CN105530554B - Video abstract generation method and device - Google Patents

Video abstract generation method and device Download PDF

Info

Publication number
CN105530554B
CN105530554B CN201410570690.4A CN201410570690A CN105530554B CN 105530554 B CN105530554 B CN 105530554B CN 201410570690 A CN201410570690 A CN 201410570690A CN 105530554 B CN105530554 B CN 105530554B
Authority
CN
China
Prior art keywords
view
object track
important
views
field
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410570690.4A
Other languages
Chinese (zh)
Other versions
CN105530554A (en
Inventor
董振江
邓硕
田玉敏
唐铭谦
冯艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing ZTE New Software Co Ltd
Original Assignee
Nanjing ZTE New Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing ZTE New Software Co Ltd filed Critical Nanjing ZTE New Software Co Ltd
Priority to CN201410570690.4A priority Critical patent/CN105530554B/en
Priority to PCT/CN2014/094701 priority patent/WO2015184768A1/en
Publication of CN105530554A publication Critical patent/CN105530554A/en
Application granted granted Critical
Publication of CN105530554B publication Critical patent/CN105530554B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8549Creating video summaries, e.g. movie trailer

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Television Signal Processing For Recording (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a video abstract generating method and a device, wherein the method comprises the following steps: dividing an original video into a plurality of views; dividing each object track contained in the original video into the view field with the closest object track according to the proximity degree of the object track and each view field; counting activity indexes of the vision field according to the activity degree of an object track in the vision field, and dividing each vision field into an important vision field and a secondary vision field according to whether the activity indexes exceed a preset threshold; and carrying out parallel processing on the object tracks in each important view and each secondary view, and combining the view obtained after the parallel processing to generate the video abstract. In the video abstract generating method, the object tracks in the important visual field and the secondary visual field are processed in parallel, so that the calculation amount of track combination is reduced, the calculation speed is increased, and a user can pay attention to the main target in the important visual field more simply and clearly.

Description

Video abstract generation method and device
Technical Field
The invention relates to the field of image recognition, in particular to a video abstract generating method and device.
Background
Video summarization, also called video concentration, is the summarization of video content, and is characterized in that moving objects are extracted through moving object analysis in an automatic or semi-automatic mode, then the moving tracks of all the objects are analyzed, different objects are spliced into a common background scene, and the different objects are combined in a certain mode. With the development of video technology, the role of video summarization in video analysis and content-based video retrieval is increasingly important.
In the field of social public safety, a video monitoring system becomes an important component for maintaining social security and strengthening social management. However, video recording has the characteristics of large data storage amount, long storage time and the like, and the traditional method for obtaining evidence through video recording searching clues consumes a large amount of manpower, material resources and time, so that the efficiency is extremely low, and the best solution solving opportunity is missed.
Aiming at the problem that the optimal abstract video cannot be quickly found from large-scale video data in the prior art, an effective solution is not provided at present.
Disclosure of Invention
In order to overcome the defects in the prior art, the embodiment of the invention provides a video abstract generating method and device.
In order to solve the technical problem, the embodiment of the invention adopts the following technical scheme:
according to an aspect of the embodiments of the present invention, there is provided a video summary generation method, including: dividing an original video into a plurality of views; dividing each object track contained in the original video into the view field with the closest object track according to the proximity degree of the object track and each view field; counting activity indexes of the vision field according to the activity degree of an object track in the vision field, and dividing each vision field into an important vision field and a secondary vision field according to whether the activity indexes exceed a preset threshold; and carrying out parallel processing on the object tracks in each important view and each secondary view, and combining the view obtained after the parallel processing to generate the video abstract.
Wherein the dividing the original video into a plurality of views comprises: determining the direction of a scene in an original video; and dividing the original video into a plurality of visual fields according to the direction of the scene, wherein the directions of the plurality of visual fields are consistent with the direction of the scene.
Wherein the determining the direction of the scene in the original video comprises: acquiring initial points and end points of a plurality of object tracks in a scene in the original video; calculating a coordinate difference value according to an initial point and a final point of the object track, and determining the direction of the object track; and judging the direction of a scene in the original video according to the direction of most of the object tracks in the object tracks, wherein the direction of the scene is consistent with the direction of most of the object tracks in the object tracks.
Wherein, according to the proximity degree of the object track and each view, dividing each object track contained in the original video into the view with the closest object track, comprises: acquiring line segment characteristics for each field of view, the line segment characteristics comprising: coordinates of a start point and a stop point of the visual field and the number of object tracks contained in the visual field; acquiring coordinates of a start point and a stop point of an object track, and calculating the degree of closeness of the object track and each view; dividing each object track contained in the original video into the view field with the closest object track according to the proximity degree; and updating the line segment characteristics of the closest visual field according to the coordinates of the start point and the stop point of the object track.
Wherein, according to the activity degree of the object track in the visual field, the activity degree index of the visual field is counted, and according to whether the activity degree index exceeds a preset threshold, each visual field is divided into an important visual field and a secondary visual field, including: the activity degree is positively correlated with the object area corresponding to the object track and the duration of the object track, and the activity degree index of the statistic view field is as follows: summing the activity degrees of all object tracks in the view field to obtain an activity degree index of the view field; and dividing each visual field into an important visual field and a secondary visual field according to whether the activity index exceeds a preset threshold.
Optionally, the performing parallel processing on the object tracks in each important view and the secondary view, and combining the views obtained after the parallel processing to generate the video summary includes: if the plurality of views are all important views, respectively solving the optimal solution of the object track combination of each view by adopting a first preset function, and further determining the optimal object track combination corresponding to the optimal solution; and generating a video abstract according to the optimal object track combination of all the views.
Optionally, the performing parallel processing on the object tracks in each important view and the secondary view, and combining the views obtained after the parallel processing to generate the video summary includes: if the plurality of vision fields are all secondary vision fields, respectively solving the optimal solution of the object track combination of each vision field by adopting a second preset function, and further determining the optimal object track combination corresponding to the optimal solution; and generating a video abstract according to the optimal object track combination of all the views.
Optionally, the performing parallel processing on the object tracks in each important view and the secondary view, and combining the views obtained after the parallel processing to generate the video summary includes: if the plurality of view areas comprise important view areas and secondary view areas, if two important view areas are adjacent, combining the two important view areas into one important view area, and solving the optimal solution of the object track combination by adopting a first preset function for the combined important view areas; if the important views are not adjacent to each other, respectively solving the optimal solution of the object track combination of each important view by adopting a first preset function, and further determining the optimal object track combination corresponding to the optimal solution; respectively solving the optimal solution of the object track combination of each secondary view by adopting a second preset function, and further determining the optimal object track combination corresponding to the optimal solution; and generating a video abstract according to the optimal object track combination of all the views.
Optionally, the performing parallel processing on the object tracks in each important view and the secondary view, and combining the views obtained after the parallel processing to generate the video summary includes: if the plurality of view areas comprise important view areas and secondary view areas, if two important view areas are adjacent, combining the two important view areas into one important view area, and solving the optimal solution of the object track combination by adopting a first preset function for the combined important view areas; if the important views are not adjacent to each other, respectively solving the optimal solution of the object track combination of each important view by adopting a first preset function, and further determining the optimal object track combination corresponding to the optimal solution; copying the object track in the secondary view field into a background image according to the original video; and combining all the views according to the processing result to generate the video abstract.
According to another aspect of the embodiments of the present invention, there is also provided a video summary generating apparatus, including: a first partitioning module for partitioning an original video into a plurality of views; the classification module is used for dividing each object track contained in the original video into the view field with the closest object track according to the proximity degree of the object track and each view field; the second division module is used for counting the activity degree index of the vision field according to the activity degree of the object track in the vision field, and dividing each vision field into an important vision field and a secondary vision field according to whether the activity degree index exceeds a preset threshold; and the merging processing module is used for carrying out parallel processing on the object tracks in each important view and each secondary view, merging the view obtained after the parallel processing and generating the video abstract.
Wherein the first partitioning module comprises: the first computing unit is used for determining the direction of a scene in an original video; and the first dividing unit is used for dividing the original video into a plurality of visual fields according to the direction of the scene, and the directions of the visual fields are consistent with the direction of the scene.
Wherein the first calculation unit includes: the first acquisition unit is used for acquiring initial points and end points of a plurality of object tracks in a scene in the original video; the difference value calculation unit is used for performing coordinate difference value calculation according to the initial point and the end point of the object track and determining the direction of the object track; and the judging unit is used for judging the direction of a scene in the original video according to the direction of most of the object tracks in the object tracks, wherein the direction of the scene is consistent with the direction of most of the object tracks in the object tracks.
Wherein the classification module comprises: a second acquisition unit configured to acquire a line segment feature of each field of view, the line segment feature including: coordinates of a start point and a stop point of the visual field and the number of object tracks contained in the visual field; the distance calculation unit is used for acquiring a starting point and a terminating point of the object track and calculating the proximity degree of the object track and each view; the first classification unit is used for dividing each object track contained in the original video into the most approximate visual field of the object track according to the proximity degree; and the updating unit is used for updating the line segment characteristics of the closest visual field according to the coordinates of the start point and the end point of the object track.
Wherein the second partitioning module comprises: an activity index calculation unit, wherein the activity level of the object track is positively correlated with the object area corresponding to the object track and the duration of the object track, and the activity index of the statistic view field is as follows: summing the activity degrees of all object tracks in the view field to obtain an activity degree index of the view field; and the second dividing unit is used for dividing each visual field into an important visual field and a secondary visual field according to whether the activity index exceeds a preset threshold.
Optionally, the merge processing module includes: the first merging unit is used for respectively solving the optimal solution of the object track combination of each view by adopting a first preset function if the plurality of views are all important views, and further determining the optimal object track combination corresponding to the optimal solution; and the first processing unit is used for generating the video abstract according to the optimal object track combination of all the views.
Optionally, the merge processing module includes: the second merging unit is used for solving the optimal solution of the object track combination of each view by adopting a second preset function if the plurality of views are all secondary views, and further determining the optimal object track combination corresponding to the optimal solution; and the second processing unit is used for generating the video abstract according to the optimal object track combination of all the views.
Optionally, the merge processing module includes: a third merging unit, configured to merge two important views into an important view if the multiple views include the important view and the secondary view, and solve an optimal solution of the object trajectory combination by using a first preset function for the merged important views; if the important views are not adjacent to each other, respectively solving the optimal solution of the object track combination of each important view by adopting a first preset function, and further determining the optimal object track combination corresponding to the optimal solution; respectively solving the optimal solution of the object track combination of each secondary view by adopting a second preset function, and further determining the optimal object track combination corresponding to the optimal solution; and the third processing unit is used for generating the video abstract according to the optimal object track combination of all the views.
Optionally, the merge processing module includes: a fourth merging unit, configured to merge two important views into an important view if the multiple views include the important view and the secondary view, and solve an optimal solution of the object trajectory combination by using a first preset function for the merged important views; if the important views are not adjacent to each other, respectively solving the optimal solution of the object track combination of each important view by adopting a first preset function, and further determining the optimal object track combination corresponding to the optimal solution; copying the object track in the secondary view field into a background image according to the original video; and the fourth processing unit is used for combining all the views according to the processing result to generate the video abstract.
The embodiment of the invention has the following beneficial effects: in the video abstract generating method of the embodiment of the invention, the operation amount of the track combination is reduced, the operation speed is accelerated and the user can pay attention to the main target in the important visual field more simply and clearly by parallel processing of the object tracks in the important visual field and the secondary visual field.
Drawings
FIG. 1 is a flow chart illustrating the basic steps of a video summary generation method according to an embodiment of the present invention;
fig. 2 is a diagram of an application of a video summary generation method according to an embodiment of the present invention;
FIG. 3 is a second application diagram of a video summary generation method according to an embodiment of the present invention;
FIG. 4 is a third application diagram of a video summary generation method according to an embodiment of the present invention;
FIG. 5 is a fourth application diagram of a video summary generation method according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a video summary generation apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments.
Example one
As shown in fig. 1 and fig. 2, which are schematic diagrams of an embodiment of the present invention, as shown in fig. 1, an embodiment of the present invention provides a video summary generation method, including:
step 101, dividing an original video into a plurality of views;
step 102, dividing each object track contained in the original video into the view field with the closest object track according to the proximity degree of the object track and each view field;
step 103, counting activity indexes of the vision field according to the activity degree of the object track in the vision field, and dividing each vision field into an important vision field and a secondary vision field according to whether the activity indexes exceed a preset threshold;
and 104, performing parallel processing on the object tracks in each important view and each secondary view, and combining the views obtained after the parallel processing to generate the video abstract.
In the video abstract generating method, the object tracks in the important visual field and the secondary visual field are processed in parallel, so that the calculation amount of track combination is reduced, the calculation speed is increased, and a user can pay attention to the main target in the important visual field more simply and clearly.
Further, step 101 in the above embodiment of the present invention specifically includes:
determining the direction of a scene in an original video;
and dividing the original video into a plurality of visual fields according to the direction of the scene, wherein the directions of the plurality of visual fields are consistent with the direction of the scene.
That is, the original video can be divided into k views according to the actual requirement, where k is a positive integer.
In the above embodiment, the calculating the direction of the scene in the original video may be implemented by the following calculation method:
firstly, acquiring initial points and end points of a plurality of object tracks in a scene in an original video;
the plurality of tracks may be all tracks in the original video scene or a part of tracks in the original video scene, for example, the original video scene includes 100 object tracks, and when calculating the direction of the scene, 20 tracks of the original video scene or all 100 tracks of the original video scene may be taken.
Then, calculating a coordinate difference value according to the initial point and the end point of the object track, and determining the direction of the object track;
if the calculation result of the coordinate difference value between the initial point and the end point of the object track is as follows: if the absolute value of the difference value of the vertical coordinates of the start point and the stop point is greater than the absolute value of the difference value of the horizontal coordinates, judging that the direction of the track is the vertical direction; and if the absolute value of the difference value of the vertical coordinates of the start point and the stop point is smaller than the absolute value of the difference value of the horizontal coordinates, judging that the direction of the track is the transverse direction.
And judging the direction of a scene in the original video according to the direction of most of the object tracks in the object tracks, wherein the direction of the scene is consistent with the direction of most of the object tracks in the object tracks.
That is, if the directions of most of the object tracks in the object tracks are the horizontal direction or the vertical direction, the corresponding direction of the scene is the horizontal direction or the vertical direction.
Specifically, step 102 in the above embodiment of the present invention includes:
acquiring line segment characteristics for each field of view, the line segment characteristics comprising: the starting point and the ending point of the view and the number of the object tracks contained in the view;
the line segment characteristics of the view include, but are not limited to, the coordinates of the start and end points of the view and the number of object trajectories included in the view.
Acquiring coordinates of a start point and a stop point of an object track, and calculating the degree of closeness of the object track and each view;
wherein the proximity of the object trajectory to each field of view may be calculated according to a distance calculation formula.
And dividing each object track contained in the original video into the view field with the closest object track according to the proximity degree.
In the embodiment of the present invention, preferably, after each object track is added to a certain view, the line segment feature of the most viewing area may also be updated according to the coordinates of the start point and the end point of the object track. Specifically, updating the formula includes: n isk=nk+1, where nkFor the number of track objects contained in the field of view before joining the object track, nk+1 is the number of the track objects contained in the view after the object track is added;
Figure BDA0000592223280000071
Figure BDA0000592223280000072
wherein, x's、y′sIs the abscissa and ordinate, x ', of the start point of the object trajectory'、y′As the abscissa and ordinate of the end point of the object trajectory,
Figure BDA0000592223280000073
as the abscissa and ordinate of the starting point of the field of view,
Figure BDA0000592223280000074
the abscissa and ordinate of the end point of the field of view. In the embodiment of the present invention, the initial starting point and the end point of the view may be selected by adding the starting point and the end point of the first object track into the view.
Specifically, step 103 in the above embodiment of the present invention includes:
the activity degree of the object track is positively correlated with the object area corresponding to the object track and the duration of the object track, and the activity degree index of the statistic view field is as follows: summing the activity degrees of all object tracks in the view field to obtain an activity degree index of the view field;
wherein the object area of the object trajectory can be calculated from the height and width of the object itself.
And dividing each visual field into an important visual field and a secondary visual field according to whether the activity index exceeds a preset threshold.
Explaining the division of the vision field into an important vision field and a secondary vision field, respectively calculating to obtain activity indexes of the 3 vision fields in an actual scene, such as dividing an original video into 3 vision fields, comparing the size relationship between the 3 activity indexes and a preset threshold, and if the activity indexes of one of the vision fields are larger than a preset threshold value, dividing the vision field into the important vision field; if the maximum activity index of the view is still smaller than the preset threshold, the 3 views are all secondary views.
Specifically, step 104 in the above embodiment of the present invention includes:
if the plurality of views are all important views, respectively solving the optimal solution of the object track combination of each view by adopting a first preset function, and further determining the optimal object track combination corresponding to the optimal solution;
and generating a video abstract according to the optimal object track combination of all the views.
As a preferred embodiment, the following first preset function and second preset function are further provided to illustrate the present invention. In the embodiment of the present invention, the first preset function uses a complex transfer mapping energy function to solve the optimal solution of the object trajectory combination of each view, and may be solved by the following formula:
E(MAP)=Ea(BO)+αEtps(BO)+β*Entps(BO)+γ*Etc(BO)+λEtct(BO)
wherein E (MAP) is a complex transfer mapping energy function; BO is the set of object trajectories within the important field of view; ea(BO) is the active energy cost, representing a penalty function if the target does not appear in the digest video; etps(BO) is the associated positive order cost, representing a penalty function if the target is added in the digest video in non-positive order; entps(BO) is a relative reverse order cost, which represents a cost penalty function brought by adding two objects which are related before and after the object in the abstract video in a reverse order; etc(BO) is a pseudo collision cost, which represents a penalty function caused by the track collision of two objects which do not collide in the original video in the summary result; etct(BO) is the true collision cost, which means that two objects colliding in the original video do not have penalty function brought by collision in the summary result, Etct(BO) is a negative value, α, γ, λ are preset weighting coefficients, and the specific values thereof can be determined according to the needs of the situation in the actual scene.
Fig. 2 is an application diagram of a video summary generation method according to an embodiment of the present invention, where the application is mainly used in a complex motion scene, and a moving object is relatively large and also relatively many. As shown in fig. 2, the application is implemented by the following steps:
step 201: initializing the number of the vision fields.
That is, the original video is divided into a plurality of views, and the specific division into several views may be determined according to actual needs, for example, the division into 3 or 5 views may be performed.
Step 202: the field of view direction is calculated.
Specifically, the direction of the view is calculated according to the direction of the scene in the original video, and if the direction of the scene in the original video is horizontal or vertical, the direction of the corresponding view is horizontal or vertical.
Step 203: and calculating the affiliated vision field of each object track.
Specifically, the proximity of the object trajectory to each viewing area may be calculated according to a distance calculation formula, and each object trajectory included in the original video may be divided into the viewing area to which the object trajectory is closest.
Step 204: and updating the view straight line model.
Specifically, after each object track is added to a certain view, the line segment feature of the most visible area may be updated according to the coordinates of the start and end points of the object track, so as to add the next object track.
Step 205: and calculating the visual field activity index.
Specifically, according to the activity degree of the object track in the visual field, the activity degree index of the visual field is counted.
Step 206: the view activity indicator is compared to a predetermined threshold.
And correspondingly judging the vision field with the vision field activity index larger than/preset threshold as the important vision field/the secondary vision field.
Step 207: and processing the object track by using a first preset function.
Specifically, due to the particularity of the scene in the application, the calculated view areas are all important view areas, the optimal solution of the object track combination of each view area is solved by using a first preset function, the optimal object track combination corresponding to the optimal solution is further determined, and the video abstract is generated.
Example two
As shown in fig. 1 and fig. 3, for the purpose of illustrating the embodiment of the present invention, the embodiment of the present invention includes steps 101, 102, 103, and 104 in the first embodiment, except that the step 104 in the first embodiment is different from the step 104 in the first embodiment in implementation manner, and the same parts in the first embodiment are not repeated, and only different parts are described below:
specifically, step 104 in this embodiment of the present invention includes:
if the plurality of vision fields are all secondary vision fields, respectively solving the optimal solution of the object track combination of each vision field by adopting a second preset function, and further determining the optimal object track combination corresponding to the optimal solution;
and generating a video abstract according to the optimal object track combination of all the views.
As a preferred implementation, the second preset function in this embodiment uses a simple transfer mapping energy function to solve the optimal solution of the object trajectory combination of each view, where the simple transfer mapping energy function is relative to the complex transfer mapping energy function in the first embodiment, and can be solved by the following formula:
Figure BDA0000592223280000101
wherein E (MAP) c solves the optimal solution of the object trajectory combination of each view for a simple transfer mapping energy function, bmAnd bbFor the two moving object trajectories in the secondary view domain, γ is a preset weight coefficient, and the specific value thereof can be determined according to the situation in the actual scene.
Fig. 3 is a second application diagram of the video summary generation method according to the embodiment of the present invention, in which the application is mainly used in a simple motion scene, the moving object is relatively small and also relatively small. As shown in fig. 3, the application is implemented by the following steps:
step 301: initializing the number of the vision fields.
That is, the original video is divided into a plurality of views, and the specific division into several views may be determined according to actual needs, for example, the division into 3 or 5 views may be performed.
Step 302: the field of view direction is calculated.
Specifically, the direction of the view is calculated according to the direction of the scene in the original video, and if the direction of the scene in the original video is horizontal or vertical, the direction of the corresponding view is horizontal or vertical.
Step 303: and calculating the affiliated vision field of each object track.
Specifically, the proximity of the object trajectory to each viewing area may be calculated according to a distance calculation formula, and each object trajectory included in the original video may be divided into the viewing area to which the object trajectory is closest.
Step 304: and updating the view straight line model.
Specifically, after each object track is added to a certain view, the line segment feature of the most visible area may be updated according to the coordinates of the start and end points of the object track, so as to add the next object track.
Wherein, step 305: and calculating the visual field activity index.
Specifically, according to the activity degree of the object track in the visual field, the activity degree index of the visual field is counted.
Step 306: the view activity indicator is compared to a predetermined threshold.
And correspondingly judging the vision field with the vision field activity index larger than/preset threshold as the important vision field/the secondary vision field.
Step 307: and processing the object track by using a second preset function.
Specifically, due to the particularity of the scene in the application, the calculated view areas are all secondary view areas, the second preset function is used for respectively solving the optimal solution of the object track combination of each view area, the optimal object track combination corresponding to the optimal solution is further determined, and the video abstract is generated.
EXAMPLE III
As shown in fig. 1 and fig. 4, for the purpose of illustrating the embodiment of the present invention, the embodiment of the present invention includes steps 101, 102, 103, and 104 in the first embodiment, except that the step 104 in the first embodiment is different from the step 104 in the first embodiment in implementation manner, and the same parts in the first embodiment are not repeated, and only different parts are described below:
specifically, step 104 in this embodiment of the present invention includes:
if the plurality of view areas comprise important view areas and secondary view areas, if two important view areas are adjacent, combining the two important view areas into one important view area, and solving the optimal solution of the object track combination by adopting a first preset function for the combined important view areas; if the important views are not adjacent to each other, respectively solving the optimal solution of the object track combination of each important view by adopting a first preset function, and further determining the optimal object track combination corresponding to the optimal solution; respectively solving the optimal solution of the object track combination of each secondary view by adopting a second preset function, and further determining the optimal object track combination corresponding to the optimal solution;
and generating a video abstract according to the optimal object track combination of all the views.
The optimal solution of the object track combination in each important view field can be solved through a first preset function, and then the optimal object track combination corresponding to the optimal solution is determined, the function in the prior art can be adopted to solve the optimal solution of the object track combination in the important view field, as an optimal implementation mode, the first preset function in the embodiment adopts a complex transfer mapping energy function to solve the optimal solution of the object track combination in each view field, and the solution can be carried out through the following formula:
E(MAP)=Ea(BO)+αEtps(BO)+β*Entps(BO)+γ*Etc(BO)+λEtct(BO)
wherein E (MAP) is a complex transfer mapping energy function; BO is the set of object trajectories within the important field of view; ea(BO) is the active energy penalty, meaning if not in summary videoA penalty function when the target occurs; etps(BO) is the associated positive order cost, representing a penalty function if the target is added in the digest video in non-positive order; entps(BO) is a relative reverse order cost, which represents a cost penalty function brought by adding two objects which are related before and after the object in the abstract video in a reverse order; etc(BO) is a pseudo collision cost, which represents a penalty function caused by the track collision of two objects which do not collide in the original video in the summary result; etct(BO) is the true collision cost, which means that two objects colliding in the original video do not have penalty function brought by collision in the summary result, Etct(BO) is a negative value, α, γ, λ are preset weighting coefficients, and the specific values thereof can be determined according to the needs of the situation in the actual scene.
The optimal solution of the object track combination in each secondary view can be solved through a second preset function, and then the optimal object track combination corresponding to the optimal solution is determined, the optimal solution of the object track combination in the secondary view can be solved through a function in the prior art, as a preferred implementation manner, the second preset function in this embodiment adopts a simple transfer mapping energy function to solve the optimal solution of the object track combination in each view, and can be solved through the following formula:
Figure BDA0000592223280000121
wherein E (MAP) c solves the optimal solution of the object trajectory combination for each view for a simple transfer mapping energy function, which is relative to the complex transfer mapping energy function in embodiment one, bmAnd bbFor the two moving object trajectories in the secondary view domain, γ is a preset weight coefficient, and the specific value thereof can be determined according to the situation in the actual scene.
Fig. 4 is a third application diagram of the video summary generation method according to the embodiment of the present invention, and the application is mainly used in a motion scene with a complex structure, where moving objects are irregular, for example, some area objects have simple motion and small number, and some area objects have complex relative motion. As shown in fig. 4, the application is implemented by the following steps:
step 401: initializing the number of the vision fields.
That is, the original video is divided into a plurality of views, and the specific division into several views may be determined according to actual needs, for example, the division into 3 or 5 views may be performed.
Step 402: the field of view direction is calculated.
Specifically, the direction of the view is calculated according to the direction of the scene in the original video, and if the direction of the scene in the original video is horizontal or vertical, the direction of the corresponding view is horizontal or vertical.
Step 403: and calculating the affiliated vision field of each object track.
Specifically, the proximity of the object trajectory to each viewing area may be calculated according to a distance calculation formula, and each object trajectory included in the original video may be divided into the viewing area to which the object trajectory is closest.
Step 404: and updating the view straight line model.
Specifically, after each object track is added to a certain view, the line segment feature of the most visible area may be updated according to the coordinates of the start and end points of the object track, so as to add the next object track.
Step 405: and calculating the visual field activity index.
Specifically, according to the activity degree of the object track in the visual field, the activity degree index of the visual field is counted.
Step 406: the view activity indicator is compared to a predetermined threshold.
And correspondingly judging the vision field with the vision field activity index larger than/preset threshold as the important vision field/the secondary vision field.
Step 407: whether two important views are adjacent to each other.
If the two important views are adjacent to each other, then step 408 is continued.
Step 408: and (6) merging. I.e. merging two adjacent important views.
Step 409: processing the object track in the important view field by using a first preset function;
step 410, processing the object track in the secondary view domain by using a second preset function;
and finally, generating a video abstract according to the optimal object track combination of all the views.
Example four
As shown in fig. 1 and fig. 5, for the purpose of illustrating the embodiment of the present invention, the embodiment of the present invention includes steps 101, 102, 103, and 104 in the first embodiment, except that the step 104 in the first embodiment is different from the step 104 in the first embodiment in implementation manner, and the same parts in the first embodiment are not repeated, and only different parts are described below:
specifically, step 104 in this embodiment of the present invention includes:
if the plurality of view areas comprise important view areas and secondary view areas, if two important view areas are adjacent, combining the two important view areas into one important view area, and solving the optimal solution of the object track combination by adopting a first preset function for the combined important view areas; if the important views are not adjacent to each other, respectively solving the optimal solution of the object track combination of each important view by adopting a first preset function, and further determining the optimal object track combination corresponding to the optimal solution; copying the object track in the secondary view field into a background image according to the original video;
and generating a video abstract according to the optimal object track combination of all the views.
The optimal solution of the object track combination in each important view field can be solved through a first preset function, and then the optimal object track combination corresponding to the optimal solution is determined, the function in the prior art can be adopted to solve the optimal solution of the object track combination in the important view field, as an optimal implementation mode, the first preset function in the embodiment adopts a complex transfer mapping energy function to solve the optimal solution of the object track combination in each view field, and the solution can be carried out through the following formula:
E(MAP)=Ea(BO)+αEtps(BO)+β*Entps(BO)+γ*Etc(BO)+λEtct(BO)
wherein E (MAP) is a complex transfer mapping energy function; BO is the set of object trajectories within the important field of view; ea(BO) is the active energy cost, representing a penalty function if the target does not appear in the digest video; etps(BO) is the associated positive order cost, representing a penalty function if the target is added in the digest video in non-positive order; entps(BO) is a relative reverse order cost, which represents a cost penalty function brought by adding two objects which are related before and after the object in the abstract video in a reverse order; etc(BO) is a pseudo collision cost, which represents a penalty function caused by the track collision of two objects which do not collide in the original video in the summary result; etct(BO) is the true collision cost, which means that two objects colliding in the original video do not have penalty function brought by collision in the summary result, Etct(BO) is a negative value, α, γ, λ are preset weighting coefficients, and the specific values thereof can be determined according to the needs of the situation in the actual scene.
And copying the object track in the secondary view field into the background image according to the original video, and finally generating the video abstract.
Fig. 5 is a fourth application diagram of the video abstract generating method according to the embodiment of the present invention, where the application is mainly used in a motion scene with a complex structure, and the motion targets are irregular, for example, some area targets have simple motion and small number, and some area targets have complex relative motion. As shown in fig. 5, the application is implemented by the following steps:
step 501: initializing the number of the vision fields.
That is, the original video is divided into a plurality of views, and the specific division into several views may be determined according to actual needs, for example, the division into 3 or 5 views may be performed.
Step 502: the field of view direction is calculated.
Specifically, the direction of the view is calculated according to the direction of the scene in the original video, and if the direction of the scene in the original video is horizontal or vertical, the direction of the corresponding view is horizontal or vertical.
Step 503: and calculating the affiliated vision field of each object track.
Specifically, the proximity of the object trajectory to each viewing area may be calculated according to a distance calculation formula, and each object trajectory included in the original video may be divided into the viewing area to which the object trajectory is closest.
Step 504: and updating the view straight line model.
Specifically, after each object track is added to a certain view, the line segment feature of the most visible area may be updated according to the coordinates of the start and end points of the object track, so as to add the next object track.
Step 505: and calculating the visual field activity index.
Specifically, according to the activity degree of the object track in the visual field, the activity degree index of the visual field is counted.
Step 506: the view activity indicator is compared to a predetermined threshold.
And correspondingly judging the vision field with the vision field activity index larger than/preset threshold as the important vision field/the secondary vision field.
Step 507: whether two important views are adjacent to each other.
If the two important views are adjacent to each other, then step 508 is continued.
Step 508: and (6) merging. I.e. merging two adjacent important views.
Step 509: processing the object track in the important view field by using a first preset function;
step 510, copying the object track into a background image according to an original video;
and finally, generating a video abstract according to the optimal object track combination of all the views.
EXAMPLE five
As shown in fig. 6, an embodiment of the present invention further provides a video summary generating apparatus, where the apparatus 60 includes:
a first dividing module 61 for dividing the original video into a plurality of views;
a classification module 62, configured to divide each object track included in the original video into the view field to which the object track is closest according to the proximity degree of the object track to each view field;
a second dividing module 63, configured to count activity indicators of the views according to activity degrees of object tracks in the views, and divide each view into an important view and a secondary view according to whether the activity indicators exceed a preset threshold;
and a merging processing module 64, configured to perform parallel processing on the object tracks in each important view and the secondary view, and merge the views obtained through the parallel processing to generate a video summary.
Wherein the first dividing module 61 comprises: the first computing unit is used for determining the direction of a scene in an original video; and the first dividing unit is used for dividing the original video into a plurality of visual fields according to the direction of the scene, and the directions of the visual fields are consistent with the direction of the scene.
Wherein the first calculation unit includes: the first acquisition unit is used for acquiring initial points and end points of a plurality of object tracks in a scene in the original video; the difference value calculation unit is used for performing coordinate difference value calculation according to the initial point and the end point of the object track and determining the direction of the object track; and the judging unit is used for judging the direction of a scene in the original video according to the direction of most of the object tracks in the object tracks, wherein the direction of the scene is consistent with the direction of most of the object tracks in the object tracks.
Wherein the categorizing module 62 comprises: a second acquisition unit configured to acquire a line segment feature of each field of view, the line segment feature including: coordinates of a start point and a stop point of the visual field and the number of object tracks contained in the visual field; the distance calculation unit is used for acquiring coordinates of a start point and a stop point of the object track and calculating the proximity degree of the object track and each view; the first classification unit is used for dividing each object track contained in the original video into the most approximate visual field of the object track according to the proximity degree;
and the updating unit is used for updating the line segment characteristics of the closest visual field according to the coordinates of the start point and the end point of the object track.
Wherein, the second dividing module 63 includes: an activity index calculation unit, wherein the activity level of the object track is positively correlated with the object area corresponding to the object track and the duration of the object track, and the activity index of the statistic view field is as follows: summing the activity degrees of all object tracks in the view field to obtain an activity degree index of the view field; and the second dividing unit is used for dividing each visual field into an important visual field and a secondary visual field according to whether the activity index exceeds a preset threshold.
Optionally, the merge processing module 64 includes: the first merging unit is used for respectively solving the optimal solution of the object track combination of each view by adopting a first preset function if the plurality of views are all important views, and further determining the optimal object track combination corresponding to the optimal solution; and the first processing unit is used for generating the video abstract according to the optimal object track combination of all the views.
Optionally, the merge processing module 64 includes: the second merging unit is used for solving the optimal solution of the object track combination of each view by adopting a second preset function if the plurality of views are all secondary views, and further determining the optimal object track combination corresponding to the optimal solution; and the second processing unit is used for generating the video abstract according to the optimal object track combination of all the views.
Optionally, the merge processing module 64 includes: a third merging unit, configured to merge two important views into an important view if the multiple views include the important view and the secondary view, and solve an optimal solution of the object trajectory combination by using a first preset function for the merged important views; if the important views are not adjacent to each other, respectively solving the optimal solution of the object track combination of each important view by adopting a first preset function, and further determining the optimal object track combination corresponding to the optimal solution; respectively solving the optimal solution of the object track combination of each secondary view by adopting a second preset function, and further determining the optimal object track combination corresponding to the optimal solution; and the third processing unit is used for generating the video abstract according to the optimal object track combination of all the views.
Optionally, the merge processing module 64 includes: a fourth merging unit, configured to merge two important views into an important view if the multiple views include the important view and the secondary view, and solve an optimal solution of the object trajectory combination by using a first preset function for the merged important views; if the important views are not adjacent to each other, respectively solving the optimal solution of the object track combination of each important view by adopting a first preset function, and further determining the optimal object track combination corresponding to the optimal solution; copying the object track in the secondary view field into a background image according to the original video; and the fourth processing unit is used for combining all the views according to the processing result to generate the video abstract.
In the video abstract generating method of the embodiment of the invention, the operation amount of the track combination is reduced, the operation speed is accelerated and the user can pay attention to the main target in the important visual field more simply and clearly by parallel processing of the object tracks in the important visual field and the secondary visual field.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (16)

1. A method for generating a video summary is characterized by comprising the following steps:
dividing the original video into a plurality of views, comprising:
determining the direction of a scene in an original video;
dividing an original video into a plurality of visual fields according to the direction of the scene, wherein the directions of the visual fields are consistent with the direction of the scene;
dividing each object track contained in the original video into the view field with the closest object track according to the proximity degree of the object track and each view field;
counting activity indexes of the vision field according to the activity degree of an object track in the vision field, and dividing each vision field into an important vision field and a secondary vision field according to whether the activity indexes exceed a preset threshold;
and carrying out parallel processing on the object tracks in each important view and each secondary view, and combining the view obtained after the parallel processing to generate the video abstract.
2. The method of claim 1, wherein determining the orientation of the scene in the original video comprises:
acquiring initial points and end points of a plurality of object tracks in a scene in the original video;
calculating a coordinate difference value according to an initial point and a final point of the object track, and determining the direction of the object track;
and judging the direction of a scene in the original video according to the direction of most of the object tracks in the object tracks, wherein the direction of the scene is consistent with the direction of most of the object tracks in the object tracks.
3. The method according to claim 1, wherein the dividing each object track included in the original video into the view areas where the object track is closest according to the proximity of the object track to each view area comprises:
acquiring line segment characteristics for each field of view, the line segment characteristics comprising: coordinates of a start point and a stop point of the visual field and the number of object tracks contained in the visual field;
acquiring coordinates of a start point and a stop point of an object track, and calculating the degree of closeness of the object track and each view;
dividing each object track contained in the original video into the view field with the closest object track according to the proximity degree;
and updating the line segment characteristics of the closest visual field according to the coordinates of the start point and the stop point of the object track.
4. The method of claim 1, wherein the counting activity indicators of the view areas according to activity levels of object tracks in the view areas, and dividing the view areas into the important view areas and the secondary view areas according to whether the activity indicators exceed a preset threshold comprises:
the activity degree is positively correlated with the object area corresponding to the object track and the duration of the object track, and the activity degree index of the statistic view field is as follows: summing the activity degrees of all object tracks in the view field to obtain an activity degree index of the view field;
and dividing each visual field into an important visual field and a secondary visual field according to whether the activity index exceeds a preset threshold.
5. The method of claim 1, wherein the parallel processing of the object trajectories in the respective important view and secondary view and combining the respective views after the parallel processing to generate the video summary comprises:
if the plurality of views are all important views, respectively solving the optimal solution of the object track combination of each view by adopting a first preset function, and further determining the optimal object track combination corresponding to the optimal solution;
and generating a video abstract according to the optimal object track combination of all the views.
6. The method of claim 1, wherein the parallel processing of the object trajectories in the respective important view and secondary view and combining the respective views after the parallel processing to generate the video summary comprises:
if the plurality of vision fields are all secondary vision fields, respectively solving the optimal solution of the object track combination of each vision field by adopting a second preset function, and further determining the optimal object track combination corresponding to the optimal solution;
and generating a video abstract according to the optimal object track combination of all the views.
7. The method of claim 1, wherein the parallel processing of the object trajectories in the respective important view and secondary view and combining the respective views after the parallel processing to generate the video summary comprises:
if the plurality of view areas comprise important view areas and secondary view areas, if two important view areas are adjacent, combining the two important view areas into one important view area, and solving the optimal solution of the object track combination by adopting a first preset function for the combined important view areas; if the important views are not adjacent to each other, respectively solving the optimal solution of the object track combination of each important view by adopting a first preset function, and further determining the optimal object track combination corresponding to the optimal solution; respectively solving the optimal solution of the object track combination of each secondary view by adopting a second preset function, and further determining the optimal object track combination corresponding to the optimal solution;
and generating a video abstract according to the optimal object track combination of all the views.
8. The method of claim 1, wherein the parallel processing of the object trajectories in the respective important view and secondary view and combining the respective views after the parallel processing to generate the video summary comprises:
if the plurality of view areas comprise important view areas and secondary view areas, if two important view areas are adjacent, combining the two important view areas into one important view area, and solving the optimal solution of the object track combination by adopting a first preset function for the combined important view areas; if the important views are not adjacent to each other, respectively solving the optimal solution of the object track combination of each important view by adopting a first preset function, and further determining the optimal object track combination corresponding to the optimal solution; copying the object track in the secondary view field into a background image according to the original video;
and combining all the views according to the processing result to generate the video abstract.
9. A video summary generation apparatus, comprising:
a first partitioning module for partitioning an original video into a plurality of views;
the first division module includes:
the first computing unit is used for determining the direction of a scene in an original video;
a first dividing unit, configured to divide an original video into a plurality of views according to a direction of the scene, where the directions of the views are consistent with a direction of the scene;
the classification module is used for dividing each object track contained in the original video into the view field with the closest object track according to the proximity degree of the object track and each view field;
the second division module is used for counting the activity degree index of the vision field according to the activity degree of the object track in the vision field, and dividing each vision field into an important vision field and a secondary vision field according to whether the activity degree index exceeds a preset threshold;
and the merging processing module is used for carrying out parallel processing on the object tracks in each important view and each secondary view, merging the view obtained after the parallel processing and generating the video abstract.
10. The apparatus of claim 9, wherein the first computing unit comprises:
the first acquisition unit is used for acquiring initial points and end points of a plurality of object tracks in a scene in the original video;
the difference value calculation unit is used for performing coordinate difference value calculation according to the initial point and the end point of the object track and determining the direction of the object track;
and the judging unit is used for judging the direction of a scene in the original video according to the direction of most of the object tracks in the object tracks, wherein the direction of the scene is consistent with the direction of most of the object tracks in the object tracks.
11. The apparatus of claim 9, wherein the classification module comprises:
a second acquisition unit configured to acquire a line segment feature of each field of view, the line segment feature including: coordinates of a start point and a stop point of the visual field and the number of object tracks contained in the visual field;
the distance calculation unit is used for acquiring coordinates of a start point and a stop point of the object track and calculating the proximity degree of the object track and each view;
the first classification unit is used for dividing each object track contained in the original video into the most approximate visual field of the object track according to the proximity degree;
and the updating unit is used for updating the line segment characteristics of the closest visual field according to the coordinates of the start point and the end point of the object track.
12. The apparatus of claim 9, wherein the second partitioning module comprises:
an activity index calculation unit, wherein the activity level of the object track is positively correlated with the object area corresponding to the object track and the duration of the object track, and the activity index of the statistic view field is as follows: summing the activity degrees of all object tracks in the view field to obtain an activity degree index of the view field;
and the second dividing unit is used for dividing each visual field into an important visual field and a secondary visual field according to whether the activity index exceeds a preset threshold.
13. The apparatus of claim 9, wherein the merge processing module comprises:
the first merging unit is used for respectively solving the optimal solution of the object track combination of each view by adopting a first preset function if the plurality of views are all important views, and further determining the optimal object track combination corresponding to the optimal solution;
and the first processing unit is used for generating the video abstract according to the optimal object track combination of all the views.
14. The apparatus of claim 9, wherein the merge processing module comprises:
the second merging unit is used for solving the optimal solution of the object track combination of each view by adopting a second preset function if the plurality of views are all secondary views, and further determining the optimal object track combination corresponding to the optimal solution;
and the second processing unit is used for generating the video abstract according to the optimal object track combination of all the views.
15. The apparatus of claim 9, wherein the merge processing module comprises:
a third merging unit, configured to merge two important views into an important view if the multiple views include the important view and the secondary view, and solve an optimal solution of the object trajectory combination by using a first preset function for the merged important views; if the important views are not adjacent to each other, respectively solving the optimal solution of the object track combination of each important view by adopting a first preset function, and further determining the optimal object track combination corresponding to the optimal solution; respectively solving the optimal solution of the object track combination of each secondary view by adopting a second preset function, and further determining the optimal object track combination corresponding to the optimal solution;
and the third processing unit is used for generating the video abstract according to the optimal object track combination of all the views.
16. The apparatus of claim 9, wherein the merge processing module comprises:
a fourth merging unit, configured to merge two important views into an important view if the multiple views include the important view and the secondary view, and solve an optimal solution of the object trajectory combination by using a first preset function for the merged important views; if the important views are not adjacent to each other, respectively solving the optimal solution of the object track combination of each important view by adopting a first preset function, further determining the optimal object track combination corresponding to the optimal solution, and copying the object track in the secondary view into a background image according to the original video;
and the fourth processing unit is used for combining all the views according to the processing result to generate the video abstract.
CN201410570690.4A 2014-10-23 2014-10-23 Video abstract generation method and device Active CN105530554B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201410570690.4A CN105530554B (en) 2014-10-23 2014-10-23 Video abstract generation method and device
PCT/CN2014/094701 WO2015184768A1 (en) 2014-10-23 2014-12-23 Method and device for generating video abstract

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410570690.4A CN105530554B (en) 2014-10-23 2014-10-23 Video abstract generation method and device

Publications (2)

Publication Number Publication Date
CN105530554A CN105530554A (en) 2016-04-27
CN105530554B true CN105530554B (en) 2020-08-07

Family

ID=54766027

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410570690.4A Active CN105530554B (en) 2014-10-23 2014-10-23 Video abstract generation method and device

Country Status (2)

Country Link
CN (1) CN105530554B (en)
WO (1) WO2015184768A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106227759B (en) * 2016-07-14 2019-09-13 中用科技有限公司 A kind of method and device of dynamic generation video frequency abstract
CN108959312B (en) 2017-05-23 2021-01-29 华为技术有限公司 Method, device and terminal for generating multi-document abstract
CN107995535B (en) * 2017-11-28 2019-11-26 百度在线网络技术(北京)有限公司 A kind of method, apparatus, equipment and computer storage medium showing video
CN110505534B (en) * 2019-08-26 2022-03-08 腾讯科技(深圳)有限公司 Monitoring video processing method, device and storage medium
CN111526434B (en) * 2020-04-24 2021-05-18 西北工业大学 Converter-based video abstraction method
CN112884808B (en) * 2021-01-26 2022-04-22 石家庄铁道大学 Video concentrator set partitioning method for reserving target real interaction behavior

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103092963A (en) * 2013-01-21 2013-05-08 信帧电子技术(北京)有限公司 Video abstract generating method and device
CN103686453A (en) * 2013-12-23 2014-03-26 苏州千视通信科技有限公司 Method for improving video abstract accuracy by dividing areas and setting different particle sizes
JP5600040B2 (en) * 2010-07-07 2014-10-01 日本電信電話株式会社 Video summarization apparatus, video summarization method, and video summarization program

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8699806B2 (en) * 2006-04-12 2014-04-15 Google Inc. Method and apparatus for automatically summarizing video
US8503523B2 (en) * 2007-06-29 2013-08-06 Microsoft Corporation Forming a representation of a video item and use thereof
US8432965B2 (en) * 2010-05-25 2013-04-30 Intellectual Ventures Fund 83 Llc Efficient method for assembling key video snippets to form a video summary
CN102375816B (en) * 2010-08-10 2016-04-20 中国科学院自动化研究所 A kind of Online Video enrichment facility, system and method
CN102256065B (en) * 2011-07-25 2012-12-12 中国科学院自动化研究所 Automatic video condensing method based on video monitoring network
CN103092925B (en) * 2012-12-30 2016-02-17 信帧电子技术(北京)有限公司 A kind of video abstraction generating method and device
CN103200463A (en) * 2013-03-27 2013-07-10 天脉聚源(北京)传媒科技有限公司 Method and device for generating video summary
CN103345764B (en) * 2013-07-12 2016-02-10 西安电子科技大学 A kind of double-deck monitor video abstraction generating method based on contents of object

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5600040B2 (en) * 2010-07-07 2014-10-01 日本電信電話株式会社 Video summarization apparatus, video summarization method, and video summarization program
CN103092963A (en) * 2013-01-21 2013-05-08 信帧电子技术(北京)有限公司 Video abstract generating method and device
CN103686453A (en) * 2013-12-23 2014-03-26 苏州千视通信科技有限公司 Method for improving video abstract accuracy by dividing areas and setting different particle sizes

Also Published As

Publication number Publication date
CN105530554A (en) 2016-04-27
WO2015184768A1 (en) 2015-12-10

Similar Documents

Publication Publication Date Title
CN105530554B (en) Video abstract generation method and device
US11643076B2 (en) Forward collision control method and apparatus, electronic device, program, and medium
CN104200237B (en) One kind being based on the High-Speed Automatic multi-object tracking method of coring correlation filtering
CN105513349B (en) Mountainous area highway vehicular events detection method based on double-visual angle study
CN103593679A (en) Visual human-hand tracking method based on online machine learning
CN104216925A (en) Repetition deleting processing method for video content
CN110827312A (en) Learning method based on cooperative visual attention neural network
CN102842036A (en) Intelligent multi-target detection method facing ship lock video monitoring
CN103985257A (en) Intelligent traffic video analysis method
CN107358622A (en) A kind of video information processing method and system based on visualization movement locus
CN108471497A (en) A kind of ship target real-time detection method based on monopod video camera
CN111191531A (en) Rapid pedestrian detection method and system
CN110688873A (en) Multi-target tracking method and face recognition method
CN109299700A (en) Subway group abnormality behavioral value method based on crowd density analysis
CN112149471A (en) Loopback detection method and device based on semantic point cloud
CN111738085B (en) System construction method and device for realizing automatic driving simultaneous positioning and mapping
CN116523970B (en) Dynamic three-dimensional target tracking method and device based on secondary implicit matching
CN106683113B (en) Feature point tracking method and device
CN112699842A (en) Pet identification method, device, equipment and computer readable storage medium
CN115330841A (en) Method, apparatus, device and medium for detecting projectile based on radar map
CN113963310A (en) People flow detection method and device for bus station and electronic equipment
CN104182990A (en) A method for acquiring a sequence image motion target area in real-time
Luo et al. An improved moment-preserving auto threshold image segmentation algorithm
Wang et al. AGR-Fcn: Adversarial generated region based on fully convolutional networks for single-and multiple-instance object detection
CN114419453B (en) Group target detection method based on electromagnetic scattering characteristics and topological configuration

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20200715

Address after: Yuhuatai District of Nanjing City, Jiangsu province 210012 Bauhinia Road No. 68

Applicant after: Nanjing Zhongxing New Software Co.,Ltd.

Address before: 518057 Nanshan District Guangdong high tech Industrial Park, South Road, science and technology, ZTE building, Ministry of Justice

Applicant before: ZTE Corp.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant