WO2022206312A1 - Automatic cropping method and apparatus for panoramic video, and terminal and storage medium - Google Patents

Automatic cropping method and apparatus for panoramic video, and terminal and storage medium Download PDF

Info

Publication number
WO2022206312A1
WO2022206312A1 PCT/CN2022/079779 CN2022079779W WO2022206312A1 WO 2022206312 A1 WO2022206312 A1 WO 2022206312A1 CN 2022079779 W CN2022079779 W CN 2022079779W WO 2022206312 A1 WO2022206312 A1 WO 2022206312A1
Authority
WO
WIPO (PCT)
Prior art keywords
video frame
time point
video
point pair
timestamp
Prior art date
Application number
PCT/CN2022/079779
Other languages
French (fr)
Chinese (zh)
Inventor
万顺
Original Assignee
影石创新科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 影石创新科技股份有限公司 filed Critical 影石创新科技股份有限公司
Publication of WO2022206312A1 publication Critical patent/WO2022206312A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440281Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the temporal resolution, e.g. by frame skipping
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47205End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for manipulating displayed content, e.g. interacting with MPEG-4 objects, editing locally
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content

Definitions

  • the invention belongs to the technical field of video processing, and in particular relates to an automatic editing method, device, terminal and storage medium for panoramic video.
  • One shot is widely used in video post-processing.
  • unnecessary video clips for example, clips containing obstacles
  • multiple video clips are cut out, and then multiple video clips are connected using software technology. Together, the video makes people feel that multiple clips were shot in one shot, without obvious frame skipping.
  • an embodiment of the present invention provides an automatic editing method for panoramic video, the method includes the following steps:
  • the method before the step of obtaining the second time point pair by using a preset feature matching algorithm based on the first time point pair, the method further includes:
  • the step of obtaining a second time point pair using a preset feature matching algorithm based on the first time point pair includes:
  • the second time point pair is obtained by using a preset feature matching algorithm based on the panoramic picture of each video frame and the first time point pair.
  • the first time point pair includes a first initial timestamp and a first termination timestamp
  • the step of obtaining the second time point pair by using a preset feature matching algorithm based on the first time point pair includes the following steps: :
  • the timestamp of the current first video frame and the found timestamp of the second video frame are set as the second time point pair, if the first video frame is not found two video frames, then jump to the step of determining the current first video frame according to the first initial timestamp.
  • the step of using the feature matching algorithm to find a second video frame matching the first video frame includes:
  • the first gaze point area picture is determined according to the shooting angle of the current first video frame, and the feature matching algorithm is used to find a second video frame matching the first gaze point area image.
  • the first time point pair includes a first initial timestamp and a first termination timestamp
  • the step of using a preset feature matching algorithm based on the first time point pair to obtain the second time point pair further include:
  • the timestamp of the current third video frame and the found timestamp of the fourth video frame are set as the second time point pair; if the fourth video frame is not found four video frames, then jump to the step of determining the current third video frame according to the second initial timestamp.
  • the step of using the feature matching algorithm to find the fourth video frame matching the current third video frame includes:
  • the second gaze point area picture is determined according to the shooting angle of the current third video frame, and the feature matching algorithm is used to find a fourth video frame matching the second gaze point area image.
  • the step of calculating the rendering parameters of the rendering model for the corresponding animation interpolation type according to the shooting angle of view of the current video frame or the second time point includes:
  • the rendering parameters of the rendering model are calculated according to the shooting angle of the current video frame
  • the current decoding timestamp is within the time period formed by the second time point pair, the current decoding timestamp, the shooting angle of the video frame corresponding to the second time point pair, and using the animation interpolation type Interpolation calculation, calculate the rendering parameters of the rendering model according to the interpolation calculation result.
  • the animation interpolation type is one or more of linear, slow in and fast out, fast in and slow out, or first slow, then fast and then slow.
  • the second time point pair includes a second initial timestamp and a second termination timestamp, and before the step of rendering the panorama video picture fitted on the rendering model according to the rendering parameter, further includes :
  • the step of rendering the panoramic video picture fitted on the rendering model according to the rendering parameters includes:
  • the panoramic video picture fitted on the rendering model is rendered according to the rendering parameter and the gradient fusion parameter.
  • an embodiment of the present invention provides an automatic editing device for panoramic video, wherein the device includes:
  • a mark obtaining unit configured to obtain the first time point pair marked in the panoramic video file, the first time point pair representing the video segment that the user desires to cut;
  • a cropping area determination unit configured to use a preset feature matching algorithm based on the first time point pair to obtain a second time point pair, the second time point pair representing the actual video segment to be cropped;
  • a video generation unit configured to calculate the rendering parameters of the rendering model for the corresponding animation interpolation type according to the shooting angle of the current video frame or the second time point, and according to the rendering parameters, pair the panoramic video picture fitted on the rendering model. Render until the current video frame is the last frame, resulting in a clipped flat video.
  • an embodiment of the present invention also provides a terminal, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program When implementing the steps of the method described above.
  • an embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps of the above method are implemented.
  • the present invention obtains the first time point pair marked in the panoramic video file, the first time point pair represents the video segment that the user expects to cut out, and uses a preset feature matching algorithm to obtain the second time point pair based on the first time point pair, and the third time point pair is obtained.
  • the two time-point pairs represent the actual video clips to be cropped, the rendering parameters of the rendering model are calculated according to the shooting angle of the current video frame or the corresponding animation interpolation type of the second time-point pair, and the panoramic video images fitted to the rendering model are calculated according to the rendering parameters. Rendering is performed, thereby realizing automatic editing of one-shot video to the end, reducing the editing complexity and improving the editing efficiency.
  • Fig. 1 is the realization flow chart of the automatic editing method of panoramic video provided by the first embodiment of the present invention
  • Embodiment 2 is a schematic structural diagram of an automatic editing device for panoramic video provided by Embodiment 2 of the present invention.
  • FIG. 3 is a schematic structural diagram of a terminal according to Embodiment 3 of the present invention.
  • FIG. 1 shows the implementation process of the automatic editing method for panoramic video provided by the first embodiment of the present invention. For the convenience of description, only the part related to the embodiment of the present invention is shown, and the details are as follows:
  • step S101 a first time point pair marked in the panoramic video file is acquired.
  • the embodiments of the present invention are applicable to terminal equipment, and the terminal equipment includes a computer, a camera, a smart phone, a tablet, etc., and the terminal equipment can implement the method by installing corresponding panoramic video processing software or plug-ins.
  • the first time point pair includes a first initial time stamp and a first termination time stamp
  • the first time point pair represents a video segment that the user desires to cut, that is, the first initial time stamp and the first end time stamp.
  • the video clips between the termination time stamps are the video clips that the user desires to cut out, and the first time point pair can be obtained according to the user's point editing operation during the shooting process.
  • step S102 a preset feature matching algorithm is used to obtain a second time point pair based on the first time point pair.
  • a preset feature matching algorithm is used to obtain a second time point pair based on the first time point pair, where the second time point pair represents the actual video segment to be trimmed, and the second time point pair includes the second initial timestamp and
  • the second termination timestamp that is, the video segment between the second initial timestamp and the second termination timestamp is the actual video segment to be cut obtained by using the feature matching algorithm.
  • the feature matching algorithm can be SIFT, SURF, ORB, Algorithms like BRISK or FREAK.
  • the current first video frame is determined according to the first initial timestamp, and the feature matching algorithm is used to find the second video frame matching the current first video frame, If the second video frame is found, the timestamp of the current first video frame and the found timestamp of the second video frame are set as the second time point pair, that is, the second initial timestamp of the second time point pair is The timestamp of the current first video frame, and the second termination timestamp of the second time point pair is the timestamp of the found second video frame.
  • the difference between the timestamp of the current first video frame and the first initial timestamp is within a first difference range
  • the first difference range may be determined according to the first fixed value preset by the user and the first initial timestamp
  • the maximum value of the first difference range can be set to zero, that is, the current first video frame is the video frame before the first initial time stamp; the time stamp of the second video frame and the first termination time stamp
  • the difference is within a second difference range
  • the second difference range can also be determined according to the second fixed value preset by the user and the first initial timestamp, and further, the minimum value of the second difference range can be set is zero, that is, the second video frame found is the video frame after the first termination timestamp.
  • the found video frame whose matching degree with the current first video frame reaches the matching degree threshold can be used as the second video frame, preferably , according to the shooting angle of the current first video frame, determine the first gaze point area picture, and use the feature matching algorithm to find the second video frame matching the first gaze point area picture, thereby reducing the amount of calculation in the feature matching process.
  • an initial time stamp can be set.
  • the set initial time stamp is represented by the third initial time stamp, and the video frame corresponding to the third initial time stamp is used as the first first video.
  • the video frame corresponding to the first initial time stamp is taken as the first first video frame
  • the obtained video frame corresponding to the previous timestamp or the next timestamp of the third initial timestamp is taken as the current first video frame in the order from near to far, until the second video frame or the first video frame is found.
  • Feature matching is performed on all video frames corresponding to a difference range.
  • the video frame corresponding to the third initial timestamp may be a video frame with a preset duration before the first initial timestamp , for example, a video frame 4 seconds before the first initial timestamp, and accordingly, the first difference range is also adjusted accordingly according to the third initial timestamp.
  • the third initial time stamp can also be obtained by a feature matching algorithm, and specifically, the third initial time stamp can be determined according to the changes of video pictures of N video frames before the first initial time stamp.
  • an initial time stamp can also be set.
  • the set initial time stamp is represented by the fourth initial time stamp.
  • the video frame corresponding to the fourth initial time stamp is used to perform feature matching, for example , if the fourth time stamp is equal to the first termination time stamp, first use the video frame corresponding to the first termination time stamp to perform feature matching, and if the second video frame is not found, obtain the The feature matching is performed on the video frames corresponding to the previous timestamp or the next timestamp of the four initial timestamps until the second video frame or all the video frames corresponding to the second difference value range are found to have feature matching.
  • the video frame corresponding to the fourth initial timestamp may be a preset duration after the first termination timestamp.
  • the first difference range is also adjusted according to the fourth initial timestamp.
  • the fourth initial timestamp can also be obtained by a feature matching algorithm. Specifically, the fourth initial timestamp can be determined according to the changes of video images of N video frames after the first termination timestamp.
  • the current third video frame is determined according to the first termination timestamp, and the feature matching algorithm is used to find the second video that matches the current first video frame.
  • the timestamp of the current third video frame and the timestamp of the fourth video frame are set as the second time point pair, if the fourth video frame is not found, then jump to The step of determining the current third video frame by the second initial time stamp, so as to search for the fourth video frame according to the re-determined current third video frame.
  • the difference between the timestamp of the current third video frame and the first termination timestamp is within the third difference range
  • the difference between the timestamp of the fourth video frame and the first initial timestamp is within the fourth difference range .
  • the specific implementation manner of searching for the fourth video frame according to the current third video frame is similar to the specific implementation manner of searching for the second video frame according to the first video frame, and details are not described herein.
  • the second gazing point area picture is determined according to the shooting angle of the current third video frame, and the feature matching algorithm is used to find the second gazing point.
  • the fourth video frame of regional picture matching thereby reducing the amount of calculation in the feature matching process.
  • the preset feature matching algorithm when using the preset feature matching algorithm to obtain the second time point pair, it is further judged whether the vertical direction satisfies the constraints, only when the constraints are met.
  • the feature matching algorithm is used to further acquire the second time point pair, so as to improve the effectiveness of acquiring the second time point pair and ensure the video editing effect.
  • the gaze point area of the video frame corresponding to the first initial time stamp may be used as the reference image
  • the video frame corresponding to the first termination time stamp may be used as the image to be detected
  • corner detection is performed on the image to be detected and the reference image. The corner detection results determine whether the vertical constraints are satisfied.
  • each video frame is obtained from the panoramic video file, and each obtained video frame is rendered into a panoramic image
  • the second time point pair is obtained by using the preset feature matching algorithm based on the first time point pair
  • the second time point pair is obtained by using the preset feature matching algorithm based on the panoramic image of each video frame and the first time point pair, in other words , when the feature matching algorithm is used to obtain the second time point pair, it is all obtained based on the panoramic image.
  • step S103 the rendering parameters of the rendering model are calculated for the corresponding animation interpolation type according to the shooting angle of view of the current video frame or the second time point, and the panoramic video picture fitted on the rendering model is rendered according to the rendering parameters.
  • the rendering parameters may include pitch (pitch angle), yaw (yaw angle), roll (roll angle), fov (field of view), distance (distance), etc. of the virtual camera.
  • Each video The shooting angle of the frame can be obtained from the data of the gyroscope when shooting the panoramic video file.
  • the rendering parameters are calculated differently when the current decoding timestamp is within or outside the time period formed by the second time point pair, so preferably, if the current decoding timestamp is not within the time period formed by the second time point pair , the rendering parameters of the rendering model are calculated according to the shooting angle of the current video frame. If the current decoding time stamp is within the time period formed by the second time point pair, then the video frame corresponding to the current decoding time stamp and the second time point pair is calculated according to the current decoding time stamp. and use the animation interpolation type to perform interpolation calculation, and calculate the rendering parameters of the rendering model according to the interpolation calculation results, so as to determine the calculation method of rendering parameters according to the time period of the current decoding timestamp.
  • the shooting angle of view of the video frame corresponding to the second time point pair specifically refers to the shooting angle of view of the video frame corresponding to the second initial timestamp of the second time point pair, and the second end time of the second time point pair. Poke the shooting angle of the corresponding video frame.
  • the animation interpolation type is one or more of linear, slow in and fast out, fast in and slow out, or first slow, then fast and then slow, so as to enrich the animation production effect, wherein the interpolation type may be pre-specified by the user One type, or multiple combinations.
  • the animation interpolation type is linear first, then fast in and slow out.
  • the video frame corresponding to the second initial timestamp before rendering the panoramic video image fitted to the rendering model according to the rendering parameters.
  • the exposure difference of the video frame corresponding to the second termination timestamp is used to obtain the gradient fusion parameters of each video frame to be fused according to the exposure difference.
  • the fusion parameter renders the panoramic video picture fitted on the rendering model, so as to further improve the visual effect of the corresponding transition video animation at the second time point.
  • gradient fusion parameters usually include transparency.
  • the difference Before obtaining the gradient fusion parameters of each video frame to be fused according to the exposure difference, further, determine whether the exposure difference is greater than the preset exposure difference threshold, if not, then according to the exposure difference The difference obtains the gradient fusion parameters of each video frame to be fused. If the exposure difference threshold is exceeded, a corresponding reminder will be sent to the user to remind the user whether to continue to automatically edit the video when the exposure difference is large, thereby improving the Effectiveness of automatic clipping. It should be noted here that if there are multiple sets of second time point pairs, the above-described method can be used to obtain the gradient fusion parameters of the video frames to be fused corresponding to each group of second time point pairs, and based on the rendering parameters and gradient Fusion parameters for rendering.
  • step S104 it is determined whether the current video frame is the last frame, if not, jump to step S103, if yes, execute step S105.
  • step S105 a clipped video is generated.
  • the clipped video can be exported, so that the user can play the exported video on a video playback device.
  • the first time point pair marked in the panoramic video file is obtained, the first time point pair represents the video segment that the user expects to cut out, and the second time point pair is obtained by using a preset feature matching algorithm based on the first time point pair.
  • the time point pair, the second time point pair represents the actual video clip to be cropped, the rendering parameters of the rendering model are calculated according to the shooting angle of the current video frame or the animation interpolation type corresponding to the second time point pair, and the rendering parameters are fitted to the rendering according to the rendering parameter pair.
  • the panoramic video picture of the model is rendered until the current video frame is the last frame, and the edited flat video is generated, so as to realize the automatic editing of the one-shot video to the end, reducing the editing complexity and improving the editing efficiency.
  • Embodiment 2 is a diagrammatic representation of Embodiment 1:
  • FIG. 2 shows the structure of the automatic editing device for panoramic video provided by the second embodiment of the present invention.
  • the parts related to the embodiment of the present invention are shown, including:
  • the mark obtaining unit 21 is used to obtain the first time point pair marked in the panoramic video file, and the first time point pair represents the video segment that the user desires to cut out;
  • a cropping region determination unit 22 configured to obtain a second time point pair using a preset feature matching algorithm based on the first time point pair, the second time point pair representing the actual video segment to be cropped;
  • the video generation unit 23 is used for calculating the rendering parameters of the rendering model for the corresponding animation interpolation type according to the shooting angle of view of the current video frame or the second time point, and rendering the panoramic video picture fitted on the rendering model according to the rendering parameters.
  • the video frame is the last frame, resulting in a clipped flat video.
  • the device further includes:
  • the panoramic video generation unit is used to obtain each video frame from the panoramic video file, and render each obtained video frame into a panoramic image;
  • the cropping region determination unit further includes:
  • the region determination subunit is configured to use a preset feature matching algorithm to obtain the second time point pair based on the panoramic picture of each video frame and the first time point pair.
  • the first time point pair includes a first initial timestamp and a first termination timestamp
  • the cropping region determining unit includes:
  • a first determining unit configured to determine the current first video frame according to the first initial time stamp, wherein the difference between the time stamp of the current first video frame and the first initial time stamp is within the first difference range;
  • a first search unit configured to use a feature matching algorithm to search for a second video frame matching the current first video frame, wherein the difference between the timestamp of the second video frame and the first termination timestamp is within a second difference range ;
  • the first acquisition unit is configured to set the timestamp of the current first video frame and the timestamp of the found second video frame as the second time point pair if the second video frame is found, and if the second video frame is not found. video frame, trigger the first determination unit to determine the current first video frame according to the first initial timestamp.
  • the first search unit further includes:
  • the first search subunit is configured to determine the first gaze point area picture according to the shooting angle of the current first video frame, and use a feature matching algorithm to search for the second video frame matching the first gaze point area image.
  • the first time point pair includes a first initial timestamp and a first termination timestamp
  • the cropping region determining unit includes:
  • a second determining unit configured to determine the current third video frame according to the first termination timestamp, wherein the difference between the timestamp of the current third video frame and the first termination timestamp is within a third difference range;
  • the second search unit is configured to use a feature matching algorithm to search for a fourth video frame matching the current third video frame, wherein the difference between the timestamp of the fourth video frame and the first initial timestamp is within the fourth difference range ;as well as
  • the second acquiring unit is configured to set the timestamp of the current third video frame and the timestamp of the found fourth video frame as the second time point pair if the fourth video frame is found, and if the fourth video frame is not found video frame, trigger the second determination unit to determine the current third video frame according to the second initial timestamp.
  • the second search unit includes:
  • the second search subunit is used to determine the second gaze point area picture according to the shooting angle of the current third video frame, and use the feature matching algorithm to search for the fourth video frame that matches the second gaze point area image.
  • the video generation unit further includes:
  • a first parameter calculation unit configured to calculate rendering parameters of the rendering model according to the shooting angle of the current video frame if the current decoding timestamp is not within the time period formed by the second time point pair;
  • the second parameter calculation unit is configured to use animation interpolation according to the current decoding timestamp, the shooting angle of the video frame corresponding to the second time point pair, if the current decoding time stamp is within the time period formed by the second time point pair.
  • Type to perform interpolation calculation, and calculate the rendering parameters of the rendering model according to the interpolation calculation result.
  • the animation interpolation type is one or more of linear, slow in and fast out, fast in and slow out, or first slow then fast and then slow.
  • the second time point pair includes a second initial timestamp and a second termination timestamp
  • the apparatus further includes:
  • an exposure degree acquiring unit configured to acquire an exposure degree difference between the video frame corresponding to the second initial timestamp and the video frame corresponding to the second termination timestamp;
  • a fusion parameter acquisition unit used for acquiring gradient fusion parameters of each video frame to be fused according to the exposure difference
  • the video generation unit also includes:
  • the rendering fusion unit is used to render the panoramic video image fitted to the rendering model according to the rendering parameters and the gradient fusion parameters.
  • each unit of the automatic editing device for panoramic video may be implemented by corresponding hardware or software units, and each unit may be an independent software and hardware unit, or may be integrated into a software and hardware unit.
  • each unit of the automatic editing device for panoramic video may be implemented by corresponding hardware or software units, and each unit may be an independent software and hardware unit, or may be integrated into a software and hardware unit.
  • FIG. 3 shows the structure of the terminal provided by Embodiment 3 of the present invention. For convenience of description, only the part related to the embodiment of the present invention is shown.
  • the terminal 3 in the embodiment of the present invention includes a processor 30 , a memory 31 , and a computer program 32 stored in the memory 31 and running on the processor 30 .
  • the processor 30 executes the computer program 32
  • the steps in the above-mentioned method embodiments are implemented, for example, steps S101 to S105 shown in FIG. 1 .
  • the processor 30 executes the computer program 32
  • the functions of the units in the above-mentioned apparatus embodiments for example, the functions of the units 21 to 23 shown in FIG. 2, are implemented.
  • the first time point pair marked in the panoramic video file is obtained, the first time point pair represents the video segment that the user expects to cut out, and the second time point pair is obtained by using a preset feature matching algorithm based on the first time point pair.
  • the time point pair, the second time point pair represents the actual video clip to be cropped, the rendering parameters of the rendering model are calculated according to the shooting angle of the current video frame or the animation interpolation type corresponding to the second time point pair, and the rendering parameters are fitted to the rendering according to the rendering parameter pair.
  • the panoramic video picture of the model is rendered until the current video frame is the last frame, and the edited flat video is generated, so as to realize the automatic editing of the one-shot video to the end, reducing the editing complexity and improving the editing efficiency.
  • Embodiment 4 is a diagrammatic representation of Embodiment 4:
  • a computer-readable storage medium where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the steps in the foregoing method embodiments are implemented, for example, as shown in FIG. 1 . Steps S101 to S105 shown. Or, when the computer program is executed by the processor, the functions of the units in the above-mentioned apparatus embodiments, for example, the functions of the units 21 to 23 shown in FIG. 2 , are implemented.
  • the first time point pair marked in the panoramic video file is obtained, the first time point pair represents the video segment that the user expects to cut out, and the second time point pair is obtained by using a preset feature matching algorithm based on the first time point pair.
  • the time point pair, the second time point pair represents the actual video clip to be cropped, the rendering parameters of the rendering model are calculated according to the shooting angle of the current video frame or the animation interpolation type corresponding to the second time point pair, and the rendering parameters are fitted to the rendering according to the rendering parameter pair.
  • the panoramic video picture of the model is rendered until the current video frame is the last frame, and the edited flat video is generated, so as to realize the automatic editing of the one-shot video to the end, reducing the editing complexity and improving the editing efficiency.
  • the computer-readable storage medium of the embodiments of the present invention may include any entity or device capable of carrying computer program codes, recording medium, for example, memory such as ROM/RAM, magnetic disk, optical disk, flash memory, and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Computer Security & Cryptography (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

The present invention is applicable to the technical field of video processing. Provided are an automatic cropping method and apparatus for a panoramic video, and a terminal and a storage medium. The method comprises: acquiring a first time point pair marked in a panoramic video file, wherein the first time point pair represents a video segment that a user expects to crop out; on the basis of the first time point pair, acquiring a second time point pair by using a pre-set feature matching algorithm, wherein the second time point pair represents an actual video segment to be cropped; according to a photographing field of view of the current video frame or an animation interpolation type corresponding to the second time point pair, calculating rendering parameters of a rendering model, and according to the rendering parameters, rendering a panoramic video picture which is attached to the rendering model until the current video frame is the last frame; and generating a cropped planar video. Therefore, automatic cropping of a one-shot video is realized, cropping complexity is reduced, and cropping efficiency is improved.

Description

全景视频的自动剪辑方法、装置、终端及存储介质Automatic editing method, device, terminal and storage medium for panoramic video 技术领域technical field
本发明属于视频处理技术领域,尤其涉及一种全景视频的自动剪辑方法、装置、终端及存储介质。The invention belongs to the technical field of video processing, and in particular relates to an automatic editing method, device, terminal and storage medium for panoramic video.
背景技术Background technique
一镜到底在视频后期处理中的应用极其广泛,在一段拍摄视频中,去掉中间不需要的视频片段(例如,包含障碍物的片段),裁剪出多段视频片段,然后利用软件技术将多段视频连接到一起,在视频上让人觉得多段片段是一镜到底拍摄的,而无明显跳帧痕迹。One shot is widely used in video post-processing. In a shooting video, unnecessary video clips (for example, clips containing obstacles) in the middle are removed, multiple video clips are cut out, and then multiple video clips are connected using software technology. Together, the video makes people feel that multiple clips were shot in one shot, without obvious frame skipping.
技术问题technical problem
但基于现有技术都是通过手动完成的,剪辑复杂度高、效率低。However, based on the existing technology, it is all done manually, and the editing complexity is high and the efficiency is low.
技术解决方案technical solutions
基于此,有必要针对上述技术问题,提供一种全景视频的自动剪辑方法、装置、终端及存储介质。Based on this, it is necessary to provide an automatic editing method, device, terminal and storage medium for panoramic video in view of the above technical problems.
一方面,本发明一实施例提供一种全景视频的自动剪辑方法,所述方法包括下述步骤:On the one hand, an embodiment of the present invention provides an automatic editing method for panoramic video, the method includes the following steps:
获取全景视频文件中标记的第一时间点对,所述第一时间点对表征用户期望裁剪掉的视频片段;Obtaining the first time point pair marked in the panoramic video file, the first time point pair representing the video segment that the user expects to be cut;
基于所述第一时间点对使用预设的特征匹配算法获取第二时间点对,所述第二时间点对表征实际待裁剪的视频片段;Using a preset feature matching algorithm to obtain a second time point pair based on the first time point pair, the second time point pair representing the actual video segment to be trimmed;
根据当前视频帧的拍摄视角或所述第二时间点对对应的动画插值类型计算渲染模型的渲染参数,根据所述渲染参数对贴合在所述渲染模型的全景视频画面进行渲染,重复该步骤直至当前视频帧为最后一帧,生成剪辑后的平面视频。Calculate the rendering parameters of the rendering model for the corresponding animation interpolation type according to the shooting angle of the current video frame or the second time point, and render the panoramic video image fitted to the rendering model according to the rendering parameters, and repeat this step. Until the current video frame is the last frame, a clipped flat video is generated.
优选地,所述基于所述第一时间点对使用预设的特征匹配算法获取第二时间点对的步骤之前,还包括:Preferably, before the step of obtaining the second time point pair by using a preset feature matching algorithm based on the first time point pair, the method further includes:
从所述全景视频文件中获取每个视频帧,将获取到的每个视频帧渲染成全景画面;Obtain each video frame from the panoramic video file, and render each obtained video frame into a panoramic picture;
所述基于所述第一时间点对使用预设的特征匹配算法获取第二时间点对的步骤,包括:The step of obtaining a second time point pair using a preset feature matching algorithm based on the first time point pair includes:
基于所述每个视频帧的全景画面以及所述第一时间点对使用预设的特征匹配算法获取第二时间点对。The second time point pair is obtained by using a preset feature matching algorithm based on the panoramic picture of each video frame and the first time point pair.
优选地,所述第一时间点对包括第一初始时间戳和第一终止时间戳,所述基于所述第一时间点对使用预设的特征匹配算法获取第二时间点对的步骤,包括:Preferably, the first time point pair includes a first initial timestamp and a first termination timestamp, and the step of obtaining the second time point pair by using a preset feature matching algorithm based on the first time point pair includes the following steps: :
使用所述特征匹配算法查找与所述第一视频帧匹配的第二视频帧,其中,所述第二视频帧的时间戳与所述第一终止时间戳的差值在第二差值范围内;using the feature matching algorithm to find a second video frame matching the first video frame, wherein the difference between the timestamp of the second video frame and the first termination timestamp is within a second difference range ;
若查找到所述第二视频帧,则将当前第一视频帧的时间戳和查找到的所述第二视频帧的时间戳设置为所述第二时间点对,若未查找到所述第二视频帧,则跳转至根据所述第一初始时间戳确定当前第一视频帧的步骤。If the second video frame is found, the timestamp of the current first video frame and the found timestamp of the second video frame are set as the second time point pair, if the first video frame is not found two video frames, then jump to the step of determining the current first video frame according to the first initial timestamp.
优选地,所述使用所述特征匹配算法查找与所述第一视频帧匹配的第二视频帧的步骤,包括:Preferably, the step of using the feature matching algorithm to find a second video frame matching the first video frame includes:
根据当前第一视频帧的拍摄视角确定第一注视点区域画面,使用所述特征匹配算法查找与所述第一注视点区域画面匹配的第二视频帧。 The first gaze point area picture is determined according to the shooting angle of the current first video frame, and the feature matching algorithm is used to find a second video frame matching the first gaze point area image.
优选地,所述第一时间点对包括第一初始时间戳和第一终止时间戳,所述基于所述第一时间点对使用预设的特征匹配算法获取第二时间点对的步骤,还包括:Preferably, the first time point pair includes a first initial timestamp and a first termination timestamp, and the step of using a preset feature matching algorithm based on the first time point pair to obtain the second time point pair, further include:
根据所述第一终止时间戳确定当前第三视频帧,其中,当前第三视频帧的时间戳与所述第一终止时间戳的差值在第三差值范围内;determining the current third video frame according to the first termination timestamp, wherein the difference between the timestamp of the current third video frame and the first termination timestamp is within a third difference range;
使用所述特征匹配算法查找与当前第三视频帧匹配的第四视频帧,其中,所述第四视频帧的时间戳与所述第一初始时间戳的差值在第四差值范围内;Using the feature matching algorithm to find a fourth video frame that matches the current third video frame, wherein the difference between the timestamp of the fourth video frame and the first initial timestamp is within a fourth difference range;
若查找到所述第四视频帧,则将当前第三视频帧的时间戳和查找到的所述第四视频帧的时间戳设置为所述第二时间点对,若未查找到所述第四视频帧,则跳转至根据所述第二初始时间戳确定当前第三视频帧的步骤。If the fourth video frame is found, the timestamp of the current third video frame and the found timestamp of the fourth video frame are set as the second time point pair; if the fourth video frame is not found four video frames, then jump to the step of determining the current third video frame according to the second initial timestamp.
优选地,所述使用所述特征匹配算法查找与当前第三视频帧匹配的第四视频帧的步骤,包括:Preferably, the step of using the feature matching algorithm to find the fourth video frame matching the current third video frame includes:
根据当前第三视频帧的拍摄视角确定第二注视点区域画面,使用所述特征匹配算法查找与所述第二注视点区域画面匹配的第四视频帧。The second gaze point area picture is determined according to the shooting angle of the current third video frame, and the feature matching algorithm is used to find a fourth video frame matching the second gaze point area image.
优选地,所述根据当前视频帧的拍摄视角或所述第二时间点对对应的动画插值类型计算渲染模型的渲染参数的步骤,包括:Preferably, the step of calculating the rendering parameters of the rendering model for the corresponding animation interpolation type according to the shooting angle of view of the current video frame or the second time point includes:
若当前解码时间戳不处于由所述第二时间点对构成的时间段内,则根据当前视频帧的拍摄视角计算渲染模型的渲染参数;If the current decoding time stamp is not within the time period formed by the second time point pair, the rendering parameters of the rendering model are calculated according to the shooting angle of the current video frame;
若当前解码时间戳处于由所述第二时间点对构成的时间段内,则根据当前解码时间戳、所述第二时间点对对应的视频帧的拍摄视角、并使用所述动画插值类型进行插值计算,根据插值计算结果计算渲染模型的渲染参数。If the current decoding timestamp is within the time period formed by the second time point pair, the current decoding timestamp, the shooting angle of the video frame corresponding to the second time point pair, and using the animation interpolation type Interpolation calculation, calculate the rendering parameters of the rendering model according to the interpolation calculation result.
优选地,所述动画插值类型为线性、慢进快出、快进慢出或先慢后快再慢中的一种或多种。Preferably, the animation interpolation type is one or more of linear, slow in and fast out, fast in and slow out, or first slow, then fast and then slow.
优选地,所述第二时间点对包括第二初始时间戳和第二终止时间戳,所述根据所述渲染参数对贴合在所述渲染模型的全景视频画面进行渲染的步骤之前,还包括:Preferably, the second time point pair includes a second initial timestamp and a second termination timestamp, and before the step of rendering the panorama video picture fitted on the rendering model according to the rendering parameter, further includes :
获取所述第二初始时间戳对应的视频帧与所述第二终止时间戳对应的视频帧的曝光度差;obtaining the exposure difference between the video frame corresponding to the second initial timestamp and the video frame corresponding to the second termination timestamp;
根据所述曝光度差获取每个待融合处理的视频帧的渐变融合参数;Acquiring gradient fusion parameters of each video frame to be fused according to the exposure difference;
根据所述渲染参数对贴合在所述渲染模型的全景视频画面进行渲染的步骤,包括:The step of rendering the panoramic video picture fitted on the rendering model according to the rendering parameters includes:
根据所述渲染参数和所述渐变融合参数对贴合在所述渲染模型的全景视频画面进行渲染。The panoramic video picture fitted on the rendering model is rendered according to the rendering parameter and the gradient fusion parameter.
另一方面,本发明一实施例提供了一种全景视频的自动剪辑装置,其特征在于,所述装置包括:On the other hand, an embodiment of the present invention provides an automatic editing device for panoramic video, wherein the device includes:
标记获取单元,用于获取全景视频文件中标记的第一时间点对,所述第一时间点对表征用户期望裁剪掉的视频片段;a mark obtaining unit, configured to obtain the first time point pair marked in the panoramic video file, the first time point pair representing the video segment that the user desires to cut;
裁剪区域确定单元,用于基于所述第一时间点对使用预设的特征匹配算法获取第二时间点对,所述第二时间点对表征实际待裁剪的视频片段;以及a cropping area determination unit, configured to use a preset feature matching algorithm based on the first time point pair to obtain a second time point pair, the second time point pair representing the actual video segment to be cropped; and
视频生成单元,用于根据当前视频帧的拍摄视角或所述第二时间点对对应的动画插值类型计算渲染模型的渲染参数,根据所述渲染参数对贴合在所述渲染模型的全景视频画面进行渲染,直至当前视频帧为最后一帧,生成剪辑后的平面视频。A video generation unit, configured to calculate the rendering parameters of the rendering model for the corresponding animation interpolation type according to the shooting angle of the current video frame or the second time point, and according to the rendering parameters, pair the panoramic video picture fitted on the rendering model. Render until the current video frame is the last frame, resulting in a clipped flat video.
另一方面,本发明一实施例还提供了一种终端,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如上所述方法的步骤。On the other hand, an embodiment of the present invention also provides a terminal, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program When implementing the steps of the method described above.
另一方面,本发明一实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现如上所述方法的步骤。On the other hand, an embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps of the above method are implemented.
技术效果  technical effect
本发明获取全景视频文件中标记的第一时间点对,第一时间点对表征用户期望裁剪掉的视频片段,基于第一时间点对使用预设的特征匹配算法获取第二时间点对,第二时间点对表征实际待裁剪的视频片段,根据当前视频帧的拍摄视角或第二时间点对对应的动画插值类型计算渲染模型的渲染参数,根据渲染参数对贴合在渲染模型的全景视频画面进行渲染,从而实现了一镜到底视频的自动剪辑,降低了剪辑复杂度,并提高了剪辑效率。 The present invention obtains the first time point pair marked in the panoramic video file, the first time point pair represents the video segment that the user expects to cut out, and uses a preset feature matching algorithm to obtain the second time point pair based on the first time point pair, and the third time point pair is obtained. The two time-point pairs represent the actual video clips to be cropped, the rendering parameters of the rendering model are calculated according to the shooting angle of the current video frame or the corresponding animation interpolation type of the second time-point pair, and the panoramic video images fitted to the rendering model are calculated according to the rendering parameters. Rendering is performed, thereby realizing automatic editing of one-shot video to the end, reducing the editing complexity and improving the editing efficiency.
附图说明Description of drawings
图1是本发明实施例一提供的全景视频的自动剪辑方法的实现流程图;Fig. 1 is the realization flow chart of the automatic editing method of panoramic video provided by the first embodiment of the present invention;
图2是本发明实施例二提供的全景视频的自动剪辑装置的结构示意图;以及2 is a schematic structural diagram of an automatic editing device for panoramic video provided by Embodiment 2 of the present invention; and
图3是本发明实施例三提供的终端的结构示意图。FIG. 3 is a schematic structural diagram of a terminal according to Embodiment 3 of the present invention.
本发明的实施方式Embodiments of the present invention
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.
以下结合具体实施例对本发明的具体实现进行详细描述:The specific implementation of the present invention is described in detail below in conjunction with specific embodiments:
实施例一:Example 1:
图1示出了本发明实施例一提供的全景视频的自动剪辑方法的实现流程,为了便于说明,仅示出了与本发明实施例相关的部分,详述如下:FIG. 1 shows the implementation process of the automatic editing method for panoramic video provided by the first embodiment of the present invention. For the convenience of description, only the part related to the embodiment of the present invention is shown, and the details are as follows:
在步骤S101中,获取全景视频文件中标记的第一时间点对。In step S101, a first time point pair marked in the panoramic video file is acquired.
本发明实施例适用于终端设备,该终端设备包括电脑、相机、智能手机和平板等,该终端设备可通过安装相应的全景视频处理软件或插件实现本方法。The embodiments of the present invention are applicable to terminal equipment, and the terminal equipment includes a computer, a camera, a smart phone, a tablet, etc., and the terminal equipment can implement the method by installing corresponding panoramic video processing software or plug-ins.
在本发明实施例中,该第一时间点对包括第一初始时间戳和第一终止时间戳,该第一时间点对表征用户期望裁剪掉的视频片段,即第一初始时间戳和第一终止时间戳之间的视频片段即为用户期望裁剪掉的视频片段,该第一时间点对可根据用户在拍摄过程中的打点编辑操作得到。In this embodiment of the present invention, the first time point pair includes a first initial time stamp and a first termination time stamp, and the first time point pair represents a video segment that the user desires to cut, that is, the first initial time stamp and the first end time stamp. The video clips between the termination time stamps are the video clips that the user desires to cut out, and the first time point pair can be obtained according to the user's point editing operation during the shooting process.
在步骤S102中,基于第一时间点对使用预设的特征匹配算法获取第二时间点对。In step S102, a preset feature matching algorithm is used to obtain a second time point pair based on the first time point pair.
在本发明实施例中,基于用户打点编辑过程中标记的第一时间点对并不十分精确,在实际播放过程中可能出现跳帧的情况,因此,需要重新确定实际需要裁减掉的视频片段,本方法中基于第一时间点对使用预设的特征匹配算法获取第二时间点对,该第二时间点对表征实际待裁剪的视频片段,该第二时间点对包括第二初始时间戳和第二终止时间戳,即第二初始时间戳和第二终止时间戳之间的视频片段即为使用特征匹配算法获取到的实际待裁剪的视频片段,特征匹配算法可以是SIFT、SURF、ORB、BRISK或FREAK等算法。In the embodiment of the present invention, based on the fact that the first time point pair marked in the editing process by the user is not very accurate, frame skipping may occur in the actual playback process. In this method, a preset feature matching algorithm is used to obtain a second time point pair based on the first time point pair, where the second time point pair represents the actual video segment to be trimmed, and the second time point pair includes the second initial timestamp and The second termination timestamp, that is, the video segment between the second initial timestamp and the second termination timestamp is the actual video segment to be cut obtained by using the feature matching algorithm. The feature matching algorithm can be SIFT, SURF, ORB, Algorithms like BRISK or FREAK.
在使用预设的特征匹配算法获取第二时间点对时,优选地,根据第一初始时间戳确定当前第一视频帧,使用特征匹配算法查找与当前第一视频帧匹配的第二视频帧,若查找到第二视频帧,则将当前第一视频帧的时间戳和查找到的第二视频帧的时间戳设置为第二时间点对,即第二时间点对的第二初始时间戳为当前第一视频帧的时间戳,第二时间点对的第二终止时间戳为查找到的第二视频帧的时间戳。进一步地,若未查找到第二视频帧,则跳转至根据第一初始时间戳确定当前第一视频帧的步骤,以根据重新确定的当前第一视频帧进行第二视频帧的查找。其中,当前第一视频帧的时间戳与第一初始时间戳的差值在第一差值范围内,该第一差值范围可以根据用户预设的第一固定值及第一初始时间戳确定,进一步地,可将该第一差值范围的最大值设置为零,即当前第一视频帧为第一初始时间戳之前的视频帧;第二视频帧的时间戳与第一终止时间戳的差值在第二差值范围内,该第二差值范围同样可以根据用户预设的第二固定值及第一初始时间戳确定,进一步地,可将该第二差值范围的最小值设置为零,即查找到的第二视频帧为第一终止时间戳之后的视频帧。When using the preset feature matching algorithm to obtain the second time point pair, preferably, the current first video frame is determined according to the first initial timestamp, and the feature matching algorithm is used to find the second video frame matching the current first video frame, If the second video frame is found, the timestamp of the current first video frame and the found timestamp of the second video frame are set as the second time point pair, that is, the second initial timestamp of the second time point pair is The timestamp of the current first video frame, and the second termination timestamp of the second time point pair is the timestamp of the found second video frame. Further, if the second video frame is not found, jump to the step of determining the current first video frame according to the first initial timestamp, so as to search for the second video frame according to the re-determined current first video frame. Wherein, the difference between the timestamp of the current first video frame and the first initial timestamp is within a first difference range, and the first difference range may be determined according to the first fixed value preset by the user and the first initial timestamp , further, the maximum value of the first difference range can be set to zero, that is, the current first video frame is the video frame before the first initial time stamp; the time stamp of the second video frame and the first termination time stamp The difference is within a second difference range, and the second difference range can also be determined according to the second fixed value preset by the user and the first initial timestamp, and further, the minimum value of the second difference range can be set is zero, that is, the second video frame found is the video frame after the first termination timestamp.
在使用特征匹配算法查找与当前第一视频帧匹配的第二视频帧时,可以将查找到的、与当前第一视频帧的匹配度达到匹配度阈值的视频帧作为第二视频帧,优选地,根据当前第一视频帧的拍摄视角确定第一注视点区域画面,使用特征匹配算法查找与第一注视点区域画面匹配的第二视频帧,从而减小了特征匹配过程中的计算量。When using the feature matching algorithm to find the second video frame that matches the current first video frame, the found video frame whose matching degree with the current first video frame reaches the matching degree threshold can be used as the second video frame, preferably , according to the shooting angle of the current first video frame, determine the first gaze point area picture, and use the feature matching algorithm to find the second video frame matching the first gaze point area picture, thereby reducing the amount of calculation in the feature matching process.
在确定当前第一视频帧时,可以设置一初始时间戳,为便于描述,用第三初始时间戳表示该设置的初始时间戳,将第三初始时间戳对应的视频帧作为首个第一视频帧,例如,若该第三初始时间戳等于第一初始时间戳,则将第一初始时间戳对应的视频帧作为首个第一视频帧,若根据该第一视频帧未查找到第二视频帧,则依照由近到远的顺序将获取到的该第三初始时间戳的前一时间戳或后一时间戳对应的视频帧作为当前第一视频帧,直至查找到第二视频帧或第一差值范围内对应的所有视频帧都进行了特征匹配。此外,考虑到实际拍摄过程中当碰到障碍物等情况时通常会有一个短时间的逗留,因此,第三初始时间戳对应的视频帧可以是第一初始时间戳之前预设时长的视频帧,例如第一初始时间戳之前4秒的视频帧,相应地,第一差值范围也根据该第三初始时间戳进行相应地调整。当然,第三初始时间戳也可以通过特征匹配算法得到,具体地,可以根据第一初始时间戳之前N个视频帧的视频画面的变化情况确定第三初始时间戳。When determining the current first video frame, an initial time stamp can be set. For the convenience of description, the set initial time stamp is represented by the third initial time stamp, and the video frame corresponding to the third initial time stamp is used as the first first video. frame, for example, if the third initial time stamp is equal to the first initial time stamp, the video frame corresponding to the first initial time stamp is taken as the first first video frame, if the second video frame is not found according to the first video frame frame, the obtained video frame corresponding to the previous timestamp or the next timestamp of the third initial timestamp is taken as the current first video frame in the order from near to far, until the second video frame or the first video frame is found. Feature matching is performed on all video frames corresponding to a difference range. In addition, considering that there is usually a short-term stay when an obstacle is encountered during the actual shooting process, the video frame corresponding to the third initial timestamp may be a video frame with a preset duration before the first initial timestamp , for example, a video frame 4 seconds before the first initial timestamp, and accordingly, the first difference range is also adjusted accordingly according to the third initial timestamp. Of course, the third initial time stamp can also be obtained by a feature matching algorithm, and specifically, the third initial time stamp can be determined according to the changes of video pictures of N video frames before the first initial time stamp.
在查找第二视频帧时,可以同样设置一初始时间戳,为便于描述,用第四初始时间戳表示该设置的初始时间戳,首先使用第四初始时间戳对应的视频帧进行特征匹配,例如,若该第四时间戳等于第一终止时间戳,则首先使用第一终止时间戳对应的视频帧进行特征匹配,若未查找到第二视频帧,则依照由近到远的顺序获取该第四初始时间戳的前一时间戳或后一时间戳对应的视频帧进行特征匹配,直至查找到第二视频帧或第二差值范围内对应的所有视频帧都进行了特征匹配。此外,考虑到实际拍摄过程中在刚绕过障碍物等情况时同样可能会有一个短时间的逗留,因此,第四初始时间戳对应的视频帧可以是第一终止时间戳之后预设时长的视频帧,例如第一终止时间戳之后4秒的视频帧,相应地,第一差值范围也根据该第四初始时间戳进行调整。当然,第四初始时间戳也可以通过特征匹配算法得到,具体地,可以根据第一终止时间戳之后N个视频帧的视频画面的变化情况确定第四初始时间戳。When searching for the second video frame, an initial time stamp can also be set. For the convenience of description, the set initial time stamp is represented by the fourth initial time stamp. First, the video frame corresponding to the fourth initial time stamp is used to perform feature matching, for example , if the fourth time stamp is equal to the first termination time stamp, first use the video frame corresponding to the first termination time stamp to perform feature matching, and if the second video frame is not found, obtain the The feature matching is performed on the video frames corresponding to the previous timestamp or the next timestamp of the four initial timestamps until the second video frame or all the video frames corresponding to the second difference value range are found to have feature matching. In addition, considering that there may also be a short-term stay in the actual shooting process when just bypassing an obstacle, etc., therefore, the video frame corresponding to the fourth initial timestamp may be a preset duration after the first termination timestamp. For a video frame, for example, a video frame 4 seconds after the first termination timestamp, correspondingly, the first difference range is also adjusted according to the fourth initial timestamp. Of course, the fourth initial timestamp can also be obtained by a feature matching algorithm. Specifically, the fourth initial timestamp can be determined according to the changes of video images of N video frames after the first termination timestamp.
在使用预设的特征匹配算法获取第二时间点对时,又一优选地,根据第一终止时间戳确定当前第三视频帧,使用特征匹配算法查找与当前第一视频帧匹配的第二视频帧,若查找到第四视频帧,则将当前第三视频帧的时间戳和第四视频帧的时间戳设置为第二时间点对,若未查找到第四视频帧,则跳转至根据第二初始时间戳确定当前第三视频帧的步骤,以根据重新确定的当前第三视频帧进行第四视频帧的查找。其中,当前第三视频帧的时间戳与第一终止时间戳的差值在第三差值范围内,第四视频帧的时间戳与第一初始时间戳的差值在第四差值范围内。根据当前第三视频帧查找第四视频帧的具体实现方式与根据第一视频帧查找第二视频帧的具体实现方式类似,在此不作赘述。When using the preset feature matching algorithm to obtain the second time point pair, still preferably, the current third video frame is determined according to the first termination timestamp, and the feature matching algorithm is used to find the second video that matches the current first video frame. frame, if the fourth video frame is found, the timestamp of the current third video frame and the timestamp of the fourth video frame are set as the second time point pair, if the fourth video frame is not found, then jump to The step of determining the current third video frame by the second initial time stamp, so as to search for the fourth video frame according to the re-determined current third video frame. The difference between the timestamp of the current third video frame and the first termination timestamp is within the third difference range, and the difference between the timestamp of the fourth video frame and the first initial timestamp is within the fourth difference range . The specific implementation manner of searching for the fourth video frame according to the current third video frame is similar to the specific implementation manner of searching for the second video frame according to the first video frame, and details are not described herein.
在使用特征匹配算法查找与当前第三视频帧匹配的第四视频帧时,优选地,根据当前第三视频帧的拍摄视角确定第二注视点区域画面,使用特征匹配算法查找与第二注视点区域画面匹配的第四视频帧,从而减小了特征匹配过程中的计算量。When using the feature matching algorithm to find the fourth video frame that matches the current third video frame, preferably, the second gazing point area picture is determined according to the shooting angle of the current third video frame, and the feature matching algorithm is used to find the second gazing point. The fourth video frame of regional picture matching, thereby reducing the amount of calculation in the feature matching process.
考虑到拍摄过程中相机抖动幅度较大时会影响剪辑效果,因此,在使用预设的特征匹配算法获取第二时间点对时,进一步地,判断垂直方向是否满足约束条件,只有满足约束条件时才使用特征匹配算法进一步获取第二时间点对,以提高第二时间点对获取的有效性,并保证视频剪辑效果。具体地,可以将第一初始时间戳对应的视频帧的注视点区域作为参考图像,将第一终止时间戳对应的视频帧作为待检测图像,然后对待检测图像和参考图像进行角点检测,基于角点检测结果确定是否满足垂直方向的约束。Considering that the large camera shake will affect the editing effect during the shooting process, when using the preset feature matching algorithm to obtain the second time point pair, it is further judged whether the vertical direction satisfies the constraints, only when the constraints are met. The feature matching algorithm is used to further acquire the second time point pair, so as to improve the effectiveness of acquiring the second time point pair and ensure the video editing effect. Specifically, the gaze point area of the video frame corresponding to the first initial time stamp may be used as the reference image, the video frame corresponding to the first termination time stamp may be used as the image to be detected, and then corner detection is performed on the image to be detected and the reference image. The corner detection results determine whether the vertical constraints are satisfied.
优选地,在基于第一时间点对使用预设的特征匹配算法获取第二时间点对之前,从全景视频文件中获取每个视频帧,将获取到的每个视频帧渲染成全景画面,在基于第一时间点对使用预设的特征匹配算法获取第二时间点对时,基于每个视频帧的全景画面以及第一时间点对使用预设的特征匹配算法获取第二时间点对,换言之,在使用特征匹配算法获取第二时间点对时全部基于全景画面进行获取。Preferably, before using the preset feature matching algorithm to obtain the second time point pair based on the first time point pair, each video frame is obtained from the panoramic video file, and each obtained video frame is rendered into a panoramic image, and When the second time point pair is obtained by using the preset feature matching algorithm based on the first time point pair, the second time point pair is obtained by using the preset feature matching algorithm based on the panoramic image of each video frame and the first time point pair, in other words , when the feature matching algorithm is used to obtain the second time point pair, it is all obtained based on the panoramic image.
在这里需要指出的是,用户实际期望裁剪掉的视频片段可能为多个,即,全景视频文件中可能标记有多组第一时间点对,对于每组第一时间点对,均可采用本步骤描述的方法查找对应的第二时间点对。It should be pointed out here that there may be multiple video clips that the user actually expects to be trimmed, that is, there may be multiple sets of first time point pairs marked in the panoramic video file. For each set of first time point pairs, this The method described in the steps searches for the corresponding second time point pair.
在步骤S103中,根据当前视频帧的拍摄视角或第二时间点对对应的动画插值类型计算渲染模型的渲染参数,根据渲染参数对贴合在渲染模型的全景视频画面进行渲染。In step S103, the rendering parameters of the rendering model are calculated for the corresponding animation interpolation type according to the shooting angle of view of the current video frame or the second time point, and the panoramic video picture fitted on the rendering model is rendered according to the rendering parameters.
在本发明实施例中,渲染参数可以包括虚拟相机的pitch(俯仰角)、yaw(偏航角)、roll(横滚角)、fov(视场角)和distance(距离)等,每个视频帧的拍摄视角可通过拍摄全景视频文件时的陀螺仪的数据得到。In this embodiment of the present invention, the rendering parameters may include pitch (pitch angle), yaw (yaw angle), roll (roll angle), fov (field of view), distance (distance), etc. of the virtual camera. Each video The shooting angle of the frame can be obtained from the data of the gyroscope when shooting the panoramic video file.
基于当前解码时间戳处于第二时间点对构成的时间段之内或之外时渲染参数的计算方式不同,从而优选地,若当前解码时间戳不处于由第二时间点对构成的时间段内,则根据当前视频帧的拍摄视角计算渲染模型的渲染参数,若当前解码时间戳处于由第二时间点对构成的时间段内,则根据当前解码时间戳、第二时间点对对应的视频帧的拍摄视角并使用动画插值类型进行插值计算,根据插值计算结果计算渲染模型的渲染参数,从而根据当前解码时间戳所处的时间段确定渲染参数的计算方式。其中,该第二时间点对对应的视频帧的拍摄视角具体是指该第二时间点对的第二初始时间戳对应的视频帧的拍摄视角,以及该第二时间点对的第二终止时间戳对应的视频帧的拍摄视角。The rendering parameters are calculated differently when the current decoding timestamp is within or outside the time period formed by the second time point pair, so preferably, if the current decoding timestamp is not within the time period formed by the second time point pair , the rendering parameters of the rendering model are calculated according to the shooting angle of the current video frame. If the current decoding time stamp is within the time period formed by the second time point pair, then the video frame corresponding to the current decoding time stamp and the second time point pair is calculated according to the current decoding time stamp. and use the animation interpolation type to perform interpolation calculation, and calculate the rendering parameters of the rendering model according to the interpolation calculation results, so as to determine the calculation method of rendering parameters according to the time period of the current decoding timestamp. The shooting angle of view of the video frame corresponding to the second time point pair specifically refers to the shooting angle of view of the video frame corresponding to the second initial timestamp of the second time point pair, and the second end time of the second time point pair. Poke the shooting angle of the corresponding video frame.
在这里需要指出的是,若存在多组第二时间点对,且当前解码时间戳处于由某一第二时间点对构成的时间段内,则根据构成该时间段的第二时间点对对应的视频帧的拍摄视角并使用动画插值类型进行插值计算,根据插值计算结果计算渲染模型的渲染参数。It should be pointed out here that if there are multiple sets of second time point pairs, and the current decoding timestamp is within a time period composed of a certain second time point pair, the corresponding The shooting angle of the video frame and use the animation interpolation type to perform interpolation calculation, and calculate the rendering parameters of the rendering model according to the interpolation calculation result.
优选地,动画插值类型为线性、慢进快出、快进慢出或先慢后快再慢中的一种或多种,以丰富动画制作效果,其中,该插值类型可以是用户预先指定的一种类型,也可以是多种组合,例如,动画插值类型为先线性再快进慢出。Preferably, the animation interpolation type is one or more of linear, slow in and fast out, fast in and slow out, or first slow, then fast and then slow, so as to enrich the animation production effect, wherein the interpolation type may be pre-specified by the user One type, or multiple combinations. For example, the animation interpolation type is linear first, then fast in and slow out.
考虑到待裁剪掉的视频片段拍摄前后可能存在曝光度的差异,从而优选地,在根据渲染参数对贴合在渲染模型的全景视频画面进行渲染之前,获取第二初始时间戳对应的视频帧与第二终止时间戳对应的视频帧的曝光度差,根据曝光度差获取每个待融合处理的视频帧的渐变融合参数,若当前视频帧为待融合处理的视频帧,则根据渲染参数和渐变融合参数对贴合在渲染模型的全景视频画面进行渲染,以进一步提高第二时间点对对应的过渡视频动画的视觉效果。其中,渐变融合参数通常包括透明度。在根据曝光度差获取每个待融合处理的视频帧的渐变融合参数之前,进一步地,判断曝光度差是否大于预设的曝光度差阈值,若未超过该曝光度差阈值,则根据曝光度差获取每个待融合处理的视频帧的渐变融合参数,若超过该曝光度差阈值,则向用户发出相应的提醒,以在曝光度差较大时提醒用户是否继续自动剪辑视频,从而提高了自动剪辑的有效性。在这里需要说明的是,若存在多组第二时间点对,则可采用以上描述方法获取每组第二时间点对对应的待融合处理的视频帧的渐变融合参数,并基于渲染参数和渐变融合参数进行渲染。Considering that there may be a difference in exposure before and after the video clip to be cropped, it is preferable to obtain the video frame corresponding to the second initial timestamp before rendering the panoramic video image fitted to the rendering model according to the rendering parameters. The exposure difference of the video frame corresponding to the second termination timestamp is used to obtain the gradient fusion parameters of each video frame to be fused according to the exposure difference. The fusion parameter renders the panoramic video picture fitted on the rendering model, so as to further improve the visual effect of the corresponding transition video animation at the second time point. Among them, gradient fusion parameters usually include transparency. Before obtaining the gradient fusion parameters of each video frame to be fused according to the exposure difference, further, determine whether the exposure difference is greater than the preset exposure difference threshold, if not, then according to the exposure difference The difference obtains the gradient fusion parameters of each video frame to be fused. If the exposure difference threshold is exceeded, a corresponding reminder will be sent to the user to remind the user whether to continue to automatically edit the video when the exposure difference is large, thereby improving the Effectiveness of automatic clipping. It should be noted here that if there are multiple sets of second time point pairs, the above-described method can be used to obtain the gradient fusion parameters of the video frames to be fused corresponding to each group of second time point pairs, and based on the rendering parameters and gradient Fusion parameters for rendering.
在步骤S104中,判断当前视频帧是否为最后一帧,若否,则跳转至步骤S103,若是,则执行步骤S105。In step S104, it is determined whether the current video frame is the last frame, if not, jump to step S103, if yes, execute step S105.
在步骤S105中,生成剪辑后的视频。In step S105, a clipped video is generated.
在本发明实施例中,在生成剪辑后的平面视频之后,可以导出该剪辑后的视频,以便于用户将导出后的该视频在视频播放设备上进行播放。In this embodiment of the present invention, after a clipped planar video is generated, the clipped video can be exported, so that the user can play the exported video on a video playback device.
在本发明实施例中,获取全景视频文件中标记的第一时间点对,第一时间点对表征用户期望裁剪掉的视频片段,基于第一时间点对使用预设的特征匹配算法获取第二时间点对,第二时间点对表征实际待裁剪的视频片段,根据当前视频帧的拍摄视角或第二时间点对对应的动画插值类型计算渲染模型的渲染参数,根据渲染参数对贴合在渲染模型的全景视频画面进行渲染,直至当前视频帧为最后一帧,生成剪辑后的平面视频,从而实现了一镜到底视频的自动剪辑,降低了剪辑复杂度,并提高了剪辑效率。In the embodiment of the present invention, the first time point pair marked in the panoramic video file is obtained, the first time point pair represents the video segment that the user expects to cut out, and the second time point pair is obtained by using a preset feature matching algorithm based on the first time point pair. The time point pair, the second time point pair represents the actual video clip to be cropped, the rendering parameters of the rendering model are calculated according to the shooting angle of the current video frame or the animation interpolation type corresponding to the second time point pair, and the rendering parameters are fitted to the rendering according to the rendering parameter pair. The panoramic video picture of the model is rendered until the current video frame is the last frame, and the edited flat video is generated, so as to realize the automatic editing of the one-shot video to the end, reducing the editing complexity and improving the editing efficiency.
实施例二:Embodiment 2:
图2示出了本发明实施例二提供的全景视频的自动剪辑装置的结构,为了便于说明,仅示出了与本发明实施例相关的部分,其中包括:FIG. 2 shows the structure of the automatic editing device for panoramic video provided by the second embodiment of the present invention. For the convenience of description, only the parts related to the embodiment of the present invention are shown, including:
标记获取单元21,用于获取全景视频文件中标记的第一时间点对,第一时间点对表征用户期望裁剪掉的视频片段;The mark obtaining unit 21 is used to obtain the first time point pair marked in the panoramic video file, and the first time point pair represents the video segment that the user desires to cut out;
裁剪区域确定单元22,用于基于第一时间点对使用预设的特征匹配算法获取第二时间点对,第二时间点对表征实际待裁剪的视频片段;以及a cropping region determination unit 22, configured to obtain a second time point pair using a preset feature matching algorithm based on the first time point pair, the second time point pair representing the actual video segment to be cropped; and
视频生成单元23,用于根据当前视频帧的拍摄视角或第二时间点对对应的动画插值类型计算渲染模型的渲染参数,根据渲染参数对贴合在渲染模型的全景视频画面进行渲染,直至当前视频帧为最后一帧,生成剪辑后的平面视频。The video generation unit 23 is used for calculating the rendering parameters of the rendering model for the corresponding animation interpolation type according to the shooting angle of view of the current video frame or the second time point, and rendering the panoramic video picture fitted on the rendering model according to the rendering parameters. The video frame is the last frame, resulting in a clipped flat video.
优选地,该装置还包括:Preferably, the device further includes:
全景视频生成单元,用于从全景视频文件中获取每个视频帧,将获取到的每个视频帧渲染成全景画面;The panoramic video generation unit is used to obtain each video frame from the panoramic video file, and render each obtained video frame into a panoramic image;
该裁剪区域确定单元还包括:The cropping region determination unit further includes:
区域确定子单元,用于基于每个视频帧的全景画面以及第一时间点对使用预设的特征匹配算法获取第二时间点对。The region determination subunit is configured to use a preset feature matching algorithm to obtain the second time point pair based on the panoramic picture of each video frame and the first time point pair.
优选地,第一时间点对包括第一初始时间戳和第一终止时间戳,该裁剪区域确定单元包括:Preferably, the first time point pair includes a first initial timestamp and a first termination timestamp, and the cropping region determining unit includes:
第一确定单元,用于根据第一初始时间戳确定当前第一视频帧,其中,当前第一视频帧的时间戳与第一初始时间戳的差值在第一差值范围内;a first determining unit, configured to determine the current first video frame according to the first initial time stamp, wherein the difference between the time stamp of the current first video frame and the first initial time stamp is within the first difference range;
第一查找单元,用于使用特征匹配算法查找与当前第一视频帧匹配的第二视频帧,其中,第二视频帧的时间戳与第一终止时间戳的差值在第二差值范围内;以及a first search unit, configured to use a feature matching algorithm to search for a second video frame matching the current first video frame, wherein the difference between the timestamp of the second video frame and the first termination timestamp is within a second difference range ;as well as
第一获取单元,用于若查找到第二视频帧,则将当前第一视频帧的时间戳和查找到的第二视频帧的时间戳设置为第二时间点对,若未查找到第二视频帧,则触发第一确定单元根据第一初始时间戳确定当前第一视频帧。The first acquisition unit is configured to set the timestamp of the current first video frame and the timestamp of the found second video frame as the second time point pair if the second video frame is found, and if the second video frame is not found. video frame, trigger the first determination unit to determine the current first video frame according to the first initial timestamp.
优选地,第一查找单元还包括:Preferably, the first search unit further includes:
第一查找子单元,用于根据当前第一视频帧的拍摄视角确定第一注视点区域画面,使用特征匹配算法查找与第一注视点区域画面匹配的第二视频帧。The first search subunit is configured to determine the first gaze point area picture according to the shooting angle of the current first video frame, and use a feature matching algorithm to search for the second video frame matching the first gaze point area image.
优选地,第一时间点对包括第一初始时间戳和第一终止时间戳,该裁剪区域确定单元包括:Preferably, the first time point pair includes a first initial timestamp and a first termination timestamp, and the cropping region determining unit includes:
第二确定单元,用于根据第一终止时间戳确定当前第三视频帧,其中,当前第三视频帧的时间戳与第一终止时间戳的差值在第三差值范围内;a second determining unit, configured to determine the current third video frame according to the first termination timestamp, wherein the difference between the timestamp of the current third video frame and the first termination timestamp is within a third difference range;
第二查找单元,用于使用特征匹配算法查找与当前第三视频帧匹配的第四视频帧,其中,第四视频帧的时间戳与第一初始时间戳的差值在第四差值范围内;以及The second search unit is configured to use a feature matching algorithm to search for a fourth video frame matching the current third video frame, wherein the difference between the timestamp of the fourth video frame and the first initial timestamp is within the fourth difference range ;as well as
第二获取单元,用于若查找到第四视频帧,则将当前第三视频帧的时间戳和查找到的第四视频帧的时间戳设置为第二时间点对,若未查找到第四视频帧,则触发第二确定单元根据第二初始时间戳确定当前第三视频帧。The second acquiring unit is configured to set the timestamp of the current third video frame and the timestamp of the found fourth video frame as the second time point pair if the fourth video frame is found, and if the fourth video frame is not found video frame, trigger the second determination unit to determine the current third video frame according to the second initial timestamp.
优选地,第二查找单元包括:Preferably, the second search unit includes:
第二查找子单元,用于根据当前第三视频帧的拍摄视角确定第二注视点区域画面,使用特征匹配算法查找与第二注视点区域画面匹配的第四视频帧The second search subunit is used to determine the second gaze point area picture according to the shooting angle of the current third video frame, and use the feature matching algorithm to search for the fourth video frame that matches the second gaze point area image.
优选地,视频生成单元还包括:Preferably, the video generation unit further includes:
第一参数计算单元,用于若当前解码时间戳不处于由第二时间点对构成的时间段内,则根据当前视频帧的拍摄视角计算渲染模型的渲染参数;以及a first parameter calculation unit, configured to calculate rendering parameters of the rendering model according to the shooting angle of the current video frame if the current decoding timestamp is not within the time period formed by the second time point pair; and
第二参数计算单元,用于若当前解码时间戳处于由第二时间点对构成的时间段内,则根据当前解码时间戳、第二时间点对对应的视频帧的拍摄视角、并使用动画插值类型进行插值计算,根据插值计算结果计算渲染模型的渲染参数。The second parameter calculation unit is configured to use animation interpolation according to the current decoding timestamp, the shooting angle of the video frame corresponding to the second time point pair, if the current decoding time stamp is within the time period formed by the second time point pair. Type to perform interpolation calculation, and calculate the rendering parameters of the rendering model according to the interpolation calculation result.
优选地,动画插值类型为线性、慢进快出、快进慢出或先慢后快再慢中的一种或多种。Preferably, the animation interpolation type is one or more of linear, slow in and fast out, fast in and slow out, or first slow then fast and then slow.
优选地,第二时间点对包括第二初始时间戳和第二终止时间戳,该装置还包括:Preferably, the second time point pair includes a second initial timestamp and a second termination timestamp, and the apparatus further includes:
曝光度获取单元,用于获取第二初始时间戳对应的视频帧与第二终止时间戳对应的视频帧的曝光度差;以及an exposure degree acquiring unit, configured to acquire an exposure degree difference between the video frame corresponding to the second initial timestamp and the video frame corresponding to the second termination timestamp; and
融合参数获取单元,用于根据曝光度差获取每个待融合处理的视频帧的渐变融合参数;a fusion parameter acquisition unit, used for acquiring gradient fusion parameters of each video frame to be fused according to the exposure difference;
视频生成单元还包括:The video generation unit also includes:
渲染融合单元,用于根据渲染参数和渐变融合参数对贴合在渲染模型的全景视频画面进行渲染。The rendering fusion unit is used to render the panoramic video image fitted to the rendering model according to the rendering parameters and the gradient fusion parameters.
在本发明实施例中,全景视频的自动剪辑装置的各单元可由相应的硬件或软件单元实现,各单元可以为独立的软、硬件单元,也可以集成为一个软、硬件单元,在此不用以限制本发明。全景视频的自动剪辑装置的各单元的具体实施方式可参考前述方法实施例的描述,在此不再赘述。In this embodiment of the present invention, each unit of the automatic editing device for panoramic video may be implemented by corresponding hardware or software units, and each unit may be an independent software and hardware unit, or may be integrated into a software and hardware unit. Limit the invention. Reference may be made to the description of the foregoing method embodiments for the specific implementation of each unit of the automatic editing device for panoramic video, and details are not described herein again.
实施例三:Embodiment three:
图3示出了本发明实施例三提供的终端的结构,为了便于说明,仅示出了与本发明实施例相关的部分。FIG. 3 shows the structure of the terminal provided by Embodiment 3 of the present invention. For convenience of description, only the part related to the embodiment of the present invention is shown.
本发明实施例的终端3包括处理器30、存储器31以及存储在存储器31中并可在处理器30上运行的计算机程序32。该处理器30执行计算机程序32时实现上述各方法实施例中的步骤,例如图1所示的步骤S101至S105。或者,处理器30执行计算机程序32时实现上述各装置实施例中各单元的功能,例如图2所示单元21至23的功能。The terminal 3 in the embodiment of the present invention includes a processor 30 , a memory 31 , and a computer program 32 stored in the memory 31 and running on the processor 30 . When the processor 30 executes the computer program 32 , the steps in the above-mentioned method embodiments are implemented, for example, steps S101 to S105 shown in FIG. 1 . Alternatively, when the processor 30 executes the computer program 32, the functions of the units in the above-mentioned apparatus embodiments, for example, the functions of the units 21 to 23 shown in FIG. 2, are implemented.
在本发明实施例中,获取全景视频文件中标记的第一时间点对,第一时间点对表征用户期望裁剪掉的视频片段,基于第一时间点对使用预设的特征匹配算法获取第二时间点对,第二时间点对表征实际待裁剪的视频片段,根据当前视频帧的拍摄视角或第二时间点对对应的动画插值类型计算渲染模型的渲染参数,根据渲染参数对贴合在渲染模型的全景视频画面进行渲染,直至当前视频帧为最后一帧,生成剪辑后的平面视频,从而实现了一镜到底视频的自动剪辑,降低了剪辑复杂度,并提高了剪辑效率。In the embodiment of the present invention, the first time point pair marked in the panoramic video file is obtained, the first time point pair represents the video segment that the user expects to cut out, and the second time point pair is obtained by using a preset feature matching algorithm based on the first time point pair. The time point pair, the second time point pair represents the actual video clip to be cropped, the rendering parameters of the rendering model are calculated according to the shooting angle of the current video frame or the animation interpolation type corresponding to the second time point pair, and the rendering parameters are fitted to the rendering according to the rendering parameter pair. The panoramic video picture of the model is rendered until the current video frame is the last frame, and the edited flat video is generated, so as to realize the automatic editing of the one-shot video to the end, reducing the editing complexity and improving the editing efficiency.
实施例四:Embodiment 4:
在本发明实施例中,提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,该计算机程序被处理器执行时实现上述方法实施例中的步骤,例如,图1所示的步骤S101至S105。或者,该计算机程序被处理器执行时实现上述各装置实施例中各单元的功能,例如图2所示单元21至23的功能。In an embodiment of the present invention, a computer-readable storage medium is provided, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the steps in the foregoing method embodiments are implemented, for example, as shown in FIG. 1 . Steps S101 to S105 shown. Or, when the computer program is executed by the processor, the functions of the units in the above-mentioned apparatus embodiments, for example, the functions of the units 21 to 23 shown in FIG. 2 , are implemented.
在本发明实施例中,获取全景视频文件中标记的第一时间点对,第一时间点对表征用户期望裁剪掉的视频片段,基于第一时间点对使用预设的特征匹配算法获取第二时间点对,第二时间点对表征实际待裁剪的视频片段,根据当前视频帧的拍摄视角或第二时间点对对应的动画插值类型计算渲染模型的渲染参数,根据渲染参数对贴合在渲染模型的全景视频画面进行渲染,直至当前视频帧为最后一帧,生成剪辑后的平面视频,从而实现了一镜到底视频的自动剪辑,降低了剪辑复杂度,并提高了剪辑效率。In the embodiment of the present invention, the first time point pair marked in the panoramic video file is obtained, the first time point pair represents the video segment that the user expects to cut out, and the second time point pair is obtained by using a preset feature matching algorithm based on the first time point pair. The time point pair, the second time point pair represents the actual video clip to be cropped, the rendering parameters of the rendering model are calculated according to the shooting angle of the current video frame or the animation interpolation type corresponding to the second time point pair, and the rendering parameters are fitted to the rendering according to the rendering parameter pair. The panoramic video picture of the model is rendered until the current video frame is the last frame, and the edited flat video is generated, so as to realize the automatic editing of the one-shot video to the end, reducing the editing complexity and improving the editing efficiency.
本发明实施例的计算机可读存储介质可以包括能够携带计算机程序代码的任何实体或装置、记录介质,例如,ROM/RAM、磁盘、光盘、闪存等存储器。The computer-readable storage medium of the embodiments of the present invention may include any entity or device capable of carrying computer program codes, recording medium, for example, memory such as ROM/RAM, magnetic disk, optical disk, flash memory, and the like.
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present invention shall be included in the protection of the present invention. within the range.

Claims (10)

  1.  一种全景视频的自动剪辑方法,其特征在于,所述方法包括下述步骤:A kind of automatic editing method of panoramic video, is characterized in that, described method comprises the steps:
    获取全景视频文件中标记的第一时间点对,所述第一时间点对表征用户期望裁剪掉的视频片段;Obtaining the first time point pair marked in the panoramic video file, the first time point pair representing the video segment that the user expects to be cut;
    基于所述第一时间点对使用预设的特征匹配算法获取第二时间点对,所述第二时间点对表征实际待裁剪的视频片段;Using a preset feature matching algorithm to obtain a second time point pair based on the first time point pair, the second time point pair representing the actual video segment to be trimmed;
    根据当前视频帧的拍摄视角或所述第二时间点对对应的动画插值类型计算渲染模型的渲染参数,根据所述渲染参数对贴合在所述渲染模型的全景视频画面进行渲染,重复该步骤直至当前视频帧为最后一帧,生成剪辑后的平面视频。Calculate the rendering parameters of the rendering model for the corresponding animation interpolation type according to the shooting angle of the current video frame or the second time point, and render the panoramic video image fitted to the rendering model according to the rendering parameters, and repeat this step. Until the current video frame is the last frame, a clipped flat video is generated.
  2. 如权利要求1所述的方法,其特征在于,所述基于所述第一时间点对使用预设的特征匹配算法获取第二时间点对的步骤之前,还包括:The method according to claim 1, wherein before the step of obtaining the second time point pair by using a preset feature matching algorithm based on the first time point pair, the method further comprises:
    从所述全景视频文件中获取每个视频帧,将获取到的每个视频帧渲染成全景画面;Obtain each video frame from the panoramic video file, and render each obtained video frame into a panoramic picture;
    所述基于所述第一时间点对使用预设的特征匹配算法获取第二时间点对的步骤,包括:The step of obtaining a second time point pair using a preset feature matching algorithm based on the first time point pair includes:
    基于所述每个视频帧的全景画面以及所述第一时间点对使用预设的特征匹配算法获取第二时间点对。The second time point pair is obtained by using a preset feature matching algorithm based on the panoramic picture of each video frame and the first time point pair.
  3. 如权利要求1所述的方法,其特征在于,所述第一时间点对包括第一初始时间戳和第一终止时间戳,所述基于所述第一时间点对使用预设的特征匹配算法获取第二时间点对的步骤,包括:The method according to claim 1, wherein the first time point pair comprises a first initial timestamp and a first termination timestamp, and a preset feature matching algorithm is used based on the first time point pair The steps for obtaining the second time point pair include:
    根据所述第一初始时间戳确定当前第一视频帧,其中,当前第一视频帧的时间戳与所述第一初始时间戳的差值在第一差值范围内;determining the current first video frame according to the first initial time stamp, wherein the difference between the time stamp of the current first video frame and the first initial time stamp is within a first difference range;
    使用所述特征匹配算法查找与当前第一视频帧匹配的第二视频帧,其中,所述第二视频帧的时间戳与所述第一终止时间戳的差值在第二差值范围内;using the feature matching algorithm to find a second video frame that matches the current first video frame, wherein the difference between the timestamp of the second video frame and the first termination timestamp is within a second difference range;
    若查找到所述第二视频帧,则将当前第一视频帧的时间戳和查找到的所述第二视频帧的时间戳设置为所述第二时间点对,若未查找到所述第二视频帧,则跳转至根据所述第一初始时间戳确定当前第一视频帧的步骤;If the second video frame is found, the timestamp of the current first video frame and the found timestamp of the second video frame are set as the second time point pair, if the first video frame is not found two video frames, then jump to the step of determining the current first video frame according to the first initial timestamp;
    所述使用所述特征匹配算法查找与当前第一视频帧匹配的第二视频帧的步骤,包括:The step of using the feature matching algorithm to find a second video frame matching the current first video frame, comprising:
    根据当前第一视频帧的拍摄视角确定第一注视点区域画面,使用所述特征匹配算法查找与所述第一注视点区域画面匹配的第二视频帧。The first gaze point area picture is determined according to the shooting angle of the current first video frame, and the feature matching algorithm is used to find a second video frame matching the first gaze point area image.
  4. 如权利要求1所述的方法,其特征在于,所述第一时间点对包括第一初始时间戳和第一终止时间戳,所述基于所述第一时间点对使用预设的特征匹配算法获取第二时间点对的步骤,还包括:The method according to claim 1, wherein the first time point pair comprises a first initial timestamp and a first termination timestamp, and a preset feature matching algorithm is used based on the first time point pair The steps of acquiring the second time point pair also include:
    根据所述第一终止时间戳确定当前第三视频帧,其中,当前第三视频帧的时间戳与所述第一终止时间戳的差值在第三差值范围内;determining the current third video frame according to the first termination timestamp, wherein the difference between the timestamp of the current third video frame and the first termination timestamp is within a third difference range;
    使用所述特征匹配算法查找与当前第三视频帧匹配的第四视频帧,其中,所述第四视频帧的时间戳与所述第一初始时间戳的差值在第四差值范围内;Using the feature matching algorithm to find a fourth video frame that matches the current third video frame, wherein the difference between the timestamp of the fourth video frame and the first initial timestamp is within a fourth difference range;
    若查找到所述第四视频帧,则将当前第三视频帧的时间戳和查找到的所述第四视频帧的时间戳设置为所述第二时间点对,若未查找到所述第四视频帧,则跳转至根据所述第二初始时间戳确定当前第三视频帧的步骤;If the fourth video frame is found, the timestamp of the current third video frame and the found timestamp of the fourth video frame are set as the second time point pair; if the fourth video frame is not found four video frames, then jump to the step of determining the current third video frame according to the second initial timestamp;
    所述使用所述特征匹配算法查找与当前第三视频帧匹配的第四视频帧的步骤,包括:The step of using the feature matching algorithm to find the fourth video frame matching the current third video frame includes:
    根据当前第三视频帧的拍摄视角确定第二注视点区域画面,使用所述特征匹配算法查找与所述第二注视点区域画面匹配的第四视频帧。The second gaze point area picture is determined according to the shooting angle of the current third video frame, and the feature matching algorithm is used to find a fourth video frame matching the second gaze point area image.
  5. 如权利要求1所述的方法,其特征在于,所述根据当前视频帧的拍摄视角或所述第二时间点对对应的动画插值类型计算渲染模型的渲染参数的步骤,包括:The method according to claim 1, wherein the step of calculating the rendering parameters of the rendering model for the corresponding animation interpolation type according to the shooting angle of the current video frame or the second time point comprises:
    若当前解码时间戳不处于由所述第二时间点对构成的时间段内,则根据当前视频帧的拍摄视角计算渲染模型的渲染参数;If the current decoding time stamp is not within the time period formed by the second time point pair, the rendering parameters of the rendering model are calculated according to the shooting angle of the current video frame;
    若当前解码时间戳处于由所述第二时间点对构成的时间段内,则根据当前解码时间戳、所述第二时间点对对应的视频帧的拍摄视角、并使用所述动画插值类型进行插值计算,根据插值计算结果计算渲染模型的渲染参数。If the current decoding timestamp is within the time period formed by the second time point pair, the current decoding timestamp, the shooting angle of the video frame corresponding to the second time point pair, and using the animation interpolation type Interpolation calculation, calculate the rendering parameters of the rendering model according to the interpolation calculation result.
  6. 如权利要求1所述的方法,其特征在于,所述动画插值类型为线性、慢进快出、快进慢出或先慢后快再慢中的一种或多种。The method of claim 1, wherein the animation interpolation type is one or more of linear, slow in and fast out, fast in and slow out, or slow first, then fast and then slow.
  7.  如权利要求1所述的方法,其特征在于,所述第二时间点对包括第二初始时间戳和第二终止时间戳,所述根据所述渲染参数对贴合在所述渲染模型的全景视频画面进行渲染的步骤之前,还包括:The method according to claim 1, wherein the second time point pair comprises a second initial timestamp and a second termination timestamp, and the panorama fitted to the rendering model according to the rendering parameter pair Before the step of rendering the video picture, it also includes:
    获取所述第二初始时间戳对应的视频帧与所述第二终止时间戳对应的视频帧的曝光度差;obtaining the exposure difference between the video frame corresponding to the second initial timestamp and the video frame corresponding to the second termination timestamp;
    根据所述曝光度差获取每个待融合处理的视频帧的渐变融合参数;Acquiring gradient fusion parameters of each video frame to be fused according to the exposure difference;
    根据所述渲染参数对贴合在所述渲染模型的全景视频画面进行渲染的步骤,包括:The step of rendering the panoramic video picture fitted on the rendering model according to the rendering parameters includes:
    根据所述渲染参数和所述渐变融合参数对贴合在所述渲染模型的全景视频画面进行渲染。The panoramic video picture fitted on the rendering model is rendered according to the rendering parameter and the gradient fusion parameter.
  8.  一种全景视频的自动剪辑装置,其特征在于,所述装置包括:A kind of automatic editing device of panoramic video, it is characterized in that, described device comprises:
    标记获取单元,用于获取全景视频文件中标记的第一时间点对,所述第一时间点对表征用户期望裁剪掉的视频片段;a mark obtaining unit, configured to obtain a first time point pair marked in the panoramic video file, where the first time point pair represents a video segment desired by the user to be cut;
    裁剪区域确定单元,用于基于所述第一时间点对使用预设的特征匹配算法获取第二时间点对,所述第二时间点对表征实际待裁剪的视频片段;以及a cropping area determination unit, configured to use a preset feature matching algorithm based on the first time point pair to obtain a second time point pair, the second time point pair representing the actual video segment to be cropped; and
    视频生成单元,用于根据当前视频帧的拍摄视角或所述第二时间点对对应的动画插值类型计算渲染模型的渲染参数,根据所述渲染参数对贴合在所述渲染模型的全景视频画面进行渲染,直至当前视频帧为最后一帧,生成剪辑后的平面视频。A video generation unit, configured to calculate the rendering parameters of the rendering model for the corresponding animation interpolation type according to the shooting angle of the current video frame or the second time point, and according to the rendering parameters, pair the panoramic video picture fitted on the rendering model. Render until the current video frame is the last frame, resulting in a clipped flat video.
  9.  一种终端,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1至7任一项所述方法的步骤。A terminal, comprising a memory, a processor, and a computer program stored in the memory and running on the processor, characterized in that, when the processor executes the computer program, the implementation of claims 1 to 7 The steps of any one of the methods.
  10.  一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至7任一项所述方法的步骤。A computer-readable storage medium storing a computer program, characterized in that, when the computer program is executed by a processor, the steps of the method according to any one of claims 1 to 7 are implemented.
PCT/CN2022/079779 2021-03-31 2022-03-08 Automatic cropping method and apparatus for panoramic video, and terminal and storage medium WO2022206312A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110351843.6A CN113115106B (en) 2021-03-31 2021-03-31 Automatic editing method, device, terminal and storage medium for panoramic video
CN202110351843.6 2021-03-31

Publications (1)

Publication Number Publication Date
WO2022206312A1 true WO2022206312A1 (en) 2022-10-06

Family

ID=76713516

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/079779 WO2022206312A1 (en) 2021-03-31 2022-03-08 Automatic cropping method and apparatus for panoramic video, and terminal and storage medium

Country Status (2)

Country Link
CN (1) CN113115106B (en)
WO (1) WO2022206312A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113115106B (en) * 2021-03-31 2023-05-05 影石创新科技股份有限公司 Automatic editing method, device, terminal and storage medium for panoramic video
CN115002335B (en) * 2021-11-26 2024-04-09 荣耀终端有限公司 Video processing method, apparatus, electronic device, and computer-readable storage medium
CN114866837B (en) * 2022-05-26 2023-10-13 影石创新科技股份有限公司 Video processing method, device, computer equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108322831A (en) * 2018-02-28 2018-07-24 广东美晨通讯有限公司 video playing control method, mobile terminal and computer readable storage medium
CN109688463A (en) * 2018-12-27 2019-04-26 北京字节跳动网络技术有限公司 A kind of editing video generation method, device, terminal device and storage medium
CN110087123A (en) * 2019-05-15 2019-08-02 腾讯科技(深圳)有限公司 Video file production method, device, equipment and readable storage medium storing program for executing
CN110691202A (en) * 2019-08-28 2020-01-14 咪咕文化科技有限公司 Video editing method, device and computer storage medium
CN110855904A (en) * 2019-11-26 2020-02-28 Oppo广东移动通信有限公司 Video processing method, electronic device and storage medium
CN113115106A (en) * 2021-03-31 2021-07-13 影石创新科技股份有限公司 Automatic clipping method, device, terminal and storage medium of panoramic video

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10154228B1 (en) * 2015-12-23 2018-12-11 Amazon Technologies, Inc. Smoothing video panning
CN108282694B (en) * 2017-01-05 2020-08-18 阿里巴巴集团控股有限公司 Panoramic video rendering method and device and electronic equipment
CN107529091B (en) * 2017-09-08 2020-08-04 广州华多网络科技有限公司 Video editing method and device
CN107888988A (en) * 2017-11-17 2018-04-06 广东小天才科技有限公司 A kind of video clipping method and electronic equipment
CN107968922A (en) * 2017-11-23 2018-04-27 深圳岚锋创视网络科技有限公司 A kind of panoramic video is recorded as the method, apparatus and portable terminal of planar video
CN108366294A (en) * 2018-03-06 2018-08-03 广州市千钧网络科技有限公司 A kind of video method of cutting out and device
WO2020103040A1 (en) * 2018-11-21 2020-05-28 Boe Technology Group Co., Ltd. A method for generating and displaying panorama images based on rendering engine and a display apparatus
CN109618093A (en) * 2018-12-14 2019-04-12 深圳市云宙多媒体技术有限公司 A kind of panoramic video live broadcasting method and system
CN110703976B (en) * 2019-08-28 2021-04-13 咪咕文化科技有限公司 Clipping method, electronic device, and computer-readable storage medium
CN110971929B (en) * 2019-10-31 2022-07-29 咪咕互动娱乐有限公司 Cloud game video processing method, electronic equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108322831A (en) * 2018-02-28 2018-07-24 广东美晨通讯有限公司 video playing control method, mobile terminal and computer readable storage medium
CN109688463A (en) * 2018-12-27 2019-04-26 北京字节跳动网络技术有限公司 A kind of editing video generation method, device, terminal device and storage medium
CN110087123A (en) * 2019-05-15 2019-08-02 腾讯科技(深圳)有限公司 Video file production method, device, equipment and readable storage medium storing program for executing
CN110691202A (en) * 2019-08-28 2020-01-14 咪咕文化科技有限公司 Video editing method, device and computer storage medium
CN110855904A (en) * 2019-11-26 2020-02-28 Oppo广东移动通信有限公司 Video processing method, electronic device and storage medium
CN113115106A (en) * 2021-03-31 2021-07-13 影石创新科技股份有限公司 Automatic clipping method, device, terminal and storage medium of panoramic video

Also Published As

Publication number Publication date
CN113115106B (en) 2023-05-05
CN113115106A (en) 2021-07-13

Similar Documents

Publication Publication Date Title
WO2022206312A1 (en) Automatic cropping method and apparatus for panoramic video, and terminal and storage medium
CN106375674B (en) Method and apparatus for finding and using video portions related to adjacent still images
WO2020252910A1 (en) Image distortion correction method, apparatus, electronic device and readable storage medium
WO2016155377A1 (en) Picture display method and device
US20230040548A1 (en) Panorama video editing method,apparatus,device and storage medium
US20090162042A1 (en) Guided photography based on image capturing device rendered user recommendations
KR101655078B1 (en) Method and apparatus for generating moving photograph
JP2014506026A (en) Portrait image synthesis from multiple images captured by a portable device
JP5949331B2 (en) Image generating apparatus, image generating method, and program
JP2004181233A5 (en)
JP6899002B2 (en) Image processing methods, devices, computer-readable storage media and electronic devices
JP2010011289A (en) Image capturing apparatus and program
CN113973190A (en) Video virtual background image processing method and device and computer equipment
JP7253622B2 (en) Image stabilization method for panorama video and portable terminal
US10491804B2 (en) Focus window determining method, apparatus, and device
JPWO2018062538A1 (en) Display device and program
CN112036311A (en) Image processing method and device based on eye state detection and storage medium
WO2022121963A1 (en) Image occlusion detection method and apparatus, photographing device and medium
WO2022028407A1 (en) Panoramic video editing method, apparatus and device, and storage medium
CN110809797B (en) Micro video system, format and generation method
CN107547939A (en) Method, system and the portable terminal of panoramic video file clip
WO2019000715A1 (en) Method and system for processing image
TWI284851B (en) Photographing apparatus and photographing method
JP7444604B2 (en) Image processing device and method, and imaging device
JP2012222465A (en) Image processing apparatus, image processing method, and computer program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22778496

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22778496

Country of ref document: EP

Kind code of ref document: A1