CN117201955A

CN117201955A - Video shooting method, device, equipment and storage medium

Info

Publication number: CN117201955A
Application number: CN202210601210.0A
Authority: CN
Inventors: 崔瀚涛; 苗锋
Original assignee: Honor Device Co Ltd
Current assignee: Honor Device Co Ltd
Priority date: 2022-05-30
Filing date: 2022-05-30
Publication date: 2023-12-08
Also published as: WO2023231585A1

Abstract

The application discloses a video shooting method, a video shooting device, video shooting equipment and a video shooting storage medium, and belongs to the technical field of video processing. The method comprises the following steps: in the video shooting process, multiple paths of video streams are acquired. And carrying out image fusion processing on the multi-path video stream to obtain a first video stream, and obtaining image fusion parameters corresponding to the first video stream. Generating a first multimedia file containing a first video stream, and generating a second multimedia file containing the multiple video streams and the image fusion parameters. The first multimedia file is stored in association with the second multimedia file. Therefore, after video shooting is finished, a fusion video stream with an image fusion effect can be generated according to the multipath video streams and the image fusion parameters in the second multimedia file, the image fusion effect of the fusion video stream is better than that of the first video stream generated in the first multimedia file in the video shooting process, and a user can finally obtain the video stream with better image fusion effect for playing.

Description

Video shooting method, device, equipment and storage medium

Technical Field

The present application relates to the field of video processing technologies, and in particular, to a video capturing method, apparatus, device, and storage medium.

Background

Along with the development of terminal technology, the terminal gradually integrates the functions of communication, shooting, video and audio, and the like, and becomes an indispensable part in daily life of people. The user can use the terminal to shoot video and record the live spot drops.

Currently, terminals support capturing video using multiple cameras simultaneously. Specifically, the terminal can collect multiple paths of video streams through multiple cameras at the same time, and then performs image fusion processing on the multiple paths of video streams to obtain a fused video stream so as to display video images of the fused video stream on a video interface. And after the video shooting is finished, the terminal can also store the fused video stream for the subsequent watching of the user.

However, in the video shooting process, the terminal is limited by the image pickup device, the processing chip, the image algorithm and the like, so that the terminal is difficult to consider the video processing capability while guaranteeing the real-time recording of the video, and the video effect of the fused video stream obtained in the video shooting process is poor.

Disclosure of Invention

The application provides a video shooting method, a device, equipment and a storage medium, which can generate a video stream with better image fusion effect after video shooting is finished. The technical scheme is as follows:

In a first aspect, a video capturing method is provided. In the method, in the video shooting process, multiple paths of video streams are acquired, then image fusion processing is carried out on the multiple paths of video streams, a first video stream is obtained, and image fusion parameters corresponding to the first video stream are acquired. And then, generating a first multimedia file containing the first video stream and generating a second multimedia file containing the multipath video stream and the image fusion parameters. The first multimedia file is stored in association with the second multimedia file.

The image fusion parameters corresponding to the first video stream are used for indicating the image fusion mode of the multi-path video stream when the first video stream is obtained. The image fusion parameters may include an image stitching mode, and may further include an image stitching position of each of the multiple video streams. The image stitching mode may include one or more of a up-down stitching mode, a left-right stitching mode, a picture-in-picture nesting mode, and the like. The image splicing position of any one of the multiple video streams is used for indicating the position of the video image of the video stream when the video image is spliced according to the corresponding image splicing mode.

The first video stream in the first multimedia file has an image fusion effect. Therefore, when the video shooting is finished, the user can watch the first video stream with the image fusion effect in the stored first multimedia file, and can share the first multimedia file to other people for watching in time.

The multiple video streams in the second multimedia file are original video streams without image fusion processing, i.e. video streams without image fusion effect. The image fusion parameter in the second multimedia file is used for indicating an image fusion mode which needs to be adopted by the multi-path video stream in the second multimedia file in the subsequent fusion. Therefore, after video shooting is finished, the terminal can play each path of video stream in the multiple paths of video streams according to the stored second multimedia file, and can generate a fusion video stream with an image fusion effect according to the multiple paths of video streams and the image fusion parameters in the stored second multimedia file. After the video shooting is finished, the terminal does not need to record the video in real time, so that higher video processing capability can be provided, the image fusion effect of the fusion video stream generated by the terminal according to the second multimedia file is better than that of the first video stream generated in the first multimedia file in the video shooting process, and the user can finally obtain the video stream with better image fusion effect for playing.

In one possible manner, the operation of acquiring multiple video streams may be: and acquiring one path of video stream acquired by each camera in the plurality of cameras to obtain the multipath video stream.

The method is to record multiple simultaneous video scenes through multiple cameras so as to obtain one video stream collected by each camera in the multiple cameras.

As an example, the plurality of cameras may be all provided at the terminal. At this time, the terminal records the video through a plurality of cameras, so that the terminal can acquire one video stream acquired by each camera in the plurality of cameras to obtain multiple video streams.

As another example, a portion of the plurality of cameras may be disposed at the terminal, and another portion of the cameras may be disposed at a collaborative device in a multi-screen collaborative state with the terminal. At this time, the terminal records the video through the camera of the terminal and the camera of the cooperative device at the same time, and the cooperative device can send the video stream collected by the camera of the terminal to the terminal, so that the terminal can obtain the video stream collected by the camera of the terminal and the video stream collected by the camera of the cooperative device to obtain multiple paths of video streams.

In another possible manner, the operation of acquiring multiple video streams may be: and acquiring one path of video stream acquired by the camera, and performing image processing on the one path of video stream to acquire the other path of video stream.

The method is that a single-shot and co-recorded scene is recorded through a camera to obtain one video stream collected by the camera, and one video stream collected by the camera is subjected to image processing to obtain the other video stream, so that two video streams including an original video stream and a video stream obtained through image processing can be obtained.

It should be noted that the terminal may further perform different image processing on the one path of video stream to obtain different video streams, so that at least three paths of video streams including the original video stream and at least two paths of video streams obtained through different image processing may be obtained.

Optionally, after the image fusion processing is performed on the multiple paths of video streams to obtain the first video stream, video images of the first video stream can be displayed on a video recording interface, so that real-time preview of the shot video can be realized in the video shooting process, and a user can know the image fusion effect of the video in time.

In one possible manner, the operation of generating the second multimedia file containing the multiple video streams and the image fusion parameters may be: encoding each path of video stream in the multiple paths of video streams respectively to obtain multiple video files; for any one video file in the plurality of video files, taking the video file as a video track, taking the image fusion parameter as a parameter track, and packaging the video track and the parameter track to obtain a corresponding multi-track file; a plurality of multi-track files corresponding to the plurality of video files one-to-one is determined as a second multimedia file.

In this way, the video files of each video stream in the multiple video streams are individually packaged to obtain a corresponding multi-track file, so that the multi-track file of each video stream in the multiple video streams can be obtained, and a plurality of multi-track files can be obtained. The second multimedia file now comprises the plurality of multi-track files.

In another possible manner, the operation of generating the second multimedia file containing the multiple video streams and the image fusion parameters may be: encoding each path of video stream in the multiple paths of video streams respectively to obtain multiple video files; taking each video file in the plurality of video files as a video track to obtain a plurality of video tracks; taking the image fusion parameter as a parameter track; and packaging the plurality of video tracks and the parameter track to obtain a second multimedia file.

In this way, the second multimedia file is obtained by encapsulating the entire video files of the multiple video streams.

Further, after the video shooting is finished, a first video stream in the first multimedia file and an association button for indicating a second multimedia file associated with the first multimedia file can be displayed in the video list. If the selection operation of the association button is detected, the multiple paths of video streams in the second multimedia file are displayed, so that a user can know which original video streams in the first multimedia file are fused, and can conveniently select any path of video streams in the multiple paths of video streams in the second multimedia file to play.

Further, after the video shooting is finished, the multiple paths of video streams can be obtained from the second multimedia file, and then at least one path of video stream in the multiple paths of video streams is played. For example, the multiple video streams may be presented in a video list, and then the user may choose to play at least one of the multiple video streams. And then, if a fusion adjustment instruction for the video images of the at least one path of video stream is received in the playing process of the at least one path of video stream, updating the image fusion parameters in the second multimedia file according to fusion adjustment information carried by the fusion adjustment instruction.

The fusion adjustment instruction is used for indicating an image fusion mode which needs to be adopted for adjusting the multi-path video stream. The user can manually trigger a fusion adjustment instruction frame by frame according to the own requirement in the playing process of the at least one path of video stream, wherein the fusion adjustment instruction is used for indicating to change the image fusion mode, such as indicating to change the image splicing mode, and/or changing the image splicing position of each path of video stream. That is, the fusion adjustment information carried in the fusion adjustment instruction may include an image stitching mode to be adjusted, and/or may include an image stitching position to be adjusted by each path of video stream. Therefore, the terminal can modify the image fusion parameters in the second multimedia file according to the fusion adjustment information, so that the update of the image fusion parameters is realized, and the image fusion processing performed subsequently according to the image fusion parameters in the second multimedia file meets the latest requirements of users.

Further, after video shooting is finished, the multiple paths of video streams and the image fusion parameters can be obtained from the second multimedia file, then image fusion processing is carried out on the multiple paths of video streams according to the image fusion parameters, a second video stream is obtained, and a third multimedia file is generated according to the second video stream. Further, the first multimedia file stored in association with the second multimedia file may also be updated to the third multimedia file.

In this case, the image fusion parameters corresponding to the first video stream are the same as the image fusion parameters corresponding to the second video stream, that is, the same image fusion method is used to perform image fusion processing on the multiple paths of video streams, so as to obtain the first video stream and the second video stream. Because the video is not required to be recorded in real time after the video shooting is finished, higher video processing capacity can be provided, and the image fusion effect of the second video stream generated at the moment is better than that of the first video stream generated in the video shooting process.

In this case, the first multimedia file stored in association with the second multimedia file is updated to the third multimedia file, so that the video stream in the multimedia file stored in association with the second multimedia file is a video stream with better image fusion effect, and thus, the user can finally obtain the video stream with better image fusion effect for playing.

It should be noted that if the image fusion parameters in the second multimedia file are updated according to the fusion adjustment instruction triggered by the user, a third multimedia file may be generated according to the second multimedia file. And then updating the multimedia file (possibly the first multimedia file and possibly the old third multimedia file) stored in association with the second multimedia file into the newly generated third multimedia file, so that the video stream in the multimedia file stored in association with the second multimedia file is a video stream with good image fusion effect which meets the latest image fusion requirement of the user.

In a second aspect, a video photographing apparatus is provided, which has a function of implementing the video photographing method behavior in the first aspect. The video shooting device comprises at least one module, and the at least one module is used for realizing the video shooting method provided by the first aspect.

In a third aspect, a video capturing apparatus is provided, where the video capturing apparatus includes a processor and a memory, where the memory is configured to store a program for supporting the video capturing apparatus to perform the video capturing method provided in the first aspect, and store data related to implementing the video capturing method in the first aspect. The processor is configured to execute a program stored in the memory. The video capture device may further include a communication bus for establishing a connection between the processor and the memory.

In a fourth aspect, a computer readable storage medium is provided, in which instructions are stored which, when run on a computer, cause the computer to perform the video capturing method according to the first aspect described above.

In a fifth aspect, there is provided a computer program product comprising instructions which, when run on a computer, cause the computer to perform the video capturing method of the first aspect described above.

The technical effects obtained by the second, third, fourth and fifth aspects are similar to the technical effects obtained by the corresponding technical means in the first aspect, and are not described in detail herein.

Drawings

Fig. 1 is a schematic structural diagram of a terminal according to an embodiment of the present application;

FIG. 2 is a block diagram of a software system of a terminal according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a video image according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a first video recording interface according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a second video recording interface according to an embodiment of the present application;

FIG. 6 is a schematic diagram of a third video recording interface according to an embodiment of the present application;

FIG. 7 is a schematic diagram of a fourth video recording interface according to an embodiment of the present application;

fig. 8 is a flowchart of a video shooting method according to an embodiment of the present application;

FIG. 9 is a schematic diagram of a dual video container provided by an embodiment of the present application;

FIG. 10 is a schematic diagram of another dual video container provided by an embodiment of the present application;

FIG. 11 is a schematic diagram of a video list provided by an embodiment of the present application;

FIG. 12 is a schematic diagram of generating a third multimedia file according to an embodiment of the present application;

fig. 13 is a schematic diagram of a first video capturing method according to an embodiment of the present application;

fig. 14 is a schematic diagram of a second video capturing method according to an embodiment of the present application;

fig. 15 is a schematic diagram of a third video capturing method according to an embodiment of the present application;

fig. 16 is a schematic diagram of a fourth video capturing method according to an embodiment of the present application;

fig. 17 is a schematic structural diagram of a video capturing apparatus according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present application more apparent, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

It should be understood that references to "a plurality" in this disclosure refer to two or more. In the description of the present application, "/" means or, unless otherwise indicated, for example, A/B may represent A or B; "and/or" herein is merely an association relationship describing an association object, and means that three relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist together, and B exists alone. In addition, in order to facilitate the clear description of the technical solution of the present application, the words "first", "second", etc. are used to distinguish the same item or similar items having substantially the same function and function. It will be appreciated by those of skill in the art that the words "first," "second," and the like do not limit the amount and order of execution, and that the words "first," "second," and the like do not necessarily differ.

The statements of "one embodiment" or "some embodiments" and the like, described in this disclosure, mean that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present disclosure. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the present application are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise. Furthermore, the terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless otherwise specifically noted.

The terminal according to the embodiment of the present application will be described first.

Fig. 1 is a schematic structural diagram of a terminal according to an embodiment of the present application. Referring to fig. 1, the terminal 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (universal serial bus, USB) interface 130, a charge management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, a sensor module 180, a key 190, a motor 191, an indicator 192, a camera 193, a display 194, a subscriber identity module (subscriber identity module, SIM) card interface 195, and the like. The sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, a bone conduction sensor 180M, and the like.

It should be understood that the structure illustrated in the embodiments of the present application does not constitute a specific limitation on the terminal 100. In other embodiments of the application, terminal 100 may include more or less components than shown, or certain components may be combined, or certain components may be split, or different arrangements of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.

The processor 110 may include one or more processing units, such as: the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), a controller, a memory, a video codec, a digital signal processor (digital signal processor, DSP), a baseband processor, and/or a neural network processor (neural-network processing unit, NPU), etc. Wherein the different processing units may be separate devices or may be integrated in one or more processors.

The controller may be a neural hub and a command center of the terminal 100, among others. The controller can generate operation control signals according to the instruction operation codes and the time sequence signals to finish the control of instruction fetching and instruction execution.

A memory may also be provided in the processor 110 for storing instructions and data. In some embodiments, the memory in the processor 110 is a cache memory. The memory may hold instructions or data that the processor 110 has just used or recycled. If the processor 110 needs to reuse the instruction or data, it can be called directly from the memory. Repeated accesses are avoided and the latency of the processor 110 is reduced, thereby improving the efficiency of the system.

The charge management module 140 is configured to receive a charge input from a charger. The charger can be a wireless charger or a wired charger. In some wired charging embodiments, the charge management module 140 may receive a charging input of a wired charger through the USB interface 130. In some wireless charging embodiments, the charge management module 140 may receive wireless charging input through a wireless charging coil of the terminal 100. The charging management module 140 may also supply power to the terminal 100 through the power management module 141 while charging the battery 142.

The power management module 141 is used for connecting the battery 142, and the charge management module 140 and the processor 110. The power management module 141 receives input from the battery 142 and/or the charge management module 140 to power the processor 110, the internal memory 121, the external memory, the display 194, the camera 193, the wireless communication module 160, and the like. The power management module 141 may also be configured to monitor battery capacity, battery cycle number, battery health (leakage, impedance) and other parameters. In other embodiments, the power management module 141 may also be provided in the processor 110. In other embodiments, the power management module 141 and the charge management module 140 may be disposed in the same device.

The wireless communication function of the terminal 100 may be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, a modem processor, a baseband processor, and the like.

The mobile communication module 150 may provide a solution including 2G/3G/4G/5G wireless communication applied to the terminal 100. The mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (low noise amplifier, LNA), etc. The mobile communication module 150 may receive electromagnetic waves from the antenna 1, perform processes such as filtering, amplifying, and the like on the received electromagnetic waves, and transmit the processed electromagnetic waves to the modem processor for demodulation. The mobile communication module 150 can amplify the signal modulated by the modem processor, and convert the signal into electromagnetic waves through the antenna 1 to radiate. In some embodiments, at least some of the functional modules of the mobile communication module 150 may be disposed in the processor 110. In some embodiments, at least some of the functional modules of the mobile communication module 150 may be provided in the same device as at least some of the modules of the processor 110.

The wireless communication module 160 may provide solutions for wireless communication including wireless local area network (wireless local area networks, WLAN) (e.g., wireless fidelity (wireless fidelity, wi-Fi) network), bluetooth (BT), global navigation satellite system (global navigation satellite system, GNSS), frequency modulation (frequency modulation, FM), near field wireless communication technology (near field communication, NFC), infrared technology (IR), etc., applied on the terminal 100. The wireless communication module 160 may be one or more devices that integrate at least one communication processing module. The wireless communication module 160 receives electromagnetic waves via the antenna 2, modulates the electromagnetic wave signals, filters the electromagnetic wave signals, and transmits the processed signals to the processor 110. The wireless communication module 160 may also receive a signal to be transmitted from the processor 110, frequency modulate it, amplify it, and convert it to electromagnetic waves for radiation via the antenna 2.

Terminal 100 implements display functions via a GPU, display 194, and application processor, etc. The GPU is a microprocessor for image processing, and is connected to the display 194 and the application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 110 may include one or more GPUs that execute program instructions to generate or change display information.

The terminal 100 may implement photographing functions through an ISP, a camera 193, a video codec, a GPU, a display 194, an application processor, and the like. Terminal 100 may include 1 or N cameras 193, N being a greater than positive integer.

In an embodiment of the present application, the terminal 100 may record video through 1 or more cameras 193. In a multi-shot copy scene, the terminal 100 performs video recording simultaneously by a plurality of cameras 193. In a single shot copy scene, the terminal 100 performs video recording by one camera 193. The camera 193 is used to capture video streams. The video stream may be collected by camera 193 and then transmitted to the ISP for processing.

As an example, the video image of the video stream acquired by the camera 193 is in a RAW format, and the ISP may convert the video image in the RAW format in the video stream into a video image in a YUV format, and then perform basic processing on the video image in the YUV format, such as adjusting contrast, removing noise, and the like.

In a multi-shot, co-recorded scenario, an ISP may receive video streams captured by each of the plurality of cameras 193 and perform basic processing on the multiple video streams before transmitting the multiple video streams to an application processor. In a single-shot and co-recorded scene, the ISP can receive a video stream acquired by one camera 193, perform basic processing on the video stream, and perform image processing on the video stream after the basic processing, for example, may perform amplification processing, cutting processing, and the like, to obtain another video stream, and then transmit the two video streams to the application processor.

The application processor can perform image fusion processing on the multiple paths of video streams to obtain a first video stream, and can also generate a first multimedia file containing the first video stream, and further, can display video images of the first video stream on a video interface through a video codec, a GPU and a display screen 194 to realize video preview.

Meanwhile, the application processor can also acquire image fusion parameters corresponding to the first video stream, wherein the image fusion parameters are used for indicating an image fusion mode of the multi-path video stream when the first video stream is obtained, then a second multimedia file containing the image fusion parameters corresponding to the multi-path video stream and the first video stream is generated, the second multimedia file and the first multimedia file are stored in an associated mode, and the image fusion parameters in the second multimedia file are used for indicating the image fusion mode needed to be adopted by the multi-path video stream in the second multimedia file in the subsequent fusion. Thus, after the video shooting is finished, a video stream with better image fusion effect can be generated according to the stored second multimedia file.

The external memory interface 120 may be used to connect an external memory card, such as a Micro SD card, to realize the memory capability of the extension terminal 100. The external memory card communicates with the processor 110 through an external memory interface 120 to implement data storage functions. Such as storing files of music, video, etc. in an external memory card.

The internal memory 121 may be used to store computer-executable program code that includes instructions. The processor 110 performs various functional applications of the terminal 100 and data processing by executing instructions stored in the internal memory 121. The internal memory 121 may include a storage program area and a storage data area. The storage program area may store an application program (such as a sound playing function, an image playing function, etc.) required for at least one function of the operating system, etc. The storage data area may store data (e.g., audio data, phonebook, etc.) created by the terminal 100 during use, and the like. In addition, the internal memory 121 may include a high-speed random access memory, and may further include a nonvolatile memory such as at least one magnetic disk storage device, a flash memory device, a universal flash memory (universal flash storage, UFS), and the like.

The terminal 100 may implement audio functions such as music playing, recording, etc. through an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, an application processor, etc.

The audio module 170 is used to convert digital audio information into an analog audio signal output and also to convert an analog audio input into a digital audio signal. The audio module 170 may also be used to encode and decode audio signals. In some embodiments, the audio module 170 may be disposed in the processor 110, or a portion of the functional modules of the audio module 170 may be disposed in the processor 110.

The SIM card interface 195 is used to connect a SIM card. The SIM card may be contacted and separated from the terminal 100 by being inserted into the SIM card interface 195 or by being withdrawn from the SIM card interface 195. The terminal 100 may support 1 or N SIM card interfaces, N being an integer greater than 1. The SIM card interface 195 may support Nano SIM cards, micro SIM cards, and the like. The same SIM card interface 195 may be used to insert multiple cards simultaneously. The types of the plurality of cards may be the same or different. The SIM card interface 195 may also be compatible with different types of SIM cards. The SIM card interface 195 may also be compatible with external memory cards. The terminal 100 interacts with the network through the SIM card to realize functions such as call and data communication. In some embodiments, the terminal 100 employs esims, i.e.: an embedded SIM card. The eSIM card may be embedded in the terminal 100 and cannot be separated from the terminal 100.

The software system of the terminal 100 will be described next.

The software system of the terminal 100 may employ a layered architecture, an event driven architecture, a micro-core architecture, a micro-service architecture, or a cloud architecture. In the embodiment of the application, an Android (Android) system with a layered architecture is taken as an example, and a software system of the terminal 100 is illustrated.

Fig. 2 is a block diagram of a software system of the terminal 100 according to an embodiment of the present application. Referring to fig. 2, the hierarchical architecture divides the software into several layers, each with distinct roles and branches. The layers communicate with each other through a software interface. In some embodiments, the Android system is divided from top to bottom into an application layer (APP), an application framework layer (FWK), a An Zhuoyun row (Android run) and system layer, and a kernel layer (kernel).

The application layer may include a series of application packages. As shown in fig. 2, the application package may include applications for cameras, gallery, calendar, phone calls, maps, navigation, WLAN, bluetooth, music, video, short messages, etc.

The application framework layer provides an application programming interface (application programming interface, API) and programming framework for application programs of the application layer. The application framework layer includes a number of predefined functions. As shown in FIG. 2, the application framework layer may include a window manager, a content provider, a view system, a telephony manager, a resource manager, a notification manager, and the like. The window manager is used for managing window programs. The window manager can acquire the size of the display screen, judge whether a status bar exists, lock the screen, intercept the screen and the like. The content provider is used to store and retrieve data, which may include video, images, audio, calls made and received, browsing history and bookmarks, phonebooks, etc., and make such data accessible to the application. The view system includes visual controls, such as controls for displaying text, controls for displaying pictures, and the like. The view system may be used to construct a display interface for an application, which may be comprised of one or more views, such as a view that includes displaying a text notification icon, a view that includes displaying text, and a view that includes displaying a picture. The telephony manager is used to provide communication functions of the terminal 100, such as management of call status (including on, off, etc.). The resource manager provides various resources to the application program, such as localization strings, icons, pictures, layout files, video files, and the like. The notification manager allows the application to display notification information in a status bar, can be used to communicate notification type messages, can automatically disappear after a short dwell, and does not require user interaction, e.g., the notification manager is used to notify that a download is complete, a message alert, etc. The notification manager may also be a notification that appears in the system top status bar in the form of a chart or a scroll bar text, such as a notification of a background running application. The notification manager may also be a notification that appears on the screen in the form of a dialog window, such as prompting a text message in a status bar, sounding a prompt, vibrating an electronic device, flashing an indicator light, etc.

Android run time includes a core library and virtual machines. Android run time is responsible for scheduling and management of the Android system. The core library consists of two parts: one part is a function which needs to be called by java language, and the other part is a core library of android. The application layer and the application framework layer run in a virtual machine. The virtual machine executes java files of the application program layer and the application program framework layer as binary files. The virtual machine is used for executing the functions of object life cycle management, stack management, thread management, security and exception management, garbage collection and the like.

The system layer may include a plurality of functional modules such as: surface manager (surface manager), media Libraries (Media Libraries), three-dimensional graphics processing Libraries (e.g., openGL ES), two-dimensional graphics engines (e.g., SGL), etc. The surface manager is used to manage the display subsystem and provides a fusion of 2D and 3D layers for multiple applications. Media libraries support a variety of commonly used audio, video format playback and recording, still image files, and the like. The media library may support a variety of audio and video encoding formats, such as: MPEG4, h.264, MP3, AAC, AMR, JPG, PNG, etc. The three-dimensional graphic processing library is used for realizing three-dimensional graphic drawing, image rendering, synthesis, layer processing and the like. A two-dimensional graphics engine is a drawing engine that draws two-dimensional drawings.

The kernel layer is a layer between hardware and software. The inner core layer at least comprises display drive, camera drive, audio drive, sensor drive and the like.

Before explaining the embodiments of the present application in detail, application scenarios related to the embodiments of the present application are explained.

Currently, as shown in fig. 3, in many video scenes, a terminal such as a mobile phone, a tablet computer, a notebook computer, etc. can display a video image 31 of each of multiple video streams during video capturing. The multiple video streams may be video streams collected by different cameras, and such video recording scenes may be referred to as multi-shot and co-recorded scenes. Alternatively, the multiple video streams may be video streams collected by one camera but processed differently, and such video scenes may be referred to as single shot co-recorded scenes.

These two video scenes are exemplarily described below.

First kind of video scene: multi-shot and co-recorded scene

In a multi-shot and co-recorded scene, video is recorded simultaneously by a plurality of cameras, and video images of video streams acquired by each of the plurality of cameras are displayed in a video recording interface (which may also be referred to as a video preview interface or a video shooting interface).

In one possible case, the terminal has a plurality of cameras whose shooting directions are different. The terminal can start a multi-camera video recording function to simultaneously record video through a plurality of cameras of the terminal, and then video images of video streams acquired by each of the cameras are displayed in a video recording interface.

For example, the terminal may have a front camera and a rear camera. After the terminal starts the multi-camera video recording function, a front camera and a rear camera of the terminal are started, the front camera collects one path of video stream, and the rear camera collects one path of video stream. Thereafter, as shown in fig. 4, the terminal may display a video image 421 of the video stream collected by the front camera and a video image 422 of the video stream collected by the rear camera in the video interface 41.

In another possible case, the terminal is in a multi-screen collaborative state with other devices (which may be referred to as collaborative devices), both the terminal and the collaborative device have cameras, and the terminal can take pictures by means of the cameras of the collaborative device. The terminal can start a collaborative video recording function to simultaneously record video through the camera of the terminal and the camera of the collaborative device, and then video images of video streams collected by the camera of the terminal and video images of video streams collected by the camera of the collaborative device are displayed in a video recording interface.

The terminal and the cooperative device are provided with a camera, and after the terminal starts the cooperative video recording function, the terminal starts the camera of the terminal and instructs the cooperative device to start the camera of the cooperative device. The camera of the terminal can collect one path of video stream, the camera of the cooperative equipment can collect one path of video stream, and the cooperative equipment can send the video stream collected by the camera of the cooperative equipment to the terminal. Thereafter, as shown in fig. 5, the terminal 501 may display a video image 521 of the video stream collected by its own camera and a video image 522 of the video stream collected by the camera of the cooperative apparatus 502 in the video interface 51.

The second video scene: single shot and co-recorded scene

In a single-shot and co-recorded scene, video is recorded through one camera, and video images, acquired by the camera, of different processed video flows are displayed in a video recording interface.

In one possible case, the terminal has one camera. The terminal can start a single-shot video recording function to record video through the camera of the terminal, and then video images, acquired by the camera, of video flows through different processes are displayed in a video recording interface.

The terminal may have a rear camera, for example. After the terminal starts the single-shot video recording function, the terminal starts a rear camera of the terminal, the rear camera collects one path of video stream, and the terminal amplifies and cuts the video image of the path of video stream to obtain the video image of the other path of video stream. Thereafter, as shown in fig. 6, the terminal may display a video image 622 of the original video stream acquired by the rear camera and a video image 621 of another video stream obtained by the zoom-in process and the crop-out process in the video interface 61. The video image 622 is an original video image captured by a rear camera, and the video image 621 is a video image obtained by performing an enlarging process and a cropping process on the original video image 622.

In the above multiple video scenes, the terminal may acquire multiple video streams during the video capturing process, and display video images of each video stream in the multiple video streams in the video interface. Optionally, when the video images of each video stream in the multiple paths of video streams are displayed in the video interface, the video images of each video stream in the multiple paths of video streams can be spliced according to a specific image splicing mode to obtain video images of the fused video streams, and then the video images of the fused video streams are displayed in the video interface.

Illustratively, the image stitching mode may include a up-down stitching mode, a side-to-side stitching mode, a picture-in-picture nesting mode, and the like. The up-down splicing mode refers to splicing video images of each path of video stream in the multiple paths of video streams in sequence from top to bottom, so that video images of each path of video stream in the multiple paths of video streams contained in the video images of the fusion video stream obtained according to the up-down splicing mode are sequentially arranged from top to bottom. For example, as shown in fig. 4, 5 or 6, the video images 32 of the merged video stream displayed in the video interface are obtained by splicing the video images of each of the multiple video streams according to the up-down splicing mode. The left-right splicing mode refers to splicing video images of each path of video stream in the multi-path video stream in sequence from left to right, so that the video images of each path of video stream in the multi-path video stream contained in the video images of the fusion video stream obtained according to the left-right splicing mode are sequentially arranged from left to right. The picture-in-picture nested mode refers to that the sub-pictures are simultaneously displayed on a small area of the main picture in the process of displaying the main picture in a full screen mode. That is, the picture-in-picture nesting mode refers to taking a video image of one video stream of the multiple video streams as a main picture, taking video images of other video streams except the one video stream of the multiple video streams as sub-pictures, and splicing the sub-pictures on a small area of the main picture. For example, as shown in fig. 7, in the multi-shot video recording process, the terminal may display a video image 32 of a fusion video stream in the video recording interface 71, where the video image 32 of the fusion video stream includes a video image 721 of a video stream collected by a front camera of the terminal and a video image 722 of a video stream collected by a rear camera of the terminal, and the video image 722 of the video stream collected by the rear camera is a main picture, and the video image 721 of the video stream collected by the front camera is a sub-picture existing on a small area of the main picture.

According to the video scene, after the terminal acquires multiple paths of video streams in the video shooting process, the multiple paths of video streams are required to be subjected to image fusion processing to obtain a fused video stream, so that video images of the fused video stream are displayed on a video interface. And after the video shooting is finished, the terminal can also store the fused video stream for the subsequent watching of the user. However, in the video shooting process, the video is limited by a Camera device, a processing chip, an image algorithm and the like, so that the video processing capability is difficult to be considered while the video is ensured to be recorded in real time, and the video effect of the fused video stream obtained in the video shooting process is poor.

Therefore, the embodiment of the application provides a video shooting method, which is used for carrying out image fusion processing on multiple paths of video streams in the video shooting process to obtain a first video stream. And then, generating not only a first multimedia file containing the first video stream, but also a second multimedia file containing the multipath video stream and image fusion parameters corresponding to the first video stream, and storing the first multimedia file and the second multimedia file in an associated manner. Therefore, after the video shooting is finished, the user can watch the first video stream with the image fusion effect in the stored first multimedia file, and can share the first multimedia file to other people for watching in time. And the terminal can also generate a fusion video stream with an image fusion effect according to the multipath video streams and the image fusion parameters in the stored second multimedia file. After the video shooting is finished, the terminal does not need to record the video in real time, so that higher video processing capability can be provided, the image fusion effect of the fusion video stream generated according to the second multimedia file is better than that of the first video stream generated in the first multimedia file in the video shooting process, and the user can finally obtain the video stream with better image fusion effect for playing.

The video shooting method provided by the embodiment of the application is explained in detail below.

Fig. 8 is a flowchart of a video photographing method according to an embodiment of the present application, and the method is applied to a terminal, which may be the terminal 100 described in the embodiments of fig. 1-2. Referring to fig. 8, the method includes:

step 801: and the terminal acquires multiple paths of video streams in the video shooting process.

The time stamps of the video images of each of the multiple video streams are aligned. That is, each time stamp has a corresponding video image of a frame in the multiple video streams, in other words, the time stamps of the video images of the ith frame of each video stream in the multiple video streams are the same, i is a positive integer.

Alternatively, if the video is captured through the Cheng Weiduo recording process, the multiple video streams may be video streams captured by different cameras. Alternatively, if the video is captured in Cheng Weishan, the multiple video streams may be video streams collected by a camera but processed differently.

In this case, the operation of the terminal to acquire multiple video streams may be implemented in the following two ways:

the first way is: the terminal acquires one path of video stream acquired by each camera in the plurality of cameras to obtain multiple paths of video streams.

The second way is: the terminal acquires one path of video stream acquired by the camera, and performs image processing on the one path of video stream to acquire the other path of video stream.

The terminal performs image processing on the video stream, that is, processes the video image of the video stream, for example, may perform amplification processing and clipping processing on the video image of the video stream to obtain the video image of another video stream.

Optionally, the camera may be disposed at the terminal, or may be disposed at a cooperative device in a multi-screen cooperative state with the terminal, which is not limited in the embodiment of the present application.

Step 802: and the terminal performs image fusion processing on the multiple paths of video streams to obtain a first video stream.

The terminal performs image fusion processing on the multiple paths of video streams, namely, performs fusion processing on video images of each path of video stream in the multiple paths of video streams to obtain video images of a first video stream. Thus, the first video stream is a video stream with a specific image fusion effect.

Because the time stamps of the video images of each video stream in the multiple video streams are aligned, fusion processing can be performed on multiple video images with the same time stamp in the multiple video streams. Specifically, every time the ith frame of video image of each video stream in the multiple video streams is obtained, the ith frame of video image of each video stream in the multiple video streams is subjected to fusion processing to obtain the ith frame of video image of the first video stream, that is, the video images of each video stream in the multiple video streams are subjected to fusion processing frame by frame to obtain each frame of video image of the first video stream, so that the time stamp of the video image of the first video stream is aligned with the time stamp of the video image of each video stream in the multiple video streams. Thus, after the image fusion processing is performed on the multiple paths of video streams, the obtained video image of the first video stream contains the image after the fusion processing is performed on the video image of each path of video stream in the multiple paths of video streams.

When the terminal performs fusion processing on the video images of each path of video stream in the multiple paths of video streams, the terminal can splice the video images of each path of video stream in the multiple paths of video streams according to specific image fusion parameters.

The image fusion parameter is used for indicating the image fusion mode of the multi-path video stream. The image fusion parameters may include an image stitching mode, and may further include an image stitching position of each of the multiple video streams. The image stitching mode may include one or more of an up-down stitching mode, a left-right stitching mode, a picture-in-picture nesting mode, and the like, which is not limited in the embodiment of the present application. The image splicing position of any one of the multiple video streams is used for indicating the position of the video image of the video stream when the video image is spliced according to the corresponding image splicing mode.

For example, the multiple video streams include video stream a and video stream B. Assuming that the image stitching mode is an up-down stitching mode, the image stitching position of the video stream a is up, and the image stitching position of the video stream B is down, the terminal may stitch the video image of the video stream a above the video image of the video stream B to obtain the video image of the first video stream, where the upper half of the video image of the first video stream is the video image of the video stream a, and the lower half is the video image of the video stream B. Or, assuming that the image stitching mode is a picture-in-picture nested mode, the image stitching position of the video stream a is a main picture, and the image stitching position of the video stream B is a sub-picture, the terminal may stitch the video image of the video stream B on a small area of the video stream a to obtain the video image of the first video stream, where the main picture of the video image of the first video stream is the video image of the video stream a, and the sub-picture is the video image of the video stream B.

It should be noted that, because the terminal performs fusion processing on the video images of each video stream in the multiple video streams frame by frame to obtain each frame of video image of the first video stream, the image fusion parameters also exist frame by frame. That is, the ith frame of video image of each path of video stream in the multiple paths of video streams corresponds to an image fusion parameter, the ith frame of video image of the first video stream obtained according to the ith frame of video image of each path of video stream in the multiple paths of video streams also corresponds to the image fusion parameter, the image fusion parameter can also have a time stamp, and the time stamp of the image fusion parameter is aligned with the time stamp of the ith frame of video image of each path of video stream in the multiple paths of video streams and the time stamp of the ith frame of video image of the first video stream.

Optionally, the image fusion parameters adopted by the terminal when the video images of each video stream in the multiple video streams are fused may be default, or may be preset by a user according to own needs before capturing the video, or may be automatically determined by the terminal according to the content of the video image of each video stream in the multiple video streams, which is not limited in the embodiment of the present application.

In some embodiments, the user may also actively adjust the image fusion parameters during video capture. For example, assume that the default image stitching mode is the up-down stitching mode. When shooting is just started, the terminal adopts a default up-down splicing mode to splice video images of each path of video stream in the multi-path video stream, after shooting is carried out for a period of time, a user can adjust the image splicing mode in the terminal to be a picture-in-picture nested mode, and then the terminal adopts the picture-in-picture nested mode to continue splicing the video images of each path of video stream in the multi-path video stream.

It should be noted that, the image fusion parameters of the video images of each frame of each video stream in the multiple video streams may be the same or different. In some embodiments, the image fusion manner of the multiple video streams may be continuously changed during the whole video capturing process, where the change may be from a manual adjustment of a user, such as a manual adjustment of an image stitching mode during the video capturing process, or the change may be from an automatic adjustment of the terminal, such as the terminal may select different image fusion parameters according to different contents of video images of the multiple video streams.

For example, the terminal has a front camera and a rear camera. In the first 10 seconds of starting to shoot the video, the terminal performs image fusion processing on the multiple paths of video streams by adopting a default image fusion parameter, and supposing that an image splicing mode in the default image fusion parameter is an up-down splicing mode, and the image splicing position of the video stream acquired by the front camera is up, and the image splicing position of the video stream acquired by the rear camera is down, as shown in fig. 4, in the first 10 seconds of starting to shoot the video, the terminal can splice the video image 421 of the video stream acquired by the front camera and the video image 422 of the video stream acquired by the rear camera according to the up-down splicing mode, so as to obtain the video image 32 of the first video stream displayed in the video recording interface 41, and the video image 421 and the video image 422 of the first video stream are arranged in sequence from top to bottom.

After shooting the video for 10 seconds, the user manually adjusts the image fusion parameters, the image stitching mode in the adjusted image fusion parameters is a picture-in-picture nesting mode, the image stitching position of the video stream acquired by the front camera is a sub-picture, the image stitching position of the video stream acquired by the rear camera is a main picture, then as shown in fig. 7, after shooting the video for 10 seconds, the terminal can stitch the video image 721 of the video stream acquired by the front camera and the video image 722 of the video stream acquired by the rear camera according to the picture-in-picture nesting mode to obtain the video image 32 of the first video stream displayed in the video recording interface 71, the video image 722 in the video image 32 of the first video stream is the main picture, and the video image 721 is the sub-picture existing on the small area region of the main picture.

Step 803: the terminal acquires image fusion parameters corresponding to the first video stream.

The image fusion parameter (which may also be referred to as Metadata) corresponding to the first video stream is used to indicate an image fusion manner of the multiple video streams when the first video stream is obtained, and specifically, the image fusion parameter is parameter information of a splicing manner of video images of the multiple video streams. That is, the terminal performs fusion processing on the video images of each path of video stream in the multiple paths of video streams frame by frame to obtain each frame of video image of the first video stream, and then can acquire image fusion parameters corresponding to each frame of video image of the first video stream frame by frame. In this case, the image fusion parameter corresponding to the i-th frame video image of the first video stream is used to indicate the image fusion mode of the i-th frame video image of each path of video streams in the multiple paths of video streams when the i-th frame video image of the first video stream is obtained, that is, the image fusion parameter corresponding to the i-th frame video image of the first video stream is the image fusion parameter adopted when the i-th frame video image of each path of video streams in the multiple paths of video streams is fused.

Because the image fusion parameters corresponding to each frame of video image of the first video stream are obtained frame by frame, the image fusion parameters corresponding to the first video stream are actually a parameter stream, the image fusion parameters of the parameter stream have time stamps, the time stamps of the image fusion parameters of the parameter stream are aligned with the time stamps of the video images of the first video stream, and the image fusion parameters of the parameter stream are used for indicating how to obtain the video images of the first video stream according to the fusion of the video images of each path of video stream in the multipath video stream, namely, the image fusion parameters of the parameter stream are description of the image fusion mode of frame by frame.

Further, after the first video stream is obtained, the terminal can display the video image of the first video stream on the video recording interface, namely, one frame of video image of the first video stream can be displayed on the video recording interface every time the video image of the first video stream is obtained, so that real-time preview of the shot video can be realized in the video shooting process, and a user can know the image fusion effect of the video in time conveniently.

For example, the terminal has a front camera and a rear camera. The image stitching mode in the image fusion parameters is an up-down stitching mode, and the image stitching position of the video stream acquired by the front camera is up, and the image stitching position of the video stream acquired by the rear camera is down, as shown in fig. 4, the terminal may stitch the video image 421 of the video stream acquired by the front camera and the video image 422 of the video stream acquired by the rear camera according to the up-down stitching mode, so as to obtain the video image 32 of the first video stream displayed in the video recording interface 41, and the video image 421 and the video image 422 in the video image 32 of the first video stream are arranged in sequence from top to bottom.

For another example, the terminal has a front camera and a rear camera. The image stitching mode in the image fusion parameter is a picture-in-picture nested mode, and the image stitching position of the video stream acquired by the front camera is a sub-picture, and the image stitching position of the video stream acquired by the rear camera is a main picture, then as shown in fig. 7, the terminal may stitch the video image 721 of the video stream acquired by the front camera and the video image 722 of the video stream acquired by the rear camera according to the picture-in-picture nested mode, so as to obtain the video image 32 of the first video stream displayed in the video interface 71, wherein the video image 722 in the video image 32 of the first video stream is the main picture, and the video image 721 is the sub-picture existing on the small area region of the main picture.

Step 804: the terminal generates a first multimedia file comprising a first video stream.

The first multimedia file is a file for playing the first video stream. The first video stream in the first multimedia file has an image fusion effect.

The terminal can be continuously fused to obtain the video image of the first video stream in the video shooting process, so that the first multimedia file can be continuously generated according to the first video stream. Therefore, after the video shooting is finished, the terminal can obtain the first multimedia file containing the complete first video stream, so that the user can share the first multimedia file in real time.

Optionally, when generating the first multimedia file including the first video stream, the terminal may first encode the first video stream to obtain a video file, and then encapsulate the video file and other related files (including but not limited to an audio file) to obtain the first multimedia file. Of course, the terminal may generate the first multimedia file including the first video stream in other manners, which is not limited by the embodiment of the present application.

The format of the video file may be a preset format, for example, may be a moving picture experts group (moving picture experts group, mpeg-4) format, that is, an MP4 format, or may be a streaming video (FLV) format, or the like, but of course, may be other formats, which is not limited in the embodiment of the present application.

The audio file may be encoded as an audio stream. The audio stream may be continuously acquired by the terminal during the video capturing process, for example, may be continuously acquired by a microphone of the terminal. The time stamps of the audio frames of the audio stream are aligned with the time stamps of the video images of each of the multiple video streams. The format of the audio file may be the same as or different from the format of the video file, for example, the format of the audio file may be MP4 format, FLV format, advanced audio coding (advanced audio coding, AAC) format, etc., which is not limited in the embodiment of the present application.

When the terminal encapsulates the video file and other related files, the video file may be used as a video track (track), other related files may be used as other tracks (e.g., an audio file may be used as an audio track), and then encapsulates the video track and other tracks to obtain a multi-track file as the first multimedia file. Wherein the track is a sequence of time stamps.

For example, the terminal may use a video multiplexer to encapsulate (also referred to as synthesizing (mux)) the video track corresponding to the video file and the audio track corresponding to the audio file into an MP4 file, where the MP4 file is a multi-track file, that is, a first multimedia file.

Step 805: the terminal generates a second multimedia file containing the multiple video streams and the image fusion parameters.

Each of the multiple video streams is stored separately in the second multimedia file, i.e., each video stream is independent. The second multimedia file may be used to play each of the multiple video streams separately. The multiple video streams in the second multimedia file are original video streams without image fusion processing, i.e. video streams without image fusion effect. The image fusion parameter in the second multimedia file is used for indicating an image fusion mode which needs to be adopted by the multi-path video stream in the second multimedia file in the subsequent fusion.

The terminal can continuously acquire video images of each path of video streams in the multipath video streams in the video shooting process, and can continuously acquire the image fusion parameters in the process of continuously carrying out image fusion processing on the multipath video streams, so that a second multimedia file can be continuously generated according to the multipath video streams and the image fusion parameters. Therefore, after video shooting is finished, the terminal can obtain the second multimedia file containing the complete multi-path video stream and the complete image fusion parameters, so that the multi-path video stream can be conveniently subjected to post-processing according to the image fusion parameters, and the post-processing space of the multi-path video stream is improved.

Alternatively, the operation of step 805 may be implemented in two possible ways as follows.

A first possible way is: the terminal encodes each path of video stream in the multiple paths of video streams respectively to obtain multiple video files; for any one video file in the plurality of video files, packaging the video file and the image fusion parameters to obtain a corresponding packaging file; a plurality of encapsulation files corresponding to the plurality of video files one to one are determined as second multimedia files.

The format of the video file may be a preset format, for example, may be MP4 format, FLV format, etc., which is not limited in the embodiment of the present application.

In this way, the video files of each path of video stream in the multiple paths of video streams are individually packaged to obtain a corresponding packaged file, so that the packaged file of each path of video stream in the multiple paths of video streams can be obtained, and a plurality of packaged files are obtained. The second multimedia file includes the plurality of encapsulated files.

Optionally, when the terminal encapsulates a certain video file and the image fusion parameter, other related files may also be encapsulated together, for example, the terminal may encapsulate the video file, the image fusion parameter and the audio file, so as to obtain a corresponding encapsulated file.

The audio file may be encoded as an audio stream. The audio stream may be continuously acquired by the terminal during the video capturing process, for example, may be continuously acquired by a microphone of the terminal. The time stamps of the audio frames of the audio stream are aligned with the time stamps of the video images of each of the multiple video streams. The format of the audio file may be the same as or different from the format of the video file, for example, the format of the audio file may be MP4 format, FLV format, AAC format, etc., which is not limited in the embodiment of the present application.

Optionally, when the terminal encapsulates a certain video file, the image fusion parameter and other related files, the video file may be used as a video track, the image fusion parameter may be used as a parameter track, other related files may be used as other tracks, and the video track, the parameter track and the other tracks are encapsulated to obtain a corresponding multi-track file as an encapsulated file. In this case, a plurality of multi-track files corresponding to the plurality of video files one by one are determined as the second multimedia file.

For example, for any one video file in the plurality of video files, the terminal may use a video multiplexer to package the video track corresponding to the video file, the parameter track corresponding to the image fusion parameter, and the audio track corresponding to the audio file into an MP4 file, where the MP4 file is a multi-track file. And then determining the MP4 files which are obtained by encapsulation and correspond to the video files one by one as second multimedia files.

A second possible way is: the terminal encodes each path of video stream in the multiple paths of video streams respectively to obtain multiple video files; and packaging the video files and the image fusion parameters to obtain a second multimedia file.

In this way, the plurality of video files of the multi-path video stream are integrally encapsulated to obtain an encapsulated file as the second multimedia file.

Optionally, when the terminal encapsulates the plurality of video files and the image fusion parameters, other related files may also be encapsulated together, for example, the terminal may encapsulate the plurality of video files and the image fusion parameters and the audio file, so as to obtain a second multimedia file.

Optionally, when the terminal encapsulates the plurality of video files, the image fusion parameters and other related files, each video file in the plurality of video files may be used as a video track to obtain a plurality of video tracks, the image fusion parameters are used as parameter tracks, other related files are used as other tracks, and then the plurality of video tracks, the parameter tracks and the other tracks are encapsulated to obtain the second multimedia file.

For example, the terminal may use a video multiplexer to package a plurality of video tracks corresponding to the plurality of video files one by one, a parameter track corresponding to the image fusion parameter, and an audio track corresponding to the audio file into an MP4 file, where the MP4 file is a multi-track file, that is, a second multimedia file.

Step 806: the terminal stores the first multimedia file in association with the second multimedia file.

As an example, when the terminal stores the first multimedia file in association with the second multimedia file, the first multimedia file and the second multimedia file may be bound and associated to form a video container, and this video container may be referred to as a dual video container in the embodiment of the present application. That is, the terminal may store the first multimedia file and the second multimedia file in a dual video container to enable associated storage of the first multimedia file and the second multimedia file.

For example, if the second multimedia file is obtained by the first manner in the step 805, as shown in fig. 9, the dual video container may store a first multimedia file, where the first multimedia file includes a first video stream, the first video stream has an image fusion effect, and the dual video container may store a second multimedia file, where the second multimedia file includes a plurality of package files, such as package file a and package file B shown in fig. 9, where each package file in the plurality of package files includes a video stream and the image fusion parameter, and the video stream is an original video stream without the image fusion effect, and as shown in fig. 9, package file a includes a video stream a, and package file B includes a video stream B, where neither the video stream a nor the video stream B has the image fusion effect.

For another example, if the second multimedia file is obtained by the second method in the step 805, as shown in fig. 10, the dual video container may store a first multimedia file, where the first multimedia file includes a first video stream, and the first video stream has an image fusion effect, and the dual video container may store a second multimedia file, where the second multimedia file includes multiple video streams (such as a video a and a video B shown in fig. 10) and the image fusion parameter, and the multiple video streams are all original video streams without the image fusion effect.

It should be noted that the implementation specifications of the dual video container are different according to the video scene. Illustratively, in a multi-shot and single-shot co-recorded scene, the implementation specification of the dual video container may be as shown in table 1 below.

TABLE 1

The embodiment of the present application is described by taking table 1 as an example only, and table 1 does not limit the embodiment of the present application.

The terminal can continuously generate a first multimedia file and a second multimedia file in the video shooting process, and store the first multimedia file and the second multimedia file in an associated mode. Further, after the video capturing is finished, the terminal may also display the first video stream in the stored first multimedia file in a video list (may also be referred to as a gallery), so that the user may select to play the first video stream in the first multimedia file.

As an example, the terminal may display an association button in the video list. The association button is used for indicating a second multimedia file associated with the first multimedia file. Therefore, if the terminal detects the selection operation of the related button, the terminal can display multiple paths of video streams in the second multimedia file, so that a user can know which original video streams in the first multimedia file are fused, and can conveniently select and play any path of video streams in the multiple paths of video streams in the second multimedia file.

For example, as shown in fig. 11, the terminal may present a first video stream 1102 of a first multimedia file in a video list 1101 and display an associated button 1103. In this case, the user may choose to play the first video stream 1102. Thereafter, as shown in fig. 11 (a), if the user clicks the association button 1103, the terminal displays the multi-path video stream 1104 in the second multimedia file in response to the click operation (i.e., selection operation) of the association button 1103 as shown in fig. 11 (b). In this case, the user may choose to play any one of the multiple video streams 1104.

As another example, the terminal may display a video thumbnail corresponding to each of the multiple video streams in the second multimedia file in the video list. In this way, if the terminal detects the selection operation on any one of the displayed video thumbnails, the terminal can display one path of video stream corresponding to the video thumbnail in the second multimedia file, so that the user can select to play the one path of video stream.

Of course, in addition to the two exemplary manners described above, the terminal may also display the multiple video streams in the second multimedia file in other manners, which is not limited by the embodiment of the present application.

Further, after the video shooting is finished, the terminal may further obtain the multiple paths of video streams from the second multimedia file, and then play at least one path of video streams in the multiple paths of video streams. For example, the terminal may present the multiple video streams in a video list, and then the user may select to play at least one of the multiple video streams.

And if the terminal receives a fusion adjustment instruction aiming at the video image of the at least one path of video stream in the playing process of the at least one path of video stream, updating the image fusion parameters in the second multimedia file according to fusion adjustment information carried by the fusion adjustment instruction.

The fusion adjustment instruction is used for indicating an image fusion mode required to be adopted for adjusting the multi-path video stream. The user can manually trigger a fusion adjustment instruction according to the own requirement in the playing process of the at least one path of video stream, wherein the fusion adjustment instruction is used for indicating to change the image fusion mode, for example, the image splicing mode can be indicated to be changed, and/or the image splicing position of each path of video stream is changed. That is, the fusion adjustment information carried in the fusion adjustment instruction may include an image stitching mode to be adjusted, and/or may include an image stitching position to be adjusted by each path of video stream. Therefore, the terminal can modify the image fusion parameters in the second multimedia file according to the fusion adjustment information, so that the update of the image fusion parameters is realized, and the image fusion processing performed subsequently according to the image fusion parameters in the second multimedia file meets the latest requirements of users.

For example, the multi-path video stream includes a video stream a and a video stream B, wherein in the video shooting process, the image stitching mode of the video stream a and the video stream B in the first 10 seconds is an up-down stitching mode, and the image stitching mode after 10 seconds is a left-right stitching mode. In this case, the image splicing modes in the image fusion parameters in the second multimedia file with the time stamp within the first 10 seconds are all the up-down splicing modes, and the image splicing modes in the image fusion parameters with the time stamp within the first 10 seconds are all the left-right splicing modes.

After the video shooting is finished, the terminal plays the video stream A or the video stream B according to the second multimedia file, or plays the video stream A and the video stream B simultaneously. At this time, if the user wants to adjust the image stitching mode of the previous 3 seconds to the left-right stitching mode, in the playing process of the video stream a and/or the video stream B, a fusion adjustment instruction for the video images of the previous 3 seconds of the video stream a and/or the video stream B may be triggered, where the fusion adjustment instruction is used to instruct the image stitching mode of the video images of the previous 3 seconds to be adjusted to the left-right stitching mode. Under the condition, the terminal updates the image fusion parameters in the second multimedia file according to the fusion adjustment instruction, wherein the image splicing modes in the image fusion parameters with the time stamp within the first 3 seconds in the updated image fusion parameters are all left and right splicing modes, the image splicing modes in the image fusion parameters with the time stamp within 3 seconds to 10 seconds are all up and down splicing modes, and the image splicing modes in the image fusion parameters with the time stamp within 10 seconds are all left and right splicing modes.

Further, after the video photographing is finished, the terminal may generate a third multimedia file according to the second multimedia file. Specifically, the terminal may obtain the multiple paths of video streams and the image fusion parameters from the second multimedia file, then perform image fusion processing on the multiple paths of video streams according to the image fusion parameters to obtain a second video stream, and generate a third multimedia file according to the second video stream. Further, the terminal may update the first multimedia file stored in association with the second multimedia file to a third multimedia file.

In this case, the image fusion parameters corresponding to the first video stream are the same as the image fusion parameters corresponding to the second video stream, that is, the terminal performs image fusion processing on the multiple video streams by using the same image fusion mode to obtain the first video stream and the second video stream. However, in the video shooting process, the terminal is limited by a Camera device, a processing chip, an image algorithm and the like, and video processing capability is difficult to be considered while video real-time recording is guaranteed, so that the image fusion effect of a first video stream generated by the terminal in the video shooting process is likely to be poor. After the video shooting is finished, the terminal does not need to record the video in real time, so that higher video processing capacity can be provided, and the image fusion effect of the second video stream generated by the terminal at the moment is better than that of the first video stream generated in the video shooting process.

In this case, the terminal updates the first multimedia file stored in association with the second multimedia file into the third multimedia file, so that the video stream in the multimedia file stored in association with the second multimedia file is a video stream with better image fusion effect, and thus, the user can finally obtain the video stream with better image fusion effect for playing.

It should be noted that, if the terminal updates the image fusion parameters in the second multimedia file according to the fusion adjustment instruction triggered by the user, the terminal may generate a third multimedia file according to the second multimedia file. And then updating the multimedia file (possibly the first multimedia file and possibly the old third multimedia file) stored in association with the second multimedia file into the newly generated third multimedia file, so that the video stream in the multimedia file stored in association with the second multimedia file is a video stream with good image fusion effect which meets the latest image fusion requirement of the user.

When the terminal obtains the multiple paths of video streams and the image fusion parameters from the second multimedia file, the terminal may firstly decapsulate (demux) the second multimedia file to obtain multiple video files and the image fusion parameters, and then decode each video file in the multiple video files to obtain the multiple paths of video streams.

The manner in which the terminal generates the third multimedia file according to the second video stream is similar to the manner in which the first multimedia file is generated according to the first video stream, which is not described in detail in the embodiment of the present application.

For example, as shown in fig. 12, the terminal decapsulates the second multimedia file to obtain a video file a, a video file B and the image fusion parameter, decodes the video file a to obtain a video stream a, decodes the video file B to obtain a video stream B, and then performs image fusion processing on the video stream a and the video stream B according to the image fusion parameter to obtain a second video stream. And then, the terminal encodes the second video stream to obtain a video file C, and encapsulates the video file C to obtain a third multimedia file.

In the embodiment of the application, multiple paths of video streams are acquired in the video shooting process. And then, carrying out image fusion processing on the multi-path video stream to obtain a first video stream, and obtaining image fusion parameters corresponding to the first video stream, wherein the image fusion parameters are used for indicating the image fusion mode of the multi-path video stream when the first video stream is obtained. And then, generating a first multimedia file containing the first video stream, and generating a second multimedia file containing the multi-path video stream and the image fusion parameters, wherein the image fusion parameters in the second multimedia file are used for indicating an image fusion mode which is needed to be adopted by the multi-path video stream in the second multimedia file in the subsequent fusion. The first multimedia file is stored in association with the second multimedia file. Therefore, after the video shooting is finished, the user can watch the first video stream with the image fusion effect in the stored first multimedia file, and can share the first multimedia file to other people for watching in time. And the terminal can also generate a fusion video stream with an image fusion effect according to the multipath video streams and the image fusion parameters in the stored second multimedia file. After the video shooting is finished, the terminal does not need to record the video in real time, so that higher video processing capability can be provided, the image fusion effect of the fusion video stream generated by the terminal according to the second multimedia file is better than that of the first video stream generated in the first multimedia file in the video shooting process, and the user can finally obtain the video stream with better image fusion effect for playing.

For ease of understanding, the above-described video photographing method is exemplified below with reference to fig. 13 to 16.

The following describes a video capturing method in a multi-shot copy scene with reference to fig. 13 and 14.

Fig. 13 is a schematic diagram of a video shooting method according to an embodiment of the present application. The method is applied to multi-shot and co-recorded scenes, and in this case, the terminal records the video through the camera A and the camera B at the same time. The method may comprise the following steps (1) -step (4):

(1) The camera A collects video stream A, and the video stream A is transmitted to the image fusion module and the associated storage module after being processed by the ISP front end module 0 and the ISP rear end module 0.

For example, the video image of the video stream a collected by the camera a may be in a RAW format, the ISP front-end module 0 may convert the video image of the video stream a in the RAW format into a video image of a YUV format, and the ISP back-end module 0 may perform basic processing on the video image of the video stream a in the YUV format, such as adjusting contrast, removing noise, and the like.

(2) The camera B collects a video stream B, and the video stream B is transmitted to the image fusion module and the associated storage module after being processed by the ISP front-end module 1 and the ISP rear-end module 1.

For example, the video image of the video stream B collected by the camera B may be in a RAW format, the ISP front-end module 1 may convert the video image of the video stream B in the RAW format into a video image of the YUV format, and the ISP back-end module 1 may perform basic processing on the video image of the video stream B in the YUV format, such as adjusting contrast, removing noise, and the like.

(3) The image fusion module performs image fusion processing on the video stream A and the video stream B to obtain a first video stream, and sends image fusion parameters corresponding to the first video stream to the associated storage module, wherein the first video stream has an image fusion effect.

Alternatively, the video image of the first video stream may be displayed as a preview video image on a video interface, so as to implement video preview (preview). Optionally, a first multimedia file containing the first video stream may also be generated and stored.

(4) The associated storage module generates a second multimedia file containing the video stream A, the video stream B and the image fusion parameters, and stores the second multimedia file in association with the first multimedia file.

It should be noted that, in the embodiment of the present application, the preview video image may be a video image of the first video stream, that is, the image fusion manner corresponding to the preview video image and the video image of the first video stream in the stored first multimedia file is the same. However, the embodiment of the present application is merely described by taking this as an example, and in actual use, the image fusion manner corresponding to the preview video image and the video image of the first video stream in the stored first multimedia file may also be different. In this case, as shown in fig. 14, the method may include the following steps a to e:

Step a: the camera A collects video stream A, and the video stream A is transmitted to the preview module, the film forming module and the associated storage module after being processed by the ISP front end module 0 and the ISP rear end module 0.

Step b: the camera B collects a video stream B, and the video stream B is transmitted to the preview module, the film forming module and the associated storage module after being processed by the ISP front end module 1 and the ISP rear end module 1.

Step c: and the preview module performs image fusion processing on the video stream A and the video stream B to obtain a preview video stream, and the video image of the preview video stream is used as a preview video image to be displayed on a video interface, wherein the preview video stream has an image fusion effect.

Step d: and the slicing module performs image fusion processing on the video stream A and the video stream B to obtain a first video stream, and sends image fusion parameters corresponding to the first video stream to the associated storage module to generate and store a first multimedia file containing the first video stream.

In this case, the image fusion method used by the preview module and the sheeting module may be different. In addition, compared with the film forming module, the preview module has simpler operation when performing image fusion processing on the video stream A and the video stream B, for example, the film forming module needs to perform image anti-shake processing when performing image fusion processing on the video stream A and the video stream B, and the preview module does not need to perform image anti-shake processing.

Step e: the associated storage module generates a second multimedia file containing the video stream A, the video stream B and the image fusion parameters, and stores the second multimedia file in association with the first multimedia file.

The following describes a video capturing method in a single-shot copy scene with reference to fig. 15 and 16.

Fig. 15 is a schematic diagram of a video shooting method according to an embodiment of the present application. The method is applied to a single-shot and co-recorded scene, and in this case, the terminal records the video through the camera A. The method may comprise the following steps (1) -step (5):

(1) The camera A collects video stream A, and the video stream A is transmitted to the ISP rear end module 0 and the ISP rear end module 1 after being processed by the ISP front end module 0.

For example, the video image of the video stream a collected by the camera a may be in a RAW format, and the ISP front-end module 0 may convert the video image of the video stream a in the RAW format into a video image of a YUV format.

(2) And after the ISP back-end module 0 performs basic processing on the video stream A, the video stream A is transmitted to the image fusion module and the associated storage module.

For example, ISP backend module 0 may perform basic processing on the video image in YUV format of video stream a, such as adjusting contrast, removing noise, and the like.

(3) After the ISP back-end module 1 performs image processing on the video stream A, a video stream A 'is obtained, and the video stream A' is transmitted to the image fusion module and the associated storage module.

For example, the ISP back-end module 1 may perform image processing on the video image in YUV format of the video stream a, such as may perform magnification processing, cropping processing, and the like on the video image in YUV format of the video stream a based on specific logic. For example, this particular logic may be body tracking or other significant subject tracking logic.

(4) The image fusion module performs image fusion processing on the video stream A and the video stream A' to obtain a first video stream, and sends image fusion parameters corresponding to the first video stream to the associated storage module, wherein the first video stream has an image fusion effect.

Optionally, the video image of the first video stream may be displayed as a preview video image on a video interface, so as to implement video preview. Optionally, a first multimedia file containing the first video stream may also be generated and stored.

(5) The associated storage module generates a second multimedia file containing the video stream A, the video stream A' and the image fusion parameters, and stores the second multimedia file in association with the first multimedia file.

It should be noted that, in the embodiment of the present application, the preview video image may be a video image of the first video stream, that is, the image fusion manner corresponding to the preview video image and the video image of the first video stream in the stored first multimedia file is the same. However, the embodiment of the present application is merely described by taking this as an example, and in actual use, the image fusion manner corresponding to the preview video image and the video image of the first video stream in the stored first multimedia file may also be different. In this case, as shown in fig. 14, the method may include the following steps a to f:

step a: the camera A collects video stream A, and the video stream A is transmitted to the ISP rear end module 0 and the ISP rear end module 1 after being processed by the ISP front end module 0.

Step b: and after the ISP back-end module 0 performs basic processing on the video stream A, the video stream A is transmitted to the preview module, the slicing module and the associated storage module.

Step c: after image processing is performed on the video stream A by the ISP back-end module 1, a video stream A 'is obtained, and the video stream A' is transmitted to the preview module, the slicing module and the associated storage module.

Step d: and the preview module performs image fusion processing on the video stream A and the video stream A' to obtain a preview video stream, and the video image of the preview video stream is used as a preview video image to be displayed on a video interface, wherein the preview video stream has an image fusion effect.

Step e: and the slicing module performs image fusion processing on the video stream A and the video stream A' to obtain a first video stream, and sends image fusion parameters corresponding to the first video stream to the associated storage module to generate and store a first multimedia file containing the first video stream.

In this case, the image fusion method used by the preview module and the sheeting module may be different. In addition, compared with the film forming module, the preview module has simpler operation when performing image fusion processing on the video stream a and the video stream a ', for example, the film forming module needs to perform image anti-shake processing when performing image fusion processing on the video stream a and the video stream a', and the preview module does not need to perform image anti-shake processing.

Step f: the associated storage module generates a second multimedia file containing the video stream A, the video stream A' and the image fusion parameters, and stores the second multimedia file in association with the first multimedia file.

Fig. 17 is a schematic structural diagram of a video capturing apparatus according to an embodiment of the present application, where the apparatus may be implemented by software, hardware, or a combination of both as part or all of a computer device, and the computer device may be the terminal 100 described in the embodiment of fig. 1-2. Referring to fig. 17, the apparatus includes: a first acquisition module 1701, a processing module 1702, a second acquisition module 1703, a first generation module 1704, a second generation module 1705, and a storage module 1706.

A first obtaining module 1701, configured to obtain multiple video streams during a video capturing process;

the processing module 1702 is configured to perform image fusion processing on multiple paths of video streams to obtain a first video stream;

the second obtaining module 1703 is configured to obtain an image fusion parameter corresponding to the first video stream, where the image fusion parameter is used to indicate an image fusion manner of the multiple paths of video streams when the first video stream is obtained;

a first generation module 1704 for generating a first multimedia file comprising a first video stream;

a second generating module 1705, configured to generate a second multimedia file comprising multiple video streams and image fusion parameters;

the storage module 1706 is configured to store the first multimedia file in association with the second multimedia file.

Optionally, the first obtaining module 1701 is configured to:

acquiring one path of video stream acquired by each camera in the plurality of cameras to obtain multiple paths of video streams;

wherein, a plurality of cameras are arranged at the terminal; or, a part of cameras in the plurality of cameras are arranged at the terminal, and the other part of cameras are arranged at the cooperative equipment in a multi-screen cooperative state with the terminal.

Optionally, the first obtaining module 1701 is configured to:

acquiring a path of video stream acquired by a camera;

And carrying out image processing on one path of video stream to obtain the other path of video stream.

Optionally, the image fusion parameters include an image stitching mode, and the image stitching mode includes one or more of a top-bottom stitching mode, a left-right stitching mode, and a picture-in-picture nesting mode.

Optionally, the apparatus further comprises:

and the display module is used for displaying video images of the first video stream on the video recording interface.

Optionally, the second generating module 1705 is configured to:

encoding each path of video stream in the multiple paths of video streams respectively to obtain multiple video files;

for any one of the plurality of video files, taking one video file as a video track, taking the image fusion parameter as a parameter track, and packaging the video track and the parameter track to obtain a corresponding multi-track file;

a plurality of multi-track files corresponding to the plurality of video files one-to-one are determined as the second multimedia file.

Optionally, the second generating module 1705 is configured to:

taking each video file in the plurality of video files as a video track to obtain a plurality of video tracks;

Taking the image fusion parameters as parameter tracks;

and packaging the plurality of video tracks and the parameter tracks to obtain a second multimedia file.

Optionally, the apparatus further comprises:

the third acquisition module is used for acquiring multiple paths of video streams and image fusion parameters from the second multimedia file after video shooting is finished;

the processing module 1702 is further configured to perform image fusion processing on the multiple paths of video streams according to the image fusion parameters to obtain a second video stream;

the first generation module 1704 is configured to generate a third multimedia file according to the second video stream.

Optionally, the apparatus further comprises:

and the first updating module is used for updating the first multimedia file stored in association with the second multimedia file into a third multimedia file.

Optionally, the apparatus further comprises:

the fourth acquisition module is used for acquiring multiple paths of video streams from the second multimedia file after video shooting is finished;

the playing module is used for playing at least one path of video stream in the multiple paths of video streams;

and the second updating module is used for updating the image fusion parameters in the second multimedia file according to the fusion adjustment information carried by the fusion adjustment instruction if the fusion adjustment instruction of the video images of at least one path of video stream is received in the playing process of the at least one path of video stream.

Optionally, the apparatus further comprises:

the first display module is used for displaying a first video stream and an associated button in a first multimedia file in a video list after video shooting is finished;

and the second display module is used for displaying the multipath video streams in the second multimedia file if the selection operation of the associated button is detected.

In the embodiment of the application, multiple paths of video streams are acquired in the video shooting process. And then, carrying out image fusion processing on the multi-path video stream to obtain a first video stream, and obtaining image fusion parameters corresponding to the first video stream, wherein the image fusion parameters are used for indicating the image fusion mode of the multi-path video stream when the first video stream is obtained. And then, generating a first multimedia file containing the first video stream, and generating a second multimedia file containing the multi-path video stream and the image fusion parameters, wherein the image fusion parameters in the second multimedia file are used for indicating an image fusion mode which is needed to be adopted by the multi-path video stream in the second multimedia file in the subsequent fusion. The first multimedia file is stored in association with the second multimedia file. Therefore, after the video shooting is finished, the user can watch the first video stream with the image fusion effect in the stored first multimedia file, and can share the first multimedia file to other people for watching in time. And the device can also generate a fusion video stream with image fusion effect according to the multipath video streams and the image fusion parameters in the stored second multimedia file. After the video shooting is finished, the device does not need to record the video in real time, so that higher video processing capability can be provided, the image fusion effect of the fusion video stream generated according to the second multimedia file is better than that of the first video stream generated in the first multimedia file in the video shooting process, and a user can finally obtain the video stream with better image fusion effect for playing.

It should be noted that: in the video shooting device provided in the above embodiment, only the division of the above functional modules is used for illustration, and in practical application, the above functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above.

The functional units and modules in the above embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the embodiments of the present application.

The video capturing device and the video capturing method provided in the foregoing embodiments belong to the same concept, and specific working processes and technical effects brought by units and modules in the foregoing embodiments may be referred to a method embodiment part, which is not described herein again.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on a computer, the processes or functions described in accordance with embodiments of the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, data subscriber line (Digital Subscriber Line, DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium such as a floppy Disk, a hard Disk, a magnetic tape, an optical medium such as a digital versatile Disk (Digital Versatile Disc, DVD), or a semiconductor medium such as a Solid State Disk (SSD), etc.

The above embodiments are not intended to limit the present application, and any modifications, equivalent substitutions, improvements, etc. within the technical scope of the present application should be included in the scope of the present application.

Claims

1. A video photographing method, applied to a terminal, the method comprising:

in the video shooting process, acquiring multiple paths of video streams;

performing image fusion processing on the multiple paths of video streams to obtain a first video stream;

acquiring image fusion parameters corresponding to the first video stream, wherein the image fusion parameters are used for indicating an image fusion mode of the multi-path video stream when the first video stream is obtained;

generating a first multimedia file containing the first video stream;

generating a second multimedia file containing the multipath video streams and the image fusion parameters;

and storing the first multimedia file in association with the second multimedia file.

2. The method of claim 1, wherein the obtaining multiple video streams comprises:

acquiring one path of video stream acquired by each camera in a plurality of cameras to obtain the multiple paths of video streams;

The cameras are arranged on the terminal; or, a part of cameras in the plurality of cameras are arranged in the terminal, and the other part of cameras are arranged in a cooperative device in a multi-screen cooperative state with the terminal.

3. The method of claim 1, wherein the obtaining multiple video streams comprises:

acquiring a path of video stream acquired by a camera;

4. A method as claimed in any one of claims 1 to 3, wherein the image fusion parameters comprise an image stitching mode comprising one or more of a top-bottom stitching mode, a side-to-side stitching mode, a picture-in-picture nesting mode.

5. The method according to any one of claims 1 to 4, wherein after performing image fusion processing on the multiple video streams to obtain a first video stream, the method further comprises:

and displaying the video image of the first video stream on a video recording interface.

6. The method of any of claims 1 to 5, wherein generating a second multimedia file comprising the multiplexed video stream and the image fusion parameters comprises:

for any one video file in the plurality of video files, taking the one video file as a video track, taking the image fusion parameter as a parameter track, and packaging the one video track and the parameter track to obtain a corresponding multi-track file;

and determining a plurality of multi-track files corresponding to the video files one by one as the second multimedia file.

7. The method of any of claims 1 to 5, wherein generating a second multimedia file comprising the multiplexed video stream and the image fusion parameters comprises:

taking the image fusion parameters as parameter tracks;

and packaging the plurality of video tracks and the parameter tracks to obtain the second multimedia file.

8. The method of any one of claims 1 to 7, further comprising:

After video shooting is finished, acquiring the multipath video streams and the image fusion parameters from the second multimedia file;

performing image fusion processing on the multi-path video stream according to the image fusion parameters to obtain a second video stream;

and generating a third multimedia file according to the second video stream.

9. The method of claim 8, wherein after generating a third multimedia file from the second video stream, further comprising:

and updating the first multimedia file stored in association with the second multimedia file into the third multimedia file.

10. The method of any one of claims 1 to 9, wherein the method further comprises:

after video shooting is finished, acquiring the multipath video streams from the second multimedia file;

playing at least one video stream in the multiple video streams;

if a fusion adjustment instruction for the video images of the at least one path of video stream is received in the playing process of the at least one path of video stream, updating the image fusion parameters in the second multimedia file according to fusion adjustment information carried by the fusion adjustment instruction.

11. The method of any one of claims 1 to 10, further comprising:

after video shooting is finished, displaying the first video stream and an associated button in the first multimedia file in a video list;

and if the selection operation of the association button is detected, displaying the multi-path video stream in the second multimedia file.

12. A video capture device, the device comprising:

the first acquisition module is used for acquiring multiple paths of video streams in the video shooting process;

the processing module is used for carrying out image fusion processing on the multipath video streams to obtain a first video stream;

the second acquisition module is used for acquiring image fusion parameters corresponding to the first video stream, wherein the image fusion parameters are used for indicating the image fusion mode of the multi-path video stream when the first video stream is obtained;

a first generation module for generating a first multimedia file containing the first video stream;

the second generation module is used for generating a second multimedia file containing the multipath video streams and the image fusion parameters;

and the storage module is used for storing the first multimedia file and the second multimedia file in an associated mode.

13. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, which computer program, when executed by the processor, implements the method according to any of claims 1-11.

14. A computer readable storage medium having instructions stored therein which, when run on a computer, cause the computer to perform the method of any of claims 1-11.