WO2021179783A1 - Free viewpoint-based video live broadcast processing method, device, system, chip and medium - Google Patents

Free viewpoint-based video live broadcast processing method, device, system, chip and medium Download PDF

Info

Publication number
WO2021179783A1
WO2021179783A1 PCT/CN2021/070575 CN2021070575W WO2021179783A1 WO 2021179783 A1 WO2021179783 A1 WO 2021179783A1 CN 2021070575 W CN2021070575 W CN 2021070575W WO 2021179783 A1 WO2021179783 A1 WO 2021179783A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
server
camera
image
synchronized
Prior art date
Application number
PCT/CN2021/070575
Other languages
French (fr)
Chinese (zh)
Inventor
胡强
孙正忠
张迎梁
Original Assignee
叠境数字科技(上海)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 叠境数字科技(上海)有限公司 filed Critical 叠境数字科技(上海)有限公司
Publication of WO2021179783A1 publication Critical patent/WO2021179783A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/21805Source of audio or video content, e.g. local disk arrays enabling multiple viewpoints, e.g. using a plurality of cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23424Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for inserting or substituting an advertisement
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/238Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
    • H04N21/2387Stream processing in response to a playback request from an end-user, e.g. for trick-play
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/242Synchronization processes, e.g. processing of PCR [Program Clock References]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content

Definitions

  • the present invention relates to the field of video live broadcasting, in particular to a method, equipment, system, chip and medium for processing video live broadcasting with variable viewing angles based on free viewpoint technology. It is suitable for live broadcast activities of anchors, evening parties, sports events, etc.
  • the traditional live video broadcast method usually uses one or more cameras to shoot.
  • the live broadcast of the host there is often only one camera, so users can only watch the live broadcast from a single perspective; while in a party or sports event, there will be multiple cameras on the spot, and the director is responsible for switching the camera, and the viewer switches according to the director. Come watch the live broadcast.
  • the problem with the above-mentioned live broadcast method is that the viewing angle of the viewer is relatively single and uncontrollable, and the viewer can only passively accept the current live broadcast view angle, but cannot freely choose the view angle they want to see.
  • the live broadcaster can actually push multiple video streams to the audience at the same time, so that the audience can freely switch their perspectives.
  • this method does not improve the viewing experience very well.
  • the viewers often cannot switch the viewing angle in the optimal way, and there will be a stutter and discontinuous experience during the switching process. On the contrary, it is better to switch the camera by the director. .
  • bullet time as a video special effect in movies, has many advantages such as 360-degree viewing and time condensation. It is often used in slow motion playback and has an excellent viewing experience and visual effects.
  • the traditional bullet time is a late-stage special effect that cannot be applied to live broadcasts. The selection and speed of the bullet time effect are determined by the special effects producer. Users cannot watch the bullet time effect at any moment they like. Very restrictive.
  • the present invention aims to provide a live video processing method, equipment, system, chip and medium based on free viewpoint technology, which can allow users to smoothly freely switch the viewing angle within a certain range when watching the live broadcast. Until there is no delay, no jam, and then realize the user-controllable dynamic and static bullet time special effects.
  • the technical solution method adopted by the present invention includes the following steps:
  • images of synchronized videos with different viewing angles are captured through the camera array.
  • S1 also includes performing image correction on several synchronized videos.
  • the image correction includes: locating the calibration point through the image taken by the camera, so as to calculate the deviation of the measured object relative to the standard position, and obtain the synchronized video image after generating the correction.
  • the large picture is compressed, then encapsulated into a streaming media format, and then transmitted.
  • the original stream is added to the time stamp to generate the packet original stream, and the decoding time stamp and the display time stamp are compared in the header of the packet original stream packet to indicate the decoding time and display time of the data at the decoding end, respectively.
  • the acquisition time corresponding to the data in the original stream packet of the current packet is used as the display timestamp in the header of the original stream packet of the packet, and the decoding timestamp is calculated by the frame type.
  • the user pulls the live broadcast data on the server for decoding, obtains a large picture and selects the picture for display and playback.
  • the pictures of different cameras are switched and displayed, thereby realizing the effect of bullet time.
  • the present invention also provides a live video processing system, including: an acquisition module, a splicing module, a compression module, and a decoding module, wherein: the acquisition module is used to collect multi-angle synchronized videos;
  • the splicing module is used to splice the synchronized video into a large picture, which is compressed by the compression module and spliced into a super high-resolution large picture, and transmitted to the server;
  • the decoding module is used to receive the compressed data from the server and decode it, and the user can select the viewing angle for playback.
  • the acquisition module collects synchronous video images of different viewing angles through a camera array.
  • a correction module is also included, and the correction module is used to perform image correction on several synchronized videos.
  • the correction module locates the calibration point based on the image taken by the camera, thereby calculating the deviation of the measured object relative to the standard position, and obtaining a synchronized video image after generating the correction.
  • the splicing module is used to compress each synchronized video to the same low resolution separately, and then splice several synchronized videos simultaneously carved into a large ultra-high resolution picture.
  • the present invention also provides a video live broadcast device, including a camera array, a server, a main controller, and a player, wherein each server controls several cameras and is connected to the main controller; the main controller is also connected to several players;
  • the camera and server are used for synchronous acquisition and distributed processing of the camera, the main controller is used for image splicing and streaming, and the player is used for video decoding and interactive video rendering.
  • the present invention also provides a chip, which is characterized by comprising a processor, configured to call and run a computer program from a memory, so that a device installed with the chip executes any one of the methods.
  • the present invention also provides an electronic device, including a processor, and a memory for storing executable instructions of the processor, and the processor executes any of the methods when it is running.
  • the present invention also provides a computer-readable medium on which computer program instructions are stored, and when the computer program instructions are processed and executed, any one of the methods described is implemented.
  • the present invention allows users to switch viewpoints freely, and realizes the effect of customizing bullet time through sliding control. This gives great freedom to choose the angle of view, and it is true that each viewer can see different live video effects.
  • the invention guarantees strict synchronization of synchronized video pictures and audio through the mode of image splicing transmission, eliminates the delay and jam caused by switching video streams, and achieves a smooth and smooth switching effect.
  • the system has a strong expansibility. While being used for interactive live video broadcasting, it can also edit and process various pictures on the main control computer, and make videos with bullet time effects, which can be played directly at the party and live events. .
  • FIG. 1a and 1b are schematic diagrams of the structure of a camera array according to an embodiment of the present invention.
  • Figure 2 is a schematic diagram of the structure of a video live broadcast device
  • Fig. 3 is a schematic flowchart of an embodiment of the present invention.
  • FIG. 3 is a schematic flowchart of an embodiment of the present invention. The specific implementation steps are as follows:
  • S1 Synchronously collect synchronous video images of different viewing angles through a circular array camera.
  • a 120-180 degree camera array is usually used.
  • a 120-degree camera array with 16 cameras is taken as an example, and the cameras are placed as shown in FIG. 1a. If you require 360-degree ring shooting, you can add more cameras as needed, and at the same time make some changes to the processing end and the playback end, mainly the part of the spliced image.
  • Figure 1b enclose the camera in a circle, and calibrate the camera with the center of the circle as the calibration center.
  • low resolution refers to a resolution less than or equal to 960*540, such as 480x240
  • ultra-high resolution refers to a resolution greater than or equal to 3840x2160.
  • the video live broadcast device based on this embodiment is divided into four parts: a camera, a server, a main controller, and a player.
  • Each server controls several cameras and is connected to the main controller; the main controller is also connected to several players.
  • the camera and the server are mainly responsible for the synchronous acquisition and distributed processing of the camera, the main controller is mainly responsible for image stitching and streaming, and the player is mainly responsible for video decoding and interactive video rendering.
  • the distributed processing uses multiple servers, and each server is connected to 4-8 cameras.
  • the servers are connected by a synchronization line, so that the synchronization signal can be sent from the first server to each server in turn, so that the camera can trigger the acquisition synchronously.
  • the server will encode and transmit the video of each camera picture, so that all the camera pictures are synchronously transmitted to the main controller (computer).
  • the advantage of this distributed synchronous acquisition scheme lies in the modularity and scalability of the system, that is, adding cameras later will not be affected by the performance of a single server and the number of camera interfaces, only the number of servers needs to be added.
  • the preprocessing algorithms for camera images such as image alignment and correction, color correction and other algorithms, they can all be processed on the server, so that each server only needs to process the images of 4-8 cameras, thus speeding up the processing.
  • S2 Perform image correction, scaling, and stitching of the images with different viewing angles into a frame of ultra-high video resolution.
  • the camera array Since the camera array has a certain error in the angular height of each camera after installation, in order to determine the accurate internal and external parameters of the camera, as well as the distortion parameters, the camera needs to be calibrated first.
  • the calibration point is located by the image taken by the camera, and the deviation of the measured object relative to the standard position is calculated, including the angle deviation and the displacement deviation, and the camera parameters are finally calculated.
  • real-time interactive live broadcast can be carried out.
  • the camera array synchronously collects different viewing angles, since the degree of distortion of each lens is different, the camera parameters obtained through the camera calibration can correct this lens distortion and generate a correction. Then, after zooming each camera image to the same low resolution, the images of different cameras will be spliced into a super high resolution image at the same time.
  • S3 Compress, encapsulate, and transmit the ultra-high resolution picture to the rtmp streaming media server.
  • a hardware video encoder is used to encode and compress the ultra-high-resolution images, and encapsulate them into a certain streaming media format, and then transmit them over the network.
  • the original stream output by the encoder is processed and added to the time stamp to generate the packet original stream.
  • the decoding timestamp In the header of the packet original stream packet, there are two more important information, namely the decoding timestamp and the display timestamp, respectively.
  • the system uses the collection time corresponding to the data in the current packet's original stream packet as the PTS in the packet's original stream header. DTS can be calculated according to the frame type.
  • the encapsulated data is transmitted over the network to the rtmp streaming media server for use and live broadcast.
  • This embodiment uses 16 cameras, each of which has a resolution of 960*540 (that is, low resolution), and the resolution of the stitched image is 3840*2160 (that is, super high resolution), which is the resolution of a standard 4K image. It can decode smoothly in almost all mainstream devices. Then the big picture can be streamed to the public network server through video encoding for users to access.
  • This method ensures that the images of all cameras in each frame are strictly synchronized, and the audio is also unique, and there will be no audio or video out of synchronization during the process of switching perspectives. Since the images of all cameras are delivered to the user as a whole, after the user decodes the entire large image, when switching the viewing angle, there will be no freeze or delay, which greatly improves the interactive video viewing experience .
  • S4 The user terminal pulls the live stream and decodes it to obtain an ultra-high resolution picture.
  • the user terminal selects the corresponding view of the ultra-high resolution picture according to the current user's view angle for display and playback.
  • the video decoding part of the player adopts the existing video decoding technology, so the applicable equipment for interactive video is also very extensive. It can be used on both the computer and mobile terminals, and it can also be viewed and browsed on the client and web pages.
  • the client will go to the streaming media server to pull the live data for decoding to get a super high resolution picture, but it will not display the super high resolution picture, but select the corresponding picture in the super high resolution picture according to the current view angle
  • the transition screen that is, the bullet time screen
  • the changed perspective screen will continue to be played until the viewing angle after the slide ends.
  • the current viewing angle is the 5th channel video, and the video screen is being displayed and played on this channel.
  • the client will first play the transition video, that is, from the 5th channel to the 20th channel as time goes by Each video takes one frame for display and playback.
  • the channel After the 20th channel, the channel will be played normally until the next screen switching operation. This allows the user to slide the mobile phone screen while playing, smoothly and freely switch between multiple cameras in real time, and the live screen is in a continuous motion state without the need for screen stagnation.
  • the resolution of the stitched image and the resolution of each camera can be adjusted and modified as needed. However, if there are too many cameras, and you do not want to make the resolution of each camera screen too small, this will inevitably cause the video resolution after stitching to be too high, reaching 8K or even higher. Video decoding with too high resolution will have higher requirements for device performance, especially on the mobile terminal, because the current mobile terminal’s general video decoding capability is 4K resolution, and too high resolution will cause it to be unable to be used in many mobile devices. Smooth decoding.
  • the specific method is as follows.
  • the frame rate of the stitched image is increased from the standard 30 frames per second to 60 frames per second, so that two frames can be used to store one frame of camera content. For example, when there are 32 cameras, every 16 cameras are still stitched into one large image, so that two stitched images are obtained.
  • the playback end decodes two adjacent frames every time, and caches the contents of the two large images into an image array for subsequent rendering of the interactive bullet time effect.
  • the distributed synchronous acquisition scheme used in the system can improve the modularity and scalability of the system.
  • the addition of cameras later will not be affected by factors such as the performance of a single server and the number of camera interfaces, and only need to add The number of servers is sufficient.
  • the image preprocessing algorithm can also be placed on the server side for processing, so that each server only needs to process the images of 4-8 cameras, thus accelerating the processing.
  • the present invention uses multi-view stereo vision, and can also estimate the depth information of the camera for scene reconstruction.
  • the viewpoint can be encrypted, thereby achieving a smoother viewing angle switching effect.
  • this embodiment also facilitates the integration of computer vision algorithms to achieve different visual effects.
  • the present invention also provides an electronic device, including: at least one processor; a memory coupled with the at least one processor, the memory storing executable instructions, where the executable instructions enable the implementation of Invent the above-mentioned method.
  • the memory may include random access memory, flash memory, read-only memory, programmable read-only memory, non-volatile memory, or registers.
  • the processor may be a central processing unit (Central Processing Unit, CPU) or the like.
  • a graphics processor Graphic Processing Unit, GPU
  • the processor can execute executable instructions stored in the memory to implement the various processes described herein.
  • the memory in this embodiment may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory can be ROM (Read-OnlyMemory), PROM (ProgrammableROM, Programmable Read-Only Memory), EPROM (ErasablePROM, Erasable Programmable Read-Only Memory), EEPROM (Electrically EPROM, Electronic Erasable programmable read-only memory) or flash memory.
  • the volatile memory may be RAM (Random Access Memory), which is used as an external cache.
  • RAM random access memory
  • SRAM StaticRAM, static random access memory
  • DRAM DynamicRAM, dynamic random access memory
  • SDRAM SynchronousDRAM, synchronous dynamic random access memory
  • DDRSDRAM DoubleDataRate SDRAM, double data rate synchronous dynamic random access memory
  • ESDRAM Enhanced SDRAM, enhanced synchronous dynamic random access memory
  • SLDRAM SynchronousDRAM, synchronous connection dynamic random access memory
  • DRRAM DirectRambusRAM, direct RAM bus random access memory.
  • the memories described herein are intended to include, but are not limited to, these and any other suitable types of memories.
  • the memory stores the following elements, upgrade packages, executable units, or data structures, or a subset of them, or an extended set of them: operating systems and applications.
  • the operating system includes various system programs, such as a framework layer, a core library layer, and a driver layer, which are used to implement various basic services and process hardware-based tasks.
  • Application programs including various application programs, are used to implement various application services.
  • a program that implements the method of the embodiment of the present invention may be included in an application program.
  • the processor calls a program or instruction stored in the memory, specifically, it may be a program or instruction stored in an application program, and the processor is used to execute the above method steps.
  • the embodiment of the present invention also provides a chip for executing the above-mentioned method.
  • the chip includes a processor, which is used to call and run a computer program from the memory, so that the device installed with the chip is used to execute the above method.
  • the present invention also provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps of the above-mentioned method of the present invention are realized.
  • the machine-readable storage medium may include, but is not limited to, various known and unknown types of non-volatile memory.
  • the embodiment of the present invention also provides a computer program product, including computer program instructions, which cause a computer to execute the foregoing method.
  • the disclosed system, electronic device, and method may be implemented in other ways.
  • the division of units is only a logical function division, and there may be other division methods in actual implementation.
  • multiple units or components can be combined or integrated into another system.
  • the coupling between the various units may be direct coupling or indirect coupling.
  • the functional units in the embodiments of the present application may be integrated into one processing unit, or may be separate physical existences, and so on.
  • the size of the sequence number of each process does not mean the order of execution, and the execution order of each process should be determined by its function and internal logic, and should not correspond to the embodiments of the present application.
  • the implementation process constitutes any limitation.
  • the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a machine-readable storage medium. Therefore, the technical solution of the present application can be embodied in the form of a software product.
  • the software product can be stored in a machine-readable storage medium. All or part of the process.
  • the foregoing storage media may include various media capable of storing program codes, such as ROM, RAM, removable disks, hard disks, magnetic disks, or optical disks.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

A free viewpoint-based video live broadcast processing method and system, and an apparatus and a medium. The method comprises: acquiring several synchronous videos at multiple angles; stitching the several synchronous videos into a large image and transmitting same to a server; and performing decoding on the server to obtain the large image, and selecting an angle for playback. According to the present invention, a user is allowed to switch a viewpoint freely, and the effect of customizing the bullet time is achieved by means of sliding control.

Description

基于自由视点的视频直播处理方法、设备、***、芯片及介质Free viewpoint-based video live broadcast processing method, equipment, system, chip and medium 技术领域Technical field
本发明涉及视频直播领域,具体地说是一种基于自由视点技术的视角可变的视频直播处理方法、设备、***、芯片及介质,实现了端到端的实时、多视角的交互直播,可以广泛适用于主播、晚会、体育赛事等主播直播活动中。The present invention relates to the field of video live broadcasting, in particular to a method, equipment, system, chip and medium for processing video live broadcasting with variable viewing angles based on free viewpoint technology. It is suitable for live broadcast activities of anchors, evening parties, sports events, etc.
背景技术Background technique
随着网络多媒体技术的高速发展,人们对信息处理能力的需求不断增强。信息传递的载体也由文字、图像发展到视频,视频直播和短视频社交等应用逐渐流行起来。尤其是随着5G商用正式启动、基站不断完善扩展,高速便捷的网络基础推动了直播产业诸多创新。传统视频直播用户不能随意选择自己想要观看的视角画面,基本都是导播推给你什么画面就看什么画面,缺乏娱乐性和新鲜感。With the rapid development of network multimedia technology, people's demand for information processing capabilities continues to increase. The carrier of information transmission has also developed from text and images to video, and applications such as live video and short video social networking have gradually become popular. Especially with the official launch of 5G commercialization and continuous improvement and expansion of base stations, the high-speed and convenient network foundation has promoted many innovations in the live broadcast industry. Traditional live video broadcast users cannot choose the angle of view they want to watch at will. Basically, they can watch whatever picture the director pushes to you, which lacks entertainment and freshness.
传统的视频直播方式通常采用一台或多台摄像机进行拍摄。在主播直播中,往往是只有一台相机,因而用户只能从单一视角观看直播;而在晚会或体育赛事中,现场会有多台相机,由导播负责镜头的切换,而观众根据导播的切换来观看直播。上述的直播方式存在的问题是观众的观看视角比较单一且不可控,观众只能被动的接受当前的直播视角,而不能自由选择自己希望看到的视角。随着宽带技术的发展和5G技术的兴起,直播方实际上是可以将多路视频流的同时推送给观众,使得观众可以自由切换视角。但是这种方式并不能很好的改善观看体验,观众往往不能以最优的方式来进行视角切换,并且切换的过程中会有卡顿和不连续的体验,反而不如由导播端来进行镜头切换。The traditional live video broadcast method usually uses one or more cameras to shoot. In the live broadcast of the host, there is often only one camera, so users can only watch the live broadcast from a single perspective; while in a party or sports event, there will be multiple cameras on the spot, and the director is responsible for switching the camera, and the viewer switches according to the director. Come watch the live broadcast. The problem with the above-mentioned live broadcast method is that the viewing angle of the viewer is relatively single and uncontrollable, and the viewer can only passively accept the current live broadcast view angle, but cannot freely choose the view angle they want to see. With the development of broadband technology and the rise of 5G technology, the live broadcaster can actually push multiple video streams to the audience at the same time, so that the audience can freely switch their perspectives. However, this method does not improve the viewing experience very well. The viewers often cannot switch the viewing angle in the optimal way, and there will be a stutter and discontinuous experience during the switching process. On the contrary, it is better to switch the camera by the director. .
此外,子弹时间作为一种电影中的视频特效,具有360度观看、时间凝结等诸多优势,往往应用于慢动作回放中,具有极好的观看体验和视觉效果。但是传统的子弹时间是一种后期的特效,不能应用在直播中,且子弹时间效果的片段选取和速度都是特效制作人员确定的,用户无法在任一自己喜欢的瞬间来观看子弹时间效果,具有很大的限制性。In addition, bullet time, as a video special effect in movies, has many advantages such as 360-degree viewing and time condensation. It is often used in slow motion playback and has an excellent viewing experience and visual effects. However, the traditional bullet time is a late-stage special effect that cannot be applied to live broadcasts. The selection and speed of the bullet time effect are determined by the special effects producer. Users cannot watch the bullet time effect at any moment they like. Very restrictive.
发明内容Summary of the invention
本发明为解决现有的问题,旨在提供一种基于自由视点技术的视频直播处理方法、设备、***、芯片及介质,可以让用户在观看直播时在一定范围内流畅地 自由切换视角,做到无延时、无卡顿,进而实现用户可控的动静态子弹时间特效。In order to solve the existing problems, the present invention aims to provide a live video processing method, equipment, system, chip and medium based on free viewpoint technology, which can allow users to smoothly freely switch the viewing angle within a certain range when watching the live broadcast. Until there is no delay, no jam, and then realize the user-controllable dynamic and static bullet time special effects.
为了达到上述目的,本发明采用的技术方案的方法包括如下步骤:In order to achieve the above objective, the technical solution method adopted by the present invention includes the following steps:
S1,采集多角度的若干同步视频;S1, collect several synchronized videos from multiple angles;
S2,将所述若干同步视频拼接成大图,并传输至服务器;S2, splicing the plurality of synchronized videos into a large image, and transmitting it to the server;
S3,在服务器上解码获取大图,选择角度播放。S3, decode the big picture on the server, and select the angle to play.
在一些实施例中,S1中,通过相机阵列采集不同视角的同步视频的画面。In some embodiments, in S1, images of synchronized videos with different viewing angles are captured through the camera array.
在一些实施例中,S1中还包括,对若干同步视频进行图像校正。In some embodiments, S1 also includes performing image correction on several synchronized videos.
在一些实施例中,所述图像校正包括:通过相机拍摄的图像对标定点进行定位,从而计算出被测物相对于标准位置的偏差,得到生成校正后的同步视频图像。在一些实施例中,S2中,对大图的画面进行压缩,进而封装成流媒体格式,然后对其进行传输。In some embodiments, the image correction includes: locating the calibration point through the image taken by the camera, so as to calculate the deviation of the measured object relative to the standard position, and obtain the synchronized video image after generating the correction. In some embodiments, in S2, the large picture is compressed, then encapsulated into a streaming media format, and then transmitted.
在一些实施例中,将原始流加入时间标签后生成分组原始流,在分组原始流包的包头中比较解码时间戳和显示时间戳,分别用于指示该数据在解码端的解码时间和显示时间。In some embodiments, the original stream is added to the time stamp to generate the packet original stream, and the decoding time stamp and the display time stamp are compared in the header of the packet original stream packet to indicate the decoding time and display time of the data at the decoding end, respectively.
在一些实施例中,采用当前分组原始流包中数据对应的采集时间作为该分组原始流包头中的显示时间戳,并通过帧类型计算得到解码时间戳。In some embodiments, the acquisition time corresponding to the data in the original stream packet of the current packet is used as the display timestamp in the header of the original stream packet of the packet, and the decoding timestamp is calculated by the frame type.
在一些实施例中,S2中,将每个同步视频分别压缩到同一低分辨率后,再将同时刻的若干同步视频拼接成一幅超高分辨率的大图画面。In some embodiments, in S2, after each synchronized video is compressed to the same low resolution respectively, several synchronized videos simultaneously carved into one large picture with super high resolution are spliced together.
在一些实施例中,S4中,用户在服务器上拉取直播数据进行解码,得到大图的画面后选择画面,进行显示播放。In some embodiments, in S4, the user pulls the live broadcast data on the server for decoding, obtains a large picture and selects the picture for display and playback.
在一些实施例中,得到大图的画面后,用户滑动屏幕或者鼠标拖动时,切换显示不同相机的画面,从而实现子弹时间的效果。In some embodiments, after the large picture is obtained, when the user slides the screen or drags with the mouse, the pictures of different cameras are switched and displayed, thereby realizing the effect of bullet time.
本发明还提供一种视频直播处理***,包括:获取模块、拼接模块、压缩模块和解码模块,其中:获取模块用于采集多角度的同步视频;The present invention also provides a live video processing system, including: an acquisition module, a splicing module, a compression module, and a decoding module, wherein: the acquisition module is used to collect multi-angle synchronized videos;
拼接模块用于将同步视频拼接成大图,通过压缩模块压缩后拼接成一幅超高分辨率的大图画面,并传输至服务器;The splicing module is used to splice the synchronized video into a large picture, which is compressed by the compression module and spliced into a super high-resolution large picture, and transmitted to the server;
解码模块用于接收服务器的压缩数据并解码,由用户选择视角进行播放。The decoding module is used to receive the compressed data from the server and decode it, and the user can select the viewing angle for playback.
其中,所述获取模块通过相机阵列采集不同视角的同步视频的画面。Wherein, the acquisition module collects synchronous video images of different viewing angles through a camera array.
其中,还包括校正模块,所述校正模块用于对若干同步视频进行图像校正。Among them, a correction module is also included, and the correction module is used to perform image correction on several synchronized videos.
其中,校正模块通过相机拍摄的图像对标定点进行定位,从而计算出被测物相 对于标准位置的偏差,得到生成校正后的同步视频图像。Among them, the correction module locates the calibration point based on the image taken by the camera, thereby calculating the deviation of the measured object relative to the standard position, and obtaining a synchronized video image after generating the correction.
其中,拼接模块用于将每个同步视频分别压缩到同一低分辨率后,再将同时刻的若干同步视频拼接成一幅超高分辨率的大图画面。Among them, the splicing module is used to compress each synchronized video to the same low resolution separately, and then splice several synchronized videos simultaneously carved into a large ultra-high resolution picture.
本发明还提供一种视频直播装置,包括相机阵列、服务器、主控制器和播放器,其中,每个服务器控制若干相机,并于主控制器连接;所属主控制器还与若干播放器连接;相机和服务器用于相机的同步采集和分布式处理,主控制器用于图像拼接和推流,播放器用于视频解码和交互式视频渲染。The present invention also provides a video live broadcast device, including a camera array, a server, a main controller, and a player, wherein each server controls several cameras and is connected to the main controller; the main controller is also connected to several players; The camera and server are used for synchronous acquisition and distributed processing of the camera, the main controller is used for image splicing and streaming, and the player is used for video decoding and interactive video rendering.
本发明还提供一种芯片,其特征在于,包括处理器,用于从存储器中调用并运行计算机程序,使得安装有所述芯片的设备执行任一项所述的方法。The present invention also provides a chip, which is characterized by comprising a processor, configured to call and run a computer program from a memory, so that a device installed with the chip executes any one of the methods.
本发明还提供一种电子设备,包括处理器、以及用于存储处理器的可执行指令的存储器,所述处理器运行时执行任一所述的方法。The present invention also provides an electronic device, including a processor, and a memory for storing executable instructions of the processor, and the processor executes any of the methods when it is running.
本发明还提供一种计算机可读介质,其上存储有计算机程序指令,所述计算机程序指令被处理执行时,实现任一所述的方法。The present invention also provides a computer-readable medium on which computer program instructions are stored, and when the computer program instructions are processed and executed, any one of the methods described is implemented.
和现有技术相比,本发明可以让用户自由地切换视点,并且通过滑动的控制实现自定义子弹时间的效果。这给予了极大的自由选择视角的权利,真正做到了每个观众可以看到不同的视频直播效果。Compared with the prior art, the present invention allows users to switch viewpoints freely, and realizes the effect of customizing bullet time through sliding control. This gives great freedom to choose the angle of view, and it is true that each viewer can see different live video effects.
本发明通过图像拼接传输的方式保证了同步视频画面和音频的严格同步,消除了切换视频流所带来的延时和卡顿的情况,做到了流畅平滑的切换效果。The invention guarantees strict synchronization of synchronized video pictures and audio through the mode of image splicing transmission, eliminates the delay and jam caused by switching video streams, and achieves a smooth and smooth switching effect.
该***具有很强的拓展性,在用于交互式视频直播的同时,也可以在主控电脑上对各路画面进行编辑处理,制作子弹时间效果的视频而直接在晚会和赛事直播现场进行播放。The system has a strong expansibility. While being used for interactive live video broadcasting, it can also edit and process various pictures on the main control computer, and make videos with bullet time effects, which can be played directly at the party and live events. .
附图说明Description of the drawings
为了更清楚地说明本发明实施例的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions of the embodiments of the present invention more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only some of the present invention. Embodiments, for those of ordinary skill in the art, without creative labor, other drawings can be obtained based on these drawings.
图1a、图1b为本发明实施例的相机阵列的结构示意图;1a and 1b are schematic diagrams of the structure of a camera array according to an embodiment of the present invention;
图2为视频直播装置的结构示意图;Figure 2 is a schematic diagram of the structure of a video live broadcast device;
图3为本发明实施例的流程示意图。Fig. 3 is a schematic flowchart of an embodiment of the present invention.
具体实施方式Detailed ways
以下通过具体实例结合附图图1-3来说明本申请的实施方式。本领域技术人员 可由本说明书所揭露的内容轻易地了解本申请的其他优点与功效。本发明还可以通过另外不同的具体实施方式加以实施或应用,本说明书中的各项细节也可以基于不同观点与应用,在没有背离本申请的精神下进行各种修饰或改变。需说明的是,在不冲突的情况下,以下实施例及实施例中的特征可以相互组合。The following describes the implementation of the present application through specific examples in conjunction with the accompanying drawings FIGS. 1-3. Those skilled in the art can easily understand the other advantages and effects of this application from the content disclosed in this specification. The present invention can also be implemented or applied through other different specific embodiments, and various details in this specification can also be modified or changed based on different viewpoints and applications without departing from the spirit of the present application. It should be noted that, in the case of no conflict, the following embodiments and the features in the embodiments can be combined with each other.
参见图3,为本发明实施例的流程示意图。具体实施步骤如下:Refer to Fig. 3, which is a schematic flowchart of an embodiment of the present invention. The specific implementation steps are as follows:
S1:通过环形阵列相机同步采集不同视角的同步视频画面。S1: Synchronously collect synchronous video images of different viewing angles through a circular array camera.
首先需要搭建环形相机阵列,可以拍摄不同角度的视频画面。在大多数视频直播中,由于观众往往只是对人物或舞台的正面感兴趣,因而通常会采用120-180度的相机阵列。本实施例以120度、16个相机的相机阵列为例,相机摆放如图1a所示。如果要求进行360度的环拍,则可以根据需要增加更多的相机,同时处理端和播放端做一些改动,主要是拼接图像的部分。参见图1b,将相机围成一圈,以圆心为标定中心对相机进行标定。架设有24个摄像头,则每两个摄像头与圆心的夹角为15°均匀分布,由于手动部署时角度高度不可能完全准确,所以会在架设相机阵列完成后进行相机标定得到相机参数。当在直播时,则通过相机标定的得到的相机参数对采集图像进行校正。然后将每个相机校正后图像缩放到同一低分辨率后将同时刻不同相机的画面拼接成一幅超高分辨率画面。本发明中,低分辨率指小于等于960*540的分辨率,如480x240;超高分辨率指大于等于3840x2160的分辨率。First, you need to build a ring camera array that can shoot video images from different angles. In most live video broadcasts, since the audience is usually only interested in the front of the person or stage, a 120-180 degree camera array is usually used. In this embodiment, a 120-degree camera array with 16 cameras is taken as an example, and the cameras are placed as shown in FIG. 1a. If you require 360-degree ring shooting, you can add more cameras as needed, and at the same time make some changes to the processing end and the playback end, mainly the part of the spliced image. Refer to Figure 1b, enclose the camera in a circle, and calibrate the camera with the center of the circle as the calibration center. With 24 cameras installed, the angle between each two cameras and the center of the circle is evenly distributed at 15°. Since the angle height cannot be completely accurate during manual deployment, the camera will be calibrated after the camera array is installed to obtain the camera parameters. When live broadcast, the collected image is corrected by the camera parameters obtained by the camera calibration. Then the corrected images of each camera are scaled to the same low resolution, and the images of different cameras at the same time are spliced into a super high resolution image. In the present invention, low resolution refers to a resolution less than or equal to 960*540, such as 480x240; ultra-high resolution refers to a resolution greater than or equal to 3840x2160.
参见图2,本实施例所基于的视频直播装置分为相机、服务器、主控制器、播放器等四大部分。每个服务器控制若干相机,并于主控制器连接;所属主控制器还与若干播放器连接。其中,相机和服务器主要负责相机的同步采集和分布式处理,主控制器主要负责图像拼接和推流,播放器主要负责视频解码和交互式视频渲染。Referring to FIG. 2, the video live broadcast device based on this embodiment is divided into four parts: a camera, a server, a main controller, and a player. Each server controls several cameras and is connected to the main controller; the main controller is also connected to several players. Among them, the camera and the server are mainly responsible for the synchronous acquisition and distributed processing of the camera, the main controller is mainly responsible for image stitching and streaming, and the player is mainly responsible for video decoding and interactive video rendering.
所述分布式处理,即使用多台服务器、而每台服务器连接4-8台相机。服务器之间使用同步线连接,这样同步信号可以从第一台服务器依次发送给每一台服务器,实现相机的同步触发采集。同时在服务器会对每一路相机画面进行视频编码和传输,从而将所有相机画面都同步传递到主控制器(电脑)。这种分布式的同步采集方案优点在于***的模块化和可扩展性,即后期添加相机不会受到单台服务器的性能和相机接口数目影响,只需要添加服务器的数目即可。而对于针对相机画面的预处理算法,如图像对齐和校正、颜色校正等算法,都可以放到服务器来处理,这样每台服务器也只需要处理4-8台相机的画面,从而 起到加速处理的作用。The distributed processing uses multiple servers, and each server is connected to 4-8 cameras. The servers are connected by a synchronization line, so that the synchronization signal can be sent from the first server to each server in turn, so that the camera can trigger the acquisition synchronously. At the same time, the server will encode and transmit the video of each camera picture, so that all the camera pictures are synchronously transmitted to the main controller (computer). The advantage of this distributed synchronous acquisition scheme lies in the modularity and scalability of the system, that is, adding cameras later will not be affected by the performance of a single server and the number of camera interfaces, only the number of servers needs to be added. As for the preprocessing algorithms for camera images, such as image alignment and correction, color correction and other algorithms, they can all be processed on the server, so that each server only needs to process the images of 4-8 cameras, thus speeding up the processing. The role of.
S2:将所述不同视角的画面进行图像校正、缩放、拼接成一帧超高视频分辨率的画面。S2: Perform image correction, scaling, and stitching of the images with different viewing angles into a frame of ultra-high video resolution.
由于相机阵列在安装后每个相机的角度高度会有一定的误差,为了确定准确的相机的内、外参数,以及畸变参数,首先需要对相机进行标定。通过相机拍摄的图像对标定点进行定位,从而计算出被测物相对于标准位置的偏差,包含角度偏差和位移偏差,最终计算得到相机参数。相机标定完成后,即可进行实时交互直播,相机阵列同步采集到不同视角画面后,由于每个镜头的畸变程度各不相同,通过相机标定的得到的相机参数可以校正这种镜头畸变,生成校正后的图像,然后将每个相机图像缩放到同一低分辨率后将同时刻不同相机的画面拼接成一幅超高分辨率画面。Since the camera array has a certain error in the angular height of each camera after installation, in order to determine the accurate internal and external parameters of the camera, as well as the distortion parameters, the camera needs to be calibrated first. The calibration point is located by the image taken by the camera, and the deviation of the measured object relative to the standard position is calculated, including the angle deviation and the displacement deviation, and the camera parameters are finally calculated. After the camera calibration is completed, real-time interactive live broadcast can be carried out. After the camera array synchronously collects different viewing angles, since the degree of distortion of each lens is different, the camera parameters obtained through the camera calibration can correct this lens distortion and generate a correction. Then, after zooming each camera image to the same low resolution, the images of different cameras will be spliced into a super high resolution image at the same time.
此外,也可以使用电动云台,即将相机的角度、位置等参数非常精准的调节好,使每个相机都对准同一个点,并且相机参数保持一致,能达到同样效果。In addition, you can also use a motorized pan/tilt, that is, to adjust the camera's angle, position and other parameters very accurately, so that each camera is aimed at the same point, and the camera parameters are consistent, which can achieve the same effect.
S3:对所述超高分辨率画面进行压缩、封装、传输至rtmp流媒体服务器。S3: Compress, encapsulate, and transmit the ultra-high resolution picture to the rtmp streaming media server.
采用硬件视频编码器对超高分辨率画面进行编码压缩,并封装成一定的流媒体格式,然后对其进行网络传输。具体来说,首先将编码器输出的原始流经过处理加入时间标签后生成分组原始流,在分组原始流包的包头中,又两个比较重要的信息即解码时间戳和显示时间戳,分别用于指示该数据在解码端的解码时间和显示时间。为了不因为网络丢包等原因产生播放画面时快时慢的情况,本***采用当前分组原始流包中数据对应的采集时间作为该分组原始流包头中的PTS,DTS可根据帧类型计算得到。最后将封装好的数据通过网络传输至rtmp流媒体服务器上用与直播使用。A hardware video encoder is used to encode and compress the ultra-high-resolution images, and encapsulate them into a certain streaming media format, and then transmit them over the network. Specifically, the original stream output by the encoder is processed and added to the time stamp to generate the packet original stream. In the header of the packet original stream packet, there are two more important information, namely the decoding timestamp and the display timestamp, respectively. To indicate the decoding time and display time of the data at the decoding end. In order not to cause fast or slow playback due to network packet loss and other reasons, the system uses the collection time corresponding to the data in the current packet's original stream packet as the PTS in the packet's original stream header. DTS can be calculated according to the frame type. Finally, the encapsulated data is transmitted over the network to the rtmp streaming media server for use and live broadcast.
本实施例使用16个相机,每个相机分辨率是960*540(即低分辨率),拼接后图像的分辨率是3840*2160(即超高分辨率),是标准4K图像的分辨率,可以在几乎所有主流设备中流畅解码。然后可以将这幅大图通过视频编码推流到公网服务器上,供用户进行访问。这种方法保证了每帧画面所有相机的图像都是严格同步的,同时音频也是唯一的,不会在切换视角的过程中出现音频或者视频不同步的情况。由于所有相机的画面作为整体传递给了用户,用户是对整张大图进行解码后,在切换视角的时候,也不会产生任何卡顿或延时的情况,大大提升了交互式视频的观看体验。This embodiment uses 16 cameras, each of which has a resolution of 960*540 (that is, low resolution), and the resolution of the stitched image is 3840*2160 (that is, super high resolution), which is the resolution of a standard 4K image. It can decode smoothly in almost all mainstream devices. Then the big picture can be streamed to the public network server through video encoding for users to access. This method ensures that the images of all cameras in each frame are strictly synchronized, and the audio is also unique, and there will be no audio or video out of synchronization during the process of switching perspectives. Since the images of all cameras are delivered to the user as a whole, after the user decodes the entire large image, when switching the viewing angle, there will be no freeze or delay, which greatly improves the interactive video viewing experience .
S4:用户端拉取直播流进行解码,得到超高分辨率画面,用户端根据当前用户 视角选择超高分辨率画面中对应视角画面进行显示播放。S4: The user terminal pulls the live stream and decodes it to obtain an ultra-high resolution picture. The user terminal selects the corresponding view of the ultra-high resolution picture according to the current user's view angle for display and playback.
播放器的视频解码部分采用现有的视频解码技术,因而交互式视频的适用设备也非常广泛,在电脑端、移动端均可使用,在客户端和网页端也都可以观看浏览。The video decoding part of the player adopts the existing video decoding technology, so the applicable equipment for interactive video is also very extensive. It can be used on both the computer and mobile terminals, and it can also be viewed and browsed on the client and web pages.
用户端会去流媒体服务器上拉取直播数据进行解码,得到超高分辨率画面,但并不会将超高分辨率画面进行显示,而是根据当前视角在超高分辨画面中选择对应的画面进行显示播放,当用户左右滑动显示屏幕,则会根据滑动的轨迹显示转场画面即子弹时间画面,直到接着滑动结束的视角继续播放改视角画面。比如,当前视角是第5路视频,视频画面正在该路画面显示播放,当用户滑动视频到第20路视频,客户端会首先播放转场视频即随着时间推移从第5路到第20路视频各取一帧进行显示播放,到第20路后则正常播放该路画面,直到下一次切换画面操作。从而可以使用户一边播放一边滑动手机屏幕,在多相机间进行流畅自如的实时切换,且直播画面是连续运动状态,不需要画面停滞。当采用不同的相机数目,拼接图像的分辨率和每个相机的分辨率可以根据需要进行调整和修改。然而,如果相机数量过多,又不希望使每个相机画面的分辨率过小,这势必会造成拼接后的视频分辨率过高,达到8K甚至更高。而分辨率过高的视频解码对设备性能会有比较高的要求,尤其是在移动端,因为目前移动端通用的视频解码能力是4K分辨率,分辨率过高会导致在很多移动设备中无法流畅解码。The client will go to the streaming media server to pull the live data for decoding to get a super high resolution picture, but it will not display the super high resolution picture, but select the corresponding picture in the super high resolution picture according to the current view angle For display playback, when the user slides the display screen left and right, the transition screen, that is, the bullet time screen, will be displayed according to the sliding track, and the changed perspective screen will continue to be played until the viewing angle after the slide ends. For example, the current viewing angle is the 5th channel video, and the video screen is being displayed and played on this channel. When the user slides the video to the 20th channel, the client will first play the transition video, that is, from the 5th channel to the 20th channel as time goes by Each video takes one frame for display and playback. After the 20th channel, the channel will be played normally until the next screen switching operation. This allows the user to slide the mobile phone screen while playing, smoothly and freely switch between multiple cameras in real time, and the live screen is in a continuous motion state without the need for screen stagnation. When using different numbers of cameras, the resolution of the stitched image and the resolution of each camera can be adjusted and modified as needed. However, if there are too many cameras, and you do not want to make the resolution of each camera screen too small, this will inevitably cause the video resolution after stitching to be too high, reaching 8K or even higher. Video decoding with too high resolution will have higher requirements for device performance, especially on the mobile terminal, because the current mobile terminal’s general video decoding capability is 4K resolution, and too high resolution will cause it to be unable to be used in many mobile devices. Smooth decoding.
针对这种情况,可以采用增加视频帧率的方式。具体方法如下,将拼接图像的帧率由标准的30帧每秒提升到60帧每秒,这样就可以用两帧画面来存储一帧相机的内容。例如当有32个相机的时候,依旧每16个相机拼接成一张大图,从而得到两张拼接后的图片。播放端则每解码相邻两帧画面,将两张大图的内容缓存到一个图像数组中,用于之后的交互式子弹时间效果的渲染。In response to this situation, a way to increase the video frame rate can be adopted. The specific method is as follows. The frame rate of the stitched image is increased from the standard 30 frames per second to 60 frames per second, so that two frames can be used to store one frame of camera content. For example, when there are 32 cameras, every 16 cameras are still stitched into one large image, so that two stitched images are obtained. The playback end decodes two adjacent frames every time, and caches the contents of the two large images into an image array for subsequent rendering of the interactive bullet time effect.
上述实施例中,该***中使用的分布式同步采集方案,可以提升***的模块化和可扩展性,后期添加相机不会受到单台服务器的性能和相机接口数目等因素的影响,只需要添加服务器数目即可。同时图像的预处理算法也都可以放到服务器端来处理,这样每台服务器也只需要处理4-8台相机的画面,从而起到加速处理的作用。In the above embodiment, the distributed synchronous acquisition scheme used in the system can improve the modularity and scalability of the system. The addition of cameras later will not be affected by factors such as the performance of a single server and the number of camera interfaces, and only need to add The number of servers is sufficient. At the same time, the image preprocessing algorithm can also be placed on the server side for processing, so that each server only needs to process the images of 4-8 cameras, thus accelerating the processing.
同时,本发明利用多视角立体视觉,还可以估算出相机的深度信息用于场景的重建。在播放器端通过结合光场渲染和虚拟视点差值技术,在可以将视点加密, 进而实现更加平滑的视角切换效果。而结合光场重对焦算法,可以实现在切换视点的过程中,加入背景虚化等效果。因而,该实施例还方便集成计算机视觉算法,从而实现不同的视觉效果。At the same time, the present invention uses multi-view stereo vision, and can also estimate the depth information of the camera for scene reconstruction. On the player side, by combining light field rendering and virtual viewpoint difference technology, the viewpoint can be encrypted, thereby achieving a smoother viewing angle switching effect. Combined with the light field refocusing algorithm, it is possible to add effects such as background blur in the process of switching viewpoints. Therefore, this embodiment also facilitates the integration of computer vision algorithms to achieve different visual effects.
此外,本发明还提供一种电子设备,包括:至少一个处理器;与至少一个处理器耦合的存储器,存储器存储有可执行指令,其中,可执行指令在被至少一个处理器执行时使得实现本发明上述的方法。In addition, the present invention also provides an electronic device, including: at least one processor; a memory coupled with the at least one processor, the memory storing executable instructions, where the executable instructions enable the implementation of Invent the above-mentioned method.
例如,存储器可以包括随机存储器、闪存、只读存储器、可编程只读存储器、非易失性存储器或寄存器等。处理器可以是中央处理器(Central Processing Unit,CPU)等。或者是图像处理器(Graphic Processing Unit,GPU)存储器可以存储可执行指令。处理器可以执行在存储器中存储的可执行指令,从而实现本文描述的各个过程。For example, the memory may include random access memory, flash memory, read-only memory, programmable read-only memory, non-volatile memory, or registers. The processor may be a central processing unit (Central Processing Unit, CPU) or the like. Alternatively, a graphics processor (Graphic Processing Unit, GPU) memory can store executable instructions. The processor can execute executable instructions stored in the memory to implement the various processes described herein.
可以理解,本实施例中的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是ROM(Read-OnlyMemory,只读存储器)、PROM(ProgrammableROM,可编程只读存储器)、EPROM(ErasablePROM,可擦除可编程只读存储器)、EEPROM(ElectricallyEPROM,电可擦除可编程只读存储器)或闪存。易失性存储器可以是RAM(RandomAccessMemory,随机存取存储器),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如SRAM(StaticRAM,静态随机存取存储器)、DRAM(DynamicRAM,动态随机存取存储器)、SDRAM(SynchronousDRAM,同步动态随机存取存储器)、DDRSDRAM(DoubleDataRate SDRAM,双倍数据速率同步动态随机存取存储器)、ESDRAM(Enhanced SDRAM,增强型同步动态随机存取存储器)、SLDRAM(SynchlinkDRAM,同步连接动态随机存取存储器)和DRRAM(DirectRambusRAM,直接内存总线随机存取存储器)。本文描述的存储器旨在包括但不限于这些和任意其它适合类型的存储器。It can be understood that the memory in this embodiment may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memory. Among them, the non-volatile memory can be ROM (Read-OnlyMemory), PROM (ProgrammableROM, Programmable Read-Only Memory), EPROM (ErasablePROM, Erasable Programmable Read-Only Memory), EEPROM (Electrically EPROM, Electronic Erasable programmable read-only memory) or flash memory. The volatile memory may be RAM (Random Access Memory), which is used as an external cache. By way of exemplary but not restrictive description, many forms of RAM are available, such as SRAM (StaticRAM, static random access memory), DRAM (DynamicRAM, dynamic random access memory), SDRAM (SynchronousDRAM, synchronous dynamic random access memory), DDRSDRAM (DoubleDataRate SDRAM, double data rate synchronous dynamic random access memory), ESDRAM (Enhanced SDRAM, enhanced synchronous dynamic random access memory), SLDRAM (SynchlinkDRAM, synchronous connection dynamic random access memory) and DRRAM (DirectRambusRAM, direct RAM bus random access memory). The memories described herein are intended to include, but are not limited to, these and any other suitable types of memories.
在一些实施方式中,存储器存储了如下的元素,升级包、可执行单元或者数据结构,或者他们的子集,或者他们的扩展集:操作***和应用程序。In some embodiments, the memory stores the following elements, upgrade packages, executable units, or data structures, or a subset of them, or an extended set of them: operating systems and applications.
其中,操作***,包含各种***程序,例如框架层、核心库层、驱动层等,用于实现各种基础业务以及处理基于硬件的任务。应用程序,包含各种应用程序,用于实现各种应用业务。实现本发明实施例方法的程序可以包含在应用程序中。Among them, the operating system includes various system programs, such as a framework layer, a core library layer, and a driver layer, which are used to implement various basic services and process hardware-based tasks. Application programs, including various application programs, are used to implement various application services. A program that implements the method of the embodiment of the present invention may be included in an application program.
在本发明实施例中,处理器通过调用存储器存储的程序或指令,具体的,可以 是应用程序中存储的程序或指令,处理器用于执行上述方法步骤。In the embodiment of the present invention, the processor calls a program or instruction stored in the memory, specifically, it may be a program or instruction stored in an application program, and the processor is used to execute the above method steps.
本发明实施例还提供一种芯片,用于执行上述的方法。具体地,该芯片包括:处理器,用于从存储器中调用并运行计算机程序,使得安装有该芯片的设备用于执行上述方法。The embodiment of the present invention also provides a chip for executing the above-mentioned method. Specifically, the chip includes a processor, which is used to call and run a computer program from the memory, so that the device installed with the chip is used to execute the above method.
本发明还提供一种计算机可读存储介质,计算机可读存储介质上存储有计算机程序,计算机程序被处理器执行时实现本发明上述的方法的步骤。The present invention also provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps of the above-mentioned method of the present invention are realized.
例如,机器可读存储介质可以包括但不限于各种已知和未知类型的非易失性存储器。For example, the machine-readable storage medium may include, but is not limited to, various known and unknown types of non-volatile memory.
本发明实施例还提供一种计算机程序产品,包括计算机程序指令,该计算机程序指令使得计算机执行上述方法。The embodiment of the present invention also provides a computer program product, including computer program instructions, which cause a computer to execute the foregoing method.
本领域技术人员可以明白的是,结合本文中所公开的实施例描述的各示例的单元及算法步骤能够以电子硬件、或者软件和电子硬件的结合来实现。这些功能是以硬件还是软件方式来实现,取决于技术方案的特定应用和设计约束条件。本领域技术人员可以针对每个特定的应用,使用不同的方式来实现所描述的功能,但是这种实现并不应认为超出本申请的范围。Those skilled in the art can understand that the units and algorithm steps of the examples described in combination with the embodiments disclosed herein can be implemented by electronic hardware or a combination of software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different ways to implement the described functions for each specific application, but such implementation should not be considered as going beyond the scope of the present application.
在本申请实施例中,所公开的***、电子设备和方法可以通过其它方式来实现。例如,单元的划分仅仅为一种逻辑功能划分,在实际实现时还可以有另外的划分方式。例如,多个单元或组件可以进行组合或者可以集成到另一个***中。另外,各个单元之间的耦合可以是直接耦合或间接耦合。另外,在本申请实施例中的各功能单元可以集成在一个处理单元中,也可以是单独的物理存在等等。In the embodiments of the present application, the disclosed system, electronic device, and method may be implemented in other ways. For example, the division of units is only a logical function division, and there may be other division methods in actual implementation. For example, multiple units or components can be combined or integrated into another system. In addition, the coupling between the various units may be direct coupling or indirect coupling. In addition, the functional units in the embodiments of the present application may be integrated into one processing unit, or may be separate physical existences, and so on.
应理解,在本申请的各种实施例中,各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请的实施例的实施过程构成任何限定。It should be understood that in the various embodiments of the present application, the size of the sequence number of each process does not mean the order of execution, and the execution order of each process should be determined by its function and internal logic, and should not correspond to the embodiments of the present application. The implementation process constitutes any limitation.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在机器可读存储介质中。因此,本申请的技术方案可以以软件产品的形式来体现,该软件产品可以存储在机器可读存储介质中,其可以包括若干指令用以使得电子设备执行本申请实施例所描述的技术方案的全部或部分过程。上述存储介质可以包括ROM、RAM、可移动盘、硬盘、磁盘或者光盘等各种可以存储程序代码的介质。If the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a machine-readable storage medium. Therefore, the technical solution of the present application can be embodied in the form of a software product. The software product can be stored in a machine-readable storage medium. All or part of the process. The foregoing storage media may include various media capable of storing program codes, such as ROM, RAM, removable disks, hard disks, magnetic disks, or optical disks.
以上内容仅为本申请的具体实施方式,本申请的保护范围并不局限于此。本领 域技术人员在本申请所公开的技术范围内可以进行变化或替换,这些变化或替换都应当在本申请的保护范围之内。The above content is only specific implementation manners of this application, and the protection scope of this application is not limited to this. Those skilled in the art can make changes or substitutions within the technical scope disclosed in this application, and these changes or substitutions should all fall within the protection scope of this application.

Claims (16)

  1. 一种视频直播处理方法,其特征在于包括如下步骤:A method for processing live video, which is characterized by including the following steps:
    S1,采集多角度的若干同步视频;S1, collect several synchronized videos from multiple angles;
    S2,将所述若干同步视频拼接成大图,并传输至服务器;S2, splicing the plurality of synchronized videos into a large image, and transmitting it to the server;
    S3,在服务器上解码获取大图,选择角度播放。S3, decode the big picture on the server, and select the angle to play.
  2. 根据权利要求1所述的视频直播处理方法,其特征在于:S1中,通过相机阵列采集不同视角的同步视频的画面。The live video processing method according to claim 1, characterized in that: in S1, the synchronized video images of different viewing angles are collected through the camera array.
  3. 根据权利要求1或2所述的视频直播处理方法,其特征在于:S1中还包括,对若干同步视频进行图像校正。The live video processing method according to claim 1 or 2, characterized in that: S1 further includes performing image correction on several synchronized videos.
  4. 根据权利要求3所述的视频直播处理方法,其特征在于:所述图像校正包括:通过相机拍摄的图像对标定点进行定位,从而计算出被测物相对于标准位置的偏差,得到生成校正后的同步视频的图像。The live video processing method of claim 3, wherein the image correction comprises: positioning the calibration point through the image taken by the camera, thereby calculating the deviation of the measured object relative to the standard position, and obtaining the generated correction The image of the synchronized video.
  5. 根据权利要求1或2所述的视频直播处理方法,其特征在于:S2中,对大图进行压缩,封装成流媒体格式,然后对其进行传输。The live video processing method according to claim 1 or 2, characterized in that: in S2, the large image is compressed, encapsulated into a streaming media format, and then transmitted.
  6. 根据权利要求5所述的视频直播处理方法,其特征在于:将原始流加入时间标签后生成分组原始流,在分组原始流包的包头中比较解码时间戳和显示时间戳,分别用于指示该数据在解码端的解码时间和显示时间。The method for processing video live broadcast according to claim 5, characterized in that: the original stream is added to the time stamp to generate the packet original stream, and the decoding timestamp and the display timestamp are compared in the header of the packet original stream packet to indicate the The decoding time and display time of the data at the decoding end.
  7. 根据权利要求5或6所述的视频直播处理方法,其特征在于:采用当前分组原始流包中数据对应的采集时间作为该分组原始流包头中的显示时间戳,并通过帧类型计算得到解码时间戳。The live video processing method according to claim 5 or 6, characterized in that the acquisition time corresponding to the data in the original stream packet of the current packet is used as the display timestamp in the packet header of the original stream, and the decoding time is calculated by the frame type stamp.
  8. 根据权利要求1或2所述的视频直播处理方法,其特征在于:S2中,将每个同步视频分别压缩到同一低分辨率后,再将同时刻的若干同步视频拼接成一幅超高分辨率的大图画面。The live video processing method according to claim 1 or 2, characterized in that: in S2, each synchronized video is compressed to the same low resolution, and then several synchronized videos at the same time are spliced into a super high resolution. The big picture screen.
  9. 根据权利要求1所述的视频直播处理方法,其特征在于:S4中,用户在服务器上拉取直播数据进行解码,得到大图的画面后选择画面,进行显示播放。The method for processing video live broadcast according to claim 1, characterized in that: in S4, the user pulls the live broadcast data from the server for decoding, obtains a large picture and then selects the picture for display and playback.
  10. 根据权利要求1或9所述的视频直播处理方法,其特征在于:S4中,得到大图的画面后,用户滑动屏幕或者鼠标拖动时切换显示不同相机的画面,从而实现子弹时间的效果。The live video processing method according to claim 1 or 9, characterized in that: in S4, after obtaining the large picture, the user slides the screen or drags the mouse to switch to display pictures of different cameras, so as to achieve the effect of bullet time.
  11. 一种视频直播处理***,包括:获取模块、拼接模块、压缩模块和解码模块, 其中:获取模块用于采集多角度的同步视频;A live video processing system, including: an acquisition module, a splicing module, a compression module, and a decoding module, wherein: the acquisition module is used to collect synchronized videos from multiple angles;
    拼接模块用于将同步视频拼接成大图,通过压缩模块压缩后拼接成超高分辨率的大图画面,并传输至服务器;The splicing module is used to splice the synchronized video into a large picture, which is compressed by the compression module and spliced into a super high-resolution large picture, and transmitted to the server;
    解码模块用于接收服务器的压缩数据并解码,由用户选择视角进行播放。The decoding module is used to receive the compressed data from the server and decode it, and the user can select the viewing angle for playback.
  12. 根据权利要求11所述视频直播处理***,其特征在于:还包括校正模块,所述校正模块用于对若干同步视频进行图像校正。The live video processing system according to claim 11, further comprising a correction module, the correction module being used to perform image correction on several synchronized videos.
  13. 根据权利要求12所述视频直播处理***,其特征在于:校正模块通过相机拍摄的图像对标定点进行定位,从而计算出被测物相对于标准位置的偏差,得到生成校正后的同步视频图像。The live video processing system according to claim 12, characterized in that the correction module locates the calibration point through the image taken by the camera, thereby calculating the deviation of the measured object relative to the standard position, and obtaining the synchronized video image after generating the correction.
  14. 根据权利要求11所述视频直播处理***,其特征在于:拼接模块用于将每个同步视频分别压缩到同一低分辨率后,再将同时刻的若干同步视频拼接成一幅超高分辨率的大图画面。The live video processing system according to claim 11, characterized in that: the splicing module is used to compress each synchronous video to the same low resolution, and then splice several simultaneous videos into a super high resolution large Figure screen.
  15. 一种视频直播装置,其特征在于:包括若干相机、若干服务器、主控制器和播放器,每个服务器控制若干相机,并于主控制器连接;所属主控制器还与若干播放器连接;其中相机和服务器用于相机的同步采集和分布式处理,主控制器用于图像拼接和推流,播放器用于视频解码和交互式视频渲染。A video live broadcast device, which is characterized in that it includes several cameras, several servers, a main controller and players. Each server controls several cameras and is connected to the main controller; the main controller is also connected to several players; wherein The camera and server are used for synchronous acquisition and distributed processing of the camera, the main controller is used for image splicing and streaming, and the player is used for video decoding and interactive video rendering.
  16. 一种计算机可读介质,其特征在于:其上存储有计算机程序指令,所述计算机程序指令被处理执行时,实现权利要求1-10中任一所述的方法。A computer readable medium, characterized in that computer program instructions are stored thereon, and when the computer program instructions are processed and executed, the method according to any one of claims 1-10 is realized.
PCT/CN2021/070575 2020-03-11 2021-01-07 Free viewpoint-based video live broadcast processing method, device, system, chip and medium WO2021179783A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010166693.7 2020-03-11
CN202010166693.7A CN111355967A (en) 2020-03-11 2020-03-11 Video live broadcast processing method, system, device and medium based on free viewpoint

Publications (1)

Publication Number Publication Date
WO2021179783A1 true WO2021179783A1 (en) 2021-09-16

Family

ID=71196091

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/070575 WO2021179783A1 (en) 2020-03-11 2021-01-07 Free viewpoint-based video live broadcast processing method, device, system, chip and medium

Country Status (2)

Country Link
CN (1) CN111355967A (en)
WO (1) WO2021179783A1 (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113891112A (en) * 2021-09-29 2022-01-04 北京拙河科技有限公司 Live broadcast method, device, medium and equipment for billion pixel video
CN113891044A (en) * 2021-09-29 2022-01-04 天翼物联科技有限公司 Video live broadcast method and device, computer equipment and computer readable storage medium
CN113891111A (en) * 2021-09-29 2022-01-04 北京拙河科技有限公司 Live broadcast method, device, medium and equipment for billion pixel video
CN114189696A (en) * 2021-11-24 2022-03-15 阿里巴巴(中国)有限公司 Video playing method and device
CN114245129A (en) * 2022-02-22 2022-03-25 湖北芯擎科技有限公司 Image processing method, image processing device, computer equipment and storage medium
CN114697501A (en) * 2022-03-23 2022-07-01 南京云创大数据科技股份有限公司 Monitoring camera image processing method and system based on time
CN115209126A (en) * 2022-07-01 2022-10-18 上海建桥学院有限责任公司 Bullet time stereo image acquisition system and synchronous control method
CN115499673A (en) * 2022-08-30 2022-12-20 深圳市思为软件技术有限公司 Live broadcast method and device
CN115834921A (en) * 2022-11-17 2023-03-21 北京奇艺世纪科技有限公司 Video processing method, video processing apparatus, video processing server, storage medium, and program product
CN116016978A (en) * 2023-01-05 2023-04-25 香港中文大学(深圳) Picture guiding and broadcasting method and device for online class, electronic equipment and storage medium
WO2023103843A1 (en) * 2021-12-10 2023-06-15 华为技术有限公司 Method, device, and system for displaying bullet comment of free viewpoint video
CN116366905A (en) * 2023-02-28 2023-06-30 北京优酷科技有限公司 Video playing method and device and electronic equipment
CN116614648A (en) * 2023-04-18 2023-08-18 天翼数字生活科技有限公司 Free view video display method and system based on view angle compensation system
CN116614650A (en) * 2023-06-16 2023-08-18 上海随幻智能科技有限公司 Voice and picture synchronous private domain live broadcast method, system, equipment, chip and medium
CN117579843A (en) * 2024-01-17 2024-02-20 淘宝(中国)软件有限公司 Video coding processing method and electronic equipment
CN117939183A (en) * 2024-03-21 2024-04-26 中国传媒大学 Multi-machine-position free view angle guided broadcasting method and system
CN116614648B (en) * 2023-04-18 2024-06-07 天翼数字生活科技有限公司 Free view video display method and system based on view angle compensation system

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111355967A (en) * 2020-03-11 2020-06-30 叠境数字科技(上海)有限公司 Video live broadcast processing method, system, device and medium based on free viewpoint
CN111866525A (en) * 2020-09-23 2020-10-30 腾讯科技(深圳)有限公司 Multi-view video playing control method and device, electronic equipment and storage medium
CN114513674A (en) * 2020-11-16 2022-05-17 上海科技大学 Interactive live broadcast data transmission/processing method, processing system, medium and server
CN112887744B (en) * 2021-01-21 2022-03-04 上海薏欣文化传播有限公司 Live broadcast data transmission control method for large healthy intelligent live broadcast hall
CN114915798A (en) * 2021-02-08 2022-08-16 阿里巴巴集团控股有限公司 Real-time video generation method, multi-camera live broadcast method and device
CN114915823B (en) * 2021-02-08 2024-04-02 腾讯科技(北京)有限公司 Video playing control method and device, storage medium and electronic equipment
CN113242452A (en) * 2021-06-15 2021-08-10 中国人民解放军91388部队 Video display method, device, system, equipment and storage medium
CN113596583A (en) * 2021-08-05 2021-11-02 四开花园网络科技(广州)有限公司 Video stream bullet time data processing method and device
CN113573079B (en) * 2021-09-23 2021-12-24 北京全心数字技术有限公司 Method for realizing free visual angle live broadcast mode
CN113938711A (en) * 2021-10-13 2022-01-14 北京奇艺世纪科技有限公司 Visual angle switching method and device, user side, server and storage medium
CN115174942A (en) * 2022-07-08 2022-10-11 叠境数字科技(上海)有限公司 Free visual angle switching method and interactive free visual angle playing system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140059588A1 (en) * 2012-08-22 2014-02-27 Yahoo Japan Corporation Advertisement distribution apparatus and advertisement distribution method
CN105847851A (en) * 2016-04-19 2016-08-10 北京金山安全软件有限公司 Panoramic video live broadcast method, device and system and video source control equipment
CN105872569A (en) * 2015-11-27 2016-08-17 乐视云计算有限公司 Video playing method and system, and devices
CN106550239A (en) * 2015-09-22 2017-03-29 北京同步科技有限公司 360 degree of panoramic video live broadcast systems and its implementation
CN107396085A (en) * 2017-08-24 2017-11-24 三星电子(中国)研发中心 A kind of processing method and system of full multi-view video image
CN111355967A (en) * 2020-03-11 2020-06-30 叠境数字科技(上海)有限公司 Video live broadcast processing method, system, device and medium based on free viewpoint

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR200220325Y1 (en) * 1998-12-29 2001-05-02 김형국 Dozer for Valve Packing
CN100588250C (en) * 2007-02-05 2010-02-03 北京大学 Method and system for rebuilding free viewpoint of multi-view video streaming
CN102307309A (en) * 2011-07-29 2012-01-04 杭州电子科技大学 Somatosensory interactive broadcasting guide system and method based on free viewpoints
CN109495760A (en) * 2018-12-25 2019-03-19 虎扑(上海)文化传播股份有限公司 A kind of method of multiple groups camera live broadcasting

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140059588A1 (en) * 2012-08-22 2014-02-27 Yahoo Japan Corporation Advertisement distribution apparatus and advertisement distribution method
CN106550239A (en) * 2015-09-22 2017-03-29 北京同步科技有限公司 360 degree of panoramic video live broadcast systems and its implementation
CN105872569A (en) * 2015-11-27 2016-08-17 乐视云计算有限公司 Video playing method and system, and devices
CN105847851A (en) * 2016-04-19 2016-08-10 北京金山安全软件有限公司 Panoramic video live broadcast method, device and system and video source control equipment
CN107396085A (en) * 2017-08-24 2017-11-24 三星电子(中国)研发中心 A kind of processing method and system of full multi-view video image
CN111355967A (en) * 2020-03-11 2020-06-30 叠境数字科技(上海)有限公司 Video live broadcast processing method, system, device and medium based on free viewpoint

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113891044A (en) * 2021-09-29 2022-01-04 天翼物联科技有限公司 Video live broadcast method and device, computer equipment and computer readable storage medium
CN113891111A (en) * 2021-09-29 2022-01-04 北京拙河科技有限公司 Live broadcast method, device, medium and equipment for billion pixel video
CN113891112A (en) * 2021-09-29 2022-01-04 北京拙河科技有限公司 Live broadcast method, device, medium and equipment for billion pixel video
CN113891112B (en) * 2021-09-29 2023-12-05 北京拙河科技有限公司 Live broadcasting method, device, medium and equipment of billion pixel video
CN113891111B (en) * 2021-09-29 2023-11-21 北京拙河科技有限公司 Live broadcasting method, device, medium and equipment of billion pixel video
CN114189696A (en) * 2021-11-24 2022-03-15 阿里巴巴(中国)有限公司 Video playing method and device
CN114189696B (en) * 2021-11-24 2024-03-08 阿里巴巴(中国)有限公司 Video playing method and device
WO2023103843A1 (en) * 2021-12-10 2023-06-15 华为技术有限公司 Method, device, and system for displaying bullet comment of free viewpoint video
CN114245129A (en) * 2022-02-22 2022-03-25 湖北芯擎科技有限公司 Image processing method, image processing device, computer equipment and storage medium
CN114697501B (en) * 2022-03-23 2023-08-11 南京云创大数据科技股份有限公司 Time-based monitoring camera image processing method and system
CN114697501A (en) * 2022-03-23 2022-07-01 南京云创大数据科技股份有限公司 Monitoring camera image processing method and system based on time
CN115209126A (en) * 2022-07-01 2022-10-18 上海建桥学院有限责任公司 Bullet time stereo image acquisition system and synchronous control method
CN115499673A (en) * 2022-08-30 2022-12-20 深圳市思为软件技术有限公司 Live broadcast method and device
CN115499673B (en) * 2022-08-30 2023-10-20 深圳市思为软件技术有限公司 Live broadcast method and device
CN115834921A (en) * 2022-11-17 2023-03-21 北京奇艺世纪科技有限公司 Video processing method, video processing apparatus, video processing server, storage medium, and program product
CN116016978B (en) * 2023-01-05 2024-05-24 香港中文大学(深圳) Picture guiding and broadcasting method and device for online class, electronic equipment and storage medium
CN116016978A (en) * 2023-01-05 2023-04-25 香港中文大学(深圳) Picture guiding and broadcasting method and device for online class, electronic equipment and storage medium
CN116366905B (en) * 2023-02-28 2024-01-09 北京优酷科技有限公司 Video playing method and device and electronic equipment
CN116366905A (en) * 2023-02-28 2023-06-30 北京优酷科技有限公司 Video playing method and device and electronic equipment
CN116614648A (en) * 2023-04-18 2023-08-18 天翼数字生活科技有限公司 Free view video display method and system based on view angle compensation system
CN116614648B (en) * 2023-04-18 2024-06-07 天翼数字生活科技有限公司 Free view video display method and system based on view angle compensation system
CN116614650A (en) * 2023-06-16 2023-08-18 上海随幻智能科技有限公司 Voice and picture synchronous private domain live broadcast method, system, equipment, chip and medium
CN117579843A (en) * 2024-01-17 2024-02-20 淘宝(中国)软件有限公司 Video coding processing method and electronic equipment
CN117579843B (en) * 2024-01-17 2024-04-02 淘宝(中国)软件有限公司 Video coding processing method and electronic equipment
CN117939183A (en) * 2024-03-21 2024-04-26 中国传媒大学 Multi-machine-position free view angle guided broadcasting method and system

Also Published As

Publication number Publication date
CN111355967A (en) 2020-06-30

Similar Documents

Publication Publication Date Title
WO2021179783A1 (en) Free viewpoint-based video live broadcast processing method, device, system, chip and medium
WO2021083176A1 (en) Data interaction method and system, interaction terminal and readable storage medium
KR102407283B1 (en) Methods and apparatus for delivering content and/or playing back content
Lou et al. A real-time interactive multi-view video system
CN111447461A (en) Synchronous switching method, device, equipment and medium for multi-view live video
WO2021083178A1 (en) Data processing method and system, server and storage medium
WO2021083174A1 (en) Virtual viewpoint image generation method, system, electronic device, and storage medium
US11282169B2 (en) Method and apparatus for processing and distributing live virtual reality content
KR20200065087A (en) Multi-viewpoint-based 360 video processing method and apparatus
Podborski et al. Virtual reality and DASH
KR20190031220A (en) System and method for providing virtual reality content
Niamut et al. Live event experiences-interactive UHDTV on mobile devices
Domański et al. Demonstration of a simple free viewpoint television system
US20150289032A1 (en) Main and immersive video coordination system and method
US11706375B2 (en) Apparatus and system for virtual camera configuration and selection
WO2021083175A1 (en) Data processing method, device and system, readable storage medium and server
Zeng et al. A new architecture of 8k vr fov video end-to-end technology
WO2021032105A1 (en) Code stream processing method and device, first terminal, second terminal and storage medium
KR102273439B1 (en) Multi-screen playing system and method of providing real-time relay service
TW201125358A (en) Multi-viewpoints interactive television system and method.
Aracena et al. Live VR end-to-end workflows: Real-life deployments and advances in VR and network technology
WO2024082561A1 (en) Video processing method and apparatus, computer, readable storage medium, and program product
WO2021083177A1 (en) Method for generating depth map, computing nodes, computing node cluster, and storage medium
KR102599664B1 (en) System operating method for transfering multiview video and system of thereof
WO2023236732A1 (en) Media information processing method and device, media information playback method and device, and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21767063

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21767063

Country of ref document: EP

Kind code of ref document: A1