WO2021218573A1 - 视频播放方法、装置及***、计算机存储介质 - Google Patents

视频播放方法、装置及***、计算机存储介质 Download PDF

Info

Publication number
WO2021218573A1
WO2021218573A1 PCT/CN2021/085477 CN2021085477W WO2021218573A1 WO 2021218573 A1 WO2021218573 A1 WO 2021218573A1 CN 2021085477 W CN2021085477 W CN 2021085477W WO 2021218573 A1 WO2021218573 A1 WO 2021218573A1
Authority
WO
WIPO (PCT)
Prior art keywords
playback
video
surround
time
rotation
Prior art date
Application number
PCT/CN2021/085477
Other languages
English (en)
French (fr)
Inventor
郑洛
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP21797606.7A priority Critical patent/EP4135312A4/en
Publication of WO2021218573A1 publication Critical patent/WO2021218573A1/zh
Priority to US17/977,404 priority patent/US20230045876A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • H04N7/181Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a plurality of remote sources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/74Browsing; Visualisation therefor
    • G06F16/745Browsing; Visualisation therefor the internal structure of a single video sequence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/74Browsing; Visualisation therefor
    • G06F16/743Browsing; Visualisation therefor a collection of video files or sequences
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04847Interaction techniques to control parameter settings, e.g. interaction with sliders or dials
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/21805Source of audio or video content, e.g. local disk arrays enabling multiple viewpoints, e.g. using a plurality of cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/239Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests
    • H04N21/2393Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests involving handling client requests
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2665Gathering content from different sources, e.g. Internet and satellite
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/4728End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for selecting a Region Of Interest [ROI], e.g. for requesting a higher resolution version of a selected region
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/65Transmission of management data between client and server
    • H04N21/658Transmission by the client directed to the server
    • H04N21/6587Control parameters, e.g. trick play commands, viewpoint selection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/90Arrangement of cameras or camera modules, e.g. multiple cameras in TV studios or sports stadiums
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/2624Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects for obtaining an image which is composed of whole input images, e.g. splitscreen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/265Mixing

Definitions

  • This application relates to the technical field of video processing, and in particular to a video playback method, device and system, and computer storage medium.
  • Surround playback requires front-end shooting to use multiple cameras distributed in specific locations to capture video images from different angles in the same focal area. At the same time, it is based on camera synchronization technology to ensure that multiple cameras capture images at the same time and frequency. Then the multiple cameras send the collected video streams to the video processing platform, and the video processing platform processes the multiple video streams, and further realizes surround playback of the focus area on the terminal.
  • the server usually splices video frames with the same collection time in multiple video streams into one video frame.
  • front-end shooting uses 16 cameras to capture video images from different angles in the same focal area.
  • the server adjusts the resolution of the video frames in each of the 16 video streams received to 960 ⁇ 540, and then adjusts the 16 video frames of the 16 video streams at the same acquisition time to a 4 ⁇ 4 equal ratio Combine it into a video frame with a resolution of 3840 ⁇ 2160 (that is, 4K) to obtain a video stream.
  • the server sends the video stream to the terminal. After the terminal decodes the video stream, it selects 1/16 of the video images (video images captured by a camera) for playback according to the set viewing machine position.
  • This application provides a video playback method, device, system, and computer storage medium, which can solve the problem of relatively high application limitations of video playback in related technologies.
  • a video playback method includes: an upper layer device receives a surround playback request sent by a terminal, the surround playback request includes rotation camera position information, and the rotation camera position information is used to indicate a rotation range.
  • the upper layer device determines the play time information based on the surround play request.
  • the upper-layer device generates a rotating segment based on the rotating camera position information and the playback time information.
  • the rotating segment includes a group of pictures (GOP) corresponding to multiple cameras within the rotating range.
  • the GOP includes one or more frames of video image.
  • the upper-layer device sends rotating fragments to the terminal.
  • the rotating machine position information includes one or more of the identification of the starting machine position, the identification of the ending machine position, the direction of rotation, or the angle of rotation.
  • the play time information includes one or more of the play start time, the play end time, or the surround play time; or, the play time information includes the target play time.
  • the video playback method provided in this application is not limited to the number of cameras used for front-end shooting, and has a wide range of applications.
  • the play time information includes the play start time and the play end time.
  • the realization process of the upper-layer device generating rotating segments according to the rotating camera position information and the playback time information includes: the upper-layer device obtains m video segments corresponding to each of the multiple camera positions from the playback start time to the playback end time , M is a positive integer.
  • the upper-layer device extracts one or more GOPs from the m video segments corresponding to each camera position according to the playing time information.
  • the upper equipment assembles the extracted GOPs to obtain rotating pieces.
  • the upper-layer device extracts one or more GOPs from the m video segments corresponding to each camera position according to the playback time information, including: the upper-layer device extracts one or more GOPs according to the surround playback duration and the number of multiple cameras, Determine the GOP extraction time and the number of GOP extractions corresponding to each camera position, and the surround playback duration is equal to the difference between the playback end time and the playback start time.
  • the upper layer device extracts GOPs from the m video segments corresponding to each camera position according to the GOP extraction time and the number of GOP extractions corresponding to each camera position.
  • the rotating segment generated by the upper-layer device is a dynamic rotating segment.
  • the upper-layer device can generate dynamic rotating segments during the terminal video playback process, thereby realizing dynamic surround playback on the terminal.
  • the dynamic surround playback of video content by the terminal means that the terminal plays video images continuously in time sequence, that is, the next frame of video image played by the terminal and the previous frame of image played are two frames of images continuously collected in time sequence.
  • the play time information includes the target play time.
  • the realization process of the upper-layer device generating the rotating segment according to the rotating camera position information and the playback time information includes: the upper-layer device obtains the target video segment corresponding to each of the multiple camera positions, and the target video segment corresponds to The time period includes the target playback moment.
  • the upper layer device extracts a GOP corresponding to the target playback moment from the target video segment corresponding to each camera position, and the GOP includes a frame of video image.
  • the upper equipment assembles the extracted GOPs to obtain rotating pieces.
  • the rotating segment generated by the upper-layer device is a static rotating segment.
  • the upper layer device can generate static rotating segments when the terminal video is paused, so as to realize static surround playback on the terminal.
  • the static surround playback of video content by the terminal refers to the surround playback of video images collected by multiple cameras at the same time by the terminal.
  • the upper-level device determines the start camera position, the end camera position, and the rotation direction according to the rotation camera position information.
  • the upper-level equipment determines multiple positions from the start position to the end position along the direction of rotation.
  • the upper-layer device assembles the extracted GOPs to obtain the implementation process of rotating slices, including: the upper-layer device sequentially assembles the extracted GOPs according to the rotation direction to obtain rotating slices.
  • the implementation process of the upper-layer device determining the playback time information based on the surround playback request includes: the upper-layer device determines the playback start time and the playback end time according to the time when the surround playback request is received and a preset strategy, the preset strategy Includes preset surround playback duration.
  • the surround playback request includes the playback start time and the playback end time
  • the upper layer device determines the implementation process of the playback time information based on the surround playback request, including: the upper layer device recognizes the playback start time and the playback end time in the surround playback request.
  • the upper-layer device determines the realization process of the playback time information based on the surround playback request, including: the upper-layer device determines the playback end time according to the playback start time and the preset surround playback duration.
  • the upper layer device determines the realization process of the playback time information based on the surround playback request, including: the upper layer device determines the playback start time and playback end according to the time when the surround playback request is received and the surround playback duration time.
  • the surround playback request includes the playback start time and the surround playback duration
  • the upper-layer device determines the realization process of the playback time information based on the surround playback request, including: the upper-layer device determines the playback end time according to the playback start moment and the surround playback duration.
  • the GOP is encoded in an independent transmission encapsulation mode.
  • Each GOP can be used as a separate slice for independent transmission.
  • a video playback method includes: when the terminal receives the rotation instruction, the terminal sends a surround playback request generated based on the rotation instruction to the upper device, the surround playback request includes rotation machine position information, and the rotation machine position information is used to indicate the rotation range.
  • the terminal receives a rotating segment sent by an upper-layer device, and the rotating segment includes GOPs corresponding to multiple cameras within the rotating range, and the GOP includes one or more frames of video images. The terminal decodes and plays the rotating segment.
  • the terminal determines that the rotation instruction is received.
  • the terminal determines the rotation machine position information according to the sliding information of the sliding operation, and the sliding information includes one or more of the sliding start position, the sliding length, the sliding direction, or the sliding angle.
  • the terminal generates a surround playback request based on the rotation machine position information.
  • the terminal determines to receive the rotation instruction, the target remote control instruction includes remote control button information, and the remote control button information includes the button identifier and/or the number of times of pressing.
  • the terminal determines the rotation machine position information based on the remote control button information.
  • the terminal generates a surround playback request based on the rotation machine position information.
  • the terminal does not need to change the playback logic. It only needs to send a surround playback request to the upper device after receiving the rotation instruction, and then decode the rotation segment, so as to realize the surround playback of the video screen, and the playback of the video screen.
  • the resolution can be the same as the resolution of the video image in the rotated slice. Therefore, the video playback method provided in this application is not limited to the number of cameras used for front-end shooting, and has a wide range of applications.
  • a video playback device in the third aspect, includes a plurality of functional modules, and the plurality of functional modules interact to implement the above-mentioned first aspect and the methods in various embodiments thereof.
  • the multiple functional modules can be implemented based on software, hardware, or a combination of software and hardware, and the multiple functional modules can be combined or divided arbitrarily based on specific implementations.
  • a video playback device in a fourth aspect, includes a plurality of functional modules, and the plurality of functional modules interact to implement the above-mentioned second aspect and the methods in various embodiments thereof.
  • the multiple functional modules can be implemented based on software, hardware, or a combination of software and hardware, and the multiple functional modules can be combined or divided arbitrarily based on specific implementations.
  • a video playback system in a fifth aspect, includes: an upper-layer device and a terminal, the upper-layer device includes the video playback apparatus as described in the third aspect, and the terminal includes the video as described in the fourth aspect Play device.
  • a video playback device including: a processor and a memory;
  • the memory is used to store a computer program, and the computer program includes program instructions
  • the processor is configured to call the computer program to implement the video playback method according to any one of the first aspect; or, to implement the video playback method according to any one of the second aspect.
  • a computer storage medium is provided, and instructions are stored on the computer storage medium.
  • the instructions are executed by a processor of a computer device, the video as described in the first aspect or the second aspect is implemented. Play method.
  • a chip in an eighth aspect, includes a programmable logic circuit and/or program instructions. When the chip is running, it implements the methods in the first aspect and its implementations or the second aspect and its implementations. The method in the way.
  • the upper-layer device determines the playback time information according to the surround playback request sent by the terminal, and then generates rotating segments according to the playback time information and the rotation machine position information in the surround playback request. Since the rotating segment contains GOPs corresponding to multiple cameras within the rotation range indicated by the rotating camera position information, after receiving the rotating segment, the terminal decodes the rotating segment to achieve surround playback of the video screen. And the resolution of the played video image may be the same as the resolution of the video image in the rotated segment. Therefore, the video playback method provided by the embodiments of the present application is not limited to the number of cameras used for front-end shooting, and has a wide range of applications.
  • the upper-layer device can be a video distribution server or a network device, which can reduce the requirements on the processing performance of the video processing server and achieve high reliability.
  • FIG. 1 is a schematic structural diagram of a video playback system provided by an embodiment of the present application
  • FIG. 2 is a schematic structural diagram of a video segment provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of a camera distribution scene on the media source side according to an embodiment of the present application
  • FIG. 4 is a schematic flowchart of a video playback method provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a generation process of a rotating segment provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of another rotating segment generation process provided by an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a video playback device provided by an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of another video playback device provided by an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of another video playback device provided by an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of still another video playback device provided by an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of another video playback device provided by an embodiment of the present application.
  • Fig. 12 is a block diagram of a video playback device provided by an embodiment of the present application.
  • Fig. 1 is a schematic structural diagram of a video playback system provided by an embodiment of the present application. As shown in Figure 1, the system includes: a media source 101, a video server 102, and a terminal 103.
  • the media source 101 is used to provide multiple video streams.
  • the media source 101 includes a plurality of cameras 1011 and a front-end encoder 1012.
  • the camera 1011 is connected to the front-end encoder 1012.
  • Each camera 1011 is used to collect a video stream and transmit the collected video stream to the front-end encoder 1012.
  • the front-end encoder 1012 is configured to encode video streams collected by multiple cameras 1011 and send the encoded video streams to the video server 102.
  • multiple cameras 1011 are used to collect video images of different angles in the same focal area, and the multiple cameras 1011 collect images at the same time and frequency.
  • a camera synchronization technology can be used to achieve synchronized shooting of multiple cameras 1011.
  • the number of cameras in the figure is only used as an exemplary description, and not as a limitation to the video playback system provided in the embodiment of the present application.
  • the multiple cameras may be arranged in a circular arrangement or a sector arrangement, etc.
  • the embodiment of the present application does not limit the arrangement of the cameras.
  • the video server 102 is configured to process the video stream sent by the media source 101 using OTT (over the top) technology, and distribute the processed video stream to the terminal through a content delivery network (CDN).
  • CDN is an intelligent virtual network built on the basis of existing networks, relying on edge servers deployed everywhere.
  • the video server 102 includes a video processing server 1021 and a video distribution server 1022.
  • the video processing server 1021 is used to process the video stream using OTT technology and send the processed video stream to the video distribution server 1022; the video distribution server 1022 is used to distribute the video stream to the terminal.
  • the video processing server 1021 may also be referred to as a video processing platform.
  • the video processing server 1021 may be a server, or a server cluster composed of several servers, or a cloud computing service center.
  • the video distribution server 1022 is an edge server.
  • the terminal 103 is a video playback terminal, and is used to decode and play the video stream sent by the video server 102.
  • the terminal 103 can change the playback angle through one or more of control methods such as touch control, voice control, gesture control, or remote control control.
  • control methods such as touch control, voice control, gesture control, or remote control control.
  • the embodiment of the present application does not limit the control method for triggering the terminal to change the playback angle.
  • the terminal 103 may be a mobile phone, a tablet computer, or a smart wearable device that can change the playback angle through touch or voice control.
  • the terminal 103 may also be a device such as a set top box (STB) that can change the playback angle through the control of a remote controller.
  • STB set top box
  • the video server 102 and the terminal 103 transmit video streams based on a hypertext transfer protocol (HTTP).
  • HTTP hypertext transfer protocol
  • the front-end encoder 1012 on the side of the media source 101 or the video processing server 1021 on the side of the video server 102 obtains multiple video streams, and then re-encodes (can also be referred to as transcoding) each video stream to obtain a GOP, And based on GOP to generate video fragments for transmission.
  • multiple GOPs are usually encapsulated in video slices, and each GOP includes one or more frames of video images.
  • GOP is a group of continuous video images in time.
  • the time stamp of the GOP obtained by re-encoding the video stream corresponds to the time when the camera captures the video image in the GOP.
  • the time stamp of the GOP can be set as the collection time of the last frame of the video image in the GOP.
  • the GOP corresponds to a start timestamp and an end timestamp.
  • the start timestamp is the collection time of the first frame of video image in the GOP
  • the end timestamp is the last frame in the GOP. The time when the video image was collected.
  • the time length of the GOP is less than or equal to 100 milliseconds.
  • the time parameter of GOP can be set by the manager.
  • the number of video image frames contained in each GOP is positively correlated with the camera's shooting frame rate, that is, the higher the camera's shooting frame rate, the more video image frames contained in each GOP.
  • the GOP may include 2 frames of video images (corresponding to the number of frames per second (frame per second, FPS) is 25 (abbreviation: 25FPS)), 3 frames of video images (corresponding to 30FPS), 5 frames of video images (corresponding to 50FPS) ) Or 6 frames of video images (corresponding to 60FPS).
  • the GOP may also include only one frame of video images or include more frames of video images, which is not limited in the embodiment of the present application.
  • the GOPs in the video fragments are encoded in an independent transmission encapsulation manner, so that each GOP can be used as a separate chunk for independent transmission.
  • video fragments may be encapsulated in a fragmented mp4 (fragmented mp4, fmp4) format.
  • the fmp4 format is a streaming media format defined in the MPEG-4 standard proposed by the Moving Picture Expert Group (MPEG).
  • Fig. 2 is a schematic structural diagram of a video segment provided by an embodiment of the present application. As shown in Figure 2, the video segment includes n encapsulation headers and n data fields (mdat). Each mdat is used to carry the data of one GOP, that is, there are n GOPs encapsulated in the video segment.
  • Each encapsulation header includes a moof field.
  • the encapsulation method of the video segmentation may also be referred to as a multi-moof header encapsulation method.
  • the encapsulation header may also include a styp field and a sidx field.
  • the video processing server 1021 on the video server 102 side generates a media content index (also referred to as an OTT index) according to externally set data.
  • the media content index is used to describe the information of each video stream, and the media content index is essentially a file describing the information of the video stream.
  • the information of the video stream includes the address information of the video stream and the time information of the video stream.
  • the address information of the video stream is used to indicate the acquisition address of the video stream.
  • the address information of the video stream may be a uniform resource locator (URL) address corresponding to the video stream.
  • the time information of the video stream is used to indicate the start time and end time of each video segment in the video stream.
  • the media content index may also include camera location information.
  • the camera position information includes the number of camera positions (that is, the number of cameras on the media source side) and the camera position angle corresponding to each video stream.
  • the camera angle corresponding to the video stream is the camera angle corresponding to the camera.
  • FIG. 3 is a schematic diagram of a camera distribution scene on the media source side according to an embodiment of the present application.
  • the scene includes 20 cameras, denoted as cameras 1-20.
  • the 20 cameras are arranged in a circular arrangement for shooting the same focal area M, and the shooting focal point is point O.
  • the camera angle corresponding to one of the cameras can be set to 0, and the camera angle corresponding to other cameras can be calculated accordingly.
  • the camera position angle corresponding to camera 4 can be set to 0°, and the camera position angles corresponding to other cameras can be calculated respectively.
  • the camera position angle corresponding to camera 9 is 90°
  • the camera position angle corresponding to camera 14 is 180°
  • the camera position angle corresponding to camera 19 The corresponding camera angle is 270°.
  • Managers can input the number of cameras and the corresponding camera angles of each camera into the video processing server for the video processing server to generate a media content index.
  • the media content index in the embodiment of the present application may be an m3u8 file (may be called an HLS index) or a media presentation description (media presentation description, MPD) file (may be called a DASH index).
  • m3u8 files refer to m3u files in UTF-8 encoding format.
  • the process for the terminal to obtain the video content in the video server includes: the terminal first downloads the media content index from the video server, and obtains the information of the video stream by analyzing the media content index.
  • the terminal selects the video stream that currently needs to be played, extracts the URL address of the video stream from the media content index, and then sends a media content request to the video server through the URL address of the video stream.
  • the video server After receiving the media content request, the video server sends the corresponding video stream to the terminal.
  • the video playback system may also include a network device 104, and the video server 102 and the terminal 103 are connected through the network device 104.
  • the network device 104 may be a gateway or other intermediate device.
  • the video server 102 and the terminal 103 may also be directly connected, which is not limited in the embodiment of the present application.
  • FIG. 4 is a schematic flowchart of a video playback method provided by an embodiment of the present application. This method can be applied to the video playback system shown in FIG. 1. As shown in Figure 4, the method includes:
  • Step 401 When the terminal receives the rotation instruction, the terminal generates a surround playback request.
  • the surround playback request includes rotation camera position information, and the rotation camera position information is used to indicate the rotation range.
  • the terminal can determine the start camera position, the end camera position, and the rotation direction according to the rotation instruction and the camera position information.
  • the rotating machine The position information may include the identification of the starting position, the identification of the ending position, and the direction of rotation.
  • the terminal may determine the rotation angle according to the rotation instruction, and in this case, the rotation machine position information may include the rotation angle.
  • the surround playback request generated by the terminal is used to request dynamic surround playback of the video content.
  • the surround playback request is also used to determine the playback start time and playback end time.
  • the surround playback request further includes playback time information, and the playback time information includes one or more of a playback start time, a playback end time, or a surround playback duration.
  • the surround playback request generated by the terminal is used to request static surround playback of the video content.
  • the surround playback request is also used to determine the target playback moment.
  • the surround playback request includes the target playback moment, and the target playback moment may be a video pause moment.
  • Static surround playback of video content refers to surround playback of video images corresponding to the target playback moment provided by multiple cameras.
  • the terminal when the terminal detects a sliding operation on the video playback interface, the terminal determines that the rotation instruction is received.
  • the terminal determines the rotation machine position information according to the sliding information of the sliding operation, and the sliding information includes one or more of a sliding start position, a sliding length, a sliding direction, or a sliding angle.
  • the terminal generates a surround playback request based on the rotation machine position information.
  • the sliding starting position, sliding length and sliding direction can be used to determine the starting position, the ending position and the direction of rotation.
  • the sliding angle can be used to determine the rotation angle.
  • the sliding starting position corresponds to the starting machine position
  • the sliding direction corresponds to the rotation direction
  • the sliding length is used to define the number of machines to be switched.
  • Sliding direction to the left means counterclockwise rotation
  • sliding direction to the right means clockwise rotation.
  • Each time the sliding length reaches the unit length it means to switch a machine position.
  • the unit length can be set to 1 cm.
  • the sliding length reaches 3 cm, it means to switch 3 positions.
  • the sliding sensitivity is negatively correlated with the unit length setting value, that is, the smaller the unit length setting value, the higher the sliding sensitivity.
  • the sliding sensitivity can be set according to actual needs.
  • the sliding direction is to the right
  • the sliding length is 5 cm
  • the unit length is 1 cm
  • the clockwise rotation is switched to 5 positions.
  • the terminal determines that the rotation direction is clockwise
  • the ending camera position is camera 14.
  • the surround playback duration can also be defined by sliding duration.
  • the surround playback duration can be equal to the sliding duration.
  • the sliding angle is used to determine the rotation angle. It is possible to set the rotation angle and the sliding angle to satisfy a certain relationship, for example, make the rotation angle equal to the sliding angle; or make the rotation angle equal to twice the sliding angle; and so on.
  • the positive or negative of the rotating angle can also be used to indicate the rotating direction. For example, a positive value of the rotation angle indicates clockwise rotation, and a negative value of the rotation angle indicates counterclockwise rotation.
  • the terminal when the terminal receives the target remote control instruction sent by the remote control device, the terminal determines that the rotation instruction is received.
  • the target remote control instruction includes remote control button information, and the remote control button information includes button identification and/or the number of keystrokes.
  • the terminal determines the rotating machine position information according to the remote control button information. Then the terminal generates a surround playback request based on the rotation machine position information.
  • the key mark can be used to determine the direction of rotation.
  • the number of keystrokes can be used to determine the number of switch positions.
  • the rotation direction is determined based on the key identification.
  • the remote control button information includes the logo of the left button, it means that the rotation direction is counterclockwise, and when the remote control button information includes the logo of the right button, it means that the rotation direction is clockwise.
  • other buttons on the remote control device can also be set to control the direction of rotation, which is not limited in the embodiment of the present application.
  • the number of keystrokes is used to define the number of positions to be switched. For example, if the number of keystrokes is 1, it means that one machine position is switched.
  • the remote control button information includes the identification of the left button, and the number of buttons is 3, it means that the 3 camera positions are rotated counterclockwise.
  • the terminal determines that the rotation direction is counterclockwise according to the button ID, the number of cameras to be switched is determined to be 3 based on the number of keystrokes, and the ending camera position is determined to be camera 6.
  • the surround playback duration can also be defined by the key duration.
  • the surround playback duration can be equal to the key duration.
  • Step 402 The terminal sends a surround playback request to the upper layer device.
  • the upper device refers to the upstream device of the terminal.
  • the upper-layer device may be a video server (specifically, a video distribution server) or a network device in the video playback system shown in FIG. 1.
  • Step 403 The upper layer device determines the play time information based on the surround play request.
  • the surround playback request is used to request dynamic surround playback of the video content
  • the playback time information includes the playback start time and the playback end time.
  • the upper-layer device determines the playback time information based on the surround playback request in the following five ways:
  • the implementation process of step 403 includes: the upper-layer device determines the playback start time and the playback end time according to the time when the surround playback request is received and a preset strategy.
  • the preset strategy includes the preset surround playback duration.
  • the preset strategy defines that the video playback time when the upper-layer device receives the surround playback request is used as the playback start moment, and the interval between the playback end moment and the playback start moment is equal to the preset surround playback duration. For example, when the upper-layer device receives the surround playback request, the video playback time is 00:19:35, and the preset surround playback time is 2 seconds, then the upper-layer device determines that the playback start time is 00:19:35 and the playback end time is 00. :19:37.
  • the preset strategy can also be defined: the video playback time separated by a certain length of time from the receiving time of the surround playback request (corresponding to the video playback moment) is regarded as the playback start time.
  • the playback start time may be located in the sequence of the surround playback request. Before the receiving time, or the playback start time may also be located after the receiving time of the surround playback request in time sequence.
  • the receiving time of the surround playback request is 00:19:35
  • the playback start time may be 00:19:34
  • the playback start time may also be 00:19:36.
  • the surround playback request includes the playback start time and the playback end time.
  • the implementation process of step 403 includes: the upper-layer device recognizes the playback start time and the playback end time in the surround playback request.
  • the specified field of the pre-defined or pre-configured surround playback request is used to carry the playback start time and the playback end time.
  • the pre-definition can be defined in a standard or protocol; the pre-configuration can be pre-negotiation between the upper-layer device and the terminal.
  • the upper-layer device After receiving the surround playback request, the upper-layer device can identify the playback start time and playback end time from the designated field.
  • the upper-layer device determines that the playback start time is 00:19:35, and the playback end time is 00: 19:37.
  • the surround playback request includes the playback start time.
  • the implementation process of step 403 includes: the upper-layer device determines the playback end time according to the playback start time and the preset surround playback duration.
  • the upper-layer device determines that the playback end time is 00:19:37.
  • the surround playback request includes the surround playback duration.
  • the implementation process of step 403 includes: the upper-layer device determines the playback start time and the playback end time according to the time when the surround playback request is received and the surround playback duration.
  • the upper-layer device determines the playback start time and the playback end time
  • the surround playback request includes the playback start time and the surround playback duration.
  • the implementation process of step 403 includes: the upper-layer device determines the playback end time according to the playback start time and the surround playback duration.
  • the upper-layer device determines that the playback end time is 00:19:37.
  • the surround playback request is used to request static surround playback of the video content
  • the playback time information includes the target playback moment.
  • the surround playback request includes the target playback moment.
  • the target playback time is not included in the surround playback request, and the upper-layer device determines the target playback moment according to the moment when the surround playback request is received.
  • the method for the upper-layer device to determine the target playback time can refer to the above-mentioned first implementation manner in which the upper-layer device determines the playback start The time method is not repeated here in the embodiment of the present application.
  • Step 404 The upper-level equipment determines the start camera position, the end camera position, and the rotation direction according to the rotation camera position information.
  • the rotation machine position information includes the start machine position identification, the end machine position identification and the rotation direction
  • the upper-layer device after the upper-layer device receives the surround playback request, it can determine the start machine position according to the content in the rotation machine position information. Position, end position and direction of rotation.
  • the upper-layer device determines the end camera position and the rotation direction according to the start camera position and the rotation angle. For example, referring to FIG. 3, assuming that the starting camera position determined by the upper-layer device is camera 9 and the rotation angle carried in the surround playback request is -90°, the upper-layer device determines that the rotation direction is counterclockwise and the ending camera position is camera 4.
  • Step 405 The upper-level device determines multiple camera positions from the start camera position to the end camera position along the rotation direction.
  • the multiple camera positions determined by the upper-level device may include all camera positions from the start camera position to the end camera position along the rotation direction.
  • the multiple camera positions determined by the upper-level equipment include camera 9, camera 10, camera 11, and camera. 12.
  • the multiple camera positions determined by the upper-layer device may include partial camera positions from the start camera position to the end camera position along the rotation direction. For example, assuming that the union of the shooting area of the camera 11 and the shooting area of the camera 13 in FIG.
  • the shooting area of the camera 12 may not be included in the multiple camera positions determined by the upper-level device.
  • the video images captured by camera 9 to camera 14 are played in static surround, since the video images captured by camera 11 and the video images captured by camera 13 include the video images captured by camera 12, it will not cause sudden changes in the video images during surround playback. In turn, the smoothness of the surround playback picture can be ensured.
  • Step 406 The upper layer device generates a rotating segment according to the rotating machine position information and the playing time information.
  • the rotating segment includes GOPs corresponding to multiple cameras within the rotating range.
  • the rotation segment sequentially includes GOPs corresponding to multiple cameras from the start camera to the end camera along the rotation direction.
  • the surround playback request is used to request dynamic surround playback of video content
  • each GOP in the rotating segment includes one or more frames of video images.
  • the implementation process of step 406 includes:
  • step 4061A the upper-layer device obtains m video segments from the playback start time to the playback end time corresponding to each of the multiple camera positions, where m is a positive integer.
  • the playback start time is T1
  • the playback end time is T2
  • q is an integer greater than 0, T2>T1
  • each camera position corresponds to the video stream
  • the time period (T1, T2) includes m video segments.
  • the upper-layer device respectively obtains the m video segments corresponding to the q cameras in the time period (T1, T2).
  • step 4062A the upper-layer device extracts one or more GOPs from the m video segments corresponding to each camera position according to the play time information.
  • the upper-layer device determines the GOP extraction time and the GOP extraction quantity corresponding to each camera position according to the surround playback duration and the number of multiple cameras, and the surround playback duration is equal to the difference between the playback end time and the playback start time.
  • the upper layer device extracts GOPs from the m video segments corresponding to each camera position according to the GOP extraction time and the number of GOP extractions corresponding to each camera position.
  • the GOP extraction time corresponding to the previous camera position is temporally ahead of the GOP extraction time corresponding to the latter camera position.
  • the number of extracted GOPs corresponding to each camera position is equal to the ratio of the product of the surround playback duration and the time length of the GOP and the number of multiple cameras (the ratio can be rounded up or down).
  • step 4061A continue to refer to the example in step 4061A, assuming that the time length of each GOP is t, and the number of GOP extractions corresponding to each machine position is equal to (T2-T1)/(q*t).
  • step 4063A the upper-layer device assembles the extracted GOPs to obtain rotating pieces.
  • the upper layer device sequentially assembles the extracted GOPs according to the rotation direction to obtain a rotating segment, and the rotating segment is a dynamic rotating segment.
  • each video segment includes 5 GOPs, and the number of GOP extractions corresponding to each camera position is 1, please refer to Figure 5, which is the original
  • the application embodiment provides a schematic diagram of a generation process of a rotating segment.
  • the GOPs in the video segments corresponding to each camera are numbered 1-5 sequentially.
  • the GOP with number 1 is extracted from the video segments corresponding to the first camera, and the GOP with the number 1 is extracted from the second camera.
  • Extract the GOP numbered 2 from the video segment corresponding to the bit extract the GOP number 3 from the video segment corresponding to the third camera, and extract the GOP number 4 from the video segment corresponding to the fourth camera GOP, extract the GOP numbered 5 from the video segment corresponding to the fifth camera position.
  • the GOPs extracted from the video segments corresponding to the 5 camera positions are sequentially assembled according to the rotation direction to obtain dynamic rotating segments.
  • the surround playback request is used to request static surround playback of video content, and each GOP in the rotating segment includes one frame of video image. Then the implementation process of step 406 includes:
  • step 4061B the upper-layer device obtains the target video segment corresponding to each of the multiple camera positions, and the time period corresponding to the target video segment includes the target playback moment.
  • the time period corresponding to the target video segment includes the target playback time, which means that the target playback time is located between the start time and the end time of the target video segment.
  • step 4062B the upper-layer device extracts a GOP corresponding to the target playback moment from the target video segment corresponding to each camera position.
  • a GOP corresponding to the target playback moment refers to the collection moment of the video image in the GOP as the target playback moment.
  • step 4063B the upper-layer device assembles the extracted GOPs to obtain rotating pieces.
  • the upper layer device sequentially assembles the extracted GOPs according to the rotation direction to obtain a rotating segment, and the rotating segment is a static rotating segment.
  • each video segment includes 5 GOPs.
  • FIG. 6 is a schematic diagram of another rotation segment generation process provided by an embodiment of the present application.
  • the GOPs in the video segments corresponding to each camera are numbered 1-5 sequentially, and the GOP corresponding to the target playback time is the GOP with number 2, so the numbers are extracted from the 5 camera positions. Is a GOP of 2.
  • the GOPs extracted from the video segments corresponding to the 5 camera positions are sequentially assembled to obtain static rotating segments.
  • the number of GOPs included in a rotated segment may be the same as or different from the number of GOPs included in other video segments.
  • the number of GOPs included in a rotated segment may be less than the number of GOPs included in other video segments. The example does not limit this.
  • the upper-layer device when the upper-layer device is a network device, after receiving the surround playback request, the upper-layer device first downloads the media content index from the video server, and obtains the video stream information by parsing the media content index. The upper-layer device extracts the URL address of the video stream corresponding to each of the multiple cameras from the media content index, and then obtains the corresponding video segments through the URL of the video stream.
  • Step 407 The upper layer device sends the rotating segment to the terminal.
  • the surround playback request when used to request dynamic surround playback of video content, after the upper layer device sends the rotating segment to the terminal, it continues to send the video stream corresponding to the terminating camera position to the terminal, so that the terminal can smoothly move from the starting camera position The corresponding playback screen switches to the playback screen corresponding to the termination machine position.
  • the upper layer device stops sending video data to the terminal after sending the rotating segment to the terminal.
  • Step 408 The terminal decodes and plays the rotating segment.
  • the terminal decodes and plays the rotating segment, which can realize the surround playback of the video images corresponding to the multiple cameras from the start camera to the end camera along the rotation direction.
  • the resolution of the video image played by the terminal may be the same as the resolution of the video image in the rotating segment.
  • the upper-layer device determines the playback time information according to the surround playback request sent by the terminal, and then generates the rotating segment according to the playback time information and the rotation machine position information in the surround playback request. . Since the rotating segment contains GOPs corresponding to multiple cameras within the rotation range indicated by the rotating camera position information, after receiving the rotating segment, the terminal decodes the rotating segment to achieve surround playback of the video screen. And the resolution of the played video image may be the same as the resolution of the video image in the rotated segment. Therefore, the video playback method provided by the embodiments of the present application is not limited to the number of cameras used for front-end shooting, and has a wide range of applications.
  • the upper-layer device can be a video distribution server or a network device, which can reduce the requirements on the processing performance of the video processing server and achieve high reliability.
  • FIG. 7 is a schematic structural diagram of a video playback device provided by an embodiment of the present application.
  • the apparatus is used for upper-layer equipment, for example, the upper-layer equipment may be a video server or a network device in a video playback system as shown in FIG. 1.
  • the device 70 includes:
  • the receiving module 701 is configured to receive a surround playback request sent by a terminal, where the surround playback request includes rotation camera position information, and the rotation camera position information is used to indicate a rotation range.
  • the first determining module 702 is configured to determine the play time information based on the surround play request.
  • the generating module 703 is configured to generate a rotating segment according to the rotating camera position information and the playing time information.
  • the rotating segment includes a group of GOPs corresponding to multiple cameras within the rotating range, and the GOP includes one or more frames of video images.
  • the sending module 704 is used to send the rotating segment to the terminal.
  • the play time information includes the play start time and the play end time
  • the generating module 703 is configured to:
  • the generating module 703 is specifically configured to: determine the GOP extraction time and the number of GOP extractions corresponding to each camera position according to the surround playback duration and the number of multiple cameras, and the surround playback duration is equal to the playback end time and the playback start time. Time difference; according to the GOP extraction time corresponding to each camera position and the number of GOP extractions, GOPs are extracted from the m video slices corresponding to each camera position.
  • the play time information includes the target play time
  • the generating module 703 is configured to:
  • the time period corresponding to the target video segment includes the target playback time; extract the target playback time corresponding to the target video segment corresponding to each camera
  • a GOP of, the GOP includes a frame of video image; the extracted GOP is assembled to obtain a rotating slice.
  • the apparatus 70 further includes:
  • the second determining module 705 is configured to determine the starting camera position, the ending camera position, and the rotation direction according to the rotating camera position information.
  • the third determining module 706 is used to determine multiple camera positions from the start camera position to the end camera position along the rotation direction;
  • the generating module 703 is configured to sequentially assemble the extracted GOPs according to the rotation direction to obtain rotating segments.
  • the first determining module 702 is configured to determine the playback start time and the playback end time according to the time when the surround playback request is received and a preset strategy, and the preset strategy includes the preset surround playback duration; or The playback request includes the playback start time and the playback end time.
  • the first determination module 702 is used to identify the playback start time and the playback end time in the surround playback request; or, the surround playback request includes the playback start time, the first determination module 702, configured to determine the playback end time according to the playback start time and the preset surround playback duration; or, the surround playback request includes the surround playback duration, and the first determining module 702 is configured to determine the playback end time according to the moment when the surround playback request is received and the surround playback Duration, the playback start time and playback end time are determined; or, the surround playback request includes the playback start time and the surround playback duration, and the first determining module 702 is configured to determine the playback end time according to the playback start time and the surround playback duration.
  • the GOP is encoded in an independent transmission encapsulation mode.
  • the upper layer device determines the playback time information according to the surround playback request sent by the terminal through the first determining module, and then uses the generating module according to the playback time information and the surround playback request.
  • Rotating machine position information generates rotating slices. Since the rotating segment contains GOPs corresponding to multiple cameras within the rotation range indicated by the rotating camera position information, after receiving the rotating segment, the terminal decodes the rotating segment to achieve surround playback of the video screen. And the resolution of the played video image may be the same as the resolution of the video image in the rotated segment. Therefore, the video playback method provided by the embodiments of the present application is not limited to the number of cameras used for front-end shooting, and has a wide range of applications.
  • the upper-layer device can be a video distribution server or a network device, which can reduce the requirements on the processing performance of the video processing server and achieve high reliability.
  • FIG. 9 is a schematic structural diagram of another video playback device provided by an embodiment of the present application.
  • the device is used in a terminal.
  • the device may be the terminal 103 in the video playback system shown in FIG. 1.
  • the device 90 includes:
  • the sending module 901 is configured to send a surround playback request generated based on the rotation instruction to the upper device when the terminal receives a rotation instruction, the surround playback request includes rotation machine position information, and the rotation machine position information is used to indicate the rotation range.
  • the receiving module 902 is configured to receive a rotating segment sent by an upper-layer device.
  • the rotating segment includes a group of GOPs corresponding to multiple cameras within the rotating range, and the GOP includes one or more frames of video images.
  • the playing module 903 is used to decode and play the rotating segments.
  • the device 90 further includes:
  • the first determining module 904 is configured to determine that a rotation instruction is received when the terminal detects a sliding operation on the video playback interface
  • the second determining module 905 is configured to determine rotation machine position information according to the sliding information of the sliding operation, and the sliding information includes one or more of a sliding start position, a sliding length, a sliding direction, or a sliding angle;
  • the generating module 906 is configured to generate a surround playback request based on the rotation machine position information.
  • the device 90 further includes:
  • the third determining module 907 is configured to determine that the rotation instruction is received when the terminal receives the target remote control instruction sent by the remote control device, the target remote control instruction includes remote control button information, and the remote control button information includes the button identifier and/or the number of keystrokes;
  • the fourth determining module 908 is configured to determine the rotation machine position information based on the remote control button information
  • the generating module 906 is configured to generate a surround playback request based on the rotation machine position information.
  • the terminal after receiving the rotation instruction, the terminal sends a surround playback request to the upper-layer device through the sending module, and then receives the rotating segment sent by the upper-layer device through the receiving module. Since the rotating segment contains GOPs corresponding to multiple cameras within the rotation range indicated by the rotating camera position information, after receiving the rotating segment, the terminal decodes the rotating segment through the playback module to realize the video image Surround playback, and the resolution of the played video image can be the same as the resolution of the video image in the rotated segment. Therefore, the video playback method provided by the embodiments of the present application is not limited to the number of cameras used for front-end shooting, and has a wide range of applications.
  • the upper-layer device can be a video distribution server or a network device, which can reduce the requirements on the processing performance of the video processing server and achieve high reliability.
  • the embodiment of the present application also provides a video playback system, which includes: an upper-layer device and a terminal.
  • the upper layer equipment includes the video playback device shown in FIG. 7 or FIG. 8, and the terminal includes the video playback device shown in any one of FIG. 9 to FIG. 11.
  • Fig. 12 is a block diagram of a video playback device provided by an embodiment of the present application.
  • the video playback device may be an upper-layer device or a terminal, the upper-layer device may be a video server or a network device, and the terminal may be a mobile phone, a tablet computer, a smart wearable device, or a set-top box.
  • the video playback device 120 includes a processor 1201 and a memory 1202.
  • the memory 1202 is configured to store a computer program, where the computer program includes program instructions;
  • the processor 1201 is configured to call the computer program to implement the actions performed by the upper-layer device or the actions performed by the terminal in the video playback method shown in FIG. 4.
  • the video playback device 120 further includes a communication bus 1203 and a communication interface 1204.
  • the processor 1201 includes one or more processing cores, and the processor 1201 executes various functional applications and data processing by running a computer program.
  • the memory 1202 may be used to store computer programs.
  • the memory may store an operating system and at least one application program unit required by the function.
  • the operating system can be a real-time operating system (Real Time eXecutive, RTX), LINUX, UNIX, WINDOWS, or OS X.
  • the communication interface 1204 is used to communicate with other storage devices or network devices.
  • the communication interface of the upper-layer device may be used to send rotating segments to the terminal, and the communication interface of the terminal may be used to send a surround playback request to the upper-layer device.
  • the network device can be a switch or router.
  • the memory 1202 and the communication interface 1204 are respectively connected to the processor 1201 through a communication bus 1203.
  • the embodiment of the present application also provides a computer storage medium with instructions stored on the computer storage medium.
  • the instructions are executed by the processor of the computer device, the upper layer in the video playback method described in the above method embodiment is implemented. Actions performed by the device or actions performed by the terminal.
  • the program can be stored in a computer-readable storage medium.
  • the storage medium mentioned can be a read-only memory, a magnetic disk or an optical disk, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Astronomy & Astrophysics (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本申请公开了一种视频播放方法、装置及***、计算机存储介质,属于视频处理技术领域。上层设备接收终端发送的环绕播放请求,该环绕播放请求中包括旋转机位信息,该旋转机位信息用于指示旋转范围。上层设备基于环绕播放请求确定播放时间信息。上层设备根据旋转机位信息和播放时间信息生成旋转分片,该旋转分片中包括旋转范围内的多个机位对应的GOP,GOP包括一帧或多帧视频图像。上层设备向终端发送旋转分片。本申请中,终端在接收到旋转分片后,对旋转分片进行解码即可实现对视频画面的环绕播放,且播放的视频画面的分辨率可以与旋转分片中的视频图像的分辨率相同。不受限于前端拍摄采用的摄像机数量,应用范围广。

Description

视频播放方法、装置及***、计算机存储介质
本申请要求于2020年4月29日提交中国专利局、申请号为202010354123.0、发明名称为“视频播放方法、装置及***、计算机存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及视频处理技术领域,特别涉及一种视频播放方法、装置及***、计算机存储介质。
背景技术
随着互联网技术的快速发展,用户开始追求更好的视频观看体验,从而衍生出围绕目标对象环绕观看的需求。特别是在体育比赛、演唱会或其它具有特定焦点的场景下,环绕观看需求更甚。为了满足用户的环绕观看需求,需要在终端上实现环绕播放。
环绕播放要求前端拍摄采用分布在特定位置的多相机采集同一焦点区域内不同角度的视频画面,同时基于相机同步技术,保证多相机采集图像的时刻和频率相同。然后多相机分别将采集的视频流发送到视频处理平台,由视频处理平台对多路视频流进行处理,进一步在终端上实现对焦点区域的环绕播放。
相关技术中,通常由服务端将多路视频流中采集时刻相同的视频帧拼接成一个视频帧。例如,前端拍摄采用16个相机采集同一焦点区域内不同角度的视频画面。服务端将接收到的16路视频流中每路视频流中的视频帧的分辨率均调整为960×540,然后将16路视频流中采集时刻相同的16个视频帧按照4×4等比例组合成分辨率为3840×2160(即4K)的一个视频帧,得到一路视频流。服务端向终端发送该路视频流。终端对该路视频流进行解码后,根据设置的观看机位,选择其中的1/16的视频画面(一个相机采集的视频画面)进行播放。
但是,采用相关技术中的视频播放方法,由于终端播放画面的分辨率与前端拍摄采用的相机数量成反比,导致前端拍摄采用的相机数量受限,因此应用局限性较高。
发明内容
本申请提供了一种视频播放方法、装置及***、计算机存储介质,可以解决相关技术中视频播放的应用局限性较高的问题。
第一方面,提供了一种视频播放方法。该方法包括:上层设备接收终端发送的环绕播放请求,该环绕播放请求中包括旋转机位信息,该旋转机位信息用于指示旋转范围。上层设备基于环绕播放请求确定播放时间信息。上层设备根据旋转机位信息和播放时间信息生成旋转分片,该旋转分片中包括旋转范围内的多个机位对应的图像组(group of picture,GOP),GOP包括一帧或多帧视频图像。上层设备向终端发送旋转分片。
可选地,旋转机位信息包括起始机位的标识、终止机位的标识、旋转方向或旋转角度中的一个或多个。可选地,播放时间信息包括播放开始时刻、播放结束时刻或环绕播放时长中的一个或多个;或者,播放时间信息包括目标播放时刻。
本申请中,由于旋转分片中包含旋转机位信息所指示的旋转范围内的多个机位对应的GOP,终端在接收到旋转分片后,对旋转分片进行解码即可实现对视频画面的环绕播放,且播放的视频画面的分辨率可以与旋转分片中的视频图像的分辨率相同。因此本申请提供的视频播放方法不受限于前端拍摄采用的摄像机数量,应用范围广。
在一种可能实现方式中,播放时间信息包括播放开始时刻和播放结束时刻。上层设备根据旋转机位信息和播放时间信息生成旋转分片的实现过程,包括:上层设备获取多个机位中的每个机位对应的从播放开始时刻至播放结束时刻的m个视频分片,m为正整数。上层设备根据播放时间信息,从每个机位对应的m个视频分片中提取一个或多个GOP。上层设备对提取的GOP进行组装,得到旋转分片。
可选地,上层设备根据播放时间信息,从每个机位对应的m个视频分片中提取一个或多个GOP的实现过程,包括:上层设备根据环绕播放时长以及多个机位的数量,确定每个机位对应的GOP提取时刻以及GOP提取数量,该环绕播放时长等于播放结束时刻与播放开始时刻的差值。上层设备根据每个机位对应的GOP提取时刻以及GOP提取数量,从每个机位对应的m个视频分片中提取GOP。
本实现方式中,上层设备生成的旋转分片为动态旋转分片。上层设备可以在终端视频播放过程中生成动态旋转分片,进而实现终端上的动态环绕播放。本申请中,终端动态环绕播放视频内容,指终端播放在时序上连续的视频图像,即终端播放的后一帧视频图像与播放的前一帧图像为在时序上连续采集的两帧图像。
在另一种可能实现方式中,播放时间信息包括目标播放时刻。上层设备根据旋转机位信息和所述播放时间信息生成旋转分片的实现过程,包括:上层设备获取多个机位中的每个机位对应的目标视频分片,该目标视频分片对应的时间段包含目标播放时刻。上层设备从每个机位对应的目标视频分片中,提取目标播放时刻对应的一个GOP,该GOP包括一帧视频图像。上层设备对提取的GOP进行组装,得到旋转分片。
本实现方式中,上层设备生成的旋转分片为静态旋转分片。上层设备可以在终端视频暂停播放时生成静态旋转分片,进而实现终端上的静态环绕播放。本申请中,终端静态环绕播放视频内容,指终端环绕播放多个摄像机在同一时刻采集的视频图像。
可选地,上层设备根据旋转机位信息确定起始机位、终止机位和旋转方向。上层设备在沿旋转方向从起始机位起至终止机位的机位中确定多个机位。则上述两种实现方式中,上层设备对提取的GOP进行组装,得到旋转分片的实现过程,包括:上层设备按照旋转方向将提取的GOP依次进行组装,得到旋转分片。
可选地,上层设备基于环绕播放请求确定播放时间信息的实现过程,包括:上层设备根据接收到环绕播放请求的时刻以及预设的策略,确定播放开始时刻和播放结束时刻,该预设的策略中包括预设环绕播放时长。或者,环绕播放请求中包括播放开始时刻和播放结束时刻,则上层设备基于环绕播放请求确定播放时间信息的实现过程,包括:上层设备在环绕播放请求中识别出播放开始时刻和播放结束时刻。或者,环绕播放请求中包括播放开 始时刻,则上层设备基于环绕播放请求确定播放时间信息的实现过程,包括:上层设备根据播放开始时刻以及预设环绕播放时长,确定播放结束时刻。或者,环绕播放请求中包括环绕播放时长,则上层设备基于环绕播放请求确定播放时间信息的实现过程,包括:上层设备根据接收到环绕播放请求的时刻以及环绕播放时长,确定播放开始时刻和播放结束时刻。或者,环绕播放请求中包括播放开始时刻和环绕播放时长,则上层设备基于环绕播放请求确定播放时间信息的实现过程,包括:上层设备根据播放开始时刻以及环绕播放时长,确定播放结束时刻。
可选地,GOP采用独立传输封装方式编码。使每个GOP可以作为单独的分片进行独立传输使用。
第二方面,提供了一种视频播放方法。该方法包括:当终端接收到旋转指令时,该终端向上层设备发送基于旋转指令生成的环绕播放请求,该环绕播放请求中包括旋转机位信息,该旋转机位信息用于指示旋转范围。终端接收上层设备发送的旋转分片,该旋转分片中包括该旋转范围内的多个机位对应的GOP,GOP包括一帧或多帧视频图像。终端对旋转分片进行解码播放。
可选地,当终端在视频播放界面上检测到滑动操作时,终端确定接收到旋转指令。终端根据滑动操作的滑动信息,确定旋转机位信息,滑动信息包括滑动起始位置、滑动长度、滑动方向或滑动角度中的一个或多个。终端基于旋转机位信息生成环绕播放请求。
可选地,当终端接收到遥控设备发送的目标遥控指令时,终端确定接收到旋转指令,目标遥控指令中包括遥控按键信息,遥控按键信息包括按键标识和/或按键次数。终端基于遥控按键信息,确定旋转机位信息。终端基于旋转机位信息生成环绕播放请求。
本申请中,终端无需改变播放逻辑,只需在接收到旋转指令之后向上层设备发送环绕播放请求,然后对旋转分片进行解码,即可实现对视频画面的环绕播放,且播放的视频画面的分辨率可以与旋转分片中的视频图像的分辨率相同。因此本申请提供的视频播放方法不受限于前端拍摄采用的摄像机数量,应用范围广。
第三方面,提供了一种视频播放装置。所述装置包括多个功能模块,所述多个功能模块相互作用,实现上述第一方面及其各实施方式中的方法。所述多个功能模块可以基于软件、硬件或软件和硬件的结合实现,且所述多个功能模块可以基于具体实现进行任意组合或分割。
第四方面,提供了一种视频播放装置。所述装置包括多个功能模块,所述多个功能模块相互作用,实现上述第二方面及其各实施方式中的方法。所述多个功能模块可以基于软件、硬件或软件和硬件的结合实现,且所述多个功能模块可以基于具体实现进行任意组合或分割。
第五方面,提供了一种视频播放***,所述***包括:上层设备和终端,所述上层设备包括如第三方面所述的视频播放装置,所述终端包括如第四方面所述的视频播放装置。
第六方面,提供了一种视频播放装置,包括:处理器和存储器;
所述存储器,用于存储计算机程序,所述计算机程序包括程序指令;
所述处理器,用于调用所述计算机程序,实现如第一方面任一所述的视频播放方法;或者,实现如第二方面任一所述的视频播放方法。
第七方面,提供了一种计算机存储介质,所述计算机存储介质上存储有指令,当所述指令被计算机设备的处理器执行时,实现如第一方面或第二方面任一所述的视频播放方法。
第八方面,提供了一种芯片,芯片包括可编程逻辑电路和/或程序指令,当芯片运行时,实现上述第一方面及其各实施方式中的方法或实现上述第二方面及其各实施方式中的方法。
本申请提供的技术方案带来的有益效果至少包括:
上层设备根据终端发送的环绕播放请求确定播放时间信息,然后根据播放时间信息以及环绕播放请求中的旋转机位信息生成旋转分片。由于旋转分片中包含旋转机位信息所指示的旋转范围内的多个机位对应的GOP,终端在接收到旋转分片后,对旋转分片进行解码即可实现对视频画面的环绕播放,且播放的视频画面的分辨率可以与旋转分片中的视频图像的分辨率相同。因此本申请实施例提供的视频播放方法不受限于前端拍摄采用的摄像机数量,应用范围广。另外,上层设备可以是视频分发服务器或网络设备,可以减小对视频处理服务器的处理性能的要求,实现可靠性高。
附图说明
图1是本申请实施例提供的一种视频播放***的结构示意图;
图2是本申请实施例提供的一种视频分片的结构示意图;
图3是本申请实施例提供的一种媒体源侧的摄像机分布场景示意图;
图4是本申请实施例提供的一种视频播放方法的流程示意图;
图5是本申请实施例提供的一种旋转分片的生成过程示意图;
图6是本申请实施例提供的另一种旋转分片的生成过程示意图;
图7是本申请实施例提供的一种视频播放装置的结构示意图;
图8是本申请实施例提供的另一种视频播放装置的结构示意图;
图9是本申请实施例提供的又一种视频播放装置的结构示意图;
图10是本申请实施例提供的再一种视频播放装置的结构示意图;
图11是本申请实施例提供的还一种视频播放装置的结构示意图;
图12是本申请实施例提供的一种视频播放装置的框图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
图1是本申请实施例提供的一种视频播放***的结构示意图。如图1所示,该***包括:媒体源101、视频服务器102和终端103。
媒体源101用于提供多路视频流。参见图1,媒体源101包括多个摄像机1011和前端编码器1012。摄像机1011与前端编码器1012连接。每个摄像机1011用于采集一路视频流,并将采集到的视频流传输至前端编码器1012。前端编码器1012用于对多个摄像机1011采集的视频流进行编码,并将编码后的视频流发送给视频服务器102。本申请实施例中,多个摄像机1011用于采集同一焦点区域内不同角度的视频图像,且该多个摄像机1011采 集图像的时刻和频率相同。可选地,可以采用相机同步技术实现多个摄像机1011的同步拍摄。图中摄像机的数量仅用作示例性说明,不作为对本申请实施例提供的视频播放***的限制。多个摄像机可以采用环形排布方式或扇形排布方式等,本申请实施例对摄像机的排布方式不做限定。
视频服务器102用于对媒体源101发送的视频流采用OTT(over the top)技术进行处理,并将处理后的视频流通过内容分发网络(content delivery network,CDN)分发至终端。CDN是构建在现有网络基础之上的智能虚拟网络,依靠部署在各地的边缘服务器。可选地,参见图1,视频服务器102包括视频处理服务器1021和视频分发服务器1022。视频处理服务器1021用于采用OTT技术对视频流进行处理,并将处理后的视频流发送给视频分发服务器1022;视频分发服务器1022用于将视频流分发至终端。其中,视频处理服务器1021也可称为视频处理平台,视频处理服务器1021可以是一台服务器,或者由若干台服务器组成的服务器集群,或者是一个云计算服务中心。视频分发服务器1022为边缘服务器。
终端103即视频播放端,用于对视频服务器102发送的视频流进行解码播放。可选地,终端103能够通过触控、语音控制、手势控制或遥控器控制等控制方式中一种或多种方式改变播放角度。本申请实施例对触发终端改变播放角度的控制方式不做限定。例如,终端103可以是手机、平板电脑或智能可穿戴设备等能够通过触控方式或语音控制方式改变播放角度的设备。或者,终端103也可以是机顶盒(set top box,STB)等能够通过遥控器的控制改变播放角度的设备。
本申请实施例中,视频服务器102与终端103之间基于超文本传输协议(hyper text transfer protocol,HTTP)传输视频流。可选地,媒体源101侧的前端编码器1012或视频服务器102侧的视频处理服务器1021在获取多路视频流后,对每路视频流进行重新编码(也可称为转码)得到GOP,并基于GOP生成视频分片进行传输。其中,视频分片中通常封装有多个GOP,每个GOP包括一帧或多帧视频图像。GOP是一组时间上的连续视频图像。对视频流进行重新编码得到的GOP的时间戳与摄像机对该GOP中的视频图像的采集时刻对应。例如,GOP的时间戳可以被设置为该GOP中最后一帧视频图像的采集时刻。又例如,当GOP中包括多帧视频图像时,GOP对应有开始时间戳和结束时间戳,开始时间戳为该GOP中第一帧视频图像的采集时刻,结束时间戳为该GOP中最后一帧视频图像的采集时刻。
可选地,GOP的时间长度小于或等于100毫秒。GOP的时间参数可由管理人员设置。在固定时间长度下,每个GOP中包含的视频图像帧数与摄像机的拍摄帧率正相关,即摄像机的拍摄帧率越高,每个GOP中包含的视频图像帧数越多。示例地,GOP中可以包括2帧视频图像(对应每秒传输帧数(frame per second,FPS)为25(简称:25FPS))、3帧视频图像(对应30FPS)、5帧视频图像(对应50FPS)或6帧视频图像(对应60FPS)。当然,GOP中也可以只包括1帧视频图像或包括更多帧视频图像,本申请实施例对此不做限定。
本申请实施例中,视频分片中的GOP采用独立传输封装方式编码,使得每个GOP可以作为单独的分片(chunk)进行独立传输使用。示例地,视频分片可以采用碎片mp4(fragmented mp4,fmp4)格式进行封装。fmp4格式是运动图像专家组(moving picture  expert group,MPEG)提出的MPEG-4标准中定义的流媒体格式。图2是本申请实施例提供的一种视频分片的结构示意图。如图2所示,该视频分片中包括n个封装头和n个数据字段(mdat),每个mdat用于承载一个GOP的数据,也即是该视频分片中封装有n个GOP,n为大于1的整数。每个封装头中包括moof字段。该视频分片的封装方式也可称为多moof头封装方式。可选地,封装头中还可以包括styp字段和sidx字段。
视频服务器102侧的视频处理服务器1021根据外部设置的数据,生成媒体内容索引(也可称为OTT索引)。媒体内容索引用于描述每条视频流的信息,媒体内容索引实质上为描述视频流的信息的文件。视频流的信息包括视频流的地址信息以及视频流的时间信息等。视频流的地址信息用于指示该视频流的获取地址,例如视频流的地址信息可以是该视频流对应的统一资源定位符(uniform resource locator,URL)地址。视频流的时间信息用于指示该视频流中每个视频分片的起始时刻和结束时刻。可选地,该媒体内容索引中还可以包括机位信息。机位信息包括机位数量(即媒体源侧的摄像机数量)和每条视频流对应的机位角度。视频流对应的机位角度即摄像机对应的机位角度。
示例地,图3是本申请实施例提供的一种媒体源侧的摄像机分布场景示意图。如图3所示,该场景中包括20个摄像机,分别记为摄像机1-20。该20个摄像机采用环形排布方式,用于拍摄同一焦点区域M,其拍摄焦点为点O。可以将其中一个摄像机对应的机位角度设置为0,并对应计算其它摄像机对应的机位角度。例如可以将摄像机4对应的机位角度设置为0°,分别计算其它摄像机对应的机位角度,则摄像机9对应的机位角度为90°,摄像机14对应的机位角度为180°,摄像机19对应的机位角度为270°。管理人员可以将摄像机数量以及各个摄像机对应的机位角度输入视频处理服务器,供视频处理服务器生成媒体内容索引。
可选地,本申请实施例中的媒体内容索引可以是m3u8文件(可称为HLS索引)或媒体演示描述(media presentation description,MPD)文件(可称为DASH索引)。其中,m3u8文件是指UTF-8编码格式的m3u文件。
终端获取视频服务器中的视频内容的过程包括:终端先从视频服务器下载媒体内容索引,通过解析该媒体内容索引得到视频流的信息。终端选择当前需要播放的视频流,并从媒体内容索引中提取该视频流的URL地址,然后通过该视频流的URL地址向视频服务器发送媒体内容请求。视频服务器接收到该媒体内容请求后,向终端发送对应的视频流。
可选地,请继续参见图1,该视频播放***中还可以包括网络设备104,视频服务器102与终端103之间通过网络设备104连接。网络设备104可以是网关或其它中间设备。当然,视频服务器102与终端103之间也可以直接连接,本申请实施例对此不做限定。
图4是本申请实施例提供的一种视频播放方法的流程示意图。该方法可以应用于如图1所示的视频播放***中。如图4所示,该方法包括:
步骤401、当终端接收到旋转指令时,终端生成环绕播放请求。
该环绕播放请求中包括旋转机位信息,该旋转机位信息用于指示旋转范围。可选地,当终端获取的媒体内容索引中包括机位信息时,终端接收到旋转指令后,可以根据旋转指令以及机位信息确定起始机位、终止机位和旋转方向,此时旋转机位信息中可以包括起始 机位的标识、终止机位的标识和旋转方向。或者,终端接收到旋转指令后,可以根据旋转指令确定旋转角度,此时旋转机位信息中可以包括旋转角度。
可选地,当终端在视频播放状态下接收到旋转指令时,终端生成的环绕播放请求用于请求动态环绕播放视频内容。这种情况下,环绕播放请求还用于确定播放开始时刻和播放结束时刻。可选地,环绕播放请求还包括播放时间信息,该播放时间信息包括播放开始时刻、播放结束时刻或环绕播放时长中的一个或多个。
可选地,当终端在视频暂停播放状态下接收到旋转指令时,终端生成的环绕播放请求用于请求静态环绕播放视频内容。这种情况下,环绕播放请求还用于确定目标播放时刻。可选地,环绕播放请求中包括该目标播放时刻,该目标播放时刻可以是视频暂停时刻。静态环绕播放视频内容指,对多个机位提供的目标播放时刻对应的视频画面进行环绕播放。
在一种可能地实现方式中,当终端在视频播放界面上检测到滑动操作时,终端确定接收到旋转指令。终端根据该滑动操作的滑动信息,确定旋转机位信息,该滑动信息包括滑动起始位置、滑动长度、滑动方向或滑动角度中的一个或多个。然后终端基于该旋转机位信息生成环绕播放请求。其中,滑动起始位置、滑动长度和滑动方向可以用于确定起始机位、终止机位和旋转方向。滑动角度可以用于确定旋转角度。
可选地,滑动起始位置对应起始机位,滑动方向对应旋转方向,滑动长度用于定义切换的机位数量。滑动方向向左表示逆时针旋转,滑动方向向右表示顺时针旋转。滑动长度每达到单位长度,表示切换一个机位。例如单位长度可以设置为1厘米,当滑动长度达到3厘米时,表示切换3个机位。滑动敏感度与单位长度的设置值负相关,即单位长度的设置值越小,滑动敏感度越高。滑动敏感度可根据实际需求设置。
示例地,假设滑动方向向右,滑动长度为5厘米,单位长度为1厘米,则表示顺时针旋转切换5个机位。参考图3,假设滑动起始位置对应的起始机位为摄像机9,则终端确定旋转方向为顺时针,终止机位为摄像机14。
可选地,当环绕播放请求用于请求动态环绕播放视频内容时,还可通过滑动时长定义环绕播放时长。例如可以使环绕播放时长等于滑动时长。
可选地,滑动角度用于确定旋转角度。可以设置旋转角度与滑动角度满足一定关系,例如使旋转角度等于滑动角度;或者使旋转角度等于滑动角度的2倍;等等。当旋转机位信息中包括旋转角度时,还可以采用旋转角度的正负表示旋转方向。例如旋转角度为正值,表示顺时针旋转,旋转角度为负值,表示逆时针旋转。
在另一种可能地实现方式中,当终端接收到遥控设备发送的目标遥控指令时,终端确定接收到旋转指令。目标遥控指令中包括遥控按键信息,遥控按键信息包括按键标识和/或按键次数。终端根据该遥控按键信息,确定旋转机位信息。然后终端基于该旋转机位信息生成环绕播放请求。其中,按键标识可以用于确定旋转方向。按键次数可以用于确定切换机位数量。
可选地,旋转方向基于按键标识确定。例如,当遥控按键信息中包括左键的标识时,表示旋转方向为逆时针,当遥控按键信息中包括右键的标识时,表示旋转方向为顺时针。当然还可以设置遥控设备上的其它按键控制旋转方向,本申请实施例对此不做限定。按键次数用于定义切换的机位数量,例如按键次数为1,表示切换一个机位。
示例地,假设遥控按键信息中包括左键的标识,且按键次数为3,则表示逆时针旋转切换3个机位。参考图3,假设起始机位为摄像机9,则终端根据按键标识确定旋转方向为逆时针,根据按键次数确定切换的机位数量为3,进而确定终止机位为摄像机6。
可选地,当环绕播放请求用于请求动态环绕播放视频内容时,还可通过按键时长定义环绕播放时长。例如可以使环绕播放时长等于按键时长。
步骤402、终端向上层设备发送环绕播放请求。
上层设备指终端的上游设备。可选地,上层设备可以是如图1所示的视频播放***中的视频服务器(具体可以是视频分发服务器)或网络设备。
步骤403、上层设备基于环绕播放请求确定播放时间信息。
在本申请的一个可选实施例中,环绕播放请求用于请求动态环绕播放视频内容,则播放时间信息包括播放开始时刻和播放结束时刻。上层设备基于环绕播放请求确定播放时间信息的实现方式包括以下五种:
在第一种实现方式中,步骤403的实现过程包括:上层设备根据接收到环绕播放请求的时刻以及预设的策略,确定播放开始时刻和播放结束时刻。预设的策略中包括预设环绕播放时长。
可选地,预设的策略中定义有:将上层设备接收到环绕播放请求时的视频播放时刻作为播放开始时刻,且播放结束时刻与播放开始时刻的间隔时长等于预设环绕播放时长。示例地,上层设备接收到环绕播放请求时的视频播放时刻为00:19:35,预设环绕播放时长为2秒,则上层设备确定播放开始时刻为00:19:35,播放结束时刻为00:19:37。或者,预设的策略中也可以定义:将与环绕播放请求的接收时刻(对应视频播放时刻)间隔一定时长的视频播放时刻作为播放开始时刻,该播放开始时刻在时序上可以位于环绕播放请求的接收时刻之前,或者,该播放开始时刻在时序上也可以位于环绕播放请求的接收时刻之后。示例地,环绕播放请求的接收时刻为00:19:35,播放开始时刻可以为00:19:34,或者,播放开始时刻也可以为00:19:36。
在第二种实现方式中,环绕播放请求中包括播放开始时刻和播放结束时刻。则步骤403的实现过程包括:上层设备在环绕播放请求中识别出播放开始时刻和播放结束时刻。
可选地,预先定义或预先配置环绕播放请求的指定字段用于携带播放开始时刻和播放结束时刻。其中,预先定义可以是在标准或协议中定义;预先配置可以是上层设备与终端预先协商。上层设备在接收到环绕播放请求后,可以从指定字段中识别出播放开始时刻和播放结束时刻。
示例地,环绕播放请求的指定字段中携带有两个时刻,分别为00:19:35和00:19:37,则上层设备确定播放开始时刻为00:19:35,播放结束时刻为00:19:37。
在第三种实现方式中,环绕播放请求中包括播放开始时刻。则步骤403的实现过程包括:上层设备根据播放开始时刻以及预设环绕播放时长,确定播放结束时刻。
示例地,环绕播放请求中携带的播放开始时刻为00:19:35,预设环绕播放时长为2秒,则上层设备确定播放结束时刻为00:19:37。
在第四种实现方式中,环绕播放请求中包括环绕播放时长。则步骤403的实现过程包括:上层设备根据接收到环绕播放请求的时刻以及环绕播放时长,确定播放开始时刻和播 放结束时刻。
可选地,上层设备确定播放开始时刻和播放结束时刻的方式可参考上述第一种实现方式,本申请实施例在此不再赘述。
在第五种实现方式中,环绕播放请求中包括播放开始时刻和环绕播放时长。则步骤403的实现过程包括:上层设备根据播放开始时刻以及环绕播放时长,确定播放结束时刻。
示例地,环绕播放请求中携带的播放开始时刻为00:19:35,环绕播放时长为2秒,则上层设备确定播放结束时刻为00:19:37。
在本申请的另一个可选实施例中,环绕播放请求用于请求静态环绕播放视频内容,则播放时间信息包括目标播放时刻。可选地,环绕播放请求中包括该目标播放时刻。或者,环绕播放请求中不包括该目标播放时刻,上层设备根据接收到环绕播放请求的时刻确定目标播放时刻,上层设备确定目标播放时刻的方式可参考上述第一种实现方式中上层设备确定播放开始时刻的方式,本申请实施例在此不再赘述。
步骤404、上层设备根据旋转机位信息确定起始机位、终止机位和旋转方向。
可选地,当旋转机位信息中包括起始机位的标识、终止机位的标识和旋转方向时,上层设备接收到环绕播放请求后,可以根据旋转机位信息中的内容确定起始机位、终止机位和旋转方向。
可选地,当旋转机位信息中包括旋转角度时,上层设备接收到环绕播放请求后,根据起始机位和旋转角度,确定终止机位和旋转方向。示例地,参考图3,假设上层设备确定的起始机位为摄像机9,环绕播放请求中携带的旋转角度为-90°,则上层设备确定旋转方向为逆时针,终止机位为摄像机4。
步骤405、上层设备在沿旋转方向从起始机位至终止机位的机位中确定多个机位。
可选地,上层设备确定的多个机位可以包括沿旋转方向从起始机位至终止机位的所有机位。示例地,参考图3,假设起始机位为摄像机9,终止机位为摄像机14,旋转方向为顺时针,则上层设备确定的多个机位依次包括摄像机9、摄像机10、摄像机11、摄像机12、摄像机13和摄像机14。或者,当环绕播放请求用于请求静态环绕播放视频内容时,上层设备确定的多个机位可以包括沿旋转方向从起始机位至终止机位的部分机位。示例地,假设图3中摄像机11的拍摄区域和摄像机13的拍摄区域的并集完全覆盖摄像机12的拍摄区域,则上层设备确定的多个机位中可以不包括摄像机12的拍摄区域。在静态环绕播放摄像机9至摄像机14采集的视频画面时,由于摄像机11拍摄的视频画面和摄像机13拍摄的视频画面包含摄像机12拍摄的视频画面,因此不会导致环绕播放过程中的视频画面突变,进而可以保证环绕播放画面的流畅性。
步骤406、上层设备根据旋转机位信息和播放时间信息生成旋转分片。
该旋转分片中包括旋转范围内的多个机位对应的GOP。可选地,该旋转分片中依次包括沿旋转方向从起始机位起至终止机位的多个机位对应的GOP。
在本申请的一个可选实施例中,环绕播放请求用于请求动态环绕播放视频内容,旋转分片中的每个GOP包括一帧或多帧视频图像。则步骤406的实现过程包括:
在步骤4061A中,上层设备获取多个机位中的每个机位对应的从播放开始时刻至播放结束时刻的m个视频分片,m为正整数。
示例地,假设该多个机位沿旋转方向依次包括q个机位,播放开始时刻为T1,播放结束时刻为T2,q为大于0的整数,T2>T1,每个机位对应的视频流在时间段(T1,T2)包括m个视频分片。则上层设备分别获取该q个机位在时间段(T1,T2)内对应的m个视频分片。
在步骤4062A中,上层设备根据播放时间信息,从每个机位对应的m个视频分片中提取一个或多个GOP。
可选地,上层设备根据环绕播放时长以及多个机位的数量,确定每个机位对应的GOP提取时刻以及GOP提取数量,该环绕播放时长等于播放结束时刻与播放开始时刻的差值。上层设备根据每个机位对应的GOP提取时刻以及GOP提取数量,从每个机位对应的m个视频分片中提取GOP。
可选地,沿旋转方向排布的两个机位中,前一个机位对应的GOP提取时刻在时序上位于后一个机位对应的GOP提取时刻之前。每个机位对应的GOP提取数量等于环绕播放时长与GOP的时间长度以及多个机位的数量的乘积的比值(可对该比值向上取整或向下取整)。
示例地,继续参考步骤4061A中的例子,假设每个GOP的时间长度为t,每个机位对应的GOP提取数量等于(T2-T1)/(q*t)。
在步骤4063A中,上层设备对提取的GOP进行组装,得到旋转分片。
可选地,上层设备按照旋转方向将提取的GOP依次进行组装,得到旋转分片,该旋转分片为动态旋转分片。
示例地,参考步骤4061A中的例子,假设q=5,m=1,每个视频分片包括5个GOP,每个机位对应的GOP提取数量为1,请参考图5,图5是本申请实施例提供的一种旋转分片的生成过程示意图。如图5所示,每个机位对应的视频分片中的GOP均依次编号为1-5,从第一个机位对应的视频分片中提取编号为1的GOP,从第二个机位对应的视频分片中提取编号为2的GOP,从第三个机位对应的视频分片中提取编号为3的GOP,从第四个机位对应的视频分片中提取编号为4的GOP,从第五个机位对应的视频分片中提取编号为5的GOP。按照旋转方向将从该5个机位对应的视频分片中提取的GOP依次进行组装,得到动态旋转分片。
在本申请的另一个可选实施例中,环绕播放请求用于请求静态环绕播放视频内容,旋转分片中的每个GOP包括一帧视频图像。则步骤406的实现过程包括:
在步骤4061B中,上层设备获取多个机位中的每个机位对应的目标视频分片,该目标视频分片对应的时间段包含目标播放时刻。
该目标视频分片对应的时间段包含目标播放时刻,指该目标播放时刻位于目标视频分片的起始时刻和结束时刻之间。
在步骤4062B中,上层设备从每个机位对应的目标视频分片中,提取目标播放时刻对应的一个GOP。
该目标播放时刻对应的一个GOP,指该GOP中的视频图像的采集时刻为目标播放时刻。
在步骤4063B中,上层设备对提取的GOP进行组装,得到旋转分片。
可选地,上层设备按照旋转方向将提取的GOP依次进行组装,得到旋转分片,该旋转分片为静态旋转分片。
示例地,假设该多个机位的数量为5,每个视频分片包括5个GOP,请参考图6,图6是本申请实施例提供的另一种旋转分片的生成过程示意图。如图6所示,每个机位对应的视频分片中的GOP均依次编号为1-5,目标播放时刻对应的GOP为编号为2的GOP,则从该5个机位中分别提取编号为2的GOP。按照旋转方向将从该5个机位对应的视频分片中提取的GOP依次进行组装,得到静态旋转分片。
可选地,旋转分片包含的GOP数量与其它视频分片包含的GOP数量可以相同,也可以不同,例如旋转分片包含的GOP数量可以少于其它视频分片包含的GOP数量,本申请实施例对此不做限定。
可选地,当上层设备为网络设备时,上层设备接收到环绕播放请求后,先从视频服务器下载媒体内容索引,通过解析该媒体内容索引得到视频流的信息。上层设备从媒体内容索引中提取多个机位中每个机位对应的视频流的URL地址,然后通过视频流的URL分别获取对应的视频分片。
步骤407、上层设备向终端发送旋转分片。
可选地,当环绕播放请求用于请求动态环绕播放视频内容时,上层设备向终端发送旋转分片后,继续向终端发送终止机位对应的视频流,使得终端能够流畅地从起始机位对应的播放画面切换至终止机位对应的播放画面。当环绕播放请求用于请求静态环绕播放视频内容时,上层设备向终端发送旋转分片后,停止向终端发送视频数据。
步骤408、终端对旋转分片进行解码播放。
终端对旋转分片进行解码播放,能够实现对沿旋转方向从起始机位起至终止机位中的多个机位对应的视频画面的环绕播放。其中,终端播放的视频画面的分辨率可以与旋转分片中的视频图像的分辨率相同。
本申请实施例提供的方法实施例步骤的先后顺序能够进行适当调整,步骤也能够根据情况进行相应增减。任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化的方法,都应涵盖在本申请的保护范围之内,因此不再赘述。
综上所述,在本申请实施例提供的视频播放方法中,上层设备根据终端发送的环绕播放请求确定播放时间信息,然后根据播放时间信息以及环绕播放请求中的旋转机位信息生成旋转分片。由于旋转分片中包含旋转机位信息所指示的旋转范围内的多个机位对应的GOP,终端在接收到旋转分片后,对旋转分片进行解码即可实现对视频画面的环绕播放,且播放的视频画面的分辨率可以与旋转分片中的视频图像的分辨率相同。因此本申请实施例提供的视频播放方法不受限于前端拍摄采用的摄像机数量,应用范围广。另外,上层设备可以是视频分发服务器或网络设备,可以减小对视频处理服务器的处理性能的要求,实现可靠性高。
图7是本申请实施例提供的一种视频播放装置的结构示意图。该装置用于上层设备,例如,该上层设备可以是如图1所示的视频播放***中的视频服务器或网络设备。如图7所示,该装置70包括:
接收模块701,用于接收终端发送的环绕播放请求,该环绕播放请求中包括旋转机位信息,该旋转机位信息用于指示旋转范围。
第一确定模块702,用于基于环绕播放请求确定播放时间信息。
生成模块703,用于根据旋转机位信息和播放时间信息生成旋转分片,该旋转分片中包括旋转范围内的多个机位对应的图像组GOP,GOP包括一帧或多帧视频图像。
发送模块704,用于向终端发送旋转分片。
可选地,播放时间信息包括播放开始时刻和播放结束时刻,生成模块703,用于:
获取多个机位中的每个机位对应的从播放开始时刻至播放结束时刻的m个视频分片,m为正整数;根据播放时间信息,从每个机位对应的m个视频分片中提取一个或多个GOP;对提取的GOP进行组装,得到旋转分片。
可选地,生成模块703,具体用于:根据环绕播放时长以及多个机位的数量,确定每个机位对应的GOP提取时刻以及GOP提取数量,该环绕播放时长等于播放结束时刻与播放开始时刻的差值;根据每个机位对应的GOP提取时刻以及GOP提取数量,从每个机位对应的m个视频分片中提取GOP。
可选地,播放时间信息包括目标播放时刻,生成模块703,用于:
获取多个机位中的每个机位对应的目标视频分片,目标视频分片对应的时间段包含目标播放时刻;从每个机位对应的目标视频分片中,提取该目标播放时刻对应的一个GOP,GOP包括一帧视频图像;对提取的GOP进行组装,得到旋转分片。
可选地,如图8所示,装置70还包括:
第二确定模块705,用于根据旋转机位信息确定起始机位、终止机位和旋转方向。
第三确定模块706,用于在沿旋转方向从起始机位起至终止机位的机位中确定多个机位;
生成模块703,用于按照旋转方向将提取的GOP依次进行组装,得到旋转分片。
可选地,第一确定模块702,用于根据接收到环绕播放请求的时刻以及预设的策略,确定播放开始时刻和播放结束时刻,预设的策略中包括预设环绕播放时长;或者,环绕播放请求中包括播放开始时刻和播放结束时刻,第一确定模块702,用于在环绕播放请求中识别出播放开始时刻和播放结束时刻;或者,环绕播放请求中包括播放开始时刻,第一确定模块702,用于根据播放开始时刻以及预设环绕播放时长,确定播放结束时刻;或者,环绕播放请求中包括环绕播放时长,第一确定模块702,用于根据接收到环绕播放请求的时刻以及环绕播放时长,确定播放开始时刻和播放结束时刻;或者,环绕播放请求中包括播放开始时刻和环绕播放时长,第一确定模块702,用于根据播放开始时刻以及环绕播放时长,确定播放结束时刻。
可选地,GOP采用独立传输封装方式编码。
综上所述,在本申请实施例提供的视频播放装置中,上层设备通过第一确定模块根据终端发送的环绕播放请求确定播放时间信息,然后通过生成模块根据播放时间信息以及环绕播放请求中的旋转机位信息生成旋转分片。由于旋转分片中包含旋转机位信息所指示的旋转范围内的多个机位对应的GOP,终端在接收到旋转分片后,对旋转分片进行解码即可实现对视频画面的环绕播放,且播放的视频画面的分辨率可以与旋转分片中的视频图像的 分辨率相同。因此本申请实施例提供的视频播放方法不受限于前端拍摄采用的摄像机数量,应用范围广。另外,上层设备可以是视频分发服务器或网络设备,可以减小对视频处理服务器的处理性能的要求,实现可靠性高。
图9是本申请实施例提供的又一种视频播放装置的结构示意图。该装置用于终端,例如,该装置可以是如图1所示的视频播放***中的终端103。如图9所示,该装置90包括:
发送模块901,用于当终端接收到旋转指令时,向上层设备发送基于旋转指令生成的环绕播放请求,该环绕播放请求中包括旋转机位信息,该旋转机位信息用于指示旋转范围。
接收模块902,用于接收上层设备发送的旋转分片,该旋转分片中包括旋转范围内的多个机位对应的图像组GOP,GOP包括一帧或多帧视频图像。
播放模块903,用于对旋转分片进行解码播放。
可选地,如图10所示,装置90还包括:
第一确定模块904,用于当终端在视频播放界面上检测到滑动操作时,确定接收到旋转指令;
第二确定模块905,用于根据滑动操作的滑动信息,确定旋转机位信息,滑动信息包括滑动起始位置、滑动长度、滑动方向或滑动角度中的一个或多个;
生成模块906,用于基于旋转机位信息生成环绕播放请求。
可选地,如图11所示,装置90还包括:
第三确定模块907,用于当终端接收到遥控设备发送的目标遥控指令时,确定接收到旋转指令,目标遥控指令中包括遥控按键信息,遥控按键信息包括按键标识和/或按键次数;
第四确定模块908,用于基于遥控按键信息,确定旋转机位信息;
生成模块906,用于基于旋转机位信息生成环绕播放请求。
综上所述,在本申请实施例提供的视频播放装置中,终端接收到旋转指令后通过发送模块向上层设备发送环绕播放请求,然后通过接收模块接收上层设备发送的旋转分片。由于旋转分片中包含旋转机位信息所指示的旋转范围内的多个机位对应的GOP,终端在接收到旋转分片后,通过播放模块对旋转分片进行解码即可实现对视频画面的环绕播放,且播放的视频画面的分辨率可以与旋转分片中的视频图像的分辨率相同。因此本申请实施例提供的视频播放方法不受限于前端拍摄采用的摄像机数量,应用范围广。另外,上层设备可以是视频分发服务器或网络设备,可以减小对视频处理服务器的处理性能的要求,实现可靠性高。
关于上述实施例中的装置,其中各个模块执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。
本申请实施例还提供了一种视频播放***,该***包括:上层设备和终端。上层设备包括如图7或图8所示的视频播放装置,终端包括如图9至图11任一所示的视频播放装置。
图12是本申请实施例提供的一种视频播放装置的框图。该视频播放装置可以是上层设备或终端,上层设备可以是视频服务器或网络设备,终端可以是手机、平板电脑、智能可穿戴设备或机顶盒等。如图12所示,该视频播放装置120包括:处理器1201和存储器1202。
存储器1202,用于存储计算机程序,所述计算机程序包括程序指令;
处理器1201,用于调用所述计算机程序,实现如图4所示的视频播放方法中上层设备执行的动作或终端执行的动作。
可选地,该视频播放装置120还包括通信总线1203和通信接口1204。
其中,处理器1201包括一个或者一个以上处理核心,处理器1201通过运行计算机程序,执行各种功能应用以及数据处理。
存储器1202可用于存储计算机程序。可选地,存储器可存储操作***和至少一个功能所需的应用程序单元。操作***可以是实时操作***(Real Time eXecutive,RTX)、LINUX、UNIX、WINDOWS或OS X之类的操作***。
通信接口1204可以为多个,通信接口1204用于与其它存储设备或网络设备进行通信。例如在本申请实施例中,上层设备的通信接口可以用于向终端发送旋转分片,终端的通信接口可以用于向上层设备发送环绕播放请求。网络设备可以是交换机或路由器等。
存储器1202与通信接口1204分别通过通信总线1203与处理器1201连接。
本申请实施例还提供了一种计算机存储介质,所述计算机存储介质上存储有指令,当所述指令被计算机设备的处理器执行时,实现如上述方法实施例所述的视频播放方法中上层设备执行的动作或者终端执行的动作。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
在本申请实施例中,术语“第一”、“第二”和“第三”仅用于描述目的,而不能理解为指示或暗示相对重要性。
本申请中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
以上所述仅为本申请的可选实施例,并不用以限制本申请,凡在本申请的构思和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (23)

  1. 一种视频播放方法,其特征在于,所述方法包括:
    上层设备接收终端发送的环绕播放请求,所述环绕播放请求中包括旋转机位信息,所述旋转机位信息用于指示旋转范围;
    所述上层设备基于环绕播放请求确定播放时间信息;
    所述上层设备根据所述旋转机位信息和所述播放时间信息生成旋转分片,所述旋转分片中包括所述旋转范围内的多个机位对应的图像组GOP,所述GOP包括一帧或多帧视频图像;
    所述上层设备向所述终端发送所述旋转分片。
  2. 根据权利要求1所述的方法,其特征在于,所述播放时间信息包括播放开始时刻和播放结束时刻,所述上层设备根据所述旋转机位信息和所述播放时间信息生成旋转分片,包括:
    所述上层设备获取所述多个机位中的每个机位对应的从所述播放开始时刻至所述播放结束时刻的m个视频分片,m为正整数;
    所述上层设备根据所述播放时间信息,从所述每个机位对应的m个视频分片中提取一个或多个GOP;
    所述上层设备对提取的所述GOP进行组装,得到所述旋转分片。
  3. 根据权利要求2所述的方法,其特征在于,所述上层设备根据所述播放时间信息,从所述每个机位对应的m个视频分片中提取一个或多个GOP,包括:
    所述上层设备根据环绕播放时长以及所述多个机位的数量,确定所述每个机位对应的GOP提取时刻以及GOP提取数量,所述环绕播放时长等于所述播放结束时刻与所述播放开始时刻的差值;
    所述上层设备根据所述每个机位对应的GOP提取时刻以及GOP提取数量,从所述每个机位对应的m个视频分片中提取GOP。
  4. 根据权利要求1所述的方法,其特征在于,所述播放时间信息包括目标播放时刻,所述上层设备根据所述旋转机位信息和所述播放时间信息生成旋转分片,包括:
    所述上层设备获取所述多个机位中的每个机位对应的目标视频分片,所述目标视频分片对应的时间段包含所述目标播放时刻;
    所述上层设备从所述每个机位对应的目标视频分片中,提取所述目标播放时刻对应的一个GOP,所述GOP包括一帧视频图像;
    所述上层设备对提取的所述GOP进行组装,得到所述旋转分片。
  5. 根据权利要求2至4任一所述的方法,其特征在于,所述方法还包括:
    所述上层设备根据所述旋转机位信息确定起始机位、终止机位和旋转方向;
    所述上层设备在沿所述旋转方向从所述起始机位起至所述终止机位的机位中确定所述多个机位;
    所述上层设备对提取的所述GOP进行组装,得到所述旋转分片,包括:
    所述上层设备按照所述旋转方向将提取的所述GOP依次进行组装,得到所述旋转分片。
  6. 根据权利要求2或3所述的方法,其特征在于,
    所述上层设备基于环绕播放请求确定播放时间信息,包括:
    所述上层设备根据接收到所述环绕播放请求的时刻以及预设的策略,确定播放开始时刻和播放结束时刻,所述预设的策略中包括预设环绕播放时长;
    或者,所述环绕播放请求中包括播放开始时刻和播放结束时刻,所述上层设备基于环绕播放请求确定播放时间信息,包括:
    所述上层设备在所述环绕播放请求中识别出所述播放开始时刻和所述播放结束时刻;
    或者,所述环绕播放请求中包括播放开始时刻,所述上层设备基于环绕播放请求确定播放时间信息,包括:
    所述上层设备根据所述播放开始时刻以及预设环绕播放时长,确定播放结束时刻;
    或者,所述环绕播放请求中包括环绕播放时长,所述上层设备基于环绕播放请求确定播放时间信息,包括:
    所述上层设备根据接收到所述环绕播放请求的时刻以及所述环绕播放时长,确定播放开始时刻和播放结束时刻;
    或者,所述环绕播放请求中包括播放开始时刻和环绕播放时长,所述上层设备基于环绕播放请求确定播放时间信息,包括:
    所述上层设备根据所述播放开始时刻以及所述环绕播放时长,确定播放结束时刻。
  7. 根据权利要求1至6任一所述的方法,其特征在于,所述GOP采用独立传输封装方式编码。
  8. 一种视频播放方法,其特征在于,所述方法包括:
    当终端接收到旋转指令时,所述终端向上层设备发送基于所述旋转指令生成的环绕播放请求,所述环绕播放请求中包括旋转机位信息,所述旋转机位信息用于指示旋转范围;
    所述终端接收所述上层设备发送的旋转分片,所述旋转分片中包括所述旋转范围内的多个机位对应的图像组GOP,所述GOP包括一帧或多帧视频图像;
    所述终端对所述旋转分片进行解码播放。
  9. 根据权利要求8所述的方法,其特征在于,所述方法还包括:
    当所述终端在视频播放界面上检测到滑动操作时,所述终端确定接收到所述旋转指令;
    所述终端根据所述滑动操作的滑动信息,确定所述旋转机位信息,所述滑动信息包括滑动起始位置、滑动长度、滑动方向或滑动角度中的一个或多个;
    所述终端基于所述旋转机位信息生成所述环绕播放请求。
  10. 根据权利要求8所述的方法,其特征在于,所述方法还包括:
    当所述终端接收到遥控设备发送的目标遥控指令时,所述终端确定接收到所述旋转指令,所述目标遥控指令中包括遥控按键信息,所述遥控按键信息包括按键标识和/或按键次数;
    所述终端基于所述遥控按键信息,确定所述旋转机位信息;
    所述终端基于所述旋转机位信息生成所述环绕播放请求。
  11. 一种视频播放装置,其特征在于,用于上层设备,所述装置包括:
    接收模块,用于接收终端发送的环绕播放请求,所述环绕播放请求中包括旋转机位信息,所述旋转机位信息用于指示旋转范围;
    第一确定模块,用于基于环绕播放请求确定播放时间信息;
    生成模块,用于根据所述旋转机位信息和所述播放时间信息生成旋转分片,所述旋转分片中包括所述旋转范围内的多个机位对应的图像组GOP,所述GOP包括一帧或多帧视频图像;
    发送模块,用于向所述终端发送所述旋转分片。
  12. 根据权利要求11所述的装置,其特征在于,所述播放时间信息包括播放开始时刻和播放结束时刻,所述生成模块,用于:
    获取所述多个机位中的每个机位对应的从所述播放开始时刻至所述播放结束时刻的m个视频分片,m为正整数;
    根据所述播放时间信息,从所述每个机位对应的m个视频分片中提取一个或多个GOP;
    对提取的所述GOP进行组装,得到所述旋转分片。
  13. 根据权利要求12所述的装置,其特征在于,所述生成模块,用于:
    根据环绕播放时长以及所述多个机位的数量,确定所述每个机位对应的GOP提取时刻以及GOP提取数量,所述环绕播放时长等于所述播放结束时刻与所述播放开始时刻的差值;
    根据所述每个机位对应的GOP提取时刻以及GOP提取数量,从所述每个机位对应的m个视频分片中提取GOP。
  14. 根据权利要求11所述的装置,其特征在于,所述播放时间信息包括目标播放时刻,所述生成模块,用于:
    获取所述多个机位中的每个机位对应的目标视频分片,所述目标视频分片对应的时间段包含所述目标播放时刻;
    从所述每个机位对应的目标视频分片中,提取所述目标播放时刻对应的一个GOP,所述GOP包括一帧视频图像;
    对提取的所述GOP进行组装,得到所述旋转分片。
  15. 根据权利要求12至14任一所述的装置,其特征在于,所述装置还包括:
    第二确定模块,用于根据所述旋转机位信息确定起始机位、终止机位和旋转方向;
    第三确定模块,用于在沿所述旋转方向从所述起始机位起至所述终止机位的机位中确定所述多个机位;
    所述生成模块,用于按照所述旋转方向将提取的所述GOP依次进行组装,得到所述旋转分片。
  16. 根据权利要求12或13所述的装置,其特征在于,
    所述第一确定模块,用于根据接收到所述环绕播放请求的时刻以及预设的策略,确定播放开始时刻和播放结束时刻,所述预设的策略中包括预设环绕播放时长;
    或者,所述环绕播放请求中包括播放开始时刻和播放结束时刻,所述第一确定模块,用于在所述环绕播放请求中识别出所述播放开始时刻和所述播放结束时刻;
    或者,所述环绕播放请求中包括播放开始时刻,所述第一确定模块,用于根据所述播放开始时刻以及预设环绕播放时长,确定播放结束时刻;
    或者,所述环绕播放请求中包括环绕播放时长,所述第一确定模块,用于根据接收到所述环绕播放请求的时刻以及所述环绕播放时长,确定播放开始时刻和播放结束时刻;
    或者,所述环绕播放请求中包括播放开始时刻和环绕播放时长,所述第一确定模块,用于根据所述播放开始时刻以及所述环绕播放时长,确定播放结束时刻。
  17. 根据权利要求11至16任一所述的装置,其特征在于,所述GOP采用独立传输封装方式编码。
  18. 一种视频播放装置,其特征在于,用于终端,所述装置包括:
    发送模块,用于当所述终端接收到旋转指令时,向上层设备发送基于所述旋转指令生成的环绕播放请求,所述环绕播放请求中包括旋转机位信息,所述旋转机位信息用于指示旋转范围;
    接收模块,用于接收所述上层设备发送的旋转分片,所述旋转分片中包括所述旋转范围内的多个机位对应的图像组GOP,所述GOP包括一帧或多帧视频图像;
    播放模块,用于对所述旋转分片进行解码播放。
  19. 根据权利要求18所述的装置,其特征在于,所述装置还包括:
    第一确定模块,用于当所述终端在视频播放界面上检测到滑动操作时,确定接收到所述旋转指令;
    第二确定模块,用于根据所述滑动操作的滑动信息,确定所述旋转机位信息,所述滑动信息包括滑动起始位置、滑动长度、滑动方向或滑动角度中的一个或多个;
    生成模块,用于基于所述旋转机位信息生成所述环绕播放请求。
  20. 根据权利要求18所述的装置,其特征在于,所述装置还包括:
    第三确定模块,用于当所述终端接收到遥控设备发送的目标遥控指令时,确定接收到所述旋转指令,所述目标遥控指令中包括遥控按键信息,所述遥控按键信息包括按键标识和/或按键次数;
    第四确定模块,用于基于所述遥控按键信息,确定所述旋转机位信息;
    生成模块,用于基于所述旋转机位信息生成所述环绕播放请求。
  21. 一种视频播放***,其特征在于,所述***包括:上层设备和终端,所述上层设备包括如权利要求11至17任一所述的视频播放装置,所述终端包括如权利要求18至20任一所述的视频播放装置。
  22. 一种视频播放装置,其特征在于,包括:处理器和存储器;
    所述存储器,用于存储计算机程序,所述计算机程序包括程序指令;
    所述处理器,用于调用所述计算机程序,实现如权利要求1至7任一所述的视频播放方法;或者,实现如权利要求8至10任一所述的视频播放方法。
  23. 一种计算机存储介质,其特征在于,所述计算机存储介质上存储有指令,当所述指令被计算机设备的处理器执行时,实现如权利要求1至10任一所述的视频播放方法。
PCT/CN2021/085477 2020-04-29 2021-04-03 视频播放方法、装置及***、计算机存储介质 WO2021218573A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21797606.7A EP4135312A4 (en) 2020-04-29 2021-04-03 VIDEO PLAYING METHOD, DEVICE AND SYSTEM AND COMPUTER STORAGE MEDIUM
US17/977,404 US20230045876A1 (en) 2020-04-29 2022-10-31 Video Playing Method, Apparatus, and System, and Computer Storage Medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010354123.0A CN113572975B (zh) 2020-04-29 2020-04-29 视频播放方法、装置及***、计算机存储介质
CN202010354123.0 2020-04-29

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/977,404 Continuation US20230045876A1 (en) 2020-04-29 2022-10-31 Video Playing Method, Apparatus, and System, and Computer Storage Medium

Publications (1)

Publication Number Publication Date
WO2021218573A1 true WO2021218573A1 (zh) 2021-11-04

Family

ID=78158348

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/085477 WO2021218573A1 (zh) 2020-04-29 2021-04-03 视频播放方法、装置及***、计算机存储介质

Country Status (4)

Country Link
US (1) US20230045876A1 (zh)
EP (1) EP4135312A4 (zh)
CN (1) CN113572975B (zh)
WO (1) WO2021218573A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022100742A1 (zh) * 2020-11-16 2022-05-19 华为云计算技术有限公司 视频编码及视频播放方法、装置和***

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12021927B2 (en) * 2022-04-15 2024-06-25 Arcus Holding A/S Location based video data transmission
CN114615540A (zh) * 2022-05-11 2022-06-10 北京搜狐新动力信息技术有限公司 视频分辨率的切换方法、装置、存储介质及设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130039632A1 (en) * 2011-08-08 2013-02-14 Roy Feinson Surround video playback
CN106550239A (zh) * 2015-09-22 2017-03-29 北京同步科技有限公司 360度全景视频直播***及其实现方法
CN109257611A (zh) * 2017-07-12 2019-01-22 阿里巴巴集团控股有限公司 一种视频播放方法、装置、终端设备和服务器
CN109996110A (zh) * 2017-12-29 2019-07-09 中兴通讯股份有限公司 一种视频播放方法、终端、服务器及存储介质
CN110719425A (zh) * 2018-07-11 2020-01-21 视联动力信息技术股份有限公司 一种视频数据的播放方法和装置

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7843497B2 (en) * 1994-05-31 2010-11-30 Conley Gregory J Array-camera motion picture device, and methods to produce new visual and aural effects
AU2002307545A1 (en) * 2001-04-20 2002-11-05 Corp. Kewazinga Navigable camera array and viewer therefore
JP4148671B2 (ja) * 2001-11-06 2008-09-10 ソニー株式会社 表示画像制御処理装置、動画像情報送受信システム、および表示画像制御処理方法、動画像情報送受信方法、並びにコンピュータ・プログラム
US7142209B2 (en) * 2004-08-03 2006-11-28 Microsoft Corporation Real-time rendering system and process for interactive viewpoint video that was generated using overlapping images of a scene captured from viewpoints forming a grid
CN101630117A (zh) * 2008-07-15 2010-01-20 上海虚拟谷数码科技有限公司 多摄像机同步获取360°高清全景***及其应用
US9462301B2 (en) * 2013-03-15 2016-10-04 Google Inc. Generating videos with multiple viewpoints
JP6354244B2 (ja) * 2014-03-25 2018-07-11 大日本印刷株式会社 画像再生端末、画像再生方法、プログラム、及び、多視点画像再生システム
US10129579B2 (en) * 2015-10-15 2018-11-13 At&T Mobility Ii Llc Dynamic video image synthesis using multiple cameras and remote control
CN107979732A (zh) * 2016-10-25 2018-05-01 北京优朋普乐科技有限公司 一种多视角视频播放的方法及装置
CN108876926B (zh) * 2017-05-11 2021-08-10 京东方科技集团股份有限公司 一种全景场景中的导航方法及***、ar/vr客户端设备
US20200213631A1 (en) * 2017-06-29 2020-07-02 4Dreplay Korea, Inc. Transmission system for multi-channel image, control method therefor, and multi-channel image playback method and apparatus
WO2019004498A1 (ko) * 2017-06-29 2019-01-03 포디리플레이 인코포레이티드 다채널 영상 생성 방법, 다채널 영상 재생 방법 및 다채널 영상 재생 프로그램
US11736675B2 (en) * 2018-04-05 2023-08-22 Interdigital Madison Patent Holdings, Sas Viewpoint metadata for omnidirectional video
US10356387B1 (en) * 2018-07-26 2019-07-16 Telefonaktiebolaget Lm Ericsson (Publ) Bookmarking system and method in 360° immersive video based on gaze vector information
JP6571859B1 (ja) * 2018-12-26 2019-09-04 Amatelus株式会社 映像配信装置、映像配信システム、映像配信方法及び映像配信プログラム

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130039632A1 (en) * 2011-08-08 2013-02-14 Roy Feinson Surround video playback
CN106550239A (zh) * 2015-09-22 2017-03-29 北京同步科技有限公司 360度全景视频直播***及其实现方法
CN109257611A (zh) * 2017-07-12 2019-01-22 阿里巴巴集团控股有限公司 一种视频播放方法、装置、终端设备和服务器
CN109996110A (zh) * 2017-12-29 2019-07-09 中兴通讯股份有限公司 一种视频播放方法、终端、服务器及存储介质
CN110719425A (zh) * 2018-07-11 2020-01-21 视联动力信息技术股份有限公司 一种视频数据的播放方法和装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4135312A4

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022100742A1 (zh) * 2020-11-16 2022-05-19 华为云计算技术有限公司 视频编码及视频播放方法、装置和***

Also Published As

Publication number Publication date
EP4135312A4 (en) 2023-07-19
CN113572975B (zh) 2023-06-06
CN113572975A (zh) 2021-10-29
US20230045876A1 (en) 2023-02-16
EP4135312A1 (en) 2023-02-15

Similar Documents

Publication Publication Date Title
US11470405B2 (en) Network video streaming with trick play based on separate trick play files
WO2021218573A1 (zh) 视频播放方法、装置及***、计算机存储介质
US10567765B2 (en) Streaming multiple encodings with virtual stream identifiers
EP3459247A1 (en) Most-interested region in an image
KR20190137915A (ko) 비디오 재생 방법, 장치 및 시스템
US20180063590A1 (en) Systems and Methods for Encoding and Playing Back 360° View Video Content
EP4192020B1 (en) Channel change method and apparatus
WO2014193996A2 (en) Network video streaming with trick play based on separate trick play files
US20200304552A1 (en) Immersive Media Metrics For Rendered Viewports
CN111447503A (zh) 一种多视点视频的视点切换方法、服务器和***
JP2020524450A (ja) 多チャネル映像のための伝送システム及びその制御方法、多チャネル映像再生方法及びその装置
CN108989833B (zh) 一种视频封面图像的生成方法及装置
Tang et al. Audio and video mixing method to enhance WebRTC
US9667885B2 (en) Systems and methods to achieve interactive special effects
CN115174942A (zh) 一种自由视角切换方法及交互式自由视角播放***
US11172238B1 (en) Multiple view streaming
WO2022222533A1 (zh) 视频播放方法、装置及***、计算机可读存储介质
KR20110129064A (ko) 콘텐트 가상 세그멘테이션 방법과, 이를 이용한 스트리밍 서비스 제공 방법 및 시스템
WO2022100742A1 (zh) 视频编码及视频播放方法、装置和***
CN112291577B (zh) 直播视频的发送方法和装置、存储介质、电子装置
WO2024082561A1 (zh) 视频处理方法、装置、计算机、可读存储介质及程序产品
Seo et al. Bandwidth-Efficient Transmission Method for User View-Oriented Video Services
KR20230013461A (ko) 영상 저장 장치, 영상 모니터링 장치 및 그 위에서 실행되는 방법
KR20230103789A (ko) 다채널 영상을 전송하기 위한 시스템의 동작 방법 및 이를 수행하는 시스템
CN112784108A (zh) 一种数据处理的方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21797606

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021797606

Country of ref document: EP

Effective date: 20221108

NENP Non-entry into the national phase

Ref country code: DE