EP3398346A1 - Video streams - Google Patents
Video streamsInfo
- Publication number
- EP3398346A1 EP3398346A1 EP16816318.6A EP16816318A EP3398346A1 EP 3398346 A1 EP3398346 A1 EP 3398346A1 EP 16816318 A EP16816318 A EP 16816318A EP 3398346 A1 EP3398346 A1 EP 3398346A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- video
- scene
- stream
- scene information
- timestamps
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000009877 rendering Methods 0.000 claims abstract description 21
- 230000001131 transforming effect Effects 0.000 claims abstract description 19
- 238000000034 method Methods 0.000 claims description 39
- 238000004891 communication Methods 0.000 claims description 11
- 230000003044 adaptive effect Effects 0.000 claims description 8
- 230000011218 segmentation Effects 0.000 claims description 6
- 230000005540 biological transmission Effects 0.000 claims description 3
- 230000001360 synchronised effect Effects 0.000 description 13
- 230000009466 transformation Effects 0.000 description 11
- 230000008569 process Effects 0.000 description 9
- 239000002131 composite material Substances 0.000 description 7
- 238000012545 processing Methods 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 230000000153 supplemental effect Effects 0.000 description 2
- 230000001154 acute effect Effects 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 238000005304 joining Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012806 monitoring device Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000000135 prohibitive effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 230000000576 supplementary effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/85406—Content authoring involving a specific file format, e.g. MP4 format
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/60—Network streaming of media packets
- H04L65/61—Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio
- H04L65/612—Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio for unicast
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/60—Network streaming of media packets
- H04L65/61—Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio
- H04L65/613—Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio for the control of the source by the destination
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/60—Network streaming of media packets
- H04L65/75—Media network packet handling
- H04L65/765—Media network packet handling intermediate
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/21—Server components or server architectures
- H04N21/218—Source of audio or video content, e.g. local disk arrays
- H04N21/21805—Source of audio or video content, e.g. local disk arrays enabling multiple viewpoints, e.g. using a plurality of cameras
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
- H04N21/23439—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements for generating different versions
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/236—Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
- H04N21/2365—Multiplexing of several video streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/845—Structuring of content, e.g. decomposing content into time segments
- H04N21/8456—Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/8543—Content authoring using a description language, e.g. Multimedia and Hypermedia information coding Expert Group [MHEG], eXtensible Markup Language [XML]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/8547—Content authoring involving timestamps for synchronizing content
Definitions
- the present invention relates to video streams. More in particular, the present invention relates to providing multiple video streams of a scene.
- a scene may be shot from several camera locations or camera angles simultaneously, each camera having its own camera location or camera angle and therefore producing its own view of the scene.
- a spectator is able to choose a view of her liking, for example a view of one of the goals when watching a football match.
- SourceTV by Valve Corporation (https://developer.valvesoftware.com/wiki/SourceTV) in principle allows a very large number of spectators to watch online games, each spectator being able to choose her own view. This is achieved by distributing game data via a network of distributed servers and proxies to spectator clients. Each proxy can serve up to 255 spectators. This means that for allowing very large numbers of spectators to simultaneously watch a game, the required number of servers and proxies can become prohibitive.
- each spectator requires her own spectator client, a client device for non-participants. As each spectator client requires a substantial amount of processing and therefore uses a significant amount of energy, the SourceTV solution is also inefficient with regard to the use of energy.
- a solution preferably comprising an apparatus and a method, which allows a large number of spectators to view a scene, such as a scene of a computer game, from different (virtual) viewpoints chosen by the spectators, which solution has improved scaling properties.
- This allows a stable composite video image to be formed from two or more synchronized video streams.
- Spectators may compose a desired video image from the images (or image frames) of the available video streams, thus limiting the number of video streams to be produced.
- video streams relating to scenes can be easily scaled without requiring substantial amounts of hardware.
- the invention provides a method of producing multiple video streams by using at least one scene information stream relating to a scene, wherein the scene information stream comprises metadata descriptive of at least one event, said metadataincluding a time indication of the event, the method comprising:
- each partial view covering at least a part of the scene
- each partial view covering at least part of the scene at least two parts of the scene can be offered to the spectators, each of those parts of the scene being represented by a separate video stream generated on the basis of the metadata from the scene information stream.
- the present invention advantageously uses time indications normally present in a scene information stream and transforms these time indications into timestamps which can be used for synchronization during the rendering and/or playing out of the views. More in particular, in accordance with the invention the time indications of the scene information stream are transformed into timestamps configured for synchronously rendering the video streams.
- the video can be synchronized so as to provide synchronized partial views.
- assigning the timestamps to the video streams, and transmitting each video stream together with its assigned timestamps it is ensured that the timestamps are present in the video streams so as to allow synchronization of the rendered streams.
- determining the partial views is preferably carried out by the entity providing the video stream service and may be independent from user input. That is, the partial views may not be determined by users. Instead, user views may be composed from the partial view offered (e.g. a user device may generate a user view by requesting one or more partial views). Such user view may be generated by for example stitching (synchronized) video frames related to different partial views, and/or by cropping the desired user view from video data related to the one or more partial views. By composing a desired view from available partial views, it is no longer necessary to provide a virtually unlimited number of partial views.
- partial views cover at least part of the scene, they may cover the entire scene.
- each partial view covers a different part of the scene, although partial views may partially overlap. In some embodiments, it can be advantageous to provide at least one view of the entire scene.
- Transforming the time indications into timestamps may be carried out in several ways.
- transforming the time indications comprises applying a linear transform.
- other transforms are also possible, such as transforms in which a single time indication influences multiple timestamps.
- the transform may involve a calculation or a look-up table.
- the scene information stream is generated by a computer-generated game. That is, the invention can advantageously be used to provide video streams of computer game scenes.
- the scene information stream may also originate from a motion picture, such as a computer-generated motion picture.
- At least two partial views may overlap. By providing overlapping views, the number of (partial) video streams can be reduced. Some partial views may entirely be constituted by parts of other partial views.
- the method may comprise composing a view to be displayed by using one or more of said partial views.
- the so-called camera position of the partial views can be suitably chosen.
- the camera position refers to the angle and the perspective to the views.
- the partial views correspond with an infinite distance camera position. That is, the partial views have the perspective corresponding with an infinite camera position.
- the angle of the camera position relative to the horizontal plane of the views may be suitably chosen, and may vary from 0° to 90°, for example 60°.
- the transmitting may be based on HTTP adaptive streaming.
- the method may further comprise requesting the at least one video stream by using a spatial manifest structure.
- the invention also provides a software program product comprising instructions allowing a processor to carry out the method described above.
- the software program product may be stored on a tangible carrier, such as a DVD or a USB stick.
- the software program product may be stored on a server from which it may be downloaded using the Internet.
- the software program product contains software instructions which can be carried out by the processor of a device, such as a server, a user device (for example a smartphone), and/or a monitoring device.
- the invention further provides a spatial manifest data structure configured for use in the method described above, and in particular configured for producing multiple video streams from a scene information stream. More in particular, the spatial manifest data structure comprises
- the spatial manifest data file preferably allows time synchronized video frames from the at least two video streams to be combined into a video frame for display.
- the invention still further provides an apparatus configured for generating multiple video streams on the basis of the metadata from at least one scene information stream relating to a scene, wherein the scene information stream comprises metadata descriptive of at least one event, the metadata including a time indication of said event, the apparatus comprising a video coordinator and at least two video generators, the video coordinator being configured for:
- each video generator being configured for:
- each partial view covering at least part of the scene, and by allocating a video generator to each partial view, at least two different parts of the scene can be produced, each part of the scene being represented by a video stream generated from the scene information stream by a separate video generator.
- each video generator is configured for rendering video frames of the allocated partial view by using said metadata from said scene information stream, and encoding the video frames into a video stream. That is, each video generator is capable of generating video frames of the particular partial view allocated to the video generator and producing a video stream containing those frames.
- each video generator is configured for splitting (that is, segmenting) the video stream into time segments, preferably time segments suitable for HTTP-based adaptive streaming.
- HTTP-based adaptive streaming is MPEG DASH.
- At least one video generator comprises:
- At least one spectator client configured for generating the video stream for the allocated partial view
- control unit configured for controlling the transforming of the time indications of the scene information stream into timestamps
- At least one encoder configured for encoding the video stream
- At least one multiplexer for assigning the timestamps to the video stream.
- the control unit of the video generator can carry out the controlling of the transforming of the time indications.
- the control unit of the video generator may not, or not only, control the transforming of time indications carried out by other units, but may carry out the transforming itself. That is, in some embodiments the control unit of a video generator can be configured for controlling the transforming of the time indications of the scene information stream into timestamps.
- the at least one multiplexer of each video generator may be configured for segmenting the video stream into time segments. This allows time-segmented video streams to be produced (that is, generated).
- At least one video generator comprises:
- At least one streaming client configured for generating the video stream for the allocated partial view
- control unit configured for controlling the transforming of the time indications of the scene information stream into timestamps
- At least one encoder configured for encoding the video stream
- At least one multiplexer for assigning the timestamps to the video stream.
- the spatial manifest data file may advantageously be used by the spectator client of the video generator, preferably in addition to its regular end user use.
- the time indications may be transformed into timestamps in various ways. It is preferred, however, that each video generator is configured for linearly transforming the time indications into timestamps.
- a linear transform has the benefits of efficiency and simplicity.
- a special embodiment of a linear transform is the identity transform, which may be used in some embodiments.
- a linear transform of the type y a.x + b, with both a and b non-zero, is preferred.
- y a.x + b, with both a and b non-zero
- the video coordinator may comprise: - a rendering coordinator configured for coordinating the partial views of the video
- an encoding coordinator configured for coordinating settings used for the encodings
- a segmentation coordinator configured for coordinating the segmenting into time segments of the video streams and preferably also for producing manifest files describing relationships between the video streams
- a video coordinator control unit configured for controlling the rendering coordinator, the encoding coordinator and the segmentation coordinator.
- the video coordinator controls the video generators mentioned above.
- the apparatus may further comprise a scene information server configured for producing the scene information stream and a timer configured for supplying the time indications to the source server.
- the invention yet further provides a system for supplying multiple video streams, the system comprising:
- the at least one video streaming client is not constituted by a separate unit, but by the video streaming client of the video generator (recursive embodiment).
- the at least one video streaming client is preferably configured for retrieving video streams, preferably using a manifest file according to the MPEG DASH SRD standard format. It will be understood that the at least one video streaming client may additionally, or alternatively, be configured for retrieving a video stream according to another standard.
- the apparatus of the system is configured for transmitting selection information indicative of the partial views to the scene information server, and the scene information server is configured for transmitting partial scene information streams to the respective video generators.
- the scene information server is configured for transmitting partial scene information streams to the respective video generators.
- the system described above may further comprise at least one display device connected to the at least one video streaming client.
- the at least one display device may be a separate device, or may be integral with at least one video streaming client.
- video streaming client may refer to a device comprising a processor, a memory, and a receiver for receiving and/or transmitting a video stream according to the invention, as the case may be.
- a device comprising a processor, a memory, and a receiver for receiving and/or transmitting a video stream according to the invention, as the case may be.
- Non-limiting of such as device are tablets, smartphones, television sets, game consoles, set-top boxes, as will readily be understood by those skilled in the art.
- Fig. 1 schematically shows a system for supplying multiple video streams in accordance with the invention.
- Fig. 2 schematically shows a matrix of partial views which together constitute a composite view of a scene as used in the invention.
- FIG. 3 schematically shows an exemplary embodiment of a video generator in accordance with the present invention.
- Fig. 4 schematically shows an exemplary embodiment of a video coordinator in accordance with the present invention.
- Fig. 5 schematically shows an alternative embodiment of a video generator in accordance with the present invention.
- Fig. 6 schematically shows an exemplary embodiment of a method in accordance with the invention.
- Fig. 7 schematically shows a transformation of time indications into timestamps in accordance with the invention.
- Fig. 8 schematically shows an exemplary embodiment of communication between a video coordinator and video generators in accordance with the present invention.
- Fig. 9 schematically shows an exemplary embodiment of communication between a scene information server, a video generator and a CDN ingest node in accordance with the present invention.
- Fig. 10 schematically shows an exemplary embodiment of communication between a CDN delivery server, a recursive video generator and a CDN ingest node in accordance with the present invention.
- Fig. 1 1 schematically shows an exemplary embodiment of a computer program product in accordance with the invention.
- the present invention makes it possible to produce a plurality of video streams from a single scene, while providing timestamps in those video streams for synchronization. This allows a stable composite video image to be formed from two or more synchronized video streams. Spectators may therefore compose a desired video image from the available video streams.
- video streams relating to scenes can be easily scaled without requiring substantial amounts of hardware.
- the term scene may refer to an image of a source video.
- FIG. 1 An exemplary embodiment of a system according to the invention is schematically illustrated in Fig. 1.
- the system shown merely by way of non-limiting example in Fig. 1 comprises an apparatus 10 for producing (that is, generating) multiple video streams, a communication network 5, video streaming clients 6 and display devices 7.
- the apparatus 10 is shown to comprise a scene information server 1 , a timer 2, video generators 3 and a video coordinator 4.
- the scene information server 1 can be configured for producing a scene information stream relating to a scene, such as a scene of a computer game or a scene of a film.
- the scene information stream produced by the scene information server 1 can contain metadata describing a scene, in particular events of a scene, but will typically not contain image data. It can thus considered to be a metadata stream. E.g. a stream containing only metadata.
- the scene information stream contains metadata descriptive of events of the scene, including one or more time indications, each time indication relating to an event in the scene.
- a timer 2 is present to provide a time reference for the time indications.
- the system clock of the server 1 may be used for this purpose.
- the timer 2 is an integral part of the scene information server 1.
- An event in a scene may for example relate to a door opening, a shot being fired or a body falling on the ground. Each event is provided with a time indication to allow synchronization at a later stage.
- the scene information server 1 inserts the time indications in the scene information stream sent to the video generators 3.
- an event may be described as an action, user or computer generated, which causes a change, over time in the video data associated with a scene. The action may thus be described using metadata and may be used to generate (computer generated) images forming a video stream (such as a networked video game).
- the apparatus 10 comprises at least two video generators.
- Each video generator 3 is configured for generating, on the basis of the metadata from the scene information stream output by the scene information server 1 , a video stream for a partial view of the scene.
- Each partial view covers at least a part of the scene, while the partial views may together cover the entire scene. That is, each video generator 3 can be said to be configured for producing a partial video stream. This allows spectators to select one or more of the produced partial video streams for rendering.
- a video generator In addition to generating a video stream, a video generator has the task of allowing its video stream to be properly synchronized with the video streams of other video generators.
- a video generator is configured for transforming the time indications of the scene information stream into timestamps, assigning the timestamps to the respective video stream, and transmitting both the video stream and the timestamps assigned to it.
- the timestamps can be embedded in the video stream, thus ensuring that the timestamps are transmitted together with the video stream.
- the timestamps can be used to synchronize the rendering of the video streams and, as mentioned before, are derived from events in the scene information stream.
- the timestamps are derived from the time indications by a linear transformation, but the invention is not so limited.
- the video generators 3 are also configured for generating video frames of the allocation partial view by using the metadata of the scene information stream. That is, each video generator uses its part of the metadata of the scene information stream to generate video frames which can later be displayed.
- the video generators may be further configured for encoding those video frames into a video stream, thus producing a video stream from the video frames.
- the video generators 3 transmit the video streams, via the network 5, to video streaming clients 6 of the spectators.
- the video streaming clients 6 use the display devices 7 to display the partial views represented by the video streams.
- a video coordinator 4 is connected with the video generators 3 to coordinate the production (that is, the generation) of the video streams, for example by assigning partial views to the respective video generators. More in particular, the video coordinator can be configured for determining two or more partial views of the scene, each partial view covering at least part of the scene, and allocating a video generator to each partial view (or, conversely, to allocate a partial view to each video generator). In a typical embodiment, only a single video generator is allocated to each partial view in order to save resources.
- a scene which can be displayed using the present invention is schematically illustrated in Fig. 2.
- the scene 20 is shown to be divided into 56 partial views.
- the partial views all have the same size, but in other embodiments the sizes of the partial views may vary.
- Some partial views may cover, for example, a quarter of the scene, while others may cover only 1/56 th of the scene, as shown in Fig. 2, or even less.
- the partial views have numbers relating to their coordinates in the rectangular grid.
- the partial views of the scene can have different “camera positions” or viewpoints of virtual cameras.
- the viewpoint of all partial views is at an acute angle of the scene, for example 30 to 45 degrees, while the distance is "infinite”.
- the scene coverage of the partial views is in typical embodiments different for each partial view.
- FIG. 3 An exemplary embodiment of a video generator according to the invention is schematically illustrated in Fig. 3.
- the video generator 3 of Fig. 3 is shown to comprise a video generator control unit 31 , a spectator client 32, an encoder 33, a multiplexer 34 and an optional wall clock 35.
- the spectator client 32 is configured for receiving at least part of the scene information stream, that is, at least the part of the partial view to which the video generator 3 was allocated.
- the video generator 3, and hence the spectator client 32 can receive the entire scene information stream, while in other embodiments the video generator 3 only receives a selected part of the scene information stream.
- the video generator 3, or another unit of the apparatus 10 is configured for transmitting selection information to the scene information server 1 , which selection information allows the video generator 3 to receive only the part of the scene information stream that pertains to the partial view to which the video generator was allocated.
- the scene information server can be configured for transmitting partial scene information streams to the respective video generators.
- the spectator client 32 receives a (partial or whole) scene information stream and outputs video frames.
- the spectator client 32 outputs those video frames together with time indications to the encoder 33. These time indications were already present in the scene information stream and are in those embodiments passed on by the spectator client.
- the spectator client 32 outputs the video frames to the encoder 33 and the time indication to the control unit 31 , where the time indications are transformed into timestamps, which are then output to the multiplexer 34.
- the encoder 33 receives the video frames from the spectator client 32 and outputs encoded video frames to the multiplexer 34.
- the multiplexer 34 multiplexes the encoded video frames into containers suitable for transmission and/or storage.
- the output of the multiplexer may comprise an MPEG transport stream, an RTP stream or an ISOBMFF.mp4 type file, for example.
- the multiplexer 34 may be constituted by a segmenter (formatter, packager), which produces a stream consisting of files.
- the video may be generated in a number of different quality levels, and each of these quality levels may comprise a plurality of segments (which may also be referred to as chunks or fragments).
- the multiplexer 34 also transforms the time indications into timestamps and adds those timestamps to the containers. This transformation is schematically illustrated in Fig. 7. Suitable timestamps are so-called presentation timestamps, but the invention is not limited to presentation timestamps.
- time indications into timestamps may for example be pipeline-based, API-based, delay-based or based upon a combination of two or three of these techniques.
- the spectator client 32, the encoder 33 and the multiplexer 33 constitute a pipeline, for example a GStreamer pipeline (see http://qstreamer.freedesktop.org).
- the spectator client 32 provides raw frames (for example in the RGB or YUV format, RGB and YUV being well-known color spaces), each frame having a time indication which corresponds with the game time (assuming the scene is a computer game scene).
- the encoder 33 encodes the raw frames into encoded frames and tracks frame numbers, for example by using a counter.
- the multiplexer adds (presentation type or other) timestamps and puts the resulting data into containers.
- Special transformation functions may be added between the spectator client 32 and the encoder 33, and/or between the encoder 33 and the multiplexer 34, to transform the time indications into frame numbers, and to transform frame numbers into (presentation) timestamps.
- the video generator control unit 31 coordinates these transformation functions and controls their settings.
- the video generator control unit 31 is an application which controls the spectator client 32, the encoder 33 and the multiplexer 34 via an API (Application Programming Interface). As soon as the spectator client 32 has produced a new raw data frame, it informs the control unit 31. In response, the control unit 31 transforms the time indication of the raw frame into a frame number, and the frame is sent to the encoder 33. Similarly, the control unit 31 may transform the frame number into a timestamp when the encoded frame is transferred from the encoder 33 to the multiplexer 34.
- API Application Programming Interface
- the control unit 31 controls (and/or knows) the processing time (that is, the delay) induced by the spectator client 32, the encoder 33 and the multiplexer 34.
- the control unit 31 receives the time indications (typically equal to the game time in game-based embodiments) from the spectator client 32 and determines the (presentation or other) timestamp to be inserted by the multiplexer 34.
- Other embodiments may combine features of at least two of the pipeline-based, the API-based and the delay-based embodiments.
- the control unit 31 of the video generator 3 is configured for exchanging information with the video coordinator 4 (see Fig. 1 ).
- the information the control unit 31 receives from the video coordinator 4 is, for example, the identification of the partial view.
- the wall clock 35 which is present in this embodiment serves to provide time references for producing the timestamps.
- the spectator client 32 may be synchronized with the game clock, which ensures that the frame generation rate is synchronous with the game time. If the spectator client 32 is not synchronized with the game clock, a buffer may be used between the spectator client 32 and the encoder 33 to ensure that a frame is only sent to the encoder when the frame is complete.
- a video generator may include two or more spectator clients. Each video generator may therefore contain multiple spectator clients, thus being able to render multiple virtual camera views in parallel.
- each video generator may contain multiple encoders, thus being able to generate multiple resolutions, multiple quality levels and/or multiple bitrates.
- a video generator may be constituted by a hardware unit, such as a hardware unit comprising a microprocessor, or by a software unit such as a thread, a process or a virtual machine running on a computer host.
- a video coordinator may also be implemented in hardware and/or software.
- FIG. 4 An exemplary embodiment of a video coordinator 4 is schematically illustrated in Fig. 4.
- the embodiment of Fig. 4 comprises a video coordinator control unit 41 , a rendering coordinator 42, an encoding coordinator 43, a multiplexing coordinator 44, and an optional wall clock 45.
- the rendering coordinator 42 determines the positions of the virtual cameras defining the partial views, the directions of the virtual cameras and their fields of view. These parameters may be determined once, for example at the beginning of a game, but may in some embodiments be changed, for example during a game.
- the encoding coordinator 43 determines the settings used for the encoding in the encoder 33, while the multiplexing coordinator 44 produces manifest files (MFs) which describe the structure of the partial views (the so-called tiling).
- the multiplexing coordinator 44 describes for each partial view its name, virtual camera direction and position, timing, bitrate, video quality and/or other parameters.
- the manifest files are transmitted via the network (5 in Fig. 1 ).
- a manifest file may be a spatial representation description (SRD) as specified in amendment 2 of MPEG-DASH part 1 (ISO/IEC 23009-1 ).
- the video coordinator control unit 41 coordinates the rendering coordinator 42, the encoding coordinator 43, and the multiplexing coordinator 44.
- the control unit 41 may cause the multiplexing coordinator to be in possession of the settings of the rendering coordinator and the encoding coordinator. It also communicates settings between video generators (3 in Fig. 1 ).
- the control unit 41 may also create or delete video generators. When the game action moves to another area, a new video generator for a new partial view may be created, while an existing video generator may be deleted when there are no spectators for its partial view.
- the wall clock 45 is, in a typical embodiment, synchronized with the wall clocks of the video generators. As a result, all processes with the video generator are synchronized, as a result of which buffer overflows or underruns will be avoided.
- FIG. 5 An alternative embodiment of a video generator 3 is schematically illustrated in Fig. 5.
- the embodiment of Fig. 5 also comprises a video generator control unit 31 , an encoder 33, a multiplexer 34 and an optional wall clock 35.
- the embodiment of Fig. 5 comprises a streaming client 32'. This allows this embodiment to be recursive, as the output of a recursive video generator may be used as input for another recursive video generator. In other words, received video streams are decoded and then go to the past processing phase of stitching.
- the streaming client 32' of the (recursive) video generator 3 and the video streaming client 6 may be the same. This allows the resources required for providing video streams in accordance with the invention to be further reduced.
- the streaming client 32' selects a set of partial views, receives the video streams relating to those partial views, decodes these partial video streams, stitches the decoded partial video streams together and forwards the result to the encoder 33.
- This embodiment has the advantage that a streaming client requires less processing than a spectator client. Another advantage of this embodiment is that it may introduce less delay.
- the method 60 of Fig. 6 comprises an initial or start step 61 , in which the method is initiated.
- step 62 at least two partial views of the scene are determined, each partial view covering at least part of the scene.
- step 63 a video stream is generated from the scene information stream for each of the partial views.
- step 64 the time indications of the scene information stream are transformed into timestamps configured for synchronously rendering the video streams.
- step 65 the timestamps are assigned to the video streams, while in step 66 each video stream is transmitted together with its assigned timestamps. The method may end in step 66. It will be understood that the method 60 may be repeated as desired.
- Fig. 8 shows an example of communication between a video coordinator 4 and video generators 3.
- the video coordinator 4 determines how it wants the spectator video be provided (for example how the spectator video is to be encoded, segmented and how the partial views are to be configured). These values may be pre-configured, or set by a human operator.
- the video coordinator provides dedicated video generator instructions to each video generator. Each of the video generators starts the instructed video generating process (or processes) and confirms its successful start-up to the video coordinator by sending a message (labelled "200 OK" in the present example).
- the video coordinator 4 Once the video coordinator 4 has received confirmation from all video generators 3, it publishes the manifest file (also referred to as spatial manifest data file). Users (that is, spectators) can now interactively watch the game.
- the manifest file also referred to as spatial manifest data file
- the video coordinator may push the video generator instructions via an HTTP GET message or a previously established websocket connection. It may also be carried out in pull-based manner, where video generators retrieve the video generator instructions from the video coordinator. Messaging and signalling protocols may be used as well, for example XMPP or SIP.
- the code below provides an exemplary embodiment of a video generator instruction.
- the video generator uses the video generator instruction to configure its spectator Client, encoder and multiplexer (see Fig. 3) in order to generate segments for partial view A31 , see Fig. 2.
- the video generator instruction is formatted using XML, but a skilled person may use some other suitable formatting type, such as JSON or ASN.1.
- the exemplary video generator instruction has four elements:
- Timing_instruction> This element provides generic instructions about timing. It has the following sub-elements: o ⁇ wall_clock_server_address>. This element provides the URL attribute of the wall clock server that the video generator can use to synchronise its wall clock.
- This element provides the UTC time attribute that is used as "time zero" for the rendering, encoding and segmentation processes in the video generators.
- This element provides the frame rate attribute of the video generated by the video generator expressed in frames-per-second (fps).
- This element provides instructions to the spectator client of the video generator. It has the following sub-elements:
- This element identifies the game instance with an id attribute. It is used to distinguish game instances to the Source TV Proxy,
- This element provides attributes of the virtual camera. Examples of such
- ⁇ Lens parameters such as focal length and aperture.
- ⁇ Field of view for example 48 degrees horizontally and 36 degrees vertically.
- ⁇ Sensor e.g. a 100 ISO sensor sensitivity.
- the element can also provide attributes about the projection that is used to map the virtual world, e.g. the Mercator projection.
- This element provides parameters about the virtual screen to which the
- Spectator Client renders its game data and events. It may include the following sub- elements:
- ⁇ ⁇ resolution> The overall screen resolution for this Spectator Client, for example 7680 x 4320 pixels.
- This element identifies the section of the overall screen that should be rendered by the spectator client. In this embodiment, it is a rectangular area, of which the top-left pixel has pixel coordinates (5760,0) and a resolution of 1920 x 1080.
- This element provides instructions to the encoder of the video generator. It has the following sub-elements (note that the video frame rate has already been provided as attribute in the generic ⁇ timing_instruction> element):
- codec and codec profile used.
- This element provides instructions to the Multiplexer of the video generator. It has the following sub-elements: o ⁇ ingest_node>. This element provides instructions to ingest segments into a Content Delivery Network (CDN).
- CDN Content Delivery Network
- This embodiment uses File Transfer Protocol (FTP) as ingest method, it provides the address of the FTP server, as well as a username and password.
- FTP File Transfer Protocol
- a skilled person may also use other CDN ingest methods, like HTTP or a websocket.
- o file_name>. This element provides the information that the multiplexer needs to generate the file names of the segments (start value 0000, increment 1 ).
- the first generated segment is shoot_m_up_instance_qxw_A31_0000.mp4
- the second is shoot_m_up_instance_qxw_A31_0001.mp4, etcetera.
- the .mp4 extension indicates that an ISOBMFF container is to be used.
- the skilled person may use other extensions and its associated containers, for example .ts .avi, .mov, .wmv or another. o ⁇ segment_parameters>. This element provides parameters of the segment. This
- embodiment provides an attribute on the duration of a segment, for example 2 seconds.
- a virtual microphone could be placed and oriented in the virtual world, and audio parameters could be provided, including an audio sample rate, for example 96000 samples per second, and an audio codec, for example "mp4a.40.2".
- the audio could be provided separately, for example in a generic way for the whole virtual world or in a specific way to an identified partial view (which may also be referred to as video tile).
- the audio could also be integrated with the partial view and provided in the same .mp4 container as the video.
- Fig. 9 schematically shows an embodiment of the streaming inputs and outputs of a video generator when it is generating partial views.
- the video generator 3 is tuned to a broadcast of game data and events (e.g. the broadcast being the scene information stream, and the game data (e.g. time indication) and events being the metadata), for example provided by the Source TV Proxy, which may be a Source TV proxy according to the prior art.
- Alternative embodiments use joining a multicast of game data and events, or retrieving these via unicast. If the virtual world is very large, and the generated partial view represents only a small portion of the virtual world, optimizations can be made to retrieve only a limited (filtered) set of game data and events. For example, the spectator client could provide the coordinates of the area of the virtual world that it is interested in, and the Source TV Proxy would provide only those game data and events relevant to that area.
- the video generator 3 can render one or more frames of the partial view.
- the frames are encoded by the encoder and segmented by the multiplexer (see Fig. 3).
- the multiplexer then uploads a generated segment with the correct file name to the CDN Ingest Node 8 indicated in the video generator Instruction. This process is repeated for each subsequent segment.
- the process illustrated in Fig. 9 can be coordinated by a video coordinator 4 (see Fig. 1 ).
- the code below provides an embodiment of a video generator Instruction to a recursive video generator (see also Fig. 10).
- the video generator has a video streaming client to generate partial view B21.
- the video streaming client preferably uses a dedicated manifest file ("for internal use") to learn which partial views are available and what their properties are. It combines this information with the information from the video generator instruction. It then deduces that it needs to download partial views A31 , A32, A41 and A42. It uses these four partial views to compose a single partial view and reduces the resolution as instructed.
- the newly generated partial view is encoded and segmented similar to the previous embodiment.
- This element provides instructions to the video streaming client of the recursive video generator. It has the following sub-elements:
- This element provides the URL attribute to retrieve the manifest file, o ⁇ screen>. Similar to previous embodiment.
- the output audio could be a weighted average of the input audio from the composing partial views, a selected subset of those (zero, one or more), or a newly created downmix of the different available sound channels.
- Fig. 10 schematically shows an embodiment of the streaming inputs and outputs of a recursive video generator (see Fig. 5) when it is generating partial views.
- the recursive video generator 3 retrieves the manifest file (MPD) from a server, e.g. a CDN Delivery Node 9, using the URL provided in the video generation instruction.
- the manifest file is retrieved from the video coordinator (4 in Fig. 1 ), the manifest file is pushed by the video coordinator to the video generation client, or the manifest file is provided as part of the video generation instruction.
- the manifest file is received, it is analysed using the video generation instruction.
- segments should be retrieved, and these are retrieved from a server, for example a CDN Delivery Node, which is not necessarily the one that provided the manifest file.
- a server for example a CDN Delivery Node, which is not necessarily the one that provided the manifest file.
- frames are rendered and encoded, and a segment of partial view B21 is generated and uploaded to the CDN Ingest Node 8. This process is repeated for each subsequent output segment.
- the process illustrated in Fig. 10 can be coordinated by a video coordinator 4 (see Fig. 1 ).
- the video coordinator publishes the manifest file to the users (that is, the spectators).
- This publication could be, for example, a publication of a hyperlink on a website, that hyperlink pointing to a location where the manifest file can be retrieved.
- the publication could also be the pushing of the hyperlink or manifest file to the subscribed user devices.
- a user device may have a video streaming client that parses the manifest file and starts retrieving the relevant segments, depending on the navigation by the user (spectator) through the virtual world.
- the code below provides an embodiment of a manifest file (partial) as provided to the user, and may be used by the video streaming clients 6 shown in Fig. 1.
- this manifest file is a standards-compliant MPD, following the MPEG-DASH-SRD standard (ISO/I EC 23009-1 :2014 Amd.2) and the associated MPEG-DASH standard (ISO/IEC 23009: 1 :2014).
- the manifest file provides, among others, the following elements:
- the exemplary manifest file has a single period that lasts 7200 seconds (2 hours).
- a skilled person may include multiple periods, for example for advertisement insertion.
- o source_id 1. This identifies the (camera) source. This embodiment has only a single camera. Multiple cameras could be identified with multiple source_id.
- o object_x 3840. This is the horizontal (x) coordinate of the top-left pixel
- o object_y 0. This is the vertical (y) coordinate of the top-left pixel
- the present AdaptationSet has only a single representation. A skilled person may include multiple representations, such that an adaptive streaming client (HAS client) can switch back to a lower bandwidth version of the stream when needed.
- the element has the following attributes:
- o bandwidth 5000000.
- Bandwidth expressed in bits per second (5 megabit per second).
- o width 1920.
- Width expressed in pixels.
- Fig. 1 1 schematically shows a software program product 1 10 which contains instructions allowing a processor to carry out embodiments of the method of the invention.
- the software program product 1 10 may contain a tangible carrier, such as a DVD, on which the instructions are stored.
- An alternative carrier is a portable semiconductor memory, such as a so-called USB stick.
- the tangible carrier may be constituted by a remote server from which the software program product may be downloaded, for example via the internet.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Computer Security & Cryptography (AREA)
- Databases & Information Systems (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
- Information Transfer Between Computers (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP15202816 | 2015-12-28 | ||
PCT/EP2016/082694 WO2017114821A1 (en) | 2015-12-28 | 2016-12-27 | Video streams |
Publications (1)
Publication Number | Publication Date |
---|---|
EP3398346A1 true EP3398346A1 (en) | 2018-11-07 |
Family
ID=55085487
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP16816318.6A Withdrawn EP3398346A1 (en) | 2015-12-28 | 2016-12-27 | Video streams |
Country Status (2)
Country | Link |
---|---|
EP (1) | EP3398346A1 (en) |
WO (1) | WO2017114821A1 (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3442240A1 (en) * | 2017-08-10 | 2019-02-13 | Nagravision S.A. | Extended scene view |
KR102559966B1 (en) * | 2018-05-25 | 2023-07-26 | 라인플러스 주식회사 | Method and system for transmitting and reproducing video of dynamic bit rate using a plurality of channels |
EP3831075A1 (en) * | 2018-07-30 | 2021-06-09 | Koninklijke KPN N.V. | Generating composite video stream for display in vr |
US11924442B2 (en) | 2018-11-20 | 2024-03-05 | Koninklijke Kpn N.V. | Generating and displaying a video stream by omitting or replacing an occluded part |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9843844B2 (en) * | 2011-10-05 | 2017-12-12 | Qualcomm Incorporated | Network streaming of media data |
WO2014057131A1 (en) * | 2012-10-12 | 2014-04-17 | Canon Kabushiki Kaisha | Method and corresponding device for streaming video data |
GB2513139A (en) * | 2013-04-16 | 2014-10-22 | Canon Kk | Method and corresponding device for streaming video data |
GB2516825B (en) * | 2013-07-23 | 2015-11-25 | Canon Kk | Method, device, and computer program for encapsulating partitioned timed media data using a generic signaling for coding dependencies |
CN106233745B (en) * | 2013-07-29 | 2021-01-15 | 皇家Kpn公司 | Providing tile video streams to clients |
-
2016
- 2016-12-27 WO PCT/EP2016/082694 patent/WO2017114821A1/en active Application Filing
- 2016-12-27 EP EP16816318.6A patent/EP3398346A1/en not_active Withdrawn
Also Published As
Publication number | Publication date |
---|---|
WO2017114821A1 (en) | 2017-07-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11330311B2 (en) | Transmission device, transmission method, receiving device, and receiving method for rendering a multi-image-arrangement distribution service | |
JP6675475B2 (en) | Formation of tiled video based on media stream | |
US10715843B2 (en) | Forming one or more tile streams on the basis of one or more video streams | |
JP6415468B2 (en) | Distribution of spatially segmented content | |
CN106664443B (en) | Region of interest determination from HEVC tiled video streams | |
US11025982B2 (en) | System and method for synchronizing content and data for customized display | |
KR20150072231A (en) | Apparatus and method for providing muti angle view service | |
EP3398346A1 (en) | Video streams | |
CN108293145A (en) | Video distribution synchronizes | |
US20230012201A1 (en) | A Method, An Apparatus and a Computer Program Product for Video Encoding and Video Decoding | |
KR20120133006A (en) | System and method for providing a service to streaming IPTV panorama image | |
CN107534797B (en) | Method and system for enhancing media recording | |
US11470140B2 (en) | Method and system for multi-channel viewing | |
KR101748382B1 (en) | Method and system for providing video streaming | |
CN110741648A (en) | Transmission system for multi-channel portrait and control method thereof, multi-channel portrait playing method and device thereof | |
US10764655B2 (en) | Main and immersive video coordination system and method | |
KR20170130883A (en) | Method and apparatus for virtual reality broadcasting service based on hybrid network | |
WO2016014129A1 (en) | Methods of implementing multi mode trickplay | |
JP6446347B2 (en) | Thumbnail providing device, display device, thumbnail video display system, thumbnail video display method, and program | |
US11856242B1 (en) | Synchronization of content during live video stream | |
KR101242478B1 (en) | Real time personal broadcasting system using media jockey based on multi-angle | |
CN117157986A (en) | Method for providing time synchronization multi-stream data transmission | |
Boronat et al. | Future Issues and Challenges in Distributed Media Synchronization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: UNKNOWN |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20180730 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20190302 |