WO2012021174A2 - EXPERIENCE OR "SENTIO" CODECS, AND METHODS AND SYSTEMS FOR IMPROVING QoE AND ENCODING BASED ON QoE EXPERIENCES - Google Patents

EXPERIENCE OR "SENTIO" CODECS, AND METHODS AND SYSTEMS FOR IMPROVING QoE AND ENCODING BASED ON QoE EXPERIENCES Download PDF

Info

Publication number
WO2012021174A2
WO2012021174A2 PCT/US2011/001425 US2011001425W WO2012021174A2 WO 2012021174 A2 WO2012021174 A2 WO 2012021174A2 US 2011001425 W US2011001425 W US 2011001425W WO 2012021174 A2 WO2012021174 A2 WO 2012021174A2
Authority
WO
WIPO (PCT)
Prior art keywords
codec
sentio
experience
codecs
encoding
Prior art date
Application number
PCT/US2011/001425
Other languages
French (fr)
Other versions
WO2012021174A3 (en
Inventor
Tara Lemmey
Stanislav Vonog
Surin Nikolay
Original Assignee
Net Power And Light Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Net Power And Light Inc. filed Critical Net Power And Light Inc.
Publication of WO2012021174A2 publication Critical patent/WO2012021174A2/en
Publication of WO2012021174A3 publication Critical patent/WO2012021174A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/23439Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements for generating different versions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/162User input
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/164Feedback from the receiver or from the transmission channel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234327Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by decomposing into layers, e.g. base layer and one or more enhancement layers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/24Monitoring of processes or resources, e.g. monitoring of server load, available bandwidth, upstream requests
    • H04N21/2402Monitoring of the downstream path of the transmission network, e.g. bandwidth available
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/258Client or end-user data management, e.g. managing client capabilities, user preferences or demographics, processing of multiple end-users preferences to derive collaborative data
    • H04N21/25866Management of end-user data
    • H04N21/25891Management of end-user data being end-user preferences

Definitions

  • the present teaching relates to experience or "sentio" codecs enabling adaptive encoding and transmission for heterogeneous data streams of different nature involving a variety of content and data types including video, audio, physical gesture, , geo-location, voice input, synchronization events, computer-generated graphics etc.
  • "Sentio" codec expands the existing concept of codecs by to maximize final Quality of Service/Experience in real-time,
  • the present invention contemplates a variety of experience or "sentio" codecs, and methods and systems for enabling an experience platform, as well as a Quality of Experience (QoE) engine which allows the sentio codec to select a suitable encoding engine or device.
  • QoE Quality of Experience
  • "Sentio" codec expands the existing concept of codec to work in real-time, heterogeneous network, multi-device, social environment to maximize final Quality of Service/Experience.
  • the sentio codec is capable of encoding and transmitting data streams that correspond to participant experiences with a variety of different dimensions and features.
  • the following description provides one paradigm for understanding the multi-dimensional experience available to the participants, and as implemented utilizing a sentio codec. There are many suitable ways of describing, characterizing and implementing the sentio codec and experience platform contemplated herein.
  • FIG. 1 is a block diagram of a sentio codec
  • FIG. 2 provides a screen shot useful for illustrating how a hybrid encoding scheme can be used to accomplish low-latency transmission
  • FIG. 3 is a block diagram of "sentio" codec model that shows event, data streams, different encoders selection based on device capabilities and network feedback.
  • FIG 4. illustrates an exemplary selection of particular codec based on device capabilities (screen size in the example) and network type (3G vs Wi-Fi in the example).
  • FIG 5. is a block diagram of "sentio" codec model that shows event, data streams, different encoders selection, applying specific group analysis services in massive social environment.
  • FIG. 6 illustrates an ensemble of devices interacting and their output streamed to and displayed on a single display; and illustrates an exemplary architecture of a simple operating system.
  • the present invention contemplates a variety of experience or "sentio" codecs, and methods and systems for enabling an experience platform, as well as a Quality of Experience (QoS) engine which allows the sentio codec to select a suitable encoding engine or device.
  • QoS Quality of Experience
  • the sentio codec is capable of encoding and transmitting data streams that correspond to participant experiences with a variety of different dimensions and features.
  • the term "sentio" is Latin roughly corresponding to perception or to perceive with one's senses, hence the original nomenclature "sensio codec"
  • video codec The primary goal of a video codec is to achieve maximum compression rate for digital video while maintaining great picture quality video; audio codecs are similar. But video and audio codecs alone are insufficient to generate and capture a full experience, such as a real-time experience enabled by hybrid encoding, and encoding of other experience aspects such as gestures, emotions, etc.
  • Fig. 2 will now be described to provide an example experience showing 4 layers where video encoding alone is inadequate under constrained network connectivity conditions (low bandwidth, high packet loss or j itter, etc.)
  • a first layer is generated by Autodesk 3ds Max instantiated on a suitable layer source, such as on an experience server or a content server.
  • a second layer is an interactive frame around the 3ds Max layer, and in this example is generated on a client device by an experience agent.
  • a third layer is the black box in the bottom-left corner with the text "FPS" and "bandwidth”, and is generated on the client device but pulls data by accessing a service engine available on the service platform.
  • a fourth layer is a red-green- yellow grid which demonstrates an aspect of region-detection code (e.g., different regions being selectively encoded) and is generated and computed on the service platform, and then merged with the 3ds Max layer on the experience server.
  • Figures 2 3 and 4 illustrate hybrid encoding approaches can be used to accomplish low-latency transmission.
  • the first ' layer provides an Autodesk 3ds Max image including a rotating teapot, the first layer moving images, static or nearly static images, and graphic and/or text portions.
  • FIG. 3 is a block diagram of "sentio" codec model that shows event, data streams, different encoders selection based on device capabilities and network feedback.
  • FIG 5. is a block diagram of "sentio" codec model that shows event, data streams, different encoders selection, applying specific group analysis services in massive social environment.
  • a video codec alone is inadequate to accomplish the hybrid encoding scheme covering video, pictures and commands. While it is theoretically possible to encode the entire first layer using only a video codec, latency and other issues can prohibit real-time and/or quality experiences. A low-latency protocol can solve this problem by efficiently encoding the data.
  • a multiplicity of video codecs can be used to improve encoding and transmission.
  • h.264 can be used if a hardware decoder is available, thus saving battery life and improving performance, or a better video codec (e.g., low latency) can be used if the device fails to support h.264.
  • a better video codec e.g., low latency
  • the present teaching contemplates an experience or sentio codec capable of encoding and transmitting data streams that correspond to experiences with a variety of different dimensions and features.
  • These dimensions include known audio and video, but further may include any conceivable element of a participant experience, such as gestures, gestures + voice commands, "game mechanics" (which you can use to boost QoE when current conditions (such as network) do not allow you to do so - i.e. apply sound distortion effect specific to a given experience when loss of data happened), emotions (perhaps as detected via voice or facial expressions, various sensor data, microphone input, etc.
  • virtual experiences can be encoded via the sentio codec.
  • virtual goods are evolved into virtual experiences. Virtual experiences expand upon limitations imposed by virtual goods by adding additional dimensions to the virtual goods.
  • User A transmits flowers as a virtual good to User B.
  • the transmission of the virtual flowers is enhanced by adding emotion by way of sound, for example.
  • the virtual flowers are also changed to a virtual experience when User B can do something with the flowers, for example User B can affect the flowers through any sort of motion or gesture.
  • User A can also transmit the virtual goods to User B by making a "throwing" gesture using a mobile device, so as to "toss" the virtual goods to User B.
  • the sentio codec improves the QoE to a consumer or experience participant on the device of their choice. This is accomplished through a variety of mechanisms, selected and implemented, possibly dynamically, based on the specific application and available resources.
  • the sentio codec encodes multi-dimensional data streams in real-time, adapting to network capability.
  • a QoE engine operating within the sentio codec a makes decisions on how to use different available codecs.
  • the network stack can be implemented as hybrid, as described above, and in further detail with reference to Vonog et al.'s US Pat. App. 12/569,876.
  • the sentio codec can include 1) a variety of codecs for each segment of experience described above, 2) a hybrid network stack with network intelligence, 3) data about available devices, and 4) a QoE engine that makes decisions on how to encode. It will be appreciated that QoE is achieved through various strategies that work differently for each given experience (say a zombie karaoke game vs. live stadium rock concert experience), and adapt in real-time to the network and other available resources, know the devices involved and take advantages of various psychological tricks to conceal imperfections which inevitably arise, particularly when the provided experience is scaled for many participants and devices.
  • Fig. 1 illustrates a block diagram of one embodiment of a sentio codec 200.
  • the sentio codec 200 includes a plurality of codecs such as video codecs 202, audio codecs 204, graphic language codecs 206, sensor data codecs 208, and emotion codecs 210.
  • the sentio codec 200 further includes a quality of experience (QoE) decision engine 212 and a network engine 214.
  • QoE quality of experience
  • the codecs, the QoE decision engine 212, and the network engine 214 work together to encode one or more data streams and transmit the encoded data according to a low- latency transfer protocol supporting the various encoded data types.
  • a low-latency transfer protocol supporting the various encoded data types.
  • One suitable low-latency protocol and more details related to the network engine 214 can be found in Vonog et al.'s U.S. Pat. App. No. 12/569,876.
  • the sentio codec 200 can be designed to take all aspects of the experience platform into consideration when executing the transfer protocol.
  • the parameters and aspects include available network bandwidth, transmission device characteristics and receiving device characteristics.
  • the sentio codec 200 can be implemented to be responsive to commands from an experience composition engine or other outside entity to determine how to prioritize data for transmission.
  • audio is the most important component of an experience data stream.
  • a specific application may desire to emphasize video or gesture commands.
  • the sentio codec provides the capability of encoding data streams corresponding to many different senses or dimensions of an experience.
  • a device 12 may include a video camera capturing video images and audio from a participant.
  • the user image and audio data may be encoded and transmitted directly or, perhaps after some intermediate processing, via the experience composition engine 48, to the service platform 46 where one or a combination of the service engines can analyze the data stream to make a determination about an emotion of the participant. This emotion can then be encoded by the sentio codec and transmitted to the experience composition engine 48, which in turn can incorporate this into a dimension of the experience.
  • a participant gesture can be captured as a data stream, e.g. by a motion sensor or a camera on device 12, and then transmitted to the service platform 46, where the gesture can be interpreted, and transmitted to the experience composition engine 48 or directly back to one or more devices 12 for incorporation into a dimension of the experience.
  • the sentio codec delivers the best QoE to a consumer on the device of their choice through current network. This is accomplished through a variety of mechanisms, selected and implemented based on the specific application and available resources.
  • the sentio codec encodes multi-dimensional data streams in real-time, adapting to network capability.
  • a QoE engine operating within the sentio codec a makes decisions on how to use different available codecs.
  • the network stack can be implemented as hybrid, as described above, and in further detail with reference to Vonog et al.'s US Pat. App. 12/569,876. [0028] Additionally, the following description is related to a simple operating system, which follows generally the fundamental concepts discussed above with further distinctions.
  • a server communicates with a first device, wherein the first device can detect surrounding devices, and an application program is executable by the server, wherein the application program is controlled by the first device and the output of the application program is directed by the server to one of the devices detected by the first device.
  • a minimum set of requirements exists in order for the first device to detect and interact with other devices in the cloud computing environment.
  • a traditional operating system is inappropriate for such enablement because the device does not need full operating system capabilities. Instead, a plurality of codecs is sufficient to enable device interaction.
  • Figure 6 illustrates an ensemble of devices interacting and their output streamed to and displayed on a single display.
  • Multiple users having devices participate in an activity, for example watching live sports.
  • The.video of the live sports is streamed as a layer (layerl ) from a content delivery network and displayed to the users.
  • a user having device 1 can play the role of commentator and the audio from device 1 is streamed as a layer (layer2) and rendered to the users.
  • a user having device2 can, for example, be drawing plays and the drawings are streamed as another layer and displayed to the users.
  • a user having device3 can, for example, be typing up facts that are streamed as another layer and displayed to the users as a ticker tape.
  • the devices and users make an ensemble in that they have different roles and experiences together while participating in the same activity.
  • Figure 7 illustrates an exemplary architecture of a simple operating system.
  • a simple operating system includes input capabilities, output capabilities, a network stack, a device agent, a plurality of codecs, services routing and an optional user interface shell.
  • the simple operating system receives input including requests for services, and routes the requests for services to the appropriate available computing capabilities.
  • the simple operating system performs minimal input processing to decipher what services are being requested, only to determine where to route the request.
  • the device agent provides information regarding the location of best computing available for a particular request.
  • the simple operating system performs no input processing and automatically routes input for processing to another device or to the cloud.
  • the simple operating system routes requests for services to another device, to a server in the cloud, or to computing capability available locally on the device hosting the simple operating system.
  • the plurality of codecs maintain a network connection and can activate output capabilities.
  • the simple operating system does not include any local services. All requests are sent to the cloud for services.
  • a device hosting the simple operating system can also host a traditional operating system.
  • Services are defined at the API Layer of the platform. Services are categorized into Dimensions. Dimensions can be recombined into Layers. Layers form to make features in the user experience

Abstract

Certain embodiments teach a variety of experience or "sentio" codecs, and methods and systems for enabling an experience platform, as well as a Quality of Experience (QoS) engine which allows the sentio codec to select a suitable encoding engine or device. The sentio codec is capable of encoding and transmitting data streams that correspond to participant experiences with a variety of different dimensions and features. As will be appreciated, the following description provides one paradigm for understanding the multi-dimensional experience available to the participants, and as implemented utilizing a sentio codec. There are many suitable ways of describing, characterizing and implementing the sentio codec and experience platform contemplated herein.

Description

EXPERIENCE OR "SENTIO" CODECS, AND METHODS AND SYSTEMS FOR IMPROVING QoE AND ENCODING BASED ON QoE FOR EXPERIENCES
CLAIM OF PRIORITY
[0001] The present application claims priority to the following U.S. Provisional Applications: U.S. Provisional Patent Application No. 61/373,236, entitled "EXPERIENCE OR "SENTIO" CODECS, AND METHODS AND SYSTEMS FOR IMPROVING QoE AND ENCODING BASED ON QoE FOR EXPERIENCES," filed on 08/12/2010, and U.S. Provisional Patent Application No. 61/373,229, entitled "METHOD AND SYSTEM FOR A SIMPLE
OPERATING SYSTEM AS AN EXPERIENCE CODEC," filed on 08/12/2010, both of which are incorporated in their entireties herein by this reference.
FIELD OF INVENTION
[0002] The present teaching relates to experience or "sentio" codecs enabling adaptive encoding and transmission for heterogeneous data streams of different nature involving a variety of content and data types including video, audio, physical gesture, , geo-location, voice input, synchronization events, computer-generated graphics etc. "Sentio" codec expands the existing concept of codecs by to maximize final Quality of Service/Experience in real-time,
heterogeneous network, multi-device, social environment
SUMMARY OF THE INVENTION
[0003] The present invention contemplates a variety of experience or "sentio" codecs, and methods and systems for enabling an experience platform, as well as a Quality of Experience (QoE) engine which allows the sentio codec to select a suitable encoding engine or device. "Sentio" codec expands the existing concept of codec to work in real-time, heterogeneous network, multi-device, social environment to maximize final Quality of Service/Experience.
[0004] As will be described in more detail below, the sentio codec is capable of encoding and transmitting data streams that correspond to participant experiences with a variety of different dimensions and features. As will be appreciated, the following description provides one paradigm for understanding the multi-dimensional experience available to the participants, and as implemented utilizing a sentio codec. There are many suitable ways of describing, characterizing and implementing the sentio codec and experience platform contemplated herein.
BRIEF DESCRIPTION OF DRAWINGS
[0005] These and other objects, features and characteristics of the present invention will become more apparent to those skilled in the art from a study of the following detailed description in conjunction with the appended claims and drawings, all of which form a part of this specification. In the drawings:
[0006] FIG. 1 is a block diagram of a sentio codec;
[0007] FIG. 2 provides a screen shot useful for illustrating how a hybrid encoding scheme can be used to accomplish low-latency transmission;
[0008] FIG. 3 is a block diagram of "sentio" codec model that shows event, data streams, different encoders selection based on device capabilities and network feedback.
[0009] FIG 4. illustrates an exemplary selection of particular codec based on device capabilities (screen size in the example) and network type (3G vs Wi-Fi in the example). [0010] FIG 5. is a block diagram of "sentio" codec model that shows event, data streams, different encoders selection, applying specific group analysis services in massive social environment.
[0011] FIG. 6 illustrates an ensemble of devices interacting and their output streamed to and displayed on a single display; and illustrates an exemplary architecture of a simple operating system.
DETAILED DESCRIPTION OF THE INVENTION
[0012] The present invention contemplates a variety of experience or "sentio" codecs, and methods and systems for enabling an experience platform, as well as a Quality of Experience (QoS) engine which allows the sentio codec to select a suitable encoding engine or device. As will be described in more detail below, the sentio codec is capable of encoding and transmitting data streams that correspond to participant experiences with a variety of different dimensions and features. (The term "sentio" is Latin roughly corresponding to perception or to perceive with one's senses, hence the original nomenclature "sensio codec")
[0013] The primary goal of a video codec is to achieve maximum compression rate for digital video while maintaining great picture quality video; audio codecs are similar. But video and audio codecs alone are insufficient to generate and capture a full experience, such as a real-time experience enabled by hybrid encoding, and encoding of other experience aspects such as gestures, emotions, etc.
[0014] Fig. 2 will now be described to provide an example experience showing 4 layers where video encoding alone is inadequate under constrained network connectivity conditions (low bandwidth, high packet loss or j itter, etc.) A first layer is generated by Autodesk 3ds Max instantiated on a suitable layer source, such as on an experience server or a content server. A second layer is an interactive frame around the 3ds Max layer, and in this example is generated on a client device by an experience agent. A third layer is the black box in the bottom-left corner with the text "FPS" and "bandwidth", and is generated on the client device but pulls data by accessing a service engine available on the service platform. A fourth layer is a red-green- yellow grid which demonstrates an aspect of region-detection code (e.g., different regions being selectively encoded) and is generated and computed on the service platform, and then merged with the 3ds Max layer on the experience server. [0015] Figures 2 3 and 4 illustrate hybrid encoding approaches can be used to accomplish low-latency transmission. The first 'layer provides an Autodesk 3ds Max image including a rotating teapot, the first layer moving images, static or nearly static images, and graphic and/or text portions. Rather then encoding all the information with a video encoder alone, a hybrid approach encoding some regions with a video encoder, other regions with a picture encoder, and other portions as command, results in better transmission results, and can be optimized based on factors such as the state of the network and the capabilities of end devices. These different encoding regions are illustrated by the different coloring of the red-green-yellow grid of layer 4. One example of this low-latency protocol is described in more detail in Vonog et al.'s US Pat. App. 12/569,876, filed September 29, 2009, and incorporated herein by reference for all purposes including the low-latency protocol and related features such as the network engine and network stack arrangement. [0016] FIG. 3 is a block diagram of "sentio" codec model that shows event, data streams, different encoders selection based on device capabilities and network feedback. FIG 4.
illustrates an exemplary selection of particular codec based on device capabilities (screen size in the example) and network type (3G vs Wi-Fi in the example). FIG 5. is a block diagram of "sentio" codec model that shows event, data streams, different encoders selection, applying specific group analysis services in massive social environment.
[0017] A video codec alone is inadequate to accomplish the hybrid encoding scheme covering video, pictures and commands. While it is theoretically possible to encode the entire first layer using only a video codec, latency and other issues can prohibit real-time and/or quality experiences. A low-latency protocol can solve this problem by efficiently encoding the data.
[0018] In another example, a multiplicity of video codecs can be used to improve encoding and transmission. For example, h.264 can be used if a hardware decoder is available, thus saving battery life and improving performance, or a better video codec (e.g., low latency) can be used if the device fails to support h.264. [0019] As yet another example, consider the case of multiple mediums where an ability to take into account the nature of human perception would be beneficial. For example, assume we have video and audio information. If network quality degrades, it could be better to prioritize audio and allow the video to degrade. To do so would require using psychoacoustics to improve the QoE. [0020] Accordingly, the present teaching contemplates an experience or sentio codec capable of encoding and transmitting data streams that correspond to experiences with a variety of different dimensions and features. These dimensions include known audio and video, but further may include any conceivable element of a participant experience, such as gestures, gestures + voice commands, "game mechanics" (which you can use to boost QoE when current conditions (such as network) do not allow you to do so - i.e. apply sound distortion effect specific to a given experience when loss of data happened), emotions (perhaps as detected via voice or facial expressions, various sensor data, microphone input, etc.
[0021] It is also contemplated that virtual experiences can be encoded via the sentio codec. According to one embodiment, virtual goods are evolved into virtual experiences. Virtual experiences expand upon limitations imposed by virtual goods by adding additional dimensions to the virtual goods. By way of example, User A transmits flowers as a virtual good to User B. The transmission of the virtual flowers is enhanced by adding emotion by way of sound, for example. The virtual flowers are also changed to a virtual experience when User B can do something with the flowers, for example User B can affect the flowers through any sort of motion or gesture. User A can also transmit the virtual goods to User B by making a "throwing" gesture using a mobile device, so as to "toss" the virtual goods to User B.
[0022] The sentio codec improves the QoE to a consumer or experience participant on the device of their choice. This is accomplished through a variety of mechanisms, selected and implemented, possibly dynamically, based on the specific application and available resources. In certain embodiments, the sentio codec encodes multi-dimensional data streams in real-time, adapting to network capability. A QoE engine operating within the sentio codec a makes decisions on how to use different available codecs. The network stack can be implemented as hybrid, as described above, and in further detail with reference to Vonog et al.'s US Pat. App. 12/569,876. [0023] The sentio codec can include 1) a variety of codecs for each segment of experience described above, 2) a hybrid network stack with network intelligence, 3) data about available devices, and 4) a QoE engine that makes decisions on how to encode. It will be appreciated that QoE is achieved through various strategies that work differently for each given experience (say a zombie karaoke game vs. live stadium rock concert experience), and adapt in real-time to the network and other available resources, know the devices involved and take advantages of various psychological tricks to conceal imperfections which inevitably arise, particularly when the provided experience is scaled for many participants and devices.
[0024] Fig. 1 illustrates a block diagram of one embodiment of a sentio codec 200. The sentio codec 200 includes a plurality of codecs such as video codecs 202, audio codecs 204, graphic language codecs 206, sensor data codecs 208, and emotion codecs 210. The sentio codec 200 further includes a quality of experience (QoE) decision engine 212 and a network engine 214. The codecs, the QoE decision engine 212, and the network engine 214 work together to encode one or more data streams and transmit the encoded data according to a low- latency transfer protocol supporting the various encoded data types. One suitable low-latency protocol and more details related to the network engine 214 can be found in Vonog et al.'s U.S. Pat. App. No. 12/569,876.
[0025] The sentio codec 200 can be designed to take all aspects of the experience platform into consideration when executing the transfer protocol. The parameters and aspects include available network bandwidth, transmission device characteristics and receiving device characteristics. Additionally, the sentio codec 200 can be implemented to be responsive to commands from an experience composition engine or other outside entity to determine how to prioritize data for transmission. In many applications, because of human response, audio is the most important component of an experience data stream. However, a specific application may desire to emphasize video or gesture commands.
[0026] The sentio codec provides the capability of encoding data streams corresponding to many different senses or dimensions of an experience. For example, a device 12 may include a video camera capturing video images and audio from a participant. The user image and audio data may be encoded and transmitted directly or, perhaps after some intermediate processing, via the experience composition engine 48, to the service platform 46 where one or a combination of the service engines can analyze the data stream to make a determination about an emotion of the participant. This emotion can then be encoded by the sentio codec and transmitted to the experience composition engine 48, which in turn can incorporate this into a dimension of the experience. Similarly a participant gesture can be captured as a data stream, e.g. by a motion sensor or a camera on device 12, and then transmitted to the service platform 46, where the gesture can be interpreted, and transmitted to the experience composition engine 48 or directly back to one or more devices 12 for incorporation into a dimension of the experience.
[0027] The sentio codec delivers the best QoE to a consumer on the device of their choice through current network. This is accomplished through a variety of mechanisms, selected and implemented based on the specific application and available resources. In certain embodiments, the sentio codec encodes multi-dimensional data streams in real-time, adapting to network capability. A QoE engine operating within the sentio codec a makes decisions on how to use different available codecs. The network stack can be implemented as hybrid, as described above, and in further detail with reference to Vonog et al.'s US Pat. App. 12/569,876. [0028] Additionally, the following description is related to a simple operating system, which follows generally the fundamental concepts discussed above with further distinctions. In a cloud computing environment, a server communicates with a first device, wherein the first device can detect surrounding devices, and an application program is executable by the server, wherein the application program is controlled by the first device and the output of the application program is directed by the server to one of the devices detected by the first device.
[0029] According to one embodiment, a minimum set of requirements exists in order for the first device to detect and interact with other devices in the cloud computing environment. A traditional operating system is inappropriate for such enablement because the device does not need full operating system capabilities. Instead, a plurality of codecs is sufficient to enable device interaction.
[0030] Figure 6 illustrates an ensemble of devices interacting and their output streamed to and displayed on a single display. Multiple users having devices participate in an activity, for example watching live sports. The.video of the live sports is streamed as a layer (layerl ) from a content delivery network and displayed to the users. A user having device 1 can play the role of commentator and the audio from device 1 is streamed as a layer (layer2) and rendered to the users. A user having device2 can, for example, be drawing plays and the drawings are streamed as another layer and displayed to the users. A user having device3 can, for example, be typing up facts that are streamed as another layer and displayed to the users as a ticker tape. The devices and users make an ensemble in that they have different roles and experiences together while participating in the same activity.
[0031] Figure 7 illustrates an exemplary architecture of a simple operating system. A simple operating system includes input capabilities, output capabilities, a network stack, a device agent, a plurality of codecs, services routing and an optional user interface shell. The simple operating system receives input including requests for services, and routes the requests for services to the appropriate available computing capabilities.
[0032] According to one embodiment, the simple operating system performs minimal input processing to decipher what services are being requested, only to determine where to route the request. The device agent provides information regarding the location of best computing available for a particular request. [0033] According to one embodiment, the simple operating system performs no input processing and automatically routes input for processing to another device or to the cloud.
[0034] According to one embodiment, the simple operating system routes requests for services to another device, to a server in the cloud, or to computing capability available locally on the device hosting the simple operating system.
[0035] According to one embodiment, the plurality of codecs maintain a network connection and can activate output capabilities.
[0036] According to one embodiment, the simple operating system does not include any local services. All requests are sent to the cloud for services.
[0037] According to one embodiment, a device hosting the simple operating system can also host a traditional operating system.
[0038] Services are defined at the API Layer of the platform. Services are categorized into Dimensions. Dimensions can be recombined into Layers. Layers form to make features in the user experience
[0039] In addition to the above mentioned examples, various other modifications and alterations of the invention may be made without departing from the invention. Accordingly, the above disclosure is not to be considered as limiting and the appended claims are to be interpreted as encompassing the true spirit and the entire scope of the invention.

Claims

In the claims: I claim
1. A sentio codec for encoding and decoding a plurality of multi-dimensional data streams for a multi-dimensional experience, the sensio codec comprising:
a plurality of codecs suitable for encoding and decoding multi-dimensional experience data streams related to a multi-dimensional experience;
a QoE decision engine; v ·'··
a network engine;
wherein the sentio codec implements a low-latency transfer protocol suitable for enabling a multi-dimensional experience.
2. A sentio codec as recited in claim 1 , wherein the plurality of codecs includes an audio codec and a video codec.
3. A sentio codec as recited in claim 2, wherein the plurality of codecs further includes a gesture command codec.
4. A sentio codec as recited in claim 2, wherein the plurality of codecs further includes a sensor data codec.
5. A sentio codec as recited in claim 2, wherein the plurality of codecs further includes an emotion data codec.
6. A sentio codec as recited in claim 1 , wherein the sentio codec takes available network bandwidth into account when encoding data streams.
7. A sentio codec as recited in claim 1 , wherein the sentio codec takes a characteristic of an intended recipient device into account when encoding data streams.
8. A sentio codec as recited in claim 1 , wherein the sentio codec takes into account a characteristic of a transmission device into account when encoding data streams.
9. A sentio codec as recited in claim 1 , wherein the sentio codec takes into account a characteristic of a participant experience into account when encoding data streams.
PCT/US2011/001425 2010-08-12 2011-08-12 EXPERIENCE OR "SENTIO" CODECS, AND METHODS AND SYSTEMS FOR IMPROVING QoE AND ENCODING BASED ON QoE EXPERIENCES WO2012021174A2 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US37322910P 2010-08-12 2010-08-12
US37323610P 2010-08-12 2010-08-12
US61/373,229 2010-08-12
US61/373,236 2010-08-12

Publications (2)

Publication Number Publication Date
WO2012021174A2 true WO2012021174A2 (en) 2012-02-16
WO2012021174A3 WO2012021174A3 (en) 2012-05-24

Family

ID=45568103

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2011/001425 WO2012021174A2 (en) 2010-08-12 2011-08-12 EXPERIENCE OR "SENTIO" CODECS, AND METHODS AND SYSTEMS FOR IMPROVING QoE AND ENCODING BASED ON QoE EXPERIENCES

Country Status (1)

Country Link
WO (1) WO2012021174A2 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9401937B1 (en) 2008-11-24 2016-07-26 Shindig, Inc. Systems and methods for facilitating communications amongst multiple users
US9661270B2 (en) 2008-11-24 2017-05-23 Shindig, Inc. Multiparty communications systems and methods that optimize communications based on mode and available bandwidth
US9711181B2 (en) 2014-07-25 2017-07-18 Shindig. Inc. Systems and methods for creating, editing and publishing recorded videos
US9712579B2 (en) 2009-04-01 2017-07-18 Shindig. Inc. Systems and methods for creating and publishing customizable images from within online events
US9734410B2 (en) 2015-01-23 2017-08-15 Shindig, Inc. Systems and methods for analyzing facial expressions within an online classroom to gauge participant attentiveness
US9733333B2 (en) 2014-05-08 2017-08-15 Shindig, Inc. Systems and methods for monitoring participant attentiveness within events and group assortments
US9779708B2 (en) 2009-04-24 2017-10-03 Shinding, Inc. Networks of portable electronic devices that collectively generate sound
US9947366B2 (en) 2009-04-01 2018-04-17 Shindig, Inc. Group portraits composed using video chat systems
US9952751B2 (en) 2014-04-17 2018-04-24 Shindig, Inc. Systems and methods for forming group communications within an online event
US10133916B2 (en) 2016-09-07 2018-11-20 Steven M. Gottlieb Image and identity validation in video chat events
US10271010B2 (en) 2013-10-31 2019-04-23 Shindig, Inc. Systems and methods for controlling the display of content

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002041121A2 (en) * 2000-10-20 2002-05-23 Wavexpress, Inc. Browser including multimedia tool overlay and method of providing a converged multimedia display including user-enhanced data
US7516255B1 (en) * 2005-03-30 2009-04-07 Teradici Corporation Method and apparatus for providing a low-latency connection between a data processor and a remote graphical user interface over a network
US20090183205A1 (en) * 2008-01-16 2009-07-16 Qualcomm Incorporated Intelligent client: multiple channel switching over a digital broadcast network
JP2010016662A (en) * 2008-07-04 2010-01-21 Kddi Corp Transmitter, method and program for controlling layer count of media stream

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002041121A2 (en) * 2000-10-20 2002-05-23 Wavexpress, Inc. Browser including multimedia tool overlay and method of providing a converged multimedia display including user-enhanced data
US7516255B1 (en) * 2005-03-30 2009-04-07 Teradici Corporation Method and apparatus for providing a low-latency connection between a data processor and a remote graphical user interface over a network
US20090183205A1 (en) * 2008-01-16 2009-07-16 Qualcomm Incorporated Intelligent client: multiple channel switching over a digital broadcast network
JP2010016662A (en) * 2008-07-04 2010-01-21 Kddi Corp Transmitter, method and program for controlling layer count of media stream

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9401937B1 (en) 2008-11-24 2016-07-26 Shindig, Inc. Systems and methods for facilitating communications amongst multiple users
US9661270B2 (en) 2008-11-24 2017-05-23 Shindig, Inc. Multiparty communications systems and methods that optimize communications based on mode and available bandwidth
US10542237B2 (en) 2008-11-24 2020-01-21 Shindig, Inc. Systems and methods for facilitating communications amongst multiple users
US9712579B2 (en) 2009-04-01 2017-07-18 Shindig. Inc. Systems and methods for creating and publishing customizable images from within online events
US9947366B2 (en) 2009-04-01 2018-04-17 Shindig, Inc. Group portraits composed using video chat systems
US9779708B2 (en) 2009-04-24 2017-10-03 Shinding, Inc. Networks of portable electronic devices that collectively generate sound
US10271010B2 (en) 2013-10-31 2019-04-23 Shindig, Inc. Systems and methods for controlling the display of content
US9952751B2 (en) 2014-04-17 2018-04-24 Shindig, Inc. Systems and methods for forming group communications within an online event
US9733333B2 (en) 2014-05-08 2017-08-15 Shindig, Inc. Systems and methods for monitoring participant attentiveness within events and group assortments
US9711181B2 (en) 2014-07-25 2017-07-18 Shindig. Inc. Systems and methods for creating, editing and publishing recorded videos
US9734410B2 (en) 2015-01-23 2017-08-15 Shindig, Inc. Systems and methods for analyzing facial expressions within an online classroom to gauge participant attentiveness
US10133916B2 (en) 2016-09-07 2018-11-20 Steven M. Gottlieb Image and identity validation in video chat events

Also Published As

Publication number Publication date
WO2012021174A3 (en) 2012-05-24

Similar Documents

Publication Publication Date Title
US9172979B2 (en) Experience or “sentio” codecs, and methods and systems for improving QoE and encoding based on QoE experiences
Yaqoob et al. A survey on adaptive 360 video streaming: Solutions, challenges and opportunities
WO2012021174A2 (en) EXPERIENCE OR "SENTIO" CODECS, AND METHODS AND SYSTEMS FOR IMPROVING QoE AND ENCODING BASED ON QoE EXPERIENCES
US20220193542A1 (en) Compositing multiple video streams into a single media stream
US10335691B2 (en) System and method for managing audio and video channels for video game players and spectators
US11522925B2 (en) Systems and methods for teleconferencing virtual environments
US8903740B2 (en) System architecture and methods for composing and directing participant experiences
US8549167B2 (en) Just-in-time transcoding of application content
US8966095B2 (en) Negotiate multi-stream continuous presence
US20190314728A1 (en) System and Method for Managing Audio and Video Channels for Video Game Players and Spectators
WO2012054895A2 (en) System architecture and method for composing and directing participant experiences
CN114666225B (en) Bandwidth adjustment method, data transmission method, device and computer storage medium
US20220109758A1 (en) Method and apparatus for teleconference
CN105635188B (en) A kind of visual content distribution method and system
US11128739B2 (en) Network-edge-deployed transcoding methods and systems for just-in-time transcoding of media data
CN116980392A (en) Media stream processing method, device, computer equipment and storage medium
KR20160015128A (en) System for cloud streaming service, method of cloud streaming service based on type of image and apparatus for the same
KR20230006495A (en) Multi-grouping for immersive teleconferencing and telepresence
WO2022049020A1 (en) Orchestrating a multidevice video session
US11985181B2 (en) Orchestrating a multidevice video session
Stamm Assessing Image Quality Impact of View Bypass in Cloud Rendering
WO2024043925A1 (en) System, method, and devices for providing text interpretation to multiple co-watching devices
Martini et al. WhatNOW: A system to enable videostream in a mobile network
Baskaran Serverless parallel video combiner with dominant speaker detection for ultra–high definition multipoint video communication systems

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11816715

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11816715

Country of ref document: EP

Kind code of ref document: A2