CN113141523A - Resource transmission method, device, terminal and storage medium - Google Patents

Resource transmission method, device, terminal and storage medium Download PDF

Info

Publication number
CN113141523A
CN113141523A CN202010054775.2A CN202010054775A CN113141523A CN 113141523 A CN113141523 A CN 113141523A CN 202010054775 A CN202010054775 A CN 202010054775A CN 113141523 A CN113141523 A CN 113141523A
Authority
CN
China
Prior art keywords
code rate
target
moment
frame
terminal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010054775.2A
Other languages
Chinese (zh)
Other versions
CN113141523B (en
Inventor
周超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN202010054775.2A priority Critical patent/CN113141523B/en
Priority to PCT/CN2020/133755 priority patent/WO2021143386A1/en
Priority to EP20913374.3A priority patent/EP3952316A4/en
Publication of CN113141523A publication Critical patent/CN113141523A/en
Priority to US17/519,459 priority patent/US11652864B2/en
Application granted granted Critical
Publication of CN113141523B publication Critical patent/CN113141523B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/80Responding to QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/238Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
    • H04N21/23805Controlling the feeding rate to the network, e.g. by controlling the video pump
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/38Flow control; Congestion control by adapting coding or compression rate
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/61Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio
    • H04L65/612Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio for unicast
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/61Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio
    • H04L65/613Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio for the control of the source by the destination
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/70Media network packetisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/75Media network packet handling
    • H04L65/762Media network packet handling at the source 
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/23439Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements for generating different versions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/238Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
    • H04N21/2387Stream processing in response to a playback request from an end-user, e.g. for trick-play
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2662Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44004Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving video buffer management, e.g. video decoder buffer or video display buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44209Monitoring of downstream path of the transmission network originating from a server, e.g. bandwidth variations of a wireless network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/462Content or additional data management, e.g. creating a master electronic program guide from data received from the Internet and a Head-end, controlling the complexity of a video stream by scaling the resolution or bit-rate based on the client capabilities
    • H04N21/4621Controlling the complexity of the content stream or additional data, e.g. lowering the resolution or bit-rate of the video stream for a mobile client with a small screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47217End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for controlling playback functions for recorded or on-demand content, e.g. using progress bars, mode or play-point indicators or bookmarks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/64Addressing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/647Control signaling between network components and server or clients; Network processes for video distribution between server and clients, e.g. controlling the quality of the video stream, by dropping packets, protecting content from unauthorised alteration within the network, monitoring of network load, bridging between two different networks, e.g. between IP and wireless
    • H04N21/64723Monitoring of network processes or resources, e.g. monitoring of network load
    • H04N21/64738Monitoring network characteristics, e.g. bandwidth, congestion level
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/65Transmission of management data between client and server
    • H04N21/658Transmission by the client directed to the server
    • H04N21/6581Reference data, e.g. a movie identifier for ordering a movie or a product identifier in a home shopping application
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/65Transmission of management data between client and server
    • H04N21/658Transmission by the client directed to the server
    • H04N21/6587Control parameters, e.g. trick play commands, viewpoint selection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8455Structuring of content, e.g. decomposing content into time segments involving pointers to the content, e.g. pointers to the I-frames of the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content

Abstract

The disclosure relates to a resource transmission method, a resource transmission device, a terminal and a storage medium, and belongs to the technical field of communication. The method comprises the steps of determining a target code rate at any moment when the multimedia resource is played, wherein the target code rate is the code rate with the highest matching degree with the playing state at the moment, if the target code rate is inconsistent with the current code rate, obtaining target address information of the multimedia resource with the target code rate, and sending a frame obtaining request carrying the target address information to a server, wherein the frame obtaining request is used for indicating the server to return a media frame of the multimedia resource at the target code rate, so that frame-level media stream transmission can be achieved, the multimedia resource does not need to be subjected to fragment transmission, the delay of a resource transmission process is greatly reduced, the real-time performance of resource transmission is improved, and the resource transmission efficiency is improved.

Description

Resource transmission method, device, terminal and storage medium
Technical Field
The present disclosure relates to the field of communications technologies, and in particular, to a resource transmission method, an apparatus, a terminal, and a storage medium.
Background
With the development of communication technology, users can browse audio and video resources on terminals at any time and any place, and currently, when a server transmits the audio and video resources to the terminals (commonly referred to as a "streaming stage"), a media transmission mode based on fragmentation can be adopted.
The media transmission mode based on the fragments includes common Adaptive Streaming media transmission standard based on HTTP (Dynamic Adaptive Streaming over HTTP, MPEG makes, wherein, English of MPEG is called Moving Picture Experts Group, Chinese is called Dynamic Picture Experts Group), HLS (HTTP Live Streaming, HTTP Adaptive Streaming media transmission standard made by apple Inc.), etc, the server divides the audio and video resources into one segment of audio and video fragments, each audio and video fragment can be transcoded into different code rates, when the terminal plays the audio and video resources, the web addresses of the audio and video clips divided by the audio and video resources are respectively accessed, different audio and video clips can correspond to the same or different code rates, the terminal can conveniently switch in audio and video resources with different code rates, and the process is also called self-adaptive code rate adjustment based on the bandwidth condition of the terminal.
In the above process, since the server always needs to wait for a complete audio/video segment to arrive before transmitting the whole audio/video segment to the terminal because the resource transmission is performed by taking the audio/video segment as a unit, the time delay of the media transmission mode based on the segment is high, that is, the time delay of the resource transmission process is high, and the real-time performance is poor.
Disclosure of Invention
The present disclosure provides a resource transmission method, device, terminal and storage medium, to at least solve the problems of high delay and poor real-time performance in the resource transmission process in the related art. The technical scheme of the disclosure is as follows:
according to a first aspect of the embodiments of the present disclosure, there is provided a resource transmission method, including:
at any moment when the multimedia resources are played, determining a target code rate at the moment, wherein the target code rate is the code rate with the highest matching degree with the playing state at the moment;
if the target code rate is not consistent with the current code rate, target address information of the multimedia resource with the target code rate is obtained;
and sending a frame acquisition request carrying the target address information to a server, wherein the frame acquisition request is used for indicating the server to return the media frame of the multimedia resource at the target code rate.
In a possible implementation manner, the time is a starting time of the multimedia resource; or, the moment is the moment when the downloading of any picture group in the multimedia resource is finished; or, the time is the playing time of any media frame in the multimedia resource.
In a possible implementation manner, the determining the target code rate for the time instant includes:
and determining the target code rate based on the bandwidth information of the moment and the media buffer amount of the moment.
In a possible implementation manner, the determining the target bitrate based on the bandwidth information at the time and the media buffer amount at the time includes:
determining a predicted buffer storage amount of at least one candidate code rate based on the bandwidth information and the media buffer storage amount, wherein the predicted buffer storage amount refers to a media buffer storage amount predicted from the moment to the moment when the current picture group is continuously downloaded according to the corresponding code rate;
determining the target code rate from the at least one candidate code rate or the current code rate based on a predicted buffer amount of the at least one candidate code rate.
In a possible implementation manner, the determining the target code rate for the time instant includes:
and if the moment is the play starting moment of the multimedia resource, determining the play starting code rate appointed by the play service or the default play code rate of the media description file as the target code rate.
In a possible implementation manner, before sending the frame acquisition request carrying the destination address information to the server, the method further includes:
if the target code rate is not consistent with the current code rate, determining target position information, wherein the target position information is used for representing the initial pulling position of the media frame of the multimedia resource;
and embedding the target position information into a frame acquisition request carrying the target address information.
In one possible embodiment, the determining the target location information includes:
if the moment is the starting moment of the multimedia resource, determining the target position information based on the cache duration appointed by the playing service; or the like, or, alternatively,
if the moment is the moment when the downloading of any picture group in the multimedia resource is finished, determining a timestamp of a first frame in a next picture group of the picture group as the target position information; or the like, or, alternatively,
and if the moment is the playing moment of any media frame in the multimedia resource, determining the timestamp of the first frame in the currently downloaded picture group as the target position information.
In a possible implementation manner, after determining the target bitrate at any time instant of playing the multimedia resource, the method further includes:
and if the target code rate is consistent with the current code rate, ignoring the target code rate and continuously executing multimedia resource transmission at the current code rate.
In one possible embodiment, the method further comprises:
in the process of playing the multimedia resource, if the same media frame with multiple code rates exists in the cache region, the media frame with the maximum code rate in the multiple code rates is played.
According to a second aspect of the embodiments of the present disclosure, there is provided a resource transmission apparatus, including:
the first determining unit is configured to determine a target code rate at any moment when the multimedia resource is played, wherein the target code rate is a code rate with the highest matching degree with the playing state at the moment;
the obtaining unit is configured to execute the step of obtaining target address information of the multimedia resource with the target code rate if the target code rate is inconsistent with the current code rate;
and the sending unit is configured to execute sending a frame obtaining request carrying the target address information to a server, wherein the frame obtaining request is used for indicating the server to return the media frame of the multimedia resource at the target code rate.
In a possible implementation manner, the time is a starting time of the multimedia resource; or, the moment is the moment when the downloading of any picture group in the multimedia resource is finished; or, the time is the playing time of any media frame in the multimedia resource.
In one possible implementation, the first determining unit includes:
a determining subunit configured to perform determining the target bitrate based on the bandwidth information at the time and the media buffer amount at the time.
In one possible embodiment, the determining subunit is configured to perform:
determining a predicted buffer storage amount of at least one candidate code rate based on the bandwidth information and the media buffer storage amount, wherein the predicted buffer storage amount refers to a media buffer storage amount predicted from the moment to the moment when the current picture group is continuously downloaded according to the corresponding code rate;
determining the target code rate from the at least one candidate code rate or the current code rate based on a predicted buffer amount of the at least one candidate code rate.
In one possible embodiment, the first determining unit is configured to perform:
and if the moment is the play starting moment of the multimedia resource, determining the play starting code rate appointed by the play service or the default play code rate of the media description file as the target code rate.
In one possible embodiment, the apparatus further comprises:
a second determining unit, configured to determine target location information if the target code rate is inconsistent with the current code rate, where the target location information is used to indicate an initial pull location of a media frame of the multimedia resource;
an embedding unit configured to perform embedding the target location information into a frame acquisition request carrying the target address information.
In one possible embodiment, the second determining unit is configured to perform:
if the moment is the starting moment of the multimedia resource, determining the target position information based on the cache duration appointed by the playing service; or the like, or, alternatively,
if the moment is the moment when the downloading of any picture group in the multimedia resource is finished, determining a timestamp of a first frame in a next picture group of the picture group as the target position information; or the like, or, alternatively,
and if the moment is the playing moment of any media frame in the multimedia resource, determining the timestamp of the first frame in the currently downloaded picture group as the target position information.
In one possible embodiment, the apparatus further comprises:
and the transmission unit is configured to ignore the target code rate and continue to execute multimedia resource transmission at the current code rate if the target code rate is consistent with the current code rate.
In one possible embodiment, the apparatus further comprises:
and the playing unit is configured to play the media frame with the maximum code rate in the multiple code rates if the same media frame with the multiple code rates exists in the cache region in the process of playing the multimedia resource.
According to a third aspect of the embodiments of the present disclosure, there is provided a terminal, including:
one or more processors;
one or more memories for storing the one or more processor-executable instructions;
wherein the one or more processors are configured to perform the resource transmission method of any of the above first aspect and possible implementations of the first aspect.
According to a fourth aspect of embodiments of the present disclosure, there is provided a storage medium, wherein at least one instruction of the storage medium, when executed by one or more processors of a terminal, enables the terminal to perform the resource transmission method of any one of the above first aspect and possible implementations of the first aspect.
According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product comprising one or more instructions executable by one or more processors of a terminal to enable the terminal to perform the resource transmission method of any one of the above first aspect and possible implementations of the first aspect.
The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:
the target code rate at any moment of playing the multimedia resource is determined, the target code rate is the code rate with the highest matching degree with the playing state at the moment, if the target code rate is inconsistent with the current code rate, the target address information of the multimedia resource with the target code rate is obtained, a frame obtaining request carrying the target address information is sent to a server, the frame obtaining request is used for indicating the server to return the media frame of the multimedia resource at the target code rate, fragmentation transmission of the multimedia resource is not needed, resource transmission is not needed to be carried out after a complete resource fragment arrives, and therefore media stream transmission at the frame level can be achieved, delay of the resource transmission process is greatly reduced, instantaneity of resource transmission is improved, and resource transmission efficiency is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.
FIG. 1 is a schematic diagram of an implementation environment of a resource transmission method according to an example embodiment;
FIG. 2 is a schematic diagram of a FAS framework provided by embodiments of the present disclosure;
FIG. 3 is a flow diagram illustrating a method of resource transmission in accordance with an example embodiment;
FIG. 4 is an interaction flow diagram illustrating a method of resource transfer in accordance with an exemplary embodiment;
FIG. 5 is a block diagram illustrating a logical structure of a resource transfer device in accordance with an exemplary embodiment;
fig. 6 shows a block diagram of a terminal according to an exemplary embodiment of the present disclosure.
Detailed Description
In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
The user information to which the present disclosure relates may be information authorized by the user or sufficiently authorized by each party.
Hereinafter, terms related to the present disclosure are explained.
First, FLV (flash video)
FLV is a streaming media format, which is a video format developed with the introduction of Flash MX (a kind of animation software). The file formed by the method is extremely small and the loading speed is extremely high, so that the video file (namely the online browsing video) can be watched on the network, and the method effectively solves the problems that the SWF (Flash special file format) file exported after the video file is imported into Flash is large in size, so that the video file cannot be well used on the network and the like.
Streaming Media (Streaming Media)
Streaming media adopts a streaming transmission method, which is a technology and a process for compressing a series of multimedia resources and then sending a resource packet through a network so as to transmit the multimedia resources on the network for viewing and admiring in real time, wherein the technology enables the resource packet to be sent like flowing water; without this technique, the entire media file must be downloaded before use, so that only offline viewing of the multimedia asset is possible. The streaming can deliver live multimedia assets or multimedia assets pre-stored on the server, which can be played by specific playing software after reaching the viewer terminal of the viewer user when the viewer user is watching the assets.
Three, FAS (FLV Adaptive Streaming, FLV-based Adaptive Streaming standard)
FAS is a streaming resource transmission standard (or called resource transmission protocol) proposed by the present disclosure, different from the traditional media transmission mode based on fragmentation, in the FAS standard, a terminal sends a frame acquisition request corresponding to a certain start-up bit rate to a server at the start-up time, the server responds to the frame acquisition request, transmits media frames corresponding to multimedia resources to the terminal according to the start-up bit rate, then frame-level multimedia transmission is performed between the server and the terminal, the server does not need to wait for a complete video clip to arrive and then sends a resource packet to the terminal, but can send real-time media frames to the terminal frame by frame, after receiving the media frames, the terminal can perform buffering, decoding, rendering and other operations, thereby playing the media frames on the terminal, if bit rate switching is required during playing, only resending a frame acquisition request corresponding to-be-switched bit rate at the time when bit rate switching is required, the server performs similar processing logic and can transmit the media frames corresponding to the multimedia resources to the terminal according to the code rate to be switched, thereby realizing dynamic code rate switching. The above-mentioned start-up code rate and the code rate to be switched can be regarded as examples of the target code rate.
In the above process, the terminal may specify the target location information in the frame acquisition request, so as to ensure that the media stream is pulled from the target location information, and of course, the terminal may not specify the target location information in the frame acquisition request, which is that the server configures the target location information as a default value, so that the terminal pulls the media stream from the default value of the target location information. Optionally, the server may package all the buffered media frames from the target location information and send the packaged buffered media frames to the terminal (without fragmentation), and optionally, if the target location information is greater than or equal to zero or a real-time stream exists in addition to the buffered media frames, the server may send the media frames of the real-time buffered multimedia resource to the terminal frame by frame.
It should be noted that the code rate to be switched may be determined based on the bandwidth information of the terminal itself and the media buffer amount, and when the bandwidth information or the media buffer amount changes, the terminal may adaptively adjust the code rate to be switched, and resend the frame acquisition request corresponding to the code rate to be switched, so that the adaptive adjustment of the code rate of the multimedia resource can be achieved, and the optimal playing state can be adjusted in time, thereby ensuring the optimal playing effect and improving the viewing experience of the user.
The FAS standard can realize frame-level transmission and reduce end-to-end delay, provides a mechanism for self-adapting code rate switching, and only when the code rate is switched, a new frame acquisition request needs to be sent, thereby greatly reducing the number of requests and reducing the communication overhead in the resource transmission process.
Fourthly, live broadcast and on demand
And (4) live broadcast: the multimedia resources are recorded in real time, the anchor user pushes the media stream (which means pushing based on a streaming transmission mode) to the server through the anchor terminal, the audience user triggers the live interface of the anchor user on the audience terminal, then pulls the media stream (which means pulling based on the streaming transmission mode) from the server to the audience terminal, and the audience terminal decodes and plays the multimedia resources, thereby playing the video in real time.
And (3) dibbling: also called Video On Demand (VOD), the multimedia assets are pre-stored in the server, and the server can provide the multimedia assets specified by the viewer user according to the requirements of the viewer user, specifically, the viewer terminal sends an On-Demand request to the server, and the server sends the multimedia assets to the viewer terminal after inquiring the multimedia assets specified by the On-Demand request, that is, the viewer user can selectively play a certain multimedia asset.
Intuitively, on-demand content can be played at any rate, while live content is not, and the speed of playing live content depends on the real-time live progress of the anchor user.
Fig. 1 is a schematic diagram of an implementation environment of a resource transmission method according to an exemplary embodiment, referring to fig. 1, in which at least one terminal 101 and a server 102 may be included, and details are described below:
the at least one terminal 101 is configured to perform multimedia resource transmission, and each terminal may be installed with a media codec component configured to perform decoding of a multimedia resource after receiving the multimedia resource (e.g., a resource packet transmitted in a burst, a media frame transmitted at a frame level), and a media playing component configured to perform playing of the multimedia resource after decoding the multimedia resource.
According to the difference of the user identities, the at least one terminal 101 may be divided into a main broadcast terminal and a viewer terminal, where the main broadcast terminal corresponds to a main broadcast user and the viewer terminal corresponds to a viewer user, and it should be noted that, for the same terminal, the terminal may be the main broadcast terminal or the viewer terminal, for example, the terminal is the main broadcast terminal when the user records the live broadcast, and the terminal is the viewer terminal when the user watches the live broadcast.
The at least one terminal 101 and the server 102 may be connected through a wired network or a wireless network.
The server 102 is configured to provide multimedia resources to be transmitted, and the server 102 may include at least one of a server, a plurality of servers, a cloud computing platform, or a virtualization center. Alternatively, the server 102 may undertake primary computational work and at least one terminal 101 may undertake secondary computational work; or, the server 102 undertakes the secondary computing work, and at least one terminal 101 undertakes the primary computing work; alternatively, at least one terminal 101 and the server 102 perform cooperative computing by using a distributed computing architecture.
In an exemplary scenario, the server 102 may be a clustered CDN (Content Delivery Network) server, where the CDN server includes a central platform and edge servers deployed in various regions, and through functional modules of the central platform, such as load balancing, Content Delivery, and scheduling, a terminal where a user is located can obtain required Content (i.e., multimedia resources) nearby by using a local edge server, so as to reduce Network congestion and improve response speed and hit rate of terminal access.
In other words, a cache mechanism is added between the terminal and the central platform by the CDN server, and the cache mechanism is an edge server (e.g., a WEB server) deployed in different geographic locations, and when performance is optimized, the central platform schedules the edge server closest to the terminal according to the distance between the terminal and the edge server to provide services to the terminal, so that content can be more effectively delivered to the terminal.
The multimedia resources related to the embodiments of the present disclosure include, but are not limited to: at least one of a video resource, an audio resource, an image resource, or a text resource, and the embodiment of the present disclosure does not specifically limit the type of the multimedia resource. For example, the multimedia resource is a live video stream of a webcast, or a historical on-demand video pre-stored on a server, or a live audio stream of a webcast, or a historical on-demand audio pre-stored on a server.
Optionally, the device type of each of the at least one terminal 101 includes, but is not limited to: at least one of a television, a smart phone, a smart speaker, a vehicle-mounted terminal, a tablet computer, an electronic book reader, an MP3(Moving Picture Experts Group Audio Layer III, motion Picture Experts compression standard Audio Layer 3) player, an MP4(Moving Picture Experts Group Audio Layer IV, motion Picture Experts compression standard Audio Layer 4) player, a laptop portable computer, or a desktop computer. The following embodiments are exemplified in the case where the terminal includes a smartphone.
Those skilled in the art will appreciate that the number of the at least one terminal 101 may be only one, or the number of the at least one terminal 101 may be several tens or hundreds, or more. The number and the device type of the at least one terminal 101 are not limited by the embodiment of the present disclosure.
Fig. 2 is a schematic diagram of a FAS framework provided by an embodiment of the present disclosure, please refer to fig. 2, and an embodiment of the present disclosure provides a FAS (streaming-based multi-rate adaptation) framework in which multimedia resource transmission is performed between at least one terminal 101 and a server 102 through a FAS protocol.
Taking any terminal as an example for illustration, an application (also referred to as an FAS client) may be installed on the terminal, and the application is used for browsing multimedia resources, for example, the application may be a short video application, a live broadcast application, a video on demand application, a social contact application, a shopping application, and the like, and the embodiment of the present disclosure does not specifically limit the type of the application.
The user may start an application on the terminal, display a resource pushing interface (e.g., a home page or a function interface of the application), where the resource pushing interface includes thumbnail information of at least one multimedia resource, the thumbnail information includes at least one of a title, a brief introduction, a publisher, a poster, a trailer, or a highlight, the terminal may jump from the resource pushing interface to the resource playing interface in response to a user's touch operation on the thumbnail information of any multimedia resource, include a play option of the multimedia resource in the resource playing interface, and download a Media Presentation Description (MPD) of the multimedia resource from the server in response to the user's touch operation on the play option.
Further, based on the media description file, determining target address information of the multimedia resource with a start-up bit rate (or a default play bit rate, which refers to a target bit rate at a start-up time), and sending a frame acquisition request (or referred to as a FAS request) corresponding to the start-up bit rate (or the default play bit rate) to the server, so that the server processes the frame acquisition request based on a certain specification (a processing specification of the FAS request), and after the server locates a media frame of the multimedia resource (a continuous media frame may form a media stream), returning the media frame of the multimedia resource to the terminal with the start-up bit rate (or the default play bit rate) (that is, returning the media stream to the terminal with the target bit rate). After the terminal receives the media stream, the media coding and decoding component is called to decode the media stream to obtain a decoded media stream, and the media playing component is called to play the decoded media stream.
It should be noted that, since the server may form multimedia resources with multiple code rates after transcoding the multimedia resources, the server may allocate different address information for the multimedia resources with different code rates, and record the address information of the multimedia resources with various code rates in the MPD, and after the terminal downloads the MPD, the terminal may send frame acquisition requests carrying different address information to the server at different times, so that the server may return media frames of corresponding multimedia resources with different code rates.
Further, a mechanism for adaptively adjusting the code rate is provided, along with the fluctuation change of the current bandwidth information or the medium buffer storage of the terminal, the playing state of the multimedia resource on the terminal is changed accordingly, and the terminal can adaptively adjust the code rate to be switched (i.e. the target code rate in the playing process) with the highest matching degree with the current playing state. Specifically, the FAS standard provides a target location information in a frame acquisition request, different target location information can specify different initial pull locations of multimedia resources, then after the target location information (if default, the server will configure default values) and code rates are specified in the frame acquisition request, if code rate switching is required in the playing process, the terminal only needs to send a new frame acquisition request again, the server can send a media stream to the terminal from the target location information according to another code rate at any time, that is, the terminal can dynamically pull a media stream of another code rate from any initial media frame, thus ensuring a dynamic code rate switching mechanism of the FAS frame, that is, ensuring seamless switching of adaptive code rates.
In an exemplary scenario, when code rate needs to be switched, a terminal may disconnect a media stream transmission link of a current code rate, send a frame acquisition request carrying target address information corresponding to the code rate to be switched to a server, and establish a media stream transmission link based on the code rate to be switched.
In the FAS framework, the frame-level media stream transmission with any media frame as the initial pull position can be achieved, the server does not need to perform fragment transmission on multimedia resources, the delay of the resource transmission process is greatly reduced, the real-time performance of the resource transmission is improved, and the resource transmission efficiency is improved. In addition, a mechanism for adaptively adjusting the code rate is provided, and seamless code rate switching can be adaptively performed in the playing process of the multimedia resource.
The MPD file in the FAS framework described above will be described below:
in some embodiments, the MPD file may include a version number (@ version) and a media description set (@ adaptationSet), and may further include at least one of a service type (@ type), a function option (@ hideouto) for indicating whether to open the adaptive function, or a function option (@ autoDefaultSelect) for indicating whether to open the adaptive function by default at the time of startup, and the content carried by the MPD file is not specifically limited by the embodiments of the present disclosure.
Wherein the version number may comprise at least one of a version number of the media description file or a version number of a resource transfer standard (FAS standard).
The media description set is used for representing meta-information of a multimedia resource, and may include a plurality of media description meta-information, each media description meta-information corresponds to a bitrate of the multimedia resource, and each media description meta-information may include a picture group length (@ gopdata) and attribute information (@ presentation) of the multimedia resource at the bitrate corresponding to the media description meta-information.
Group Of Pictures (GOP) length refers to a distance between two key frames, a key frame refers to an Intra-coded picture frame (also referred to as an "I frame") in a video coding sequence, the coding and decoding Of the I frame can be realized only by using the information Of the I frame without referring to other image frames, and in contrast, the coding and decoding Of a P frame (Predictive-coded picture frame) and a B frame (bidirectional Predictive-coded picture frame) both need to refer to other image frames, and the coding and decoding cannot be completed only by using the information Of the I frame.
Each attribute information may include identification information (@ id, unique identifier) of the multimedia Resource, an encoding mode of the multimedia Resource (@ codec, compliant encoding and decoding standard), a bitrate supported by the multimedia Resource (@ bitrate, data bits transmitted in unit time during Resource transmission), and address information (@ URL, a URL or a domain name provided by the multimedia Resource at a certain bitrate to the outside, the URL refers to a Uniform Resource Locator, and is entirely called as a Uniform Resource Locator), of course, each attribute information may also include a quality type of the multimedia Resource (@ qualityType, including quality evaluation indexes such as resolution, frame rate, and the like), a hidden option (@ hiden) of the multimedia Resource, which is used to indicate whether the multimedia Resource at a certain bitrate is exposed, i.e. whether a user can manually operate the multimedia Resource at the bitrate), a function option (@ enable adaptation, which is used to indicate whether the multimedia Resource is visible with respect to the adaptive function, whether the adaptive function can select the multimedia resource with a certain code rate) or a default play function option (@ default select, whether the multimedia resource with a certain code rate is played by default at the time of starting).
The service type is used for specifying the service type of the multimedia resource, and the service type comprises at least one of live broadcast or on-demand broadcast.
In some embodiments, the MPD file format may be JSON (JavaScript Object Notation) or other script formats, and the MPD file format is not specifically limited in the embodiments of the present disclosure.
Fig. 3 is a flowchart illustrating a resource transmission method according to an exemplary embodiment, and referring to fig. 3, the resource transmission method is applied to a terminal in the FAS framework related to the above implementation environment, which is described in detail below.
In step 301, at any time when the terminal plays the multimedia resource, the terminal determines a target code rate at that time, where the target code rate is a code rate with the highest matching degree with the playing state at that time.
In step 302, if the target code rate is not consistent with the current code rate, the terminal obtains the target address information of the multimedia resource with the target code rate.
In step 303, the terminal sends a frame acquisition request carrying the destination address information to the server, where the frame acquisition request is used to instruct the server to return the media frame of the multimedia resource at the destination code rate.
The method provided by the embodiment of the disclosure determines the target code rate at any moment when the multimedia resource is played, wherein the target code rate is the code rate with the highest matching degree with the playing state at the moment, and if the target code rate is not consistent with the current code rate, the target address information of the multimedia resource with the target code rate is acquired, and a frame acquisition request carrying the target address information is sent to a server, wherein the frame acquisition request is used for indicating the server to return a media frame of the multimedia resource at the target code rate, so that the multimedia resource does not need to be transmitted in a fragmentation manner, and the resource transmission does not need to be performed after a complete resource fragment arrives, thereby achieving the media stream transmission at a frame level, greatly reducing the delay of the resource transmission process, improving the real-time performance of the resource transmission, and improving the resource transmission efficiency.
In a possible implementation, the time is a broadcast starting time of the multimedia resource; or, the moment is the moment when the downloading of any picture group in the multimedia resource is finished; or, the time is the playing time of any media frame in the multimedia resource.
In one possible implementation, determining the target code rate at the time includes:
and determining the target code rate based on the bandwidth information at the moment and the media buffer storage amount at the moment.
In one possible embodiment, determining the target bitrate based on the bandwidth information at the time and the media buffer amount at the time comprises:
determining a predicted buffer storage amount of at least one candidate code rate based on the bandwidth information and the media buffer storage amount, wherein the predicted buffer storage amount refers to a media buffer storage amount predicted from the moment to the moment until the downloading end moment, and the predicted buffer storage amount is continuously downloaded to the current picture group according to the corresponding code rate;
and determining the target code rate from the at least one candidate code rate or the current code rate based on the predicted buffer amount of the at least one candidate code rate.
In one possible implementation, determining the target code rate at the time includes:
and if the moment is the play starting moment of the multimedia resource, determining the play starting code rate specified by the play service or the default play code rate of the media description file as the target code rate.
In a possible implementation manner, before sending the frame acquisition request carrying the target address information to the server, the method further includes:
if the target code rate is not consistent with the current code rate, determining target position information, wherein the target position information is used for expressing the initial pulling position of the media frame of the multimedia resource;
and embedding the target position information into a frame acquisition request carrying the target address information.
In one possible embodiment, determining the target location information comprises:
if the moment is the starting moment of the multimedia resource, the target position information is determined based on the cache duration appointed by the playing service; or the like, or, alternatively,
if the moment is the moment when the downloading of any picture group in the multimedia resource is finished, determining a timestamp of a first frame in a next picture group of the picture group as the target position information; or the like, or, alternatively,
and if the moment is the playing moment of any media frame in the multimedia resource, determining the timestamp of the first frame in the currently downloaded frame group as the target position information.
In a possible embodiment, after determining the target bitrate at any time when the multimedia resource is played, the method further includes:
if the target code rate is consistent with the current code rate, neglecting the target code rate, and continuing to execute multimedia resource transmission with the current code rate.
In one possible embodiment, the method further comprises:
in the process of playing the multimedia resource, if the same media frame with multiple code rates exists in the cache region, the media frame with the maximum code rate in the multiple code rates is played.
All the above optional technical solutions may be combined arbitrarily to form the optional embodiments of the present disclosure, and are not described herein again.
Fig. 4 is an interaction flow diagram illustrating a resource transfer method according to an exemplary embodiment, which may be used in the FAS framework related to the above-described implementation environment, as shown in fig. 4, and includes the following steps.
In step 401, the terminal plays the multimedia resource in the resource playing interface.
The terminal may have an application installed thereon, and the application is used for browsing multimedia resources, for example, the application may include at least one of a short video application, a live broadcast application, a video-on-demand application, a social application, or a shopping application, and the embodiment of the present disclosure does not specifically limit the type of the application.
The multimedia resources related to the embodiments of the present disclosure include, but are not limited to: at least one of a video resource, an audio resource, an image resource, or a text resource, and the embodiment of the present disclosure does not specifically limit the type of the multimedia resource. For example, the multimedia resource is a live video stream of a webcast, or a historical on-demand video pre-stored on a server, or a live audio stream of a webcast, or a historical on-demand audio pre-stored on a server.
In the foregoing process, a user may start an application program on a terminal, where the application program displays a resource pushing interface, for example, the resource pushing interface may be a home page or a functional interface of the application program, and the embodiment of the present disclosure does not specifically limit the type of the resource pushing interface. The resource pushing interface can include thumbnail information of at least one multimedia resource, wherein the thumbnail information comprises at least one of a title, a brief, a poster, a trailer or a highlight of the multimedia resource. In the process of browsing the resource pushing interface, a user can click the thumbnail information of the interested multimedia resource, and in response to the touch operation of the user on the thumbnail information of the multimedia resource, the terminal can jump from the resource pushing interface to the resource playing interface.
The resource playing interface can comprise a playing area and a comment area, the playing area can comprise playing options of the multimedia resource, and the comment area can comprise viewing comments of other users for the multimedia resource.
Optionally, detail information of the multimedia asset may be further included in the playing area, where the detail information may include at least one of a title, a brief description, a keyword, publisher information, or current popularity of the multimedia asset, where the publisher information may include a publisher nickname, a publisher avatar, a publisher amount of filament, and the like, and the content of the detail information or the publisher information is not specifically limited in the embodiments of the present disclosure.
Optionally, the play area may further include a bullet screen input area and a bullet screen setting option, the user may control whether to display at least one of a bullet screen, a bullet screen moving speed, a bullet screen display area, or a bullet screen display mode (transparency, size of characters, etc.), the user may further input a content that the user wants to comment by clicking the bullet screen input area, the bullet screen mode is not limited to a text or an expression image, and the content of the bullet screen setting option or the bullet screen mode input by the user is not specifically limited in the embodiment of the present disclosure.
Optionally, the playing area may further include a collection option and an attention option, if the user clicks the collection option, the terminal may be triggered to send a collection request to the server, the server responds to the collection request, and adds the multimedia resource to the favorite corresponding to the user, so as to facilitate quick search of the multimedia resource when the user repeatedly watches the multimedia resource subsequently, and if the user clicks the attention option, the terminal may be triggered to send an attention request to the server, and the server responds to the attention request, and adds the publisher of the multimedia resource to the attention list corresponding to the user, so as to facilitate watching the content published by the publisher subsequently.
Optionally, the playing area may further include a presentation option of the virtual gift, if the user clicks the presentation option, a selection bar of a presentation category and a presentation number of the virtual gift may be displayed, after the user selects a certain category and a certain number of virtual gifts, the user may trigger the terminal to send a presentation request of the virtual gift to the server by clicking a confirmation button, the server settles the presentation request, deducts a certain value from an account of the user, and issues a certain value to an account of the anchor, and after the settlement is completed, the terminal may display a special animation of the virtual gift in a floating layer manner in the playing area, so that a playing manner with higher interactivity and higher interest may be provided.
The above various possible embodiments provide different exemplary layouts of the resource playing interface, and the resource playing interface may have more or fewer layout manners in practical application, and the embodiments of the present disclosure do not specifically limit the layout manners of the resource playing interface.
After the resource playing interface is displayed, when a user wants to watch a multimedia resource, the user can click a playing option in a playing area, the terminal responds to the touch operation of the user on the playing option, and at the start (indicating to start playing) moment in the playing process, the terminal can obtain a media stream of the multimedia resource with a certain code rate through the operation executed in the following step 402 and 407, invoke the media encoding and decoding component to decode the media stream, and invoke the media playing component to play the decoded media stream.
In some embodiments, at any play time in the play process except the play start time, the terminal may still re-acquire the media stream (new media stream) of the multimedia resource with another bitrate through the operations performed in the following step 402 and 407, so as to switch from the original media stream to play the new media stream, thereby implementing seamless switching of multiple bitrate, and this embodiment of the present disclosure does not specifically limit whether the following step 402 and 407 is performed at the play start time or at any play time in the play process.
In step 402, the terminal determines a target code rate at any time when playing the multimedia resource, where the target code rate is a code rate with the highest matching degree with the playing state at the time.
Optionally, the time may be a broadcast starting time of the multimedia resource; or, the time may also be the time when the downloading of any group of pictures (GOP) in the multimedia resource is finished; or, the time may also be a playing time of any media frame in the multimedia resource, and the embodiment of the present disclosure does not specifically limit which time is a playing process.
In some embodiments, the terminal may further provide a code rate selection list for the user, when the user clicks any one of the values in the code rate selection list, the user triggers generation of a code rate selection instruction carrying the value, and the terminal determines the value carried by the code rate selection instruction as the target code rate in response to the code rate selection instruction.
In some embodiments, if the time is a play start time of the multimedia resource, at this time, the play service may specify a play start code rate, or a default play code rate exists in the MPD of the multimedia resource, and therefore, the terminal may determine the play start code rate specified by the play service or the default play code rate of the media description file as the target code rate.
The default playing code rate is a code rate of a multimedia resource with the @ defaultSelect field set to true in media description meta information of an MPD file, and since a media playing component cannot default to play a multimedia resource with two code rates (playing conflicts exist), the @ defaultSelect field of the multimedia resource with only one code rate is true in all media description meta information at most.
Optionally, in the above process, the terminal may first traverse media description meta information of the MPD, determine an @ bitrate field of a multimedia resource as a target bitrate if there is a unique multimedia resource whose @ default select field is true, otherwise, determine whether the play service specifies a start bitrate if there is no unique @ default select field as the true multimedia resource (for example, there are at least two multimedia resources whose @ default select field is true, or there is no multimedia resource whose @ default select field is true), determine whether the play service specifies the start bitrate if the play service specifies the start bitrate, determine the start bitrate as the target bitrate if the play service specifies the start bitrate, and automatically determine the target bitrate based on the bandwidth information and the media buffer storage amount by using the following adaptive policy if the play service does not specify the start bitrate.
In some embodiments, the terminal may adjust the target code rate through the following adaptive strategy: and the terminal determines the target code rate based on the bandwidth information and the media buffer amount at the moment. Optionally, the bandwidth information may be obtained by sampling at a first sampling interval by the terminal, the media buffer amount may be obtained by sampling at a second sampling interval by the terminal, and the first sampling interval and the second sampling interval may be the same or different, for example, the two sampling intervals may be the same by using the same timer to trigger and execute respective sampling logics.
The bandwidth information may refer to a transmission rate of a multimedia resource, and assuming that a sampling duration of each time of the bandwidth information is T (unit: ms, millisecond), and a data amount downloaded by a terminal in the sampling duration is S (unit: bytes, byte), a bandwidth sampling point B may be represented as: b is S × 8/T, wherein B may be in kbps (kilobits per second, kilobits rate), where T > 0, e.g., T is 500ms, S ≧ 0, depending on the actual download amount of the terminal.
The media buffer amount may refer to the data amount of all media frames of the multimedia resource in the buffer area.
In the above process, the terminal may determine an expected buffer amount of at least one candidate code rate based on the bandwidth information and the media buffer amount, where the expected buffer amount refers to an expected media buffer amount from the moment when the current group of pictures is continuously downloaded according to the corresponding code rate until the moment when the downloading is finished; and determining the target code rate from the at least one candidate code rate or the current code rate based on the predicted buffer amount of the at least one candidate code rate.
At this time, the terminal may configure two thresholds of the media buffer amount, which respectively include a first threshold and a second threshold, where the first threshold is greater than the second threshold, that is, the first threshold is a higher threshold, and if the media buffer amount is greater than the first threshold, it indicates that the media buffer amount is sufficient, at this time, it may be tried to increase the code rate of the multimedia resource, so as to improve the definition of the multimedia resource and optimize the playing effect, and therefore, at least one code rate greater than the current code rate in a set of code rates supported by the multimedia resource may be determined as at least one candidate code rate; on the contrary, the second threshold is a lower threshold, and if the media buffer storage is smaller than the second threshold, it indicates that the media buffer storage is very deficient, and at this time, the code rate of the multimedia resource needs to be properly reduced to avoid the occurrence of stutter during the playing process, so that all the code rates in the code rate set supported by the multimedia resource can be determined as at least one candidate code rate, so as to facilitate the global screening of the maximum candidate code rate that ensures that the media buffer storage does not drop, that is, to ensure that the highest definition is maintained without stutter.
The following discusses the case where the media buffer amount is greater than the first threshold or less than the second threshold, respectively:
1): if the media buffer amount is greater than the first threshold, as can be seen from the above analysis, at least one code rate greater than the current code rate in the code rate set is determined as at least one candidate code rate, and the predicted buffer amount of the candidate code rate is calculated for any one candidate code rate, where the predicted buffer amount of the candidate code rate may be represented as follows:
q=qc+D-d-D*r*8/Best
in the above formula, q is the predicted buffer storage amount, qcThe media buffer amount (current buffer amount) at this time, D is the GOP length (unit: ms, ms), D is the length (unit: ms, ms) of the current GOP that has been downloaded, r is any estimated code rate, BestIs the bandwidth information at that time.
Assume that the first threshold is qhAnd the current code rate is rcThe terminal calculates each candidate code rate r (r > r)c) If there is no predicted buffer amount greater than a first threshold (q > q)h) The current code rate r can be setcDetermining the target code rate; otherwise, if there is an expected amount of buffer greater than a first threshold (q > q)h) Then q > q will be satisfiedhThe maximum candidate code rate of the candidate code rates is determined as the target code rate.
2): if the media buffer size is smaller than the second threshold, as can be seen from the above analysis, all the code rates in the code rate set are determined to be at least one candidate code rate, and the predicted buffer size of the candidate code rate is obtained for any candidate code rate according to the operation similar to that in the above example 1), which is not described herein again.
Assume that the second threshold is ql(ql<qh) After the terminal calculates the predicted buffer amount q of each candidate code rate r, if the predicted buffer amount q is larger than or equal to a second threshold (q is larger than or equal to q)l) Then q ≧ q will be satisfiedlDetermining the maximum candidate code rate in the candidate code rates as a target code rate; conversely, if there is no predicted buffer level greater than or equal to the second threshold (q ≧ q)l) The terminal needs to calculate a comparison buffer amount at this time, where the comparison buffer amount refers to an expected buffer amount when the downloading is continued at the current code rate until the current GOP is completely downloaded, and the comparison buffer amount of the current code rate may be represented by the following formula:
q=qc+D-d-(D-d)*rc*8/Best
wherein r iscRepresenting the current code rate, qc、D、d、BestThe meanings of the above-mentioned characters are the same as the meanings of the same characters in the above formula, and are not described herein.
After the comparison buffer amounts are calculated, the code rate r corresponding to the maximum buffer amount between each of the predicted buffer amounts and the comparison buffer amount is calculated*Determining the target code rate, in particular if r*=rcThen, the current code rate is determined as the target code rate, and the target code rate r is determined under the other conditions*With the current code rate rcAre not equal.
In the above process, a mechanism for determining the target code rate based on the bandwidth information and the media buffer amount is provided, that is, a self-adaptive adjustment strategy of the target code rate is provided. It should be noted that, since the time may be a start-up time, a download completion time of any GOP, or a play time of any media frame, optionally, if the play service does not specify a start-up bit rate and a default play bit rate is not specified in the MPD at the start-up time, the terminal may invoke an adaptive adjustment policy to determine a target bit rate at the start-up time; or, when each GOP is downloaded, the terminal may call an adaptive strategy to determine the target bit rate for downloading the next GOP, and this strategy is also called "GOP boundary decision"; alternatively, at the playing time of any media frame, the terminal may invoke an adaptive strategy to determine the target bitrate at the next time, and such a strategy is also referred to as "arbitrary point decision".
In some embodiments, in addition to determining the target bitrate based on the bandwidth information and the media buffer amount, the terminal may determine the target bitrate based on the bandwidth information and the play state information, where the play state information may include at least one of a video buffer amount, an audio buffer amount, a last stuck time, or a frame loss rate within a sampling time, and the content included in the play state information is not specifically limited by the embodiments of the present disclosure, where the video buffer amount and the audio buffer amount may be collectively referred to as the "media buffer amount" described above.
The adaptive strategy dynamically selects the optimal target code rate by combining the network state information (bandwidth information) and the playing state information (media buffer storage) of the terminal, so that the balance among the pause rate, the definition and the smoothness of the playing effect can be obtained, the decision of making the playing state optimal is ensured, namely, the target code rate determined by the adaptive strategy is the highest in matching degree with the playing state of the multimedia resource.
In step 403, if the target code rate is not consistent with the current code rate, the terminal obtains target address information and target location information of the multimedia resource with the target code rate, where the target location information is used to indicate an initial pull location of a media frame of the multimedia resource.
In the above process, if the target code rate is not consistent with the current code rate, at this time, code rate switching needs to be performed, the terminal may query, using the target code rate as an index, in the MPD, to obtain media description meta-information corresponding to the multimedia resource of the target code rate, and extract target address information stored in the @ url field from the media description meta-information.
In some embodiments, the target location information (@ fas spts) is used to indicate from which frame the server starts to send the media stream specifically, and a data type of the target location information may be an int64_ t type, and of course, may also be other data types.
It should be noted that the target location information represents a relative location parameter relative to the current time, if the target location information is less than 0, the terminal will pull the cache data in a historical time period, if the target location information is equal to 0, the terminal will pull the real-time media stream from the current time, and if the target location information is greater than 0, the terminal will pull the real-time media stream from a future time.
How to obtain the target location information will be discussed below for different execution time points, respectively:
1): and if the moment is the play starting moment of the multimedia resource, the terminal determines the target position information based on the cache duration appointed by the play service.
The cache duration specified by the playing service is determined by the service party according to the service requirement, for example, for some short videos, the cache duration may be set to 8 seconds, and for some long videos, the cache duration may be set to 1 minute, and the value of the cache duration is not specifically limited in the embodiment of the present disclosure.
When the target location information is determined based on the cache duration, the cache duration may be mapped to the corresponding target location information, for example, if the cache duration is 8 seconds, the cache duration may be mapped to @ fas spts ═ 8000, and the target location information is a negative value, which represents that cache data in a history time period is pulled, which indicates that the cache area of the terminal has cache data of at most 8 seconds, thereby implementing a compromise between delay and network jitter buffering.
In some embodiments, the playing service may also directly specify a target location information without specifying the cache duration, the target location information specified by the service party may be marked as "fassputsinit," and the embodiment of the present disclosure does not specifically limit whether the service party specifies the cache duration or the target location information.
2): if the moment is the moment when the downloading of any group of pictures (GOP) in the multimedia resource is finished, the terminal determines the time stamp of the first frame in the next group of pictures of the group of pictures as the target position information.
In the above process, after a GOP is downloaded, the terminal invokes an adaptive policy of GOP boundary decision to determine a target bit rate of a next GOP, and after determining the target bit rate, a PTS (Presentation Time Stamp) of a first frame in the next GOP may be determined as target position information, where the first frame refers to a first I frame in the video resource if the buffer includes the video resource, and the first frame refers to a first audio frame in the audio resource if the buffer does not include the video resource.
3): and if the moment is the playing moment of any media frame in the multimedia resource, the terminal determines the timestamp of the first frame in the currently downloaded picture group as the target position information.
In the above process, the terminal invokes an adaptive strategy of an arbitrary point decision to determine a target code rate at any play time, and for the same GOP, if the code rates of an I frame and a P frame (or a B frame) in the GOP are different, decoding cannot be performed, so that code rate switching can only be performed between GOPs, but code rate switching cannot be performed inside the GOP, and any play time cannot be guaranteed to be exactly at the download completion time of one GOP.
In the above examples 2) and 3), the scenarios are both scenarios when rate switching is performed during playing, and do not belong to the start-up scenario in 1), the target address information determined in the rate switching scenario may be denoted as "url switch", and the target location information determined may be denoted as "fas sptsswitch".
In step 403, how the terminal determines the processing logic of the target address information and the target location information when the target code rate is not consistent with the current code rate, but in some embodiments, if the target code rate is consistent with the current code rate, the terminal may ignore the target code rate, and continue to perform multimedia resource transmission with the current code rate, at this time, the code rate is not changed, the address information of the pull stream is also not changed, the terminal does not need to generate a new frame acquisition request, and does not need to send a redundant frame acquisition request, and it is only required to directly continue to perform media stream transmission according to the FAS standard with the current code rate, which can reduce the communication traffic when implementing dynamic code rate switching in the FAS framework and save communication overhead.
In step 404, the terminal embeds the target location information into a frame acquisition request carrying the target address information.
The frame acquiring request may include an address information field (@ url) and an extension field, and the terminal may write the target address information into the address information field of the frame acquiring request and write the target location information into the extension field of the frame acquiring request.
In one exemplary scenario, a viewer user enters a live room (live interface, which is a resource playing interface) of a anchor user in an application, in the initial stage of playing, the terminal needs to pull a certain buffer media stream to start playing, when starting playing, the terminal obtains target address information (@ url) corresponding to a media stream to be requested according to an initial media stream (a media stream with a start playing code rate) specified by a service or a default initial media stream (a media stream with a default playing code rate) in the MPD, at this time, the target address information at the start playing time can be marked as "urlStart", assuming that the service also specifies a negative value as target position information (@ fas spts ═ fas sptsinit < 0), and encapsulates the target address information and the target position information at the start playing time into a frame obtaining request, the frame acquisition request (FAS request at the time of startup) at this time may be expressed as "url start & (FAS spts ═ FAS sptsinit)".
In an exemplary scenario, during the process of viewing a multimedia resource, a user causes a target code rate output by an adaptive strategy to be inconsistent with a current code rate along with changes of bandwidth information and play state information, at this time, a terminal needs to perform code rate switching, target address information in the code rate switching process is recorded as "url switch", target location information in the code rate switching process is recorded as "FAS sptsswitch", the target address information and the target location information in the code rate switching process are encapsulated in a frame acquisition request, and a frame acquisition request (FAS request in the code rate switching process) at this time may be represented as "url switch & (FAS spts ═ FAS sptsswitch)".
In step 405, the terminal sends a frame acquisition request carrying the target address information and the target location information to the server, where the frame acquisition request is used to instruct the server to return the media frame of the multimedia resource at the target bitrate from the target location information.
In the above process, it is equivalent to that the terminal encapsulates both the destination address information and the destination location information in the frame acquisition request, in a possible implementation case, the terminal may encapsulate only the destination address information in the frame acquisition request, that is, the destination location information is default in the frame acquisition case, at this time, the server configures the default value of the destination location information according to the FAS specification, so as to start media stream transmission from the default value of the destination location information.
In some embodiments, the extension field of the frame acquisition request may further carry an audio parameter, where the audio parameter is used to indicate whether the requested media frame is an audio frame, and if the value is set to true, it indicates that the media frame pulled by the terminal is an audio frame, that is, only a pure audio stream is pulled, otherwise, if the value is set to false, it indicates that the media frame pulled by the terminal is an audio/video frame, that is, an audio stream and a video picture stream are pulled, and if the value is not specified, "false" may be used as a default value.
Optionally, when configuring the audio parameter, the terminal may obtain a type of the multimedia resource, and if the type of the multimedia resource is a video, the first extended parameter may be set to "false" or a default value, and if the type of the multimedia resource is an audio, the first extended parameter may be set to "true".
Optionally, when configuring the audio parameter, the terminal may further detect a type of the application program, set the first extended parameter to "false" or a default value if the type of the application program is a video application, and set the first extended parameter to "true" if the type of the application program is an audio application.
Of course, the frame acquisition request may not carry the audio parameter, or may not carry the target location information, or may not carry both the audio parameter and the target location information, and the embodiment of the present disclosure does not specifically limit the content of the extension field.
In step 406, the server returns the media frame of the multimedia resource to the terminal at the target bitrate starting from the target position information in response to the frame acquisition request.
In the above process, after receiving the frame acquisition request, the server may parse the frame acquisition request to obtain the target address information and the target location information, and based on the target address information, the server locates the media frames of the multimedia resource with the target bitrate from the resource library, and according to the sequence of the timestamps from small to large, the server starts to send the media frames of the multimedia resource with the target bitrate to the terminal from the target location information (continuous media frames also form a media stream).
Optionally, if at least one item of target location information is not carried in the frame acquisition request, the server may configure a default value of the target location information, and the server can determine the target location information regardless of parsing from the frame acquisition request or configuring the default value by itself, and further determine from which timestamp the media frame is to be pulled based on the target location information, so that the server can return the media frame in the form indicated by the audio parameter to the terminal at the target bitrate from the timestamp indicated by the target location information.
In an exemplary scenario, if the server is a CDN server, the target address information may be a Domain Name, the terminal may send a frame acquisition request to a central platform of the CDN server, the central platform calls a DNS (Domain Name System, essentially a Domain Name resolution library) to resolve the Domain Name, may obtain a CNAME (alias) record corresponding to the Domain Name, and parses the CNAME record again based on the geographical location information of the terminal, may obtain an IP (Internet Protocol ) address of an edge server closest to the terminal, at which time the central platform directs the frame acquisition request to the edge server, and the edge server responds to the frame acquisition request, and provides a media frame of a multimedia resource to the terminal at a target bitrate, so that the terminal can access the multimedia resource at the target bitrate nearby, which may balance the overall load of the CDN server, the performance of the CDN system is more stable.
In step 407, if the terminal receives the media frame of the multimedia resource with the target bitrate, the terminal switches to playing the media frame of the multimedia resource with the target bitrate.
In the above process, if the terminal receives a media frame of a multimedia resource with a target code rate (the media frame continuously received may form a media stream), in order to ensure the fluency of playing, the terminal may store the media frame in a buffer, invoke a media encoding and decoding component to decode the media frame, obtain a decoded media frame, and invoke a media playing component to play the media frame in the buffer according to a sequence of timestamps (PTS) from small to large.
In the decoding process, the terminal can determine the coding mode of the multimedia resource from the @ codec field of the media description file, and determine the corresponding decoding mode according to the coding mode, so as to decode the media frame according to the determined decoding mode.
In some scenes of switching code rates, when a terminal sends a frame acquisition request corresponding to the code rate to be switched, the terminal can select to disconnect an existing media stream transmission link, or can select to disconnect the existing media stream transmission link, under the condition of not disconnecting the existing media stream transmission link, two paths of media stream transmission links are equivalently established, the terminal plays based on a main stream and standby stream mode, the original media stream is used as a standby stream, a new media stream is preferentially played, and once the new media stream is abnormally transmitted, the standby stream can be continuously played, so that the code rate of the media stream can be dynamically adjusted in the playing process, and the playing effect of the terminal is optimized.
That is, in the process of playing the multimedia resource, if the same media frame with multiple code rates exists in the buffer area, the terminal plays the media frame with the largest code rate among the multiple code rates, which can ensure that the media frame with the highest definition is preferentially played under the condition that multiple paths of media streams are buffered, so as to ensure the best playing effect.
In the above process, the streaming-based media transmission mode can achieve frame-level transmission of multimedia resources, and since the frame acquisition request carries the target location information, seamless switching between media streams with different code rates can be achieved, and the purpose of multi-code-rate self-adaptation is achieved.
All the above optional technical solutions may be combined arbitrarily to form the optional embodiments of the present disclosure, and are not described herein again.
The method provided by the embodiment of the disclosure determines the target code rate at any moment when the multimedia resource is played, wherein the target code rate is the code rate with the highest matching degree with the playing state at the moment, and if the target code rate is not consistent with the current code rate, the target address information of the multimedia resource with the target code rate is acquired, and a frame acquisition request carrying the target address information is sent to a server, wherein the frame acquisition request is used for indicating the server to return a media frame of the multimedia resource at the target code rate, so that the multimedia resource does not need to be transmitted in a fragmentation manner, and the resource transmission does not need to be performed after a complete resource fragment arrives, thereby achieving the media stream transmission at a frame level, greatly reducing the delay of the resource transmission process, improving the real-time performance of the resource transmission, and improving the resource transmission efficiency.
Fig. 5 is a block diagram illustrating a logical structure of a resource transfer apparatus according to an example embodiment. Referring to fig. 5, the apparatus includes a first determining unit 501, an acquiring unit 502, and a transmitting unit 503:
a first determining unit 501, configured to determine, at any time when the multimedia resource is played, a target code rate at that time, where the target code rate is a code rate with the highest matching degree with the playing state at that time;
an obtaining unit 502, configured to execute, if the target code rate is not consistent with the current code rate, obtaining target address information of the multimedia resource of the target code rate;
a sending unit 503, configured to execute sending, to a server, a frame obtaining request carrying the target address information, where the frame obtaining request is used to instruct the server to return a media frame of the multimedia resource at the target bitrate.
The device provided by the embodiment of the disclosure determines the target code rate at any moment when the multimedia resource is played, wherein the target code rate is the code rate with the highest matching degree with the playing state at the moment, and if the target code rate is not consistent with the current code rate, the target address information of the multimedia resource with the target code rate is acquired, and a frame acquisition request carrying the target address information is sent to a server, and the frame acquisition request is used for indicating the server to return a media frame of the multimedia resource at the target code rate, so that the multimedia resource does not need to be transmitted in a fragmentation manner, and resource transmission does not need to be performed after a complete resource fragment arrives, thereby achieving media stream transmission at a frame level, greatly reducing the delay of a resource transmission process, improving the real-time performance of resource transmission, and improving the resource transmission efficiency.
In a possible implementation, the time is a broadcast starting time of the multimedia resource; or, the moment is the moment when the downloading of any picture group in the multimedia resource is finished; or, the time is the playing time of any media frame in the multimedia resource.
In a possible implementation, based on the apparatus composition of fig. 5, the first determining unit 501 includes:
a determining subunit configured to perform determining the target bitrate based on the bandwidth information at the time and the media buffer amount at the time.
In one possible embodiment, the determining subunit is configured to perform:
determining a predicted buffer storage amount of at least one candidate code rate based on the bandwidth information and the media buffer storage amount, wherein the predicted buffer storage amount refers to a media buffer storage amount predicted from the moment to the moment until the downloading end moment, and the predicted buffer storage amount is continuously downloaded to the current picture group according to the corresponding code rate;
and determining the target code rate from the at least one candidate code rate or the current code rate based on the predicted buffer amount of the at least one candidate code rate.
In one possible implementation, the first determining unit 501 is configured to perform:
and if the moment is the play starting moment of the multimedia resource, determining the play starting code rate specified by the play service or the default play code rate of the media description file as the target code rate.
In a possible embodiment, based on the apparatus composition of fig. 5, the apparatus further comprises:
a second determining unit, configured to determine target location information if the target code rate is inconsistent with the current code rate, where the target location information is used to indicate an initial pull location of a media frame of the multimedia resource;
an embedding unit configured to perform embedding the target location information into a frame acquisition request carrying the target address information.
In one possible embodiment, the second determining unit is configured to perform:
if the moment is the starting moment of the multimedia resource, the target position information is determined based on the cache duration appointed by the playing service; or the like, or, alternatively,
if the moment is the moment when the downloading of any picture group in the multimedia resource is finished, determining a timestamp of a first frame in a next picture group of the picture group as the target position information; or the like, or, alternatively,
and if the moment is the playing moment of any media frame in the multimedia resource, determining the timestamp of the first frame in the currently downloaded frame group as the target position information.
In a possible embodiment, based on the apparatus composition of fig. 5, the apparatus further comprises:
and the transmission unit is configured to ignore the target code rate and continue to execute multimedia resource transmission at the current code rate if the target code rate is consistent with the current code rate.
In a possible embodiment, based on the apparatus composition of fig. 5, the apparatus further comprises:
and the playing unit is configured to play the media frame with the maximum code rate in the multiple code rates if the same media frame with the multiple code rates exists in the cache region in the process of playing the multimedia resource.
All the above optional technical solutions may be combined arbitrarily to form the optional embodiments of the present disclosure, and are not described herein again.
With regard to the apparatus in the above-mentioned embodiment, the specific manner in which each unit performs operations has been described in detail in the embodiment related to the resource transmission method, and will not be elaborated here.
Fig. 6 shows a block diagram of a terminal according to an exemplary embodiment of the present disclosure. The terminal 600 may be: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion video Experts compression standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, motion video Experts compression standard Audio Layer 4), a notebook computer, or a desktop computer. The terminal 600 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, etc.
In general, the terminal 600 includes: a processor 601 and a memory 602.
The processor 601 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. The processor 601 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 601 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 601 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed on the display screen. In some embodiments, processor 601 may also include an AI (Artificial Intelligence) processor for processing computational operations related to machine learning.
The memory 602 may include one or more computer-readable storage media, which may be non-transitory. The memory 602 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in the memory 602 is used to store at least one instruction for execution by the processor 601 to implement the resource transfer method provided by various embodiments in the present disclosure.
In some embodiments, the terminal 600 may further optionally include: a peripheral interface 603 and at least one peripheral. The processor 601, memory 602, and peripheral interface 603 may be connected by buses or signal lines. Various peripheral devices may be connected to the peripheral interface 603 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of a radio frequency circuit 604, a touch screen display 605, a camera assembly 606, an audio circuit 607, a positioning component 608, and a power supply 609.
The peripheral interface 603 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 601 and the memory 602. In some embodiments, the processor 601, memory 602, and peripheral interface 603 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 601, the memory 602, and the peripheral interface 603 may be implemented on a separate chip or circuit board, which is not limited in this embodiment.
The Radio Frequency circuit 604 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 604 communicates with communication networks and other communication devices via electromagnetic signals. The rf circuit 604 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 604 comprises: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuitry 604 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: metropolitan area networks, various generation mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the radio frequency circuit 604 may also include NFC (Near Field Communication) related circuits, which are not limited by this disclosure.
The display 605 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 605 is a touch display screen, the display screen 605 also has the ability to capture touch signals on or over the surface of the display screen 605. The touch signal may be input to the processor 601 as a control signal for processing. At this point, the display 605 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display 605 may be one, providing the front panel of the terminal 600; in other embodiments, the display 605 may be at least two, respectively disposed on different surfaces of the terminal 600 or in a folded design; in still other embodiments, the display 605 may be a flexible display disposed on a curved surface or on a folded surface of the terminal 600. Even more, the display 605 may be arranged in a non-rectangular irregular pattern, i.e., a shaped screen. The Display 605 may be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), and the like.
The camera assembly 606 is used to capture images or video. Optionally, camera assembly 606 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 606 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.
Audio circuitry 607 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 601 for processing or inputting the electric signals to the radio frequency circuit 604 to realize voice communication. For the purpose of stereo sound collection or noise reduction, a plurality of microphones may be provided at different portions of the terminal 600. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 601 or the radio frequency circuit 604 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, audio circuitry 607 may also include a headphone jack.
The positioning component 608 is used for positioning the current geographic Location of the terminal 600 to implement navigation or LBS (Location Based Service). The Positioning component 608 can be a Positioning component based on the united states GPS (Global Positioning System), the chinese beidou System, the russian graves System, or the european union's galileo System.
Power supply 609 is used to provide power to the various components in terminal 600. The power supply 609 may be ac, dc, disposable or rechargeable. When the power supply 609 includes a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.
In some embodiments, the terminal 600 also includes one or more sensors 610. The one or more sensors 610 include, but are not limited to: acceleration sensor 611, gyro sensor 612, pressure sensor 613, fingerprint sensor 614, optical sensor 615, and proximity sensor 616.
The acceleration sensor 611 may detect the magnitude of acceleration in three coordinate axes of the coordinate system established with the terminal 600. For example, the acceleration sensor 611 may be used to detect components of the gravitational acceleration in three coordinate axes. The processor 601 may control the touch screen display 605 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 611. The acceleration sensor 611 may also be used for acquisition of motion data of a game or a user.
The gyro sensor 612 may detect a body direction and a rotation angle of the terminal 600, and the gyro sensor 612 and the acceleration sensor 611 may cooperate to acquire a 3D motion of the user on the terminal 600. The processor 601 may implement the following functions according to the data collected by the gyro sensor 612: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.
The pressure sensor 613 may be disposed on a side frame of the terminal 600 and/or on a lower layer of the touch display screen 605. When the pressure sensor 613 is disposed on the side frame of the terminal 600, a user's holding signal of the terminal 600 can be detected, and the processor 601 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 613. When the pressure sensor 613 is disposed at the lower layer of the touch display screen 605, the processor 601 controls the operability control on the UI interface according to the pressure operation of the user on the touch display screen 605. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.
The fingerprint sensor 614 is used for collecting a fingerprint of a user, and the processor 601 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 614, or the fingerprint sensor 614 identifies the identity of the user according to the collected fingerprint. Upon identifying that the user's identity is a trusted identity, the processor 601 authorizes the user to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying, and changing settings, etc. The fingerprint sensor 614 may be disposed on the front, back, or side of the terminal 600. When a physical button or vendor Logo is provided on the terminal 600, the fingerprint sensor 614 may be integrated with the physical button or vendor Logo.
The optical sensor 615 is used to collect the ambient light intensity. In one embodiment, processor 601 may control the display brightness of touch display 605 based on the ambient light intensity collected by optical sensor 615. Specifically, when the ambient light intensity is high, the display brightness of the touch display screen 605 is increased; when the ambient light intensity is low, the display brightness of the touch display screen 605 is turned down. In another embodiment, the processor 601 may also dynamically adjust the shooting parameters of the camera assembly 606 according to the ambient light intensity collected by the optical sensor 615.
A proximity sensor 616, also known as a distance sensor, is typically disposed on the front panel of the terminal 600. The proximity sensor 616 is used to collect the distance between the user and the front surface of the terminal 600. In one embodiment, when the proximity sensor 616 detects that the distance between the user and the front surface of the terminal 600 gradually decreases, the processor 601 controls the touch display 605 to switch from the bright screen state to the dark screen state; when the proximity sensor 616 detects that the distance between the user and the front surface of the terminal 600 gradually becomes larger, the processor 601 controls the touch display 605 to switch from the breath screen state to the bright screen state.
Those skilled in the art will appreciate that the configuration shown in fig. 6 is not intended to be limiting of terminal 600 and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.
In an exemplary embodiment, there is also provided a storage medium comprising at least one instruction, for example a memory comprising at least one instruction, which is executable by a processor in a terminal to perform the resource transfer method in the above embodiments. Alternatively, the storage medium may be a non-transitory computer-readable storage medium, for example, the non-transitory computer-readable storage medium may include a ROM (Read-Only Memory), a RAM (Random-Access Memory), a CD-ROM (Compact Disc Read-Only Memory), a magnetic tape, a floppy disk, an optical data storage device, and the like.
In an exemplary embodiment, a computer program product is also provided, which includes one or more instructions that can be executed by a processor of a terminal to implement the resource transmission method provided by the above embodiments.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (10)

1. A method for resource transmission, comprising:
at any moment when the multimedia resources are played, determining a target code rate at the moment, wherein the target code rate is the code rate with the highest matching degree with the playing state at the moment;
if the target code rate is not consistent with the current code rate, target address information of the multimedia resource with the target code rate is obtained;
and sending a frame acquisition request carrying the target address information to a server, wherein the frame acquisition request is used for indicating the server to return the media frame of the multimedia resource at the target code rate.
2. The resource transmission method according to claim 1, wherein the time is a broadcast start time of the multimedia resource; or, the moment is the moment when the downloading of any picture group in the multimedia resource is finished; or, the time is the playing time of any media frame in the multimedia resource.
3. The method of claim 1, wherein the determining the target code rate for the time comprises:
and determining the target code rate based on the bandwidth information of the moment and the media buffer amount of the moment.
4. The method of claim 3, wherein the determining the target bitrate based on the bandwidth information and the media buffer amount at the time comprises:
determining a predicted buffer storage amount of at least one candidate code rate based on the bandwidth information and the media buffer storage amount, wherein the predicted buffer storage amount refers to a media buffer storage amount predicted from the moment to the moment when the current picture group is continuously downloaded according to the corresponding code rate;
determining the target code rate from the at least one candidate code rate or the current code rate based on a predicted buffer amount of the at least one candidate code rate.
5. The method of claim 1, wherein the determining the target code rate for the time comprises:
and if the moment is the play starting moment of the multimedia resource, determining the play starting code rate appointed by the play service or the default play code rate of the media description file as the target code rate.
6. The resource transmission method according to any one of claims 1 to 5, wherein before sending the frame acquisition request carrying the destination address information to the server, the method further comprises:
if the target code rate is not consistent with the current code rate, determining target position information, wherein the target position information is used for representing the initial pulling position of the media frame of the multimedia resource;
and embedding the target position information into a frame acquisition request carrying the target address information.
7. The method of claim 6, wherein the determining the target location information comprises:
if the moment is the starting moment of the multimedia resource, determining the target position information based on the cache duration appointed by the playing service; or the like, or, alternatively,
if the moment is the moment when the downloading of any picture group in the multimedia resource is finished, determining a timestamp of a first frame in a next picture group of the picture group as the target position information; or the like, or, alternatively,
and if the moment is the playing moment of any media frame in the multimedia resource, determining the timestamp of the first frame in the currently downloaded picture group as the target position information.
8. An apparatus for resource transmission, comprising:
the first determining unit is configured to determine a target code rate at any moment when the multimedia resource is played, wherein the target code rate is a code rate with the highest matching degree with the playing state at the moment;
the obtaining unit is configured to execute the step of obtaining target address information of the multimedia resource with the target code rate if the target code rate is inconsistent with the current code rate;
and the sending unit is configured to execute sending a frame obtaining request carrying the target address information to a server, wherein the frame obtaining request is used for indicating the server to return the media frame of the multimedia resource at the target code rate.
9. A terminal, comprising:
one or more processors;
one or more memories for storing the one or more processor-executable instructions;
wherein the one or more processors are configured to execute the instructions to implement the resource transmission method of any one of claim 1 to claim 7.
10. A storage medium, wherein at least one instruction of the storage medium, when executed by one or more processors of a terminal, enables the terminal to perform the resource transmission method of any one of claims 1 to 7.
CN202010054775.2A 2020-01-17 2020-01-17 Resource transmission method, device, terminal and storage medium Active CN113141523B (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN202010054775.2A CN113141523B (en) 2020-01-17 2020-01-17 Resource transmission method, device, terminal and storage medium
PCT/CN2020/133755 WO2021143386A1 (en) 2020-01-17 2020-12-04 Resource transmission method and terminal
EP20913374.3A EP3952316A4 (en) 2020-01-17 2020-12-04 Resource transmission method and terminal
US17/519,459 US11652864B2 (en) 2020-01-17 2021-11-04 Method and apparatus for transmitting resources and non-transitory storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010054775.2A CN113141523B (en) 2020-01-17 2020-01-17 Resource transmission method, device, terminal and storage medium

Publications (2)

Publication Number Publication Date
CN113141523A true CN113141523A (en) 2021-07-20
CN113141523B CN113141523B (en) 2022-07-22

Family

ID=76809529

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010054775.2A Active CN113141523B (en) 2020-01-17 2020-01-17 Resource transmission method, device, terminal and storage medium

Country Status (4)

Country Link
US (1) US11652864B2 (en)
EP (1) EP3952316A4 (en)
CN (1) CN113141523B (en)
WO (1) WO2021143386A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115086300A (en) * 2022-06-16 2022-09-20 乐视云计算有限公司 Video file scheduling method and device

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114221870B (en) * 2021-12-16 2023-01-20 北京达佳互联信息技术有限公司 Bandwidth allocation method and device for server
CN115396732B (en) * 2022-08-11 2024-02-02 深圳海翼智新科技有限公司 Audio and video data packet transmission method and device, electronic equipment and storage medium
CN116684468B (en) * 2023-08-02 2023-10-20 腾讯科技(深圳)有限公司 Data processing method, device, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102333089A (en) * 2011-09-26 2012-01-25 南京邮电大学 Adaptive control method of multi-rate media stream based on hypertext transfer protocol (HTTP) streaming
CN109040801A (en) * 2018-07-19 2018-12-18 北京达佳互联信息技术有限公司 Media code rate by utilizing adaptive approach, device, computer equipment and storage medium
CN110636346A (en) * 2019-09-19 2019-12-31 北京达佳互联信息技术有限公司 Code rate self-adaptive switching method and device, electronic equipment and storage medium

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8089892B2 (en) * 2005-12-15 2012-01-03 Thomson Licensing Adaptive joint source and channel coding scheme for H.264 video multicasting over wireless networks
CN101068236B (en) * 2007-04-13 2011-10-26 华为技术有限公司 Streaming media bit rate control method, system and equipment
US9124642B2 (en) * 2009-10-16 2015-09-01 Qualcomm Incorporated Adaptively streaming multimedia
CN102695081A (en) * 2012-06-13 2012-09-26 百视通网络电视技术发展有限责任公司 Video resource scheduling method based on Internet television and television terminal
WO2014011848A2 (en) * 2012-07-12 2014-01-16 Huawei Technologies Co., Ltd. Signaling and processing content with variable bitrates for adaptive streaming
CN103338393A (en) * 2013-06-13 2013-10-02 西安交通大学 Video code rate selecting method driven by user experience under HSPA system
US9386308B2 (en) * 2013-07-16 2016-07-05 Cisco Technology, Inc. Quality optimization with buffer and horizon constraints in adaptive streaming
CN103533386A (en) * 2013-10-21 2014-01-22 腾讯科技(深圳)有限公司 Live broadcasting control method and anchor equipment
CN105025351B (en) * 2014-04-30 2018-06-29 深圳Tcl新技术有限公司 The method and device of DST PLAYER buffering

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102333089A (en) * 2011-09-26 2012-01-25 南京邮电大学 Adaptive control method of multi-rate media stream based on hypertext transfer protocol (HTTP) streaming
CN109040801A (en) * 2018-07-19 2018-12-18 北京达佳互联信息技术有限公司 Media code rate by utilizing adaptive approach, device, computer equipment and storage medium
CN110636346A (en) * 2019-09-19 2019-12-31 北京达佳互联信息技术有限公司 Code rate self-adaptive switching method and device, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115086300A (en) * 2022-06-16 2022-09-20 乐视云计算有限公司 Video file scheduling method and device
CN115086300B (en) * 2022-06-16 2023-09-08 乐视云网络技术(北京)有限公司 Video file scheduling method and device

Also Published As

Publication number Publication date
US11652864B2 (en) 2023-05-16
EP3952316A1 (en) 2022-02-09
CN113141523B (en) 2022-07-22
US20220060533A1 (en) 2022-02-24
WO2021143386A1 (en) 2021-07-22
EP3952316A4 (en) 2022-07-27

Similar Documents

Publication Publication Date Title
CN108600773B (en) Subtitle data pushing method, subtitle display method, device, equipment and medium
CN113141514B (en) Media stream transmission method, system, device, equipment and storage medium
CN113141523B (en) Resource transmission method, device, terminal and storage medium
CN113141524B (en) Resource transmission method, device, terminal and storage medium
US11537562B2 (en) Auxiliary manifest file to provide timed metadata
CN103190092B (en) System and method for the synchronized playback of streaming digital content
CN108769726B (en) Multimedia data pushing method and device, storage medium and equipment
CN110213616B (en) Video providing method, video obtaining method, video providing device, video obtaining device and video providing equipment
CN109874043B (en) Video stream sending method, video stream playing method and video stream playing device
US20160227285A1 (en) Browsing videos by searching multiple user comments and overlaying those into the content
CN109413453B (en) Video playing method, device, terminal and storage medium
CN108600778B (en) Media stream transmitting method, device, system, server, terminal and storage medium
CN110996117B (en) Video transcoding method and device, electronic equipment and storage medium
CN112995759A (en) Interactive service processing method, system, device, equipment and storage medium
CN113141522B (en) Resource transmission method, device, computer equipment and storage medium
CN113835649A (en) Screen projection method and terminal
WO2018005835A1 (en) Systems and methods for fast channel change
CN112969093A (en) Interactive service processing method, device, equipment and storage medium
CN111147942A (en) Video playing method and device, electronic equipment and storage medium
US20220095020A1 (en) Method for switching a bit rate, and electronic device
CN111010588A (en) Live broadcast processing method and device, storage medium and equipment
CN114189696A (en) Video playing method and device
CN111698262B (en) Bandwidth determination method, device, terminal and storage medium
CN116264619A (en) Resource processing method, device, server, terminal, system and storage medium
CN111770373B (en) Content synchronization method, device and equipment based on live broadcast and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant