CN112930687A - Media stream processing method and device, storage medium and program product - Google Patents

Media stream processing method and device, storage medium and program product Download PDF

Info

Publication number
CN112930687A
CN112930687A CN201880098342.8A CN201880098342A CN112930687A CN 112930687 A CN112930687 A CN 112930687A CN 201880098342 A CN201880098342 A CN 201880098342A CN 112930687 A CN112930687 A CN 112930687A
Authority
CN
China
Prior art keywords
media stream
web front
playing
frequency
computer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201880098342.8A
Other languages
Chinese (zh)
Other versions
CN112930687B (en
Inventor
鲁学研
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bitmain Technologies Inc
Original Assignee
Bitmain Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bitmain Technologies Inc filed Critical Bitmain Technologies Inc
Publication of CN112930687A publication Critical patent/CN112930687A/en
Application granted granted Critical
Publication of CN112930687B publication Critical patent/CN112930687B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

A media stream processing method and apparatus, storage medium and program product, the method comprising the steps of: the method comprises the steps that a media stream is pulled by an internet webpage front end, structural information carried in the media stream is obtained (S102), then the internet webpage front end generates a display time stamp of the structural information according to a frame structure of the media stream, and the display time stamp is aligned with the playing time of the media stream (S104), so that the internet webpage front end carries out format encapsulation on the media stream, the encapsulated media stream is played by an H5 player (S106), and further the internet webpage front end synchronously draws the structural information on a current playing picture according to the display time stamp (S108). The method provides a technical scheme that the structured information and the video frame can be synchronously drawn without installing a plug-in, eliminates the technical barrier that the synchronous drawing can not be realized without the plug-in, and has higher flexibility.

Description

Media stream processing method and device, storage medium and program product Technical Field
The present application relates to the field of media streams, and for example, to a method and an apparatus for processing a media stream, a storage medium, and a program product.
Background
With the popularization of internet applications, data transmitted on the network is not limited to only text or graphics, and new enjoyment is brought to people by the spread of media streams such as voice, video and the like.
Currently, playing a media stream on a browser generally depends on the way of a browser control or a plug-in. Specifically, a plug-in is installed in the browser, the media stream is directly pulled through the plug-in, the pulled-in media stream is decoded to obtain a frame picture, and then the frame is used for synchronously drawing the structured information, so that the structured information and the video frame are synchronously played.
However, the existing media stream processing method requires a plug-in to be installed in a browser, and part of the browser also has a limitation on the version of the plug-in, whereas if a traditional plug-in is not used, other ways are required to implement synchronous drawing of the structured information and the video frame.
Disclosure of Invention
The embodiment of the disclosure provides a media stream processing method and device, a storage medium and a program product, which are used for providing a technical scheme that synchronous drawing of structured information and video frames can be realized without installing a plug-in, eliminating the technical barrier that synchronous drawing cannot be realized without the plug-in, and having higher flexibility.
The embodiment of the disclosure provides a media stream processing method, which includes:
the method comprises the steps that a Web front end pulls a media stream and obtains structural information carried in the media stream;
the Web front end generates a display time stamp PTS of the structured information according to the frame structure of the media stream, and the display time stamp is aligned with the playing time of the media stream;
the Web front end carries out format packaging on the media stream, and plays the packaged media stream by using an H5 player;
and the Web front end synchronously draws the structured information on the current playing picture according to the display time stamp.
An embodiment of the present disclosure further provides a media stream processing apparatus, including:
the acquisition module is used for pulling the media stream and acquiring the structural information carried in the media stream;
a generating module, configured to generate a display timestamp PTS of the structured information according to a frame structure of the media stream, where the display timestamp is aligned with a playing time of the media stream;
the playing module is used for carrying out format packaging on the media stream and playing the packaged media stream by using an H5 player;
and the drawing module is used for synchronously drawing the structural information on the current playing picture according to the display time stamp.
The embodiment of the disclosure also provides a computer, which comprises the media stream processing device.
The embodiment of the disclosure also provides a computer-readable storage medium, which stores computer-executable instructions configured to execute the above media stream processing method.
Embodiments of the present disclosure also provide a computer program product comprising a computer program stored on a computer-readable storage medium, the computer program comprising program instructions that, when executed by a computer, cause the computer to perform the above-mentioned media stream processing method.
An embodiment of the present disclosure further provides an electronic device, including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor, and the instructions, when executed by the at least one processor, cause the at least one processor to perform the above-mentioned media stream processing method.
According to the technical scheme provided by the embodiment of the disclosure, the media stream is pulled through the Web front end, the structural information of the media stream is obtained, and then the frame structure of the media stream is utilized to generate the display timestamp of the structural information, so that when the H5 player is utilized to play the media stream, the structural information is synchronously drawn according to the display timestamp, therefore, the synchronous drawing of the structural information and the media stream can be realized without installing any plug-in a browser or decoding a frame picture, the technical barrier that the synchronous drawing cannot be realized without the plug-in is eliminated, and the method has higher flexibility.
Drawings
One or more embodiments are illustrated by way of example in the accompanying drawings, which correspond to the accompanying drawings and not in limitation thereof, in which elements having the same reference numeral designations are shown as like elements and not in limitation thereof, and wherein:
fig. 1 is a schematic flowchart of a media stream processing method according to an embodiment of the disclosure;
fig. 2 is an interaction flow diagram of a media stream processing method according to an embodiment of the present disclosure;
fig. 3 is a schematic flow chart of another media stream processing method according to an embodiment of the disclosure;
fig. 4 is a schematic flow chart of another media stream processing method according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of a media stream processing apparatus according to an embodiment of the present disclosure;
FIG. 6 is a schematic structural diagram of a computer according to an embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.
Detailed Description
So that the manner in which the features and elements of the disclosed embodiments can be understood in detail, a more particular description of the disclosed embodiments, briefly summarized above, may be had by reference to the embodiments, some of which are illustrated in the appended drawings. In the following description of the technology, for purposes of explanation, numerous details are set forth in order to provide a thorough understanding of the disclosed embodiments. However, one or more embodiments may be practiced without these details. In other instances, well-known structures and devices may be shown in simplified form in order to simplify the drawing.
First, terms related to the embodiments of the present disclosure are specifically described for the sake of understanding.
Web (internet Web) front end: which may be embodied as a Web front-end processor.
H5: the HTML5 is made by World Wide Web Consortium (W3C), and its goal is to replace the HTML4.01 and XHTML 1.0 standards made earlier, so as to make the standards meet the requirements of the current generation when the internet application is rapidly developed. When referring to HTML5 in a broad sense, it actually refers to a set of technical combinations including Hyper Text Markup Language (HTML), Cascading Style Sheets (CSS), and JavaScript (an transliterated scripting Language).
MSE (media sources extensions): namely, media source extension, is a new browser interface (Web API) supported by mainstream browsers such as Chrome, Safari, Edge, etc. MSE conforms to the W3C standard, allowing JavaScript to dynamically construct < video > and < audio > media streams. It defines objects that allow JavaScript to transfer Media stream fragments to an HTML Media Element.
Pts (presentation Time stamp): i.e., a presentation time stamp, which can be used as a basis for the player to play the information or media stream, for example, the player can determine when to present the frame data corresponding to the presentation time stamp according to the presentation time stamp.
Rtsp (real Time Streaming protocol): the Real-time streaming Protocol is an application layer Protocol in a Transmission Control Protocol (TCP)/Internet Protocol (IP) Protocol system, and is an IETF RFC standard submitted by university of columbia, Internet view company, and Real Networks company.
RTMP (real Time Messaging protocol): i.e., the Real-Time Messaging Protocol, which is based on TCP, is a family of protocols including Real-Time Messaging Protocol (RTMP) base Protocol and multiple variants of RTMPT/RTMPs/rtmmpe. RTMP is a network protocol designed for real-time data communication, mainly used for audio-video and data communication between Flash/AIR (Adobe Integrated runtime) platform and media stream/interactive server supporting RTMP protocol. Software supporting the protocol includes Adobe Media Server/Ultrant Media Server/red5, etc.
Flv (flash video): the FLV media stream format is a video format that has evolved with the introduction of Flash MX technology. The file formed by the method is small and the loading speed is high, so that the video file can be watched on the network, and the method effectively solves the problems that the exported SWF (shock wave Flash, a special format specially used for designing software Flash) file is large in size and cannot be well used on the network after the video file is imported into Flash.
http-flv: and (4) carrying out flv media stream transmission based on the http protocol.
websocket-flv: flv media stream transmitted based on websocket protocol.
NPAPI (Netscape plug Application Programming interface): the Netscape plug-in application programming interface is a plug-in interface used by Gecko engine browsers such as Netscape Navigator, Mozilla Suite, Mozilla SeaMonkey and Mozilla Firefox and webkit engine browsers such as Apple Safari and Google Chrome.
Ppapi (pepper plug api): there are safety hazards for NPAPI.
The terms related to the above are used in the following description of the embodiments of the present disclosure, and all the terms refer to the above meanings, which are not described again.
In view of the foregoing problems in the prior art, the embodiments of the present disclosure provide the following solutions: the method comprises the steps that a Web front end pulls a media stream and obtains structured information, and then the structured information is synchronously drawn when an H5 player plays the media stream in a mode of adding a display time stamp to the structured information, so that synchronous playing is realized.
The embodiment of the disclosure provides a media stream processing method. Referring to fig. 1, the method includes:
s102, the Web front end pulls the media stream and obtains the structural information carried in the media stream.
S104, the Web front end generates a display time stamp PTS of the structured information according to the frame structure of the media stream, and the display time stamp is aligned with the playing time of the media stream.
S106, the Web front end performs format packaging on the media stream, and plays the packaged media stream by using an H5 player;
and S108, synchronously drawing the structural information on the current playing picture by the Web front end according to the display time stamp.
In the embodiment of the present disclosure, the media streams involved in the foregoing S102 and S104 are http-flv streams or websocket-flv streams, where http and websocket are protocol names, and flv is a format of the media streams. And the step S106 is to perform format encapsulation, which essentially converts the format of the media stream, so that the encapsulated media stream can meet the playing requirement of the H5 player. In one possible design, the encapsulated media stream may be in FMP4 format.
Before the Web front end pulls the data stream, that is, before the step S102 is executed, the streaming server needs to push the media stream, and the algorithm server needs to calculate the structural information of the media stream.
Specifically, referring to the interaction diagram shown in fig. 2, before executing S102, the method further includes the following steps:
s1011, the streaming media server pulls the RTSP stream.
S1012, the streaming media server decodes the RTSP stream to obtain a video frame unit.
And S1013, the streaming media server sends the video frame unit to the algorithm server.
S1014, the algorithm server processes the video frame unit to obtain the structured information of the RTSP stream.
And S1015, the algorithm server sends the structured information to the streaming media server.
S1016, the streaming media server encapsulates the RTSP stream and the structural information to obtain the media stream.
And S1017, the streaming media server carries out stream pushing on the media stream.
In the embodiment of the present disclosure, there is no special limitation on whether the streaming media server, the algorithm server, and the Web front end are integrated together, and the three may be independent from each other, or may be integrated in one device or apparatus in at least two ways.
In addition, the prior art also relates to an H5 scheme for playing a media stream without depending on a plug-in, which specifically transmits a low-coding-rate stream in an MPEG1 coding format in a websocket manner, uses a Central Processing Unit (CPU) to soft-decode a frame picture, and uses canvas to synchronously render. However, this implementation is to use the CPU to decode the frame picture, which results in high CPU occupation, and thus, the number of playing paths is limited, and the h264 coding scheme of the main stream is not supported, and the streaming media server is required to convert the frame picture into MPEG 1.
In contrast, as shown in fig. 2, the technical solution provided by the embodiment of the present disclosure is that the streaming media server decodes the RTSP stream to obtain a video frame unit, and the video frame unit does not need to be decoded by the Web front end, and does not occupy the CPU, and thus does not have the problem of performance consumed by the CPU, so that multiple paths of video playing are achieved, and the problem of blocking is effectively solved.
As shown in the interaction flow of fig. 2, in S1016, the RTSP stream and the structural information are encapsulated, and the result of the encapsulation is the http-flv stream or the websocket-flv stream that forms the foregoing embodiment of the present disclosure.
Based on the push stream of the streaming server, the Web front end can pull the media stream at the address of its push stream.
Specifically, the embodiment of the present disclosure provides the following implementation manner of S102:
the Web front end pulls the media stream through XHR2/Fetch or websocket.
The XHR2/Fetch or websocket is used as a network request for requesting binary data from the streaming media server to obtain the media stream.
Further, the position where the structured information is encapsulated in the media stream may be different based on the protocol-format difference of the media stream encapsulated in S1016. In one possible design, if the RTSP stream and the structural information are encapsulated into a media stream conforming to the h.264 specification, such as an http-flv stream, the structural information may be encapsulated into a Supplemental Enhancement Information (SEI) unit of the http-flv stream. Among them, h.264 is a digital video compression format proposed by the international organization for standardization and the international telecommunication union.
In contrast, based on the different encapsulation positions of the structured information in the media streams with different protocol-formats, when the Web server executes the step of obtaining the structured information, the Web front end can obtain the structured information carried in the media streams according to the protocol and format of the media streams.
That is, the protocol format of the media stream is first determined, so that the encapsulation position of the purchase information is determined according to the preset correspondence between the protocol format and the encapsulation position, and then the structured information is obtained at the encapsulation position. For example, in the foregoing possible design, since the http-flv stream is pulled by the Web front end, it may be determined that an SEI unit in the http-flv stream is used for encapsulating the structural information, and therefore the structural information may be acquired at the SEI unit.
The existing media stream processing mode is realized by means of plug-ins, such as an NPAPI plug-in, a PPAPI plug-in, or an ADOBE FLASH plug-in, all of which need to decode a frame picture after pulling a media stream, and draw structured information by using frame synchronization, which not only needs a browser to install the plug-in, but also further causes an increase in computation amount and a reduction in processing efficiency due to decoding of the frame picture.
Based on this, in the embodiment of the present disclosure, since a plug-in does not need to be installed in the browser, but a display timestamp is generated for the structured information in the obtained media stream, the structured information can be synchronously drawn when the H5 player plays the media stream without decoding the media stream into a frame picture.
The specific implementation process may refer to the flow shown in fig. 3, and at this time, the step S104 may be:
s1042, the Web front end analyzes the SEI unit of the structured data and the video frame unit of the media stream according to the frame structure of the media stream.
And S1044, the Web front end generates a display timestamp for the SEI unit according to the playing time of the video frame unit, so that the SEI unit is aligned with the video frame unit.
In the embodiment of the present disclosure, the final rendering result is that the media stream is played synchronously with the structured information, and therefore, the display timestamp of the SEI unit is required to be aligned with the playing time of the video frame unit.
For example, if the playback of the video frame unit is practiced at a time interval of 40ms per frame, when generating the display time stamp of the SEI unit, the display time stamp corresponding to the playback time of the video frame unit corresponding to the SEI unit may be generated for each SEI unit at a time interval of 40ms per SEI unit.
Based on the above setting, the embodiment of the present disclosure uses an H5 player to play the media stream, and simultaneously draws the structured information above the current playing screen according to the generated display timestamp, thereby implementing synchronous playing.
According to the display time stamp, when the structured information is synchronously drawn, a three-dimensional image or a two-dimensional image can be drawn on a canvas as required, wherein the canvas is a browser DOM (document object model) object, and a displayed picture is displayed on the current playing picture in an overlaying manner. That is, the canvas presents the screen closer to the viewer than the screen played in the H5 player.
In one possible design, the two-dimensional image may be implemented by a canvas. Canvas is a 2D (two-dimensional) drawing protocol that renders pixel by pixel and implements 2D image drawing by JavaScript. Therefore, in the foregoing flow, if synchronous drawing of structured information is realized by canvas, a canvas element may be added to the H5 player, so that the canvas may draw a 2D image on the canvas by JavaScript.
Alternatively, in another design, the rendering of the three-dimensional image may be implemented by a Web Graphics Library (GL) protocol. The Web GL is a 3D (three-dimensional) drawing protocol which eliminates the trouble of developing a Web page-specific rendering plug-in, can be used to create a Web site page having a complicated 3D structure, and can be even used to design a 3D Web game, etc.
When synchronous drawing is specifically realized, the Web front end can call the current time of the media stream at a preset frequency, then synchronous drawing of the structured information is carried out on the canvas based on the display timestamp, and the current playing picture is overlaid and aligned. The preset frequency may be set as needed, for example, the preset frequency may be set to call up the time at a frequency of 1 second for 60 frames.
However, in the synchronous rendering process, in consideration of the problem that the media stream may not be synchronized with the playing of the structured information, the embodiment of the present disclosure further provides a possible design as shown in fig. 4, so as to implement the rectification processing on the rendered structured information and ensure that the rendering is synchronized with the media stream.
As shown in fig. 4, the method further includes:
and S110, the Web front end performs deviation rectification processing on the drawing of the structural information according to a first frequency, wherein the first frequency is greater than a second frequency, and the second frequency is the frame playing frequency of the media stream.
In the design shown in fig. 4, the second frequency is the frame playing frequency of the media stream, and the time interval thereof is the frame interval of the media stream, which is related to the frame structure design of the media stream. The first frequency is greater than the second frequency, and the time interval of the first frequency is smaller than the time interval of the first frequency, namely, smaller than the frame interval of the media stream. That is, the drawing of the structured information is de-skewed at a higher frequency, so that the played media stream and the structured information can be presented synchronously.
In addition, considering that the alignment drawing has time consumption in calculation, in a specific implementation, a time interval slightly longer than a specific operation duration may be used for performing rectification processing. The deviation rectifying process can be realized by a MATH.ceil function.
Ceil function is one of the lua functions (lua function is one of JavaScript), and is configured to: ceil (x) returns the smallest integer greater than or equal to parameter x, i.e., rounding up the floating point number. In an implementation scenario related to the embodiment of the present disclosure, it may be expressed as: fill (video.currenttime/40) × 40 is aligned with PTS, where video.currenttime represents the current play time of the media stream.
For example, in a possible implementation scenario, if the frame playing interval of the media stream is 40ms, 10ms per frame may be used as the first frequency for performing the skew correction processing.
As can be seen from the foregoing description, the technical solution provided in the embodiment of the present disclosure does not require installing a plug-in the browser, which can meet the development requirements of future browsers, and synchronous rendering is implemented by displaying the timestamp without decoding a frame picture of the media stream by the Web front end, which can effectively reduce the amount of computation, and is beneficial to shortening the processing time, and reserve more time for performing error correction processing, so as to ensure synchronous playing and alignment of the structured information and the media stream.
The embodiment of the disclosure also provides a media stream processing device. Referring to fig. 5, the media stream processing apparatus 500 includes:
an obtaining module 51, configured to pull a media stream and obtain structural information carried in the media stream;
a generating module 52, configured to generate a display time stamp PTS of the structured information according to the frame structure of the media stream, where the display time stamp is aligned with the playing time of the media stream;
the playing module 53 is configured to perform format encapsulation on the media stream, and play the encapsulated media stream by using an H5 player;
and the drawing module 54 is configured to draw the structured information synchronously on the current playing picture according to the display timestamp.
In one possible design, the generating module 52 is specifically configured to:
analyzing an SEI unit of the structured data and a video frame unit of the media stream according to a frame structure of the media stream;
and generating a display time stamp for the SEI unit according to the playing time of the video frame unit, so that the SEI unit is aligned with the video frame unit.
In another possible design, the media stream processing apparatus further includes:
and a deviation rectifying module (not shown in fig. 5) configured to perform deviation rectifying processing on the drawing of the structured information according to a first frequency, where the first frequency is greater than a second frequency, and the second frequency is a frame playing frequency of the media stream.
In another possible design, the rendering module 54 is specifically configured to:
and drawing the three-dimensional image or the two-dimensional image on a canvas according to the display time stamp, wherein the canvas is displayed on the current playing picture in an overlaying mode.
In another possible design, the obtaining module 51 is specifically configured to:
pulling the media stream;
and acquiring the structural information carried in the media stream according to the protocol and the format of the media stream.
In another possible design, the media stream is: http-flv media stream or websocket-flv media stream.
In another possible design, the obtaining module 51 is specifically configured to:
the media stream is pulled through XHR2/Fetch or websocket.
The media stream processing apparatus 500 shown in fig. 5 is provided at the Web front end.
In addition, an embodiment of the present disclosure further provides a computer, please refer to fig. 6, in which the computer 600 includes the media stream processing apparatus 500 described above.
The embodiment of the disclosure also provides a computer-readable storage medium, which stores computer-executable instructions configured to execute the above media stream processing method.
Embodiments of the present disclosure also provide a computer program product comprising a computer program stored on a computer-readable storage medium, the computer program comprising program instructions that, when executed by a computer, cause the computer to perform the above-mentioned media stream processing method.
The computer-readable storage medium described above may be a transitory computer-readable storage medium or a non-transitory computer-readable storage medium.
An embodiment of the present disclosure further provides an electronic device, a structure of which is shown in fig. 7, and the electronic device includes:
at least one processor (processor)73, one processor 73 being exemplified in fig. 7; and a memory (memory)71, and may further include a Communication Interface (Communication Interface)72 and a bus. The processor 73, the communication interface 72, and the memory 71 may communicate with each other via a bus. The communication interface 72 may be used for information transfer. The processor 73 may call logic instructions in the memory 71 to perform the media stream processing method of the above-described embodiment.
In addition, the logic instructions in the memory 71 may be implemented in the form of software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products.
The memory 71 is a computer-readable storage medium, and can be used for storing software programs, computer-executable programs, such as program instructions/modules corresponding to the methods in the embodiments of the present disclosure. The processor 73 executes the functional application and the media stream by executing the software program, the instructions and the modules stored in the memory 71, namely, the media stream processing method in the above method embodiment is realized.
The memory 71 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal device, and the like. Further, the memory 71 may include a high-speed random access memory, and may also include a nonvolatile memory.
The technical solution of the embodiments of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes one or more instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method of the embodiments of the present disclosure. And the aforementioned storage medium may be a non-transitory storage medium comprising: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes, and may also be a transient storage medium.
As used in this application, although the terms "first," "second," etc. may be used in this application to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, unless the meaning of the description changes, so long as all occurrences of the "first element" are renamed consistently and all occurrences of the "second element" are renamed consistently. The first and second elements are both elements, but may not be the same element.
The words used in this application are words of description only and not of limitation of the claims. As used in the description of the embodiments and the claims, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. Similarly, the term "and/or" as used in this application is meant to encompass any and all possible combinations of one or more of the associated listed. Furthermore, the terms "comprises" and/or "comprising," when used in this application, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The various aspects, implementations, or features of the described embodiments can be used alone or in any combination. Aspects of the described embodiments may be implemented by software, hardware, or a combination of software and hardware. The described embodiments may also be embodied by a computer-readable medium having computer-readable code stored thereon, the computer-readable code comprising instructions executable by at least one computing device. The computer readable medium can be associated with any data storage device that can store data which can be read by a computer system. Exemplary computer readable media can include read-only memory, random-access memory, CD-ROMs, HDDs, DVDs, magnetic tape, and optical data storage devices, among others. The computer readable medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion.
The above description of the technology may refer to the accompanying drawings, which form a part hereof, and in which is shown by way of illustration embodiments in which the described embodiments may be practiced. These embodiments, while described in sufficient detail to enable those skilled in the art to practice them, are non-limiting; other embodiments may be utilized and changes may be made without departing from the scope of the described embodiments. For example, the order of operations described in a flowchart is non-limiting, and thus the order of two or more operations illustrated in and described in accordance with the flowchart may be altered in accordance with several embodiments. As another example, in several embodiments, one or more operations illustrated in and described with respect to the flowcharts are optional or may be eliminated. Additionally, certain steps or functions may be added to the disclosed embodiments, or two or more steps may be permuted in order. All such variations are considered to be encompassed by the disclosed embodiments and the claims.
Additionally, terminology is used in the foregoing description of the technology to provide a thorough understanding of the described embodiments. However, no unnecessary detail is required to implement the described embodiments. Accordingly, the foregoing description of the embodiments has been presented for purposes of illustration and description. The embodiments presented in the foregoing description and the examples disclosed in accordance with these embodiments are provided solely to add context and aid in the understanding of the described embodiments. The above description is not intended to be exhaustive or to limit the described embodiments to the precise form disclosed. Many modifications, alternative uses, and variations are possible in light of the above teaching. In some instances, well known process steps have not been described in detail in order to avoid unnecessarily obscuring the described embodiments.

Claims (18)

  1. A method for processing a media stream, comprising:
    the method comprises the steps that a media stream is pulled by a Web front end of an Internet webpage, and structured information carried in the media stream is obtained;
    the Web front end generates a display time stamp PTS of the structured information according to the frame structure of the media stream, and the display time stamp is aligned with the playing time of the media stream;
    the Web front end carries out format packaging on the media stream, and plays the packaged media stream by using an H5 player;
    and the Web front end synchronously draws the structured information on the current playing picture according to the display time stamp.
  2. The method according to claim 1, wherein the Web front end generates the display time stamp PTS of the structured information according to the frame structure of the media stream, comprising:
    the Web front end analyzes an SEI unit of the structured data and a video frame unit of the media stream according to a frame structure of the media stream;
    and the Web front end generates the display timestamp for the SEI unit according to the playing time of the video frame unit, so that the SEI unit is aligned with the video frame unit.
  3. The method according to claim 1 or 2, characterized in that the method further comprises:
    and the Web front end performs deviation rectification processing on the drawing of the structural information according to a first frequency, wherein the first frequency is greater than a second frequency, and the second frequency is the frame playing frequency of the media stream.
  4. The method of claim 1, wherein the Web front end synchronously renders the structured information on a current display screen according to the display timestamp, and the method comprises:
    and the Web front end draws a three-dimensional image or a two-dimensional image on a canvas according to the display timestamp, wherein the canvas is displayed on the current playing picture in an overlaying manner.
  5. The method of claim 1, wherein the Web front end pulling a media stream to obtain the structured information carried in the media stream comprises:
    the Web front end pulls the media stream;
    and the Web front end acquires the structural information carried in the media stream according to the protocol and the format of the media stream.
  6. The method according to claim 1 or 5, wherein the media stream is: http-flv media stream or websocket-flv media stream.
  7. The method of claim 6, wherein the Web front end pulling the media stream comprises:
    the Web front end pulls the media stream through XHR2/Fetch or websocket.
  8. A media stream processing apparatus, comprising:
    the acquisition module is used for pulling the media stream and acquiring the structural information carried in the media stream;
    a generating module, configured to generate a display timestamp PTS of the structured information according to a frame structure of the media stream, where the display timestamp is aligned with a playing time of the media stream;
    the playing module is used for carrying out format packaging on the media stream and playing the packaged media stream by using an H5 player;
    and the drawing module is used for synchronously drawing the structural information on the current playing picture according to the display time stamp.
  9. The apparatus of claim 8, wherein the generating module is specifically configured to:
    analyzing an SEI unit of the structured data and a video frame unit of the media stream according to a frame structure of the media stream;
    and generating the display timestamp for the SEI unit according to the playing time of the video frame unit, so that the SEI unit is aligned with the video frame unit.
  10. The apparatus of claim 8 or 9, further comprising:
    and the deviation rectifying module is used for rectifying the drawing of the structural information according to a first frequency, wherein the first frequency is greater than a second frequency, and the second frequency is the frame playing frequency of the media stream.
  11. The apparatus of claim 8, wherein the rendering module is specifically configured to:
    and drawing a three-dimensional image or a two-dimensional image on a canvas according to the display time stamp, wherein the canvas is displayed on the current playing picture in an overlaying mode.
  12. The apparatus of claim 8, wherein the obtaining module is specifically configured to:
    pulling the media stream;
    and acquiring the structural information carried in the media stream according to the protocol and the format of the media stream.
  13. The apparatus according to claim 8 or 12, wherein the media stream is: http-flv media stream or websocket-flv media stream.
  14. The apparatus of claim 13, wherein the obtaining module is specifically configured to:
    the media stream is pulled through XHR2/Fetch or websocket.
  15. A computer comprising the apparatus of any one of claims 8-14.
  16. An electronic device, comprising:
    at least one processor; and
    a memory communicatively coupled to the at least one processor; wherein,
    the memory stores instructions executable by the at least one processor, the instructions, when executed by the at least one processor, causing the at least one processor to perform the method of any one of claims 1-7.
  17. A computer-readable storage medium having stored thereon computer-executable instructions configured to perform the method of any one of claims 1-7.
  18. A computer program product, characterized in that the computer program product comprises a computer program stored on a computer-readable storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to carry out the method of any one of claims 1-7.
CN201880098342.8A 2018-11-15 2018-11-15 Media stream processing method and device, storage medium and program product Active CN112930687B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/115660 WO2020097857A1 (en) 2018-11-15 2018-11-15 Media stream processing method and apparatus, storage medium, and program product

Publications (2)

Publication Number Publication Date
CN112930687A true CN112930687A (en) 2021-06-08
CN112930687B CN112930687B (en) 2023-04-28

Family

ID=70730368

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201880098342.8A Active CN112930687B (en) 2018-11-15 2018-11-15 Media stream processing method and device, storage medium and program product

Country Status (2)

Country Link
CN (1) CN112930687B (en)
WO (1) WO2020097857A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113360707A (en) * 2021-07-27 2021-09-07 北京睿芯高通量科技有限公司 Video structured information storage method and system
CN113573088A (en) * 2021-07-23 2021-10-29 上海芯翌智能科技有限公司 Method and equipment for synchronously drawing identification object for live video stream
CN113938470A (en) * 2021-10-18 2022-01-14 成都小步创想慧联科技有限公司 Method and device for playing RTSP data source by browser and streaming media server
WO2023151332A1 (en) * 2022-02-08 2023-08-17 腾讯科技(深圳)有限公司 Multimedia stream processing method and apparatus, devices, computer-readable storage medium, and computer program product

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114697303B (en) * 2022-03-16 2023-11-03 北京金山云网络技术有限公司 Multimedia data processing method and device, electronic equipment and storage medium
CN114745361B (en) * 2022-03-25 2024-05-14 朗新数据科技有限公司 Audio and video playing method and system for HTML5 browser
CN115914748A (en) * 2022-10-18 2023-04-04 阿里云计算有限公司 Visual display method and device for visual recognition result and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100005485A1 (en) * 2005-12-19 2010-01-07 Agency For Science, Technology And Research Annotation of video footage and personalised video generation
CN106375793A (en) * 2016-08-29 2017-02-01 东方网力科技股份有限公司 Superposition method and superposition system of video structured information, and user terminal
CN107194006A (en) * 2017-06-19 2017-09-22 深圳警翼智能科技股份有限公司 A kind of video features structural management method
CN107277004A (en) * 2017-06-13 2017-10-20 重庆扬讯软件技术股份有限公司 A kind of browser is without plug-in unit net cast method
CN107832402A (en) * 2017-11-01 2018-03-23 武汉烽火众智数字技术有限责任公司 Dynamic exhibition system and its method during a kind of video structural fructufy

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9712867B2 (en) * 2013-09-16 2017-07-18 Avago Technologies General Ip (Singapore) Pte. Ltd. Application specific policy implementation and stream attribute modification in audio video (AV) media
CN107682715B (en) * 2016-08-01 2019-12-24 腾讯科技(深圳)有限公司 Video synchronization method and device
CN106303430B (en) * 2016-08-21 2019-05-14 贵州大学 The method for playing real time monitoring without plug-in unit in browser

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100005485A1 (en) * 2005-12-19 2010-01-07 Agency For Science, Technology And Research Annotation of video footage and personalised video generation
CN106375793A (en) * 2016-08-29 2017-02-01 东方网力科技股份有限公司 Superposition method and superposition system of video structured information, and user terminal
CN107277004A (en) * 2017-06-13 2017-10-20 重庆扬讯软件技术股份有限公司 A kind of browser is without plug-in unit net cast method
CN107194006A (en) * 2017-06-19 2017-09-22 深圳警翼智能科技股份有限公司 A kind of video features structural management method
CN107832402A (en) * 2017-11-01 2018-03-23 武汉烽火众智数字技术有限责任公司 Dynamic exhibition system and its method during a kind of video structural fructufy

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113573088A (en) * 2021-07-23 2021-10-29 上海芯翌智能科技有限公司 Method and equipment for synchronously drawing identification object for live video stream
CN113573088B (en) * 2021-07-23 2023-11-10 上海芯翌智能科技有限公司 Method and equipment for synchronously drawing identification object for live video stream
CN113360707A (en) * 2021-07-27 2021-09-07 北京睿芯高通量科技有限公司 Video structured information storage method and system
CN113938470A (en) * 2021-10-18 2022-01-14 成都小步创想慧联科技有限公司 Method and device for playing RTSP data source by browser and streaming media server
CN113938470B (en) * 2021-10-18 2023-09-12 成都小步创想慧联科技有限公司 Method and device for playing RTSP data source by browser and streaming media server
WO2023151332A1 (en) * 2022-02-08 2023-08-17 腾讯科技(深圳)有限公司 Multimedia stream processing method and apparatus, devices, computer-readable storage medium, and computer program product

Also Published As

Publication number Publication date
CN112930687B (en) 2023-04-28
WO2020097857A1 (en) 2020-05-22

Similar Documents

Publication Publication Date Title
CN112930687B (en) Media stream processing method and device, storage medium and program product
US20220263885A1 (en) Adaptive media streaming method and apparatus according to decoding performance
CN107645491B (en) Media stream transmission apparatus and media service apparatus
US10979785B2 (en) Media playback apparatus and method for synchronously reproducing video and audio on a web browser
CN109889907B (en) HTML 5-based video OSD display method and device
CN110545466B (en) Webpage-based media file playing method and device and storage medium
US20190286684A1 (en) Reception device, information processing method in reception device, transmission device, information processing device, and information processing method
US11012759B2 (en) Webcasting method, device and storage medium of media file
CN111355976B (en) Video live broadcast method and system based on HEVC standard
US20220014827A1 (en) Method, device and computer program for encapsulating media data into a media file
CN111683293A (en) Method for playing H.265 video across browsers based on HTTP-FLV protocol
US11006192B2 (en) Media-played loading control method, device and storage medium
WO2019227740A1 (en) Media file synchronous playback method and device and storage medium
US11025991B2 (en) Webpage playing method and device and storage medium for non-streaming media file
US11653054B2 (en) Method and apparatus for late binding in media content
CN110858919A (en) Data processing method and device in media file playing process and storage medium
KR20140133096A (en) Virtual web iptv and streaming method using thereof
US11570501B2 (en) Connection allocation method in media playing process, media playing device and storage medium
CN114025233B (en) Data processing method and device, electronic equipment and storage medium
CN113573100B (en) Advertisement display method, equipment and system
CN115086282A (en) Video playing method, device and storage medium
CN112801854A (en) Video data processing method and device, storage medium and electronic equipment
CN114466225A (en) Video data playing method and device, electronic equipment and readable storage medium
CN117956224A (en) Audio and video processing method, audio and video processing device, server and browser
KR20120117313A (en) Apparatus and method for processing scene description contents

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant