CN115811621A - Live stream playing method and device, computer equipment and storage medium - Google Patents

Live stream playing method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN115811621A
CN115811621A CN202111082490.0A CN202111082490A CN115811621A CN 115811621 A CN115811621 A CN 115811621A CN 202111082490 A CN202111082490 A CN 202111082490A CN 115811621 A CN115811621 A CN 115811621A
Authority
CN
China
Prior art keywords
video
data
format
playing
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111082490.0A
Other languages
Chinese (zh)
Inventor
刘立国
付宇豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing ByteDance Network Technology Co Ltd
Original Assignee
Beijing ByteDance Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd filed Critical Beijing ByteDance Network Technology Co Ltd
Priority to CN202111082490.0A priority Critical patent/CN115811621A/en
Publication of CN115811621A publication Critical patent/CN115811621A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The present disclosure provides a live stream playing method, apparatus, computer device and storage medium, wherein the method comprises: responding to a live broadcast viewing request triggered by a browser, and acquiring target live broadcast stream data; when the format of the target live broadcast stream data is a first format, carrying out data separation processing on the target live broadcast stream data to obtain audio data and video data in the target live broadcast stream data; in the process of playing video data by using a video player in a browser, video screenshot is carried out on the video data to obtain a plurality of video frame images, the plurality of video frame images are rendered and played on an HTML5 page, and audio data are played synchronously. The embodiment of the disclosure separates FLV-format live streaming data which cannot be played by a video tag to obtain audio data and MP 4-format video data which can be played, and plays the FLV live streaming data at a web end which does not support MSE function by synchronously playing the decoded audio data and the obtained video frame image.

Description

Live stream playing method and device, computer equipment and storage medium
Technical Field
The present disclosure relates to the field of internet technologies, and in particular, to a live stream playing method and apparatus, a computer device, and a storage medium.
Background
With the development of computer technology, video stream playing applications have been rapidly developed. In a Hyper Text markup Language5 (HTML 5) page in a web browser, playing of a Video stream is generally implemented by using a Video player (Video tag) in the browser. But the Video tag cannot recognize Video data in FLV (Flash Video) format. To solve this problem, the video data in FLV format is usually loaded to the web browser for playing in the form of Media Source Extensions (MSE) in the web browser.
However, in a web browser that does not support the MSE function, video data in the FLV format cannot be played.
Disclosure of Invention
The embodiment of the disclosure at least provides a live stream playing method and device, computer equipment and a storage medium.
In a first aspect, an embodiment of the present disclosure provides a live streaming playing method, including:
responding to a live broadcast viewing request triggered by a browser, wherein the live broadcast viewing request is used for requesting to play target live broadcast stream data and acquiring the target live broadcast stream data;
when the format of the target live broadcast stream data is a first format, carrying out data separation processing on the target live broadcast stream data to obtain audio data and video data in the target live broadcast stream data; the video data is obtained by format conversion and is in a second format; the first format is a video format which is not supported by the browser, and the second format is a video format which is supported by the browser;
in the process of playing the video data by using a video player in the browser, performing video screenshot on the video data by using a first image processing tool to obtain a plurality of video frame images;
and rendering and playing the plurality of video frame images by utilizing a second image processing tool on a hypertext markup language (HTML 5) page, and synchronously playing the audio data.
In an optional implementation manner, when the format of the target live streaming data is a first format, performing data separation processing on the target live streaming data to obtain audio data and video data in the target live streaming data includes:
when the format of the target live broadcast stream data is a first format, decapsulating the target live broadcast stream data to obtain video coded data and audio coded data;
and packaging the video coding data according to the second video format to obtain video data in the second video format, and decoding the audio coding data to generate pulse modulation coding PCM audio data.
In an optional embodiment, in the process of playing the video data by using a video player in the browser, performing a video screenshot on the video data by using a first image processing tool to obtain a plurality of video frame images, including:
in the process of playing the video data by using a video player in the browser, performing video screenshot on the video data by using a first image processing tool to obtain video frame images, and acquiring timestamp information corresponding to each video frame image;
based on the timestamp information, the video frame images are rendered on a first canvas in playback order with the first image processing tool.
In an alternative embodiment, the rendering and playing the plurality of video frame images and the synchronous playing of the audio data by using a second image processing tool in a hypertext markup language HTML5 page includes:
sequentially calling the video frame images drawn on the first canvas by using a second image processing tool according to the playing sequence indicated by the timestamp information corresponding to each video frame image;
and drawing the called video frame image on a second canvas on a hypertext markup language (HTML) 5 page, and synchronously playing the audio data.
In an optional embodiment, after rendering the called video frame image on the second canvas on a hypertext markup language HTML5 page, the method further includes:
and deleting the video frame image on the first canvas corresponding to the called video frame image.
In an optional embodiment, in the process of playing the video data by using a video player in the browser, performing a video screenshot on the video data by using a first image processing tool to obtain a plurality of video frame images, including:
determining an accelerated playing speed according to a standard playing speed corresponding to the target live streaming data;
playing the video data by using a video player in the browser according to the accelerated playing speed;
and in the process of playing the video data, performing video screenshot on the video data by using a first image processing tool to obtain a plurality of video frame images.
In an optional embodiment, after decapsulating the target live streaming data to obtain video encoded data and audio encoded data, the method further includes:
respectively writing the decapsulated video coded data and audio coded data into a cache;
under the condition that the total delay time corresponding to the video coding data in the cache is greater than a set threshold, determining the video coding data of the latest preset number of target video frames in the video coding data in the cache based on the timestamp information of each video frame in the video coding data in the cache; determining audio coded data matched with the preset number of target video frames from the audio coded data in the cache;
the encapsulating the video encoding data according to the second video format to obtain video data in the second video format, and generating pulse modulation coding (PCM) audio data after decoding the audio encoding data, includes:
and encapsulating the video coding data of the preset number of target video frames to obtain video data in the second video format, and decoding the audio coding data matched with the preset number of target video frames to generate pulse modulation coding (PCM) audio data.
In a second aspect, an embodiment of the present disclosure further provides a live streaming playing apparatus, including:
the system comprises a first acquisition module, a second acquisition module and a first display module, wherein the first acquisition module is used for responding to a live broadcast viewing request triggered by a browser, and the live broadcast viewing request is used for requesting to play target live broadcast stream data and acquiring the target live broadcast stream data;
the separation module is used for performing data separation processing on the target live streaming data to obtain audio data and video data in the target live streaming data when the format of the target live streaming data is a first format; the video data is obtained by format conversion and is in a second format; the first format is a video format which is not supported by the browser, and the second format is a video format which is supported by the browser;
the screenshot module is used for performing video screenshot on the video data by using a first image processing tool in the process of playing the video data by using a video player in the browser to obtain a plurality of video frame images;
and the playing module is used for rendering and playing the plurality of video frame images by using a second image processing tool on a hypertext markup language (HTML) 5 page and synchronously playing the audio data.
In a third aspect, an embodiment of the present disclosure further provides a computer device, including: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating via the bus when the computer device is running, the machine-readable instructions when executed by the processor performing the steps of the first aspect described above, or any possible implementation of the first aspect.
In a fourth aspect, this disclosed embodiment also provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps in the first aspect or any one of the possible implementation manners of the first aspect.
The live streaming playing method, the live streaming playing device, the computer device and the storage medium provided by the embodiments of the present disclosure can separate live streaming data in a first video format (FLV format) that is not supported by a browser to obtain audio data in a PCM format and video data in a second video format (MP 4 format) that is supported by the browser, perform video capture on the video data to obtain a plurality of video frame images in a process of playing the video data by using a video player in the browser, and play the audio data while rendering the video frame images, thereby playing the live streaming data in the FLV format in a web browser that does not support an MSE function.
When the total delay time corresponding to the cached video coding data is greater than the set threshold value, the frame pursuit strategy is adopted, namely the video coding data of the latest target video frame in the cached video coding data are determined, then the video coding data of the target video frame are decoded and rendered, and the decoding and rendering of the video coding data delayed for a long time are abandoned, so that the live broadcast delay can be reduced, and the live broadcast real-time performance is ensured.
According to the embodiment of the invention, the video capture is carried out in the process of accelerating the playing of the video data, so that the time of the video capture can be shortened, the decoding process of video coding data can be completed as soon as possible, the live broadcast delay is reduced, and the continuity of the live broadcast process is ensured.
According to the live streaming playing method provided by the embodiment of the disclosure, after each video frame image is rendered on the second canvas, the video frame image on the first canvas corresponding to the called video frame image is deleted, and the occupancy rate of a CPU (Central processing Unit) can be effectively reduced in the process of decoding and rendering.
In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required in the embodiments will be briefly described below, and the drawings herein incorporated in and forming a part of the specification illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the technical solutions of the present disclosure. It is to be understood that the following drawings depict only certain embodiments of the disclosure and are therefore not to be considered limiting of its scope, for those skilled in the art to which the disclosure pertains without the benefit of the inventive faculty, and that additional related drawings may be derived therefrom.
Fig. 1 shows a flowchart of a live stream playing method provided by an embodiment of the present disclosure;
fig. 2 shows a flowchart of another live stream playing method provided by the embodiment of the present disclosure;
fig. 3 shows a flowchart of a frame tracking method provided by an embodiment of the present disclosure;
fig. 4 is a schematic diagram illustrating a live streaming data playing architecture provided by an embodiment of the present disclosure;
fig. 5 is a schematic diagram illustrating a live stream playing apparatus provided in an embodiment of the present disclosure;
fig. 6 shows a schematic diagram of a computer device provided by an embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. The components of the embodiments of the present disclosure, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the disclosure, provided in the accompanying drawings, is not intended to limit the scope of the disclosure, as claimed, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making any creative effort, shall fall within the protection scope of the disclosure.
Currently, when a Video player (Video tag) in a web browser plays an FLV format, it generally acquires FLV format Video data to be played first; then, de-packaging the FLV format video data to obtain FMP4 (Fragmented MP 4) format video data; and finally, loading the FMP4 format Video data to a Video tag of a Web end for playing in a mode of media source expansion MSE. However, for a web browser that does not support the MSE function, the video data in the FLV format cannot be played.
The present disclosure provides a live streaming playing method, which may separate live streaming data in a first Video format (FLV format) that a Video tag (i.e., a Video tag) cannot play, to obtain audio data in a PCM format and Video data in a second Video format (MP 4 format) that the Video tag can play, and perform Video screenshot on the Video data to obtain a plurality of Video frame images during playing the Video data using the Video tag, and play the audio data while rendering the Video frame images, thereby playing the live streaming data in the FLV format in a web browser that does not support an MSE function.
The above-mentioned drawbacks are the results of the inventor after practical and careful study, and therefore, the discovery process of the above-mentioned problems and the solutions proposed by the present disclosure to the above-mentioned problems should be the contribution of the inventor in the process of the present disclosure.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
In order to facilitate understanding of this embodiment, a live stream playing method disclosed in the embodiment of the present disclosure is first described in detail, and an execution subject of the live stream playing method provided in the embodiment of the present disclosure is generally a computer device with certain computing capability.
The live streaming playing method provided by the embodiment of the present disclosure is described below by taking an execution subject as a terminal device as an example.
The live streaming playing method provided by the embodiment of the present disclosure is mainly applied to a web browser (for example, a browser of an IOS system at a mobile terminal) that does not support an MSE function, and when received live streaming data is in a format (for example, an FLV format) that cannot be identified by a Video tag in the web browser that does not support the MSE function, the live streaming playing method provided by the embodiment of the present disclosure may be executed.
Referring to fig. 1, a flowchart of a live stream playing method provided in an embodiment of the present disclosure is shown, where the method includes S101 to S104, where:
s101: responding to a live broadcast viewing request triggered by a browser, wherein the live broadcast viewing request is used for requesting to play target live broadcast stream data and acquiring the target live broadcast stream data.
In the embodiment of the present disclosure, the target live streaming data may include audio data and video data, and in the embodiment of the present disclosure, when a live viewing request triggered by a browser is responded, the target live streaming data corresponding to the live viewing request may be acquired through a network request.
S102: when the format of the target live broadcast stream data is a first format, carrying out data separation processing on the target live broadcast stream data to obtain audio data and video data in the target live broadcast stream data; the video data is obtained by format conversion and is in a second format; the first format is a video format which is not supported by the browser, and the second format is a video format which is supported by the browser.
When the format of the target live streaming data is the first format, that is, the target live streaming data is live streaming data that cannot be recognized by a Video tag (i.e., a Video tag, which can be regarded as a Video player of a web browser with decapsulation and decoding functions) of the web browser, that is, a Video format that is not supported by the web browser, the target live streaming data may be obtained as live streaming data in an FLV format, as shown in fig. 2.
In the storage and transmission processes of target live streaming data, in order to reduce the size of the target live streaming data during transmission, the transmitted target live streaming data is all subjected to encoding compression, and before decoding video data and audio data in the target live streaming data, the target live streaming data needs to be unpacked.
In this step, specifically, when the format of the target live streaming data is the first format, decapsulating the target live streaming data to obtain video encoded data and audio encoded data; and then packaging the video coding data according to a second video format to obtain video data in the second video format, and decoding the audio coding data to generate pulse modulation coding PCM audio data.
After the target live streaming data is unpacked, complete video coding data and audio coding data can be obtained. Here, the decapsulated video encoded data and audio encoded data are written into the buffers, respectively. When the video coded data and the audio coded data are decoded respectively, the video coded data and the audio coded data can be decoded according to the time sequence respectively corresponding to the video coded data and the audio coded data in the cache.
The video coded data, i.e. the video bare stream, may be video coded data of the H264 video coding standard, or may be video coded data of the H265 video coding standard. The video coding data of the H264 video coding standard can be obtained in the web browser supporting the video coding data of the H264 video coding standard, and the video coding data of the H265 video coding standard can be obtained in the web browser supporting the video coding data of the H264 video coding standard.
When the video resource data is encoded, that is, before the video resource data is encapsulated into target live stream data, the video resource data is divided according to a key frame image (determined according to encoding information in the encoding process) to form a plurality Of Group Of Pictures (GOP), so that when the target live stream data is decapsulated, a plurality Of video encoded data segments (that is, GOP) can be directly obtained, and the plurality Of video encoded data segments form complete video encoded data.
For each Video coded data segment, the Video coded data segment may be encapsulated according to a second Video format (that is, a Video format supported by the browser) that can be identified by the Video tag, so as to obtain Video data in multiple second Video formats. The second video format may here be the MP4 (MPEG-4 part 14) format. The video data in the second video format is still the encoded and compressed video data, and the video data in the second video format is decoded and rendered in the subsequent step. Here, the obtained Video data may be stored in a memory, and may be converted into a Blob address according to the memory address, and then played using a Video tag in step S103.
In the disclosed embodiments, the Audio encoding data may be Advanced Audio Coding (AAC) data. Here, by decoding the audio encoded data, pulse Code Modulation (PCM) audio data can be obtained.
Here, time stamp information corresponding to the audio data can also be obtained for playing in step S104 in synchronization with the decoded video data, which is not described in detail here.
S103: and in the process of playing the video data by using a video player in the browser, performing video screenshot on the video data by using a first image processing tool to obtain a plurality of video frame images.
Here the Video player, i.e. the Video tag, is off-line and the browser is unaware when playing the Video data.
Specifically, in the process of playing video data by using a video player, a first image processing tool is used for performing video screenshot on the video data to obtain video frame images, and timestamp information corresponding to each video frame image is obtained; the video frame images are then rendered on the first canvas in play order using the first image processing tool based on the timestamp information.
Here, the video frame images displayed in the video player and the timestamp information corresponding to each video frame image may be continuously obtained in a timed frame-extracting manner.
The timestamp information determines the playing order of the video frame images when the video frame images are stored, wherein each video frame image can be drawn on a first canvas (canvas) according to the playing order. When the video frame images are rendered and played in step S104, the video frame images are retrieved from the first canvas according to the playing order, and rendered and played according to the playing order. The first image processing tool herein may be an offline image processing tool, the first canvas may be an offline canvas, and the browser is unaware.
In order to ensure that target live streaming data is continuously played, a video frame image needs to be obtained as soon as possible (the slower the process of obtaining the video frame image is, the longer the delay of playing the video frame image is, and the more discontinuous the playing effect is), so that in one implementation, the accelerated playing speed can be determined according to the standard playing speed corresponding to the target live streaming data; playing the video data by using a video player in the browser according to the accelerated playing speed; in the process of playing the video data, a first image processing tool is used for carrying out video screenshot on the video data to obtain a plurality of video frame images.
Here, an accelerated playback speed such as a 2-fold speed or a 3-fold speed may be used, and when video data is played back at the accelerated playback speed, the time taken to decode the video data at the accelerated playback speed may be 1/2, 1/3, or the like of the time taken to decode the video data at a standard playback speed, so that a video frame image can be obtained as soon as possible.
S104: and rendering and playing the plurality of video frame images by utilizing a second image processing tool on a hypertext markup language (HTML 5) page, and synchronously playing the audio data.
A hypertext markup Language HTML5 (Hyper Text Mark up Language 5) page is a page displayed in a web-side browser.
The video frame image may be retrieved from the first canvas using a corresponding image rendering interface (DrawImage API) of a second image processing tool, which may be an online image processing tool. Specifically, the video frame images drawn on the first canvas may be sequentially called by the second image processing tool according to the playing order indicated by the timestamp information corresponding to each video frame image; rendering the called video frame image on a second canvas on a hypertext markup language HTML5 page and playing audio data synchronously.
As described above, the video frame images are drawn on the first canvas according to the playing order indicated by the timestamp information, so that the video frame images drawn on the first canvas are still called according to the playing order, thereby ensuring that the video data can be played in sequence in the web-end browser.
Here, an online second canvas is embedded on the hypertext markup language HTML5 page, i.e. the second canvas is browser-aware. And drawing the called video frame image on a second canvas, and playing the image on a web end browser.
When the video frame image is played in a rendering mode, corresponding audio data can be obtained according to the timestamp information, and synchronous playing of the audio and video data is achieved.
In order to reduce the occupancy rate of a memory of a Central Processing Unit (CPU), in an embodiment, after a video frame image is rendered on a second canvas each time, a called video frame image on a first canvas corresponding to the video frame image may be deleted, that is, the width and height on the first canvas are set to zero, so that the CPU memory occupied by the first canvas is recycled in time, and the occupancy rate of the CPU is reduced.
As mentioned above, the slower the process of obtaining the video frame image is, the longer the delay time for playing the video frame image is, and even when the video data is played at a double speed, the delay time may be caused by considering the process of performing video capture on the video data, the process of rendering the video frame image, and the process of determining the synchronous audio data, so in order to reduce the delay time of live broadcasting, in a possible implementation manner, a frame tracking strategy may be adopted, and specifically, the latest preset number of target video frames in the video encoded data in the cache may be determined based on the timestamp information of each video frame in the video encoded data in the cache under the condition that the total delay time corresponding to the video encoded data in the cache is greater than a set threshold; and determining the audio encoding data matched with the preset number of target video frames from the audio encoding data in the buffer.
As described above, since the live stream data in the encapsulated format includes a plurality of GOP data, and the plurality of GOP data cannot be decoded at the same time after the live stream data is decapsulated, there is buffered GOP data that is not decoded after the decapsulation, that is, video encoded data in the buffer.
Here, it may be determined whether the total delay duration corresponding to the video encoded data in the buffer is greater than a set threshold, where the set threshold may be a duration corresponding to one GOP data (the duration corresponding to one GOP data is approximately 2 seconds to 4 seconds, and the duration corresponding to each GOP data may be determined when the GOP data is divided in the video data encoding process). The total delay time duration may include a sum of the play time duration corresponding to the GOP data and a sum of time durations corresponding to decoding of the GOP data, for example, a play time duration corresponding to one GOP data is 4 seconds, when playing the GOP data at 2 × speed, a time duration used for decoding the GOP data is 2 seconds, and then when the GOP data can be played, 6 seconds have been delayed. In a specific implementation, the delay duration may further include a duration for searching for audio data matching the GOP data, and the like.
As shown in fig. 3, when the total delay time corresponding to the video encoded data in the cache is greater than the set threshold, the video frame chase may be triggered, the time point of playing the target live broadcast stream data is recalculated according to the current time, and the video encoded data of the latest preset number of target video frames in the cached video encoded data is determined according to the recalculated time point and the timestamp information of each video frame in the cached video encoded data. The video frame may be a key frame used in encoding and dividing the video resource data.
After the video encoded data of the preset number of target video frames are encapsulated to obtain the video data in the second video format, the newly encapsulated video data can be decoded, and the video frame image drawn before the recalculated time point on the first canvas is emptied.
For synchronous playing, at this time, frame pursuit may be performed on the audio data, that is, the audio coded data that matches a preset number of target video frames is determined from the audio coded data in the buffer, where matching refers to temporal matching, that is, the audio coded data that is time-synchronized with the video coded data of the latest preset number of target video frames is found from the audio coded data in the buffer. Specifically, the matching audio encoding data can be found according to the timestamp information corresponding to the latest decoded video frame image.
After decoding the audio encoded data that match the preset number of target video frames, pulse modulation encoded PCM audio data may be generated. By adopting the strategies of video frame chasing and audio frame chasing, the time delay of live broadcasting is reduced.
In the embodiment of the present disclosure, an architecture schematic diagram for implementing a live streaming data playing method is further provided, as shown in fig. 4, in order to keep consistent with a native Video tag in a web browser, a virtual Video tag, that is, an MVideo tag, which integrates functions of audio decoding, video decoding, audio and Video synchronous playing and the like is customized here, and the MVideo tag may be regarded as a customized Video player for decapsulating and decoding functions.
A TimeLine module, a video renderer VideoRender, an audio renderer AudioRender, and a timer Ticktimer module (i.e., an audio video synchronization module) may be included in the MVideo tag.
The TimeLine module is used to store custom objects, where after live streaming data in a first video format (e.g., FLV format) is acquired, the TimeLine module is used to store the live streaming data according to a chronological order.
And the video renderer VideoRender is used for decoding and rendering the video coding data obtained after separating the live streaming data. The VideoRender module may further include a video time sequence videotimesrange module, at least one management Controller module, a video frame queue FrameQueue module, and a video frame rendering FrameRender module.
Specifically, the video time sequence videotime range module may be configured to store, in a time sequence, video encoded data obtained by decapsulating live stream data.
The management Controller module may be configured to decode the video encoded data to obtain a video frame image. Here, before decoding the encoded video data, the encoded video data may be encapsulated according to the second video format to obtain video data in the second video format, for example, video data in MP4 format. And then, in the process of playing the video data in the second video format by using the MVideo label, performing video screenshot on the video data by using an offline first image processing tool to obtain a video frame image, namely decoding the video coded data. After separating the live streaming data, according to the support situation of the browser to the video coding standard, video coding data of the H264 video coding standard can be obtained, and video coding data of the H265 video coding standard can also be obtained.
When decoding the video coding data, the time stamp information corresponding to each video frame image can be obtained. Therefore, here, the video frame queue FrameQueue module may be configured to store the video frames in the playing order indicated by the timestamp information corresponding to the video frame image. In a specific implementation, the FrameQueue module may draw the decoded video frame image on an offline canvas (first canvas).
The FrameRender module is used for rendering the video frame. Here, the online second image processing tool of the HTML5 page may be used to sequentially retrieve the video frame images from the first canvas in the playing order indicated by the timestamp information corresponding to the video frame images. The recalled video frame image is then rendered on an online canvas (second canvas) on a hypertext markup language HTML5 page.
And the audio renderer AudioRender is used for decoding and rendering the audio coding data obtained after the live streaming data is separated. The AudioRender module may further include an audio time series AudioTimeRange module, at least one management Controller module, an audio pulse modulation queue PCMQueue module, and an audio pulse modulation rendering pcmender module.
Specifically, the audio time sequence AudioTimeRange module is configured to store audio encoded data obtained by decapsulating live streaming data according to a time sequence. The resulting audio encoded data may be advanced audio encoded AAC data.
And the management Controller module is used for decoding the AAC data to obtain pulse modulation coded PCM data. Time stamp information corresponding to the pulse modulation encoded PCM data may also be obtained here.
Therefore, the PCMQueue module may be configured to store the PCM data in the playing order indicated by the timestamp information corresponding to the PCM data.
And the audio pulse modulation rendering PCMRender module is used for playing the pulse modulation coding PCM data according to a playing sequence.
The timer module is used for synchronously playing the video frame image and the pulse modulation coding PCM data.
It will be understood by those of skill in the art that in the above method of the present embodiment, the order of writing the steps does not imply a strict order of execution and does not impose any limitations on the implementation, as the order of execution of the steps should be determined by their function and possibly inherent logic.
Based on the same inventive concept, a live stream playing device corresponding to the live stream playing method is also provided in the embodiments of the present disclosure, and as the principle of solving the problem of the device in the embodiments of the present disclosure is similar to the live stream playing method in the embodiments of the present disclosure, the implementation of the device may refer to the implementation of the method, and repeated details are not described again.
Referring to fig. 5, which is a schematic view of an architecture of a live stream playing apparatus provided in an embodiment of the present disclosure, the apparatus includes: a first obtaining module 501, a separating module 502, a screenshot module 503, and a playing module 504; wherein the content of the first and second substances,
a first obtaining module 501, configured to respond to a live viewing request triggered by a browser, where the live viewing request is used to request to play target live streaming data, and obtain the target live streaming data;
a separation module 502, configured to, when the format of the target live streaming data is a first format, perform data separation processing on the target live streaming data to obtain audio data and video data in the target live streaming data; the video data is obtained by format conversion and is in a second format; the first format is a video format which is not supported by the browser, and the second format is a video format which is supported by the browser;
a screenshot module 503, configured to perform video screenshot on the video data by using a first image processing tool in a process of playing the video data by using a video player in the browser, so as to obtain multiple video frame images;
the playing module 504 is configured to render and play the plurality of video frame images by using a second image processing tool on a hypertext markup language HTML5 page, and play the audio data synchronously.
In an alternative embodiment, the separation module 502 is specifically configured to:
when the format of the target live streaming data is a first format, performing data separation processing on the target live streaming data to obtain audio data and video data in the target live streaming data, including:
when the format of the target live streaming data is a first format, decapsulating the target live streaming data to obtain video encoding data and audio encoding data;
and packaging the video coding data according to the second video format to obtain video data in the second video format, and decoding the audio coding data to generate pulse modulation coding PCM audio data.
In an optional implementation manner, the screenshot module 503 is specifically configured to:
in the process of playing the video data by using a video player in the browser, performing video screenshot on the video data by using a first image processing tool to obtain video frame images, and acquiring timestamp information corresponding to each video frame image;
based on the timestamp information, the video frame images are rendered on a first canvas in playback order with the first image processing tool.
In an optional implementation manner, the playing module 504 is specifically configured to:
sequentially calling the video frame images drawn on the first canvas by using the second image processing tool according to the playing sequence indicated by the timestamp information corresponding to each video frame image;
and drawing the called video frame image on a second canvas on a hypertext markup language (HTML) 5 page, and synchronously playing the audio data.
In an optional implementation manner, the playing module 504 is specifically configured to:
and deleting the video frame image on the first canvas corresponding to the called video frame image.
In an optional implementation manner, the screenshot module 503 is specifically configured to:
determining an accelerated playing speed according to a standard playing speed corresponding to the target live streaming data;
playing the video data by using a video player in the browser according to the accelerated playing speed;
and in the process of playing the video data, performing video screenshot on the video data by using a first image processing tool to obtain a plurality of video frame images.
In an alternative embodiment, the apparatus further comprises:
the writing module is used for respectively writing the unpackaged video coding data and audio coding data into a cache;
under the condition that the total delay time corresponding to the video coding data in the cache is greater than a set threshold value, determining the video coding data of the latest preset number of target video frames in the video coding data in the cache based on the timestamp information of each video frame in the video coding data in the cache; determining audio coding data matched with the preset number of target video frames from the audio coding data in the cache;
the separation module 502 is specifically configured to:
and packaging the video coding data of the preset number of target video frames to obtain video data in the second video format, and decoding the audio coding data matched with the preset number of target video frames to generate pulse modulation coding PCM audio data.
The description of the processing flow of each module in the apparatus and the interaction flow between the modules may refer to the relevant description in the above method embodiments, and will not be described in detail here.
Based on the same technical concept, the embodiment of the disclosure also provides computer equipment. Referring to fig. 6, a schematic structural diagram of a computer device 600 provided in the embodiment of the present disclosure includes a processor 601, a memory 602, and a bus 603. The memory 602 is used for storing execution instructions and includes a memory 6021 and an external memory 6022; the memory 6021 is also called an internal memory and is used for temporarily storing the operation data in the processor 601 and the data exchanged with the external memory 6022 such as a hard disk, the processor 601 exchanges data with the external memory 6022 through the memory 6021, and when the computer device 600 operates, the processor 601 and the memory 602 communicate with each other through the bus 603, so that the processor 601 executes the following instructions:
responding to a live broadcast viewing request triggered by a browser, wherein the live broadcast viewing request is used for requesting to play target live broadcast stream data and acquiring the target live broadcast stream data;
when the format of the target live streaming data is a first format, performing data separation processing on the target live streaming data to obtain audio data and video data in the target live streaming data; the video data is obtained by format conversion and is in a second format; the first format is a video format which is not supported by the browser, and the second format is a video format which is supported by the browser;
in the process of playing the video data by using a video player in the browser, performing video screenshot on the video data by using a first image processing tool to obtain a plurality of video frame images;
and rendering and playing the plurality of video frame images by utilizing a second image processing tool on a hypertext markup language (HTML 5) page, and synchronously playing the audio data.
The embodiments of the present disclosure also provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the live stream playing method in the foregoing method embodiments are executed. The storage medium may be a volatile or non-volatile computer-readable storage medium.
An embodiment of the present disclosure further provides a computer program product, where the computer program product bears a program code, and an instruction included in the program code may be used to execute the step of the live streaming playing method in the foregoing method embodiment, which may be specifically referred to the foregoing method embodiment, and is not described herein again.
The computer program product may be implemented by hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK) or the like.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. The above-described apparatus embodiments are merely illustrative, and for example, the division of the units into only one type of logical function may be implemented in other ways, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed coupling or direct coupling or communication connection between each other may be through some communication interfaces, indirect coupling or communication connection between devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Finally, it should be noted that: the above-mentioned embodiments are merely specific embodiments of the present disclosure, which are used for illustrating the technical solutions of the present disclosure and not for limiting the same, and the scope of the present disclosure is not limited thereto, and although the present disclosure is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive of the technical solutions described in the foregoing embodiments or equivalent technical features thereof within the technical scope of the present disclosure; such modifications, changes and substitutions do not depart from the spirit and scope of the embodiments disclosed herein, and they should be construed as being included therein. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (10)

1. A live stream playing method is characterized by comprising the following steps:
responding to a live broadcast viewing request triggered by a browser, wherein the live broadcast viewing request is used for requesting to play target live broadcast stream data and acquiring the target live broadcast stream data;
when the format of the target live broadcast stream data is a first format, carrying out data separation processing on the target live broadcast stream data to obtain audio data and video data in the target live broadcast stream data; the video data is obtained by format conversion and is in a second format; the first format is a video format which is not supported by the browser, and the second format is a video format which is supported by the browser;
in the process of playing the video data by using a video player in the browser, performing video screenshot on the video data by using a first image processing tool to obtain a plurality of video frame images;
and rendering and playing the plurality of video frame images by utilizing a second image processing tool on a hypertext markup language (HTML 5) page, and synchronously playing the audio data.
2. The method as claimed in claim 1, wherein when the format of the target live streaming data is a first format, performing data separation processing on the target live streaming data to obtain audio data and video data in the target live streaming data includes:
when the format of the target live broadcast stream data is a first format, decapsulating the target live broadcast stream data to obtain video coded data and audio coded data;
and packaging the video coding data according to the second video format to obtain video data in the second video format, and decoding the audio coding data to generate pulse modulation coding PCM audio data.
3. The method of claim 1, wherein performing a video screenshot on the video data with a first image processing tool during playing the video data with a video player in the browser to obtain a plurality of video frame images comprises:
in the process of playing the video data by using a video player in the browser, performing video screenshot on the video data by using a first image processing tool to obtain video frame images, and acquiring timestamp information corresponding to each video frame image;
based on the timestamp information, the video frame images are rendered on a first canvas in playback order with the first image processing tool.
4. The method of claim 3, wherein rendering and playing the plurality of video frame images and playing the audio data synchronously using a second image processing tool in a hypertext markup language (HTML) 5 page comprises:
sequentially calling the video frame images drawn on the first canvas by using a second image processing tool according to the playing sequence indicated by the timestamp information corresponding to each video frame image;
drawing the called video frame image on a second canvas on a hypertext markup language (HTML) 5 page, and synchronously playing the audio data.
5. The method of claim 4, wherein rendering the recalled video frame image after a second canvas on a hypertext markup language (HTML) 5 page, the method further comprises:
and deleting the video frame image on the first canvas corresponding to the called video frame image.
6. The method of claim 1, wherein performing a video screenshot on the video data using a first image processing tool during playing the video data using a video player in the browser to obtain a plurality of video frame images comprises:
determining an accelerated playing speed according to a standard playing speed corresponding to the target live streaming data;
playing the video data by using a video player in the browser according to the accelerated playing speed;
and in the process of playing the video data, performing video screenshot on the video data by using a first image processing tool to obtain a plurality of video frame images.
7. The method of claim 2, wherein after decapsulating the target live streaming data to obtain video encoded data and audio encoded data, further comprising:
respectively writing the decapsulated video coded data and audio coded data into a cache;
under the condition that the total delay time corresponding to the video coding data in the cache is greater than a set threshold value, determining the video coding data of the latest preset number of target video frames in the video coding data in the cache based on the timestamp information of each video frame in the video coding data in the cache; determining audio coded data matched with the preset number of target video frames from the audio coded data in the cache;
the encapsulating the video encoding data according to the second video format to obtain video data in the second video format, and generating pulse modulation coding (PCM) audio data after decoding the audio encoding data, includes:
and encapsulating the video coding data of the preset number of target video frames to obtain video data in the second video format, and decoding the audio coding data matched with the preset number of target video frames to generate pulse modulation coding (PCM) audio data.
8. A live stream playback apparatus, comprising:
the system comprises a first acquisition module, a second acquisition module and a first display module, wherein the first acquisition module is used for responding to a live broadcast viewing request triggered by a browser, and the live broadcast viewing request is used for requesting to play target live broadcast stream data and acquiring the target live broadcast stream data;
the separation module is used for performing data separation processing on the target live streaming data to obtain audio data and video data in the target live streaming data when the format of the target live streaming data is a first format; the video data is obtained by format conversion and is in a second format; the first format is a video format which is not supported by the browser, and the second format is a video format which is supported by the browser;
the screenshot module is used for performing video screenshot on the video data by using a first image processing tool in the process of playing the video data by using a video player in the browser to obtain a plurality of video frame images;
and the playing module is used for rendering and playing the plurality of video frame images by using a second image processing tool on a hypertext markup language (HTML) 5 page and synchronously playing the audio data.
9. A computer device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over the bus when a computer device is running, the machine-readable instructions when executed by the processor performing the steps of the live stream playback method as claimed in any one of claims 1 to 7.
10. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the live streaming method according to any one of claims 1 to 7.
CN202111082490.0A 2021-09-15 2021-09-15 Live stream playing method and device, computer equipment and storage medium Pending CN115811621A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111082490.0A CN115811621A (en) 2021-09-15 2021-09-15 Live stream playing method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111082490.0A CN115811621A (en) 2021-09-15 2021-09-15 Live stream playing method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115811621A true CN115811621A (en) 2023-03-17

Family

ID=85482045

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111082490.0A Pending CN115811621A (en) 2021-09-15 2021-09-15 Live stream playing method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115811621A (en)

Similar Documents

Publication Publication Date Title
CN109168078B (en) Video definition switching method and device
US9961398B2 (en) Method and device for switching video streams
US10930318B2 (en) Gapless video looping
CN112653700B (en) Website video communication method based on WEBRTC
CN110784750B (en) Video playing method and device and computer equipment
CN111147947B (en) Websocket-based flv video transmission and webpage playing method
EP3866481A1 (en) Audio/video switching method and apparatus, and computer device and readable storage medium
CN111447455A (en) Live video stream playback processing method and device and computing equipment
CN111182322B (en) Director control method and device, electronic equipment and storage medium
CN107690093B (en) Video playing method and device
CN112087642B (en) Cloud guide playing method, cloud guide server and remote management terminal
CN112055254A (en) Video playing method, device, terminal and storage medium
CN113938470A (en) Method and device for playing RTSP data source by browser and streaming media server
US20190327425A1 (en) Image processing device, method and program
CN116600169A (en) Method and device for preloading media files, electronic equipment and storage medium
CN111436009B (en) Real-time video stream transmission and display method and transmission and play system
CN103581741A (en) Special-effect playing device and method
CN112235600B (en) Method, device and system for processing video data and video service request
CN114222156A (en) Video editing method, video editing device, computer equipment and storage medium
CN105992018B (en) Streaming media transcoding method and apparatus
CN111918074A (en) Live video fault early warning method and related equipment
CN115811621A (en) Live stream playing method and device, computer equipment and storage medium
CN111954041A (en) Video loading method, computer equipment and readable storage medium
US20190387271A1 (en) Image processing apparatus, image processing method, and program
CN113691886A (en) Downloading method and device of streaming media file

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Applicant after: Douyin Vision Co.,Ltd.

Address before: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Applicant before: Tiktok vision (Beijing) Co.,Ltd.

Address after: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Applicant after: Tiktok vision (Beijing) Co.,Ltd.

Address before: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Applicant before: BEIJING BYTEDANCE NETWORK TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information