CN108063970A - A kind of method and apparatus for handling live TV stream - Google Patents
A kind of method and apparatus for handling live TV stream Download PDFInfo
- Publication number
- CN108063970A CN108063970A CN201711172649.1A CN201711172649A CN108063970A CN 108063970 A CN108063970 A CN 108063970A CN 201711172649 A CN201711172649 A CN 201711172649A CN 108063970 A CN108063970 A CN 108063970A
- Authority
- CN
- China
- Prior art keywords
- video data
- original
- duration
- stream
- live
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 96
- 230000001360 synchronised effect Effects 0.000 claims abstract description 37
- 150000001875 compounds Chemical class 0.000 claims abstract description 15
- 238000012545 processing Methods 0.000 claims description 52
- 238000013519 translation Methods 0.000 claims description 44
- 238000012937 correction Methods 0.000 claims description 36
- 230000015572 biosynthetic process Effects 0.000 claims description 16
- 238000003786 synthesis reaction Methods 0.000 claims description 16
- 238000004891 communication Methods 0.000 claims description 14
- 238000004590 computer program Methods 0.000 claims description 7
- 230000000694 effects Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000002093 peripheral effect Effects 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000000802 evaporation-induced self-assembly Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/439—Processing of audio elementary streams
- H04N21/4394—Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/4302—Content synchronisation processes, e.g. decoder synchronisation
- H04N21/4307—Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/434—Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
- H04N21/4342—Demultiplexing isochronously with video sync, e.g. according to bit-parallel or bit-serial interface formats, as SDI
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/8547—Content authoring involving timestamps for synchronizing content
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
- Studio Circuits (AREA)
Abstract
An embodiment of the present invention provides a kind of method and apparatus for handling live TV stream, method includes:Original live stream is decoded as original audio data and original video data;Speech recognition is carried out to the original audio data, generates the corresponding text character of the original audio data;According to the first duration that the speech recognition expends, delay disposal is carried out to the original video data;The text character is added in the video data after delay, generates target video data;By the target video data and the original audio data synchronized compound, target live TV stream is generated;Play the target live TV stream.It can be realized using the embodiment of the present invention and play the net cast with subtitle.
Description
Technical field
The present invention relates to field of computer technology, more particularly to a kind of method and apparatus for handling live TV stream.
Background technology
It is more and more to be liked be subject to user due to the diversity of net cast content.Under normal conditions, net cast is not
It can show the subtitle with audio video synchronization.
It is subject to the cacoepy of personage in audio disturbances or net cast is true or velocity of sound is too fast etc. there is net cast
In the case of causing the sound of net cast unsharp, user can not be understood in the program of net cast completely according only to sound
Hold, affect the viewing experience of user.
The content of the invention
The embodiment of the present invention is designed to provide a kind of method and apparatus for handling live TV stream, and band subtitle is played to realize
Net cast.Specific technical solution is as follows:
In the one side that the present invention is implemented, a kind of method for handling live TV stream is provided, the described method includes:
Original live stream is decoded as original audio data and original video data;
Speech recognition is carried out to the original audio data, generates the corresponding text character of the original audio data;
According to the first duration that the speech recognition expends, delay disposal is carried out to the original video data;
The text character is added in the video data after delay, generates target video data;
By the target video data and the original audio data synchronized compound, target live TV stream is generated;
Play the target live TV stream.
Optionally, described the step of original live stream is decoded as original audio data and original video data, including:
The original live stream of preset duration is decoded as original audio data and original video data.
Optionally, described the step of original live stream is decoded as original audio data and original video data, including:
In original live stream in preset duration section, the time point of speech pause is determined;
By before time point described in the original live stream and not decoded live TV stream segment, original audio number is decoded as
According to and original video data.
Optionally, first duration expended according to the speech recognition, postpones the original video data
The step of processing, including:
Determine the first duration spent by the speech recognition;
By the timestamp of the original video data, postpone first duration.
Optionally, speech recognition is carried out to the original audio data described, generates the original audio data and correspond to
Text character the step of after, the method further includes:
The text character is translated into default category of language, generates the second duration, described second when it is a length of will described in
Text character translates into the duration spent by default category of language;
The step of timestamp by the original video data, delay first duration, including:
By the timestamp of the original video data, postpone the duration of the sum of first duration and second duration;
It is described by the text character be added to delay after video data in, generate target video data the step of, bag
It includes:
Text character after translation is added in the video data after delay, generates target video data.
Optionally, after the described the step of text character is translated into default category of language, the method is also
Including:
Correction process is carried out to the text character after translation;
Determine the 3rd duration spent by the correction process;
The timestamp by the original video data, postpone the sum of first duration and second duration when
Long step, including:
By the timestamp of the original video data, postpone first duration, second duration and it is described 3rd when
The sum of long duration;
It is described by the text character be added to delay after video data in, generate target video data the step of, bag
It includes:
The text character translated and after error correction is added in the video data after delay, generates target video data.
Optionally, it is described by the target video data and the original audio data synchronized compound, generation target live streaming
The step of stream, including:
Based on default reference time axis, according to the timestamp of video frame in the target video data and the original
The target video data and the original audio data are synchronized synthesis by the timestamp of beginning voice data sound intermediate frequency frame,
Generate target live TV stream.
At the another aspect that the present invention is implemented, and a kind of device for handling live TV stream is provided, described device includes:
Decoding unit, for original live stream to be decoded as original audio data and original video data;
Recognition unit for carrying out speech recognition to the original audio data, generates the original audio data and corresponds to
Text character;
Delay cell for the first duration expended according to the speech recognition, prolongs the original video data
Processing late;
Adding device for the text character to be added in the video data after delay, generates target video data;
Synthesis unit, for by the target video data and the original audio data synchronized compound, generation target to be straight
Broadcast stream;
Broadcast unit, for playing the target live TV stream.
Optionally, the decoding unit, specifically for the original live stream of preset duration is decoded as original audio data
And original video data.
Optionally, the decoding unit, including:First determination subelement and decoding subunit;
First determination subelement, in the original live stream in preset duration section, determining speech pause
Time point;
The decoding subunit, for will be before time point described in the original live stream and not decoded live streaming flow
Section, is decoded as original audio data and original video data.
Optionally, the delay cell, including:Second determination subelement and delay subelement;
Second determination subelement, for determining the first duration spent by the speech recognition;
The delay subelement, for by the timestamp of the original video data, postponing first duration.
Optionally, described device further includes:
Translation unit for the text character to be translated into default category of language, generates the second duration, and described second
The text character is translated into the duration spent by default category of language by Shi Changwei;
The delay subelement, specifically for by the timestamp of the original video data, postpone first duration and
The duration of the sum of second duration;
The adding device, specifically for the text character after translating is added in the video data after delay, generation
Target video data.
Optionally, described device further includes:
Error correction unit, for carrying out correction process to the text character after translation;
Determination unit, for determining the 3rd duration spent by the correction process;
The delay subelement, specifically for by the timestamp of the original video data, postponing first duration, institute
State the duration of the sum of the second duration and the 3rd duration;
The adding device, specifically for the text character translated and after error correction to be added to the video data after delay
In, generate target video data.
Optionally, the synthesis unit, specifically for being based on default reference time axis, according to the target video data
The timestamp of the timestamp of middle video frame and the original audio data sound intermediate frequency frame, by the target video data and institute
It states original audio data and synchronizes synthesis, generate target live TV stream.
At the another aspect that the present invention is implemented, a kind of computer readable storage medium is additionally provided, it is described computer-readable
Instruction is stored in storage medium, when run on a computer so that it is straight that computer performs any of the above-described processing
The method for broadcasting stream.
At the another aspect that the present invention is implemented, a kind of computer program product for including instruction is additionally provided, when it is being counted
When being run on calculation machine so that the method that computer performs any of the above-described processing live TV stream.
A kind of method and apparatus for handling live TV stream provided in an embodiment of the present invention, first, by the original straight of preset duration
It broadcasts stream and is decoded as original audio data and original video data;Then, speech recognition, generation are carried out to the original audio data
The corresponding text character of the original audio data;According to the first duration that the speech recognition expends, to the original video
Data carry out delay disposal;Next, the text character is added in the video data after postponing, target video number is generated
According to;Finally, by the target video data and the original audio data synchronized compound, target live TV stream is generated;Described in broadcasting
Target live TV stream.
In this way, it in embodiments of the present invention, by adding the corresponding text character of voice data in net cast, realizes
While net cast is played, synchronous subtitle is played, user can be helped to understand the content of net cast, promote user's
Viewing experience.Certainly, implement any of the products of the present invention or method must be not necessarily required to reach all the above excellent simultaneously
Point.
Description of the drawings
It in order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing
There is attached drawing needed in technology description to be briefly described.
Fig. 1 is a kind of flow chart of the method for the processing live TV stream of the embodiment of the present invention;
Fig. 2 is another flow chart of the method for the processing live TV stream of the embodiment of the present invention;
Fig. 3 is another flow chart of the method for the processing live TV stream of the embodiment of the present invention;
Fig. 4 is the schematic diagram of the system of the processing live TV stream of the embodiment of the present invention;
Fig. 5 is the structure chart of the device of the processing live TV stream of the embodiment of the present invention;
Fig. 6 is the schematic diagram of the electronic equipment of the embodiment of the present invention.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is described.
At present, it is more and more to be liked be subject to user due to the diversity of net cast content.Under normal conditions, video
Live streaming will not show the subtitle with audio video synchronization.
User is during using terminal equipment watching video live broadcast, when net cast is subject to audio disturbances or regards
During frequency is broadcast live personage cacoepy is true or velocity of sound it is too fast when cause net cast sound it is unintelligible when, due to can not see with
The subtitle of audio video synchronization, user can not understand the programme content of net cast according only to sound, influence viewing experience completely.
To solve the above-mentioned problems, an embodiment of the present invention provides a kind of method and apparatus for handling live TV stream, Neng Goutong
It crosses and the corresponding text character of voice data is added in net cast, realize while net cast is played, play synchronous
Subtitle can help user to understand the content of net cast, promote the viewing experience of user.
An embodiment of the present invention provides a kind of methods for handling live TV stream.Referring to Fig. 1, Fig. 1 is the place of the embodiment of the present invention
A kind of flow chart of the method for live TV stream is managed, is included the following steps:
Step 101, original live stream is decoded as original audio data and original video data.
In this step, the original live stream in original live stream can be decoded as original audio data and original video
Data, and then the processing of subtitle is added to original live stream.
It should be noted that since live TV stream is required to playing the time limit, delay can influence the viewing experience of user too much,
Therefore, the processing that one section of decoded original live stream is added subtitle can be first obtained, it can be first by place after processing
Live video after reason plays back, next, being further continued for handling subsequent original live stream.
In a kind of realization method, step 101 can include:
The original live stream of preset duration is decoded as original audio data and original video data.
In specific implementation, the processing of subtitle is added except first obtaining one section of decoded original live stream, may be used also
First to obtain the original live stream of preset length, after being decoded to the original live stream of preset length, then subtitle is added
Processing, after processing can first by treated, live video plays back, next, being further continued for handling subsequent original
Live TV stream.Wherein, preset duration can be set according to actual conditions.
In another realization method, step 101 can include:
In original live stream in preset duration section, the time point of speech pause is determined;
By before the time point in original live stream and not decoded live TV stream segment, original audio data and original are decoded as
Beginning video data.
Specifically, can first preset duration section, duration section can be 30 seconds to 40 seconds.When the voice of original live stream
When pausing in the preset duration section, determine the time point paused occur;Next, can will be before the time point
Not decoded live TV stream segment, is decoded as original audio data and original video data.In this manner it is ensured that interception is original straight
The integrality of the voice in stream is broadcast, avoidance breakout is originally complete in short not only to facilitate subtitle to add, but also the viewing of user
Experience is more preferably.
Step 102, speech recognition, the corresponding text character of generation original audio data are carried out to original audio data.
In this step, the original audio data that generation can be decoded to step 101 carries out speech recognition, and generation is corresponding
Text character, and the alphabetic character is preserved, text character to be added in live TV stream by subsequent step, realize to play and treat
The net cast of subtitle.
Step 103, the first duration expended according to speech recognition carries out delay disposal to original video data.
In this step, since the processing procedure of language identification in step 102 can expend certain duration, therefore, it is necessary to
According to the first duration that speech recognition processes expend, delay disposal, the text character that will identify that are carried out to original video data
It is added in the video data after delay, just can guarantee subtitle and audio video synchronization.
Optionally, step 103 can include:
Determine the first duration spent by speech recognition;
By the timestamp of original video data, postpone the first duration.
Specifically, the first duration that can be in first obtaining step 102 spent by the processing procedure of language identification;Normal conditions
Words, when the preset duration for the original live stream that step 101 obtains is 30 seconds, the first duration spent by speech recognition may be
5 seconds or so;Then, according to the first duration got, by the timestamp of original video data, the first duration is postponed, to ensure
The subtitle and audio video synchronization of addition.
Step 104, text character is added in the video data after postponing, generates target video data.
In this step, text character step 102 generated is added to the video data after the delay that step 103 generates
In, generate the target video data with subtitle, it is possible to understand that, video in target video data is synchronous with subtitle.
Step 105, by target video data and original audio data synchronized compound, target live TV stream is generated.
In this step, it is the target video data that step 104 generates is synchronous with the original audio data that step 101 generates
Synthesis generates target live TV stream, sychronization captions is carried in live TV stream to realize.
Optionally, step 105 can include:
Based on default reference time axis, according to the timestamp of video frame in target video data and original audio number
According to the timestamp of sound intermediate frequency frame, target video data and original audio data are synchronized into synthesis, generate target live TV stream.
It specifically, can be on the basis of default reference time axis, according to the timestamp of video frame in target video data
And the timestamp of original audio data sound intermediate frequency frame, it is raw by the target video data and the original audio data synchronized compound
Into target live TV stream.
Step 106, target live TV stream is played.
In this step, the target live TV stream that step 105 generates is played, wherein, which is regarding with subtitle
Frequency is broadcast live, and with the sychronization captions of net cast user can be helped to understand the content of net cast, promote the viewing experience of user.
As it can be seen that the method for processing live TV stream provided in an embodiment of the present invention, it can be by adding audio in net cast
The corresponding text character of data is realized while net cast is played, plays synchronous subtitle, and user can be helped to understand and regarded
The content of frequency live streaming promotes the viewing experience of user.
For foreign language net cast, for convenience of user to watch, the subtitle after translation can also be increased in net cast.
For above application scene, in a kind of realization method, the embodiment of the present invention provides a kind of method for handling live TV stream, reference again
Fig. 2, Fig. 2 are another flow chart of the method for the processing live TV stream of the embodiment of the present invention, are included the following steps:
Step 201, the original live stream of preset duration is decoded as original audio data and original video data.
The detailed process and technique effect of this step may be referred to the step in the method for processing live TV stream shown in FIG. 1
101, details are not described herein.
Step 202, speech recognition, the corresponding text character of generation original audio data are carried out to original audio data.
The detailed process and technique effect of this step may be referred to the step in the method for processing live TV stream shown in FIG. 1
102, details are not described herein.
Step 203, the first duration spent by speech recognition is determined.
It in this step, can be with the first duration spent by the processing procedure of language identification in obtaining step 202.
Under normal conditions, when the preset duration of the original live stream obtained when step 201 is 60 seconds, voice recognition processing institute
The first duration expended may be 10 seconds or so.When the preset duration for the original live stream that step 201 obtains is 120 seconds, language
The first duration spent by sound identifying processing may be 20 seconds or so.
Step 204, text character is translated into default category of language, generates the second duration.
Wherein, second when a length of duration translated into the text character spent by default category of language.
In this step, can be according to actual demand, the corresponding text character of original audio data that step 202 is generated
Default category of language is translated into, and calculates the second spent duration of translation processing.
For example, for English net cast, the corresponding text character of original audio data is also English, then, it can incite somebody to action
The text character of English translates into Chinese, domestic user to be facilitated to watch.
In the present embodiment, except the voice recognition processing to original audio data, the text character to generation is further included
Translation processing, the second duration expended therefore, it is necessary to the first duration for being expended according to voice recognition processing and translation processing
Total duration carries out original video data delay disposal, and the text character that will identify that is added in the video data after delay,
It just can guarantee subtitle and audio video synchronization.
Step 205, by the timestamp of original video data, the duration of the sum of the first duration and the second duration is postponed.
In this step, the second duration that the first duration and step 204 calculated according to step 203 calculates, regards to original
Frequency is according to delay disposal is carried out, to ensure subtitle and audio video synchronization.
Execution sequence on step 202 to step 204 is, it is necessary to which explanation, can first carry out step 202 and perform step again
Rapid 204, step 204 can also be first carried out and perform step 202 again.
Step 206, the text character after translation is added in the video data after delay, generates target video data.
In this step, after the text character after translation step 204 generated is added to the delay that step 205 generates
In video data, the target video data with caption is generated, it is possible to understand that, video and translation in target video data
Subtitle is synchronous.
Step 207, by target video data and original audio data synchronized compound, target live TV stream is generated.
In this step, it is the target video data that step 206 generates is synchronous with the original audio data that step 201 generates
Synthesis generates target live TV stream, synchronous caption is carried in live TV stream to realize.
Step 208, target live TV stream is played.
In this step, the target live TV stream that step 207 generates is played, wherein, target live TV stream is with caption
Net cast, the synchronous caption with net cast can help the content for the net cast that user is best understood from, especially
For foreign language net cast, the viewing experience of user is promoted.
As it can be seen that the method for processing live TV stream provided in an embodiment of the present invention, for foreign language net cast, can by regarding
Frequency adds the text character after the corresponding translation of voice data in being broadcast live, realize while net cast is played, and plays synchronous
Caption, user can be helped to understand the content of foreign language net cast, largely promoted user viewing experience.
In order to improve translation after subtitle accuracy, can also to the subtitle after translation carry out calibration process.For above-mentioned
Application scenarios, in another realization method, the embodiment of the present invention separately provides a kind of method for handling live TV stream, with reference to figure 3, figure
3 be another flow chart of the method for the processing live TV stream of the embodiment of the present invention, is included the following steps:
Step 301, in the original live stream in preset duration section, the time point of speech pause is determined.
Step 302, by before the time point in original live stream and not decoded live TV stream segment, it is decoded as original audio
Data and original video data.
The method that the detailed process and technique effect of step 301 and step 302 may be referred to processing live TV stream shown in FIG. 1
In step 101 below associated description, details are not described herein.
Step 303, speech recognition, the corresponding text character of generation original audio data are carried out to original audio data.
The detailed process and technique effect of this step may be referred to the step in the method for processing live TV stream shown in FIG. 1
102, details are not described herein.
Step 304, the first duration spent by speech recognition is determined.
The detailed process and technique effect of this step may be referred to the step in the method for processing live TV stream shown in Fig. 2
203, details are not described herein.
Step 305, text character is translated into default category of language, generates the second duration.
The detailed process and technique effect of this step may be referred to the step in the method for processing live TV stream shown in Fig. 2
204, details are not described herein.
Step 306, correction process is carried out to the text character after translation.
In this step, the text character after the translation that can be generated to step 305 carries out correction process, to ensure subtitle
Accuracy.
Wherein, correction process can also be performed by manually performing by machine.
Step 307, the 3rd duration spent by correction process is determined;
In this step, the 3rd duration carried out to the character after translation spent by correction process in calculation procedure 306.
In the present embodiment, except the voice recognition processing to original audio data, further include at the translation to the text character of generation
Reason and correction process.Therefore, can according to voice recognition processing expend the first duration, translation processing expend the second duration with
And the total duration of the 3rd duration of correction process consuming, delay disposal, the text word that will identify that are carried out to original video data
Symbol is added in the video data after delay, just can guarantee subtitle and audio video synchronization.
Step 308, by the sum of the timestamp of original video data, the first duration of delay, the second duration and the 3rd duration
Duration.
In this step, the second duration and step 307 that the first duration and step 305 calculated according to step 303 calculates
The 3rd duration calculated carries out delay disposal, to ensure subtitle and audio video synchronization to original video data.
Step 309, the text character translated and after error correction is added in the video data after delay, generates target video
Data.
In this step, what the text character after the translation and error correction that step 306 are generated was added to that step 308 generates prolongs
In the video data to lag, the target video data with accurate translation subtitle is generated.
Step 310, by target video data and original audio data synchronized compound, target live TV stream is generated.
In this step, it is the target video data that step 309 generates is synchronous with the original audio data that step 301 generates
Synthesis generates target live TV stream, synchronous accurate translation subtitle is carried in live TV stream to realize.
Step 311, target live TV stream is played.
In this step, the target live TV stream that step 310 generates is played, wherein, which is band accurate translation
The net cast of subtitle, synchronous with net cast and accurate caption can help user easily and accurately understand video
The content of live streaming particularly with foreign language net cast, can largely promote the viewing experience of user.
As it can be seen that the method for processing live TV stream provided in an embodiment of the present invention, it can be by adding audio in net cast
Text character after the corresponding translation of data, and error correction is carried out to the text character after translation, it realizes and is playing net cast
While, synchronous and accurate caption is played, user can be helped easily and accurately understand foreign language net cast, especially
It is the content of foreign language net cast, brings the preferable viewing experience of user.
The embodiment of the present invention provides a kind of system for handling live TV stream again.Referring to Fig. 4, Fig. 4 is the embodiment of the present invention
Handle the schematic diagram of the system of live TV stream.
As shown in figure 4, the system of processing live TV stream includes drawing flow module 401, decoder module 402, subtitle acquisition module
403rd, audio coding module 404, video encoding module 405, package module 406 and plug-flow module 407, wherein, subtitle obtains mould
Block 403 includes speech recognition submodule 4031, translation submodule 4032 and artificial correction submodule 4033, video encoding module
405 include video data buffer delay submodule 4051, subtitle superposition submodule 4052 and Video coding submodule 4053.
The workflow for handling the system of live TV stream is as follows:
The first step draws flow module 401 to be obtained from server and download the original live stream of preset duration.
In practical applications, pending net cast resource can be stored in the service of multimedia web site under normal conditions
In device.
The original live stream of the preset duration is decoded as original audio data and original regarded by second step, decoder module 402
Frequency evidence.
Specifically, one section of original live stream of preset duration in original live stream can be decoded as original by decoder module 402
Beginning voice data and original video data, and then the processing of subtitle is added to the original live stream of the preset duration.
Original audio data is copied as two parts by the 3rd step, and portion is sent to subtitle acquisition module 403, another transmission
To audio coding module 404.
Specifically, can be by two parts of original audio datas after duplication, portion is sent to subtitle acquisition module 403, with life
Into subtitle corresponding with original audio, another is sent to audio coding module 404.
4th step, audio coding module 404 are compressed original audio data processing.
Specifically, since the data volume of original audio data is larger, using audio coding module 404 to original sound
Frequency is handled according into overcompression, in order to network transmission.
5th step, the speech recognition submodule 4031 in subtitle acquisition module 403 carry out voice knowledge to original audio data
Not, the corresponding text character of original audio data is generated.
Specifically, speech recognition submodule 4031 can carry out speech recognition to original audio data, corresponding text is generated
This character text character to be added in live TV stream by subsequent step, realizes the net cast for playing and treating subtitle.
Text character is translated into default language kind by the 6th step, the translation submodule 4032 in subtitle acquisition module 403
Class.
Specifically, translation submodule 4032 can be according to actual demand, by the corresponding text word of step original audio data
Symbol translates into default category of language.
For example, the corresponding foreign language text character of original audio data in foreign language net cast can be translated into Chinese.This
Sample for liking watching foreign language net cast but the not high user of L proficiency, can also pass through the subtitle after translation and understand foreign language
Net cast content promotes the usage experience of user.
7th step, the artificial correction submodule 4033 in subtitle acquisition module 403 entangle the text character after translation
Fault is managed.
Specifically, artificial correction submodule 4033 can carry out correction process to the text character after translation, to ensure word
The accuracy of curtain.
8th step, the video data buffer delay submodule 4051 in video encoding module 405 is according to the 3rd step to the 4th
The spent duration of step carries out delay process to original video data.
Specifically, video data buffer delay submodule 4051 can according to voice recognition processing expend duration, translation
The total duration of the duration expended and the duration of correction process consuming is handled, delay disposal is carried out to original video data, will be known
The text character not gone out is added in the video data after delay, just can guarantee subtitle and audio video synchronization.
9th step, the subtitle superposition submodule 4052 in video encoding module 405 will translate and the text character after error correction
It is added in the video data after delay, generates target video data.
Specifically, the text character translated and after error correction can be added to regarding after postponing by subtitle superposition submodule 4052
Frequency generates the target video data with accurate translation subtitle in.
Tenth step, the Video coding submodule 4053 in video encoding module 405 are compressed target video data at place
Reason.
Specifically, since the data volume of target video data is larger, using Video coding submodule 4053 to target
Video data is handled into overcompression, in order to network transmission.
11st step, package module 406 are synchronous with compressed original audio data by compressed target video data
Synthesis generates target live TV stream.
Specifically, package module 406 can be same by compressed target video data and compressed original audio data
Step synthesis, generates target live TV stream, synchronous accurate translation subtitle is carried in live TV stream to realize.
12nd step, plug-flow module 407 play target live TV stream.
Specifically, plug-flow module 407 plays target live TV stream, wherein, target live TV stream is regarding with accurate translation subtitle
Frequency is broadcast live, and synchronous with net cast and accurate caption can help user easily and accurately understand in net cast
Hold, particularly with foreign language net cast, can largely promote the viewing experience of user.
As it can be seen that the system of processing live TV stream provided in an embodiment of the present invention, it can be by adding audio in net cast
Text character after the corresponding translation of data, and error correction is carried out to the text character after translation, it realizes and is playing net cast
While, synchronous and accurate caption is played, user can be helped easily and accurately understand net cast, it is especially outer
The content of text video live streaming, brings the preferable viewing experience of user.
The embodiment of the present invention provides a kind of device for handling live TV stream again.Referring to Fig. 5, Fig. 5 is the embodiment of the present invention
The structure chart of the device of live TV stream is handled, which includes:
Decoding unit 501, for original live stream to be decoded as original audio data and original video data;
Recognition unit 502, for carrying out speech recognition, the corresponding text of generation original audio data to original audio data
Character;
Delay cell 503 for the first duration expended according to speech recognition, is carried out at delay original video data
Reason;
Adding device 504 for text character to be added in the video data after delay, generates target video data;
Synthesis unit 505, for by target video data and original audio data synchronized compound, generating target live TV stream;
Broadcast unit 506, for playing target live TV stream.
Optionally, decoding unit 501, specifically for by the original live stream of preset duration be decoded as original audio data and
Original video data.
Optionally, decoding unit 501, including:First determination subelement and decoding subunit;
First determination subelement, in the original live stream in preset duration section, determining the time of speech pause
Point;
Decoding subunit, for by before time point in original live stream and not decoded live TV stream segment, being decoded as original
Beginning voice data and original video data.
Optionally, delay cell 503, including:Second determination subelement and delay subelement;
Second determination subelement, for determining the first duration spent by speech recognition;
Postpone subelement, for by the timestamp of original video data, postponing the first duration.
Optionally, device further includes:
Translation unit for the text character to be translated into default category of language, generates the second duration, and described second
The text character is translated into the duration spent by default category of language by Shi Changwei;
Postpone subelement, specifically for by the timestamp of original video data, postponing the sum of the first duration and the second duration
Duration;
Adding device specifically for the text character after translating is added in the video data after delay, generates target
Video data.
Optionally, device further includes:
Error correction unit, for carrying out correction process to the text character after translation;
Determination unit, for determining the 3rd duration spent by correction process;
Postpone subelement, specifically for by the timestamp of original video data, the first duration of delay, the second duration and the 3rd
The duration of the sum of duration;
Adding device, it is raw specifically for the text character translated and after error correction is added in the video data after delay
Into target video data.
Optionally, synthesis unit 505 specifically for being based on default reference time axis, are regarded according in target video data
The timestamp of frequency frame and the timestamp of original audio data sound intermediate frequency frame, by target video data and original audio data into
Row synchronized compound generates target live TV stream.
As it can be seen that the device of processing live TV stream provided in an embodiment of the present invention, it can be by adding audio in net cast
The corresponding text character of data is realized while net cast is played, plays synchronous subtitle, and user can be helped to understand and regarded
The content of frequency live streaming promotes the viewing experience of user.
The embodiment of the present invention additionally provides a kind of electronic equipment, and with reference to figure 6, Fig. 6 is the electronic equipment of the embodiment of the present invention
Schematic diagram, as shown in fig. 6, electronic equipment include processor 601, communication interface 602, memory 603 and communication bus 604,
Wherein, processor 601, communication interface 602, memory 603 complete mutual communication by communication bus 604,
Memory 603, for storing computer program;
Processor 601 during for performing the program stored on memory 603, realizes following steps:
Original live stream is decoded as original audio data and original video data;
Speech recognition, the corresponding text character of generation original audio data are carried out to original audio data;
According to the first duration that speech recognition expends, delay disposal is carried out to original video data;
Text character is added in the video data after delay, generates target video data;
By target video data and original audio data synchronized compound, target live TV stream is generated;
Play target live TV stream.
The communication bus that above-mentioned electronic equipment is mentioned can be Peripheral Component Interconnect standard (Peripheral Component
Interconnect, abbreviation PCI) bus or expanding the industrial standard structure (Extended Industry Standard
Architecture, abbreviation EISA) bus etc..The communication bus can be divided into address bus, data/address bus, controlling bus etc..
For ease of representing, only represented in figure with a thick line, it is not intended that an only bus or a type of bus.
Communication interface is for the communication between above-mentioned electronic equipment and other equipment.
Memory can include random access memory (Random Access Memory, abbreviation RAM), can also include
Nonvolatile memory (non-volatile memory), for example, at least a magnetic disk storage.Optionally, memory may be used also
To be at least one storage device for being located remotely from aforementioned processor.
Above-mentioned processor can be general processor, including central processing unit (Central Processing Unit,
Abbreviation CPU), network processing unit (Network Processor, abbreviation NP) etc.;It can also be digital signal processor
(Digital Signal Processing, abbreviation DSP), application-specific integrated circuit (Application Specific
Integrated Circuit, abbreviation ASIC), field programmable gate array (Field-Programmable Gate Array,
Abbreviation FPGA) either other programmable logic device, discrete gate or transistor logic, discrete hardware components.
In another embodiment provided by the invention, a kind of computer readable storage medium is additionally provided, which can
It reads to be stored with instruction in storage medium, when run on a computer so that computer performs any institute in above-described embodiment
The method for the processing live TV stream stated.
In another embodiment provided by the invention, a kind of computer program product for including instruction is additionally provided, when it
When running on computers so that the method that computer performs any processing live TV stream in above-described embodiment.
In the above-described embodiments, can come wholly or partly by software, hardware, firmware or its any combination real
It is existing.When implemented in software, can entirely or partly realize in the form of a computer program product.The computer program
Product includes one or more computer instructions.When loading on computers and performing the computer program instructions, all or
It partly generates according to the flow or function described in the embodiment of the present invention.The computer can be all-purpose computer, special meter
Calculation machine, computer network or other programmable devices.The computer instruction can be stored in computer readable storage medium
In or from a computer readable storage medium to another computer readable storage medium transmit, for example, the computer
Instruction can pass through wired (such as coaxial cable, optical fiber, number from a web-site, computer, server or data center
User's line (DSL)) or wireless (such as infrared, wireless, microwave etc.) mode to another web-site, computer, server or
Data center is transmitted.The computer readable storage medium can be any usable medium that computer can access or
It is the data storage devices such as server, the data center integrated comprising one or more usable mediums.The usable medium can be with
It is magnetic medium, (for example, floppy disk, hard disk, tape), optical medium (for example, DVD) or semiconductor medium (such as solid state disk
Solid State Disk (SSD)) etc..
It should be noted that herein, relational terms such as first and second and the like are used merely to a reality
Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation
In any this actual relation or order.Moreover, term " comprising ", "comprising" or its any other variant are intended to
Non-exclusive inclusion, so that process, method, article or equipment including a series of elements not only will including those
Element, but also including other elements that are not explicitly listed or further include as this process, method, article or equipment
Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that
Also there are other identical elements in process, method, article or equipment including the element.
Each embodiment in this specification is described using relevant mode, identical similar portion between each embodiment
Point just to refer each other, and the highlights of each of the examples are difference from other examples.It is real especially for system
For applying example, since it is substantially similar to embodiment of the method, so description is fairly simple, related part is referring to embodiment of the method
Part explanation.
Claims (15)
- A kind of 1. method for handling live TV stream, which is characterized in that the described method includes:Original live stream is decoded as original audio data and original video data;Speech recognition is carried out to the original audio data, generates the corresponding text character of the original audio data;According to the first duration that the speech recognition expends, delay disposal is carried out to the original video data;The text character is added in the video data after delay, generates target video data;By the target video data and the original audio data synchronized compound, target live TV stream is generated;Play the target live TV stream.
- 2. according to the method described in claim 1, it is characterized in that, it is described by original live stream be decoded as original audio data and The step of original video data, including:The original live stream of preset duration is decoded as original audio data and original video data.
- 3. according to the method described in claim 1, it is characterized in that, it is described by original live stream be decoded as original audio data and The step of original video data, including:In original live stream in preset duration section, the time point of speech pause is determined;Before time point described in the original live stream and not decoded live TV stream segment, will be decoded as original audio data and Original video data.
- 4. according to the method described in claim 1, it is characterized in that, it is described according to the speech recognition expend the first duration, The step of delay disposal is carried out to the original video data, including:Determine the first duration spent by the speech recognition;By the timestamp of the original video data, postpone first duration.
- 5. according to the method described in claim 4, it is characterized in that,Speech recognition is carried out to the original audio data described, generates the corresponding text character of the original audio data After step, the method further includes:The text character is translated into default category of language, generates the second duration, described second when is a length of by the text Duration spent by character translation into default category of language;The step of timestamp by the original video data, delay first duration, including:By the timestamp of the original video data, postpone the duration of the sum of first duration and second duration;It is described by the text character be added to delay after video data in, generate target video data the step of, including:Text character after translation is added in the video data after delay, generates target video data.
- 6. according to the method described in claim 5, it is characterized in that,After the described the step of text character is translated into default category of language, the method further includes:Correction process is carried out to the text character after translation;Determine the 3rd duration spent by the correction process;The timestamp by the original video data postpones the duration of the sum of first duration and second duration Step, including:By the timestamp of the original video data, postpone first duration, second duration and the 3rd duration it The duration of sum;It is described by the text character be added to delay after video data in, generate target video data the step of, including:The text character translated and after error correction is added in the video data after delay, generates target video data.
- It is 7. according to the method described in claim 1, it is characterized in that, described by the target video data and the original audio The step of data synchronized compound, generation target live TV stream, including:Based on default reference time axis, according to the timestamp of video frame in the target video data and the original sound The target video data and the original audio data are synchronized synthesis, generated by frequency according to the timestamp of sound intermediate frequency frame Target live TV stream.
- 8. a kind of device for handling live TV stream, which is characterized in that described device includes:Decoding unit, for original live stream to be decoded as original audio data and original video data;Recognition unit for carrying out speech recognition to the original audio data, generates the corresponding text of the original audio data This character;Delay cell for the first duration expended according to the speech recognition, is carried out at delay the original video data Reason;Adding device for the text character to be added in the video data after delay, generates target video data;Synthesis unit, for by the target video data and the original audio data synchronized compound, generating target live TV stream;Broadcast unit, for playing the target live TV stream.
- 9. device according to claim 8, which is characterized in thatThe decoding unit, specifically for the original live stream of preset duration is decoded as original audio data and original video number According to.
- 10. device according to claim 8, which is characterized in thatThe decoding unit, including:First determination subelement and decoding subunit;First determination subelement, in the original live stream in preset duration section, determining the time of speech pause Point;The decoding subunit, for by before time point described in the original live stream and not decoded live TV stream segment, It is decoded as original audio data and original video data.
- 11. device according to claim 8, which is characterized in thatThe delay cell, including:Second determination subelement and delay subelement;Second determination subelement, for determining the first duration spent by the speech recognition;The delay subelement, for by the timestamp of the original video data, postponing first duration.
- 12. according to the devices described in claim 11, which is characterized in that described device further includes:Translation unit for the text character to be translated into default category of language, generates the second duration, second duration For the text character to be translated into the duration spent by default category of language;The delay subelement, specifically for by the timestamp of the original video data, postponing first duration and described The duration of the sum of second duration;The adding device specifically for the text character after translating is added in the video data after delay, generates target Video data.
- 13. device according to claim 12, which is characterized in that described device further includes:Error correction unit, for carrying out correction process to the text character after translation;Determination unit, for determining the 3rd duration spent by the correction process;The delay subelement, specifically for by the timestamp of the original video data, postponing first duration, described the The duration of the sum of two durations and the 3rd duration;The adding device, it is raw specifically for the text character translated and after error correction is added in the video data after delay Into target video data.
- 14. device according to claim 8, which is characterized in thatThe synthesis unit, specifically for being based on default reference time axis, according to video frame in the target video data The timestamp of timestamp and the original audio data sound intermediate frequency frame, by the target video data and the original audio Data synchronize synthesis, generate target live TV stream.
- 15. a kind of electronic equipment, which is characterized in that including processor, communication interface, memory and communication bus, wherein, processing Device, communication interface, memory complete mutual communication by communication bus;Memory, for storing computer program;Processor during for performing the program stored on memory, realizes any method and steps of claim 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711172649.1A CN108063970A (en) | 2017-11-22 | 2017-11-22 | A kind of method and apparatus for handling live TV stream |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711172649.1A CN108063970A (en) | 2017-11-22 | 2017-11-22 | A kind of method and apparatus for handling live TV stream |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108063970A true CN108063970A (en) | 2018-05-22 |
Family
ID=62134976
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711172649.1A Pending CN108063970A (en) | 2017-11-22 | 2017-11-22 | A kind of method and apparatus for handling live TV stream |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108063970A (en) |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108401192A (en) * | 2018-04-25 | 2018-08-14 | 腾讯科技(深圳)有限公司 | Video stream processing method, device, computer equipment and storage medium |
CN108419113A (en) * | 2018-05-24 | 2018-08-17 | 广州酷狗计算机科技有限公司 | Caption presentation method and device |
CN108833403A (en) * | 2018-06-11 | 2018-11-16 | 颜彦 | It is a kind of to melt media information publication generation method with embedded code transplanting |
CN110691218A (en) * | 2019-09-09 | 2020-01-14 | 苏州臻迪智能科技有限公司 | Audio data transmission method and device, electronic equipment and readable storage medium |
WO2020024962A1 (en) * | 2018-08-01 | 2020-02-06 | 北京微播视界科技有限公司 | Method and apparatus for processing data |
CN110933449A (en) * | 2019-12-20 | 2020-03-27 | 北京奇艺世纪科技有限公司 | Method, system and device for synchronizing external data and video pictures |
CN111522971A (en) * | 2020-04-08 | 2020-08-11 | 广东小天才科技有限公司 | Method and device for assisting user in attending lessons in live broadcast teaching |
CN112188241A (en) * | 2020-10-09 | 2021-01-05 | 上海网达软件股份有限公司 | Method and system for real-time subtitle generation of live stream |
CN112437337A (en) * | 2020-02-12 | 2021-03-02 | 上海哔哩哔哩科技有限公司 | Method, system and equipment for realizing live broadcast real-time subtitles |
CN112637620A (en) * | 2020-12-09 | 2021-04-09 | 杭州艾耕科技有限公司 | Method and device for identifying and analyzing articles and languages in audio and video stream in real time |
CN113721704A (en) * | 2021-08-30 | 2021-11-30 | 成都华栖云科技有限公司 | Simultaneous interpretation system of video stream and implementation method thereof |
CN113873306A (en) * | 2021-09-23 | 2021-12-31 | 深圳市多狗乐智能研发有限公司 | Method for projecting real-time translation caption superposition picture to live broadcast room through hardware |
CN113992926A (en) * | 2021-10-19 | 2022-01-28 | 北京有竹居网络技术有限公司 | Interface display method and device, electronic equipment and storage medium |
CN115086691A (en) * | 2021-03-16 | 2022-09-20 | 北京有竹居网络技术有限公司 | Subtitle optimization method and device, electronic equipment and storage medium |
CN115150631A (en) * | 2021-03-16 | 2022-10-04 | 北京有竹居网络技术有限公司 | Subtitle processing method, subtitle processing device, electronic equipment and storage medium |
CN116017011A (en) * | 2021-10-22 | 2023-04-25 | 成都极米科技股份有限公司 | Subtitle synchronization method, playing device and readable storage medium for audio and video |
WO2024087732A1 (en) * | 2022-10-25 | 2024-05-02 | 上海哔哩哔哩科技有限公司 | Livestreaming data processing method and system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103327397A (en) * | 2012-03-22 | 2013-09-25 | 联想(北京)有限公司 | Subtitle synchronous display method and system of media file |
CN105704579A (en) * | 2014-11-27 | 2016-06-22 | 南京苏宁软件技术有限公司 | Real-time automatic caption translation method during media playing and system |
CN105828216A (en) * | 2016-03-31 | 2016-08-03 | 北京奇艺世纪科技有限公司 | Live broadcast video subtitle synthesis system and method |
CN106340294A (en) * | 2016-09-29 | 2017-01-18 | 安徽声讯信息技术有限公司 | Synchronous translation-based news live streaming subtitle on-line production system |
-
2017
- 2017-11-22 CN CN201711172649.1A patent/CN108063970A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103327397A (en) * | 2012-03-22 | 2013-09-25 | 联想(北京)有限公司 | Subtitle synchronous display method and system of media file |
CN105704579A (en) * | 2014-11-27 | 2016-06-22 | 南京苏宁软件技术有限公司 | Real-time automatic caption translation method during media playing and system |
CN105828216A (en) * | 2016-03-31 | 2016-08-03 | 北京奇艺世纪科技有限公司 | Live broadcast video subtitle synthesis system and method |
CN106340294A (en) * | 2016-09-29 | 2017-01-18 | 安徽声讯信息技术有限公司 | Synchronous translation-based news live streaming subtitle on-line production system |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108401192B (en) * | 2018-04-25 | 2022-02-22 | 腾讯科技(深圳)有限公司 | Video stream processing method and device, computer equipment and storage medium |
US11463779B2 (en) | 2018-04-25 | 2022-10-04 | Tencent Technology (Shenzhen) Company Limited | Video stream processing method and apparatus, computer device, and storage medium |
CN108401192A (en) * | 2018-04-25 | 2018-08-14 | 腾讯科技(深圳)有限公司 | Video stream processing method, device, computer equipment and storage medium |
CN108419113A (en) * | 2018-05-24 | 2018-08-17 | 广州酷狗计算机科技有限公司 | Caption presentation method and device |
CN108419113B (en) * | 2018-05-24 | 2021-01-08 | 广州酷狗计算机科技有限公司 | Subtitle display method and device |
CN108833403A (en) * | 2018-06-11 | 2018-11-16 | 颜彦 | It is a kind of to melt media information publication generation method with embedded code transplanting |
WO2020024962A1 (en) * | 2018-08-01 | 2020-02-06 | 北京微播视界科技有限公司 | Method and apparatus for processing data |
CN110691218A (en) * | 2019-09-09 | 2020-01-14 | 苏州臻迪智能科技有限公司 | Audio data transmission method and device, electronic equipment and readable storage medium |
CN110933449A (en) * | 2019-12-20 | 2020-03-27 | 北京奇艺世纪科技有限公司 | Method, system and device for synchronizing external data and video pictures |
CN112437337A (en) * | 2020-02-12 | 2021-03-02 | 上海哔哩哔哩科技有限公司 | Method, system and equipment for realizing live broadcast real-time subtitles |
US11595731B2 (en) | 2020-02-12 | 2023-02-28 | Shanghai Bilibili Technology Co., Ltd. | Implementation method and system of real-time subtitle in live broadcast and device |
CN111522971A (en) * | 2020-04-08 | 2020-08-11 | 广东小天才科技有限公司 | Method and device for assisting user in attending lessons in live broadcast teaching |
CN112188241A (en) * | 2020-10-09 | 2021-01-05 | 上海网达软件股份有限公司 | Method and system for real-time subtitle generation of live stream |
CN112637620A (en) * | 2020-12-09 | 2021-04-09 | 杭州艾耕科技有限公司 | Method and device for identifying and analyzing articles and languages in audio and video stream in real time |
CN115086691A (en) * | 2021-03-16 | 2022-09-20 | 北京有竹居网络技术有限公司 | Subtitle optimization method and device, electronic equipment and storage medium |
CN115150631A (en) * | 2021-03-16 | 2022-10-04 | 北京有竹居网络技术有限公司 | Subtitle processing method, subtitle processing device, electronic equipment and storage medium |
CN113721704A (en) * | 2021-08-30 | 2021-11-30 | 成都华栖云科技有限公司 | Simultaneous interpretation system of video stream and implementation method thereof |
CN113873306A (en) * | 2021-09-23 | 2021-12-31 | 深圳市多狗乐智能研发有限公司 | Method for projecting real-time translation caption superposition picture to live broadcast room through hardware |
CN113992926A (en) * | 2021-10-19 | 2022-01-28 | 北京有竹居网络技术有限公司 | Interface display method and device, electronic equipment and storage medium |
CN113992926B (en) * | 2021-10-19 | 2023-09-12 | 北京有竹居网络技术有限公司 | Interface display method, device, electronic equipment and storage medium |
CN116017011A (en) * | 2021-10-22 | 2023-04-25 | 成都极米科技股份有限公司 | Subtitle synchronization method, playing device and readable storage medium for audio and video |
CN116017011B (en) * | 2021-10-22 | 2024-04-23 | 成都极米科技股份有限公司 | Subtitle synchronization method, playing device and readable storage medium for audio and video |
WO2024087732A1 (en) * | 2022-10-25 | 2024-05-02 | 上海哔哩哔哩科技有限公司 | Livestreaming data processing method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108063970A (en) | A kind of method and apparatus for handling live TV stream | |
US20200336796A1 (en) | Video stream processing method and apparatus, computer device, and storage medium | |
US9961398B2 (en) | Method and device for switching video streams | |
US20170111414A1 (en) | Video playing method and device | |
CN109842795B (en) | Audio and video synchronization performance testing method and device, electronic equipment and storage medium | |
CN106973317A (en) | Multimedia data processing method, multimedia data providing method, apparatus and system | |
CN112954434B (en) | Subtitle processing method, system, electronic device and storage medium | |
WO2017107516A1 (en) | Method and device for playing network video | |
CN107659538A (en) | A kind of method and apparatus of Video processing | |
CN112437337A (en) | Method, system and equipment for realizing live broadcast real-time subtitles | |
CN104918108A (en) | Video accurate positioning device and method based on HLS (HTTP Live Streaming) protocol | |
EP2953132A1 (en) | Method and apparatus for processing audio/video file | |
CN106385525A (en) | Video play method and device | |
CN103648011A (en) | Audio and video synchronization device and method based on HLS protocol | |
CN106331820B (en) | Audio and video synchronization processing method and device | |
JP6646661B2 (en) | Method and apparatus for transmitting and receiving media data | |
CN107371053B (en) | Audio and video stream contrast analysis method and device | |
JP2018026778A (en) | Transmission apparatus, transmission method, reception apparatus, and reception method | |
CN103491430A (en) | Streaming media data processing method and electronic device | |
CN104113778A (en) | Video stream decoding method and device | |
CN115623264A (en) | Live stream subtitle processing method and device and live stream playing method and device | |
CN106303754A (en) | A kind of audio data play method and device | |
JP5852000B2 (en) | Test management apparatus and method for testing interactivity to comply with the Brazilian digital television standard | |
CN106162323A (en) | A kind of video data handling procedure and device | |
WO2017071642A1 (en) | Media playback method, device and computer storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180522 |
|
RJ01 | Rejection of invention patent application after publication |