CN106303303A - Method and device for translating subtitles of media file and electronic equipment - Google Patents

Method and device for translating subtitles of media file and electronic equipment Download PDF

Info

Publication number
CN106303303A
CN106303303A CN201610683339.5A CN201610683339A CN106303303A CN 106303303 A CN106303303 A CN 106303303A CN 201610683339 A CN201610683339 A CN 201610683339A CN 106303303 A CN106303303 A CN 106303303A
Authority
CN
China
Prior art keywords
media file
caption
captions
timestamp
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610683339.5A
Other languages
Chinese (zh)
Inventor
田昊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Internet Security Software Co Ltd
Original Assignee
Beijing Kingsoft Internet Security Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Internet Security Software Co Ltd filed Critical Beijing Kingsoft Internet Security Software Co Ltd
Priority to CN201610683339.5A priority Critical patent/CN106303303A/en
Publication of CN106303303A publication Critical patent/CN106303303A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/278Subtitling
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/237Communication with additional data server
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/262Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists
    • H04N21/26283Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists for associating distribution time parameters to content, e.g. to generate electronic program guide data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the invention provides a method and a device for translating a media file subtitle and electronic equipment, wherein the method comprises the following steps: acquiring caption characters and a time stamp of a media file according to associated captions or audio of the media file or caption content embedded in a frame picture; and translating the caption characters into characters of a set language, and corresponding the characters of the set language with the timestamp to generate translated caption information of the media file. The invention can acquire the caption characters of any media file for translation, generates the caption of the set language according to the timestamp, and solves the problem of low universality of caption translation of the media file in the prior art.

Description

The interpretation method of a kind of media file caption, device and electronic equipment
Technical field
The present invention relates to electronic technology field, particularly relate to a kind of media file caption interpretation method, device and electronics and set Standby.
Background technology
Along with the development of network video technique, increasing overseas play realizes transnational broadcasting by Internet channel. User, watching these overseas plays when, although some video files provide captions, but is not necessarily the language that user is familiar with Speech, some videos do not provide captions, and user is limited due to foreign language level, are difficult to understand these captions or understand containing of lines Justice, causes user cannot understand video content, reduces the viewing interest of user.
Being directed to this, the caption character in the association captions of media file is typically translated, during foundation by prior art Between stamp generate the word after translation, but if obtaining less than the association captions of media file, then cannot be carried out translation, captions turn over The universality translated is low.
Accordingly, it would be desirable to solution problems with: improve the universality of media file caption translation.
Summary of the invention
The present invention proposes the interpretation method of a kind of media file caption, device and electronic equipment, can obtain relevant Captions, captions embed in frame picture, do not have the caption character of the media file of captions to translate, and generate according to timestamp and set The captions of language, solve the problem that in prior art, caption translating universality is low.
In one aspect, embodiments providing the interpretation method of media file caption, described method includes:
Associating captions or audio frequency or embedding the caption content in frame picture according to media file, obtains described media file Caption character and timestamp, wherein, the priority of the caption character and timestamp that obtain described media file is followed successively by: according to The association captions acquisition of described media file, the caption content embedded in frame picture according to described media file obtain, basis The audio frequency of described media file obtains;
Described caption character is translated as setting the word of language;
By corresponding with described timestamp for the word of described setting language, generate the letter of the captions after the translation of described media file Breath.
Wherein, described associating captions or audio frequency or embedding the caption content in frame picture according to media file, obtain institute State caption character and the timestamp of media file, wherein, obtain the caption character of described media file and the priority of timestamp It is followed successively by: obtain according to the association captions of described media file, according in the captions embedded in frame picture of described media file Appearance obtains, obtains according to the audio frequency of described media file, particularly as follows:
Step 1, judge described media file whether have correspondence association captions, when judging that described media file has correspondence Association captions time perform step 2, when judge described media file do not have correspondence association captions time perform step 3;
Step 2, the caption character obtained in described association captions and timestamp;
Step 3, judge whether described media file has the caption content embedded in frame picture, when judging described media literary composition Part performs step 4, when judging that described media file does not embed in frame picture when having the caption content embedded in frame picture Step 5 is performed during caption content;
Step 4, the caption content identified in the frame picture of described media file, obtain the captions literary composition of described media file Word, and obtain the timestamp of described frame picture;
Step 5, audio frequency to described media file carry out speech recognition, obtain the caption character of described media file, and Obtain the timestamp of described audio frequency.
Wherein, described step 2, the caption character obtained in described association captions and timestamp, particularly as follows:
According to regular expression rule, filter other letters in addition to timestamp and caption character in described association captions Breath.
Wherein, the described word to described setting language processes, and generates the captions after the translation of described media file Information, particularly as follows:
By word and the described timestamp coupling of described setting language, and write other information described, generate described media Caption information after the translation of file.
Wherein, described caption character is translated as the word of described setting language, particularly as follows:
According to the translation module of described electronic equipment, described caption character is translated, obtain setting the word of language; Or
Described caption character is sent to remote server, receives described remote server and described caption character is turned over Translate the word setting language obtained.
Preferably, the described word to described setting language processes, and generates the word after the translation of described media file After curtain information, also include:
Caption information after described translation is imported in described media file, after translation described in simultaneous display in caption information Word.
Preferably, the described word to described setting language processes, and generates the word after the translation of described media file After curtain information, also include:
Caption information after described translation is sent to remote server, after making described remote server to described translation Caption information carries out examining calibration and preserving, and when again needing to translate described media file caption, adjusts from described remote server With the caption information after calibration.
In yet another aspect, embodiments providing the translating equipment of media file caption, described device includes: obtain Delivery block, translation module and captions generation module;
Described acquisition module, in the captions in the association captions according to media file or audio frequency or embedding frame picture Hold, obtain caption character and the timestamp of described media file, wherein, obtain caption character and the timestamp of described media file Priority be followed successively by: according to described media file association captions obtain, according in the embedding frame picture of described media file Caption content obtain, obtain according to the audio frequency of described media file;
Described translation module, for being translated as the word of described setting language by described caption character;
Described captions generation module, for by corresponding with described timestamp for the word of described setting language, generating described matchmaker Caption information after the translation of body file.
Wherein, described acquisition module includes judging unit and acquiring unit, wherein:
Judging unit includes the first judging unit and the second judging unit, acquiring unit include the first acquiring unit, second Acquiring unit and the 3rd acquiring unit;
Described first judging unit is for judging whether described media file has the association captions of correspondence;
When described first judging unit, described first acquiring unit, for judging that described media file has the association of correspondence During captions, obtain the caption character in described association captions and timestamp;
When described first judging unit, described second judging unit, for judging that described media file does not has the pass of correspondence During connection captions, it is judged that whether described media file has the caption content embedded in frame picture;
When described second acquisition unit, described second acquisition unit, for judging that described media file has embedding frame picture In caption content time, identify the caption content in the frame picture of described media file, obtain described media file captions literary composition Word, and obtain the timestamp of described frame picture;
When described second acquisition unit, described 3rd acquiring unit, for judging that described media file does not embed frame and draws During caption content in face, the audio frequency of described media file is carried out speech recognition, obtains the caption character of described media file, And obtain the timestamp of described audio frequency.
Wherein, described first acquiring unit, for according to regular expression rule, filters in described association captions except the time Other invalid informations beyond stamp and caption character.
Wherein, described captions generation module, for the word of described setting language and described timestamp are mated, and write Other information described, generate the caption information after the translation of described media file.
Wherein, described translation module, for being used for described caption character is translated by described first translation unit, Caption character after translation;Or
Described caption character is sent to remote server, receives described remote server and described caption character is turned over Translate the caption character after the translation obtained.
Preferably, described device also includes Subtitle Demonstration module, described for being imported by the caption information after described translation The word in caption information in media file, after the translation imported described in simultaneous display.
Preferably, described device also includes examining module, for by the most described far for the caption information transmission after described translation Journey server, makes described remote server carry out the caption information after described translation examining calibration and preserving, again translates matchmaker Caption information after described remote server calls calibration during body file captions.
In yet another aspect, embodiments provide a kind of terminal, including: media file caption as above Translating equipment.
In yet another aspect, embodiments provide a kind of electronic equipment, including: housing, processor, memorizer, Display screen, circuit board and power circuit, wherein, described circuit board is placed in the interior volume that described housing surrounds, described process Device and described memorizer are arranged on described circuit board, be embedded on described housing and connect described circuit board outside described display screen; Described power circuit, powers for each circuit or the device for electronic equipment;Described memorizer is used for storing executable program Code and data;Described processor runs by reading the executable program code of storage in described memorizer and can perform journey The program that sequence code is corresponding, for performing following steps:
Associating captions or audio frequency or embedding the caption content in frame picture according to media file, obtains described media file Caption character and timestamp, wherein, the priority of the caption character and timestamp that obtain described media file is followed successively by: according to The association captions acquisition of described media file, the caption content embedded in frame picture according to described media file obtain, basis The audio frequency of described media file obtains;
Described caption character is translated as setting the word of language;
By corresponding with described timestamp for the word of described setting language, generate the letter of the captions after the translation of described media file Breath.
The such scheme of the present invention at least includes following beneficial effect:
The present invention associating captions or audio frequency or embedding the caption content in frame picture according to media file, obtains described matchmaker The caption character of body file and timestamp;Caption character is translated as setting the word of language, by the word of described setting language Corresponding with described timestamp, generate the caption information after the translation of described media file.The present invention can obtain any media literary composition The caption character of part is translated, and generates the captions setting language according to timestamp, solves media file caption in prior art The problem that translation universality is low.
Accompanying drawing explanation
The specific embodiment of the present invention is described below with reference to accompanying drawings, wherein:
Fig. 1 shows the schematic diagram of the interpretation method of media file caption in the embodiment of the present invention one;
Fig. 2 shows the schematic diagram of the interpretation method of media file caption in the embodiment of the present invention two;
Fig. 3 shows associating captions or audio frequency or embedding in frame picture according to media file in the embodiment of the present invention two Caption content, obtains the caption character of described media file and the schematic diagram of timestamp method;
Fig. 4 shows the structural representation of the translating equipment of media file caption in the embodiment of the present invention three;
Fig. 6 shows the structural representation of the translating equipment of media file caption in the embodiment of the present invention four;
Fig. 5 shows the structural representation of acquisition module in the embodiment of the present invention four;
Fig. 7 shows the structural representation of a kind of electronic equipment that the embodiment of the present invention five provides.
Detailed description of the invention
In order to make technical scheme and advantage clearer, below in conjunction with exemplary to the present invention of accompanying drawing Embodiment is described in more detail, it is clear that described embodiment be only the present invention a part of embodiment rather than All embodiments exhaustive.And in the case of not conflicting, the embodiment in this explanation and the feature in embodiment can be mutual Combine.
Embodiments of the invention are for the low problem of caption translating universality in prior art, it is provided that a kind of media file word Interpretation method, device and the electronic equipment of curtain, can obtain relevant captions, captions embed in frame picture, do not have the matchmaker of captions The caption character of body file is translated, and generates the captions setting language according to timestamp, solves caption translating in prior art The problem that universality is low.
In embodiments of the invention, media file can be video file or video flowing, this video file or video The source of stream includes but not limited to: the video file preserved in (1) storage device;(2) live video stream, such as live telecast regard Frequency stream, network direct broadcasting video flowing etc..
In embodiments of the invention, association captions refer to the word that in electronic equipment or terminal, the media file of storage carries Curtain, it is also possible to be the captions mated with media file on the Internet or server.
Embodiment one
The first embodiment schematic flow sheet of the interpretation method of a kind of media file caption that Fig. 1 provides for the present invention.This The interpretation method of the media file caption that inventive embodiments one provides includes:
Step 101, associating captions or audio frequency or embedding the caption content in frame picture according to media file, obtain media The caption character of file and timestamp, wherein, the priority of the caption character and timestamp that obtain media file is followed successively by: according to The association captions of media file obtain, obtain, according to media file according to the caption content embedded in frame picture of media file Audio frequency obtain;
Step 102, caption character is translated as set language word;
Step 103, by set language word corresponding with timestamp, generation media file translation after caption information.
In the embodiment of the present invention, associating captions or audio frequency or embedding the caption content in frame picture according to media file, Obtain caption character and the timestamp of media file, caption character is translated, and generate setting language according to timestamp Captions, improve the universality of media file caption translation.
Embodiment two
Second embodiment schematic flow sheet of the interpretation method of a kind of media file caption that Fig. 2 provides for the present invention.This The interpretation method of the media file caption that inventive embodiments two provides includes:
Step 201, associating captions or audio frequency or embedding the caption content in frame picture according to media file, obtain media The caption character of file and timestamp, wherein, the priority of the caption character and timestamp that obtain media file is followed successively by: according to The association captions of media file obtain, obtain, according to media file according to the caption content embedded in frame picture of media file Audio frequency obtain;
In the present embodiment, the caption content in the association captions according to media file or audio frequency or embedding frame picture, obtains Before taking caption character and the timestamp of media file, it is also possible to include the step setting language triggering command accepting user, Be specially play media file time, display to the user that in certain position of media file set language Option Box, can but Being not limited to the upper right corner or the upper left position of media file, user selects object language by this Option Box.
As it is shown on figure 3, the caption character obtained in the present embodiment in media file association captions and timestamp include following Step:
Step 2011, judge media file whether have correspondence association captions, when judge media file have correspondence pass Step 2012 is performed, when judging to perform step 2013 when media file does not has the association captions of correspondence during connection captions;
Wherein, it is judged that the most relevant captions of media file, can be by the pass of the media file of preservation in search equipment Connection captions;Or the association captions that searching media files is on the Internet or server.
Step 2012, the caption character obtained in association captions and timestamp;
Wherein, the mode of the association captions obtaining media file can be, but not limited to: the media that (1) preserves in obtaining equipment The association captions of file;(2) media file association captions on the Internet or server are obtained.
Obtaining the mode of caption character and timestamp in association captions can be: for the electronic equipment of android system Or caption information is first converted into byte stream information by the method for byte stream class by terminal;The electronic equipment of ios system or terminal are first By the method in byte stream storehouse, caption information is converted into byte stream information;Windows directly runs captioned test file, according to just Then expression formula rule, filter in byte stream information corresponding to caption information or text beyond timestamp and caption character its His information, in byte stream information as corresponding in caption information or text { in captions two braces of xxxxx}{ time XXX} Be exactly a pair captions, retain the content in braces during filtration.
Step 2013, judge whether the frame picture of media file comprises caption content, embedding when judging that media file has Step 2014 is performed, when judging that media file does not embed the caption content in frame picture when entering the caption content in frame picture Shi Zhihang step 2015;
The present embodiment, it is judged that whether media file has the caption content embedded in frame picture can use following methods: obtain Take a number of media file frame picture, frame picture is converted into gray level image;The ash of each pixel in statistics gray level image Angle value, obtains the grey level histogram of frame picture;Choose first threshold and the Second Threshold of intensity value ranges, calculate every width frame picture The local message entropy of grey level histogram;If local message entropy, more than the 4th threshold value is then recognized more than the frame number of pictures of the 3rd threshold value The caption content embedded in frame picture is had, if local message entropy does not surpasses more than the frame number of pictures of the 3rd threshold value for this media file Cross the 4th threshold value, then it is assumed that this media file does not embed the caption content in frame picture.
Step 2014, identify media file frame picture in caption content, obtain the caption character of media file, and obtain Take the timestamp of frame picture;
In the embodiment of the present invention, identify the caption content in the frame picture of media file, obtain the captions literary composition of media file Word, and the timestamp of getting frame picture, particularly as follows: intercept the frame picture of media file every fixing frame number, getting frame is drawn The timestamp in face, uses OCR, identifies the caption content in the frame picture of media file, obtains media file Caption character.
Certainly, in the present embodiment, it is also possible to identify caption content in media file frame picture in the following ways, and obtain Timestamp: intercept the frame picture of media file every fixing frame number;Filter out in media file frame picture and have caption content Pending frame picture, and obtain its timestamp;According to caption content, pending frame picture is carried out duplicate removal, obtain same captions Unique frame picture that content is corresponding;Unique frame picture is identified, obtains the Word message that unique frame picture is corresponding.
Step 2015, audio frequency to media file carry out speech recognition, obtain the caption character of media file, and obtain sound The timestamp of frequency.
The audio frequency of media file is carried out speech recognition, obtains the caption character of media file, and obtain the time of audio frequency Stamp, can be, but not limited to: obtain audio-frequency information and the temporal information of media file;The audio-frequency information of media file is carried out point Section, obtains multistage segmentation audio-frequency information;Multistage segmentation audio-frequency information is processed, obtains end frame without voice audio information Target audio information, and obtain its timestamp;Target audio information is identified as corresponding word.
Step 202, caption character is translated as set language word;
In the present embodiment, the word that caption character is translated as setting language can use but is not limited to: after (1) calls Platform services, and translates caption character, obtains setting the word of language;(2) by caption character transmission to remote server, by Caption character is translated by remote server, obtains setting the word of language, receives the word setting language.
Step 203, by set language word corresponding with timestamp, generation media file translation after caption information;
In the present embodiment, by corresponding with timestamp for the word setting language, caption information after being translated particularly as follows: The word setting language is added to text, then according to content and the timestamp of text, according to a time Code adds the form of captions and writes word in caption information, generates the caption information after translation.
Certainly, in the present embodiment, filter out the situation of other information for association captions, it is also possible to will filter out other In caption information after information write translation.
The kind of captions has multiple, and the most the more commonly used subtitling format has graphical format and text formatting two class, relatively For graphical format captions, text formatting captions have that size is little, form simple, are easy to make and the feature of amendment, text lattice Formula captions include utf, idx, sub, srt, smi, rt, txt, ssa, aq, jss, js, ass, wherein the text subtitle of srt form Most widely used, it can compatible various common media players, MPC, QQ are audio-visual etc. all can load the type automatically Captions.Therefore, in the present embodiment, caption information uses srt form, and certain the present embodiment does not limit the lattice of caption information Formula, as long as the form of caption information can support used media player.
Step 204, by the caption information importing medium file after translation, the literary composition in caption information after simultaneous display translation Word.
In the present embodiment, the caption information after translation is stored in the file at media file place, plays media literary composition During part, at this caption information of fixed position simultaneous display of media file.
Certainly, the present embodiment can also be for different types of media file and shows caption information in different ways: pin Media file to relevant captions, can directly substitute association captions with the caption information after translation, play media file Time simultaneous display;For frame picture has the media file of caption content, can be by the caption information superposition after translation during broadcasting In frame picture, caption content is displayed above;For the media file without caption content, directly by the word after translation during broadcasting Curtain information imports and simultaneous display.
Additionally, for the display effect optimizing captions, can be by sentence branch display longer in caption information.
Step 205, the caption information after translation is sent to remote server, make remote server to the captions after translation Information carries out examining calibration and preserving, when again needing to translate media file caption, and the word after remote server calls calibration Curtain information.
In the embodiment of the present invention, by obtaining the caption character in media file association captions and timestamp;Or use Pictograph identification technology, identifies the caption content in the frame picture of media file, obtains the caption character of media file;Or Use speech recognition technology, the audio frequency of media file is carried out speech recognition, obtains the caption character of media file;By captions literary composition Word is translated as setting the word of language, improves the universality of media file caption translation.Additionally, passed through during relevant captions Filter the invalid information in association caption information and improve the accuracy of caption translating;When again needing to translate media file caption Caption information after remote server calls calibration, improves speed when again translating media file caption and accuracy.
Based on same inventive concept, the embodiment of the present invention additionally provides the translating equipment of a kind of media file caption, by The principle solving problem in these systems is similar to the interpretation method of a kind of media file caption, and therefore the enforcement of these systems can To see the enforcement of method, repeat no more in place of repetition.
As shown in Figure 4, device may include that
Acquisition module 301, the caption content in the association captions according to media file or audio frequency or embedding frame picture, Obtaining caption character and the timestamp of media file, wherein, the priority of the caption character and timestamp that obtain media file depends on Secondary it is: obtain according to the association captions of media file, obtain according to the caption content embedded in frame picture of media file, basis The audio frequency of media file obtains;
Translation module 302, for being translated as setting the word of language by caption character;
Captions generation module 303 is corresponding with timestamp for the word that will set language, after generating the translation of media file Caption information.
The translating equipment of the media file caption in the embodiment of the present invention, according to the association captions of media file or audio frequency or Embed the caption content in frame picture, obtain caption character and the timestamp of media file, caption character is translated, and depends on Generate the captions setting language according to timestamp, improve the universality of media file caption translation.
As it is shown in figure 5, additionally provide the translating equipment of another kind of media file caption in the embodiment of the present invention, device is permissible Including:
Acquisition module 401, the caption content in the association captions according to media file or audio frequency or embedding frame picture, Obtaining caption character and the timestamp of media file, wherein, the priority of the caption character and timestamp that obtain media file depends on Secondary it is: obtain according to the association captions of media file, obtain according to the caption content embedded in frame picture of media file, basis The audio frequency of media file obtains;
In the present embodiment, the translating equipment of media file caption can also include receiver module, for receiving setting of user Attribute speech translation triggering command, is specially when playing media file, displays to the user that setting in certain position of media file The Option Box of language, can but be not limited to the upper right corner or the upper left position of media file, receiver module receive user set The instruction of attribute speech translation.
As shown in Figure 6, acquisition module includes judging unit 4011 and acquiring unit 4012, wherein:
Judging unit 4011 includes the first judging unit and the second judging unit, and acquiring unit 4012 includes the first acquisition list Unit, second acquisition unit and the 3rd acquiring unit;
First judging unit is for judging whether media file has the association captions of correspondence;
First acquiring unit, for when the first judging unit judges the association captions that media file has correspondence, obtaining Caption character in association captions and timestamp;
In the present embodiment, the first acquiring unit can be, but not limited to according to regular expression rule, in filtration correlation captions Other information in addition to timestamp and caption character, obtain the caption character in association captions and timestamp;
Second judging unit, for when the first judging unit judges the association captions that media file does not has correspondence, sentencing Whether the frame picture of disconnected media file comprises caption content;
When second acquisition unit, second acquisition unit, for judging that media file has the caption content embedded in frame picture Time, identify the caption content in the frame picture of media file, obtain the caption character of media file, and the time of getting frame picture Stamp;
3rd acquiring unit, is used for judging that media file does not embed in the captions in frame picture when second acquisition unit Rong Shi, carries out speech recognition to the audio frequency of media file, obtains the caption character of media file, and obtains the timestamp of audio frequency.
Translation module 402, for being translated as setting the word of language by caption character.
In the present embodiment, translation module 403, may be used for caption character is translated, the captions literary composition after being translated Word;Or
For caption character is sent to remote server, receive remote server and caption character is translated obtain Caption character after translation.
Captions generation module 403 is corresponding with timestamp for the word that will set language, after generating the translation of media file Caption information;
Certainly, in the present embodiment, when the first acquiring unit is according to regular expression rule, except the time in filtration correlation captions During other information beyond stamp and caption character, captions generation module can be also used for setting the word of language and timestamp Join, and write other information, generate the caption information after the translation of media file.
Subtitle Demonstration module 404, for by the caption information importing medium file after translation, what simultaneous display imported turns over The word in caption information after translating.
Examine module 405, for sending the caption information after translation to remote server, make remote server to translation After caption information carry out examine calibration and preserve, again translation media file caption time after remote server calls calibration Caption information.
The translating equipment of the media file caption in the embodiment of the present invention, by obtaining the word in media file association captions Curtain word and timestamp;Or use pictograph identification technology, identify the caption content in the frame picture of media file, obtain The caption character of media file;Or use speech recognition technology, carries out speech recognition to the audio frequency of media file, obtains media The caption character of file;Caption character is translated as setting the word of language, improves the universality of media file caption translation. Additionally, the invalid information filtered out during relevant captions in association caption information improves the accuracy of caption translating;Again Secondary caption information after remote server calls calibration when needing to translate media file caption, improves translation media literary composition again Speed during part captions and accuracy.
Based on same inventive concept, as it is shown in fig. 7, the embodiment of the present invention additionally provides a kind of electronic equipment, including: shell Body 501, processor 502, memorizer 503, display screen (not shown), circuit board 504 and power circuit 505, wherein, circuit Plate 504 is placed in the interior volume that housing 501 surrounds, processor 502 and memorizer 503 and is arranged on circuit board 504, display screen It is embedded in outward on housing 501 and connects circuit board 504;Power circuit 505, for supplying for each circuit of electronic equipment or device Electricity;Memorizer 503 is used for storing executable program code and data;Processor 502 can by store in reading memorizer 503 Perform program code and run the program corresponding with executable program code, for performing following steps:
Associating captions or audio frequency or embedding the caption content in frame picture according to media file, obtains the word of media file Curtain word and timestamp, wherein, the priority of the caption character and timestamp that obtain media file is followed successively by: according to media file Association captions obtain, obtain according to the caption content embedded in frame picture of media file, obtain according to the audio frequency of media file Take;
Caption character is translated as setting the word of language;
By corresponding with timestamp for the word of setting language, generate the caption information after the translation of media file.
Electronic equipment in the embodiment of the present invention, associating captions or audio frequency or embedding in frame picture according to media file Caption content, obtains caption character and the timestamp of media file, translates caption character, and sets according to timestamp generation The captions of attribute speech, improve the universality of media file caption translation.
For convenience of description, each several part of system above is divided into various module or unit to be respectively described with function.Certainly, The function of each module or unit can be realized in same or multiple softwares or hardware when implementing the present invention.
Those skilled in the art are it should be appreciated that embodiments of the invention can be provided as method, system or computer program Product.Therefore, the reality in terms of the present invention can use complete hardware embodiment, complete software implementation or combine software and hardware Execute the form of example.And, the present invention can use at one or more computers wherein including computer usable program code The upper computer program product implemented of usable storage medium (including but not limited to disk memory, CD-ROM, optical memory etc.) The form of product.
The present invention is with reference to method, equipment (system) and the flow process of computer program according to embodiments of the present invention Figure and/or block diagram describe.It should be understood that can the most first-class by computer program instructions flowchart and/or block diagram Flow process in journey and/or square frame and flow chart and/or block diagram and/or the combination of square frame.These computer programs can be provided Instruction arrives the processor of general purpose computer, special-purpose computer, Embedded Processor or other programmable data processing device to produce A raw machine so that the instruction performed by the processor of computer or other programmable data processing device is produced for real The device of the function specified in one flow process of flow chart or multiple flow process and/or one square frame of block diagram or multiple square frame now.
These computer program instructions may be alternatively stored in and computer or other programmable data processing device can be guided with spy Determine in the computer-readable memory that mode works so that the instruction being stored in this computer-readable memory produces and includes referring to Make the manufacture of device, this command device realize at one flow process of flow chart or multiple flow process and/or one square frame of block diagram or The function specified in multiple square frames.
These computer program instructions also can be loaded in computer or other programmable data processing device so that at meter Perform sequence of operations step on calculation machine or other programmable devices to produce computer implemented process, thus at computer or The instruction performed on other programmable devices provides for realizing at one flow process of flow chart or multiple flow process and/or block diagram one The step of the function specified in individual square frame or multiple square frame.
Although preferred embodiments of the present invention have been described, but those skilled in the art once know basic creation Property concept, then can make other change and amendment to these embodiments.So, claims are intended to be construed to include excellent Select embodiment and fall into all changes and the amendment of the scope of the invention.

Claims (10)

1. an interpretation method for media file caption, is applied to electronic equipment, it is characterised in that including:
Associating captions or audio frequency or embedding the caption content in frame picture according to media file, obtains the word of described media file Curtain word and timestamp, wherein, the priority of the caption character and timestamp that obtain described media file is followed successively by: according to described The association captions of media file obtain, obtain, according to described according to the caption content embedded in frame picture of described media file The audio frequency of media file obtains;
Described caption character is translated as setting the word of language;
By corresponding with described timestamp for the word of described setting language, generate the caption information after the translation of described media file.
2. the method for claim 1, it is characterised in that the described association captions according to media file or audio frequency or embedding Caption content in frame picture, obtains caption character and the timestamp of described media file, wherein, obtains described media file The priority of caption character and timestamp is followed successively by: obtain according to the association captions of described media file, according to described media literary composition The caption content embedded in frame picture of part obtains, obtains according to the audio frequency of described media file, particularly as follows:
Step 201, judge described media file whether have correspondence association captions, when judge described media file have correspondence Step 202 is performed, when judging to perform step 203 when described media file does not has the association captions of correspondence during association captions;
Step 202, the caption character obtained in described association captions and timestamp;
Step 203, judge whether described media file has the caption content embedded in frame picture, when judging described media file Step 204 is performed, when judging that described media file does not embed in frame picture when having the caption content embedded in frame picture Step 205 is performed during caption content;
Step 204, the caption content identified in the frame picture of described media file, obtain the caption character of described media file, And obtain the timestamp of described frame picture;
Step 205, audio frequency to described media file carry out speech recognition, obtain the caption character of described media file, and obtain Take the timestamp of described audio frequency.
3. method as claimed in claim 3, it is characterised in that described step 202, the captions obtained in described association captions literary composition Word and timestamp, particularly as follows:
According to regular expression rule, filter other information in addition to timestamp and caption character in described association captions.
4. method as claimed in claim 4, it is characterised in that the described word to described setting language processes, and generates Caption information after the translation of described media file, particularly as follows:
By word and the described timestamp coupling of described setting language, and write other information described, generate described media file Translation after caption information.
5. the method for claim 1, it is characterised in that described caption character is translated as the literary composition of described setting language Word, particularly as follows:
According to the translation module of described electronic equipment, described caption character is translated, obtain setting the word of language;Or
Described caption character is sent to remote server, receives described remote server and described caption character is translated The word setting language arrived.
6. the method as described in any one of claim 1-5, it is characterised in that at the described word to described setting language Reason, after generating the caption information after the translation of described media file, also includes:
Caption information after described translation is imported in described media file, the literary composition in caption information after translation described in simultaneous display Word.
7. method as claimed in claim 6, it is characterised in that the described word to described setting language processes, and generates After caption information after the translation of described media file, also include:
Caption information after described translation is sent to remote server, makes described remote server to the captions after described translation Information carries out examining calibration and preserving, and when again needing to translate described media file caption, calls school from described remote server Caption information after standard.
8. the translating equipment of a media file caption, it is characterised in that including: acquisition module, translation module and captions generate Module;
Described acquisition module, the caption content in the association captions according to media file or audio frequency or embedding frame picture, obtains Take caption character and the timestamp of described media file, wherein, obtain the caption character of described media file and the excellent of timestamp First level is followed successively by: obtain according to the association captions of described media file, according to the word in the embedding frame picture of described media file Curtain content obtaining, obtain according to the audio frequency of described media file;
Described translation module, for being translated as the word of described setting language by described caption character;
Described captions generation module, for by corresponding with described timestamp for the word of described setting language, generates described media literary composition Caption information after the translation of part.
9. device as claimed in claim 8, it is characterised in that described acquisition module includes judging unit and acquiring unit, its In:
Judging unit includes the first judging unit and the second judging unit, and acquiring unit includes the first acquiring unit, the second acquisition Unit and the 3rd acquiring unit;
Described first judging unit is for judging whether described media file has the association captions of correspondence;
When described first judging unit, described first acquiring unit, for judging that described media file has the association captions of correspondence Time, obtain the caption character in described association captions and timestamp;
When described first judging unit, described second judging unit, for judging that described media file does not has the associated characters of correspondence During curtain, it is judged that whether described media file has the caption content embedded in frame picture;
When described second acquisition unit, described second acquisition unit, for judging that described media file has in embedding frame picture During caption content, identify the caption content in the frame picture of described media file, obtain the caption character of described media file, and Obtain the timestamp of described frame picture;
When described second acquisition unit, described 3rd acquiring unit, for judging that described media file does not embed in frame picture Caption content time, the audio frequency of described media file is carried out speech recognition, obtains the caption character of described media file, and obtain Take the timestamp of described audio frequency.
10. an electronic equipment, it is characterised in that including: housing, processor, memorizer, display screen, circuit board and power supply electricity Road, wherein, described circuit board is placed in the interior volume that described housing surrounds, described processor and described memorizer and is arranged on institute State on circuit board, be embedded on described housing outside described display screen and connect described circuit board;Described power circuit, is used for as electronics Each circuit of equipment or device are powered;Described memorizer is used for storing executable program code and data;Described processor leads to Cross and read in described memorizer the executable program code of storage and run the program corresponding with executable program code, for Execution following steps:
Associating captions or audio frequency or embedding the caption content in frame picture according to media file, obtains the word of described media file Curtain word and timestamp, wherein, the priority of the caption character and timestamp that obtain described media file is followed successively by: according to described The association captions of media file obtain, obtain, according to described according to the caption content embedded in frame picture of described media file The audio frequency of media file obtains;
Described caption character is translated as setting the word of language;
By corresponding with described timestamp for the word of described setting language, generate the caption information after the translation of described media file.
CN201610683339.5A 2016-08-17 2016-08-17 Method and device for translating subtitles of media file and electronic equipment Pending CN106303303A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610683339.5A CN106303303A (en) 2016-08-17 2016-08-17 Method and device for translating subtitles of media file and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610683339.5A CN106303303A (en) 2016-08-17 2016-08-17 Method and device for translating subtitles of media file and electronic equipment

Publications (1)

Publication Number Publication Date
CN106303303A true CN106303303A (en) 2017-01-04

Family

ID=57678280

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610683339.5A Pending CN106303303A (en) 2016-08-17 2016-08-17 Method and device for translating subtitles of media file and electronic equipment

Country Status (1)

Country Link
CN (1) CN106303303A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107484002A (en) * 2017-08-25 2017-12-15 四川长虹电器股份有限公司 The method of intelligent translation captions
CN107577676A (en) * 2017-09-15 2018-01-12 北京彩彻区明科技有限公司 Web page translation method, apparatus and system
CN107644016A (en) * 2017-10-19 2018-01-30 维沃移动通信有限公司 A kind of multimedia titles interpretation method, multimedia titles lookup method and device
CN107682739A (en) * 2017-09-20 2018-02-09 成都视达科信息技术有限公司 The generation method and system of a kind of languages captions of video
CN109361958A (en) * 2018-11-05 2019-02-19 侯清元 Multi-lingual subtitle fabricating method, device, medium and electronic equipment
CN109963092A (en) * 2017-12-26 2019-07-02 深圳市优必选科技有限公司 A kind of processing method of subtitle, device and terminal
CN110134973A (en) * 2019-04-12 2019-08-16 深圳壹账通智能科技有限公司 Video caption real time translating method, medium and equipment based on artificial intelligence
WO2020024353A1 (en) * 2018-08-01 2020-02-06 平安科技(深圳)有限公司 Video playback method and device, terminal device, and storage medium
EP3787300A4 (en) * 2018-04-25 2021-03-03 Tencent Technology (Shenzhen) Company Limited Video stream processing method and apparatus, computer device and storage medium
CN113794940A (en) * 2021-09-01 2021-12-14 北京百度网讯科技有限公司 Subtitle display method and device, electronic equipment and readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003049315A1 (en) * 2001-12-05 2003-06-12 Walt Disney Parks And Resorts System and method of wirelessly triggering portable devices
CN103051945A (en) * 2012-12-31 2013-04-17 广东欧珀移动通信有限公司 Method and system for translating subtitles of video playing terminal
CN103067775A (en) * 2013-01-28 2013-04-24 Tcl集团股份有限公司 Subtitle display method for audio/video terminal, audio/video terminal and server
CN103226947A (en) * 2013-03-27 2013-07-31 广东欧珀移动通信有限公司 Mobile terminal-based audio processing method and device
CN103327397A (en) * 2012-03-22 2013-09-25 联想(北京)有限公司 Subtitle synchronous display method and system of media file
CN103984772A (en) * 2014-06-04 2014-08-13 百度在线网络技术(北京)有限公司 Method and device for generating text retrieval subtitle library and video retrieval method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003049315A1 (en) * 2001-12-05 2003-06-12 Walt Disney Parks And Resorts System and method of wirelessly triggering portable devices
CN103327397A (en) * 2012-03-22 2013-09-25 联想(北京)有限公司 Subtitle synchronous display method and system of media file
CN103051945A (en) * 2012-12-31 2013-04-17 广东欧珀移动通信有限公司 Method and system for translating subtitles of video playing terminal
CN103067775A (en) * 2013-01-28 2013-04-24 Tcl集团股份有限公司 Subtitle display method for audio/video terminal, audio/video terminal and server
CN103226947A (en) * 2013-03-27 2013-07-31 广东欧珀移动通信有限公司 Mobile terminal-based audio processing method and device
CN103984772A (en) * 2014-06-04 2014-08-13 百度在线网络技术(北京)有限公司 Method and device for generating text retrieval subtitle library and video retrieval method and device

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107484002A (en) * 2017-08-25 2017-12-15 四川长虹电器股份有限公司 The method of intelligent translation captions
CN107577676A (en) * 2017-09-15 2018-01-12 北京彩彻区明科技有限公司 Web page translation method, apparatus and system
CN107682739A (en) * 2017-09-20 2018-02-09 成都视达科信息技术有限公司 The generation method and system of a kind of languages captions of video
CN107644016A (en) * 2017-10-19 2018-01-30 维沃移动通信有限公司 A kind of multimedia titles interpretation method, multimedia titles lookup method and device
CN109963092A (en) * 2017-12-26 2019-07-02 深圳市优必选科技有限公司 A kind of processing method of subtitle, device and terminal
EP3787300A4 (en) * 2018-04-25 2021-03-03 Tencent Technology (Shenzhen) Company Limited Video stream processing method and apparatus, computer device and storage medium
US11463779B2 (en) 2018-04-25 2022-10-04 Tencent Technology (Shenzhen) Company Limited Video stream processing method and apparatus, computer device, and storage medium
WO2020024353A1 (en) * 2018-08-01 2020-02-06 平安科技(深圳)有限公司 Video playback method and device, terminal device, and storage medium
CN109361958A (en) * 2018-11-05 2019-02-19 侯清元 Multi-lingual subtitle fabricating method, device, medium and electronic equipment
CN110134973A (en) * 2019-04-12 2019-08-16 深圳壹账通智能科技有限公司 Video caption real time translating method, medium and equipment based on artificial intelligence
CN113794940A (en) * 2021-09-01 2021-12-14 北京百度网讯科技有限公司 Subtitle display method and device, electronic equipment and readable storage medium

Similar Documents

Publication Publication Date Title
CN106303303A (en) Method and device for translating subtitles of media file and electronic equipment
CN108419141B (en) Subtitle position adjusting method and device, storage medium and electronic equipment
CN109729420B (en) Picture processing method and device, mobile terminal and computer readable storage medium
US20180041796A1 (en) Method and device for displaying information on video image
CN110557678B (en) Video processing method, device and equipment
CN109218629B (en) Video generation method, storage medium and device
Hu et al. Speaker-following video subtitles
CN108012173B (en) Content identification method, device, equipment and computer storage medium
EP4099709A1 (en) Data processing method and apparatus, device, and readable storage medium
CN108182211B (en) Video public opinion acquisition method and device, computer equipment and storage medium
CN111556332B (en) Live broadcast method, electronic device and readable storage medium
JP4621758B2 (en) Content information reproducing apparatus, content information reproducing system, and information processing apparatus
CN102655585B (en) Video conference system and time delay testing method, device and system thereof
CN103327407B (en) Audio-visual content is set to watch level method for distinguishing
CN113052169A (en) Video subtitle recognition method, device, medium, and electronic device
CN109729429B (en) Video playing method, device, equipment and medium
CN109558513A (en) A kind of content recommendation method, device, terminal and storage medium
US20150078729A1 (en) Synchronizing videos with frame-based metadata using video content
CN109151520B (en) Method, device, electronic equipment and medium for generating video
CN107888989A (en) A kind of interactive system and method live based on internet
CN106295592A (en) Method and device for identifying subtitles of media file and electronic equipment
CN113992972A (en) Subtitle display method and device, electronic equipment and readable storage medium
CN113365109A (en) Method and device for generating video subtitles, electronic equipment and storage medium
CN113630620A (en) Multimedia file playing system, related method, device and equipment
CN113784058A (en) Image generation method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170104