CN107277645A - Error correction method and device for subtitle content - Google Patents

Error correction method and device for subtitle content Download PDF

Info

Publication number
CN107277645A
CN107277645A CN201710624479.XA CN201710624479A CN107277645A CN 107277645 A CN107277645 A CN 107277645A CN 201710624479 A CN201710624479 A CN 201710624479A CN 107277645 A CN107277645 A CN 107277645A
Authority
CN
China
Prior art keywords
text
text message
error correction
audio
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710624479.XA
Other languages
Chinese (zh)
Inventor
王金龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Genius Technology Co Ltd
Original Assignee
Guangdong Genius Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Genius Technology Co Ltd filed Critical Guangdong Genius Technology Co Ltd
Priority to CN201710624479.XA priority Critical patent/CN107277645A/en
Publication of CN107277645A publication Critical patent/CN107277645A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Detection And Prevention Of Errors In Transmission (AREA)

Abstract

The embodiment of the invention discloses a method and a device for correcting subtitle content, wherein the method comprises the following steps: extracting first text information corresponding to a target subtitle strip in a video file; identifying the audio information of the target subtitle strip to obtain corresponding second text information; and comparing the first text information with the second text information through texts to correct errors, and outputting an error correction result. The intelligent correction of the subtitle content is realized, and the problems of low efficiency of manual correction and high input cost are solved.

Description

The error correction method and device of a kind of caption content
Technical field
The present embodiments relate to the error correction method and device of multimedia technology, more particularly to a kind of caption content.
Background technology
It is usually while seeing that audio, one side typing captions text are listened in video or side generally when the captions of audio frequency and video make This, and whether the captioned test content recorded is consistent or corresponding with the audio content in video, influence user viewing video or receipts Listen the experience of audio.
It is typically manually to go to check in the prior art, crosschecks and pinpoint the problems.The result that artificial error correction is brought is effect Rate underground, input cost is high.
The content of the invention
The embodiment of the present invention provides a kind of error correction method and device of caption content, realizes and the intelligence of caption content is entangled Mistake, the problem of solving low artificial error correction efficiency and high input cost.
In a first aspect, the embodiments of the invention provide a kind of error correction method of caption content, methods described includes:
Extract corresponding first text message of target caption strips in video file;
Recognize that the audio-frequency information of the target caption strips obtains corresponding second text message;
First text message and second text message are compared into carry out error correction by text, error correction knot is exported Really.
Further, first text message for extracting target caption strips in video file includes:
Judge whether current image frame there are captions, if, it is determined that the position of the caption strips and the caption strips Start frame and abort frame;
Extract the first text message of the caption strips.
Further, corresponding second text message of audio-frequency information of the identification target caption strips includes:
Time interval is determined according to the start frame and the abort frame;
The audio-frequency information in video is parsed and cut according to the time interval;
Audio-frequency information after parsing and cutting is compared with pre-set text storehouse, the audio-frequency information corresponding the is recognized Two text messages.
Further, described compare first text message and second text message by text is entangled Mistake, output error correction result includes:
First text message and second text message are compared one by one in units of word or word;
Record words or word different from first text in second text;
The word or word are exported as error correction result.
Further, the pre-set text library storage is in the server being connected with sound identification module.
Second aspect, the embodiments of the invention provide a kind of error correction device of caption content, described device includes:
Information extraction modules, for extracting corresponding first text message of target caption strips in video file;
Information identification module, recognizes that the audio-frequency information of the target caption strips obtains corresponding second text message;
Information comparison module, for first text message and second text message to be compared into progress by text Error correction, exports error correction result.
Further, described information extraction module specifically for:
Judge whether current image frame there are captions, if, it is determined that the position of the caption strips and the caption strips Start frame and abort frame;
Extract the first text message of the caption strips.
Further, described information identification module specifically for:
Time interval is determined according to the start frame and the abort frame;
The audio-frequency information in video is parsed and cut according to the time interval;
Audio-frequency information after parsing and cutting is compared with pre-set text storehouse, the audio-frequency information corresponding the is recognized Two text messages.
Further, described information comparing module specifically for:
First text message and second text message are compared one by one in units of word or word;
Record words or word different from first text in second text;
The word or word are exported as error correction result.
Further, the pre-set text library storage is in the server being connected with sound identification module.
In the embodiment of the present invention, corresponding first text message of target caption strips in video file is extracted;Recognize the mesh The audio-frequency information of mark caption strips obtains corresponding second text message;By first text message and second text message Compared by text and carry out error correction, export error correction result.The intelligent correction to caption content is realized, artificial error correction efficiency is solved The problem of low and input cost is high.
Brief description of the drawings
Fig. 1 is a kind of flow chart of the error correction method of caption content in the embodiment of the present invention one;
Fig. 2 is a kind of flow chart of the error correction method of caption content in the embodiment of the present invention two;
Fig. 3 is a kind of flow chart of the error correction method of caption content in the embodiment of the present invention three;
Fig. 4 is a kind of structural representation of the error correction device of caption content in the embodiment of the present invention four.
Embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining the present invention, rather than limitation of the invention.It also should be noted that, in order to just Part related to the present invention rather than entire infrastructure are illustrate only in description, accompanying drawing.
Embodiment one
Fig. 1 is a kind of flow chart of the error correction method for caption content that the embodiment of the present invention one is provided, and the present embodiment can be fitted Situation for carrying out error correction to caption content, this method can be a kind of entangling for caption content that embodiment is provided by the present invention Misloading is put to perform, and the device can be realized by the way of software and/or hardware.With reference to Fig. 1, this method can specifically be included such as Lower step:
Corresponding first text message of target caption strips in S110, extraction video file.
Specifically, it is necessary to the audio heard with reference to the caption information in video and user during user's viewing video Information appreciates the picture in video.Usual caption strips are located at the middle and lower part that user watches the whole screen of picture, are broadcast in video During putting, it may appear that multiple caption strips, determine that at least one caption strips is according to the demand of user in multiple caption strips Target caption strips, extract corresponding first text message of target caption strips in video file.Wherein, the first text message and target Captions in caption strips are corresponded.
Optionally, corresponding first text message of target caption strips is extracted using texture denoising method.Detailed process is as follows: Seek survival in the average and image in the caption strips region of the multiple image frame luminance picture of same captions;Average and image is carried out Split by maximum variance between clusters, generation only has the caption area image in two kinds of Color-Connected domains of black and white;To maximum kind Between variance method segmentation after image determine which kind of color be character area;Finally reject non-legible noise.
S120, the audio-frequency information of the identification target caption strips obtain corresponding second text message.
Wherein, speech recognition is carried out to the corresponding audio-frequency information of target caption strips, recognition result is labeled as the second text envelope Breath, wherein, the second text message is corresponding with the audio-frequency information of target caption strips.
S130, first text message and second text message by text compared into carry out error correction, output is entangled Wrong result.
Specifically, the first text message and the second text message are subjected to error correction by text comparison method, optionally, by It is that speech recognition acquisition is carried out to audio-frequency information in the second text message, the second text message can be believed as target text Breath, the first text message is compared with target text information.In comparison result, by part different in two text messages Error section is defined as, that is, error correction result, then exports error correction result.
In the embodiment of the present invention, corresponding first text message of target caption strips in video file is extracted;Recognize the mesh The audio-frequency information of mark caption strips obtains corresponding second text message;By first text message and second text message Compared by text and carry out error correction, export error correction result.The intelligent correction to caption content is realized, artificial error correction efficiency is solved The problem of low and input cost is high.
On the basis of above-mentioned technical proposal, " first text message and second text message are passed through into text Compare and carry out error correction, export error correction result " can be specifically:
First text message and second text message are compared one by one in units of word or word;Record The word or word different from first text message in second text message;It regard the word or word as error correction result Exported.
Optionally, can be by the first text message and the second text message in the specific error correction implementation to text It is compared one by one in units of word or word.In a specific example, word can be short word or long word, right Specific word length is not specifically limited.It should be noted that the length of word is shorter, the result of comparison is more accurate.Contrast Different word or word are recorded, record result is exported as wrong result is entered.
Embodiment two
Fig. 2 is a kind of flow chart of the error correction method for caption content that the embodiment of the present invention two is provided, and the present embodiment is upper State on the basis of embodiment, " the first text message for extracting target caption strips in video file " is optimized.With reference to figure 2, this method specifically may include steps of:
S210, judge whether current image frame there are captions, if so, then performing S220, S210 is performed if it is not, then returning.
Specifically, according to determining current picture frame in the video played, and judge that current image frame middle row is It is no to have captions, if without captions, returning and continuing to judge whether current image frame has captions, until there is captions appearance.
S220, the position for determining the caption strips and the caption strips start frame and abort frame.
Specifically, when determining the position of caption strips, the luminance picture of acquired image frames first generates texture maps, by hanging down Straight grain figure floor projection seeks difference, first determines the upper and lower side frame of horizontal caption strips, then determines the left and right side frame of horizontal caption strips, So that it is determined that the horizontal level of caption strips;Then the position of vertical caption strips is determined, is asked and looked into by horizontal texture figure upright projection Point, vertical caption strips left and right side frame is first determined, then vertical caption strips upper and lower side frame is determined, caption strips denoising is finally carried out, it is determined that The position of caption strips.
Wherein, if there is caption strips, if current image frame is caption strips key frame, then in previous key frame and the word The start frame of caption strips is determined between curtain bar key frame, then the caption strips region of the caption strips key frame is matched below successively Key frame, if matching is consistent, continues to match, inconsistent until matching, then true in previous key frame and current key frame Determine the abort frame of caption strips.
S230, the first text message for extracting the caption strips.
S240, the audio-frequency information of the identification target caption strips obtain corresponding second text message.
S250, first text message and second text message by text compared into carry out error correction, output is entangled Wrong result.
In the embodiment of the present invention, by judging whether there is captions in current image frame, if so, then determining the position of caption strips And the start frame and abort frame of the caption strips, if being judged always untill detecting the presence of captions without if.Pass through The judgement of the start frame and abort frame of caption strips, realizes the extraction to caption information in caption strips.
Embodiment three
Fig. 3 is a kind of flow chart of the error correction method for caption content that the embodiment of the present invention three is provided, and the present embodiment is upper State on the basis of embodiment, " corresponding second text messages of audio-frequency information of the identification target caption strips " have been carried out excellent Change.With reference to Fig. 3, this method specifically may include steps of:
S310, judge whether current image frame there are captions, if so, then performing S320, S310 is performed if it is not, then returning.
S320, the position for determining the caption strips and the caption strips start frame and abort frame.
S330, the first text message for extracting the caption strips.
S340, time interval determined according to the start frame and the abort frame.
Can be T to time interval specifically, determining a time interval according to start frame and abort frame, that is, from The time of the start frame of same caption strips to abort frame is T.
S350, the audio-frequency information in time interval parsing and cutting video.
Wherein, on the basis of at definite intervals, the audio-frequency information in video is parsed and split.In a tool In the example of body, by video on the basis of time interval T, the audio in video is carried out to be divided into some section audio information, and Audio-frequency information after segmentation is parsed.
S360, the audio-frequency information after parsing and cutting and pre-set text storehouse be compared, recognize the audio-frequency information pair The second text message answered.
Specifically, the audio-frequency information after parsing and cutting is compared with pre-set text storehouse, optionally, pre-set text storehouse It can be obtained by speech identifying function, can be by calling University of Science and Technology's news to fly opening for speech recognition in a specific example Source interface is obtained.Wherein, be stored with the corresponding relation of each audio content and corresponding text message in pre-set text storehouse. Audio-frequency information after parsing and cutting is compared with pre-set text storehouse, corresponding second text message of identification audio-frequency information.
Optionally, the pre-set text library storage is in the server being connected with sound identification module.
Wherein, sound identification module is connected with server, and pre-set text library storage is in the server.Stored in server There is the pre-set text, realize according to the real-time calling for being used for demand to pre-set text storehouse.
S370, first text message and second text message by text compared into carry out error correction, output is entangled Wrong result.
In the embodiment of the present invention, the start frame of preferred picture frame and the abort frame determine time interval, and according to described Time interval parses and cut the audio-frequency information in video, and the audio-frequency information after parsing and cutting is compared with pre-set text storehouse It is right, recognize corresponding second text message of the audio-frequency information.Realize the knowledge of the second text message corresponding to audio-frequency information Not.
Example IV
Fig. 4 be the present invention be example IV provide a kind of caption content error correction device structural representation, the device It is adapted for carrying out a kind of error correction method for caption content that the embodiment of the present invention is supplied to.As shown in figure 4, the device specifically can be with Including:
Information extraction modules 410, for extracting corresponding first text message of target caption strips in video file;
Information identification module 420, recognizes that the audio-frequency information of the target caption strips obtains corresponding second text message;
Information comparison module 430, for first text message to be compared with second text message by text Error correction is carried out, error correction result is exported.
Further, information extraction modules 410 specifically for:
Judge whether current image frame there are captions, if, it is determined that the position of the caption strips and the caption strips Start frame and abort frame;
Extract the first text message of the caption strips.
Further, information identification module 420 specifically for:
Time interval is determined according to the start frame and the abort frame;
The audio-frequency information in video is parsed and cut according to the time interval;
Audio-frequency information after parsing and cutting is compared with pre-set text storehouse, the audio-frequency information corresponding the is recognized Two text messages.
Further, information comparison module 430 specifically for:
First text message and second text message are compared one by one in units of word or word;
Record words or word different from first text in second text;
The word or word are exported as error correction result.
Further, the pre-set text library storage is in the server being connected with sound identification module.
The captions that the executable any embodiment of the present invention of the error correction device of caption content provided in an embodiment of the present invention is provided The error correction method of content, possesses the corresponding functional module of execution method and beneficial effect.
Note, above are only presently preferred embodiments of the present invention and institute's application technology principle.It will be appreciated by those skilled in the art that The invention is not restricted to specific embodiment described here, can carry out for a person skilled in the art it is various it is obvious change, Readjust and substitute without departing from protection scope of the present invention.Therefore, although the present invention is carried out by above example It is described in further detail, but the present invention is not limited only to above example, without departing from the inventive concept, also Other more equivalent embodiments can be included, and the scope of the present invention is determined by scope of the appended claims.

Claims (10)

1. a kind of error correction method of caption content, it is characterised in that including:
Extract corresponding first text message of target caption strips in video file;
Recognize that the audio-frequency information of the target caption strips obtains corresponding second text message;
First text message and second text message are compared into carry out error correction by text, error correction result is exported.
2. according to the method described in claim 1, it is characterised in that first text for extracting target caption strips in video file This information includes:
Judge whether current image frame there are captions, if, it is determined that the starting of the position of the caption strips and the caption strips Frame and abort frame;
Extract the first text message of the caption strips.
3. method according to claim 2, it is characterised in that the audio-frequency information correspondence of the identification target caption strips The second text message include:
Time interval is determined according to the start frame and the abort frame;
The audio-frequency information in video is parsed and cut according to the time interval;
Audio-frequency information after parsing and cutting is compared with pre-set text storehouse, corresponding second text of the audio-frequency information is recognized This information.
4. according to the method described in claim 1, it is characterised in that described by first text message and second text Information is compared by text carries out error correction, and output error correction result includes:
First text message and second text message are compared one by one in units of word or word;
Record words or word different from first text in second text;
The word or word are exported as error correction result.
5. method according to claim 3, it is characterised in that the pre-set text library storage with sound identification module phase In server even.
6. a kind of error correction device of caption content, it is characterised in that including:
Information extraction modules, for extracting corresponding first text message of target caption strips in video file;
Information identification module, recognizes that the audio-frequency information of the target caption strips obtains corresponding second text message;
Information comparison module, is entangled for first text message to be compared with second text message by text Mistake, exports error correction result.
7. device according to claim 6, it is characterised in that described information extraction module specifically for:
Judge whether current image frame there are captions, if, it is determined that the starting of the position of the caption strips and the caption strips Frame and abort frame;
Extract the first text message of the caption strips.
8. device according to claim 7, it is characterised in that described information identification module specifically for:
Time interval is determined according to the start frame and the abort frame;
The audio-frequency information in video is parsed and cut according to the time interval;
Audio-frequency information after parsing and cutting is compared with pre-set text storehouse, corresponding second text of the audio-frequency information is recognized This information.
9. device according to claim 6, it is characterised in that described information comparing module specifically for:
First text message and second text message are compared one by one in units of word or word;
Record words or word different from first text in second text;
The word or word are exported as error correction result.
10. device according to claim 8, it is characterised in that the pre-set text library storage with sound identification module In connected server.
CN201710624479.XA 2017-07-27 2017-07-27 Error correction method and device for subtitle content Pending CN107277645A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710624479.XA CN107277645A (en) 2017-07-27 2017-07-27 Error correction method and device for subtitle content

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710624479.XA CN107277645A (en) 2017-07-27 2017-07-27 Error correction method and device for subtitle content

Publications (1)

Publication Number Publication Date
CN107277645A true CN107277645A (en) 2017-10-20

Family

ID=60079240

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710624479.XA Pending CN107277645A (en) 2017-07-27 2017-07-27 Error correction method and device for subtitle content

Country Status (1)

Country Link
CN (1) CN107277645A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108377416A (en) * 2018-02-27 2018-08-07 维沃移动通信有限公司 A kind of video broadcasting method and mobile terminal
CN108833403A (en) * 2018-06-11 2018-11-16 颜彦 It is a kind of to melt media information publication generation method with embedded code transplanting
CN111968649A (en) * 2020-08-27 2020-11-20 腾讯科技(深圳)有限公司 Subtitle correction method, subtitle display method, device, equipment and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101448100A (en) * 2008-12-26 2009-06-03 西安交通大学 Method for extracting video captions quickly and accurately
CN101464896A (en) * 2009-01-23 2009-06-24 安徽科大讯飞信息科技股份有限公司 Voice fuzzy retrieval method and apparatus
US8544049B2 (en) * 2006-09-27 2013-09-24 Hitachi, Ltd. Contents receiving system and client
CN106529529A (en) * 2016-10-31 2017-03-22 腾讯科技(深圳)有限公司 Video subtitle identification method and system
CN106604125A (en) * 2016-12-29 2017-04-26 北京奇艺世纪科技有限公司 Video subtitle determining method and video subtitle determining device
CN106782560A (en) * 2017-03-06 2017-05-31 海信集团有限公司 Determine the method and device of target identification text

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8544049B2 (en) * 2006-09-27 2013-09-24 Hitachi, Ltd. Contents receiving system and client
CN101448100A (en) * 2008-12-26 2009-06-03 西安交通大学 Method for extracting video captions quickly and accurately
CN101464896A (en) * 2009-01-23 2009-06-24 安徽科大讯飞信息科技股份有限公司 Voice fuzzy retrieval method and apparatus
CN106529529A (en) * 2016-10-31 2017-03-22 腾讯科技(深圳)有限公司 Video subtitle identification method and system
CN106604125A (en) * 2016-12-29 2017-04-26 北京奇艺世纪科技有限公司 Video subtitle determining method and video subtitle determining device
CN106782560A (en) * 2017-03-06 2017-05-31 海信集团有限公司 Determine the method and device of target identification text

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108377416A (en) * 2018-02-27 2018-08-07 维沃移动通信有限公司 A kind of video broadcasting method and mobile terminal
CN108833403A (en) * 2018-06-11 2018-11-16 颜彦 It is a kind of to melt media information publication generation method with embedded code transplanting
CN111968649A (en) * 2020-08-27 2020-11-20 腾讯科技(深圳)有限公司 Subtitle correction method, subtitle display method, device, equipment and medium
CN111968649B (en) * 2020-08-27 2023-09-15 腾讯科技(深圳)有限公司 Subtitle correction method, subtitle display method, device, equipment and medium

Similar Documents

Publication Publication Date Title
CN106534548B (en) Voice error correction method and device
US8200061B2 (en) Signal processing apparatus and method thereof
US10304458B1 (en) Systems and methods for transcribing videos using speaker identification
CN104429091B (en) Method and apparatus for identifying media
US20020069055A1 (en) Apparatus and method for automatically generating punctuation marks continuous speech recognition
CN107277645A (en) Error correction method and device for subtitle content
CN108347646A (en) multimedia content playing method and device
JP3873926B2 (en) Subtitle insertion method, subtitle insertion system and subtitle insertion program
US20190213998A1 (en) Method and device for processing data visualization information
CN110072140A (en) A kind of video information reminding method, device, equipment and storage medium
JP2014120032A (en) Character recognition device, character recognition method and character recognition program
CN109743613A (en) A kind of method for processing caption, device, terminal and storage medium
CN109545232A (en) Information-pushing method, information push-delivery apparatus and interactive voice equipment
CN106507175A (en) Method of video image processing and device
JP2019213064A (en) CM section detection device, CM section detection method, and program
US11600279B2 (en) Transcription of communications
KR102136059B1 (en) System for generating subtitle using graphic objects
CN105657395A (en) Subtitle playing method and device for 3D (3-Dimensions) video
US20200227069A1 (en) Method, device and apparatus for recognizing voice signal, and storage medium
CN113992972A (en) Subtitle display method and device, electronic equipment and readable storage medium
CN111081093B (en) Dictation content identification method and electronic equipment
JP4140744B2 (en) How to automatically split caption text
JP2000270263A (en) Automatic subtitle program producing system
CN106658167A (en) Video interaction method and device
JP2012039524A (en) Moving image processing apparatus, moving image processing method and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20171020