CN109887492A

CN109887492A - A kind of data processing method, device and electronic equipment

Info

Publication number: CN109887492A
Application number: CN201811497640.2A
Authority: CN
Inventors: 郑宏
Original assignee: Beijing Sogou Technology Development Co Ltd
Current assignee: Beijing Sogou Technology Development Co Ltd
Priority date: 2018-12-07
Filing date: 2018-12-07
Publication date: 2019-06-14
Anticipated expiration: 2038-12-07
Also published as: CN109887492B

Abstract

The embodiment of the invention provides a kind of data processing method, device and electronic equipments, wherein the described method includes: obtaining current speech identifies text；The current speech is identified that text and upper N item output text splice, obtains splicing text, wherein the N is positive integer；Punctuate is added in the splicing text, the data in addition to the upper N item output text are extracted from the splicing text of addition punctuate as current output text and is exported.The embodiment of the present invention can determine the text end punctuate before pausing, solve the problems, such as because pause mistake punctuates, to improve the accuracy rate of addition punctuate by combining two texts before and after pausing.

Description

A kind of data processing method, device and electronic equipment

Technical field

The present invention relates to technical field of data processing, more particularly to a kind of data processing method, device and electronic equipment.

Background technique

Artificial intelligence includes very extensive science, is made of different fields, such as machine learning, computer vision etc. Deng.Generally speaking, the main target of artificial intelligence study is to enable the machine to be competent at some to usually require human intelligence The complex work that can be completed；Since artificial intelligence is born, theory and technology is increasingly mature, and application field also constantly expands.Such as Machine translation field, for example, Chinese translated into English, by English Translation at Chinese etc..

As machine translation mothod is constantly mature, the simultaneous interpretation translation based on machine is come into being, and simultaneous interpretation translation can wrap It includes: speech recognition and machine translation, as shown in Figure 1；Wherein, the speech recognition includes multiple stages: acquisition voice data, VAD (Voice Activity Detection, speech terminals detection) punctuate, speech recognition, text punctuate；Wherein, VAD makes pauses in reading unpunctuated ancient writings It is that voice is cut to by multiple sound bites according to mute time, text punctuate is speech recognition text corresponding to each sound bite Originally it puts in the stops, it is then such as " big to this text addition punctuation mark such as speech recognition text " hello, and I is Li Lei " Family is good, I is Li Lei ".

User may pause during saying not a word of punctuate, such as user first says one: " we are hot Cut anticipatory ", it has then paused for a moment followed by saying: " this new technology "；It is more than threshold value when the dead time, the words will It is divided into multistage；Such as the 20ms that paused between above-mentioned two word, it is two sections that the voice of this complete sentence, which can be split, at this time Sound bite: " our keen anticipations " corresponding sound bite and " this new technology " corresponding sound bite.Then exist When adding punctuate for the corresponding speech recognition text chunk of each sound bite, punctuate may be added to this section of end, such as May the end of " our keen anticipations " this text can add fullstop, obtain " our keen anticipations." obviously " our keen anticipations ", which is followed by, add punctuation mark, and punctuate is caused to add mistake.

Summary of the invention

The embodiment of the present invention provides a kind of data processing method, to improve the accuracy rate of addition punctuate.

Correspondingly, the embodiment of the invention also provides a kind of data processing equipment and a kind of electronic equipment, on guaranteeing State the realization and application of method.

To solve the above-mentioned problems, it the embodiment of the invention discloses a kind of data processing method, specifically includes: obtaining current Speech recognition text；The current speech is identified that text and upper N item output text splice, obtains splicing text, wherein The N is positive integer；Punctuate is added in the splicing text, is extracted from the splicing text of addition punctuate and removes the upper N item Data except output text as current output text and export.

Optionally, described to add punctuate in the splicing text, comprising: word segmentation processing is carried out to the splicing text, Obtain corresponding multiple participle segments；According to Symbol matching model, the corresponding symbol logo of each participle segment is determined；If described point The symbol logo of word segment is setting identification, then described in the participle segment described in the splicing text correspond to and is added after text Symbol logo.

Optionally, the Symbol matching model include the first Symbol matching model and the second Symbol matching model, it is described according to According to Symbol matching model, the corresponding symbol logo of each participle segment is determined, comprising: each participle segment is successively input to described In one Symbol matching model, the first probabilistic information that each participle segment corresponds to each symbol logo is obtained；Successively by each participle segment It is input in the second Symbol matching model, obtains the second probabilistic information that each participle segment corresponds to each symbol logo；For One participle segment corresponds to the first probabilistic information and the second probabilistic information of each symbol logo according to the participle segment, determines The corresponding symbol logo of the participle segment.

Optionally, first probabilistic information and the second probability letter that each symbol logo is corresponded to according to the participle segment Breath, determine the corresponding symbol logo of the participle segment, comprising: according to it is described participle segment correspond to each symbol logo first generally Rate information calculates first variance information；The second probabilistic information that each symbol logo is corresponded to according to the participle segment, calculates second Covariance information；If the first variance information is greater than second variance information, the maximum symbol logo of the first probabilistic information is chosen As the corresponding symbol logo of the participle segment；If the second variance information is greater than first variance information, second is chosen The maximum symbol logo of probabilistic information is as the corresponding symbol logo of the participle segment.

Optionally, the method further include: if punctuate is not present in the splicing text end of the addition punctuate, and described Current speech identification text is the last item speech recognition text of voice data, then in the splicing text of the addition punctuate End addition setting punctuate；If there are punctuates at the splicing text end of addition punctuate, and current speech identification text is not The last item speech recognition text of voice data then deletes the punctuate at the splicing text end of the addition punctuate.

Optionally, described to splice the current cypher text and upper N item output text, obtain splicing text, packet It includes: if N is 1, obtaining final stage text in an output text, the final stage text is a upper output Text in text after the last one punctuate；By the current cypher text splicing after the final stage text, obtain To splicing text.

Optionally, the method is applied to simultaneous interpretation field.

The embodiment of the invention also discloses a kind of data processing equipments, specifically include: text obtains module, works as obtaining Preceding speech recognition text；Text splicing module, for the current speech to be identified that text and upper N item output text are spelled It connects, obtains splicing text, wherein the N is positive integer；Punctuate adding module, for adding punctuate in the splicing text, Data in addition to the upper N item output text are extracted from the splicing text of addition punctuate as current output text and defeated Out.

Optionally, the punctuate adding module includes: participle submodule, for carrying out at participle to the splicing text Reason, obtains corresponding multiple participle segments；Punctuate determines submodule, for determining each participle segment according to Symbol matching model Corresponding symbol logo；Symbol adds submodule, if the symbol logo for the participle segment is setting identification, described Participle segment described in splicing text adds the symbol logo after corresponding to text.

Optionally, the Symbol matching model includes the first Symbol matching model and the second Symbol matching model, the mark Point determines that submodule includes: first information determination unit, for each participle segment to be successively input to first Symbol matching In model, the first probabilistic information that each participle segment corresponds to each symbol logo is obtained；Second information determination unit successively will be used for Each participle segment is input in the second Symbol matching model, obtains the second probability that each participle segment corresponds to each symbol logo Information；Symbol determination unit, for be directed to a participle segment, according to it is described participle segment correspond to each symbol logo first generally Rate information and the second probabilistic information determine the corresponding symbol logo of the participle segment.

Optionally, the symbol determination unit, for corresponding to the first probability of each symbol logo according to the participle segment Information calculates first variance information；The second probabilistic information that each symbol logo is corresponded to according to the participle segment, calculates second party Poor information；If the first variance information is greater than second variance information, chooses the maximum symbol logo of the first probabilistic information and make For the corresponding symbol logo of the participle segment；If the second variance information is greater than first variance information, it is general to choose second The maximum symbol logo of rate information is as the corresponding symbol logo of the participle segment.

Optionally, the device further include: end punctuate adding module, if the splicing text for the addition punctuate Punctuate is not present in end, and current speech identification text is the last item speech recognition text of voice data, then in institute State the end addition setting punctuate of the splicing text of addition punctuate；End punctuate removing module, if the splicing for adding punctuate There are punctuates at text end, and current speech identification text is not the last item speech recognition text of voice data, then Delete the punctuate at the splicing text end of the addition punctuate.

Optionally, the text splicing module obtains final stage text in an output text if being 1 for N This, the final stage text is the text in the upper output text after the last one punctuate；It is currently turned over described This splicing of translation obtains splicing text after the final stage text.

Optionally, described device is applied to simultaneous interpretation field.

The embodiment of the invention also discloses a kind of readable storage medium storing program for executing, when the instruction in the storage medium is by electronic equipment Processor execute when so that electronic equipment is able to carry out the data processing method as described in the embodiment of the present invention is any.

It include memory and one or more than one the embodiment of the invention also discloses a kind of electronic equipment Program, perhaps more than one program is stored in memory and is configured to by one or more than one processing for one of them It includes the instruction for performing the following operation that device, which executes the one or more programs: obtaining current speech identification text This；The current speech is identified that text and upper N item output text splice, obtains splicing text, wherein the N is positive Integer；Add punctuate in the splicing text, extracted from the splicing text of addition punctuate except the upper N item output text it Outer data are as current output text and export.

Optionally, the electronic equipment also includes the instruction for performing the following operation: if the spelling of the addition punctuate Connecing text end, there is no punctuates, and current speech identification text is the last item speech recognition text of voice data, Then in the end addition setting punctuate of the splicing text of the addition punctuate；If there is mark in the splicing text end for adding punctuate Point, and current speech identification text is not the last item speech recognition text of voice data, then deletes the addition mark The punctuate at the splicing text end of point.

Optionally, the electronic apparatus application is in simultaneous interpretation field.

The embodiment of the present invention includes following advantages:

In the embodiment of the present invention, after getting current speech identification text, the current speech can be identified into text Spliced with upper N item output text, obtains splicing text, then add punctuate in the splicing text；And then it can tie Close the punctuate for hereafter determining an output text end；It is subsequent to be removed on described from extraction in the splicing text of addition punctuate again Data except N item output text as current output text and export, so when exporting next speech recognition text to The punctuate at upper output text end out.The embodiment of the present invention can be by combining two texts before and after pausing, and determination stops Text end punctuate before solves the problems, such as because pause mistake punctuates, to improve the accuracy rate of addition punctuate.

Detailed description of the invention

Fig. 1 is a kind of step flow chart of data processing method embodiment of the invention；

Fig. 2 is a kind of step flow chart of data processing method alternative embodiment of the invention；

Fig. 3 is a kind of structural block diagram of data processing equipment embodiment of the invention；

Fig. 4 is a kind of structural block diagram of data processing equipment alternative embodiment of the invention；

A kind of Fig. 5 structural block diagram of the electronic equipment for data processing shown according to an exemplary embodiment；

Fig. 6 is a kind of structure for electronic equipment for data processing that the present invention is shown according to another exemplary embodiment Schematic diagram.

Specific embodiment

In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, with reference to the accompanying drawing and specific real Applying mode, the present invention is described in further detail.

One of the core concepts of the embodiments of the present invention is, after receiving a speech recognition text, in conjunction with this voice It identifies text and upper several output text, determines the punctuate at an output text end, and in output this speech recognition text This when, provides the punctuate at an output text end；And then the text by combining front and back of pausing, determine the text before pausing Ins and outs tail tag point solves the problems, such as because pause mistake punctuates, to improve the accuracy rate of addition punctuate.

Referring to Fig.1, a kind of step flow chart of data processing method embodiment of the invention is shown, can specifically include Following steps:

Step 102 obtains current speech identification text.

The current speech is identified that text and upper N item output text splice by step 104, obtains splicing text.

Step 106 adds punctuate in the splicing text, extracts from the splicing text of addition punctuate and removes the upper N Data except item output text as current output text and export.

It, can be according to mute time interval by phonetic segmentation during carrying out speech recognition in the embodiment of the present invention For multiple sound bites；Then speech recognition successively is carried out to each sound bite, obtains speech recognition text, and be the voice It is exported after identification text addition punctuate.It is subsequent other processing to be carried out to the text of output, such as directly show the text of output, It text will for another example be exported translates into and shown after the cypher text of another language again, etc..

In the embodiment of the present invention, after getting speech recognition text, it can be determined that the speech recognition text whether be First speech recognition text of voice data；If the speech recognition text is first speech recognition text of voice data, At this time punctuate can be added in the speech recognition text, wherein punctuate can be being added between text in the speech recognition text, And punctuate is not added at end, then the speech recognition text (can be described as output text) after punctuate is added in output.

It, can be by the speech recognition text if the speech recognition text is not first speech recognition text of voice data Referred to as current speech identifies text, and the current speech can be identified to text and the output text before it are spelled at this time It connects, then adds punctuate again for splicing text.Wherein, during user speech output, a possible word is divided into be exported twice, It may also be divided into being more than and export twice；Therefore the embodiment of the present invention is to guarantee the accuracy of addition punctuation mark, available to work as The upper N item of preceding speech recognition text exports text, wherein N is positive integer, then current speech is identified text and the upper N item Output text is spliced, and splicing text is obtained.Then it is extracted from the splicing text of addition punctuate except the upper N item output text Data except this as current output text and export.Wherein, if a upper output is literary in splicing text after addition punctuate There are punctuates between current speech identification text for this, then currently output text may include the punctuate and working as after addition punctuate Preceding speech text；If adding in the splicing text after punctuate, do not deposited between a upper output text and current speech identification text In punctuate, then currently output text only includes the current speech identification text after addition punctuate；And then next language can be exported Sound provides the punctuate at an output text end when identifying text.

It should be noted that may need to add punctuate between the text of every speech recognition text (or splicing text), Addition punctuate may also not be needed, therefore in speech recognition text (or splicing text) after addition punctuate, speech recognition text It may be added punctuate between the text of (or splicing text), speech recognition text after adding punctuate at this time (or splicing text Originally) and speech recognition text (or splicing text) is different；Between the text of speech recognition text (or splicing text) It may not be added punctuate, speech recognition text (or splicing text) and speech recognition text after adding punctuate at this time (or Splice text) it is identical.

In one embodiment of the invention, for example, " hello, I is Li Lei, is very glad and recognizes everybody " the words, Yong Hu " hello, I is Li Lei " has first been said when saying the words, is spaced 20ms and is said again " very ", then is spaced 20ms and says " happiness ", then It is spaced 20ms and says " understanding ", and be spaced 20ms and say " everybody "；At this time by " hello, I is Li Lei ", " very ", " happiness ", " understanding " and " everybody " corresponding voice data is respectively as a sound bite.It first can be to " hello, I is Li Lei " This sound bite carries out speech recognition, obtains speech recognition text " hello, and I is Li Lei "；Since this sentence is voice data First speech recognition text, therefore punctuate can be added in this sentence, the speech recognition text after obtaining addition punctuate is " big Family is good, I is Li Lei ", then export.Then speech recognition is carried out to " very " this sound bite, obtains corresponding speech recognition Text " very " can be by the language since this speech recognition text is not first speech recognition text of voice data Sound identification text is known as current speech identification text, then the current speech is identified that text and upper output text are spelled Connect, obtain splicing text such as " hello, I cry Li Lei very ".Then punctuate is added in the splicing text, obtains addition mark Splicing text such as " hello, I is Li Lei, very " after point；Then it is extracted from the splicing text of addition punctuate and removes described upper one Data except item output text export text as current, such as extract ", very " and it is used as current output text and exports.Then Speech recognition is carried out to " happiness " this sound bite, obtains corresponding speech recognition text " happiness ", since this voice is known Other text is not first speech recognition text of voice data, therefore the speech recognition text can be known as to current speech knowledge Other text, then the current speech is identified that text and upper output text splice, obtain splicing text as ", it is very high It is emerging ".Then punctuate is added in the splicing text, the splicing text after obtaining addition punctuate is such as ", be very glad "；Then from adding The data in addition to the upper output text are extracted in the splicing text to punctuate as current output text, such as are extracted " happiness " is as current output text and exports.Then speech recognition is carried out to " understanding " this sound bite, obtained corresponding Speech recognition text " understanding ", since this speech recognition text is not first speech recognition text of voice data, The speech recognition text can be known as to current speech identification text, then the current speech is identified into text and upper two output Text (", very " and " happiness ") is spliced, and splicing text such as ", understanding of being very glad " is obtained.Then in the splicing text Punctuate is added, the splicing text after obtaining addition punctuate is such as ", understanding of being very glad "；Then it is mentioned from the splicing text of addition punctuate The data except the upper two output text are removed as current output text, such as extract " understanding " as current output text This is simultaneously exported.Then speech recognition is carried out to " everybody " this sound bite, obtains corresponding speech recognition text " everybody ", by In first speech recognition text that this speech recognition text is not voice data, therefore the speech recognition text can be claimed Text is identified for current speech, then the current speech is identified that text and upper three output text (", very ", " happiness " and " are recognized Know ") spliced, obtain splicing text such as ", be very glad and recognize everybody ".Then punctuate is added in the splicing text, obtained Splicing text after to addition punctuate is such as ", be very glad and recognize everybody "；Then it is extracted from the splicing text of addition punctuate and removes institute The data except three output texts are stated as current output text, such as extract " everybody " as current output text and defeated Out.

In one embodiment of the invention, such as " I is the London arrived June " the words, user are first when saying the words It has said " I is to arrive June ", has then been spaced 30ms and says " London " again；At this time by " I is to arrive June " corresponding language Sound data are used as another sound bite as a sound bite, " London ".First can to " I is to arrive June " this A sound bite carries out speech recognition, obtains speech recognition text " I is to arrive June "；Since this is voice data First speech recognition text, therefore can add punctuate in this sentence, " I am the speech recognition text after obtaining addition punctuate What June arrived ", then export.Then speech recognition is carried out to " London " this sound bite, obtains corresponding speech recognition Text " London " can should since this speech recognition text is not first speech recognition text of voice data Speech recognition text is known as current speech identification text, then the current speech is identified that text and a upper output text carry out Splicing obtains splicing text such as " I is the London arrived June ".Then punctuate is added in the splicing text, is added Splicing text such as " I is the London arrived June " after punctuate；It is extracted from the splicing text of addition punctuate again and removes described upper one Item exports the data except text as current output text, such as extracts " London " as current output text and export.

In another embodiment of the invention, it can be determined between the text for splicing text with symbolization Matching Model Punctuate, and then realize and add punctuate in splicing text；Wherein, the Symbol matching model is determined for each participle segment Latter linked punctuate, such as language model (Ngram model), neural network model, etc..

Referring to Fig. 2, a kind of step flow chart of data processing method alternative embodiment of the invention is shown, it specifically can be with Include the following steps:

Step 202 obtains current speech identification text.

The current speech is identified that text and upper N item output text splice by step 204, obtains splicing text.

In the embodiment of the present invention, sound bite is identified after obtaining speech recognition text, available speech recognition text This, if the speech recognition text is first speech recognition text of voice data, can add in the speech recognition text It punctuates and exports, wherein punctuate can not be added at the end of the speech recognition text.If the speech recognition text is not language The speech recognition text can be then known as current speech identification text, then will by first speech recognition text of sound data The current speech identification text and upper N item output text are spliced, then add punctuate in splicing text.

Current speech identifies the output text before text, some possible texts are longer, some texts are shorter, if therefore working as A plurality of output text before preceding speech recognition text is shorter, the upper several output of available current speech identification text Then current speech identification text and the upper several output text are spliced (N is greater than 1 at this time) by text；If current language Sound identifies that the upper output text of text is long, the upper output text of available current speech identification text, so Current speech identification text and the upper output text are spliced into (N is 1 at this time) afterwards.

Wherein, if the current cypher text and upper output text are spliced, due to upper output text Text between may be added punctuate, it is also possible to be not added punctuate；When upper output text is added When punctuate, final stage text in the upper output text can be saved while exporting the upper output text This, the final stage text is the text in the upper output text after the last one punctuate, empty to save storage Between；And then in splicing, final stage text in available upper output text spells the current cypher text It connects after the final stage text, obtains splicing text.Certainly may also not have between the text of upper output text It is added punctuate, the text can be stored while exporting upper output text at this time；It then, will be described in splicing Current speech identifies that text splicing after the upper output text, obtains splicing text.

Step 206 carries out word segmentation processing to the splicing text, obtains corresponding multiple participle segments.

Step 208, foundation Symbol matching model, determine the corresponding symbol logo of each participle segment.

It can be that splicing text adds punctuate with symbolization Matching Model, wherein can will splice in the embodiment of the present invention Text is divided into multiple participle segments, then successively each participle segment is input in Symbol matching model, available each point The corresponding symbol logo of word segment, the symbol logo may include it is a variety of, such as various punctuation marks mark such as ", " ".""；" "！" "? ", for another example such as " ", the non-punctuation mark mark is for characterizing not no punctuation mark for the symbol logo of non-punctuation mark Situation；The symbol logo of certain punctuation mark can also include other such as " ... ", and the symbol of non-punctuation mark can also be with It is other such as " $ ", the invention is not limited in this regard.

In the embodiment of the present invention, the Symbol matching model may include the first Symbol matching model and the second Symbol matching Model, wherein the first Symbol matching model can be one of speech model and neural network model, second symbol Number Matching Model can be the another kind in speech model and neural network model；Certain first Symbol matching model and the second symbol Number model is also possible to other models, the embodiment of the present invention to this with no restriction.

It is a kind of according to Symbol matching model in an example of the invention, determine the corresponding symbol logo of each participle segment Mode, may include following sub-step:

Sub-step S2: successively each participle segment is input in the first Symbol matching model, obtains each participle segment First probabilistic information of corresponding each symbol logo.

Sub-step S4: successively each participle segment is input in the second Symbol matching model, obtains each participle segment Second probabilistic information of corresponding each symbol logo.

Sub-step S6: being directed to a participle segment, and the first probability letter of each symbol logo is corresponded to according to the participle segment Breath and the second probabilistic information, determine the corresponding symbol logo of the participle segment.

It, can be according to sequence of each participle segment in splicing text, successively by each participle segment in the embodiment of the present invention It is input in the first Symbol matching model, after the participle segment is input in first sign mould, described One Symbol matching model can be handled each participle segment.Wherein, the first Symbol matching model can be for each point Word segment calculates the first probabilistic information that the participle segment connects each symbol logo later.It is corresponding, it can will successively will be each Participle segment is input in the second Symbol matching model, is connected after calculating the participle segment by the second Symbol matching model Second probabilistic information of each symbol logo.

Then for each participle segment, can believe according to the first probabilistic information of the participle segment and the second probability Breath, determines the corresponding symbol logo of the participle segment；It is specific as follows: according to the first probabilistic information of each symbol logo, to calculate First variance information；According to the second probabilistic information of each symbol logo, second variance information is calculated；If the first variance information Greater than second variance information, then the maximum symbol logo of the first probabilistic information is chosen as the corresponding symbol mark of the participle segment Know；If the second variance information is greater than first variance information, the maximum symbol logo of the second probabilistic information is chosen as institute State the corresponding symbol logo of participle segment.Wherein, the probabilistic information difference since covariance information is bigger, between each symbol logo Greatly, illustrate that corresponding Symbol matching model credibility is higher, and then the one group probabilistic information big according to covariance information determines participle piece The corresponding symbol logo of section, can be improved the confidence level of addition punctuate, further increase the accuracy of addition punctuate.

It is a kind of according to Symbol matching model in another example of the invention, determine the corresponding symbol logo of each participle segment Mode, be also possible to after executing above-mentioned sub-step S2, do not execute sub-step S4-S6, for each participle segment, directly according to According to the first probabilistic information of the participle segment, the corresponding symbol logo of the participle segment is determined；Wherein it is possible to which it is general to choose first The maximum symbol logo of rate information is as the corresponding symbol logo of the participle segment.

It is a kind of according to Symbol matching model in another example of the invention, determine the corresponding symbol mark of each participle segment The mode of knowledge can also be and not execute sub-step S2 and S6, directly execute sub-step S4；For each participle segment, directly according to According to the second probabilistic information of the participle segment, the corresponding symbol logo of the participle segment is determined；Wherein it is possible to which it is general to choose second The maximum symbol logo of rate information is as the corresponding symbol logo of the participle segment.

Corresponding, the above-mentioned text exported in the last one in text after punctuate using upper one is spliced, can To reduce the quantity of the participle segment of Symbol matching model treatment, and then it may also reach up and increase in splicing text addition punctuate The effect of efficiency.

If the symbol logo of step 210, the participle segment is setting identification, segmented described in the splicing text Segment adds the symbol logo after corresponding to text.

In the embodiment of the present invention, the symbol logo for segmenting segment can be the symbol logo of punctuation mark, it is also possible to right and wrong The symbol logo of punctuation mark；Therefore it is directed to each participle segment, it can be determined that whether the symbol logo for segmenting segment is setting Mark.If the symbol logo of the participle segment is setting identification, described in the splicing text participle segment correspond to text it After add the symbol logo；If the symbol logo of the participle segment is not setting identification, without this point in splicing text Punctuate is added after the corresponding text of word segment, directly judges whether the symbol logo of next participle segment is setting identification. Wherein, the setting identification may include the symbol logo of punctuation mark, can specifically be arranged as desired.

Wherein, when the symbol logo for determining the participle segment is setting identification, if the participle segment is not splicing text When the last one segment, the participle segment text can be corresponded in splicing text, and, next participle segment of the participle segment Between corresponding text, the corresponding symbol logo of participle segment is added.

Certainly, when the symbol logo for determining the participle segment is setting identification, if the participle segment is to splice text most When the latter segment, the participle segment it can be corresponded to after text (end for splicing text) in splicing text, addition should Segment the corresponding symbol logo of segment；The participle segment (i.e. splicing text after text can not also be corresponded in splicing text End), add the corresponding symbol logo of participle segment；Can specifically be arranged as desired, the embodiment of the present invention to this not It is restricted.

If punctuate is not present in the splicing text end of step 212, the addition punctuate, and the current speech identifies text It is the last item speech recognition text of voice data, then in the end addition setting mark of the splicing text of the addition punctuate Point.

If there are punctuates at the splicing text end of step 214, addition punctuate, and current speech identification text is not language The last item speech recognition text of sound data then deletes the punctuate at the splicing text end of the addition punctuate.

In the embodiment of the present invention, in addition to the corresponding output text of the last item speech recognition text of voice data, other The punctuate at the corresponding output text end of speech recognition text, the beginning that can export text at next provide；And then its The end of the corresponding output text of his speech recognition text can not include punctuate, and the last item speech recognition text is corresponding The end of output text may include punctuate.Therefore after adding punctuate in splicing text, if current speech identification text is not The last item text of voice data, then the end of the splicing text after may determine that addition punctuate is with the presence or absence of punctuate.If adding There are punctuates at the end of splicing text after punctuating, then delete the punctuate at the splicing text end of the addition punctuate；If adding Punctuate is not present in the end of splicing text after punctuating, then can execute step 216.If current speech identifies that text is voice The last item text of data, then the end of the splicing text after may determine that addition punctuate is with the presence or absence of punctuate, if addition mark Punctuate is not present in the end of splicing text after point, then in the end addition setting punctuate of the splicing text of the addition punctuate； If there are punctuates at the end of the splicing text after addition punctuate, step 216 can be executed.Wherein, the setting punctuate can be with It is arranged as desired, wherein the setting punctuate may include the punctuation mark for Statement Completion.

Step 216, data of the extraction in addition to the upper N item output text are used as and work as from the splicing text of addition punctuate Preceding output text simultaneously exports.

Then the data in addition to the upper N item output text are extracted from the splicing text of addition punctuate as current defeated Text and export out, if therefore current speech identification text is not last speech recognition text of voice data, currently The end of output text does not have punctuate；If current speech identification text is last speech recognition text of voice data, The end of current output text may include punctuate.

It, can be with after the embodiment of the present invention can be applied to simultaneous interpretation field, such as voice capture device gets voice data Voice data is sent to speech-recognition services, speech-recognition services can be according to the time interval for receiving voice data, will Voice data is divided into multiple sound bites, then identifies to each sound bite, then executes above-mentioned steps 202- step 216.Speech-recognition services will can currently export text output to machine translation service, and machine translation service will currently export text Originally it is translated as the cypher text of another voice；Then on the one hand the cypher text can be sent to display equipment shows this On the other hand the cypher text can be sent to voice conversion services, cypher text is converted to corresponding language by cypher text The voice of speech is then output to voice playing equipment broadcasting, realizes simultaneous interpretation.Wherein, speech-recognition services, machine translation service and Voice conversion services can be deployed in the same equipment, can also be disposed on different devices, specifically can be as desired Setting, the embodiment of the present invention to this with no restriction.

In the embodiment of the present invention, after getting current speech identification text, the current speech can be identified into text Spliced with upper N item output text, obtains splicing text, then add punctuate in the splicing text；It is marked again from addition The data in addition to the upper N item output text are extracted in the splicing text of point as current output text and are exported, Jin Er The punctuate that an output text end is provided when exporting next speech recognition text, to rise to an output text The accuracy rate of end addition punctuate.Additionally it is possible to identify text addition mark in conjunction with upper output text for current speech Point can be improved currently to export the accuracy rate for adding punctuate between the text of text.

Secondly, after the embodiment of the present invention adds punctuate in splicing text, if the splicing text end of the addition punctuate There is no punctuates, and current speech identification text is the last item speech recognition text of voice data, then adds described The end addition setting punctuate of the splicing text to punctuate；And then it can guarantee the last item speech recognition text of voice data The end of corresponding output text carries punctuate, guarantees the integrality of output text, and improve user experience.

Further, in the embodiment of the present invention, splicing text can be subjected to word segmentation processing, it is then described according to Symbol matching Model determines the corresponding symbol logo of each participle segment, if the symbol logo of the participle segment is setting identification, described Participle segment described in splicing text adds the symbol logo after corresponding to text.Wherein it is possible to successively by each participle segment It is input in the first Symbol matching model, obtains the first probabilistic information that each participle segment corresponds to each symbol logo；Successively Each participle segment is input in the second Symbol matching model, obtain it is each participle segment correspond to each symbol logo second generally Rate information；For a participle segment, the first probabilistic information and second of each symbol logo is corresponded to generally according to the participle segment Rate information determines the corresponding symbol logo of the participle segment.And then according to the output of two Symbol matching models as a result, improving Determine that each participle segment corresponds to the accuracy of symbol logo, to further improve the accuracy for adding punctuate.

Further, in the embodiment of the present invention, according to the first probabilistic information of each symbol logo, first variance information is calculated； According to the second probabilistic information of each symbol logo, second variance information is calculated；If the first variance information is greater than second variance Information then chooses the maximum symbol logo of the first probabilistic information as the corresponding symbol logo of the participle segment；If described Two covariance informations are greater than first variance information, then choose the maximum symbol logo of the second probabilistic information as the participle segment pair The symbol logo answered.Wherein, since covariance information is bigger, the probabilistic information difference between each symbol logo is big, illustrates corresponding symbol Number Matching Model confidence level is higher, therefore the confidence level of addition punctuate can be improved, and then further increases the standard of addition punctuate True property.

Again, the embodiment of the present invention is in splicing, if N is 1, last in available upper output text Duan Wenben, the final stage text are the texts in the upper output text after the last one punctuate；Then by institute Current cypher text splicing is stated after the final stage text, obtains splicing text.And then it need to can only save one Final stage punctuate text in text is exported, memory space is saved；And can also improve it is subsequent splicing text in add punctuate Efficiency.

It should be noted that for simple description, therefore, it is stated as a series of action groups for embodiment of the method It closes, but those skilled in the art should understand that, embodiment of that present invention are not limited by the describe sequence of actions, because according to According to the embodiment of the present invention, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art also should Know, the embodiments described in the specification are all preferred embodiments, and the related movement not necessarily present invention is implemented Necessary to example.

Referring to Fig. 3, show a kind of structural block diagram of data processing equipment embodiment of the invention, can specifically include as Lower module:

Text obtains module 302, for obtaining current speech identification text；

Text splicing module 304 is obtained for the current speech to be identified that text and upper N item output text splice To splicing text, wherein the N is positive integer；

Punctuate adding module 306 is mentioned from the splicing text of addition punctuate for adding punctuate in the splicing text The data except the upper N item output text are removed as current output text and are exported.

Referring to Fig. 4, a kind of structural block diagram of data processing equipment alternative embodiment of the invention is shown.

In an optional embodiment of the present invention, the device further include:

End punctuate adding module 308, if punctuate is not present in the splicing text end for the addition punctuate, and described Current speech identification text is the last item speech recognition text of voice data, then in the splicing text of the addition punctuate End addition setting punctuate.

In an optional embodiment of the present invention, the device further include:

End punctuate removing module 310, if splicing text end for adding punctuate there are punctuate, and the current language Sound identification text is not the last item speech recognition text of voice data, then deletes the splicing text end of the addition punctuate Punctuate.

In an optional embodiment of the present invention, the punctuate adding module 306 includes:

Submodule 3062 is segmented, for carrying out word segmentation processing to the splicing text, obtains corresponding multiple participle segments；

Punctuate determines submodule 3064, for determining the corresponding symbol logo of each participle segment according to Symbol matching model；

Symbol adds submodule 3066, if the symbol logo for the participle segment is setting identification, in the spelling It connects after participle segment described in text corresponds to text and adds the symbol logo.

In an optional embodiment of the present invention, the Symbol matching model includes the first Symbol matching model and the second symbol Number Matching Model, the punctuate determine that submodule 3064 includes:

First information determination unit 30642, for each participle segment to be successively input to the first Symbol matching model In, obtain the first probabilistic information that each participle segment corresponds to each symbol logo；

Second information determination unit 30644, for each participle segment to be successively input to the second Symbol matching model In, obtain the second probabilistic information that each participle segment corresponds to each symbol logo；

Symbol determination unit 30646 corresponds to each symbol logo according to the participle segment for being directed to a participle segment The first probabilistic information and the second probabilistic information, determine the corresponding symbol logo of the participle segment.

In an optional embodiment of the present invention, the symbol determination unit 30646, for according to the participle segment pair The first probabilistic information of each symbol logo is answered, first variance information is calculated；Each symbol logo is corresponded to according to the participle segment Second probabilistic information calculates second variance information；If the first variance information is greater than second variance information, it is general to choose first The maximum symbol logo of rate information is as the corresponding symbol logo of the participle segment；If the second variance information is greater than first Covariance information then chooses the maximum symbol logo of the second probabilistic information as the corresponding symbol logo of the participle segment.

In an optional embodiment of the present invention, the text splicing module 304 obtains one if being 1 for N Final stage text in text is exported, the final stage text is in the upper output text after the last one punctuate Text；By the current cypher text splicing after the final stage text, splicing text is obtained.

In an optional embodiment of the present invention, it is applied to simultaneous interpretation field.

For device embodiment, since it is basically similar to the method embodiment, related so being described relatively simple Place illustrates referring to the part of embodiment of the method.

Fig. 5 is a kind of structural block diagram of electronic equipment 500 for data processing shown according to an exemplary embodiment. For example, electronic equipment 500 can be mobile phone, computer, digital broadcasting terminal, messaging device, game console put down Panel device, Medical Devices, body-building equipment, personal digital assistant etc..

Referring to Fig. 5, electronic equipment 500 may include following one or more components: processing component 502, memory 504, Electric power assembly 506, multimedia component 508, audio component 510, the interface 512 of input/output (I/O), sensor module 514, And communication component 516.

The integrated operation of the usual controlling electronic devices 500 of processing component 502, such as with display, call, data are logical Letter, camera operation and record operate associated operation.Processing element 502 may include one or more processors 520 to hold Row instruction, to perform all or part of the steps of the methods described above.In addition, processing component 502 may include one or more moulds Block, convenient for the interaction between processing component 502 and other assemblies.For example, processing component 502 may include multi-media module, with Facilitate the interaction between multimedia component 508 and processing component 502.

Memory 504 is configured as storing various types of data to support the operation in equipment 500.These data are shown Example includes the instruction of any application or method for operating on electronic equipment 500, contact data, telephone directory number According to, message, picture, video etc..Memory 504 can by any kind of volatibility or non-volatile memory device or they Combination realize, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM) is erasable Programmable read only memory (EPROM), programmable read only memory (PROM), read-only memory (ROM), magnetic memory, quick flashing Memory, disk or CD.

Electric power assembly 506 provides electric power for the various assemblies of electronic equipment 500.Electric power assembly 506 may include power supply pipe Reason system, one or more power supplys and other with for electronic equipment 500 generate, manage, and distribute the associated component of electric power.

Multimedia component 508 includes the screen of one output interface of offer between the electronic equipment 500 and user. In some embodiments, screen may include liquid crystal display (LCD) and touch panel (TP).If screen includes touch surface Plate, screen may be implemented as touch screen, to receive input signal from the user.Touch panel includes one or more touches Sensor is to sense the gesture on touch, slide, and touch panel.The touch sensor can not only sense touch or sliding The boundary of movement, but also detect duration and pressure associated with the touch or slide operation.In some embodiments, Multimedia component 508 includes a front camera and/or rear camera.When electronic equipment 500 is in operation mode, as clapped When taking the photograph mode or video mode, front camera and/or rear camera can receive external multi-medium data.It is each preposition Camera and rear camera can be a fixed optical lens system or have focusing and optical zoom capabilities.

Audio component 510 is configured as output and/or input audio signal.For example, audio component 510 includes a Mike Wind (MIC), when electronic equipment 500 is in operation mode, when such as call mode, recording mode, and voice recognition mode, microphone It is configured as receiving external audio signal.The received audio signal can be further stored in memory 504 or via logical Believe that component 516 is sent.In some embodiments, audio component 510 further includes a loudspeaker, is used for output audio signal.

I/O interface 512 provides interface between processing component 502 and peripheral interface module, and above-mentioned peripheral interface module can To be keyboard, click wheel, button etc..These buttons may include, but are not limited to: home button, volume button, start button and lock Determine button.

Sensor module 514 includes one or more sensors, for providing the state of various aspects for electronic equipment 500 Assessment.For example, sensor module 514 can detecte the state that opens/closes of equipment 500, the relative positioning of component, such as institute The display and keypad that component is electronic equipment 500 are stated, sensor module 514 can also detect electronic equipment 500 or electronics The position change of 500 1 components of equipment, the existence or non-existence that user contacts with electronic equipment 500,500 orientation of electronic equipment Or the temperature change of acceleration/deceleration and electronic equipment 500.Sensor module 514 may include proximity sensor, be configured to It detects the presence of nearby objects without any physical contact.Sensor module 514 can also include optical sensor, such as CMOS or ccd image sensor, for being used in imaging applications.In some embodiments, which can be with Including acceleration transducer, gyro sensor, Magnetic Sensor, pressure sensor or temperature sensor.

Communication component 516 is configured to facilitate the communication of wired or wireless way between electronic equipment 500 and other equipment. Electronic equipment 500 can access the wireless network based on communication standard, such as WiFi, 2G or 3G or their combination.Show at one In example property embodiment, communication component 514 receives broadcast singal or broadcast from external broadcasting management system via broadcast channel Relevant information.In one exemplary embodiment, the communication component 514 further includes near-field communication (NFC) module, short to promote Cheng Tongxin.For example, radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra wide band can be based in NFC module (UWB) technology, bluetooth (BT) technology and other technologies are realized.

In the exemplary embodiment, electronic equipment 500 can be by one or more application specific integrated circuit (ASIC), number Word signal processor (DSP), digital signal processing appts (DSPD), programmable logic device (PLD), field programmable gate array (FPGA), controller, microcontroller, microprocessor or other electronic components are realized, for executing the above method.

In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instruction, example are additionally provided It such as include the memory 504 of instruction, above-metioned instruction can be executed by the processor 520 of electronic equipment 500 to complete the above method.Example Such as, the non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, soft Disk and optical data storage devices etc..

A kind of non-transitorycomputer readable storage medium, when the instruction in the storage medium is by the processing of electronic equipment When device executes, so that electronic equipment is able to carry out a kind of data processing method, which comprises obtain current speech identification text This；The current speech is identified that text and upper N item output text splice, obtains splicing text, wherein the N is positive Integer；Add punctuate in the splicing text, extracted from the splicing text of addition punctuate except the upper N item output text it Outer data are as current output text and export.

Optionally, it is applied to simultaneous interpretation field.

Fig. 6 is a kind of electronic equipment 600 for data processing that the present invention is shown according to another exemplary embodiment Structural schematic diagram.The electronic equipment 600 can be server, which can generate bigger because of configuration or performance difference Difference, may include one or more central processing units (central processing units, CPU) 622 (for example, One or more processors) and memory 632, the storage of one or more storage application programs 642 or data 644 Medium 630 (such as one or more mass memory units).Wherein, memory 632 and storage medium 630 can be of short duration Storage or persistent storage.The program for being stored in storage medium 630 may include one or more modules (diagram does not mark), Each module may include to the series of instructions operation in server.Further, central processing unit 622 can be set to It is communicated with storage medium 630, executes the series of instructions operation in storage medium 630 on the server.

Server can also include one or more power supplys 626, one or more wired or wireless networks connect Mouthfuls 650, one or more input/output interfaces 658, one or more keyboards 656, and/or, one or one with Upper operating system 641, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM etc..

A kind of electronic equipment includes perhaps one of them or one of more than one program of memory and one Procedure above is stored in memory, and is configured to execute one or one by one or more than one processor Procedure above includes the instruction for performing the following operation: obtaining current speech and identifies text；The current speech is identified into text This and upper N item output text are spliced, and obtain splicing text, wherein the N is positive integer；Add in the splicing text It punctuates, the data in addition to the upper N item output text is extracted from the splicing text of addition punctuate as current output text This is simultaneously exported.

Optionally, it is applied to simultaneous interpretation field.

All the embodiments in this specification are described in a progressive manner, the highlights of each of the examples are with The difference of other embodiments, the same or similar parts between the embodiments can be referred to each other.

The embodiment of the present invention be referring to according to the method for the embodiment of the present invention, terminal device (system) and computer program The flowchart and/or the block diagram of product describes.It should be understood that flowchart and/or the block diagram can be realized by computer program instructions In each flow and/or block and flowchart and/or the block diagram in process and/or box combination.It can provide these Computer program instructions are set to general purpose computer, special purpose computer, Embedded Processor or other programmable data processing terminals Standby processor is to generate a machine, so that being held by the processor of computer or other programmable data processing terminal devices Capable instruction generates for realizing in one or more flows of the flowchart and/or one or more blocks of the block diagram The device of specified function.

These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing terminal devices In computer-readable memory operate in a specific manner, so that instruction stored in the computer readable memory generates packet The manufacture of command device is included, which realizes in one side of one or more flows of the flowchart and/or block diagram The function of being specified in frame or multiple boxes.

These computer program instructions can also be loaded into computer or other programmable data processing terminal devices, so that Series of operation steps are executed on computer or other programmable terminal equipments to generate computer implemented processing, thus The instruction executed on computer or other programmable terminal equipments is provided for realizing in one or more flows of the flowchart And/or in one or more blocks of the block diagram specify function the step of.

Although the preferred embodiment of the embodiment of the present invention has been described, once a person skilled in the art knows bases This creative concept, then additional changes and modifications can be made to these embodiments.So the following claims are intended to be interpreted as Including preferred embodiment and fall into all change and modification of range of embodiment of the invention.

Finally, it is to be noted that, herein, relational terms such as first and second and the like be used merely to by One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation Between there are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant meaning Covering non-exclusive inclusion, so that process, method, article or terminal device including a series of elements not only wrap Those elements are included, but also including other elements that are not explicitly listed, or further includes for this process, method, article Or the element that terminal device is intrinsic.In the absence of more restrictions, being wanted by what sentence "including a ..." limited Element, it is not excluded that there is also other identical elements in process, method, article or the terminal device for including the element.

Above to a kind of data processing method provided by the present invention, a kind of data processing equipment and a kind of electronic equipment, It is described in detail, used herein a specific example illustrates the principle and implementation of the invention, the above reality The explanation for applying example is merely used to help understand method and its core concept of the invention；Meanwhile for the general technology of this field Personnel, according to the thought of the present invention, there will be changes in the specific implementation manner and application range, in conclusion this theory Bright book content should not be construed as limiting the invention.

Claims

1. a kind of data processing method characterized by comprising

It obtains current speech and identifies text；

The current speech is identified that text and upper N item output text splice, obtains splicing text, wherein the N is positive Integer；

Add punctuate in the splicing text, extracted from the splicing text of addition punctuate except the upper N item output text it Outer data are as current output text and export.

2. the method according to claim 1, wherein described add punctuate in the splicing text, comprising:

Word segmentation processing is carried out to the splicing text, obtains corresponding multiple participle segments；

According to Symbol matching model, the corresponding symbol logo of each participle segment is determined；

If the symbol logo of the participle segment is setting identification, the participle segment described in the splicing text corresponds to text The symbol logo is added later.

3. according to the method described in claim 2, it is characterized in that, the Symbol matching model includes the first Symbol matching model It is described according to Symbol matching model with the second Symbol matching model, determine the corresponding symbol logo of each participle segment, comprising:

Successively each participle segment is input in the first Symbol matching model, each participle segment is obtained and corresponds to each symbol logo The first probabilistic information；

Successively each participle segment is input in the second Symbol matching model, each participle segment is obtained and corresponds to each symbol logo The second probabilistic information；

For a participle segment, the first probabilistic information and the second probability letter of each symbol logo are corresponded to according to the participle segment Breath, determines the corresponding symbol logo of the participle segment.

4. according to the method described in claim 3, it is characterized in that, described correspond to each symbol logo according to the participle segment First probabilistic information and the second probabilistic information determine the corresponding symbol logo of the participle segment, comprising:

The first probabilistic information that each symbol logo is corresponded to according to the participle segment, calculates first variance information；

The second probabilistic information that each symbol logo is corresponded to according to the participle segment, calculates second variance information；

If the first variance information is greater than second variance information, the maximum symbol logo of the first probabilistic information is chosen as institute State the corresponding symbol logo of participle segment；

If the second variance information is greater than first variance information, the maximum symbol logo of the second probabilistic information is chosen as institute State the corresponding symbol logo of participle segment.

5. the method according to claim 1, wherein the method further include:

If punctuate is not present in the splicing text end of the addition punctuate, and current speech identification text is voice data The last item speech recognition text, then in the end addition setting punctuate of the splicing text of the addition punctuate；

If there are punctuates at the splicing text end of addition punctuate, and current speech identification text is not the last of voice data One speech recognition text then deletes the punctuate at the splicing text end of the addition punctuate.

6. the method according to claim 1, wherein described export text for the current cypher text and upper N item This is spliced, and splicing text is obtained, comprising:

If N is 1, final stage text in an output text is obtained, the final stage text is described upper one defeated Text in text after the last one punctuate out；

By the current cypher text splicing after the final stage text, splicing text is obtained.

7. -6 any method according to claim 1, which is characterized in that be applied to simultaneous interpretation field.

8. a kind of data processing equipment characterized by comprising

Text obtains module, for obtaining current speech identification text；

Text splicing module obtains splicing text for the current speech to be identified that text and upper N item output text splice This, wherein the N is positive integer；

Punctuate adding module is extracted from the splicing text of addition punctuate for adding punctuate in the splicing text and removes institute The data except N item output text are stated as current output text and are exported.

9. a kind of readable storage medium storing program for executing, which is characterized in that when the instruction in the storage medium is held by the processor of electronic equipment When row, so that electronic equipment is able to carry out the data processing method as described in claim to a method 1-7 is any.

10. a kind of electronic equipment, which is characterized in that include memory and one or more than one program, wherein one A perhaps more than one program is stored in memory and is configured to execute described one by one or more than one processor A or more than one program includes the instruction for performing the following operation:

It obtains current speech and identifies text；