CN109558600A - Translation processing method and device - Google Patents

Translation processing method and device Download PDF

Info

Publication number
CN109558600A
CN109558600A CN201811350498.9A CN201811350498A CN109558600A CN 109558600 A CN109558600 A CN 109558600A CN 201811350498 A CN201811350498 A CN 201811350498A CN 109558600 A CN109558600 A CN 109558600A
Authority
CN
China
Prior art keywords
translated
content
result
translation
processing method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811350498.9A
Other languages
Chinese (zh)
Other versions
CN109558600B (en
Inventor
陈小帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing ByteDance Network Technology Co Ltd
Original Assignee
Beijing ByteDance Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd filed Critical Beijing ByteDance Network Technology Co Ltd
Priority to CN201811350498.9A priority Critical patent/CN109558600B/en
Publication of CN109558600A publication Critical patent/CN109558600A/en
Application granted granted Critical
Publication of CN109558600B publication Critical patent/CN109558600B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The present disclosure proposes a kind of translation processing method and devices, wherein method includes: to obtain content to be translated, and extract the language feature of content to be translated;Language feature is input in wrong identification model trained in advance and is handled, obtains the recognition result of content to be translated;Judge content to be translated with the presence or absence of mistake according to recognition result, and if it exists, then correct generating to content to be translated and correct result;It translates, obtain corresponding first translation result and shows to result is corrected.Thus, the mistake in content to be translated is identified by wrong identification model and is corrected, and then is translated to result is corrected, and desired translation result can be obtained by so that user is not had to manual correction content to be translated, the efficiency that user uses translation tool is improved, the usage experience of user is promoted.

Description

Translation processing method and device
Technical field
This disclosure relates to translation technology field more particularly to a kind of translation processing method and device.
Background technique
User can input the content for oneself being intended to translation when using interpretative function, and then translation tool is by the defeated of user Enter content translation to object language, for users to use.But the case where being likely to occur input error in user's input process, example Such as, user is intended in input " height had a test in chemistry and was difficult this year " progress-English translation, but possible mistake input is " college entrance examination in this year skiing It is difficult ", and then translation result is caused to deviate the desired content of user.
In the related technology, when the content to be translated error of user's input, needing user to find mistake, simultaneously manual correction is wrong Accidentally, cumbersome, reduce the efficiency that user uses translation tool.
Summary of the invention
The disclosure provides a kind of translation processing method and device, when for solving the content to be translated error of user's input, Need user's discovery and manual correction mistake, cumbersome technical problem.
For this purpose, on the one hand the disclosure proposes a kind of translation processing method, content to be translated is identified by wrong identification model In mistake and corrected, and then to correct result translate, make user do not have to manual correction content to be translated can obtain Desired translation result is taken, the efficiency that user uses translation tool is improved, promotes the usage experience of user.
On the other hand the disclosure proposes a kind of translation processing unit.
On the other hand the disclosure proposes a kind of electronic equipment.
The another aspect of the disclosure proposes a kind of computer readable storage medium.
Disclosure first aspect embodiment proposes a kind of translation processing method, comprising:
Content to be translated is obtained, and extracts the language feature of the content to be translated;
The language feature is input in wrong identification model trained in advance and is handled, obtained described to be translated interior The recognition result of appearance;
Judge the content to be translated with the presence or absence of mistake according to the recognition result, and if it exists, then to described to be translated Content, which correct generating, corrects result;
The correction result is translated, corresponding first translation result is obtained and is shown.
The translation processing method of the embodiment of the present disclosure by obtaining content to be translated, and extracts the language of content to be translated Feature, and then language feature is input in wrong identification model trained in advance and is handled, obtain the knowledge of content to be translated Other result.Further judge content to be translated with the presence or absence of mistake according to recognition result, and if it exists, then to carry out to content to be translated It corrects to generate and correct as a result, and translating, obtaining corresponding first translation result and showing to result is corrected.Hereby it is achieved that When the content to be translated error of user's input, mistake in intellectual analysis content to be translated is simultaneously corrected, and then to entangling Positive result is translated, so that user does not have to manual correction content to be translated and can obtain desired translation result, improves use Family uses the efficiency of translation tool, promotes the usage experience of user.
In addition, can also have following additional technical feature according to the translation processing method of disclosure above-described embodiment:
Optionally, the language feature for extracting the content to be translated includes: to segment to the content to be translated, Obtain word segmentation result;According to the word segmentation result extract it is each participle position language model value, and above feature and/or under Literary feature;The recognition result for obtaining the content to be translated, comprising: obtain the error probability of each participle position.
Optionally, described that the content to be translated is carried out correcting generation correction result including: to obtain the error probability Greater than the word of the participle position of preset threshold, and construct the candidate list of the word;It gives a mark to the candidate list, The word is replaced according to marking result, to generate correction result.
Optionally, described that the content to be translated is carried out correcting generation correction result including: by the content to be translated It is input to being handled slave wrong content into the end-to-end generation model for correcting result for training in advance, is obtained with described wait turn over Translate the corresponding correction result of content.
Optionally, before being handled in the language feature to be input to wrong identification model trained in advance, also It include: the training set for obtaining wrong corpus and corresponding correct corpus;It is raw according to the parameter of training set training preset model At the wrong identification model.
Optionally, it after being judged the content to be translated with the presence or absence of mistake according to the wrong identification result, also wraps It includes: if it does not exist, then the content to be translated being translated, obtain corresponding second translation result and show.
Optionally, it translates to the correction result, after obtaining corresponding first translation result and showing, also wraps It includes: the content to be translated is translated, obtain corresponding second translation result and show.
Disclosure second aspect embodiment proposes a kind of translation processing unit, comprising:
Module is obtained, for obtaining content to be translated, and extracts the language feature of the content to be translated;
Processing module is handled for the language feature to be input in wrong identification model trained in advance, is obtained Take the recognition result of the content to be translated;
Module is corrected, for judging the content to be translated with the presence or absence of mistake according to the recognition result, and if it exists, then The content to be translated correct generating and corrects result;
Translation module, for translating, obtaining corresponding first translation result and showing to the correction result.
The translation processing unit of the embodiment of the present disclosure by obtaining content to be translated, and extracts the language of content to be translated Feature, and then language feature is input in wrong identification model trained in advance and is handled, obtain the knowledge of content to be translated Other result.Further judge content to be translated with the presence or absence of mistake according to recognition result, and if it exists, then to carry out to content to be translated It corrects to generate and correct as a result, and translating, obtaining corresponding first translation result and showing to result is corrected.Hereby it is achieved that When the content to be translated error of user's input, mistake in intellectual analysis content to be translated is simultaneously corrected, and then to entangling Positive content is translated, so that user does not have to manual correction content to be translated and can obtain desired translation result, promotes user Usage experience.
In addition, the translation processing unit according to disclosure above-described embodiment can also have following additional technical feature:
Optionally, the acquisition module is specifically used for: segmenting to the content to be translated, obtains word segmentation result;Root The language model value of each participle position, and feature and/or following traits above are extracted according to the word segmentation result;The processing Module is specifically used for: obtaining the error probability of each participle position.
Optionally, the module of correcting is specifically used for: obtaining the participle position of the error probability greater than preset threshold Word, and construct the candidate list of the word;Give a mark to the candidate list, according to marking result to the word into Row replacement, to generate correction result.
Optionally, the module of correcting is specifically used for: the content to be translated is input in the slave mistake of training in advance Hold in the end-to-end generation model for correcting result and handled, obtains correction result corresponding with the content to be translated.
Optionally, the device further include: training module, for obtaining the instruction of wrong corpus with corresponding correct corpus Practice collection;According to the parameter of training set training preset model, the wrong identification model is generated.
Optionally, the translation module is also used to: being translated to the content to be translated, is obtained corresponding second translation As a result it and shows.
Disclosure third aspect embodiment proposes a kind of electronic equipment, including processor and memory;Wherein, the place Reason device is corresponding with the executable program code to run by reading the executable program code stored in the memory Program, for realizing the translation processing method as described in first aspect embodiment.
Disclosure fourth aspect embodiment proposes a kind of computer readable storage medium, is stored thereon with computer journey Sequence, which is characterized in that the translation processing method as described in first aspect embodiment is realized when the program is executed by processor.
The additional aspect of the disclosure and advantage will be set forth in part in the description, and will partially become from the following description It obtains obviously, or recognized by the practice of the disclosure.
Detailed description of the invention
Fig. 1 is a kind of flow diagram of translation processing method provided by the embodiment of the present disclosure;
Fig. 2 is the flow diagram of another kind translation processing method provided by the embodiment of the present disclosure;
Fig. 3 is a kind of structural schematic diagram for translating processing unit provided by the embodiment of the present disclosure;
Fig. 4 is the structural schematic diagram that another kind provided by the embodiment of the present disclosure translates processing unit;
Fig. 5 shows the structural schematic diagram for being suitable for the electronic equipment for being used to realize the embodiment of the present disclosure;
Fig. 6 is the schematic diagram for illustrating computer readable storage medium according to an embodiment of the present disclosure.
Specific embodiment
Embodiment of the disclosure is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached The embodiment of figure description is exemplary, it is intended to for explaining the disclosure, and should not be understood as the limitation to the disclosure.
Below with reference to the accompanying drawings the translation processing method and device of the embodiment of the present disclosure are described.
Fig. 1 is a kind of flow diagram of translation processing method provided by the embodiment of the present disclosure, as shown in Figure 1, the party Method includes:
Step 101, content to be translated is obtained, and extracts the language feature of content to be translated.
In the present embodiment, need first to obtain content to be translated when carrying out translation processing.For example, available user makes The content inputted when with interpretative function, as content to be translated.
As an example, after obtaining content to be translated, the language feature of content to be translated can also be extracted, with root Identify that content to be translated whether there is mistake according to language feature.Wherein, language feature include but is not limited to language model value, up and down Literary feature, word frequency etc..
Step 102, language feature is input in wrong identification model trained in advance and is handled, obtained in be translated The recognition result of appearance.
In one embodiment of the present disclosure, the training set of available wrong corpus and corresponding correct corpus, and root According to the parameter of training set training preset model, generation error identification model.In turn, language feature is input to mistake trained in advance It is handled in misrecognition model, obtains the recognition result of content to be translated.
As an example, the training set of available wrong corpus and corresponding correct corpus, such as obtain " xenogenesis translation Processing method " and " a kind of translation processing method " are used as training data, and the parameter of training preset model identifies mould with generation error Type makes wrong identification mode input language feature, exports as result of giving a mark.In turn, content to be translated is extracted " at xenogenesis translation The language feature of reason method ", and be input in wrong identification model trained in advance and handled, obtain " xenogenesis translation processing The marking result of method ".
As another example, the training set of available mistake corpus and corresponding correct corpus, training preset model Parameter makes the language feature of each word in wrong identification mode input content to be translated with generation error identification model, defeated It is out the error probability of word.In turn, the language feature of each word in content to be translated is extracted, and is input to preparatory training Wrong identification model in handled, obtain the error probability of each word.
Wherein, wrong identification model includes but is not limited to Logic Regression Models, Random Forest model, Recognition with Recurrent Neural Network mould Type, convolutional neural networks model, deep neural network model etc..
Step 103, judge content to be translated with the presence or absence of mistake according to recognition result, and if it exists, then to content to be translated Correct generating and corrects result.
As an example, recognition result is marking result A, marking result A is compared with preset threshold B, when A is big When B, judging content to be translated, there is no mistakes;When A is less than or equal to B, judge that there are mistakes for content to be translated.
Wherein, preset threshold can be determined by lot of experimental data, also be can according to need self-setting, do not limited herein System.
In one embodiment of the present disclosure, when content to be translated has mistake, correction life is carried out to content to be translated At correction result.
Wherein, there are many implementations for generating correction result.As a kind of possible implementation, available mistake Sentence and correct sentence are as training sample, end-to-end generation model of the training from wrong sentence to correct sentence, in turn, will be to Translation content is input in end-to-end generation model, is generated and is corrected result.For example, " xenogenesis translation processing method " is input to pre- It is first handled in the end-to-end generation model of training, generates and correct result " a kind of translation processing method ".
In one embodiment of the present disclosure, if judging content to be translated, there is no mistakes, carry out to content to be translated Translation, obtains corresponding translation result and shows.
Step 104, correction result is translated, obtains corresponding first translation result and simultaneously shows.
In the present embodiment, the correction of available content to be translated is as a result, in turn translate correction result, to obtain It takes and corrects corresponding first translation result of result and show.For example, by translator of Chinese to English when, can entangling Chinese form Positive result is translated into the first translation result of English form and is shown.
In one embodiment of the present disclosure, content to be translated can also be translated, is obtained and content pair to be translated The second translation result for answering simultaneously is shown.
It should be noted that the above-mentioned acquisition translation result and implementation shown is only exemplary, it specifically can be with It is configured as needed.For example, by taking wrong identification model exports marking result as an example, it can be default based on marking result setting Strategy only can be translated and shown to correcting result, when marking result value is larger when result value of giving a mark is smaller When, can correction result and content to be translated be translated and be shown simultaneously.
The translation processing method of the embodiment of the present disclosure by obtaining content to be translated, and extracts the language of content to be translated Feature, and then language feature is input in wrong identification model trained in advance and is handled, obtain the knowledge of content to be translated Other result.Further judge content to be translated with the presence or absence of mistake according to recognition result, and if it exists, then to carry out to content to be translated It corrects to generate and correct as a result, and translating, obtaining corresponding first translation result and showing to result is corrected.Hereby it is achieved that When the content to be translated error of user's input, mistake in intellectual analysis content to be translated is simultaneously corrected, and then to entangling Positive content is translated, so that user does not have to manual correction content to be translated and can obtain desired translation result, promotes user Usage experience.
Based on the above embodiment, further, it is illustrated below with reference to the case where realizing wrong identification based on participle.
Fig. 2 is the flow diagram of another kind translation processing method provided by the embodiment of the present disclosure, as shown in Fig. 2, After obtaining content to be translated, this method comprises:
Step 201, content to be translated is segmented, obtains word segmentation result.
In the present embodiment, content to be translated can be segmented so that wrong identification model to each participle position into Row wrong identification.
As an example, content to be translated is " college entrance examination in this year skiing is difficult ", by related segmenting method to be translated Content is segmented, and word segmentation result " this year, college entrance examination, skiing are difficult " is obtained.
Wherein, segmenting method include but is not limited to the segmenting method based on string matching, the segmenting method based on understanding, Segmenting method etc. based on statistics.
Step 202, according to word segmentation result extract it is each participle position language model value, and above feature and/or under Literary feature.
In one embodiment of the present disclosure, language model value can be N-Gram language model characteristic value.Wherein, N- Whether Gram language model can assess sentence reasonable.For example, according to word segmentation result " this year, college entrance examination, skiing are difficult ", respectively Extract the N-Gram language model characteristic value of " this year " " college entrance examination " " skiing " " being difficult " four participle positions.
In one embodiment of the present disclosure, the feature above and/or following traits of each participle position can also be extracted. For example, the following traits for extracting " this year " are " college entrance examination " according to word segmentation result " this year, college entrance examination, skiing are difficult ", extract " sliding The feature above of snow " is " college entrance examination ", following traits are " being difficult ".
It should be noted that each participle position can extract one group of contextual feature, multiple groups context can also be corresponded to Feature, for example the following traits of " college entrance examination " " skiing " as " this year " are extracted, herein with no restriction.
Step 203, the language feature of each participle position is input in wrong identification model trained in advance Reason obtains the error probability of each participle position.
In one embodiment of the present disclosure, the training set of available wrong corpus and corresponding correct corpus, and root According to the parameter of training set training preset model, generation error identification model.In turn, the language feature of each participle position is inputted It is handled into wrong identification model, obtains the error probability of each participle position.
As an example, the training set of available wrong corpus and corresponding correct corpus, such as obtain " college entrance examination in this year Skiing be difficult " and " height had a test in chemistry and was difficult this year " be used as training data, wrong corpus is segmented, and obtain each participle position The correct/error type mark for setting corresponding word is being trained two classification (just in each participle position of every training data Really/mistake) model, make language model value, feature above and/or the following traits of each participle position of mode input, output For the error probability of each participle position.
In turn, language model value, the following traits for inputting " this year ", export the error probability A1 in " this year ";Input is " sliding Language model value, the feature above, following traits of snow ", export the error probability A2 of " skiing ".
Step 204, judge content to be translated with the presence or absence of mistake, and if it exists, to obtain error probability and be greater than preset threshold The word of position is segmented, and constructs the candidate list of word.
As an example, the error probability of each participle position can be matched with preset threshold, when wrong general When rate is greater than preset threshold, judge that there are mistakes for the corresponding word in participle position;, when error probability is less than or equal to preset threshold When, judge that the corresponding word in participle position is correct.In turn, when content to be translated exist error participle position when, judge to Translating content, there are mistakes.
In the present embodiment, candidate list can be constructed for the word that there is mistake.
As an example, the candidate list of word can be constructed based on pronunciation.For example, judgement " skiing " is there are mistake, The candidate list including the words such as " chemistry ", " snow melting ", " Hua Xue " can be constructed.
Step 205, it gives a mark, word is replaced according to marking result, to generate correction result to candidate list.
As an example, it can be given a mark based on language model to candidate list.For example, being distinguished based on language model It gives a mark to " chemistry " " snow melting " " Hua Xue ", the word for choosing wherein highest scoring, which generates, corrects result.
As another example, the wrong word in content to be translated can be replaced with into the word in candidate list, it is raw Sentence is constructed at candidate, and then error probability is obtained by wrong identification model, and sort based on error probability backward.For example, " skiing " in content to be translated is replaced, candidate building sentence " height had a test in chemistry and was difficult this year " " college entrance examination in this year is generated Snow is difficult " " college entrance examination in this year Hua Xue is difficult ", and be separately input to be handled in wrong identification model, obtain corresponding participle position Error probability be ranked up in turn based on error probability is ascending, choose the smallest candidate building sentence of error probability and make To correct as a result, alternatively, choosing the lesser N number of candidate building sentence of error probability as multiple correction results.
Optionally, it can also be shown to result is corrected.
The translation processing method of the embodiment of the present disclosure obtains word segmentation result, and root by segmenting to content to be translated The language model value of each participle position, and feature and/or following traits above are extracted according to word segmentation result.It in turn, will be each The language feature of participle position, which is input in wrong identification model trained in advance, to be handled, and the mistake of each participle position is obtained Accidentally probability further judges content to be translated with the presence or absence of mistake, and if it exists, obtains the participle that error probability is greater than preset threshold The word of position, and construct the candidate list of word, further gives a mark to candidate list, according to marking result to word into Row replacement, to generate correction result.Hereby it is achieved that carrying out wrong identification to content to be translated based on participle, and generate correction As a result, realize the mistake when the content to be translated error of user's input, in intellectual analysis content to be translated and corrected, To further be translated to correction content, so that user does not have to manual correction content to be translated and can obtain desired translation As a result, promoting the usage experience of user.
In order to realize above-described embodiment, the disclosure also proposes a kind of translation processing unit.
Fig. 3 is a kind of structural schematic diagram for translating processing unit provided by the embodiment of the present disclosure, as shown in figure 3, the dress Setting includes: to obtain module 10, and processing module 20 corrects module 30, translation module 40.
Wherein, module 10 is obtained, for obtaining content to be translated, and extracts the language feature of content to be translated.
Processing module 20 is handled for language feature to be input in wrong identification model trained in advance, is obtained The recognition result of content to be translated.
Module 30 is corrected, for judging content to be translated with the presence or absence of mistake according to recognition result, and if it exists, then treat and turn over Content is translated to carry out correcting generation correction result.
Translation module 40 obtains corresponding first translation result and simultaneously shows for translating to correction result.
On the basis of Fig. 3, translation processing unit shown in Fig. 4 further include: training module 50.
Wherein, training module 50, for obtaining the training set of wrong corpus with corresponding correct corpus;It is assembled for training according to training Practice the parameter of preset model, generation error identification model.
Further, it obtains module to be specifically used for: content to be translated is segmented, obtain word segmentation result;According to participle As a result the language model value of each participle position, and feature and/or following traits above are extracted.Processing module 20 is specifically used In: obtain the error probability of each participle position.
Further, it corrects module 30 to be specifically used for: obtaining the word that error probability is greater than the participle position of preset threshold, And construct the candidate list of word;It gives a mark to candidate list, word is replaced according to marking result, is corrected with generating As a result.
Further, correct module 30 be specifically used for: by content to be translated be input in advance training slave wrong content to It corrects and is handled in the end-to-end generation model of result, obtain correction result corresponding with content to be translated.
Further, translation module 40 is also used to: when mistake is not present in judgement, alternatively, obtaining the first translation result And after showing, content to be translated is translated, obtain corresponding second translation result and is shown.
It should be noted that previous embodiment is equally applicable to turning over for the present embodiment to the explanation of translation processing method Processing unit is translated, details are not described herein again.
The translation processing unit of the embodiment of the present disclosure by obtaining content to be translated, and extracts the language of content to be translated Feature, and then language feature is input in wrong identification model trained in advance and is handled, obtain the knowledge of content to be translated Other result.Further judge content to be translated with the presence or absence of mistake according to recognition result, and if it exists, then to carry out to content to be translated It corrects to generate and correct as a result, and translating, obtaining corresponding first translation result and showing to result is corrected.Hereby it is achieved that When the content to be translated error of user's input, mistake in intellectual analysis content to be translated is simultaneously corrected, and then to entangling Positive content is translated, so that user does not have to manual correction content to be translated and can obtain desired translation result, promotes user Usage experience.
In order to realize above-described embodiment, the disclosure also proposes a kind of electronic equipment.
Below with reference to Fig. 5, it illustrates the structural representations for the electronic equipment 800 for being suitable for being used to realize the embodiment of the present disclosure Figure.Terminal device in the embodiment of the present disclosure can include but is not limited to such as mobile phone, laptop, digital broadcasting and connect Receive device, PDA (personal digital assistant), PAD (tablet computer), PMP (portable media player), car-mounted terminal (such as vehicle Carry navigation terminal) etc. mobile terminal and such as number TV, desktop computer etc. fixed terminal.Electricity shown in Fig. 5 Sub- equipment is only an example, should not function to the embodiment of the present disclosure and use scope bring any restrictions.
As shown in figure 5, electronic equipment 800 may include processing unit (such as central processing unit, graphics processor etc.) 801, random access can be loaded into according to the program being stored in read-only memory (ROM) 802 or from storage device 808 Program in memory (RAM) 803 and execute various movements appropriate and processing.In RAM 803, it is also stored with electronic equipment Various programs and data needed for 800 operations.Processing unit 801, ROM 802 and RAM 803 pass through the phase each other of bus 804 Even.Input/output (I/O) interface 805 is also connected to bus 804.
In general, following device can connect to I/O interface 805: including such as touch screen, touch tablet, keyboard, mouse, taking the photograph As the input unit 806 of head, microphone, accelerometer, gyroscope etc.;Including such as liquid crystal display (LCD), loudspeaker, vibration The output device 807 of dynamic device etc.;Storage device 808 including such as tape, hard disk etc.;And communication device 809.Communication device 809, which can permit electronic equipment 800, is wirelessly or non-wirelessly communicated with other equipment to exchange data.Although Fig. 5 shows tool There is the electronic equipment 800 of various devices, it should be understood that being not required for implementing or having all devices shown.It can be with Alternatively implement or have more or fewer devices.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communication device 809, or from storage device 808 It is mounted, or is mounted from ROM 802.When the computer program is executed by processing unit 801, the embodiment of the present disclosure is executed Method in the above-mentioned function that limits.
It should be noted that the above-mentioned computer-readable medium of the disclosure can be computer-readable signal media or meter Calculation machine readable storage medium storing program for executing either the two any combination.Computer readable storage medium for example can be --- but not Be limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.Meter The more specific example of calculation machine readable storage medium storing program for executing can include but is not limited to: have the electrical connection, just of one or more conducting wires Taking formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only storage Device (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device, Or above-mentioned any appropriate combination.In the disclosure, computer readable storage medium can be it is any include or storage journey The tangible medium of sequence, the program can be commanded execution system, device or device use or in connection.And at this In open, computer-readable signal media may include in a base band or as the data-signal that carrier wave a part is propagated, In carry computer-readable program code.The data-signal of this propagation can take various forms, including but not limited to Electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer-readable and deposit Any computer-readable medium other than storage media, the computer-readable signal media can send, propagate or transmit and be used for By the use of instruction execution system, device or device or program in connection.Include on computer-readable medium Program code can transmit with any suitable medium, including but not limited to: electric wire, optical cable, RF (radio frequency) etc. are above-mentioned Any appropriate combination.
Above-mentioned computer-readable medium can be included in above-mentioned electronic equipment;It is also possible to individualism, and not It is fitted into the electronic equipment.
Above-mentioned computer-readable medium carries one or more program, when said one or multiple programs are by the electricity When sub- equipment executes, so that the electronic equipment: obtaining at least two internet protocol addresses;Send to Node evaluation equipment includes institute State the Node evaluation request of at least two internet protocol addresses, wherein the Node evaluation equipment is internet from described at least two In protocol address, chooses internet protocol address and return;Receive the internet protocol address that the Node evaluation equipment returns;Its In, the fringe node in acquired internet protocol address instruction content distributing network.
Alternatively, above-mentioned computer-readable medium carries one or more program, when said one or multiple programs When being executed by the electronic equipment, so that the electronic equipment: receiving the Node evaluation including at least two internet protocol addresses and request; From at least two internet protocol address, internet protocol address is chosen;Return to the internet protocol address selected;Wherein, The fringe node in internet protocol address instruction content distributing network received.
The calculating of the operation for executing the disclosure can be write with one or more programming languages or combinations thereof Machine program code, above procedure design language include object oriented program language-such as Java, Smalltalk, C+ +, it further include conventional procedural programming language-such as " C " language or similar programming language.Program code can Fully to execute, partly execute on the user computer on the user computer, be executed as an independent software package, Part executes on the remote computer or executes on a remote computer or server completely on the user computer for part. In situations involving remote computers, remote computer can pass through the network of any kind --- including local area network (LAN) Or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as utilize Internet service Provider is connected by internet).
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.
Being described in unit involved in the embodiment of the present disclosure can be realized by way of software, can also be by hard The mode of part is realized.Wherein, the title of unit does not constitute the restriction to the unit itself under certain conditions, for example, the One acquiring unit is also described as " obtaining the unit of at least two internet protocol addresses ".
In order to realize above-described embodiment, the disclosure also proposes a kind of computer readable storage medium, is stored thereon with calculating Machine program, the program realize translation processing method as in the foregoing embodiment when being executed by processor.
Fig. 6 is the schematic diagram for illustrating computer readable storage medium according to an embodiment of the present disclosure.As shown in fig. 6, root According to the computer readable storage medium 300 of the embodiment of the present disclosure, it is stored thereon with non-transient computer readable instruction 310.When this When non-transient computer readable instruction 310 is run by processor, the translation processing method of each embodiment of the disclosure above-mentioned is executed All or part of the steps.
In order to realize above-described embodiment, the disclosure also proposes a kind of computer program product, when the computer program product In instruction when being executed by processor, realize translation processing method as in the foregoing embodiment.
Although embodiment of the disclosure has been shown and described above, it is to be understood that above-described embodiment is example Property, it should not be understood as the limitation to the disclosure, those skilled in the art within the scope of this disclosure can be to above-mentioned Embodiment is changed, modifies, replacement and variant.

Claims (10)

1. a kind of translation processing method characterized by comprising
Content to be translated is obtained, and extracts the language feature of the content to be translated;
The language feature is input in wrong identification model trained in advance and is handled, obtains the content to be translated Recognition result;
Judge the content to be translated with the presence or absence of mistake according to the recognition result, and if it exists, then to the content to be translated Correct generating and corrects result;
The correction result is translated, corresponding first translation result is obtained and is shown.
2. translation processing method as described in claim 1, which is characterized in that the language for extracting the content to be translated is special Sign includes:
The content to be translated is segmented, word segmentation result is obtained;
The language model value of each participle position, and feature and/or following traits above are extracted according to the word segmentation result;
The recognition result for obtaining the content to be translated, comprising:
Obtain the error probability of each participle position.
3. translation processing method as claimed in claim 2, which is characterized in that described to carry out correction life to the content to be translated Include: at result is corrected
Word of the error probability greater than the participle position of preset threshold is obtained, and constructs the candidate list of the word;
It gives a mark, the word is replaced according to marking result, to generate correction result to the candidate list.
4. translation processing method as described in claim 1, which is characterized in that described to carry out correction life to the content to be translated Include: at result is corrected
By the content to be translated be input in advance training slave wrong content to correct result end-to-end generation model in into Row processing obtains correction result corresponding with the content to be translated.
5. translation processing method as described in claim 1, which is characterized in that the language feature is being input to preparatory training Wrong identification model in handled before, further includes:
Obtain the training set of wrong corpus and corresponding correct corpus;
According to the parameter of training set training preset model, the wrong identification model is generated.
6. translation processing method as described in claim 1, which is characterized in that according to wrong identification result judgement After content to be translated is with the presence or absence of mistake, further includes:
If it does not exist, then the content to be translated is translated, obtain corresponding second translation result and shown.
7. translation processing method as described in claim 1, which is characterized in that translate, obtain to the correction result After corresponding first translation result and displaying, further includes:
The content to be translated is translated, corresponding second translation result is obtained and is shown.
8. a kind of translation processing unit characterized by comprising
Module is obtained, for obtaining content to be translated, and extracts the language feature of the content to be translated;
Processing module is handled for the language feature to be input in wrong identification model trained in advance, obtains institute State the recognition result of content to be translated;
Module is corrected, for judging the content to be translated with the presence or absence of mistake according to the recognition result, and if it exists, then to institute Content to be translated is stated to carry out correcting generation correction result;
Translation module, for translating, obtaining corresponding first translation result and showing to the correction result.
9. a kind of electronic equipment, which is characterized in that including processor and memory;
Wherein, the processor is run by reading the executable program code stored in the memory can be performed with described The corresponding program of program code, for realizing translation processing method such as of any of claims 1-7.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor Such as translation processing method of any of claims 1-7 is realized when execution.
CN201811350498.9A 2018-11-14 2018-11-14 Translation processing method and device Active CN109558600B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811350498.9A CN109558600B (en) 2018-11-14 2018-11-14 Translation processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811350498.9A CN109558600B (en) 2018-11-14 2018-11-14 Translation processing method and device

Publications (2)

Publication Number Publication Date
CN109558600A true CN109558600A (en) 2019-04-02
CN109558600B CN109558600B (en) 2023-06-30

Family

ID=65866348

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811350498.9A Active CN109558600B (en) 2018-11-14 2018-11-14 Translation processing method and device

Country Status (1)

Country Link
CN (1) CN109558600B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110362401A (en) * 2019-06-20 2019-10-22 深圳壹账通智能科技有限公司 Data run the member host in batch method, apparatus, storage medium and cluster
CN111339790A (en) * 2020-02-25 2020-06-26 北京字节跳动网络技术有限公司 Text translation method, device, equipment and computer readable storage medium
CN112905869A (en) * 2021-03-26 2021-06-04 北京儒博科技有限公司 Adaptive training method and device for language model, storage medium and equipment
CN113361511A (en) * 2020-03-05 2021-09-07 顺丰科技有限公司 Method, device and equipment for establishing correction model and computer readable storage medium

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050289168A1 (en) * 2000-06-26 2005-12-29 Green Edward A Subject matter context search engine
WO2008131509A1 (en) * 2007-04-30 2008-11-06 Fireswirl Systems Inc. Systems and methods for improving translation systems
CN101593173A (en) * 2008-05-28 2009-12-02 中国科学院自动化研究所 A kind of reverse Chinese-English transliteration method and device
US20100138210A1 (en) * 2008-12-02 2010-06-03 Electronics And Telecommunications Research Institute Post-editing apparatus and method for correcting translation errors
US20110307241A1 (en) * 2008-04-15 2011-12-15 Mobile Technologies, Llc Enhanced speech-to-speech translation system and methods
US8296124B1 (en) * 2008-11-21 2012-10-23 Google Inc. Method and apparatus for detecting incorrectly translated text in a document
US8326598B1 (en) * 2007-03-26 2012-12-04 Google Inc. Consensus translations from multiple machine translation systems
US20130144592A1 (en) * 2006-09-05 2013-06-06 Google Inc. Automatic Spelling Correction for Machine Translation
CN106649288A (en) * 2016-12-12 2017-05-10 北京百度网讯科技有限公司 Translation method and device based on artificial intelligence
CN107122346A (en) * 2016-12-28 2017-09-01 平安科技(深圳)有限公司 The error correction method and device of a kind of read statement
CN107451127A (en) * 2017-07-04 2017-12-08 广东小天才科技有限公司 A kind of word translation method and system based on image, mobile device
CN108241614A (en) * 2016-12-27 2018-07-03 北京搜狗科技发展有限公司 Information processing method and device, the device for information processing
CN108491392A (en) * 2018-03-29 2018-09-04 广州视源电子科技股份有限公司 Modification method, system, computer equipment and the storage medium of word misspelling
CN108563634A (en) * 2018-03-29 2018-09-21 广州视源电子科技股份有限公司 Recognition methods, system, computer equipment and the storage medium of word misspelling

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050289168A1 (en) * 2000-06-26 2005-12-29 Green Edward A Subject matter context search engine
US20130144592A1 (en) * 2006-09-05 2013-06-06 Google Inc. Automatic Spelling Correction for Machine Translation
US8326598B1 (en) * 2007-03-26 2012-12-04 Google Inc. Consensus translations from multiple machine translation systems
WO2008131509A1 (en) * 2007-04-30 2008-11-06 Fireswirl Systems Inc. Systems and methods for improving translation systems
US20110307241A1 (en) * 2008-04-15 2011-12-15 Mobile Technologies, Llc Enhanced speech-to-speech translation system and methods
CN101593173A (en) * 2008-05-28 2009-12-02 中国科学院自动化研究所 A kind of reverse Chinese-English transliteration method and device
US8296124B1 (en) * 2008-11-21 2012-10-23 Google Inc. Method and apparatus for detecting incorrectly translated text in a document
US20100138210A1 (en) * 2008-12-02 2010-06-03 Electronics And Telecommunications Research Institute Post-editing apparatus and method for correcting translation errors
CN106649288A (en) * 2016-12-12 2017-05-10 北京百度网讯科技有限公司 Translation method and device based on artificial intelligence
CN108241614A (en) * 2016-12-27 2018-07-03 北京搜狗科技发展有限公司 Information processing method and device, the device for information processing
CN107122346A (en) * 2016-12-28 2017-09-01 平安科技(深圳)有限公司 The error correction method and device of a kind of read statement
CN107451127A (en) * 2017-07-04 2017-12-08 广东小天才科技有限公司 A kind of word translation method and system based on image, mobile device
CN108491392A (en) * 2018-03-29 2018-09-04 广州视源电子科技股份有限公司 Modification method, system, computer equipment and the storage medium of word misspelling
CN108563634A (en) * 2018-03-29 2018-09-21 广州视源电子科技股份有限公司 Recognition methods, system, computer equipment and the storage medium of word misspelling

Non-Patent Citations (18)

* Cited by examiner, † Cited by third party
Title
A.DE GISPERT等: "N-gram posterior probability confidence measures for statistical machine translation: an empirical study", 《SPRINGER》 *
A.DE GISPERT等: "N-gram posterior probability confidence measures for statistical machine translation: an empirical study", 《SPRINGER》, 28 August 2012 (2012-08-28) *
HEIKE ADEL等: "Syntactic and Semantic Features For Code-Switching Factored Language Models", 《ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING》 *
HEIKE ADEL等: "Syntactic and Semantic Features For Code-Switching Factored Language Models", 《ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING》, 4 October 2017 (2017-10-04) *
IFTEKHAR NAIM等: "Feature-based Decipherment for Machine Translation", 《COMPUTATIONAL LINGUISTICS》 *
IFTEKHAR NAIM等: "Feature-based Decipherment for Machine Translation", 《COMPUTATIONAL LINGUISTICS》, 18 July 2018 (2018-07-18) *
SHAMIL CHOLLAMPATT等: "A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction", 《THE THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-18)》 *
SHAMIL CHOLLAMPATT等: "A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction", 《THE THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-18)》, 28 February 2018 (2018-02-28) *
党百振: "一种二级级联纠错编码的设计与分析", 《移动通信》 *
党百振: "一种二级级联纠错编码的设计与分析", 《移动通信》, no. 14, 30 July 2013 (2013-07-30) *
刘磊等: "英语学习者书面语法错误自动检测研究综述", 《中文信息学报》 *
刘磊等: "英语学习者书面语法错误自动检测研究综述", 《中文信息学报》, no. 01, 15 January 2018 (2018-01-15) *
姚树杰等: "基于句对质量和覆盖度的统计机器翻译训练语料选取", 《中文信息学报》 *
姚树杰等: "基于句对质量和覆盖度的统计机器翻译训练语料选取", 《中文信息学报》, no. 02, 15 March 2011 (2011-03-15) *
姚树杰等: "基于句对质量和覆盖度的统计机器翻译训练语料选取", 中文信息学报, no. 02 *
陈欢等: "基于话题翻译模型的双语文本纠错", 计算机应用与软件, no. 03 *
龚慧敏等: "自纠正词对齐", 《计算机科学》 *
龚慧敏等: "自纠正词对齐", 《计算机科学》, no. 12, 15 December 2017 (2017-12-15) *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110362401A (en) * 2019-06-20 2019-10-22 深圳壹账通智能科技有限公司 Data run the member host in batch method, apparatus, storage medium and cluster
CN111339790A (en) * 2020-02-25 2020-06-26 北京字节跳动网络技术有限公司 Text translation method, device, equipment and computer readable storage medium
CN113361511A (en) * 2020-03-05 2021-09-07 顺丰科技有限公司 Method, device and equipment for establishing correction model and computer readable storage medium
CN112905869A (en) * 2021-03-26 2021-06-04 北京儒博科技有限公司 Adaptive training method and device for language model, storage medium and equipment

Also Published As

Publication number Publication date
CN109558600B (en) 2023-06-30

Similar Documents

Publication Publication Date Title
CN108052577B (en) Universal text content mining method, device, server and storage medium
CN109558600A (en) Translation processing method and device
CN108628830B (en) Semantic recognition method and device
CN110472251A (en) Method, the method for statement translation, equipment and the storage medium of translation model training
CN109697291A (en) The semantic paragraph recognition methods of text and device
CN104462056B (en) For the method and information handling systems of knouledge-based information to be presented
CN102306171A (en) Method and equipment for providing network access suggestions and network search suggestions
WO2021135319A1 (en) Deep learning based text generation method and apparatus and electronic device
CN109657251A (en) Method and apparatus for translating sentence
CN110969012A (en) Text error correction method and device, storage medium and electronic equipment
CN109256125B (en) Off-line voice recognition method and device and storage medium
CN107491477A (en) A kind of emoticon searching method and device
CN107861954A (en) Information output method and device based on artificial intelligence
CN108304376B (en) Text vector determination method and device, storage medium and electronic device
CN112883968B (en) Image character recognition method, device, medium and electronic equipment
CN106919711A (en) The method and apparatus of the markup information based on artificial intelligence
CN108268637A (en) A kind of intelligent sound correction recognition methods, device and user terminal
CN109858045A (en) Machine translation method and device
CN111325031B (en) Resume analysis method and device
CN109947431A (en) A kind of code generating method, device, equipment and storage medium
CA3147341A1 (en) Category phrase recognition method, model training method, device and system
US10650195B2 (en) Translated-clause generating method, translated-clause generating apparatus, and recording medium
CN114141384A (en) Method, apparatus and medium for retrieving medical data
CN106339105A (en) Method and device for identifying phonetic information
CN109325227A (en) Method and apparatus for generating amendment sentence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Applicant after: Douyin Vision Co.,Ltd.

Address before: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Applicant before: Tiktok vision (Beijing) Co.,Ltd.

Address after: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Applicant after: Tiktok vision (Beijing) Co.,Ltd.

Address before: Room B0035, 2nd floor, No. 3 Courtyard, 30 Shixing Street, Shijingshan District, Beijing, 100041

Applicant before: BEIJING BYTEDANCE NETWORK TECHNOLOGY Co.,Ltd.

GR01 Patent grant
GR01 Patent grant