CN107423293A - The method and apparatus of data translation - Google Patents

The method and apparatus of data translation Download PDF

Info

Publication number
CN107423293A
CN107423293A CN201710589392.3A CN201710589392A CN107423293A CN 107423293 A CN107423293 A CN 107423293A CN 201710589392 A CN201710589392 A CN 201710589392A CN 107423293 A CN107423293 A CN 107423293A
Authority
CN
China
Prior art keywords
translation
fragment
input content
user
preset format
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710589392.3A
Other languages
Chinese (zh)
Inventor
吴闯
叶娜
蔡东风
张桂平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenyang Aerospace University
Original Assignee
Shenyang Aerospace University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenyang Aerospace University filed Critical Shenyang Aerospace University
Priority to CN201710589392.3A priority Critical patent/CN107423293A/en
Publication of CN107423293A publication Critical patent/CN107423293A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • G06F40/47Machine-assisted translation, e.g. using translation memory

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a kind of method and apparatus of data translation.Wherein, this method includes:According to the input content of user, pass through the translation candidate word of each translation unit corresponding to the first preset model offer input content;Fragment is not operated the positive fragment candidate after the user interactive obtained in advance, negative sense fragment and, and the translation candidate word of each translation unit inputs to the second preset model corresponding to input content, and generation translation is calculated by the second preset model.The present invention is solved due to the technical problem for translating inaccuracy that machine translation is brought in the prior art.

Description

The method and apparatus of data translation
Technical field
The present invention relates to computer technology application field, in particular to a kind of method and apparatus of data translation.
Background technology
For the angle of research, machine translation can be divided into rule-based machine translation and the machine based on corpus turns over Translate.Wherein, the machine translation method based on corpus can be divided into the method and Statistics-Based Method of Case-based Reasoning again.Mesh Before, Statistics-Based Method turns into main flow.Such as Baidu, Google, Microsoft, have, the company such as Ali all puts into substantial contribution and ground Study carefully machine translation correlation technique.
Statistical machine translation method largely improves the effect of translation, day of these machine translation systems in netizen Often the use in life is very universal, but still faces problems, is mainly reflected in from the point of view of application:
Translation quality is high not enough:For some specific languages and field, machine translation has reached can be with The level (reading level) of receiving, user will be seen that the main contents of original text by means of machine translation system substantially, such as The translation (News Field) of French, Arabic to English that Google is provided.But for ordinary circumstance, the matter of machine translation Amount can not also meet the needs of user.Typical example is English-Chinese and Chinese-English translation.Chinese-English translation is almost machine translation research Most languages, the scale of corpus is also extremely huge, has reached the order of magnitude of million to ten million sentences pair, but machine translation system The performance of system or unsatisfactory, the situation that translation result is not clear and coherent or even the meaning is understood entirely without method is still frequently all It is.
(2) translation result is credible not enough:For many users, in the case where machine translation accuracy rate is not high, If machine translation system can accurately illustrate which translation result is believable, which is not credible enough, still can be user Save substantial amounts of time and money.But present machine translation system is also helpless in this respect, cause the knot of machine translation The positive mistake of fruit mixes, and the availability of machine translation substantially reduces.In addition, even if the field narrow to some, machine translation also High confidence level is not accomplished.
Therefore, it is still in the application field for having strict demand to translation quality, human translation or man-machine supplementary translation It can not be substituted.
For it is above-mentioned due to machine translation is brought in the prior art translation inaccuracy the problem of, not yet propose at present effective Solution.
The content of the invention
The embodiments of the invention provide a kind of method and apparatus of data translation, at least to solve due to machine in the prior art The technical problem for the translation inaccuracy that device translation is brought.
One side according to embodiments of the present invention, there is provided a kind of method of data translation, including:According to the defeated of user Enter content, the translation candidate word of each translation unit corresponding to input content is provided by the first preset model;It will obtain in advance User interactive after positive fragment candidate, negative sense fragment and do not operate fragment, and each turned over corresponding to input content The translation candidate word for translating unit inputs to the second preset model, and generation translation is calculated by the second preset model.
Optionally, in the input content according to user, provided by the first preset model and each turned over corresponding to input content Before the translation candidate word for translating unit, this method also includes:Input content is converted into the second default lattice by the first preset format Formula, wherein, the second preset format is the form for translating data.
Further, optionally, input content is converted into the second preset format by the first preset format includes:Second In the case that preset format includes text, the first preset format of input content is parsed, wherein, the first preset format includes:Figure Piece, sound and text;In the case where the first preset format is picture, picture is parsed by image recognition, by the letter in picture Breath is converted to text;In the case where the first preset format is sound, the pronunciation in extraction sound is changed by voice, by pronunciation Be converted to text;In the case where the first preset format is text, text is defined as text to be translated.
Optionally, positive fragment candidate, negative sense fragment and the non-operating sheet after by the user interactive obtained in advance Section, and before the translation candidate word of each translation unit inputs to the second preset model corresponding to input content, this method is also Including:Obtain the translation feature of input content;Translation feature includes:Positive fragment candidate, negative sense fragment and fragment is not operated;Its In, positive fragment, it is that user mutual selects word, the primitive of generation mentions target language fragment pair;The mode bag of target language fragment generation Include:Directly chosen from candidate, multiple candidate combinations generations, User Defined addition, existing candidate is changed in part;Negative sense piece Section is user after interactive interface operation, and the primitive of generation mentions the target language fragment pair that can not be translated corresponding to original language;Not Fragment is operated, is that user did not did remaining fragment after the operation of positive and negative sense.
Optionally, before input content is converted into the second preset format by the first preset format, this method also includes: Receive the input content that user keys in.
Other side according to embodiments of the present invention, there is provided a kind of device of data translation, including:Candidate word obtains Module, for the input content according to user, each translation unit corresponding to input content is provided by the first preset model Translate candidate word;Translation module, for by the positive fragment candidate after the user interactive obtained in advance, negative sense fragment and not Fragment is operated, and the translation candidate word of each translation unit inputs to the second preset model corresponding to input content, by the Two preset models calculate generation translation.
Optionally, the device also includes:Modular converter, in the input content according to user, passing through the first default mould Before type provides the translation candidate word of each translation unit corresponding to input content, input content is changed by the first preset format For the second preset format, wherein, the second preset format is the form for translating data.
Further, optionally, modular converter includes:First resolution unit, for including text in the second preset format In the case of, the first preset format of input content is parsed, wherein, the first preset format includes:Picture, sound and text;The One converting unit, in the case of being picture in the first preset format, picture is parsed by image recognition, by the letter in picture Breath is converted to text;Second converting unit, in the case of being sound in the first preset format, extraction sound is changed by voice Pronunciation in sound, pronunciation is converted into text;3rd converting unit, will in the case of being text in the first preset format Text is defined as text to be translated.
Optionally, the device also includes:Translation feature acquisition module, for after by the user interactive obtained in advance Positive fragment candidate, negative sense fragment and do not operate fragment, and the translation candidate of each translation unit corresponding to input content Before word inputs to the second preset model, the translation feature of input content is obtained;Translation feature includes:Positive fragment candidate, bear Fragment is not operated to fragment and;Wherein, positive fragment, it is that user mutual selects word, the primitive of generation mentions target language fragment pair;Mesh The mode of poster fragment generation includes:Directly chosen from candidate, multiple candidate combinations generations, User Defined addition, part The existing candidate of modification;Negative sense fragment is user after interactive interface operation, and the primitive of generation is mentioned and can not turned over corresponding to original language The target language fragment pair translated;Fragment is not operated, is that user did not did remaining fragment after the operation of positive and negative sense.
Optionally, the device also includes:Receiving module, for input content to be converted into second by the first preset format Before preset format, the input content that user keys in is received.
Another aspect according to embodiments of the present invention, there is provided a kind of storage medium, storage medium include the journey of storage Sequence, wherein, the method that equipment where controlling storage medium when program is run performs above-mentioned data translation.
Another aspect according to embodiments of the present invention, there is provided a kind of processor, processor are used for operation program, its In, program run when perform above-mentioned data translation method.
In embodiments of the present invention, by the input content according to user, input content is provided by the first preset model The translation candidate word of corresponding each translation unit;By the positive fragment candidate after the user interactive obtained in advance, negative sense Fragment and fragment is not operated, and the translation candidate word of each translation unit corresponding to input content inputs to the second default mould Type, generation translation is calculated by the second preset model, has reached the purpose of lifting translation precision, it is achieved thereby that lifting translation is accurate The technique effect of true rate, and then solve due to the technical problem for translating inaccuracy that machine translation is brought in the prior art.
Brief description of the drawings
Accompanying drawing described herein is used for providing a further understanding of the present invention, forms the part of the application, this hair Bright schematic description and description is used to explain the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 is the schematic flow sheet of the method for data translation according to embodiments of the present invention;
Fig. 2 is that showing for Chunk candidate lists T corresponding to S is obtained in the method for data translation according to embodiments of the present invention It is intended to;
Fig. 3 is a kind of schematic flow sheet of the method for data translation according to embodiments of the present invention;
Fig. 3 a be a kind of data translation according to embodiments of the present invention method in candidate list schematic diagram;
Fig. 4 is the structural representation of the device of data translation according to embodiments of the present invention.
Embodiment
In order that those skilled in the art more fully understand the present invention program, below in conjunction with the embodiment of the present invention Accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is only The embodiment of a part of the invention, rather than whole embodiments.Based on the embodiment in the present invention, ordinary skill people The every other embodiment that member is obtained under the premise of creative work is not made, it should all belong to the model that the present invention protects Enclose.
It should be noted that term " first " in description and claims of this specification and above-mentioned accompanying drawing, " Two " etc. be for distinguishing similar object, without for describing specific order or precedence.It should be appreciated that so use Data can exchange in the appropriate case, so as to embodiments of the invention described herein can with except illustrating herein or Order beyond those of description is implemented.In addition, term " comprising " and " having " and their any deformation, it is intended that cover Cover it is non-exclusive include, be not necessarily limited to for example, containing the process of series of steps or unit, method, system, product or equipment Those steps or unit clearly listed, but may include not list clearly or for these processes, method, product Or the intrinsic other steps of equipment or unit.
Embodiment one
According to embodiments of the present invention, there is provided a kind of embodiment of the method for data translation is, it is necessary to illustrate, in accompanying drawing The step of flow illustrates can perform in the computer system of such as one group computer executable instructions, although also, Logical order is shown in flow chart, but in some cases, can be to perform shown different from order herein or retouch The step of stating.
Fig. 1 is the schematic flow sheet of the method for data translation according to embodiments of the present invention, as shown in figure 1, this method bag Include following steps:
Step S102, according to the input content of user, provided by the first preset model and each turned over corresponding to input content Translate the translation candidate word of unit;
Step S104, by positive fragment candidate, negative sense fragment and the non-operating sheet after the user interactive obtained in advance Section, and the translation candidate word of each translation unit inputs to the second preset model corresponding to input content, it is default by second Model calculates generation translation.
It is pre- by first by the input content according to user in the method for the data translation that the embodiment of the present application provides If model provides the translation candidate word of each translation unit corresponding to input content;After the user interactive obtained in advance Positive fragment candidate, negative sense fragment and fragment is not operated, and the translation candidate word of each translation unit corresponding to input content The second preset model is inputed to, generation translation is calculated by the second preset model, has reached the purpose of lifting translation precision, so as to The technique effect of lifting translation accuracy rate is realized, and then is solved because the translation that machine translation is brought in the prior art is forbidden True technical problem.
Optionally, input content pair is provided by the first preset model according to the input content of user in step s 102 Before the translation candidate word for each translation unit answered, the method for the data translation that the application provides also includes:
Step S101, input content is converted into the second preset format by the first preset format, wherein, the second preset format For the form for translating data.
Further, optionally, input content is converted into the second preset format by the first preset format in step S101 Including:
Step1, in the case where the second preset format includes text, the first preset format of input content is parsed, wherein, First preset format includes:Picture, sound and text;
Step2, in the case where the first preset format is picture, picture is parsed by image recognition, by the letter in picture Breath is converted to text;
Step3, in the case where the first preset format is sound, the pronunciation in extraction sound is changed by voice, will be read Sound is converted to text;
Step4, in the case where the first preset format is text, text is defined as text to be translated.
Optionally, by the positive fragment candidate after the user interactive obtained in advance, negative sense fragment in step S104 And do not operate fragment, and the translation candidate word of each translation unit corresponding to input content input to the second preset model it Before, the method for the data translation that the application provides also includes:
Step S103, obtain the translation feature of input content;Translation feature includes:Positive fragment candidate, negative sense fragment and Fragment is not operated;
Wherein, positive fragment, it is that user mutual selects word, the primitive of generation mentions target language fragment pair;Target language fragment is given birth to Into mode include:Directly chosen from candidate, multiple candidate combinations generations, User Defined addition, part modification has been waited Choosing;
Negative sense fragment is user after interactive interface operation, and the primitive of generation mentions what can not be translated corresponding to original language Target language fragment pair;
Fragment is not operated, is that user did not did remaining fragment after the operation of positive and negative sense.
Optionally, should before input content being converted into the second preset format by the first preset format in step S101 Method also includes:
Step S100, receive the input content that user keys in.
To sum up, the method for the data translation that the embodiment of the present application provides is specific as follows:
(1) input of user is obtained by user's input capture module, and is ultimately converted to textual form S-IN= { CH1, CH2 ... CHi-2CHi-1CHiCHi+1 ... CHn };(phonetic entry is converted into textual form by voice conversion module, figure Piece input identifies text by image recognition;Direct text input does not do any conversion);
Specifically, textual form is converted in above-mentioned steps (1), the step S102 in corresponding the embodiment of the present application, i.e. will The data that the user collected inputs are converted into the second preset format by the first preset format, wherein, the first preset format is The initial format of the data of the user's input collected, such as:Sound (voice), picture and text, in the data collected In the case that initial format is text, directly using the text as text to be translated.
(2) according to text S, Chunk candidate lists T corresponding to S is obtained by first candidate's generation module M1;
Specifically, Chunk candidate list T corresponding to S are obtained in above-mentioned steps (2), the step in corresponding the embodiment of the present application Rapid S104, i.e. acquisition is converted to candidate list corresponding to the data of the second preset format, and the second default lattice are converted to by obtaining Character string in the data of formula, obtain candidate list corresponding to the character string.Fig. 2 is data translation according to embodiments of the present invention Method in obtain the schematic diagram of Chunk candidate lists T corresponding to S, it is specific as shown in Figure 2.
(3) user is obtained by user behavior trapping module and selects word behavior, obtained user and select word list SELECT-LIST= {T2,…Ti};
Specifically, obtaining user by user behavior trapping module in above-mentioned steps (3) selects word behavior, obtain user and select word List, correspond to the step S106 in the embodiment of the present application, i.e. the user's history that parsing obtains in advance selects word behavior, obtains user History selects user corresponding to word behavior to select word list.
(4) second candidate's generation module M2 input S-IN, SE-LIST, generation translation S-OUT according to user.
To sum up, Fig. 3 is a kind of schematic flow sheet of the method for data translation according to embodiments of the present invention, specific such as Fig. 3 It is shown.
Example set unit to be translated as:" all these criterions are obviously based on the optimistic prediction to maximal rate." then second Candidate list corresponding to formatted data is as shown in Figure 3 a, Fig. 3 a is a kind of method of data translation according to embodiments of the present invention The schematic diagram of middle candidate list.
Ste1:The translation feature of acquisition.
Step2 features:This part input feature vector includes three parts:
Positive fragment (set):Positive fragment is that user mutual selects word, and the primitive of generation mentions target language fragment pair;Target The mode of language fragment generation includes:Directly choose, multiple candidate combinations generations, User Defined addition, partly repair from candidate Change existing candidate.
For example, " obviously ", obviously, enum_candidate;
" maximum ", maximum, enum_combine;
……
Negative sense fragment (set):Negative sense fragment is user after interactive interface operation, and the primitive of generation mentions original language pair The target language fragment pair that can not be translated answered;
For example, large, root swelling, enum_del;
Fragment (set) is not operated:It is that user did not did remaining fragment after the operation of positive and negative sense.
Such as:
(all, " ", enum_none)
(these, " ", enum_none)
……
(prediction, predication, enum_none)
It is characterized in the emphasis of the application caused by this step, feature instantiation is:Feature caused by man-machine interaction behavior.
The feature according to caused by Step2, it is input in the second default translation model, generates candidate's translation.
The method for the data translation that the embodiment of the present application provides (such as engineering in the translation duties higher to translation quality Translation, handbook translation, document translation etc.), in rapid increase, (translation speed is slow, cost at present for simple human translation cost It is high), the translation system of man-machine auxiliary is commonly used in interpreter works.Traditional man-machine supplementary translation is to utilize to turn over mostly Memory function is translated, and the postedit method for being currently based on machine translation is just being used by increasing company, the application is implemented The method for the data translation that example provides, which makes full use of, calculates the strong advantage of memory capability, for the first time by carrying out CHUNK to input The identification of level, high-precision CHUNK paginal translations table is obtained by the selection behavior of interpreter, afterwards using automatic translation generation module, added On the user that has obtained select word result to carry out the generation of further high-precision translation, so as to make the effect of machine translation more accurate Really, the word custom of interpreter is met.
Embodiment two
Another aspect according to embodiments of the present invention, additionally provides a kind of device of data translation, and Fig. 4 is according to the present invention The structural representation of the device of the data translation of embodiment, as shown in figure 4, including:
Candidate word acquisition module 42, for the input content according to user, input content is provided by the first preset model The translation candidate word of corresponding each translation unit;Translation module 44, for by after the user interactive obtained in advance just Fragment is not operated to fragment candidate, negative sense fragment and, and the translation candidate word of each translation unit corresponding to input content is defeated Enter to the second preset model, generation translation is calculated by the second preset model.
It is pre- by first by the input content according to user in the device for the data translation that the embodiment of the present application provides If model provides the translation candidate word of each translation unit corresponding to input content;After the user interactive obtained in advance Positive fragment candidate, negative sense fragment and fragment is not operated, and the translation candidate word of each translation unit corresponding to input content The second preset model is inputed to, generation translation is calculated by the second preset model, has reached the purpose of lifting translation precision, so as to The technique effect of lifting translation accuracy rate is realized, and then is solved because the translation that machine translation is brought in the prior art is forbidden True technical problem.
Optionally, the device for the data translation that the embodiment of the present application provides also includes:Modular converter, for according to user Input content, will before providing the translation candidate word of each translation unit corresponding to input content by the first preset model Input content is converted to the second preset format by the first preset format, wherein, the second preset format is the lattice for translating data Formula.
Further, optionally, modular converter includes:First resolution unit, for including text in the second preset format In the case of, the first preset format of input content is parsed, wherein, the first preset format includes:Picture, sound and text;The One converting unit, in the case of being picture in the first preset format, picture is parsed by image recognition, by the letter in picture Breath is converted to text;Second converting unit, in the case of being sound in the first preset format, extraction sound is changed by voice Pronunciation in sound, pronunciation is converted into text;3rd converting unit, will in the case of being text in the first preset format Text is defined as text to be translated.
Optionally, the device for the data translation that the embodiment of the present application provides also includes:Translation feature acquisition module, for Fragment is not operated the positive fragment candidate after the user interactive obtained in advance, negative sense fragment and, and input content pair Before the translation candidate word for each translation unit answered inputs to the second preset model, the translation feature of input content is obtained;Turn over Translating feature includes:Positive fragment candidate, negative sense fragment and fragment is not operated;Wherein, positive fragment, it is that user mutual selects word, it is raw Into primitive mention target language fragment pair;The mode of target language fragment generation includes:Directly chosen from candidate, multiple candidate sets Symphysis is into existing candidate is changed in User Defined addition, part;Negative sense fragment is user after interactive interface operation, generation Primitive mentions the target language fragment pair that can not be translated corresponding to original language;Fragment is not operated, is that user did not did positive and negative sense Remaining fragment after operation.
Optionally, the device for the data translation that the embodiment of the present application provides also includes:Receiving module, in it will input Before appearance is converted to the second preset format by the first preset format, the input content that user keys in is received.
Embodiment three
Another aspect according to embodiments of the present invention, there is provided a kind of storage medium, storage medium include the journey of storage Sequence, wherein, the method that equipment where controlling storage medium when program is run performs above-mentioned data translation.
Example IV
Another aspect according to embodiments of the present invention, there is provided a kind of processor, processor are used for operation program, its In, program run when perform above-mentioned data translation method.
The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.
In the above embodiment of the present invention, the description to each embodiment all emphasizes particularly on different fields, and does not have in some embodiment The part of detailed description, it may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed technology contents, others can be passed through Mode is realized.Wherein, device embodiment described above is only schematical, such as the division of the unit, Ke Yiwei A kind of division of logic function, can there is an other dividing mode when actually realizing, for example, multiple units or component can combine or Person is desirably integrated into another system, or some features can be ignored, or does not perform.Another, shown or discussed is mutual Between coupling or direct-coupling or communication connection can be INDIRECT COUPLING or communication link by some interfaces, unit or module Connect, can be electrical or other forms.
The unit illustrated as separating component can be or may not be physically separate, show as unit The part shown can be or may not be physical location, you can with positioned at a place, or can also be distributed to multiple On unit.Some or all of unit therein can be selected to realize the purpose of this embodiment scheme according to the actual needs.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, can also That unit is individually physically present, can also two or more units it is integrated in a unit.Above-mentioned integrated list Member can both be realized in the form of hardware, can also be realized in the form of SFU software functional unit.
If the integrated unit is realized in the form of SFU software functional unit and is used as independent production marketing or use When, it can be stored in a computer read/write memory medium.Based on such understanding, technical scheme is substantially The part to be contributed in other words to prior art or all or part of the technical scheme can be in the form of software products Embody, the computer software product is stored in a storage medium, including some instructions are causing a computer Equipment (can be personal computer, server or network equipment etc.) perform each embodiment methods described of the present invention whole or Part steps.And foregoing storage medium includes:USB flash disk, read-only storage (ROM, Read-Only Memory), arbitrary access are deposited Reservoir (RAM, Random Access Memory), mobile hard disk, magnetic disc or CD etc. are various can be with store program codes Medium.
Described above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For member, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications also should It is considered as protection scope of the present invention.

Claims (10)

  1. A kind of 1. method of data translation, it is characterised in that including:
    According to the input content of user, turning over for each translation unit corresponding to the input content is provided by the first preset model Translate candidate word;
    Fragment is not operated the positive fragment candidate after the user interactive obtained in advance, negative sense fragment and, and it is described defeated The translation candidate word for entering each translation unit corresponding to content inputs to the second preset model, passes through the second preset model meter Calculate generation translation.
  2. 2. according to the method for claim 1, it is characterised in that pre- by first in the input content according to user Before if model provides the translation candidate word of each translation unit corresponding to the input content, methods described also includes:
    The input content is converted into the second preset format by the first preset format, wherein, second preset format is use In the form for translating the data.
  3. 3. according to the method for claim 2, it is characterised in that described to change the input content by the first preset format Include for the second preset format:
    In the case where second preset format includes text, the first preset format of the input content is parsed, wherein, institute Stating the first preset format includes:Picture, sound and text;
    In the case where first preset format is the picture, the picture is parsed by image recognition, by the picture In information be converted to the text;
    In the case where first preset format is the sound, the pronunciation in the sound is extracted by voice conversion, will The pronunciation is converted to the text;
    In the case where first preset format is the text, the text is defined as text to be translated.
  4. 4. according to the method for claim 1, it is characterised in that it is described by the user interactive obtained in advance after just Fragment is not operated to fragment candidate, negative sense fragment and, and the translation candidate of each translation unit corresponding to the input content Before word inputs to the second preset model, methods described also includes:
    Obtain the translation feature of the input content;The translation feature includes:The positive fragment candidate, the negative sense fragment And described fragment is not operated;
    Wherein, the positive fragment, it is that user mutual selects word, the primitive of generation mentions target language fragment pair;Target language fragment is given birth to Into mode include:Directly chosen from candidate, multiple candidate combinations generations, User Defined addition, part modification has been waited Choosing;
    The negative sense fragment is user after interactive interface operation, and the primitive of generation mentions what can not be translated corresponding to original language Target language fragment pair;
    It is described not operate fragment, it is that user did not did remaining fragment after the operation of positive and negative sense.
  5. 5. according to the method for claim 2, it is characterised in that be converted to the input content by the first preset format Before second preset format, methods described also includes:
    Receive the input content that the user keys in.
  6. A kind of 6. device of data translation, it is characterised in that including:
    Candidate word acquisition module, for the input content according to user, the input content pair is provided by the first preset model The translation candidate word for each translation unit answered;
    Translation module, for by positive fragment candidate, negative sense fragment and the non-operating sheet after the user interactive obtained in advance Section, and the translation candidate word of each translation unit inputs to the second preset model corresponding to the input content, by described Second preset model calculates generation translation.
  7. 7. device according to claim 6, it is characterised in that described device also includes:
    Modular converter, the input content pair is provided in the input content according to user, passing through the first preset model Before the translation candidate word for each translation unit answered, the input content is converted into the second default lattice by the first preset format Formula, wherein, second preset format is the form for translating the data.
  8. 8. device according to claim 7, it is characterised in that the modular converter includes:
    First resolution unit, in the case of including text in second preset format, parse the of the input content One preset format, wherein, first preset format includes:Picture, sound and text;
    First converting unit, in the case of being the picture in first preset format, institute is parsed by image recognition Picture is stated, the information in the picture is converted into the text;
    Second converting unit, in the case of being the sound in first preset format, extraction institute is changed by voice The pronunciation in sound is stated, the pronunciation is converted into the text;
    3rd converting unit, in the case of being the text in first preset format, the text is defined as treating Cypher text.
  9. 9. device according to claim 6, it is characterised in that described device also includes:
    Translation feature acquisition module, in the positive fragment candidate by after the user interactive obtained in advance, negative sense Fragment and fragment is not operated, and to input to second default for the translation candidate word of each translation unit corresponding to the input content Before model, the translation feature of the input content is obtained;The translation feature includes:It is the positive fragment candidate, described negative Described do not operate to fragment and fragment;Wherein, the positive fragment, it is that user mutual selects word, the primitive of generation mentions target language Fragment pair;The mode of target language fragment generation includes:Directly chosen from candidate, multiple candidate combinations generations, User Defined Existing candidate is changed in addition, part;The negative sense fragment is user after interactive interface operation, and the primitive of generation mentions primitive The target language fragment pair that can not be translated corresponding to speech;It is described not operate fragment, it is after user did not did the operation of positive and negative sense Remaining fragment.
  10. 10. device according to claim 7, it is characterised in that described device also includes:
    Receiving module, for before the input content is converted into the second preset format by the first preset format, receiving institute State the input content of user's key entry.
CN201710589392.3A 2017-07-18 2017-07-18 The method and apparatus of data translation Pending CN107423293A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710589392.3A CN107423293A (en) 2017-07-18 2017-07-18 The method and apparatus of data translation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710589392.3A CN107423293A (en) 2017-07-18 2017-07-18 The method and apparatus of data translation

Publications (1)

Publication Number Publication Date
CN107423293A true CN107423293A (en) 2017-12-01

Family

ID=60430184

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710589392.3A Pending CN107423293A (en) 2017-07-18 2017-07-18 The method and apparatus of data translation

Country Status (1)

Country Link
CN (1) CN107423293A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108682201A (en) * 2018-03-30 2018-10-19 华北水利水电大学 A kind of English teaching system based on intelligent terminal
CN109558597A (en) * 2018-12-17 2019-04-02 北京百度网讯科技有限公司 Text interpretation method and device, equipment and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108682201A (en) * 2018-03-30 2018-10-19 华北水利水电大学 A kind of English teaching system based on intelligent terminal
CN109558597A (en) * 2018-12-17 2019-04-02 北京百度网讯科技有限公司 Text interpretation method and device, equipment and storage medium
CN109558597B (en) * 2018-12-17 2022-05-24 北京百度网讯科技有限公司 Text translation method and device, equipment and storage medium

Similar Documents

Publication Publication Date Title
KR102577514B1 (en) Method, apparatus for text generation, device and storage medium
CN108287858B (en) Semantic extraction method and device for natural language
WO2020186778A1 (en) Error word correction method and device, computer device, and storage medium
CN111241237B (en) Intelligent question-answer data processing method and device based on operation and maintenance service
CN110750959A (en) Text information processing method, model training method and related device
CN114580382A (en) Text error correction method and device
CN107729313A (en) The method of discrimination and device of multitone character pronunciation based on deep neural network
CN108124477A (en) Segmenter is improved based on pseudo- data to handle natural language
CN111445898B (en) Language identification method and device, electronic equipment and storage medium
CN111599340A (en) Polyphone pronunciation prediction method and device and computer readable storage medium
CN110211562B (en) Voice synthesis method, electronic equipment and readable storage medium
CN114757176A (en) Method for obtaining target intention recognition model and intention recognition method
CN112463942A (en) Text processing method and device, electronic equipment and computer readable storage medium
CN115545041B (en) Model construction method and system for enhancing semantic vector representation of medical statement
JP2020135135A (en) Dialog content creation assisting method and system
CN110968725A (en) Image content description information generation method, electronic device, and storage medium
CN115631261A (en) Training method of image generation model, image generation method and device
CN107423293A (en) The method and apparatus of data translation
CN116913278B (en) Voice processing method, device, equipment and storage medium
CN106484660A (en) Title treating method and apparatus
CN110516125A (en) Method, device and equipment for identifying abnormal character string and readable storage medium
CN110162615A (en) A kind of intelligent answer method, apparatus, electronic equipment and storage medium
CN111090720B (en) Hot word adding method and device
CN113204966A (en) Corpus augmentation method, apparatus, device and storage medium
KR101543024B1 (en) Method and Apparatus for Translating Word based on Pronunciation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20171201