CN107423293A - The method and apparatus of data translation - Google Patents
The method and apparatus of data translation Download PDFInfo
- Publication number
- CN107423293A CN107423293A CN201710589392.3A CN201710589392A CN107423293A CN 107423293 A CN107423293 A CN 107423293A CN 201710589392 A CN201710589392 A CN 201710589392A CN 107423293 A CN107423293 A CN 107423293A
- Authority
- CN
- China
- Prior art keywords
- translation
- fragment
- input content
- user
- preset format
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/42—Data-driven translation
- G06F40/47—Machine-assisted translation, e.g. using translation memory
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a kind of method and apparatus of data translation.Wherein, this method includes:According to the input content of user, pass through the translation candidate word of each translation unit corresponding to the first preset model offer input content;Fragment is not operated the positive fragment candidate after the user interactive obtained in advance, negative sense fragment and, and the translation candidate word of each translation unit inputs to the second preset model corresponding to input content, and generation translation is calculated by the second preset model.The present invention is solved due to the technical problem for translating inaccuracy that machine translation is brought in the prior art.
Description
Technical field
The present invention relates to computer technology application field, in particular to a kind of method and apparatus of data translation.
Background technology
For the angle of research, machine translation can be divided into rule-based machine translation and the machine based on corpus turns over
Translate.Wherein, the machine translation method based on corpus can be divided into the method and Statistics-Based Method of Case-based Reasoning again.Mesh
Before, Statistics-Based Method turns into main flow.Such as Baidu, Google, Microsoft, have, the company such as Ali all puts into substantial contribution and ground
Study carefully machine translation correlation technique.
Statistical machine translation method largely improves the effect of translation, day of these machine translation systems in netizen
Often the use in life is very universal, but still faces problems, is mainly reflected in from the point of view of application:
Translation quality is high not enough:For some specific languages and field, machine translation has reached can be with
The level (reading level) of receiving, user will be seen that the main contents of original text by means of machine translation system substantially, such as
The translation (News Field) of French, Arabic to English that Google is provided.But for ordinary circumstance, the matter of machine translation
Amount can not also meet the needs of user.Typical example is English-Chinese and Chinese-English translation.Chinese-English translation is almost machine translation research
Most languages, the scale of corpus is also extremely huge, has reached the order of magnitude of million to ten million sentences pair, but machine translation system
The performance of system or unsatisfactory, the situation that translation result is not clear and coherent or even the meaning is understood entirely without method is still frequently all
It is.
(2) translation result is credible not enough:For many users, in the case where machine translation accuracy rate is not high,
If machine translation system can accurately illustrate which translation result is believable, which is not credible enough, still can be user
Save substantial amounts of time and money.But present machine translation system is also helpless in this respect, cause the knot of machine translation
The positive mistake of fruit mixes, and the availability of machine translation substantially reduces.In addition, even if the field narrow to some, machine translation also
High confidence level is not accomplished.
Therefore, it is still in the application field for having strict demand to translation quality, human translation or man-machine supplementary translation
It can not be substituted.
For it is above-mentioned due to machine translation is brought in the prior art translation inaccuracy the problem of, not yet propose at present effective
Solution.
The content of the invention
The embodiments of the invention provide a kind of method and apparatus of data translation, at least to solve due to machine in the prior art
The technical problem for the translation inaccuracy that device translation is brought.
One side according to embodiments of the present invention, there is provided a kind of method of data translation, including:According to the defeated of user
Enter content, the translation candidate word of each translation unit corresponding to input content is provided by the first preset model;It will obtain in advance
User interactive after positive fragment candidate, negative sense fragment and do not operate fragment, and each turned over corresponding to input content
The translation candidate word for translating unit inputs to the second preset model, and generation translation is calculated by the second preset model.
Optionally, in the input content according to user, provided by the first preset model and each turned over corresponding to input content
Before the translation candidate word for translating unit, this method also includes:Input content is converted into the second default lattice by the first preset format
Formula, wherein, the second preset format is the form for translating data.
Further, optionally, input content is converted into the second preset format by the first preset format includes:Second
In the case that preset format includes text, the first preset format of input content is parsed, wherein, the first preset format includes:Figure
Piece, sound and text;In the case where the first preset format is picture, picture is parsed by image recognition, by the letter in picture
Breath is converted to text;In the case where the first preset format is sound, the pronunciation in extraction sound is changed by voice, by pronunciation
Be converted to text;In the case where the first preset format is text, text is defined as text to be translated.
Optionally, positive fragment candidate, negative sense fragment and the non-operating sheet after by the user interactive obtained in advance
Section, and before the translation candidate word of each translation unit inputs to the second preset model corresponding to input content, this method is also
Including:Obtain the translation feature of input content;Translation feature includes:Positive fragment candidate, negative sense fragment and fragment is not operated;Its
In, positive fragment, it is that user mutual selects word, the primitive of generation mentions target language fragment pair;The mode bag of target language fragment generation
Include:Directly chosen from candidate, multiple candidate combinations generations, User Defined addition, existing candidate is changed in part;Negative sense piece
Section is user after interactive interface operation, and the primitive of generation mentions the target language fragment pair that can not be translated corresponding to original language;Not
Fragment is operated, is that user did not did remaining fragment after the operation of positive and negative sense.
Optionally, before input content is converted into the second preset format by the first preset format, this method also includes:
Receive the input content that user keys in.
Other side according to embodiments of the present invention, there is provided a kind of device of data translation, including:Candidate word obtains
Module, for the input content according to user, each translation unit corresponding to input content is provided by the first preset model
Translate candidate word;Translation module, for by the positive fragment candidate after the user interactive obtained in advance, negative sense fragment and not
Fragment is operated, and the translation candidate word of each translation unit inputs to the second preset model corresponding to input content, by the
Two preset models calculate generation translation.
Optionally, the device also includes:Modular converter, in the input content according to user, passing through the first default mould
Before type provides the translation candidate word of each translation unit corresponding to input content, input content is changed by the first preset format
For the second preset format, wherein, the second preset format is the form for translating data.
Further, optionally, modular converter includes:First resolution unit, for including text in the second preset format
In the case of, the first preset format of input content is parsed, wherein, the first preset format includes:Picture, sound and text;The
One converting unit, in the case of being picture in the first preset format, picture is parsed by image recognition, by the letter in picture
Breath is converted to text;Second converting unit, in the case of being sound in the first preset format, extraction sound is changed by voice
Pronunciation in sound, pronunciation is converted into text;3rd converting unit, will in the case of being text in the first preset format
Text is defined as text to be translated.
Optionally, the device also includes:Translation feature acquisition module, for after by the user interactive obtained in advance
Positive fragment candidate, negative sense fragment and do not operate fragment, and the translation candidate of each translation unit corresponding to input content
Before word inputs to the second preset model, the translation feature of input content is obtained;Translation feature includes:Positive fragment candidate, bear
Fragment is not operated to fragment and;Wherein, positive fragment, it is that user mutual selects word, the primitive of generation mentions target language fragment pair;Mesh
The mode of poster fragment generation includes:Directly chosen from candidate, multiple candidate combinations generations, User Defined addition, part
The existing candidate of modification;Negative sense fragment is user after interactive interface operation, and the primitive of generation is mentioned and can not turned over corresponding to original language
The target language fragment pair translated;Fragment is not operated, is that user did not did remaining fragment after the operation of positive and negative sense.
Optionally, the device also includes:Receiving module, for input content to be converted into second by the first preset format
Before preset format, the input content that user keys in is received.
Another aspect according to embodiments of the present invention, there is provided a kind of storage medium, storage medium include the journey of storage
Sequence, wherein, the method that equipment where controlling storage medium when program is run performs above-mentioned data translation.
Another aspect according to embodiments of the present invention, there is provided a kind of processor, processor are used for operation program, its
In, program run when perform above-mentioned data translation method.
In embodiments of the present invention, by the input content according to user, input content is provided by the first preset model
The translation candidate word of corresponding each translation unit;By the positive fragment candidate after the user interactive obtained in advance, negative sense
Fragment and fragment is not operated, and the translation candidate word of each translation unit corresponding to input content inputs to the second default mould
Type, generation translation is calculated by the second preset model, has reached the purpose of lifting translation precision, it is achieved thereby that lifting translation is accurate
The technique effect of true rate, and then solve due to the technical problem for translating inaccuracy that machine translation is brought in the prior art.
Brief description of the drawings
Accompanying drawing described herein is used for providing a further understanding of the present invention, forms the part of the application, this hair
Bright schematic description and description is used to explain the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 is the schematic flow sheet of the method for data translation according to embodiments of the present invention;
Fig. 2 is that showing for Chunk candidate lists T corresponding to S is obtained in the method for data translation according to embodiments of the present invention
It is intended to;
Fig. 3 is a kind of schematic flow sheet of the method for data translation according to embodiments of the present invention;
Fig. 3 a be a kind of data translation according to embodiments of the present invention method in candidate list schematic diagram;
Fig. 4 is the structural representation of the device of data translation according to embodiments of the present invention.
Embodiment
In order that those skilled in the art more fully understand the present invention program, below in conjunction with the embodiment of the present invention
Accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is only
The embodiment of a part of the invention, rather than whole embodiments.Based on the embodiment in the present invention, ordinary skill people
The every other embodiment that member is obtained under the premise of creative work is not made, it should all belong to the model that the present invention protects
Enclose.
It should be noted that term " first " in description and claims of this specification and above-mentioned accompanying drawing, "
Two " etc. be for distinguishing similar object, without for describing specific order or precedence.It should be appreciated that so use
Data can exchange in the appropriate case, so as to embodiments of the invention described herein can with except illustrating herein or
Order beyond those of description is implemented.In addition, term " comprising " and " having " and their any deformation, it is intended that cover
Cover it is non-exclusive include, be not necessarily limited to for example, containing the process of series of steps or unit, method, system, product or equipment
Those steps or unit clearly listed, but may include not list clearly or for these processes, method, product
Or the intrinsic other steps of equipment or unit.
Embodiment one
According to embodiments of the present invention, there is provided a kind of embodiment of the method for data translation is, it is necessary to illustrate, in accompanying drawing
The step of flow illustrates can perform in the computer system of such as one group computer executable instructions, although also,
Logical order is shown in flow chart, but in some cases, can be to perform shown different from order herein or retouch
The step of stating.
Fig. 1 is the schematic flow sheet of the method for data translation according to embodiments of the present invention, as shown in figure 1, this method bag
Include following steps:
Step S102, according to the input content of user, provided by the first preset model and each turned over corresponding to input content
Translate the translation candidate word of unit;
Step S104, by positive fragment candidate, negative sense fragment and the non-operating sheet after the user interactive obtained in advance
Section, and the translation candidate word of each translation unit inputs to the second preset model corresponding to input content, it is default by second
Model calculates generation translation.
It is pre- by first by the input content according to user in the method for the data translation that the embodiment of the present application provides
If model provides the translation candidate word of each translation unit corresponding to input content;After the user interactive obtained in advance
Positive fragment candidate, negative sense fragment and fragment is not operated, and the translation candidate word of each translation unit corresponding to input content
The second preset model is inputed to, generation translation is calculated by the second preset model, has reached the purpose of lifting translation precision, so as to
The technique effect of lifting translation accuracy rate is realized, and then is solved because the translation that machine translation is brought in the prior art is forbidden
True technical problem.
Optionally, input content pair is provided by the first preset model according to the input content of user in step s 102
Before the translation candidate word for each translation unit answered, the method for the data translation that the application provides also includes:
Step S101, input content is converted into the second preset format by the first preset format, wherein, the second preset format
For the form for translating data.
Further, optionally, input content is converted into the second preset format by the first preset format in step S101
Including:
Step1, in the case where the second preset format includes text, the first preset format of input content is parsed, wherein,
First preset format includes:Picture, sound and text;
Step2, in the case where the first preset format is picture, picture is parsed by image recognition, by the letter in picture
Breath is converted to text;
Step3, in the case where the first preset format is sound, the pronunciation in extraction sound is changed by voice, will be read
Sound is converted to text;
Step4, in the case where the first preset format is text, text is defined as text to be translated.
Optionally, by the positive fragment candidate after the user interactive obtained in advance, negative sense fragment in step S104
And do not operate fragment, and the translation candidate word of each translation unit corresponding to input content input to the second preset model it
Before, the method for the data translation that the application provides also includes:
Step S103, obtain the translation feature of input content;Translation feature includes:Positive fragment candidate, negative sense fragment and
Fragment is not operated;
Wherein, positive fragment, it is that user mutual selects word, the primitive of generation mentions target language fragment pair;Target language fragment is given birth to
Into mode include:Directly chosen from candidate, multiple candidate combinations generations, User Defined addition, part modification has been waited
Choosing;
Negative sense fragment is user after interactive interface operation, and the primitive of generation mentions what can not be translated corresponding to original language
Target language fragment pair;
Fragment is not operated, is that user did not did remaining fragment after the operation of positive and negative sense.
Optionally, should before input content being converted into the second preset format by the first preset format in step S101
Method also includes:
Step S100, receive the input content that user keys in.
To sum up, the method for the data translation that the embodiment of the present application provides is specific as follows:
(1) input of user is obtained by user's input capture module, and is ultimately converted to textual form S-IN=
{ CH1, CH2 ... CHi-2CHi-1CHiCHi+1 ... CHn };(phonetic entry is converted into textual form by voice conversion module, figure
Piece input identifies text by image recognition;Direct text input does not do any conversion);
Specifically, textual form is converted in above-mentioned steps (1), the step S102 in corresponding the embodiment of the present application, i.e. will
The data that the user collected inputs are converted into the second preset format by the first preset format, wherein, the first preset format is
The initial format of the data of the user's input collected, such as:Sound (voice), picture and text, in the data collected
In the case that initial format is text, directly using the text as text to be translated.
(2) according to text S, Chunk candidate lists T corresponding to S is obtained by first candidate's generation module M1;
Specifically, Chunk candidate list T corresponding to S are obtained in above-mentioned steps (2), the step in corresponding the embodiment of the present application
Rapid S104, i.e. acquisition is converted to candidate list corresponding to the data of the second preset format, and the second default lattice are converted to by obtaining
Character string in the data of formula, obtain candidate list corresponding to the character string.Fig. 2 is data translation according to embodiments of the present invention
Method in obtain the schematic diagram of Chunk candidate lists T corresponding to S, it is specific as shown in Figure 2.
(3) user is obtained by user behavior trapping module and selects word behavior, obtained user and select word list SELECT-LIST=
{T2,…Ti};
Specifically, obtaining user by user behavior trapping module in above-mentioned steps (3) selects word behavior, obtain user and select word
List, correspond to the step S106 in the embodiment of the present application, i.e. the user's history that parsing obtains in advance selects word behavior, obtains user
History selects user corresponding to word behavior to select word list.
(4) second candidate's generation module M2 input S-IN, SE-LIST, generation translation S-OUT according to user.
To sum up, Fig. 3 is a kind of schematic flow sheet of the method for data translation according to embodiments of the present invention, specific such as Fig. 3
It is shown.
Example set unit to be translated as:" all these criterions are obviously based on the optimistic prediction to maximal rate." then second
Candidate list corresponding to formatted data is as shown in Figure 3 a, Fig. 3 a is a kind of method of data translation according to embodiments of the present invention
The schematic diagram of middle candidate list.
Ste1:The translation feature of acquisition.
Step2 features:This part input feature vector includes three parts:
Positive fragment (set):Positive fragment is that user mutual selects word, and the primitive of generation mentions target language fragment pair;Target
The mode of language fragment generation includes:Directly choose, multiple candidate combinations generations, User Defined addition, partly repair from candidate
Change existing candidate.
For example, " obviously ", obviously, enum_candidate;
" maximum ", maximum, enum_combine;
……
Negative sense fragment (set):Negative sense fragment is user after interactive interface operation, and the primitive of generation mentions original language pair
The target language fragment pair that can not be translated answered;
For example, large, root swelling, enum_del;
Fragment (set) is not operated:It is that user did not did remaining fragment after the operation of positive and negative sense.
Such as:
(all, " ", enum_none)
(these, " ", enum_none)
……
(prediction, predication, enum_none)
It is characterized in the emphasis of the application caused by this step, feature instantiation is:Feature caused by man-machine interaction behavior.
The feature according to caused by Step2, it is input in the second default translation model, generates candidate's translation.
The method for the data translation that the embodiment of the present application provides (such as engineering in the translation duties higher to translation quality
Translation, handbook translation, document translation etc.), in rapid increase, (translation speed is slow, cost at present for simple human translation cost
It is high), the translation system of man-machine auxiliary is commonly used in interpreter works.Traditional man-machine supplementary translation is to utilize to turn over mostly
Memory function is translated, and the postedit method for being currently based on machine translation is just being used by increasing company, the application is implemented
The method for the data translation that example provides, which makes full use of, calculates the strong advantage of memory capability, for the first time by carrying out CHUNK to input
The identification of level, high-precision CHUNK paginal translations table is obtained by the selection behavior of interpreter, afterwards using automatic translation generation module, added
On the user that has obtained select word result to carry out the generation of further high-precision translation, so as to make the effect of machine translation more accurate
Really, the word custom of interpreter is met.
Embodiment two
Another aspect according to embodiments of the present invention, additionally provides a kind of device of data translation, and Fig. 4 is according to the present invention
The structural representation of the device of the data translation of embodiment, as shown in figure 4, including:
Candidate word acquisition module 42, for the input content according to user, input content is provided by the first preset model
The translation candidate word of corresponding each translation unit;Translation module 44, for by after the user interactive obtained in advance just
Fragment is not operated to fragment candidate, negative sense fragment and, and the translation candidate word of each translation unit corresponding to input content is defeated
Enter to the second preset model, generation translation is calculated by the second preset model.
It is pre- by first by the input content according to user in the device for the data translation that the embodiment of the present application provides
If model provides the translation candidate word of each translation unit corresponding to input content;After the user interactive obtained in advance
Positive fragment candidate, negative sense fragment and fragment is not operated, and the translation candidate word of each translation unit corresponding to input content
The second preset model is inputed to, generation translation is calculated by the second preset model, has reached the purpose of lifting translation precision, so as to
The technique effect of lifting translation accuracy rate is realized, and then is solved because the translation that machine translation is brought in the prior art is forbidden
True technical problem.
Optionally, the device for the data translation that the embodiment of the present application provides also includes:Modular converter, for according to user
Input content, will before providing the translation candidate word of each translation unit corresponding to input content by the first preset model
Input content is converted to the second preset format by the first preset format, wherein, the second preset format is the lattice for translating data
Formula.
Further, optionally, modular converter includes:First resolution unit, for including text in the second preset format
In the case of, the first preset format of input content is parsed, wherein, the first preset format includes:Picture, sound and text;The
One converting unit, in the case of being picture in the first preset format, picture is parsed by image recognition, by the letter in picture
Breath is converted to text;Second converting unit, in the case of being sound in the first preset format, extraction sound is changed by voice
Pronunciation in sound, pronunciation is converted into text;3rd converting unit, will in the case of being text in the first preset format
Text is defined as text to be translated.
Optionally, the device for the data translation that the embodiment of the present application provides also includes:Translation feature acquisition module, for
Fragment is not operated the positive fragment candidate after the user interactive obtained in advance, negative sense fragment and, and input content pair
Before the translation candidate word for each translation unit answered inputs to the second preset model, the translation feature of input content is obtained;Turn over
Translating feature includes:Positive fragment candidate, negative sense fragment and fragment is not operated;Wherein, positive fragment, it is that user mutual selects word, it is raw
Into primitive mention target language fragment pair;The mode of target language fragment generation includes:Directly chosen from candidate, multiple candidate sets
Symphysis is into existing candidate is changed in User Defined addition, part;Negative sense fragment is user after interactive interface operation, generation
Primitive mentions the target language fragment pair that can not be translated corresponding to original language;Fragment is not operated, is that user did not did positive and negative sense
Remaining fragment after operation.
Optionally, the device for the data translation that the embodiment of the present application provides also includes:Receiving module, in it will input
Before appearance is converted to the second preset format by the first preset format, the input content that user keys in is received.
Embodiment three
Another aspect according to embodiments of the present invention, there is provided a kind of storage medium, storage medium include the journey of storage
Sequence, wherein, the method that equipment where controlling storage medium when program is run performs above-mentioned data translation.
Example IV
Another aspect according to embodiments of the present invention, there is provided a kind of processor, processor are used for operation program, its
In, program run when perform above-mentioned data translation method.
The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.
In the above embodiment of the present invention, the description to each embodiment all emphasizes particularly on different fields, and does not have in some embodiment
The part of detailed description, it may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed technology contents, others can be passed through
Mode is realized.Wherein, device embodiment described above is only schematical, such as the division of the unit, Ke Yiwei
A kind of division of logic function, can there is an other dividing mode when actually realizing, for example, multiple units or component can combine or
Person is desirably integrated into another system, or some features can be ignored, or does not perform.Another, shown or discussed is mutual
Between coupling or direct-coupling or communication connection can be INDIRECT COUPLING or communication link by some interfaces, unit or module
Connect, can be electrical or other forms.
The unit illustrated as separating component can be or may not be physically separate, show as unit
The part shown can be or may not be physical location, you can with positioned at a place, or can also be distributed to multiple
On unit.Some or all of unit therein can be selected to realize the purpose of this embodiment scheme according to the actual needs.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, can also
That unit is individually physically present, can also two or more units it is integrated in a unit.Above-mentioned integrated list
Member can both be realized in the form of hardware, can also be realized in the form of SFU software functional unit.
If the integrated unit is realized in the form of SFU software functional unit and is used as independent production marketing or use
When, it can be stored in a computer read/write memory medium.Based on such understanding, technical scheme is substantially
The part to be contributed in other words to prior art or all or part of the technical scheme can be in the form of software products
Embody, the computer software product is stored in a storage medium, including some instructions are causing a computer
Equipment (can be personal computer, server or network equipment etc.) perform each embodiment methods described of the present invention whole or
Part steps.And foregoing storage medium includes:USB flash disk, read-only storage (ROM, Read-Only Memory), arbitrary access are deposited
Reservoir (RAM, Random Access Memory), mobile hard disk, magnetic disc or CD etc. are various can be with store program codes
Medium.
Described above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art
For member, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications also should
It is considered as protection scope of the present invention.
Claims (10)
- A kind of 1. method of data translation, it is characterised in that including:According to the input content of user, turning over for each translation unit corresponding to the input content is provided by the first preset model Translate candidate word;Fragment is not operated the positive fragment candidate after the user interactive obtained in advance, negative sense fragment and, and it is described defeated The translation candidate word for entering each translation unit corresponding to content inputs to the second preset model, passes through the second preset model meter Calculate generation translation.
- 2. according to the method for claim 1, it is characterised in that pre- by first in the input content according to user Before if model provides the translation candidate word of each translation unit corresponding to the input content, methods described also includes:The input content is converted into the second preset format by the first preset format, wherein, second preset format is use In the form for translating the data.
- 3. according to the method for claim 2, it is characterised in that described to change the input content by the first preset format Include for the second preset format:In the case where second preset format includes text, the first preset format of the input content is parsed, wherein, institute Stating the first preset format includes:Picture, sound and text;In the case where first preset format is the picture, the picture is parsed by image recognition, by the picture In information be converted to the text;In the case where first preset format is the sound, the pronunciation in the sound is extracted by voice conversion, will The pronunciation is converted to the text;In the case where first preset format is the text, the text is defined as text to be translated.
- 4. according to the method for claim 1, it is characterised in that it is described by the user interactive obtained in advance after just Fragment is not operated to fragment candidate, negative sense fragment and, and the translation candidate of each translation unit corresponding to the input content Before word inputs to the second preset model, methods described also includes:Obtain the translation feature of the input content;The translation feature includes:The positive fragment candidate, the negative sense fragment And described fragment is not operated;Wherein, the positive fragment, it is that user mutual selects word, the primitive of generation mentions target language fragment pair;Target language fragment is given birth to Into mode include:Directly chosen from candidate, multiple candidate combinations generations, User Defined addition, part modification has been waited Choosing;The negative sense fragment is user after interactive interface operation, and the primitive of generation mentions what can not be translated corresponding to original language Target language fragment pair;It is described not operate fragment, it is that user did not did remaining fragment after the operation of positive and negative sense.
- 5. according to the method for claim 2, it is characterised in that be converted to the input content by the first preset format Before second preset format, methods described also includes:Receive the input content that the user keys in.
- A kind of 6. device of data translation, it is characterised in that including:Candidate word acquisition module, for the input content according to user, the input content pair is provided by the first preset model The translation candidate word for each translation unit answered;Translation module, for by positive fragment candidate, negative sense fragment and the non-operating sheet after the user interactive obtained in advance Section, and the translation candidate word of each translation unit inputs to the second preset model corresponding to the input content, by described Second preset model calculates generation translation.
- 7. device according to claim 6, it is characterised in that described device also includes:Modular converter, the input content pair is provided in the input content according to user, passing through the first preset model Before the translation candidate word for each translation unit answered, the input content is converted into the second default lattice by the first preset format Formula, wherein, second preset format is the form for translating the data.
- 8. device according to claim 7, it is characterised in that the modular converter includes:First resolution unit, in the case of including text in second preset format, parse the of the input content One preset format, wherein, first preset format includes:Picture, sound and text;First converting unit, in the case of being the picture in first preset format, institute is parsed by image recognition Picture is stated, the information in the picture is converted into the text;Second converting unit, in the case of being the sound in first preset format, extraction institute is changed by voice The pronunciation in sound is stated, the pronunciation is converted into the text;3rd converting unit, in the case of being the text in first preset format, the text is defined as treating Cypher text.
- 9. device according to claim 6, it is characterised in that described device also includes:Translation feature acquisition module, in the positive fragment candidate by after the user interactive obtained in advance, negative sense Fragment and fragment is not operated, and to input to second default for the translation candidate word of each translation unit corresponding to the input content Before model, the translation feature of the input content is obtained;The translation feature includes:It is the positive fragment candidate, described negative Described do not operate to fragment and fragment;Wherein, the positive fragment, it is that user mutual selects word, the primitive of generation mentions target language Fragment pair;The mode of target language fragment generation includes:Directly chosen from candidate, multiple candidate combinations generations, User Defined Existing candidate is changed in addition, part;The negative sense fragment is user after interactive interface operation, and the primitive of generation mentions primitive The target language fragment pair that can not be translated corresponding to speech;It is described not operate fragment, it is after user did not did the operation of positive and negative sense Remaining fragment.
- 10. device according to claim 7, it is characterised in that described device also includes:Receiving module, for before the input content is converted into the second preset format by the first preset format, receiving institute State the input content of user's key entry.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710589392.3A CN107423293A (en) | 2017-07-18 | 2017-07-18 | The method and apparatus of data translation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710589392.3A CN107423293A (en) | 2017-07-18 | 2017-07-18 | The method and apparatus of data translation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107423293A true CN107423293A (en) | 2017-12-01 |
Family
ID=60430184
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710589392.3A Pending CN107423293A (en) | 2017-07-18 | 2017-07-18 | The method and apparatus of data translation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107423293A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108682201A (en) * | 2018-03-30 | 2018-10-19 | 华北水利水电大学 | A kind of English teaching system based on intelligent terminal |
CN109558597A (en) * | 2018-12-17 | 2019-04-02 | 北京百度网讯科技有限公司 | Text interpretation method and device, equipment and storage medium |
-
2017
- 2017-07-18 CN CN201710589392.3A patent/CN107423293A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108682201A (en) * | 2018-03-30 | 2018-10-19 | 华北水利水电大学 | A kind of English teaching system based on intelligent terminal |
CN109558597A (en) * | 2018-12-17 | 2019-04-02 | 北京百度网讯科技有限公司 | Text interpretation method and device, equipment and storage medium |
CN109558597B (en) * | 2018-12-17 | 2022-05-24 | 北京百度网讯科技有限公司 | Text translation method and device, equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102577514B1 (en) | Method, apparatus for text generation, device and storage medium | |
CN108287858B (en) | Semantic extraction method and device for natural language | |
WO2020186778A1 (en) | Error word correction method and device, computer device, and storage medium | |
CN111241237B (en) | Intelligent question-answer data processing method and device based on operation and maintenance service | |
CN110750959A (en) | Text information processing method, model training method and related device | |
CN114580382A (en) | Text error correction method and device | |
CN107729313A (en) | The method of discrimination and device of multitone character pronunciation based on deep neural network | |
CN108124477A (en) | Segmenter is improved based on pseudo- data to handle natural language | |
CN111445898B (en) | Language identification method and device, electronic equipment and storage medium | |
CN111599340A (en) | Polyphone pronunciation prediction method and device and computer readable storage medium | |
CN110211562B (en) | Voice synthesis method, electronic equipment and readable storage medium | |
CN114757176A (en) | Method for obtaining target intention recognition model and intention recognition method | |
CN112463942A (en) | Text processing method and device, electronic equipment and computer readable storage medium | |
CN115545041B (en) | Model construction method and system for enhancing semantic vector representation of medical statement | |
JP2020135135A (en) | Dialog content creation assisting method and system | |
CN110968725A (en) | Image content description information generation method, electronic device, and storage medium | |
CN115631261A (en) | Training method of image generation model, image generation method and device | |
CN107423293A (en) | The method and apparatus of data translation | |
CN116913278B (en) | Voice processing method, device, equipment and storage medium | |
CN106484660A (en) | Title treating method and apparatus | |
CN110516125A (en) | Method, device and equipment for identifying abnormal character string and readable storage medium | |
CN110162615A (en) | A kind of intelligent answer method, apparatus, electronic equipment and storage medium | |
CN111090720B (en) | Hot word adding method and device | |
CN113204966A (en) | Corpus augmentation method, apparatus, device and storage medium | |
KR101543024B1 (en) | Method and Apparatus for Translating Word based on Pronunciation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20171201 |