CN112487831A - Split type artificial intelligence translation method - Google Patents

Split type artificial intelligence translation method Download PDF

Info

Publication number
CN112487831A
CN112487831A CN202011352378.XA CN202011352378A CN112487831A CN 112487831 A CN112487831 A CN 112487831A CN 202011352378 A CN202011352378 A CN 202011352378A CN 112487831 A CN112487831 A CN 112487831A
Authority
CN
China
Prior art keywords
sentence
single sentence
translation
artificial intelligence
document
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011352378.XA
Other languages
Chinese (zh)
Inventor
单杰
王璐
杨丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Sunyu Information Technology Co ltd
Original Assignee
Jiangsu Sunyu Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Sunyu Information Technology Co ltd filed Critical Jiangsu Sunyu Information Technology Co ltd
Priority to CN202011352378.XA priority Critical patent/CN112487831A/en
Publication of CN112487831A publication Critical patent/CN112487831A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a split artificial intelligence translation method, which belongs to the technical field of intelligent translation and comprises the following steps: s1, introduction of literature; s2, traversing the literature, and splitting the literature into a plurality of sub-literatures; s3, respectively carrying out sentence breaking on each sub-document to obtain a single sentence, and sequentially carrying out sequence number marking on each single sentence; s4, extracting nouns in the single sentence and carrying out preliminary paraphrasing to obtain a single sentence key word; s5, translating each single sentence to obtain a paraphrase sentence; s6, carrying out proximity adjustment according to the single sentence key words in the preceding sentence of the paraphrase sentence; s7, arranging all the adjusted paraphrase sentences to obtain a translation and displaying the translation.

Description

Split type artificial intelligence translation method
Technical Field
The invention belongs to the technical field of intelligent translation, and particularly relates to a split type artificial intelligence translation method.
Background
For the current society, communication in international intersection becomes daily meal. The translation amount brought by the method is larger and larger, and the number of words of the file is larger and larger.
With the current popularity of CAT technology, the translation speed is greatly improved. The preparation process before translation often takes a considerable amount of time. For example, for a translation company, a copy of about 3 ten thousand words of a document is made to 10 persons, and the document splitting and distribution is completed in half a day or more in the past. More serious, hesitant characters are too many, which easily causes visual confusion during segmentation to cause wrong splitting. Many documents will have many paragraphs that are identical, so that the translation is just a little more useless. This adds virtually to the translation cost.
However, all the splitting tools on the market are designed to break the whole file into parts and to be convenient to carry, and the algorithm is divided according to the byte stream. This method is essentially useless for word segmentation. Therefore, when a file with too many characters is translated, the translation world often needs a plurality of people to translate at the same time, and a great amount of time is spent for splitting characters before translation.
Therefore, a translation tool is needed to directly split documents, and meanwhile, the split translations are related to each other, so that the phenomenon that the semantemes are not overlapped before and after the translations is avoided.
Disclosure of Invention
1. Technical problem to be solved by the invention
The object of the present invention is to solve the above drawbacks.
2. Technical scheme
In order to achieve the purpose, the technical scheme provided by the invention is as follows:
the invention discloses a split type artificial intelligence translation method which comprises the following steps:
s1, introduction of literature;
s2, traversing the literature, and splitting the literature into a plurality of sub-literatures;
s3, respectively carrying out sentence breaking on each sub-document to obtain a single sentence, and sequentially carrying out sequence number marking on each single sentence;
s4, extracting nouns in the single sentence and carrying out preliminary paraphrasing to obtain a single sentence key word;
s5, translating each single sentence to obtain a paraphrase sentence;
s6, carrying out proximity adjustment according to the single sentence key words in the preceding sentence of the paraphrase sentence;
and S7, arranging all the adjusted paraphrases to obtain a translation and displaying the translation.
Preferably, the numbering method of step S3 is n-k, where n denotes the number of the sub-document, k denotes the number of the single sentence in the sub-document, and n and k are both natural numbers.
Preferably, after step S5, there is the following step:
s5.1, arranging a plurality of paraphrases of the single sentence key words and sequentially selecting the paraphrases;
s5.2, comparing the paraphrase of the selected single sentence key word with the paraphrase of the next single sentence, judging whether a logical relation exists, if so, performing the step S6, otherwise, returning to the step S5.1 until the paraphrases of the single sentence key words are all judged to be finished.
Preferably, after step S6, there is the following step:
s6.1, underlining the single sentence after the close-up adjustment;
s6.2, carrying out multi-meaning translation on the single sentence with underlines;
s6.3, forming hyperlinks in the multi-meaning translated single sentence and binding the hyperlinks with underlines;
preferably, step S7 is preceded by the following steps:
s6.4, extracting the subject with more than two times of appearance times in each sub-literature;
s6.5, judging whether the translation results of the subject in the translated text translated by the subject with the same original text are the same, if so, normally displaying, and otherwise, performing the step S6.6;
and S6.6, displaying the subjects of different translation results in an editable state, adding background colors and displaying.
Preferably, after step S7, there is the following step:
s8, judging whether the operator modifies the characters in the translation, if so, packaging and integrating the modified part and the original text to obtain a summary document, otherwise, not performing any operation;
and S8.1, uploading the summary document to a cloud.
Preferably, in step S6, if the translated sentence is the first sentence, the contents are directly translated.
Preferably, the splitting method in step S2 is to determine whether there are multiple paragraphs in the document, if yes, split the document using the paragraph as the structure, otherwise split the document using the sentence as the structure.
Preferably, the method of adjusting the proximity in step S6 is to select a word sense with more usage in the matching semantics according to the cloud big data.
3. Advantageous effects
Compared with the prior art, the technical scheme provided by the invention has the following beneficial effects:
(1) according to the split artificial intelligence translation method, the document is split into the plurality of sub-documents to be translated synchronously, and translation time is saved.
(2) According to the split artificial intelligence translation method, when the sub-documents are translated, except the first sentence, the translations of other sentences are subjected to approximate adjustment according to the key nouns of the previous sentence, so that the accuracy rate and the confidence and elegance of the translation are improved.
(3) According to the split artificial intelligence translation method, after translation of all the sub-documents is completed, the system is adjusted and unified according to the main words appearing in the sub-documents, and the accuracy and the system learning capacity are improved by matching with manual intervention, and further fed back to cloud big data.
Drawings
FIG. 1 is a flowchart of a split-type artificial intelligence translation method of the present invention
Detailed Description
In order to facilitate an understanding of the invention, the invention will now be described more fully hereinafter with reference to the accompanying drawings, in which several embodiments of the invention are shown, but which may be embodied in many different forms and are not limited to the embodiments described herein, but rather are provided for the purpose of providing a more thorough disclosure of the invention.
It will be understood that when an element is referred to as being "secured to" another element, it can be directly on the other element or intervening elements may also be present; when an element is referred to as being "connected" to another element, it can be directly connected to the other element or intervening elements may also be present; the terms "vertical," "horizontal," "left," "right," and the like as used herein are for illustrative purposes only.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs; the terminology used herein in the description of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention; as used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
Referring to fig. 1, a split artificial intelligence translation method according to this embodiment includes the following steps:
s1, introduction of literature;
s2, traversing the literature, and splitting the literature into a plurality of sub-literatures;
s3, respectively carrying out sentence breaking on each sub-document to obtain a single sentence, and sequentially carrying out sequence number marking on each single sentence;
s4, extracting nouns in the single sentence and carrying out preliminary paraphrasing to obtain a single sentence key word;
s5, translating each single sentence to obtain a paraphrase sentence;
s6, carrying out proximity adjustment according to the single sentence key words in the preceding sentence of the paraphrase sentence;
and S7, arranging all the adjusted paraphrases to obtain a translation and displaying the translation.
The labeling method of step S3 in this embodiment is n-k, where n denotes the number of the sub-document, k denotes the number of the single sentence in the sub-document, and n and k are both natural numbers.
The following steps also exist after step S5 of the present embodiment:
s5.1, arranging a plurality of paraphrases of the single sentence key words and sequentially selecting the paraphrases;
s5.2, comparing the paraphrase of the selected single sentence key word with the paraphrase of the next single sentence, judging whether a logical relation exists, if so, performing the step S6, otherwise, returning to the step S5.1 until the paraphrases of the single sentence key words are all judged to be finished.
The following steps also exist after step S6 of the present embodiment:
s6.1, underlining the single sentence after the close-up adjustment;
s6.2, carrying out multi-meaning translation on the single sentence with underlines;
s6.3, forming hyperlinks in the multi-meaning translated single sentence and binding the hyperlinks with underlines;
the following steps also exist before step S7 of the present embodiment:
s6.4, extracting the subject with more than two times of appearance times in each sub-literature;
s6.5, judging whether the translated contents of the subject in the translated text translated by the subject with the same original text are the same, if so, normally displaying, and otherwise, performing the step S6.6;
and S6.6, displaying the subject with different translation contents in an editable state, adding a background color and displaying.
The following steps also exist after step S7 of the present embodiment:
s8, judging whether the operator modifies the characters in the translation, if so, packaging and integrating the modified part and the original text to obtain a summary document, otherwise, not performing any operation;
and S8.1, uploading the summary document to a cloud.
In step S6 of the present embodiment, if the translated sentence is the first sentence, i.e., the sentence with the sequence number n-1, the contents are directly translated.
In step S2, the splitting method in this embodiment is to determine whether there are multiple paragraphs in the document, if yes, the document is split using the paragraph as the structure, otherwise, the document is split using the sentence as the structure.
In the method for adjusting the proximity in step S6 of this embodiment, more word senses are selected from the matching semantics according to the cloud big data.
The above embodiments are illustrated as follows:
example 1, there is a document including 7 paragraphs, each paragraph is 5 sentences, if the document is divided into seven sub-documents after traversal, the number of the sub-document is 1 to 7, the number of the sentence in each sub-document is 1 to 5, the third sentence of the second sub-document, that is, the number 2 to 3, is translated, when translating, the system extracts the noun with the number 2 to 2 as the keyword, translates 2 to 3, adjusts the definition of 2 to 3 according to the keyword in 2 to 2, underlines the adjusted 2 to 3 definition sentence, the underlines hyperlink includes the original meaning before 2 to 3 adjustment and the common meaning of big data display, translates and adjusts each sentence in each sub-document according to the above process to obtain the translation, extracts the noun in each sub-document and judges whether the same translation nouns are the same or not, and finally, adjusting and manually intervening by an operator, displaying the translation, packaging and transmitting the result of the manual intervention by the operator to the cloud for big data integration.
Example 2, there is a document that includes only 1 paragraph, where the paragraph has only 1 sentence, after traversing the document, the sentence is represented as 1-1, when translating 1-1, because it is determined that 1-1 is the first sentence, the translation is performed directly and the result is displayed, and if an operator manually modifies some paraphrases, the modified content is integrated, packaged and transmitted to the cloud.
The above-mentioned embodiments only express a certain implementation mode of the present invention, and the description thereof is specific and detailed, but not construed as limiting the scope of the present invention; it should be noted that, for those skilled in the art, without departing from the concept of the present invention, several variations and modifications can be made, which are within the protection scope of the present invention; therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (9)

1. A split type artificial intelligence translation method is characterized by comprising the following steps:
s1, introduction of literature;
s2, traversing the literature, and splitting the literature into a plurality of sub-literatures;
s3, respectively carrying out sentence breaking on each sub-document to obtain a single sentence, and sequentially carrying out sequence number marking on each single sentence;
s4, extracting nouns in the single sentence and carrying out preliminary paraphrasing to obtain a single sentence key word;
s5, translating each single sentence to obtain a paraphrase sentence;
s6, carrying out proximity adjustment according to the single sentence key words in the preceding sentence of the paraphrase sentence;
and S7, arranging all the adjusted paraphrases to obtain a translation and displaying the translation.
2. The split artificial intelligence translation method of claim 1, wherein: the labeling method of step S3 is n-k, where n denotes the number of the sub-document, k denotes the number of the single sentence in the sub-document, and both n and k are natural numbers.
3. The split artificial intelligence translation method according to claim 1, wherein the step S5 is further followed by the following steps:
s5.1, arranging a plurality of paraphrases of the single sentence key words and sequentially selecting the paraphrases;
s5.2, comparing the paraphrase of the selected single sentence key word with the paraphrase of the next single sentence, judging whether a logical relation exists, if so, performing the step S6, otherwise, returning to the step S5.1 until the paraphrases of the single sentence key words are all judged to be finished.
4. The split artificial intelligence translation method according to claim 1, wherein the step S6 is further followed by the following steps:
s6.1, underlining the single sentence after the close-up adjustment;
s6.2, carrying out multi-meaning translation on the single sentence with underlines;
and S6.3, forming a hyperlink by the multi-meaning translated single sentence and binding the hyperlink with an underline.
5. The split artificial intelligence translation method according to claim 1, wherein the step S7 is preceded by the following steps:
s6.4, extracting the subject with more than two times of appearance times in each sub-literature;
s6.5, judging whether the translation results of the subject in the translated text translated by the subject with the same original text are the same, if so, normally displaying, and otherwise, performing the step S6.6;
and S6.6, displaying the subjects of different translation results in an editable state, adding background colors and displaying.
6. The split artificial intelligence translation method according to claim 1, wherein the step S7 is further followed by the following steps:
s8, judging whether the operator modifies the characters in the translation, if so, packaging and integrating the modified part and the original text to obtain a summary document, otherwise, not performing any operation;
and S8.1, uploading the summary document to a cloud.
7. The split artificial intelligence translation method of claim 1, wherein: in step S6, if the translated sentence is the first sentence, the contents are directly translated.
8. The split artificial intelligence translation method of claim 1, wherein: the splitting method in step S2 is to determine whether there are multiple paragraphs in the document, if yes, split the document using the paragraph as the structure, otherwise split the document using the sentence as the structure.
9. The split artificial intelligence translation method of claim 1, wherein: the method of adjusting the proximity in step S6 is to select a word sense with more usage in the matching semantics according to the cloud big data.
CN202011352378.XA 2020-11-27 2020-11-27 Split type artificial intelligence translation method Pending CN112487831A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011352378.XA CN112487831A (en) 2020-11-27 2020-11-27 Split type artificial intelligence translation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011352378.XA CN112487831A (en) 2020-11-27 2020-11-27 Split type artificial intelligence translation method

Publications (1)

Publication Number Publication Date
CN112487831A true CN112487831A (en) 2021-03-12

Family

ID=74935627

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011352378.XA Pending CN112487831A (en) 2020-11-27 2020-11-27 Split type artificial intelligence translation method

Country Status (1)

Country Link
CN (1) CN112487831A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110245358A (en) * 2018-03-09 2019-09-17 北京搜狗科技发展有限公司 A kind of machine translation method and relevant apparatus
CN110837742A (en) * 2019-11-15 2020-02-25 广州市汇泉翻译服务有限公司 Man-machine combined translation batch processing translation method containing artificial intelligence
CN111666774A (en) * 2020-04-24 2020-09-15 北京大学 Machine translation method and device based on document context
CN111680523A (en) * 2020-06-09 2020-09-18 语联网(武汉)信息技术有限公司 Man-machine collaborative translation system and method based on context semantic comparison

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110245358A (en) * 2018-03-09 2019-09-17 北京搜狗科技发展有限公司 A kind of machine translation method and relevant apparatus
CN110837742A (en) * 2019-11-15 2020-02-25 广州市汇泉翻译服务有限公司 Man-machine combined translation batch processing translation method containing artificial intelligence
CN111666774A (en) * 2020-04-24 2020-09-15 北京大学 Machine translation method and device based on document context
CN111680523A (en) * 2020-06-09 2020-09-18 语联网(武汉)信息技术有限公司 Man-machine collaborative translation system and method based on context semantic comparison

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
""使用trados翻译时减少术语不一致的技巧"", Retrieved from the Internet <URL:"zhuanlan.zhihu.com/p/32884675"> *

Similar Documents

Publication Publication Date Title
US6119077A (en) Translation machine with format control
US6321189B1 (en) Cross-lingual retrieval system and method that utilizes stored pair data in a vector space model to process queries
US7111011B2 (en) Document processing apparatus, document processing method, document processing program and recording medium
US5826219A (en) Machine translation apparatus
US5587902A (en) Translating system for processing text with markup signs
KR100235223B1 (en) Mapping method and device
US20080306729A1 (en) Method and system for searching a multi-lingual database
CN112380864A (en) Text triple labeling sample enhancement method based on translation
JP2007272859A (en) Information retrieval support program, computer having information retrieval support function, server computer and program storage medium
JP2005292958A (en) Teacher data preparation device and program, language analysis processor and program and summary processor and program
CN112487831A (en) Split type artificial intelligence translation method
CN116631400A (en) Voice-to-text method and device, computer equipment and storage medium
WO2015162464A1 (en) Method and system for generating a definition of a word from multiple sources
JPH0344343B2 (en)
CN114996494A (en) Image processing method, image processing device, electronic equipment and storage medium
JPH10326277A (en) Translation service providing method and translation service system
JP6114090B2 (en) Machine translation apparatus, machine translation method and program
JP2005050156A (en) Method and system for replacing content
JPH02297157A (en) Method and device for summarizing text
US20230169257A1 (en) Device for generating combined sentences of images and characters
KR102640887B1 (en) Method and electronic device for generating multilingual website content
JPH08297675A (en) Translation supporting device
CN112487791A (en) Multi-language hybrid intelligent translation method
JP2783597B2 (en) Kana-Kanji conversion device
Goyal et al. SIG

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination