CN109241542A - A kind of text data processing method for English Translation - Google Patents

A kind of text data processing method for English Translation Download PDF

Info

Publication number
CN109241542A
CN109241542A CN201810993789.3A CN201810993789A CN109241542A CN 109241542 A CN109241542 A CN 109241542A CN 201810993789 A CN201810993789 A CN 201810993789A CN 109241542 A CN109241542 A CN 109241542A
Authority
CN
China
Prior art keywords
translation
sentence
data
word
english
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201810993789.3A
Other languages
Chinese (zh)
Inventor
王萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiujiang University
Original Assignee
Jiujiang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiujiang University filed Critical Jiujiang University
Priority to CN201810993789.3A priority Critical patent/CN109241542A/en
Publication of CN109241542A publication Critical patent/CN109241542A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a kind of text data processing methods for English Translation, include the following steps: the text information data to be translated being divided into as unit of sentence, it is then based on the identification that retrieval model completes characteristic item in text information, characteristic item if it exists, this article notebook data is then delivered to human translation module/special translation module to translate, if it does not exist, text information is then divided by the several modules being made of word, phrase based on preset English Translation rule, and completes the acquisition of each word, phrase queueing discipline in text information;Corresponding translation data are inquired in the local database as querying condition using word, phrase, are then based on queueing discipline, the corresponding translation result of the data inquired is arranged, and export sentence translation result;Complete the integration without feature sentence Yu the translation result for having feature sentence, output to client.The invention avoids the mistakes of tense when translation and word order, while can be with the translation blind area of reduction system.

Description

A kind of text data processing method for English Translation
Technical field
The present invention relates to English Translation technical fields, and in particular to a kind of text data processing side for English Translation Method.
Background technique
It is also increasing to the translation demand of English file with deepening continuously for international exchange, promote large quantities of English The appearance of the tool of language translation, these English Translation tools are generally divided into online version and city edition, regardless of online version and local Version is all that search translation is translated in the database, and the translation that the appearance of these translation tools greatly meets user needs It asks, to translation efficiency is improved, social progress is pushed to contribute.
And since the grammer of English, rule are numerous, needs can be not necessarily exactly matched in the database of translation tool The sentence of translation is essentially all that sentence to be translated is carried out one-to-one word translation, and tense and word order are wrong frequent occurrence Accidentally, it and translates stiff, is not achieved in translation and often says the translation brief that fidelity, fluency, elegance.The use of english foundation is also needed at this time Family carries out check and correction sentence by sentence, makes word order in order, adjusts tense, reorganizes language according to the knowledge of grammar of oneself, these are for English Just seem helpless for the user of language basis difference.
Summary of the invention
To solve the above problems, the present invention provides a kind of text data processing methods for English Translation.
To achieve the above object, the technical scheme adopted by the invention is as follows:
A kind of text data processing method for English Translation, includes the following steps:
S1, be split data to be translated, identify it is described it is data to be translated in punctuate, using fullstop as division position, Obtain the text information as unit of sentence;
S2, the identification that characteristic item in the text information is completed based on preset retrieval model, the characteristic item if it exists, This article notebook data is then delivered to human translation module/special translation module to translate, if it does not exist the characteristic item, then into Enter step S3;
S3, based on preset English Translation rule by the text information be divided into be made of word, phrase it is several Module, and complete the acquisition of each word, phrase queueing discipline in text information;
S4, corresponding translation data are inquired as querying condition using the word, phrase in the local database, are then based on Each word, phrase queueing discipline in the text information, the corresponding translation result of the data inquired is arranged, and exports sentence Sub- translation result;
S5, repeat step S3-S4, completing institute, whether there is or not the translations of characteristic item sentence, then by turning over whether there is or not characteristic sentence is sub It translates and there is translation result/special translation module translation result of the human translation module of feature sentence to carry out integration output to visitor Family end.
Further, the retrieval model is one of probability retrieval model, Boolean logical mode.
Further, the characteristic item is the English Grammar for needing certain translation.
It further, further include that the sentence for having characteristic item and its corresponding human translation result are stored in special translation money Expect the step in database.
Further, the special translation module is querying condition in special translation information number using the sentence with characteristic item According in library carry out closest to data inquiry,
Further, described must not be lower than percent 80, if being lower than percentage closest to the likelihood of data and querying condition 80, then start human translation module.
Further, the step S5 carries out the integration of each translation result based on the position where each sentence.
The invention has the following advantages:
It will be split as needing the English Grammar part of certain translation and conventional translator unit with translation data first, then again Conventional translator unit is splitted into Word parts and phrase part according to preset English Translation rule, constantly so as to avoid translation The mistake of state and word order, while certain translation part can be translated by human translation module, on the one hand it can make It is more accurate to translate, and on the other hand also achieves the update filling of data in database.
Specific embodiment
In order to which objects and advantages of the present invention are more clearly understood, the present invention is carried out with reference to embodiments further It is described in detail.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not used to limit this hair It is bright.
Embodiment 1
A kind of text data processing method for English Translation, includes the following steps:
S1, be split data to be translated, identify it is described it is data to be translated in punctuate, using fullstop as division position, Obtain the text information as unit of sentence;
S2, the identification that characteristic item in the text information is completed based on preset probability retrieval model, the if it exists spy Item is levied, then this article notebook data is delivered to special translation module and translated, special translation module is with the sentence with characteristic item The inquiry closest to data is carried out in special translation information database for querying condition, wherein it is described closest to data with look into The likelihood of inquiry condition must not be lower than percent 80, if starting human translation module lower than percent 80;It is described if it does not exist Characteristic item then enters step S3;
S3, based on preset English Translation rule by the text information be divided into be made of word, phrase it is several Module, and complete the acquisition of each word, phrase queueing discipline in text information;
S4, corresponding translation data are inquired as querying condition using the word, phrase in the local database, are then based on Each word, phrase queueing discipline in the text information, the corresponding translation result of the data inquired is arranged, and exports sentence Sub- translation result;
S5, repeat step S3-S4, completing institute, whether there is or not the translations of characteristic item sentence, then by turning over whether there is or not characteristic sentence is sub It translates and there is translation result/special translation module translation result of the human translation module of feature sentence to carry out integration output to visitor Family end.
Embodiment 2
A kind of text data processing method for English Translation, includes the following steps:
S1, be split data to be translated, identify it is described it is data to be translated in punctuate, using fullstop as division position, Obtain the text information as unit of sentence;
S2, the identification that characteristic item in the text information is completed based on preset Boolean logical mode, the if it exists spy Item is levied, then this article notebook data is delivered to special translation module and translated, special translation module is with the sentence with characteristic item The inquiry closest to data is carried out in special translation information database for querying condition, wherein it is described closest to data with look into The likelihood of inquiry condition must not be lower than percent 80, if starting human translation module lower than percent 80;It is described if it does not exist Characteristic item then enters step S3;
S3, based on preset English Translation rule by the text information be divided into be made of word, phrase it is several Module, and complete the acquisition of each word, phrase queueing discipline in text information;
S4, corresponding translation data are inquired as querying condition using the word, phrase in the local database, are then based on Each word, phrase queueing discipline in the text information, the corresponding translation result of the data inquired is arranged, and exports sentence Sub- translation result;
S5, repeat step S3-S4, completing institute, whether there is or not the translations of characteristic item sentence, then by turning over whether there is or not characteristic sentence is sub It translates and there is translation result/special translation module translation result of the human translation module of feature sentence to carry out integration output to visitor Family end.
The above is only a preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For member, without departing from the principle of the present invention, it can also make several improvements and retouch, these improvements and modifications are also answered It is considered as protection scope of the present invention.

Claims (7)

1. a kind of text data processing method for English Translation, which comprises the steps of:
S1, be split data to be translated, identify it is described it is data to be translated in punctuate obtained using fullstop as division position Text information as unit of sentence;
S2, the identification that characteristic item in the text information is completed based on preset retrieval model, the characteristic item, then will if it exists This article notebook data is delivered to human translation module/special translation module and is translated, if it does not exist the characteristic item, then enters step Rapid S3;
S3, the text information is divided by the several modules being made of word, phrase based on preset English Translation rule, And complete the acquisition of each word, phrase queueing discipline in text information;
S4, corresponding translation data are inquired as querying condition using the word, phrase in the local database, are then based on described Each word, phrase queueing discipline in text information, the corresponding translation result of the data inquired is arranged, and output sentence turns over Translate result;
S5, repeat step S3-S4, completing institute, whether there is or not the translations of characteristic item sentence, then by whether there is or not the translation of characteristic sentence and There is translation result/special translation module translation result of the human translation module of feature sentence to carry out integration output to client End.
2. a kind of text data processing method for English Translation as described in claim 1, which is characterized in that the retrieval Model is one of probability retrieval model, Boolean logical mode.
3. a kind of text data processing method for English Translation as described in claim 1, which is characterized in that the feature Item is the English Grammar for needing certain translation.
4. a kind of text data processing method for English Translation as described in claim 1, which is characterized in that further include by There are the sentence of characteristic item and its corresponding human translation result to be stored in step in special translation information database.
5. a kind of text data processing method for English Translation as described in claim 1, which is characterized in that described special Translation module carries out looking into closest to data by querying condition of the sentence with characteristic item in special translation information database It askes.
6. a kind of text data processing method for English Translation as claimed in claim 5, which is characterized in that described most to connect The likelihood of nearly data and querying condition must not be lower than percent 80, if starting human translation module lower than percent 80.
7. a kind of text data processing method for English Translation as described in claim 1, which is characterized in that the step S5 carries out the integration of each translation result based on the position where each sentence.
CN201810993789.3A 2018-08-20 2018-08-20 A kind of text data processing method for English Translation Withdrawn CN109241542A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810993789.3A CN109241542A (en) 2018-08-20 2018-08-20 A kind of text data processing method for English Translation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810993789.3A CN109241542A (en) 2018-08-20 2018-08-20 A kind of text data processing method for English Translation

Publications (1)

Publication Number Publication Date
CN109241542A true CN109241542A (en) 2019-01-18

Family

ID=65068869

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810993789.3A Withdrawn CN109241542A (en) 2018-08-20 2018-08-20 A kind of text data processing method for English Translation

Country Status (1)

Country Link
CN (1) CN109241542A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109829173A (en) * 2019-01-21 2019-05-31 中国测绘科学研究院 A kind of English place name interpretation method and device
CN109992753A (en) * 2019-03-22 2019-07-09 维沃移动通信有限公司 A kind of translation processing method and terminal device
CN110705321A (en) * 2019-10-16 2020-01-17 榆林学院 Computer aided translation system
CN111143074A (en) * 2019-12-30 2020-05-12 文思海辉智科科技有限公司 Method and device for distributing translation files

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109829173A (en) * 2019-01-21 2019-05-31 中国测绘科学研究院 A kind of English place name interpretation method and device
CN109829173B (en) * 2019-01-21 2023-09-29 中国测绘科学研究院 English place name translation method and device
CN109992753A (en) * 2019-03-22 2019-07-09 维沃移动通信有限公司 A kind of translation processing method and terminal device
CN109992753B (en) * 2019-03-22 2023-09-08 维沃移动通信有限公司 Translation processing method and terminal equipment
CN110705321A (en) * 2019-10-16 2020-01-17 榆林学院 Computer aided translation system
CN111143074A (en) * 2019-12-30 2020-05-12 文思海辉智科科技有限公司 Method and device for distributing translation files
CN111143074B (en) * 2019-12-30 2024-04-09 文思海辉智科科技有限公司 Method and device for distributing translation files

Similar Documents

Publication Publication Date Title
CN109241542A (en) A kind of text data processing method for English Translation
AU2017317878B2 (en) Error correction method and device for search term
US9342499B2 (en) Round-trip translation for automated grammatical error correction
US20030061023A1 (en) Automatic extraction of transfer mappings from bilingual corpora
US9904672B2 (en) Machine-translation based corrections
US8346819B2 (en) Enhanced data conversion framework
JP2005535007A (en) Synthesizing method of self-learning system for knowledge extraction for document retrieval system
Al-Jumaily et al. A real time Named Entity Recognition system for Arabic text mining
WO2008103894A1 (en) Automated word-form transformation and part of speech tag assignment
Hollenstein et al. Compilation of a Swiss German dialect corpus and its application to PoS tagging
Ahmed et al. Revised n-gram based automatic spelling correction tool to improve retrieval effectiveness
CN105320650B (en) A kind of machine translation method and its system based on corpus matching and syntactic analysis
CN102486787B (en) Method and device for extracting document structure
CN110019749B (en) Method, apparatus, device and computer readable medium for generating VQA training data
Alhanini et al. The enhancement of arabic stemming by using light stemming and dictionary-based stemming
Pinnis Context independent term mapper for European languages
Hauser et al. Unsupervised learning of edit distance weights for retrieving historical spelling variations
CN106569994A (en) Elevator remote control device
CN102609410B (en) Authority file auxiliary writing system and authority file generating method
CN102890723A (en) Example sentence searching method and system
CN103854521A (en) Suffix induction learning system for English words with vowels or very few consonants serving as initials
Srinivasagan et al. An automated system for tamil named entity recognition using hybrid approach
Makhija A study of different stemmer for sindhi language based on devanagari script
Cook ACRONYM: Acronym CReatiON for You and Me
CN111897958B (en) Ancient poetry classification method based on natural language processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20190118