CN111767709A - Logic method for carrying out error correction and syntactic analysis on English text - Google Patents

Logic method for carrying out error correction and syntactic analysis on English text Download PDF

Info

Publication number
CN111767709A
CN111767709A CN201910238788.2A CN201910238788A CN111767709A CN 111767709 A CN111767709 A CN 111767709A CN 201910238788 A CN201910238788 A CN 201910238788A CN 111767709 A CN111767709 A CN 111767709A
Authority
CN
China
Prior art keywords
english
text
error correction
sentence
english text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910238788.2A
Other languages
Chinese (zh)
Inventor
戴翰波
李辉
王丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Huiren Information Technology Co ltd
Original Assignee
Wuhan Huiren Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Huiren Information Technology Co ltd filed Critical Wuhan Huiren Information Technology Co ltd
Priority to CN201910238788.2A priority Critical patent/CN111767709A/en
Publication of CN111767709A publication Critical patent/CN111767709A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention provides a method for English text error correction and syntax analysis, which is used for processing English text, correcting error of wrong sentences, giving out error prompt information, and analyzing grammatical phenomena of sentences for correct sentences to give out information such as basic sentence patterns, sentence composition structures, phrase dependency relations, modifiers, fixed collocation and the like. Based on the invention, English beginners can be helped to more effectively improve writing ability, pertinently correct errors and learn sentence structures of excellent sentences.

Description

Logic method for carrying out error correction and syntactic analysis on English text
Technical Field
The invention belongs to the technical field of natural language processing, and particularly relates to two research aspects, namely an automatic detection technology of English text errors on one hand and grammar analysis of English text syntax on the other hand. The method is mainly applied to tutoring of English writing, and meanwhile, the two technologies can be independently used.
Background
English is the most popular language in global application, and is the necessary skill for communication in global environment, while English text is the main medium for communication, especially for academic papers and written communication in business. Therefore, the improvement of the English writing ability becomes the demand of more and more people, the English text with excellent reading related topics is a strategy selected by most people when improving the writing ability of the English text, but the quality of the information on the network is uneven, a beginner cannot well select, and the error correction and syntactic analysis method of the English text can help the English beginner to effectively improve the writing ability.
The existing related technology is more limited to the automatic error detection part for solving English texts, and the structure and the grammar analysis part of correct sentences are not involved. For example, CN 108519974 a provides an automatic detection and analysis method for grammar errors of english compositions, which performs sentence segmentation, word segmentation, and spelling check on english compositions, performs part-of-speech tagging with a stanford analyzer, then corrects part-of-speech tagging, constructs a negative example rule flow chart, and returns the result; CN 101814065B patent, which proposes a syntax analysis apparatus and syntax analysis method using regular expressions to describe syntax analysis rules; besides, the application programs Grammarly, Ginger and WhiteSmoke published abroad and the correction network commonly used by domestic students carry out the correction of English texts.
In the existing error correction method and software, only attention is paid to prompting of wrong grammar of English texts, analysis of correct graceful sentences is omitted, and writing ability of English learners is improved. On the other hand, most of English text correction, basic grammar correction and the like can be realized, but because the universality is emphasized, the difference of language application capabilities of different crowd groups is ignored, so that a plurality of prompts have no pertinence, and even errors in a specific group can not be corrected basically. For example, the common problem of English writing for pupils is obviously different from the emphasis of error correction of scientific research and academic papers, the former attaches more importance to the application of grammatical structure and vocabulary, and the latter attaches more attention to the accuracy and understandability of expression of professional vocabulary.
Disclosure of Invention
In view of the above situation, the present invention provides a method for english text error correction and syntax analysis, which is used for processing an english text, correcting an error sentence, providing error prompt information, and for a correct sentence, parsing a grammatical phenomenon of the sentence, providing a basic sentence pattern of the sentence, a sentence composition structure, a phrase dependency relationship, a modifier, and a fixed collocation.
The supplement of English writing scenes and the recognition and error correction of corresponding grammar error patterns can be continuously perfected according to the requirements of users, and are used for any English education tutoring program
Compared with the prior art, the invention has the main innovation points that:
1. by adopting a proper logic structure, not only the error correction of English texts is considered, but also the correct grammatical phenomena of sentences can be output, and the most important point is that learners can be assisted to write and learn correct sentences (including graceful sentences) by using the grammatical analysis results of the correct sentences, learn sentence composition structures of example sentences, match words and the like, and improve the writing level;
2. the error correction expansion is carried out based on the LangeTool toolkit, not only a common error rule mode is supported, but also the corresponding error mode is added in a regularized mode according to different application scenes and different users, and the user mode encapsulation under different scenes is carried out.
The logic method for performing error correction and syntactic analysis on the English text comprises the following specific technical routes:
the method comprises the steps of logically dividing two modules of an input text according to whether a sentence has errors or not, carrying out a text error correction module if the sentence has errors, carrying out a syntactic analysis module if the sentence has no errors, and carrying out the syntactic analysis module if the sentence only has word spelling errors.
The specific processing flow of the text error correction module is carried out according to the following steps:
1) identifying common errors of English text grammar, and identifying the common errors by using any error correction kit;
2) addition of error rule patterns. The established scene is divided according to the two aspects of the learning stage of the domestic English composition and the application and the cutting of the English text, and the division of the learning stage of the domestic English composition comprises the following steps: pupils, junior high school students and the three scenes above; english text application genre partitioning comprises: three scenes, namely letter, description and narrative, and scientific research paper. For different scenarios of these two aspects, respective error patterns are added.
Further, for step 2), the supplement of the english writing scene and the recognition and error correction of the corresponding grammar error pattern can be continuously perfected according to the requirements of the user, and the method is used for any english education tutoring program.
And (II) the syntactic analysis module is mainly realized by using the related technology of automatic analysis of the English text of the patent I.
Drawings
FIG. 1 is a general logic flow diagram of the method of the present invention;
FIG. 2 is a detailed process flow diagram of the text correction module;
fig. 3 is a detailed process flow diagram of the text parsing module.
Detailed Description
The English text error correction and syntax analysis method mainly comprises a text error correction module and a syntax analysis module, and the logical relationship between the text error correction module and the syntax analysis module is shown in figure 1
The text error correction module firstly corrects the common grammar and then corrects the additional grammar according to the difference of the selected application scenes; the syntactic analysis module mainly provides relevant syntactic analysis for the correct sentence, and the syntactic analysis comprises information such as basic sentence patterns of the sentence, sentence composition structures, sentence syntactical sequences, phrase dependency relations, modifiers, fixed collocation and the like.
2.2.1 module one: text error correction module
The specific flow is shown in fig. 2, and the method mainly comprises two parts, wherein the first part uses a LanguageTool toolkit to correct common problems, and the second part adds error rules corresponding to specific fields or levels according to different application scenarios.
For the first part, the recognition of common errors in the grammar of the english text, we use the LanguageTool kit to process, and call the kit to realize the correction of common english text, but it should be noted that, other text correction kits are selected, the effect is basically consistent, and it does not depart from the processing procedure of us, and only different calling modes are available.
For the second part, the addition of error rule patterns. The established scene is divided according to the two aspects of the learning stage of the domestic English composition and the application and the cutting of the English text, wherein the division of the learning stage of the domestic English composition comprises the following steps: pupils, junior high school students and the three scenes above; english text application genre partitioning comprises: three scenes, namely letter, description and narrative, and scientific research paper. For different scenes in the two aspects, the emphasis points of the added error modes are different, and the following are specific:
A. according to the scenes divided by the learning stages of the domestic English composition, the adding rules are mainly added by utilizing the recognition of a set rule mode, and the adding rules comprise grammar phrase collocation under the corresponding scene stages, common writing errors and the like.
Any rule adding method can be used for realizing, such as error pattern expansion supported by a LanguageTool toolkit, and any other rule pattern adding method based on negative examples can also be used, wherein error pattern matching rules are mainly used, such as: the elementary school students usually have error pattern play + musical instruments, the instrument lacks the word, and only negative example rules are formed according to word combination (or regular matching) to be added, while the error pattern help sb. to do and preposition to are redundant, the words help + sb. + to + part of speech tagging results VB (representing word primitive) are needed to realize, for the error pattern of noun single number, the NNS (representing noun complex number) + VBP (representing verb non-third person to be referred to as singular) is completely dependent on the combined judgment of the part of speech tagging results.
B. Applying a scene of genre division according to English texts, wherein the addition rule mainly utilizes the difference of the genres, namely the difference of the composition structures of sentences, such as letters, and puts the emphasis on whether the sentences are simplified or not, the average word number contained in the sentences and the word difficulty and the common degree are recorded, the word difficulty and the common degree are obtained by comparing 5000 common words summarized and summarized by self, and the cutting division reminding is carried out on overlength clauses; the scientific research papers mainly investigate the expression accuracy of the professional vocabulary, give out by using a fixed expression matching mode of the professional vocabulary, namely summarize and summarize the professional vocabulary in the field, give out common and commonly occurring vocabularies in the same text of the professional vocabulary (namely, the vocabularies with high occurrence frequency in the professional text but low occurrence frequency in other professional fields), then count the occurrence frequency corresponding to the vocabularies for the given text, and then carry out replacement reminding on improper words; and explaining and narrating the text, mainly record the frequency of each word, when the frequency of word is too high, carry on the replacement of the word of similar meaning and remind, if the average word number of the sentence is less, carry on the warning that the sentence merges, if the conjunctive word is too few among the sentences, carry on the warning of conjunctive word.
It is worth noting that: the method for constructing the negative example rule by the sentence segmentation, word segmentation, part of speech tagging and correction of the text exists in the rule expansion process, but the claim point of the prior patents lies in the construction process, and the protection requirement point of the prior patents does not lie in the implementation of the process.
2.2.2 Module two: syntactic analysis module
The general processing flow is shown in fig. 3, and a more detailed parsing processing method is shown in patent I, wherein the parsing processing method comprises the whole content of a data preprocessing module of the module I, the whole process of syntactic analysis in the module II and the whole content of a corresponding module III result output module.

Claims (5)

1. The logic method for carrying out error correction and syntactic analysis on the English text is characterized by comprising the following steps: the processing process of the English text comprises two module parts, one part carries out text error correction, and the other part gives related syntactic analysis for a correct sentence.
2. The method of claim 1, further comprising: the text error correction module comprises the following two parts:
1) identifying common English grammar errors by using a common English text error correction tool;
2) addition of additional error rule patterns.
3. The method of claim 1, further comprising: the syntax analysis module obtains relevant syntax analysis by using English text regular expression matching and dependency syntax analysis results, wherein the relevant syntax analysis comprises information such as basic sentence patterns of sentences, sentence composition structures, sentence syntactical sequences, phrase dependency relations, modifiers, fixed collocation and the like.
4. The method of claim 2, wherein: and adding an additional error rule mode, and adding error rules specific to the corresponding field or level according to different application scenes.
5. The method of claim 4, wherein: the established scene is divided according to the two aspects of the learning stage of the domestic English composition and the application and the cutting of the English text, and the division of the learning stage of the domestic English composition comprises the following steps: pupils, junior high school students and the three scenes above; english text application genre partitioning comprises: three scenes, namely letter, description and narrative, and scientific research paper.
CN201910238788.2A 2019-03-27 2019-03-27 Logic method for carrying out error correction and syntactic analysis on English text Pending CN111767709A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910238788.2A CN111767709A (en) 2019-03-27 2019-03-27 Logic method for carrying out error correction and syntactic analysis on English text

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910238788.2A CN111767709A (en) 2019-03-27 2019-03-27 Logic method for carrying out error correction and syntactic analysis on English text

Publications (1)

Publication Number Publication Date
CN111767709A true CN111767709A (en) 2020-10-13

Family

ID=72717962

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910238788.2A Pending CN111767709A (en) 2019-03-27 2019-03-27 Logic method for carrying out error correction and syntactic analysis on English text

Country Status (1)

Country Link
CN (1) CN111767709A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112036135A (en) * 2020-11-06 2020-12-04 腾讯科技(深圳)有限公司 Text processing method and related device
CN112667208A (en) * 2020-12-22 2021-04-16 深圳壹账通智能科技有限公司 Translation error recognition method and device, computer equipment and readable storage medium
CN112988995A (en) * 2021-03-05 2021-06-18 广州大学 English composition reading system and method
CN113205084A (en) * 2021-07-05 2021-08-03 北京一起教育科技有限责任公司 English dictation correction method and device and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101814065A (en) * 2009-02-23 2010-08-25 富士通株式会社 Syntactic analysis device and syntactic analysis method
CN107783958A (en) * 2016-08-31 2018-03-09 科大讯飞股份有限公司 A kind of object statement recognition methods and device
CN107807915A (en) * 2017-09-27 2018-03-16 北京百度网讯科技有限公司 Error correcting model method for building up, device, equipment and medium based on error correction platform
CN109376355A (en) * 2018-10-08 2019-02-22 上海起作业信息科技有限公司 English word and sentence screening technique, device, storage medium and electronic equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101814065A (en) * 2009-02-23 2010-08-25 富士通株式会社 Syntactic analysis device and syntactic analysis method
CN107783958A (en) * 2016-08-31 2018-03-09 科大讯飞股份有限公司 A kind of object statement recognition methods and device
CN107807915A (en) * 2017-09-27 2018-03-16 北京百度网讯科技有限公司 Error correcting model method for building up, device, equipment and medium based on error correction platform
CN109376355A (en) * 2018-10-08 2019-02-22 上海起作业信息科技有限公司 English word and sentence screening technique, device, storage medium and electronic equipment

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112036135A (en) * 2020-11-06 2020-12-04 腾讯科技(深圳)有限公司 Text processing method and related device
CN112667208A (en) * 2020-12-22 2021-04-16 深圳壹账通智能科技有限公司 Translation error recognition method and device, computer equipment and readable storage medium
CN112988995A (en) * 2021-03-05 2021-06-18 广州大学 English composition reading system and method
CN113205084A (en) * 2021-07-05 2021-08-03 北京一起教育科技有限责任公司 English dictation correction method and device and electronic equipment
CN113205084B (en) * 2021-07-05 2021-10-08 北京一起教育科技有限责任公司 English dictation correction method and device and electronic equipment

Similar Documents

Publication Publication Date Title
Black et al. Statistically-driven computer grammars of English: The IBM/Lancaster approach
CN111767709A (en) Logic method for carrying out error correction and syntactic analysis on English text
Dolezal World Englishes and lexicography
Khumphee et al. Grammatical errors in English essays written by Thai EFL undergraduate students
Tesfaye A rule-based Afan Oromo Grammar Checker
Spyns et al. Essential speech and language technology for Dutch
Hajar et al. THE INTERFERENCE OF INDONESIAN ON THE STUDENTS’ENGLISH WRITING OF MUHAMMADIYAH UNIVERSITY OF MAKASSAR
Rosen Building and Using Corpora of Non-Native Czech.
Low English in East and South Asia in the post-Kachruvian era
Chen The development of an interlanguage
US11341961B2 (en) Multi-lingual speech recognition and theme-semanteme analysis method and device
US20160267811A1 (en) Systems and methods for teaching foreign languages
Sullivan et al. The global in the local: Young multilingual language learners write in North Sámi (Finland, Norway, Sweden)
Matić Perception of the English element in the scientific register of Croatian ICT university educational material with graduate ICT students
Pellegrini et al. ASR-based exercises for listening comprehension practice in European Portuguese
CN112988955B (en) Multilingual voice recognition and topic semantic analysis method and device
TWI731493B (en) Multi-lingual speech recognition and theme-semanteme analysis method and device
Bannò et al. Towards automatic spoken grammatical error correction of L2 learners of English.
Шамуратова Challenges in simultaneous interpretation
Cho Assessing Nativelikeness of Korean College Students' English Writing Using fastText
Nikulásdóttir et al. LANGUAGE TECHNOLOGY FOR ICELANDIC 2018-2022
Van Nam et al. Building a spelling checker for documents in Khmer language
Leturia et al. The BerbaTek project for Basque: Promoting a less-resourced language via language technology for translation, content management and learning
Jose et al. Noisy SMS text normalization model
Aimuratova Challenges in simultaneous interpretation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination