CN114692606A - English composition analysis scoring system, method and storage medium - Google Patents

English composition analysis scoring system, method and storage medium Download PDF

Info

Publication number
CN114692606A
CN114692606A CN202011645278.6A CN202011645278A CN114692606A CN 114692606 A CN114692606 A CN 114692606A CN 202011645278 A CN202011645278 A CN 202011645278A CN 114692606 A CN114692606 A CN 114692606A
Authority
CN
China
Prior art keywords
scored
english
module
english composition
analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011645278.6A
Other languages
Chinese (zh)
Inventor
周启贤
梁子仲
王可泽
陈添水
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
DMAI Guangzhou Co Ltd
Original Assignee
DMAI Guangzhou Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by DMAI Guangzhou Co Ltd filed Critical DMAI Guangzhou Co Ltd
Priority to CN202011645278.6A priority Critical patent/CN114692606A/en
Publication of CN114692606A publication Critical patent/CN114692606A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/232Orthographic correction, e.g. spell checking or vowelisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • G06F40/44Statistical methods, e.g. probability models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • G06F40/47Machine-assisted translation, e.g. using translation memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention discloses a system, a method and a storage medium for analyzing and scoring English compositions, wherein the system comprises: the recognition module is used for acquiring English compositions to be scored; the error correction module is used for carrying out error recognition and correction on the English composition according to the combination of a grammar error correction algorithm, a neural machine translation network model and a statistical language model; the multi-dimensional characteristic statistical module is used for carrying out statistical analysis on words, phrases and sentences in English texts; the theme analysis module is used for carrying out theme analysis on the English composition; the bright spot matching module is used for carrying out bright spot identification on the English composition to be scored; and the scoring module is used for obtaining scores according to the analysis results of the error correction module, the multi-dimensional feature statistical module, the theme analysis module and the bright spot matching module. By implementing the method and the device, the English composition to be scored can be comprehensively scored, so that students can deeply recognize the advantages and the disadvantages of the written composition, and the writing level of the students can be improved practically.

Description

English composition analysis scoring system, method and storage medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an English composition analysis scoring system, an English composition analysis scoring method and a storage medium.
Background
English composition is a common language writing output exercise in English learning, and is an important application type exercise in English learning. Unlike the real-time and haphazardness of spoken language, writing requires students to describe the passage of a thing or to narratively demonstrate a certain point of view in a written language. Written writing usually uses more complex and rich expressions and article organization and layout structure to achieve the purpose of systematically and comprehensively expressing a certain viewpoint or thing.
English composition is an important part in English learning of both primary school and university, and the correction mainly depends on manual review by teachers at present to find out grammar errors of the composition and judge whether word-sending sentence-making is correct or not. However, the manual correction method is time-consuming and labor-consuming, and many grammatical errors in the written texts are common and general, so that a large amount of manpower and material resources are wasted by adopting the manual correction method, and the efficiency is low.
Disclosure of Invention
In view of this, embodiments of the present invention provide an english composition analysis scoring system, method, and storage medium, so as to solve the technical problem in the prior art that the efficiency is low through manual composition correction.
The technical scheme provided by the invention is as follows:
the first aspect of the embodiments of the present invention provides an english composition analysis scoring system, including: the recognition module is used for acquiring English compositions to be scored; the error correction module is used for carrying out error recognition and correction on the obtained English composition to be scored according to the combination of a grammar error correction algorithm, a neural machine translation network model and a statistical language model; the multi-dimensional characteristic statistical module is used for performing statistical analysis on words, phrases and sentences in English texts to be scored; the theme analysis module is used for carrying out theme analysis on the English composition to be scored; the bright spot matching module is used for carrying out bright spot identification on the English composition to be scored; and the scoring module is used for calculating the score of the English composition to be scored according to the analysis results of the error correction module, the multi-dimensional feature statistical module, the theme analysis module and the bright point matching module.
Optionally, the identification module comprises: the electronic identification unit is used for acquiring an English composition to be evaluated, which is input electronically; and the handwriting recognition unit is used for acquiring a handwritten picture of the English composition to be scored and acquiring characters of the English composition to be scored according to the deep convolutional neural network.
Optionally, the system for analyzing and scoring english compositions further comprises: and the preprocessing module is used for performing segmentation and sentence division processing before performing recognition correction, statistical analysis, theme analysis and highlight matching on the English composition to be scored.
Optionally, the error correction module includes: the fixed error recognition unit is used for carrying out error recognition on the English composition to be scored according to a grammar error correction algorithm based on grammar rules to obtain sentences containing grammar errors; the deep network error correction unit is used for correcting the acquired English composition to be evaluated based on the sentence containing the grammatical error according to the neural machine translation network model to obtain an error correction result; and the post-processing unit is used for carrying out spelling check on the error correction result to obtain the optimized correct sentence.
Optionally, the deep network error correction unit includes: the word segmentation unit is used for segmenting words of the sentences containing the grammar errors according to a double-byte coding algorithm; and the error correction subunit is used for correcting errors of the participled sentences by combining an attention mechanism according to the multi-layer transform as a coding-decoding structure to obtain an error correction result.
Optionally, the error correction module further includes: and the statistical error correction unit is used for carrying out error identification on the English composition to be scored according to the statistical language model and replacing error sentences with correct sentences.
Optionally, the topic analysis module includes: extracting the characteristics of the English composition to be scored based on a meta-learning frame and a simnet matching model, matching the extracted characteristics with predefined subject words or subject words in a model text to obtain a matching result, wherein the characteristics of the English composition to be scored are enhanced according to a self-attention and attention-guiding mechanism when the characteristics of the English composition to be scored are extracted; subject terms in the model texts are extracted through an SIFRank algorithm.
Optionally, the scoring module comprises: the grading unit is used for dividing English compositions to be evaluated into a plurality of grades according to preset requirements; and the scoring unit is used for weighting and calculating the score of each grade of English composition according to the analysis results of the error correction module, the multi-dimensional feature statistical module, the theme analysis module and the bright spot matching module.
The second aspect of the embodiment of the invention provides an analysis and scoring method for english compositions, which comprises the following steps: acquiring English compositions to be scored; carrying out error recognition and correction on the obtained English composition to be scored according to the combination of a grammar error correction algorithm, a neural machine translation network model and a statistical language model; performing statistical analysis on words, phrases and sentences in the English to be scored; performing theme analysis on the English composition to be scored; carrying out highlight identification on an English composition to be scored; and calculating the English composition to be scored to obtain the score of the English composition to be scored.
A third aspect of the embodiments of the present invention provides a computer-readable storage medium, where computer instructions are stored, and the computer instructions are configured to cause the computer to execute the method for analyzing and scoring english compositions according to the second aspect of the embodiments of the present invention.
The technical scheme provided by the invention has the following effects:
according to the English composition analysis scoring system provided by the embodiment of the invention, by arranging the error correction module, the multi-dimensional characteristic statistics module, the theme analysis module and the highlight matching module, error recognition and correction, word, phrase and sentence statistical analysis, theme analysis and highlight recognition can be performed on an English composition to be scored, so that the English composition to be scored can be comprehensively evaluated and scored, students can deeply recognize the advantages and the defects of the written composition, and the writing level of the students can be improved practically. Meanwhile, in the English composition analysis scoring system provided by the embodiment of the invention, the error correction module comprehensively adopts a grammar error correction algorithm, a neural machine translation network model and a statistical language model to recognize and correct errors, so that the accuracy of error recognition can be improved, and meanwhile, the error correction module can also output wrong words and sentences and corrected words and sentences, so that the system is beneficial to learning of composition errors by students, and the knowledge level of the students can be further improved.
According to the English composition analysis scoring method and the storage medium provided by the embodiment of the invention, error recognition correction, word, phrase and sentence statistical analysis, theme analysis and highlight recognition are carried out on the English composition to be scored, so that the English composition to be scored can be comprehensively scored in a measuring way, students can deeply recognize the advantages and the disadvantages of the written composition, and the writing level of the students can be improved practically. Meanwhile, the English composition analysis scoring method provided by the embodiment of the invention comprehensively adopts a grammar error correction algorithm, a neural machine translation network model and a statistical language model to identify and correct errors, can improve the accuracy of error identification, and can output wrong words and sentences and corrected words and sentences, thereby being beneficial to learning composition errors by students and further improving the knowledge level of the students.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a block diagram of a structure of an english composition analysis scoring system according to an embodiment of the present invention;
FIG. 2 is a block diagram of an identification module according to an embodiment of the invention;
FIG. 3 is a block diagram of an error correction module according to an embodiment of the present invention;
FIG. 4 is a block diagram of an error correction module according to another embodiment of the present invention;
FIG. 5 is a block diagram of a scoring module according to an embodiment of the present invention;
fig. 6 is a flowchart of an english composition analysis scoring method according to an embodiment of the present invention;
FIG. 7 is a schematic structural diagram of a computer-readable storage medium provided according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of an english composition analysis scoring system according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides an English composition analysis scoring system, as shown in figure 1, the system comprises: the recognition module 10 is used for acquiring English compositions to be scored; the error correction module 20 is configured to perform error recognition and correction on the obtained english composition to be scored according to a combination of a syntax error correction algorithm, a neural machine translation network model, and a statistical language model; the multi-dimensional feature statistical module 30 is used for performing statistical analysis on words, phrases and sentences in the English texts to be scored; the theme analysis module 40 is used for performing theme analysis on the English composition to be scored; the bright spot matching module 50 is used for performing bright spot identification on the English composition to be scored; and the scoring module 60 is configured to calculate a score of the english composition to be scored according to the analysis results of the error correction module 20, the multidimensional feature statistics module 30, the theme analysis module 40, and the highlight matching module 50.
According to the English composition analysis scoring system provided by the embodiment of the invention, error recognition and correction, word, phrase and sentence statistical analysis, theme analysis and highlight recognition can be carried out on the English composition to be scored by arranging the error correction module, the multi-dimensional feature statistical module, the theme analysis module and the highlight matching module, so that the English composition to be scored can be comprehensively evaluated and scored, students can deeply recognize the advantages and the disadvantages of the written composition, and the writing level of the students can be improved practically. Meanwhile, in the English composition analysis scoring system provided by the embodiment of the invention, the error correction module comprehensively adopts a grammar error correction algorithm, a neural machine translation network model and a statistical language model to recognize and correct errors, so that the accuracy of error recognition can be improved, and meanwhile, the error correction module can also output wrong words and sentences and corrected words and sentences, thereby being beneficial to learning of composition errors by students and further improving the knowledge level of the students.
In one embodiment, as shown in FIG. 2, the identification module 10 includes: the electronic identification unit 11 is used for acquiring an English composition to be evaluated which is input electronically; and the handwriting recognition unit 12 is used for acquiring a handwritten picture of the English composition to be scored, and acquiring characters of the English composition to be scored according to the deep convolutional neural network. Therefore, the English composition analysis scoring system provided by the embodiment of the invention can not only identify the composition which is electronically input, such as the composition which is directly input through terminal equipment such as a computer, a mobile phone and a tablet, but also identify the composition which is handwritten on paper by students, such as photographing the handwritten composition through a photographing structure arranged in the system, extracting the picture characteristics based on a deep convolutional neural network model, and identifying all handwritten characters through two stages of positioning and identification, so that other modules can conveniently analyze the English composition to be scored.
In one embodiment, the english composition analysis scoring system further includes: and the preprocessing module is used for performing segmentation and sentence division processing before performing recognition correction, statistical analysis, theme analysis and highlight matching on the English composition to be scored. Specifically, after the english composition to be scored is acquired, the acquired english composition to be scored can be segmented and divided into sentences by using a basic natural language processing algorithm, so that the subsequent analysis processing of a plurality of modules is facilitated.
In one embodiment, the multidimensional feature statistics module can perform multidimensional analysis on the composition by performing basic word features in the composition. In one embodiment, the features extracted by the multidimensional feature statistics module include: average word length, i.e. the average length of a word; the number of words; the unique word proportion can measure the richness of the words; word frequency distribution based on a hierarchical vocabulary; the part of speech is distributed, and the richness of the mastered morphology is measured; the average sentence length is used for measuring the ability of writing long and difficult sentences; the number of clauses is used for measuring the clause using capacity; the number of sentences; a number of syntax errors; a number of spelling errors; the number of conjunctions, etc. Through the multi-dimensional characteristic statistical module, the quality of the composition can be comprehensively analyzed, so that the final scoring structure is more accurate and objective.
In one embodiment, as shown in fig. 3, the error correction module 20 includes: the fixed error recognition unit 21 is configured to perform error recognition on an english composition to be scored according to a grammar error correction algorithm based on a grammar rule, so as to obtain a sentence with a grammar error; the deep network error correction unit 22 is configured to correct, according to the neural machine translation network model, an error of an english composition to be scored based on a statement containing a syntax error, so as to obtain an error correction result; and the post-processing unit 23 is configured to perform spell check on the error correction result to obtain an optimized correct statement.
In one embodiment, for a grammar rule-based grammar error correction algorithm, a plurality of grammar rules are predefined, such as specific keyword sentences, word parts of speech, wildcard characters and the like, and common grammar errors, such as cardinal meaning consistent errors, noun complex errors, common confusing word errors and the like, can be corrected through the algorithm.
In one embodiment, the neural machine translation network model includes an LSTM model, a Transformer model, and the like. The neural machine translation network model regards the grammar error correction task as a machine translation task, simulates the error correction process by using the translation process, obtains the mapping between the wrong sentences and the correct sentences through model training, and further finishes error correction. Taking a Transformer model as an example, when error correction identification is performed, a sentence with grammar errors can be subjected to word segmentation according to a double byte encoding algorithm (BPE); the BPE word segmentation method can divide each word into smaller subunits, and because English words are usually formed by combining the subunits, the BPE word segmentation method can relieve the negative influence of unknown words on the error correction effect to a certain extent; the sentence after word segmentation is input into a Transformer model for error correction, the Transformer model adopts multiple layers of transformers as a coding-decoding structure to form a mapping from an error sentence to a correct sentence, and because in the process of grammar error correction, different from normal translation, more sentences and vocabularies do not need to be changed, a concern mechanism is added in the Transformer model, the coding result is utilized to obtain the input probability distribution and the weight which does not need to be changed through one layer of Transformer, the final decoding result is influenced through weighting, specifically, the coding result and the decoding result can be multiplied, namely, the distribution of the vocabularies is determined through one layer of Transformer, and then the multiplication result and the decoding result are added to obtain the error correction result. Therefore, the sentence can be corrected while the original sentence is kept as much as possible, and the processing method can achieve a good effect when the unknown word is faced and can keep the unknown word as much as possible. After the error correction result is obtained, in order to further correct individual specific error types, the error correction result may be post-processed, for example, the error correction result may be spell checked to obtain several potentially better sentences, and a language model may be used to score them to select an optimal result.
In the English composition analysis scoring system provided by the embodiment of the invention, the adopted neural machine translation network model is an end-to-end sequence-to-sequence model. Compared with other sequence-to-sequence models, such as a double-layer unidirectional LSTM (local Strand TM) model and a common Transformer model, the model used by the method is more stable in obtained result due to the addition of a coding result attention mechanism; due to the fact that BPE word segmentation is used, processing of unknown words is more efficient; in addition, in the model training process, the method extracts error types through rules, then adds grammar errors into a large number of collected correct sentences to generate error sentence samples, trains, and optimizes the model by using real correct-error sentence pairs, so that the final error correction result is better.
In one embodiment, as shown in fig. 4, the error correction module 20 further includes: and the statistical error correction unit 24 is configured to perform error identification on the english composition to be scored according to the statistical language model, and replace an error sentence with a correct sentence. In one embodiment, the statistical language model may be an n-gram language model that constructs an confusion set based on existing error-prone words, such as a student may confuse two words, that is, 'there' and 'then', and thus may construct the confusion set from both 'there' and 'then'. Therefore, when error recognition is carried out according to the statistical language model, replacement words of English compositions to be scored can be replaced by adopting a pre-constructed confusion set, and then the n-gram model is applied to score to judge whether the use errors of the confusion words exist.
In one embodiment, the topic analysis module 40 includes: extracting the characteristics of the English composition to be scored based on a meta-learning framework and a simnet matching model, and matching the extracted characteristics with predefined subject words or subject words in a model text to obtain a matching result, wherein the characteristics of the English composition to be scored are enhanced according to a self-attention and attention-guiding mechanism when the characteristics of the English composition to be scored are extracted; subject terms in the model texts are extracted through an SIFRank algorithm.
In one embodiment, the analysis of the theme may be regarded as a matching task, and the matching between the inputted composition and the theme word is performed, so as to determine whether the composition and the theme are matched according to the matching degree. The embodiment of the invention adopts a matching model based on simnet, extracts respective characteristics of the subject words and the composition, and performs fusion matching. In the process of extracting the features, a mechanism of self-attention and guide attention is used for improving the effectiveness of the extracted features; and using a dot product fusion method in the fusion matching process. The self-concern mechanism is to multiply the English composition to be scored with the self, and the guide concern mechanism is to multiply the English composition to be scored with the model document to be matched, so that certain characteristics of the English composition to be scored can be enhanced, and characteristics irrelevant to subjects are weakened, such as words like and.
In addition, the meta-learning MAML framework and the simnet matching model can be combined during matching, article matching corresponding to a subject word is regarded as a meta-learning task, the matching capability is learned, certain generalization performance is improved, and model fine tuning is performed for each input subject and corresponding model essay, so that certain exclusive matching capability of the subject is improved. For example, when matching English compositions to be scored and model documents, the model of the meta-learning task can be finely adjusted by the model of the meta-learning task, so that more accurate matching results can be obtained. In addition, before the English compositions to be evaluated are matched with the model texts, the SIFrank automatic subject term extraction method can be used for extracting the subject terms of the model texts, so that the method for manually extracting the subject terms is replaced.
In an embodiment, the highlight matching module 50 may perform full-text matching on the english composition to be scored based on the highlight matching rule constructed in advance. In one embodiment, the spot collocation includes common spot patterns such as it is adj. By adopting the highlight matching module, the bright spots of multiple dimensions such as sentences, words, phrases, conjunctions and the like of the bright spots in the English composition can be output on the sentence level.
In one embodiment, as shown in FIG. 5, scoring module 60 includes: the grading unit 61 is used for dividing English compositions to be evaluated into a plurality of grades according to preset requirements; and the scoring unit 62 is configured to obtain a score of each grade of the english composition by performing weighted calculation according to the analysis results of the error correction module 20, the multi-dimensional feature statistics module 30, the theme analysis module 40, and the highlight matching module 50. In one embodiment, when the english composition is classified, the classification may be performed according to basic requirements such as the number of words, the number of sentences, and the grammar, for example, the composition may be classified into five grades according to the basic requirements, and each grade corresponds to a range of scores. After determining the grade of the English composition to be scored, the score of each module can be determined according to the analysis results of the modules, then the scores of the modules are subjected to weighted summation to obtain the total score of the English composition, and the finally obtained total score is in the score range corresponding to the grade of the English composition to be scored.
The embodiment of the invention also provides an English composition analysis scoring method, as shown in FIG. 6, the method comprises the following steps:
step S101: and acquiring the English composition to be scored.
Step S102: and carrying out error recognition and correction on the obtained English composition to be scored according to the combination of a grammar error correction algorithm, a neural machine translation network model and a statistical language model.
Step S103: and performing statistical analysis on words, phrases and sentences in the English to be scored.
Step S104: and carrying out theme analysis on the English composition to be scored.
Step S105: and carrying out highlight identification on the English composition to be scored.
Step S106: and calculating the English composition to be scored to obtain the score of the English composition to be scored.
According to the English composition analysis scoring method provided by the embodiment of the invention, through carrying out error recognition correction, word, phrase and sentence statistical analysis, theme analysis and highlight recognition on the English composition to be scored, the English composition to be scored can be comprehensively scored in a measuring way, so that students can deeply recognize the advantages and the disadvantages of the written composition, and the writing level of the students can be improved practically. Meanwhile, the English composition analysis scoring method provided by the embodiment of the invention comprehensively adopts a grammar error correction algorithm, a neural machine translation network model and a statistical language model to identify and correct errors, can improve the accuracy of error identification, and can output wrong words and sentences and corrected words and sentences, thereby being beneficial to learning composition errors by students and further improving the knowledge level of the students.
The detailed description of the functions of the method for analyzing and scoring the English composition provided by the embodiment of the invention refers to the description of the system for analyzing and scoring the English composition in the embodiment.
An embodiment of the present invention further provides a storage medium, as shown in fig. 7, on which a computer program 601 is stored, where the instructions, when executed by a processor, implement the steps of the english composition analysis and scoring method in the foregoing embodiment. The storage medium is also stored with audio and video stream data, characteristic frame data, interactive request signaling, encrypted data, preset data size and the like. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Hard Disk Drive (Hard Disk Drive, abbreviated as HDD), or a Solid State Drive (SSD); the storage medium may also comprise a combination of memories of the kind described above.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Hard Disk (Hard Disk Drive, abbreviated as HDD) or a Solid State Drive (SSD), etc.; the storage medium may also comprise a combination of memories of the kind described above.
The english composition analysis scoring system provided by the embodiment of the present invention, as shown in fig. 8, includes a processor 51 and a memory 52, where the processor 51 and the memory 52 may be connected through a bus or in another manner, and fig. 8 takes the connection through the bus as an example.
The processor 51 may be a Central Processing Unit (CPU). The Processor 51 may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, or combinations thereof.
The memory 52, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as the corresponding program instructions/modules in the embodiments of the present invention. The processor 51 executes various functional applications and data processing of the processor by running non-transitory software programs, instructions and modules stored in the memory 52, that is, implements the english composition analysis scoring method in the above method embodiment.
The memory 52 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created by the processor 51, and the like. Further, the memory 52 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 52 may optionally include memory located remotely from the processor 51, and these remote memories may be connected to the processor 51 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The one or more modules are stored in the memory 52 and, when executed by the processor 51, perform an english composition analysis scoring method as in the embodiment shown in fig. 6.
The details of the above-mentioned english composition analysis and scoring system can be understood by referring to the corresponding related description and effects in the embodiment shown in fig. 6, and are not described herein again.
Although the embodiments of the present invention have been described in conjunction with the accompanying drawings, those skilled in the art can make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope defined by the appended claims.

Claims (10)

1. An English composition analysis scoring system, comprising:
the recognition module is used for acquiring English compositions to be scored;
the error correction module is used for carrying out error recognition and correction on the obtained English composition to be scored according to the combination of a grammar error correction algorithm, a neural machine translation network model and a statistical language model;
the multi-dimensional characteristic statistical module is used for performing statistical analysis on words, phrases and sentences in English texts to be scored;
the theme analysis module is used for carrying out theme analysis on the English composition to be scored;
the bright spot matching module is used for carrying out bright spot identification on the English composition to be scored;
and the scoring module is used for calculating the score of the English composition to be scored according to the analysis results of the error correction module, the multi-dimensional feature statistical module, the theme analysis module and the bright point matching module.
2. The english composition analysis scoring system according to claim 1, wherein the recognition module comprises:
the electronic identification unit is used for acquiring an English composition to be evaluated, which is input electronically;
and the handwriting recognition unit is used for acquiring a handwritten picture of the English composition to be scored and acquiring characters of the English composition to be scored according to the deep convolutional neural network.
3. The english composition analysis scoring system according to claim 1, further comprising:
and the preprocessing module is used for performing segmentation and sentence division processing before performing recognition correction, statistical analysis, theme analysis and highlight matching on the English composition to be scored.
4. The english composition analysis scoring system according to claim 1, wherein the error correction module comprises:
the fixed error recognition unit is used for carrying out error recognition on the English composition to be scored according to a grammar error correction algorithm based on grammar rules to obtain sentences containing grammar errors;
the deep network error correction unit is used for correcting the acquired English composition to be evaluated based on the sentence containing the grammatical error according to the neural machine translation network model to obtain an error correction result;
and the post-processing unit is used for carrying out spelling check on the error correction result to obtain the optimized correct sentence.
5. The English composition analysis scoring system according to claim 4, wherein the deep web error correction unit comprises:
the word segmentation unit is used for segmenting words of the sentences containing the grammar errors according to a double-byte coding algorithm;
and the error correction subunit is used for correcting errors of the participled sentences by combining an attention mechanism according to the multi-layer transform as a coding-decoding structure to obtain an error correction result.
6. The english composition analysis scoring system according to claim 1, wherein the error correction module further comprises:
and the statistical error correction unit is used for carrying out error identification on the English composition to be scored according to the statistical language model and replacing error sentences with correct sentences.
7. The english composition analysis scoring system according to claim 1, wherein the topic analysis module comprises: extracting the characteristics of the English composition to be scored based on a meta-learning frame and a simnet matching model, matching the extracted characteristics with predefined subject words or subject words in a model text to obtain a matching result, wherein the characteristics of the English composition to be scored are enhanced according to a self-attention and attention-guiding mechanism when the characteristics of the English composition to be scored are extracted; subject terms in the model texts are extracted through an SIFRank algorithm.
8. The English composition analysis scoring system of claim 7, wherein the scoring module comprises:
the grading unit is used for dividing English compositions to be evaluated into a plurality of grades according to preset requirements;
and the scoring unit is used for weighting and calculating the score of each grade of English composition according to the analysis results of the error correction module, the multi-dimensional feature statistical module, the theme analysis module and the bright spot matching module.
9. An English composition analysis scoring method is characterized by comprising the following steps:
acquiring English compositions to be scored;
carrying out error recognition and correction on the obtained English composition to be scored according to the combination of a grammar error correction algorithm, a neural machine translation network model and a statistical language model;
performing statistical analysis on words, phrases and sentences in the English to be scored;
performing theme analysis on the English composition to be scored;
carrying out highlight identification on an English composition to be scored;
and calculating the English composition to be scored to obtain the score of the English composition to be scored.
10. A computer-readable storage medium storing computer instructions for causing a computer to perform the english composition analysis scoring method according to claim 9.
CN202011645278.6A 2020-12-31 2020-12-31 English composition analysis scoring system, method and storage medium Pending CN114692606A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011645278.6A CN114692606A (en) 2020-12-31 2020-12-31 English composition analysis scoring system, method and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011645278.6A CN114692606A (en) 2020-12-31 2020-12-31 English composition analysis scoring system, method and storage medium

Publications (1)

Publication Number Publication Date
CN114692606A true CN114692606A (en) 2022-07-01

Family

ID=82135665

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011645278.6A Pending CN114692606A (en) 2020-12-31 2020-12-31 English composition analysis scoring system, method and storage medium

Country Status (1)

Country Link
CN (1) CN114692606A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103294660A (en) * 2012-02-29 2013-09-11 张跃 Automatic English composition scoring method and system
US20150199913A1 (en) * 2014-01-10 2015-07-16 LightSide Labs, LLC Method and system for automated essay scoring using nominal classification
CN110851599A (en) * 2019-11-01 2020-02-28 中山大学 Automatic scoring method and teaching and assisting system for Chinese composition

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103294660A (en) * 2012-02-29 2013-09-11 张跃 Automatic English composition scoring method and system
US20150199913A1 (en) * 2014-01-10 2015-07-16 LightSide Labs, LLC Method and system for automated essay scoring using nominal classification
CN110851599A (en) * 2019-11-01 2020-02-28 中山大学 Automatic scoring method and teaching and assisting system for Chinese composition

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
谭咏梅: "基于语料库的英语文章语法错误检查及纠正方法", 北京邮电大学学报, 31 August 2016 (2016-08-31), pages 92 - 97 *

Similar Documents

Publication Publication Date Title
CN108363790B (en) Method, device, equipment and storage medium for evaluating comments
CN109408642B (en) Domain entity attribute relation extraction method based on distance supervision
CN111104789B (en) Text scoring method, device and system
CN111221939B (en) Scoring method and device and electronic equipment
CN110852087A (en) Chinese error correction method and device, storage medium and electronic device
CN110727796B (en) Multi-scale difficulty vector classification method for graded reading materials
CN112069295B (en) Similar question recommendation method and device, electronic equipment and storage medium
CN108319581B (en) Natural language sentence evaluation method and device
CN109359290B (en) Knowledge point determining method of test question text, electronic equipment and storage medium
CN106570180A (en) Artificial intelligence based voice searching method and device
CN108090099B (en) Text processing method and device
CN108280065B (en) Foreign text evaluation method and device
TW201403354A (en) System and method using data reduction approach and nonlinear algorithm to construct Chinese readability model
CN109376355B (en) English word and sentence screening method and device, storage medium and electronic equipment
CN108090098B (en) Text processing method and device
CN111832281A (en) Composition scoring method and device, computer equipment and computer readable storage medium
CN113672731A (en) Emotion analysis method, device and equipment based on domain information and storage medium
CN113836894B (en) Multi-dimensional English composition scoring method and device and readable storage medium
Dunn et al. Stability of syntactic dialect classification over space and time
CN109977391B (en) Information extraction method and device for text data
CN110969005A (en) Method and device for determining similarity between entity corpora
CN113822052A (en) Text error detection method and device, electronic equipment and storage medium
CN114692606A (en) English composition analysis scoring system, method and storage medium
CN114241835A (en) Student spoken language quality evaluation method and device
CN114239555A (en) Training method of keyword extraction model and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination