CN110851599A - Automatic scoring method and teaching and assisting system for Chinese composition - Google Patents

Automatic scoring method and teaching and assisting system for Chinese composition Download PDF

Info

Publication number
CN110851599A
CN110851599A CN201911059419.3A CN201911059419A CN110851599A CN 110851599 A CN110851599 A CN 110851599A CN 201911059419 A CN201911059419 A CN 201911059419A CN 110851599 A CN110851599 A CN 110851599A
Authority
CN
China
Prior art keywords
composition
scoring
scored
chinese
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911059419.3A
Other languages
Chinese (zh)
Other versions
CN110851599B (en
Inventor
夏俐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN201911059419.3A priority Critical patent/CN110851599B/en
Publication of CN110851599A publication Critical patent/CN110851599A/en
Application granted granted Critical
Publication of CN110851599B publication Critical patent/CN110851599B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides an automatic scoring method and an auxiliary teaching system for Chinese compositions. The method comprises the following steps: acquiring a composition to be scored; a shallow feature extraction step, which is used for extracting the shallow features of the composition to be evaluated; a deep semantic feature extraction step, which is used for extracting deep semantic features of the composition to be evaluated, wherein the deep semantic features comprise wrongly written characters and grammar wrongly written features; a scoring step, which is used for combining the extracted shallow layer characteristics and deep layer semantic characteristics and adopting random forest fitting to obtain a scoring result of the composition to be scored; also comprises a pinyin conversion step and a theme extraction step. The method combines the shallow feature and the deep semantic feature of the composition, has high scoring accuracy, obtains an ideal evaluation result by training on a small sample, and effectively improves the utilization rate of the sample; meanwhile, the functions of wrongly written character recognition and correction, pinyin recognition and conversion, grammar error recognition and correction and the like are added, multi-dimensional information is provided as feedback tutoring of user writing, and user experience is enhanced.

Description

Automatic scoring method and teaching and assisting system for Chinese composition
Technical Field
The invention relates to natural language processing technology in the field of artificial intelligence, in particular to a Chinese composition automatic scoring method and a teaching and assisting system.
Background
Brief introduction to automatic composition scoring System
An automatic composition scoring system aes (automatic assessment scoring) is an educational aid based on an intelligent algorithm in the trend of artificial intelligence and deep learning technology. Compared with a manual scoring system, the automatic composition scoring system has the advantages of being more objective, timely, efficient and low in cost, and more attention and research are paid, so that the research and development of the automatic composition scoring system gradually becomes a trend. The traditional composition automatic scoring system mainly models and analyzes texts through shallow features, ignores deep semantic features of the texts, and adopts a cyclic neural network to extract the deep semantic features of the texts by adopting a deep learning technology, so that a scoring result is more objective.
Challenge of Chinese composition automatic scoring system
For natural language processing technology, most of the research is based on English at present, Chinese is technically much more complex than English due to the characteristics of Chinese, and Chinese processing is relatively lacked in practical application and has a plurality of difficulties and challenges. The existing automatic composition scoring system mainly processes English compositions, and the processing result of Chinese compositions is not ideal. The invention provides an automatic scoring method and an auxiliary teaching system specially for Chinese composition.
The traditional composition automatic scoring system needs to manually design text features, is high in cost and cannot understand deep semantics of the text; the deep learning technology for extracting deep semantic features of texts depends on a large corpus, the scale of the traditional Chinese composition corpus is small, and how to improve the effective utilization rate of samples is very important. Meanwhile, how to design features on a small-scale sample, how to identify and correct wrongly written characters, pinyin and grammar errors appearing in the Chinese composition, how to combine the extracted features for training, how to ensure the accuracy of writing tutorial feedback information and the like are a series of problems to be solved by designing an automatic scoring system for the Chinese composition.
Prior art implementation
In designing an automatic composition scoring system, non-patent document 1 trains a CNN-LSTM model on an english composition data set. Non-patent document 2 extracts lexical and syntactic features of a composition, and trains the extracted features using a multiple linear regression model. Patent document 3 provides a composition scoring method in which two neural networks are designed, feature vectors and word vectors of a composition text are used as inputs to the neural networks, and composition scores are calculated from outputs of the two neural networks. Patent document 4 provides a composition scoring method based on an attention mechanism, which adopts a neural network attention framework of a word-sentence-document three-layer structure, and uses manually extracted features to fuse with a document layer, thereby setting an attention weight of the document layer. Patent document 5 acquires a large number of compositions of a certain composition topic, analyzes the content of each composition, obtains the composition mode of each composition, trains a time sequence model of the composition, tests the user composition by using the model, and scores the user composition according to a novelty degree.
Non-patent document 1: taghipour K, Ng H T.A Neural Approach to Automated construction [ C ]// Conference on Empirical Methods in Natural Langugen processing.2016
Non-patent document 2: composition automatic scoring research using libama, chinese as a second language test [ D ]. university of beijing language, 2006
Patent document 3: CN108519975A composition scoring method, device and storage medium
Patent document 4: CN107133211A composition scoring method based on attention mechanism
Patent document 5: CN109635087A composition scoring method and family education equipment
Disadvantages of the prior art
The deep learning technique represented by non-patent document 1 depends on a large-scale sample during training, and cannot achieve a satisfactory training effect in a small-scale sample; the machine learning technology represented by non-patent document 2 does not fully extract deep semantic features of the composition, and meanwhile, the fitting capability of a multiple linear regression model is limited, so that the accuracy of composition scoring is low; the methods represented by patent documents 3 and 4 are based on a neural network, the patent document 3 predicts and makes text scores by designing a plurality of neural networks, and the patent document 4 adopts an attention mechanism at the output end of the neural network to improve the scoring accuracy, however, the methods have low utilization rate of samples and cannot obtain satisfactory training results on small samples; patent document 5, which trains a neural network under a specific topic, will result in insufficient generalization capability of the trained neural network, and only takes the novelty of a composition as a judgment standard without considering the characteristics of other dimensions of the composition, which will result in low accuracy of composition scoring.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a construction method of a Chinese composition automatic scoring system, a Chinese composition automatic scoring method, a teaching and assisting system, a computer readable storage medium and a computer program product.
According to the technical scheme, the shallow layer characteristics and the deep layer semantic characteristics of the composition are combined, so that the scoring accuracy is improved, the utilization rate of the sample is improved, and a satisfactory effect is achieved on the small sample training.
In order to achieve the above object, an embodiment of the first aspect of the present invention provides a method for constructing an automatic scoring system for chinese composition, the method comprising the following steps:
a corpus construction step, which is used for constructing a Chinese composition corpus;
a shallow feature extraction step, namely extracting shallow features of the composition based on the corpus;
a deep semantic feature extraction step, wherein deep semantic features of the composition are extracted based on the corpus, and the deep semantic features comprise wrongly written character features and grammar wrongly written features;
and a regression step, which is used for combining the extracted shallow layer characteristics and deep layer semantic characteristics and adopting random forest fitting to obtain the scoring result of the composition.
Further, the extracting of the wrongly written character specifically includes: adopting a probability word segmentation model to segment the composition; according to the word segmentation result, comparing the composition text with the wrongly written character recognition corpus to obtain a suspicious word set; comparing the suspicious word set with the wrongly written character correction corpus to obtain a candidate word set; and calculating semantic confusion of the candidate word set, and taking the word with the minimum confusion as a wrongly written character correction result.
Further, the extracting the syntax error feature specifically includes: and (4) training word vectors by utilizing the corpus, inputting the word vectors into the Bi-LSTM neural network model, and training to obtain a labeling sequence, namely a grammar error result.
Furthermore, the method also comprises a pinyin conversion step, which is used for identifying the pinyin in the text to be scored and converting the pinyin into corresponding Chinese characters.
Further, the method also comprises a theme extracting step, wherein the theme extracting step is used for extracting the theme to be scored as implicit in the text.
The embodiment of the second aspect of the invention provides a Chinese composition automatic scoring method, which comprises the following steps:
acquiring a composition to be scored: acquiring a composition picture to be scored, and performing Chinese recognition to obtain a composition text; or directly acquiring the composition text to be evaluated;
shallow layer feature extraction: processing the composition text to be scored to obtain word segmentation results of the composition text; according to the word segmentation result, counting shallow features of the composition to be scored;
deep semantic feature extraction: extracting deep semantic features of the composition to be scored, wherein the deep semantic features comprise wrongly written characters and grammar wrongly written characters;
grading: and combining the extracted shallow layer features and deep layer semantic features and adopting random forest fitting to obtain a scoring result of the composition to be scored.
Further, the extracting of the wrongly written character specifically includes: processing the composition text to be scored to obtain word segmentation results of the composition text; according to the word segmentation result, comparing the composition text to be scored with the wrongly written character recognition corpus to obtain a suspicious word set; comparing the suspicious word set with the wrongly written character correction corpus to obtain a candidate word set; and calculating semantic confusion of the candidate word set, and taking the word with the minimum confusion as a wrongly written character correction result.
Further, the extracting the syntax error feature specifically includes: processing the composition texts to be evaluated to obtain word vectors of the composition texts; and inputting the word vector into a Bi-LSTM neural network model, and training to obtain a labeling sequence, namely a grammar error result.
Furthermore, the method also comprises a pinyin conversion step, which is used for identifying the pinyin in the text to be scored and converting the pinyin into corresponding Chinese characters.
Further, the method also comprises a theme extracting step, wherein the theme extracting step is used for extracting the theme to be scored as implicit in the text.
Further, the shallow features specifically include the number of sentences, the average length of the sentences, the number of full characters, the number of metaphors, the number of pinyin, and the vocabulary level.
Further, the syntax error characteristics specifically include four types: redundant words, missing words, wrong word selections, unordered words.
The embodiment of the third aspect of the invention provides a Chinese composition automatic scoring system, which comprises the following modules:
the composition to be scored acquisition module: acquiring a composition picture to be scored, and performing Chinese recognition to obtain a composition text; or directly acquiring the composition text to be evaluated;
shallow layer feature extraction module: the system is used for processing the composition texts to be scored to obtain word segmentation results of the composition texts; according to the word segmentation result, counting shallow features of the composition to be scored;
the deep semantic feature extraction module: the method is used for extracting deep semantic features of the composition to be scored, wherein the deep semantic features comprise wrongly written characters and grammar wrongly written characters;
a scoring module: and the method is used for combining the extracted shallow layer characteristics and deep layer semantic characteristics and adopting random forest fitting to obtain a scoring result of the composition to be scored.
Further, the extracting of the wrongly written character specifically includes: processing the composition text to be scored to obtain word segmentation results of the composition text; according to the word segmentation result, comparing the composition text to be scored with the wrongly written character recognition corpus to obtain a suspicious word set; comparing the suspicious word set with the wrongly written character correction corpus to obtain a candidate word set; and calculating semantic confusion of the candidate word set, and taking the word with the minimum confusion as a wrongly written character correction result.
Further, the extracting the syntax error feature specifically includes: processing the composition texts to be evaluated to obtain word vectors of the composition texts; and inputting the word vector into a Bi-LSTM neural network model, and training to obtain a labeling sequence, namely a grammar error result.
Furthermore, the system also comprises a pinyin conversion module which is used for identifying pinyin in the text to be scored and converting the pinyin into corresponding Chinese characters.
Further, the system also comprises a theme extracting module used for extracting the theme to be scored as implicit in the text.
The embodiment of the fourth aspect of the invention provides a Chinese composition automatic scoring system, which is constructed according to the construction method of the Chinese composition automatic scoring system.
An embodiment of a fifth aspect of the present invention provides a Chinese composition automatic scoring tutoring system, which includes a memory, a processor, and a computer program stored in the memory and operable on the processor; or the teaching and assisting system comprises a terminal and a cloud server which is connected with the terminal and is stored with a computer program, wherein the computer program is executed to realize the automatic Chinese composition scoring method.
An embodiment of a sixth aspect of the present invention provides a computer-readable storage medium, on which a computer program is stored, which, when executed, implements the above-mentioned automatic scoring method for chinese compositions.
An embodiment of the seventh aspect of the present invention provides a computer program product, which when executed, implements the above method for automatically scoring a chinese composition.
The composition scoring method only considering the shallow feature has low scoring accuracy; the method of simply considering deep semantic features requires a large corpus to perform sample training. According to the invention, by combining the shallow layer characteristics and the deep layer semantic characteristics of the composition, the scoring accuracy is improved, and the utilization rate of the sample is effectively improved, so that a series of problems in the prior art are solved.
Compared with the existing Chinese composition scoring software, the composition automatic scoring method and the teaching and assisting system have the advantages that: according to the technical scheme, the shallow layer characteristics and the deep layer semantic characteristics of the composition are combined, the scoring accuracy is high, an ideal evaluation result is obtained by training on a small sample, and the utilization rate of the sample is effectively improved; meanwhile, the functions of wrongly written character recognition and correction, pinyin recognition and conversion, grammar error recognition and correction and the like are added, multi-dimensional writing tutoring information feedback is provided, and user experience is enhanced.
Drawings
FIG. 1 is a schematic diagram of the working principle of the automatic Chinese composition scoring method and the teaching and assisting system according to the present invention.
FIG. 2 is a schematic diagram illustrating the principle of shallow feature extraction according to the present invention.
FIG. 3 is a schematic diagram illustrating the principle of extracting the syntax error feature according to the present invention.
FIG. 4 is a schematic diagram of an implementation of the Chinese composition automatic scoring tutoring system of the present invention.
FIG. 5 is one of the UI interfaces of the automatic Chinese composition scoring system constructed by the present invention: an OCR recognition interface schematic.
Fig. 6 is a second UI interface of the automatic scoring system for chinese compositions constructed in the present invention: and (4) displaying an interface schematic diagram by grading.
FIG. 7 is a diagram illustrating key steps in the method for automatically scoring Chinese compositions according to the present invention.
Fig. 8-10 are diagrams illustrating an embodiment of an automatic scoring method for chinese composition according to the present invention, wherein fig. 8 is a diagram illustrating an image of a composition to be scored, fig. 9 is a diagram illustrating chinese character recognition, and fig. 10 is a diagram illustrating a result of scoring by using the automatic scoring method for chinese composition according to the present invention.
Detailed Description
The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
FIG. 1 is a schematic diagram of the working principle of the automatic Chinese composition scoring method and the teaching and assisting system according to the present invention. As shown in FIG. 1, the technical scheme of the invention combines the composition shallow feature and the deep semantic feature, improves the scoring accuracy and the sample utilization rate, and obtains satisfactory effect on small samples. Firstly, constructing a Chinese composition corpus, and extracting shallow features of the composition based on the corpus, wherein the shallow features are mainly statistical features; extracting deep semantic features of the composition based on the corpus, wherein the deep semantic features comprise wrongly written characters and grammar wrongly written characters; combining the shallow layer characteristics and the deep layer semantic characteristics and adopting random forest fitting to obtain the score of the composition. The scheme has high scoring accuracy rate when training on a small sample, and effectively improves the utilization rate of the sample.
Compared with the existing Chinese composition scoring software, the Chinese composition automatic scoring teaching and assisting system provided by the invention has the advantages that the functions of wrongly written character correction, pinyin correction, grammar error recognition and the like are added, and multidimensional writing tutoring information feedback is provided.
Method for constructing automatic Chinese composition scoring system
The construction method of the automatic Chinese composition scoring system of the present invention is described in detail below.
Firstly, a Chinese composition corpus is constructed. The method comprises the steps of collecting 1000 composition pictures, hiring professional scoring teachers to score compositions, identifying Chinese characters by adopting a network-accessible cloud OCR technology and manually proofreading, and constructing an electronic version Chinese composition corpus; the invention constructs one-to-six-grade word banks based on human teaching version pupil Chinese teaching materials, wherein the number of the one-grade word banks is 174, the second-grade word banks is 536, the third-grade word banks is 1132, the fourth-grade word banks is 1737, the fifth-grade word banks is 2172, and the sixth-grade word banks is 2655. It should be noted that the Chinese composition corpus can also be constructed by using composition pictures collected from other places, and the source of the composition is not limited by the invention; or directly obtain the composition text. The word stock can also be constructed by adopting other textbook systems of publishers or selecting other sources independent of the textbooks.
And then shallow feature extraction is carried out. And counting the number of sentences in the composition, the average word number of the sentences, the full-text word number, the number of metaphors, the number of pinyin and the matching degree of the composition and each grade lexicon, and taking the extracted result as the shallow feature of the composition. Metaphorical characteristic words are: like, liked, like, liked, as, like. When the shallow feature is counted, a probability word segmentation model is adopted, as shown in fig. 2, word segmentation marks S, B, M, E are defined as word formation of a single word, the beginning, the middle and the end of a word group respectively, and each word is represented as a visible state otWord segmentationThe sign being represented as a hidden state stThen the best participle combination can be expressed as such P (o)1,o2,…on|s1,s2,…,sn) The largest combination. Defining lambda as input model parameter, a as state transition probability matrix, b as observation probability matrix, deltat(i) Is the maximum probability value, delta, in a single path with state i at time tt(i)=maxP(it=i,it-1,…,i1,ot,…,o1I |, λ), i ═ 1, 2, …, N. Definition psit(i) T-1 node of the maximum probability path in the single path with state i at time t, psit(i)=argmax1≤j≤Nt-1(j)aji]. The termination state is P*=max1≤i≤NδT(i),
Figure BDA0002257476630000081
Figure BDA0002257476630000082
Backtracking the optimal path to obtain the optimal word segmentation combination,
Figure BDA0002257476630000083
and then, deep semantic feature extraction is carried out. And extracting deep semantic features of the composition, such as wrongly written characters. Firstly, a probability word segmentation model is adopted to segment words of a composition, and the word segmentation result is compared with a wrongly-written character recognition corpus to obtain a suspicious word set. The wrongly-written character recognition corpus can include but is not limited to a manual definition dictionary, a confusion set dictionary, a people daily dictionary and the like; the number of words in the manual definition dictionary is 177, the confusion set dictionary is 759, and the daily dictionary of people is 584429. And comparing the suspicious word set with the wrongly written character correction corpus to obtain a candidate word set. The wrongly written character correction corpus includes but is not limited to a common word dictionary, the same component and radical set and the same pinyin set; wherein, it is commonly usedThe number of words in the dictionary is 3502, the same Pinyin dictionary is 3431, and the similar dictionary is 1664. Training a confusion model based on a people daily newspaper corpus, and defining wiIs a word in the article, the confusion degree PP of the sentence S is
Figure BDA0002257476630000091
And calculating the confusion degree of the candidate word set by using the model, and taking the element with the minimum confusion degree as a wrongly written character correction result.
The deep semantic feature extraction also comprises grammar error feature extraction. As shown in FIG. 3, first, w is defined by training word vectors in microblog corpusiFor composition texts, the learning goal is to maximize the likelihood function L ═ logp (w | content (w)), and the trained word vector is used as the input of the neural network model. And (3) adopting Bi-LSTM as a neural network model, defining c as the state of the cell, a as the output of the cell, w as the weight, sigma as an activation function, and selecting sigmoid as the activation function. The LSTM cell needs to be operated through a three-layer gate, the first layer gate is a forgetting gate, the output and the state of the last cell are selectively forgotten, ft=σ(wf·[at-1,ct]+bf) It is then necessary to determine that new information is deposited in the cellular state, divided into two parts. Firstly, the sigmoid layer determines an updating value, the tanh layer creates a new candidate vector, ut=σ(wf·[at-1,ct]+bf),
Figure BDA0002257476630000092
Figure BDA0002257476630000093
When the cell state is updated, part of the information is discarded, new information is added, namely the state of the next cell,
Figure BDA0002257476630000094
finally, the partial state of the output is determined by the sigmoid layer, and the cell state is processed by tanh to finally obtain the desired output, ot=σ(wo[at-1,wt]+bo),at=ot·tanh(ct). And (3) processing the output of the Bi-LSTM neural network by a conditional random field (conditional random field), and considering the mutual relation between the positions before and after the output to obtain a high-accuracy labeling sequence, wherein the labeling sequence is a part-of-speech and grammar wrong labeling result of each character. The annotation sequence can be denoted by letters R, M, S, W, which correspond to four types of syntax errors: redundant words (R), missing words (M), wrong word selections (S), unordered words (W). The syntax error characteristics may include, but are not limited to, one or more of the four above. The Batch size of the Bi-LSTM neural network is 64, Epoch is 200, Embedding dim is 100, rnhidden dim is 200, LSTMmaxlen is 300, dropout is 0.25, training is carried out on a data set provided by a CGED (Chinese Grammar Errordignosis) competition, the accuracy rate finally reaches 0.861, and grammatical error characteristics are extracted from the composition set by using a trained Bi-LSTM model.
And finally, a regression step, namely combining the extracted shallow layer features and deep layer semantic features and adopting random forest fitting to obtain a scoring result of the composition. The random forest firstly resamples the sample data, randomly extracts N samples in the original N training samples in a returning way each time, and constructs a decision tree by taking a plurality of obtained sample sets as training samples. When a decision tree is constructed, m features in the candidate features are randomly extracted to serve as candidate features for decision under the current node, and the best combination is selected from the candidate features. And after a group of decision trees are obtained, voting is carried out on the output of the group of decision trees, and the class with the most votes is used as the decision of the random forest. In the embodiment of the invention, 100 decision trees are selected for training each time, the average error of the scores under the percentage score is 2.78 points, and the consistency evaluation standard quadratic weighted kappa value is 0.759.
The embodiment of the invention also can comprise a pinyin conversion step and a theme extraction step. The pinyin conversion step is used for converting pinyin in a user text into corresponding Chinese characters, the pinyin is represented as a visible state, the Chinese characters with the same pinyin are in a hidden state by adopting the same method as the probability word segmentation model, and the best pinyin conversion result is obtained by solving. Topic extraction for use in user compositionImplicit topics are extracted, assuming that the article consists of K topics, the kth topic consisting of
Figure BDA0002257476630000101
The words are formed, an LDA (latent Dirichlet allocation) model is constructed,
Figure BDA0002257476630000102
Figure BDA0002257476630000103
wherein
Figure BDA0002257476630000104
Is a K-dimensional distribution hyperparameter. For any composition d, the subject distribution theta is expressed by Dirichlet distributiondFor any topic k, the word distribution β is represented by a Dirichlet distributionkEach word has a conditional probability of corresponding to a topic of
Figure BDA0002257476630000105
Gibbs sampling is performed on the conditional probability to obtain a topic of each word, and K is set to be 5 in the embodiment of the present invention. Thus, the design of the automatic Chinese composition scoring system is completed.
The schematic diagram of the automatic Chinese composition scoring teaching and assisting system constructed by the above method for constructing the automatic Chinese composition scoring system is shown in fig. 4, wherein the cloud server and the terminal are both in the prior art, and are not described herein again. The Chinese composition automatic scoring teaching auxiliary system is realized through a computer program, the computer program is stored on a cloud server, the cloud server is connected with a terminal, and after an authorized user downloads the computer program from the cloud server through the terminal, the program is executed on the terminal, so that the automatic scoring of compositions is realized. The UI system interface includes an OCR recognition interface and a score display interface, as shown in fig. 5 and 6, where fig. 5 is a schematic diagram of the OCR recognition interface, and fig. 6 is a schematic diagram of the score display interface. The teaching and assisting system can also be designed to include a memory, a processor, and a computer program stored on the memory and executable on the processor; the computer program is executed to implement automatic scoring of a composition.
Automatic scoring method for Chinese composition
The automatic Chinese composition scoring method of the present invention is described below. As shown in fig. 5, in the OCR recognition interface, the user needs to submit a handwritten composition picture at the local terminal, click the upload picture button to obtain an OCR recognition result, and click the start-modify button to obtain a composition reviewing result, as shown in fig. 6. The composition reviewing result may include, but is not limited to, composition score, keyword, lexicon matching degree, pinyin conversion result, wrongly written character recognition and correction result, grammar mistake result, and the like, and the content displayed on the interface may be increased or decreased according to the requirement in the specific implementation process.
Specifically, the Chinese composition automatic scoring method comprises the following steps:
acquiring a composition to be scored: acquiring a composition picture to be scored, and performing Chinese recognition to obtain a composition text; or directly acquiring the composition text to be evaluated;
shallow layer feature extraction: processing the composition text to be scored to obtain word segmentation results of the composition text; according to the word segmentation result, counting shallow features of the composition to be scored;
deep semantic feature extraction: extracting deep semantic features of the composition to be scored, wherein the deep semantic features comprise wrongly written characters and grammar wrongly written characters;
grading: and combining the extracted shallow layer features and deep layer semantic features and adopting random forest fitting to obtain a scoring result of the composition to be scored.
Fig. 7 illustrates key steps in the above-described method. Aiming at the shallow feature extraction step, processing the composition text to be scored by adopting a probability word segmentation model to obtain a word segmentation result of the composition text; and according to the word segmentation result, counting shallow features of the composition to be scored, wherein the shallow features comprise but are not limited to the number of sentences, the average length of the sentences, the number of full words, the number of metaphors, the number of pinyin, the word level and other features. The probability scoreThe word model is shown in FIG. 2, the word-defining marks S, B, M, E are the word-forming of a single word, the beginning, middle and end of a word group, and each word is represented as a visible state otThe word segmentation markers are represented as hidden states stThen the best participle combination can be expressed as such P (o)1,o2,…on|s1,s2,…,sn) The largest combination. Defining lambda as input model parameter, a as state transition probability matrix, b as observation probability matrix, deltat(i) Is the maximum probability value, delta, in a single path with state i at time tt(i)=maxP(it=i,it-1,…,i1,ot,…,o1I |, λ), i ═ 1, 2, …, N. Definition psit(i) T-1 node of the maximum probability path in the single path with state i at time t, psit(i)=argmax1≤j≤Nt-1(j)aji]. The termination state is P*=max1≤i≤NδT(i),
Figure BDA0002257476630000121
Figure BDA0002257476630000122
Backtracking the optimal path to obtain the optimal word segmentation combination,
Figure BDA0002257476630000123
Figure BDA0002257476630000124
aiming at the wrongly-written character feature extraction step, adopting a probability word segmentation model to process the composition text to be scored to obtain a word segmentation result of the composition text; and comparing the composition text to be evaluated with the wrongly-written character recognition corpus according to the word segmentation result, and counting unmatched words to obtain a suspicious word set. The wrongly written word recognition corpus can include but is not limited to a manually defined dictionary, a confusion set dictionary, a people daily dictionary and the like; the number of words in the manual definition dictionary is 177, the confusion set dictionary is 759, and the daily dictionary of people is 584429. Will be provided withAnd comparing the suspected word set with the wrongly-written character correction corpus to obtain a candidate word set, and calculating the semantic confusion degree of the candidate word set, wherein the word with the minimum confusion degree is used as a wrongly-written character correction result, and the original word is the wrongly-written character result. The corpus of corrected wrongly written words may include but is not limited to a dictionary of common words, a set of the same components and radicals, and a set of the same pinyin; wherein, the number of words in the common dictionary is 3502, the same Pinyin dictionary is 3431, and the similar dictionary is 1664. Calculating the semantic confusion degree by using the trained confusion degree model, defining wi as a word to be scored, and defining the semantic confusion degree PP of the sentence S as
Figure BDA0002257476630000125
Figure BDA0002257476630000126
And calculating the semantic confusion degree of the candidate word set by using the model, and taking the word with the minimum semantic confusion degree as a wrongly written character correction result.
Processing the composition text to be evaluated aiming at the grammar error characteristic extraction step to obtain a word vector of the composition text; inputting the word vector into a Bi-LSTM neural network model for training to obtain a labeling sequence; the word with the annotation sequence R, M, S, W is a syntax error result. And (3) adopting Bi-LSTM as a neural network model, defining c as the state of the cell, a as the output of the cell, w as the weight, sigma as an activation function, and selecting sigmoid as the activation function. The LSTM cell needs to be operated through a three-layer gate, the first layer gate is a forgetting gate, the output and the state of the last cell are selectively forgotten, ft=σ(wf·[at-1,ct]+bf) It is then necessary to determine that new information is deposited in the cellular state, divided into two parts. Firstly, the sigmoid layer determines an updating value, the tanh layer creates a new candidate vector, ut=σ(wf·[at-1,ct]+bf),
Figure BDA0002257476630000131
When the cell state is updated, part of the information is discarded, new information is added, namely the state of the next cell,finally, the partial state of the output is determined by the sigmoid layer, and the cell state is processed by tanh to finally obtain the desired output, ot=σ(wo[at-1,wt]+bo),at=ot·tanh(ct). And (3) processing the output of the Bi-LSTM neural network by a conditional random field (conditional random field), and considering the mutual relation between the positions before and after the output to obtain a high-accuracy labeling sequence, wherein the labeling sequence is a part-of-speech and grammar wrong labeling result of each character. The annotation sequence can be denoted by letters R, M, S, W, which correspond to four types of syntax errors: redundant words (R), missing words (M), wrong word selections (S), unordered words (W). The syntax error characteristics may include, but are not limited to, one or more of the four above.
Aiming at the scoring step, namely the regression step, combining the extracted shallow layer features and deep layer semantic features (including wrongly written characters and grammatical errors) and then training by adopting a random forest to obtain the final score of the composition to be scored. The random forest firstly resamples the sample data, randomly extracts N samples in the original N training samples in a returning way each time, and constructs a decision tree by taking a plurality of obtained sample sets as training samples. When a decision tree is constructed, m features in the candidate features are randomly extracted to serve as candidate features for decision under the current node, and the best combination is selected from the candidate features. And after a group of decision trees are obtained, voting is carried out on the output of the group of decision trees, and the class with the most votes is used as the decision of the random forest. In the embodiment of the invention, 100 decision trees are selected for training each time, the average error of the scores under the percentage score is 2.78 points, and the consistency evaluation standard quadraticated kappa value is 0.759.
The Chinese composition automatic scoring method can also comprise a pinyin conversion step and a theme extraction step. The pinyin conversion step is used for converting pinyin in the text made by the user into corresponding Chinese characters, the method which is the same as the probability word segmentation model is adopted to express the pinyin as a visible state, and the Chinese characters with the same pinyin are hiddenAnd storing the state, and solving to obtain the optimal pinyin conversion result. The topic extraction is used for extracting the topics implicit in the user composition, and the hypothesis article is composed of K topics, wherein the K topic is composed ofThe words are formed, an LDA (latent Dirichlet allocation) model is constructed,
Figure BDA0002257476630000142
wherein
Figure BDA0002257476630000143
Is a K-dimensional distribution hyperparameter. For any composition d, the subject distribution theta is expressed by Dirichlet distributiondFor any topic k, the word distribution β is represented by a Dirichlet distributionkEach word has a conditional probability of corresponding to a topic of
Figure BDA0002257476630000144
Gibbs sampling is performed on the conditional probability to obtain a topic of each word, and K is set to be 5 in the embodiment of the present invention.
Fig. 8-10 show an embodiment of the present invention for composition scoring using the automatic Chinese composition scoring method. Wherein, fig. 8 is a schematic diagram of obtaining a picture of a composition to be scored, fig. 9 is a schematic diagram of identifying Chinese characters, and fig. 10 is a result of scoring by adopting the automatic scoring method of the Chinese composition.
The embodiment of the invention also comprises a Chinese composition automatic scoring system, and each module of the system corresponds to each step of the Chinese composition automatic scoring method one by one. The system comprises the following modules:
the composition to be scored acquisition module: acquiring a composition picture to be scored, and performing Chinese recognition to obtain a composition text; or directly acquiring the composition text to be evaluated;
shallow layer feature extraction module: the system is used for processing the composition texts to be scored to obtain word segmentation results of the composition texts; according to the word segmentation result, counting shallow features of the composition to be scored;
the deep semantic feature extraction module: the method is used for extracting deep semantic features of the composition to be scored, wherein the deep semantic features comprise wrongly written characters and grammar wrongly written characters;
a scoring module: and the method is used for combining the extracted shallow layer characteristics and deep layer semantic characteristics and adopting random forest fitting to obtain a scoring result of the composition to be scored.
The embodiment of the invention also comprises a Chinese composition automatic scoring teaching and assisting system, which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor; or the teaching and assisting system comprises a terminal and a cloud server which is connected with the terminal and is stored with a computer program, wherein the computer program is executed to realize the automatic Chinese composition grading method.
Embodiments of the present invention also include a computer-readable storage medium having a computer program stored thereon, where the computer program is executed to implement the automatic scoring method for chinese composition according to the present invention.
Embodiments of the present invention also include a computer program product, which when executed implements the method for automatically scoring chinese compositions of the present invention.
The composition scoring method only considering the shallow feature has low scoring accuracy; the method of simply considering deep semantic features requires a large corpus to perform sample training. According to the invention, by combining the shallow layer characteristics and the deep layer semantic characteristics of the composition, the scoring accuracy is improved, and the utilization rate of the sample is effectively improved, so that a series of problems in the prior art are solved.
Compared with the existing Chinese composition scoring software, the composition automatic scoring method and the teaching and assisting system have the advantages that: according to the technical scheme, the shallow layer characteristics and the deep layer semantic characteristics of the composition are combined, so that the scoring accuracy is high, an ideal evaluation result is obtained by training on a small sample, and the utilization rate of the sample is effectively improved; meanwhile, the functions of wrongly-written character recognition and correction, pinyin recognition and conversion, grammar error recognition and correction and the like are added, multi-dimensional information feedback is provided, and user experience is enhanced.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and substitutions can be made without departing from the technical principle of the present invention, and these modifications and substitutions should also be regarded as the protection scope of the present invention.

Claims (10)

1. A method for constructing an automatic Chinese composition scoring system is characterized by comprising the following steps: the method comprises the following steps:
a corpus construction step, which is used for constructing a Chinese composition corpus;
a shallow feature extraction step, namely extracting shallow features of the composition based on the corpus;
a deep semantic feature extraction step, wherein deep semantic features of the composition are extracted based on the corpus, and the deep semantic features comprise wrongly written character features and grammar wrongly written features;
and a regression step, which is used for combining the extracted shallow layer characteristics and deep layer semantic characteristics and adopting random forest fitting to obtain the scoring result of the composition.
2. The method for constructing the automatic Chinese composition scoring system according to claim 1, wherein the method comprises the following steps: the extraction of the wrongly written character features specifically comprises the following steps: adopting a probability word segmentation model to segment the composition; according to the word segmentation result, comparing the composition text with the wrongly written character recognition corpus to obtain a suspicious word set; comparing the suspicious word set with the wrongly written character correction corpus to obtain a candidate word set; and calculating semantic confusion of the candidate word set, and taking the word with the minimum confusion as a wrongly written character correction result.
3. The method for constructing the automatic Chinese composition scoring system according to claim 1, wherein the method comprises the following steps: the extracting of the grammatical error features specifically comprises: and (4) training word vectors by utilizing the corpus, inputting the word vectors into the Bi-LSTM neural network model, and training to obtain a labeling sequence, namely a grammar error result.
4. A Chinese composition automatic scoring method is characterized in that: the method comprises the following steps:
acquiring a composition to be scored: acquiring a composition picture to be scored, and performing Chinese recognition to obtain a composition text; or directly acquiring the composition text to be evaluated;
shallow layer feature extraction: processing the composition text to be scored to obtain word segmentation results of the composition text; according to the word segmentation result, counting shallow features of the composition to be scored;
deep semantic feature extraction: extracting deep semantic features of the composition to be scored, wherein the deep semantic features comprise wrongly written characters and grammar wrongly written characters;
grading: and combining the extracted shallow layer features and deep layer semantic features and adopting random forest fitting to obtain a scoring result of the composition to be scored.
5. The method for automatically scoring Chinese compositions as claimed in claim 4, wherein: the extraction of the wrongly written character features specifically comprises the following steps: processing the composition text to be scored to obtain word segmentation results of the composition text; according to the word segmentation result, comparing the composition text to be scored with the wrongly written character recognition corpus to obtain a suspicious word set; comparing the suspicious word set with the wrongly written character correction corpus to obtain a candidate word set; and calculating semantic confusion of the candidate word set, and taking the word with the minimum confusion as a wrongly written character correction result.
6. The method for automatically scoring Chinese compositions as claimed in claim 4, wherein: the extracting of the grammatical error features specifically comprises: processing the composition texts to be evaluated to obtain word vectors of the composition texts; and inputting the word vector into a Bi-LSTM neural network model, and training to obtain a labeling sequence, namely a grammar error result.
7. The method for automatically scoring Chinese compositions as claimed in claim 4, wherein: and the pinyin conversion step is used for identifying the pinyin in the text to be scored and converting the pinyin into corresponding Chinese characters.
8. The method for automatically scoring Chinese compositions as claimed in claim 4, wherein: also comprises a theme extracting step, which is used for extracting the theme to be scored as implicit in the text.
9. An automatic scoring system for Chinese composition is characterized in that: the system comprises the following modules:
the composition to be scored acquisition module: acquiring a composition picture to be scored, and performing Chinese recognition to obtain a composition text; or directly acquiring the composition text to be evaluated;
shallow layer feature extraction module: the system is used for processing the composition texts to be scored to obtain word segmentation results of the composition texts; according to the word segmentation result, counting shallow features of the composition to be scored;
the deep semantic feature extraction module: the method is used for extracting deep semantic features of the composition to be scored, wherein the deep semantic features comprise wrongly written characters and grammar wrongly written characters;
a scoring module: and the method is used for combining the extracted shallow layer characteristics and deep layer semantic characteristics and adopting random forest fitting to obtain a scoring result of the composition to be scored.
10. An automatic scoring teaching and assisting system for Chinese composition comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor; or the teaching and assistance system comprises a terminal and a cloud server which is connected with the terminal and is stored with a computer program, and the teaching and assistance system is characterized in that: the computer program is executed to implement the automatic scoring method for chinese composition according to any one of claims 4 to 8.
CN201911059419.3A 2019-11-01 2019-11-01 Automatic scoring method for Chinese composition and teaching assistance system Active CN110851599B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911059419.3A CN110851599B (en) 2019-11-01 2019-11-01 Automatic scoring method for Chinese composition and teaching assistance system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911059419.3A CN110851599B (en) 2019-11-01 2019-11-01 Automatic scoring method for Chinese composition and teaching assistance system

Publications (2)

Publication Number Publication Date
CN110851599A true CN110851599A (en) 2020-02-28
CN110851599B CN110851599B (en) 2023-04-28

Family

ID=69598489

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911059419.3A Active CN110851599B (en) 2019-11-01 2019-11-01 Automatic scoring method for Chinese composition and teaching assistance system

Country Status (1)

Country Link
CN (1) CN110851599B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111581379A (en) * 2020-04-28 2020-08-25 电子科技大学 Automatic composition scoring calculation method based on composition question-deducting degree
CN111832281A (en) * 2020-07-16 2020-10-27 平安科技(深圳)有限公司 Composition scoring method and device, computer equipment and computer readable storage medium
CN111914544A (en) * 2020-08-18 2020-11-10 科大讯飞股份有限公司 Metaphor sentence recognition method, metaphor sentence recognition device, metaphor sentence recognition equipment and storage medium
CN112183065A (en) * 2020-09-16 2021-01-05 北京思源智通科技有限责任公司 Text evaluation method and device, computer readable storage medium and terminal equipment
CN112199946A (en) * 2020-09-15 2021-01-08 北京大米科技有限公司 Data processing method and device, electronic equipment and readable storage medium
CN112287921A (en) * 2020-10-15 2021-01-29 泰州锐比特智能科技有限公司 Composition evaluation system and method based on wrong word identification
CN112364990A (en) * 2020-10-29 2021-02-12 北京语言大学 Method and system for realizing grammar error correction and less sample field adaptation through meta-learning
CN112380830A (en) * 2020-06-18 2021-02-19 达而观信息科技(上海)有限公司 Method, system and computer readable storage medium for matching related sentences in different documents
CN112686020A (en) * 2020-12-29 2021-04-20 科大讯飞股份有限公司 Composition scoring method and device, electronic equipment and storage medium
CN114519345A (en) * 2022-01-17 2022-05-20 广东南方网络信息科技有限公司 Content proofreading method and device, mobile terminal and storage medium
CN114692606A (en) * 2020-12-31 2022-07-01 暗物智能科技(广州)有限公司 English composition analysis scoring system, method and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1442804A (en) * 2002-03-01 2003-09-17 何万贯 Automatic composition comment education system
WO2005045786A1 (en) * 2003-10-27 2005-05-19 Educational Testing Service Automatic essay scoring system
CN105045778A (en) * 2015-06-24 2015-11-11 江苏科技大学 Chinese homonym error auto-proofreading method
CN108595410A (en) * 2018-03-19 2018-09-28 小船出海教育科技(北京)有限公司 The automatic of hand-written composition corrects method and device
CN109614623A (en) * 2018-12-12 2019-04-12 广东小天才科技有限公司 A kind of composition processing method and system based on syntactic analysis
CN109948152A (en) * 2019-03-06 2019-06-28 北京工商大学 A kind of Chinese text grammer error correcting model method based on LSTM
CN110069768A (en) * 2018-01-22 2019-07-30 北京博智天下信息技术有限公司 A kind of English argumentative writing automatic scoring method based on the structure of an article
CN110264792A (en) * 2019-06-17 2019-09-20 上海元趣信息技术有限公司 One kind is for pupil's composition intelligent tutoring system
CN110276077A (en) * 2019-06-25 2019-09-24 上海应用技术大学 The method, device and equipment of Chinese error correction

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1442804A (en) * 2002-03-01 2003-09-17 何万贯 Automatic composition comment education system
WO2005045786A1 (en) * 2003-10-27 2005-05-19 Educational Testing Service Automatic essay scoring system
CN105045778A (en) * 2015-06-24 2015-11-11 江苏科技大学 Chinese homonym error auto-proofreading method
CN110069768A (en) * 2018-01-22 2019-07-30 北京博智天下信息技术有限公司 A kind of English argumentative writing automatic scoring method based on the structure of an article
CN108595410A (en) * 2018-03-19 2018-09-28 小船出海教育科技(北京)有限公司 The automatic of hand-written composition corrects method and device
CN109614623A (en) * 2018-12-12 2019-04-12 广东小天才科技有限公司 A kind of composition processing method and system based on syntactic analysis
CN109948152A (en) * 2019-03-06 2019-06-28 北京工商大学 A kind of Chinese text grammer error correcting model method based on LSTM
CN110264792A (en) * 2019-06-17 2019-09-20 上海元趣信息技术有限公司 One kind is for pupil's composition intelligent tutoring system
CN110276077A (en) * 2019-06-25 2019-09-24 上海应用技术大学 The method, device and equipment of Chinese error correction

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
陈一乐: "基于回归分析的中文作文自动评分技术研究", 《中国优秀博硕士学位论文全文数据库(硕士)社会科学Ⅱ辑》 *
陈珊珊: "自动作文评分模型及方法研究", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111581379B (en) * 2020-04-28 2022-03-25 电子科技大学 Automatic composition scoring calculation method based on composition question-deducting degree
CN111581379A (en) * 2020-04-28 2020-08-25 电子科技大学 Automatic composition scoring calculation method based on composition question-deducting degree
CN112380830A (en) * 2020-06-18 2021-02-19 达而观信息科技(上海)有限公司 Method, system and computer readable storage medium for matching related sentences in different documents
CN112380830B (en) * 2020-06-18 2024-05-17 达观数据有限公司 Matching method, system and computer readable storage medium for related sentences in different documents
CN111832281A (en) * 2020-07-16 2020-10-27 平安科技(深圳)有限公司 Composition scoring method and device, computer equipment and computer readable storage medium
WO2021139265A1 (en) * 2020-07-16 2021-07-15 平安科技(深圳)有限公司 Composition scoring method and apparatus, computer device, and computer readable storage medium
CN111914544A (en) * 2020-08-18 2020-11-10 科大讯飞股份有限公司 Metaphor sentence recognition method, metaphor sentence recognition device, metaphor sentence recognition equipment and storage medium
CN112199946A (en) * 2020-09-15 2021-01-08 北京大米科技有限公司 Data processing method and device, electronic equipment and readable storage medium
CN112199946B (en) * 2020-09-15 2024-05-07 北京大米科技有限公司 Data processing method, device, electronic equipment and readable storage medium
CN112183065A (en) * 2020-09-16 2021-01-05 北京思源智通科技有限责任公司 Text evaluation method and device, computer readable storage medium and terminal equipment
CN112287921A (en) * 2020-10-15 2021-01-29 泰州锐比特智能科技有限公司 Composition evaluation system and method based on wrong word identification
CN112364990A (en) * 2020-10-29 2021-02-12 北京语言大学 Method and system for realizing grammar error correction and less sample field adaptation through meta-learning
CN112364990B (en) * 2020-10-29 2021-06-04 北京语言大学 Method and system for realizing grammar error correction and less sample field adaptation through meta-learning
CN112686020A (en) * 2020-12-29 2021-04-20 科大讯飞股份有限公司 Composition scoring method and device, electronic equipment and storage medium
CN112686020B (en) * 2020-12-29 2024-06-04 科大讯飞股份有限公司 Composition scoring method and device, electronic equipment and storage medium
CN114692606A (en) * 2020-12-31 2022-07-01 暗物智能科技(广州)有限公司 English composition analysis scoring system, method and storage medium
CN114519345B (en) * 2022-01-17 2023-11-07 广东南方网络信息科技有限公司 Content checking method and device, mobile terminal and storage medium
CN114519345A (en) * 2022-01-17 2022-05-20 广东南方网络信息科技有限公司 Content proofreading method and device, mobile terminal and storage medium

Also Published As

Publication number Publication date
CN110851599B (en) 2023-04-28

Similar Documents

Publication Publication Date Title
CN110851599B (en) Automatic scoring method for Chinese composition and teaching assistance system
CN108363743B (en) Intelligent problem generation method and device and computer readable storage medium
CN110852087B (en) Chinese error correction method and device, storage medium and electronic device
CN109783657B (en) Multi-step self-attention cross-media retrieval method and system based on limited text space
CN110147436B (en) Education knowledge map and text-based hybrid automatic question-answering method
Dong et al. Automatic features for essay scoring–an empirical study
CN110083710B (en) Word definition generation method based on cyclic neural network and latent variable structure
CN110442841B (en) Resume identification method and device, computer equipment and storage medium
CN106599032B (en) Text event extraction method combining sparse coding and structure sensing machine
CN110134954B (en) Named entity recognition method based on Attention mechanism
CN110750959A (en) Text information processing method, model training method and related device
CN111475629A (en) Knowledge graph construction method and system for math tutoring question-answering system
CN108717413B (en) Open field question-answering method based on hypothetical semi-supervised learning
CN108345583B (en) Event identification and classification method and device based on multilingual attention mechanism
CN107544958B (en) Term extraction method and device
Jin et al. Combining cnns and pattern matching for question interpretation in a virtual patient dialogue system
CN110276069A (en) A kind of Chinese braille mistake automatic testing method, system and storage medium
CN110222344B (en) Composition element analysis algorithm for composition tutoring of pupils
CN113268576B (en) Deep learning-based department semantic information extraction method and device
CN110781681A (en) Translation model-based elementary mathematic application problem automatic solving method and system
CN114528919A (en) Natural language processing method and device and computer equipment
CN110968708A (en) Method and system for labeling education information resource attributes
Ortiz-Zambranoa et al. Overview of alexs 2020: First workshop on lexical analysis at sepln
CN115455167A (en) Geographic examination question generation method and device based on knowledge guidance
CN114579706B (en) Automatic subjective question review method based on BERT neural network and multi-task learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant