CN113392187A

CN113392187A - Automatic scoring and error correction recommendation method for subjective questions

Info

Publication number: CN113392187A
Application number: CN202110672735.9A
Authority: CN
Inventors: 马黎
Original assignee: Shanghai Publishing and Printing College
Current assignee: Shanghai Publishing and Printing College
Priority date: 2021-06-17
Filing date: 2021-06-17
Publication date: 2021-09-14

Abstract

The invention provides an automatic scoring and error correction recommendation method for subjective questions, which comprises the following steps: step S1, establishing a question bank, wherein the question bank comprises questions, corresponding standard answers, knowledge point labels and question numbers; step S2, establishing a multi-bit inverted index table for the title number, and establishing a multi-bit inverted index total library; step S3, receiving an answer sheet picture to be scored, and dividing a question area and an answer area according to a target segmentation algorithm; step S4, identifying the question text in the question area by using OCR technology to obtain the test paper question, finding out the matched question from the question bank according to the multi-digit search algorithm and the question matching algorithm, and extracting the standard answer and the knowledge point label; step S5, recognizing the answer text of the answer area by using OCR technology to obtain the answer of the test paper, calculating the similarity between the answer of the test paper and the standard answer according to the answer matching algorithm, and providing the deficiency compared with the standard answer; and step S6, providing similar topics according to the recommendation strategy for knowledge point consolidation.

Description

Automatic scoring and error correction recommendation method for subjective questions

Technical Field

The invention belongs to the technical field of automatic scoring, error correction and recommendation, and particularly relates to an automatic scoring and error correction recommendation method for subjective questions.

Background

In recent years, students have increased burdens and have more and more learning tasks. In writing, especially when the subject is not met, if the teacher is helping to solve the problem in school, if the parent has limited ability at home, the teacher can take measures.

With the development of science and technology, researchers can utilize related technologies to realize automatic correction and error correction of jobs under the support of big data and artificial intelligence. The prior art discloses a method, a device, an electronic device and a storage medium for automatically correcting a job with application number 202010603637.5, wherein the method comprises the following steps: receiving a job picture to be corrected sent by an intelligent terminal; inputting the operation picture into a pre-trained text detection model to generate subject information and answer information of a target title; performing OCR recognition on the question information and the answer information respectively to obtain a question text and an answer text; searching in a resource library according to the question text to obtain answer analysis corresponding to the original question; comparing the answer analysis with the answer text to obtain the similarity of the answer analysis and the answer text; and when the similarity is greater than or equal to the preset threshold, the answer result of the correction target title is correct, and when the similarity is less than the preset threshold, the answer result of the correction target title is wrong, and the answer is returned and analyzed to the intelligent terminal. Although the technology can be used for automatically correcting the homework, no specific implementation algorithm is given in the links of searching, comparing and the like, only the correct or wrong question is given for the objective question, the reason for the correct or wrong question is not given, and the students are still in a blank face; in addition, the implementation effect on the subjective questions is limited.

Disclosure of Invention

The present invention is made to solve the above problems, and an object of the present invention is to provide an automatic scoring and error correction recommendation method for subjective questions.

The invention provides an automatic scoring and error correction recommendation method for subjective questions, which is characterized by comprising the following steps of:

step S1, establishing a question bank, wherein the question bank comprises questions, standard answers corresponding to the questions, knowledge point labels and question numbers;

step S2, establishing a multi-bit inverted index table for the question numbers in the question bank, and establishing a multi-bit inverted index total bank;

step S3, receiving an answer sheet picture to be scored, and dividing a question area and an answer area according to a target segmentation algorithm;

step S4, identifying the question text in the question area by using an OCR technology to obtain the question of the test paper, finding the question matched with the question of the test paper from the question library according to a multi-digit search algorithm and a question matching algorithm, and correspondingly extracting a standard answer and a knowledge point label;

step S5, recognizing the answer text of the answer area by using an OCR technology to obtain the answer of the test paper, calculating the similarity between the answer of the test paper and the standard answer according to an answer matching algorithm, and providing the shortage of the answer of the test paper compared with the standard answer;

step S6, searching questions similar to the questions of the test paper from the question bank according to the recommendation strategy to consolidate the knowledge points,

in step S5, the answer matching algorithm includes the following steps:

step S5-1, recognizing answer texts in answer areas by using an OCR technology to obtain test paper answers, marking the test paper answers as Daan, and marking the standard answers returned in the step S4 as Biaozhun;

step S5-2, a bert pre-training model or an xlnet pre-training model is used for generating sentence vectors of the test paper answer Daan and the standard answer Biaozhun, and a cosine similarity algorithm is used for calculating similarity Sim between the vectors₁，0≤Sim₁≤1；

Step S5-3, extracting a keyword set G from the test paper answer Daan by using a textrank algorithm₁Extracting a keyword set G from the standard answer Biaozhun by using a textrank algorithm₂And calculating the similarity Sim of the two groups of keyword sets₂，0≤Sim₂Less than or equal to 1, similarity Sim₂The calculation formula of (a) is as follows:

step S5-4, similarity Sim₁And similarity Sim₂And (3) carrying out fusion to obtain the similarity Sim, wherein the calculation formula is as follows:

Sim＝Sim₁×a+Sim₂×b (2)

step S5-5, according to the set score of the test paper question, carrying out similarity Sim mapping and returning the score of the test paper answer, and in addition, collecting the keywords G₂But keyword set G₁Also returns an element representing the insufficient feature of the test paper answer compared with the standard answer,

in the formula (2), a and b are respectively similarity Sim₁And similarity Sim₂The weight of (a) satisfies that a + b is 1 and a is not less than b.

The automatic scoring and error correction recommendation method for the subjective questions provided by the invention can also have the following characteristics: in step S1, the titles, the standard answers, and the knowledge point labels are derived from various books and internet resources, and the title numbers are generated from the titles, and the specific generation steps are as follows:

step S1-1, performing word segmentation processing on the text of the title;

step S1-2, using MD5 algorithm as pseudo-random number generator, using TF-IDF algorithm or BM25 algorithm to calculate the word weight of each word after word segmentation;

and step S1-3, generating a hash value corresponding to the text of the title by using a 128-bit simhash algorithm according to the pseudo-random number generator and the word weight, and taking the hash value as the title number.

The automatic scoring and error correction recommendation method for the subjective questions provided by the invention can also have the following characteristics: in step S2, the specific steps of establishing the multi-bit inverted index table and establishing the multi-bit inverted index total library are as follows:

step S2-1, the title number of a title in the title library is hash, the title number hash is divided according to M segments to obtain M segments of sub-title numbers, as shown in formula (3),

hash＝[hash₁,hash₂,…,hash_i,…,hash_M],i∈[1,M] (3)

step S2-2, the M segment sub-topic number hash of the topic number hash_i,i∈[1,M]Performing (M-alpha) bit permutation and combination to establish a multi-bit inverted index table, wherein alpha is phaseLike the threshold, and M > α, each topic number hash will have

An index points to the index, and the indexes are sequentially marked as index 1, index 2, … … and index from top to bottom

The multi-bit inverted index table when α is 3 and M is 4 is shown in formula (4),

the multi-bit inverted index table when α is 3 and M is 5 is shown in equation (5),

step S2-3, summarizing the multi-bit inverted index tables of all question numbers in the question bank, constructing to obtain a multi-bit inverted index total bank,

wherein, in the formula (3), hash_i,i∈[1,M]The sub-topic number representing the topic number hash.

The automatic scoring and error correction recommendation method for the subjective questions provided by the invention can also have the following characteristics: in step S3, the target segmentation algorithm includes the following specific steps:

step S3-1, collecting a plurality of answer sheet pictures, marking question areas and answer areas on the answer sheet pictures by using a manual marking method, and taking the areas formed by the printing forms as the question areas and the areas formed by the handwriting forms as the answer areas when manual marking is carried out;

step S3-2, using the collected answer sheet pictures and the manually marked information as training data, using a deep learning technology and training the two classification models of the print form and the handwriting form by means of a transfer learning technology;

step S3-3, inputting the answer sheet picture to be scored into the two classification models obtained by training, correspondingly dividing according to the print form and the handwriting form to obtain a question area and an answer area,

wherein, the deep learning technology is convolutional neural network CNN, recurrent neural network RNN or LSTM or GRU.

The automatic scoring and error correction recommendation method for the subjective questions provided by the invention can also have the following characteristics: in step S4, the multi-bit search algorithm includes the following specific steps:

step S4-1-1, identifying the question text of the question area by using an OCR technology to obtain the question of the test paper, performing the same processing on the question of the test paper according to the specific generation step of the question number in the step 1 to obtain the question number of the test paper, and recording the question number of the test paper as Thash;

step S4-1-2, according to the specific steps of establishing the multi-bit inverted index table in the step S2, the test paper title number Thash is also segmented by M sections, and the multi-bit inverted index table of the test paper title number Thash is established;

step S4-1-3, according to the multi-bit inverted index table of the test paper title number Thash, searching the title number which is the same as the index number of the test paper title number Thash and has the same index value in the multi-bit inverted index total library to obtain a title number set,

the specific steps of the topic matching algorithm are as follows:

step S4-2-1, calculating Hamming distance H for the test paper title number Thash and the title numbers in the title number set one by one;

step S4-2-2, if the Hamming distance H of only one question number in the question number set meets the requirement that H is less than or equal to alpha, taking the standard answer and the knowledge point label corresponding to the question number as return values;

s4-2-3, if Hamming distance H of a plurality of question numbers in the question number set meets the requirement that H is less than or equal to alpha, taking a standard answer and a knowledge point label corresponding to the question number with the smallest Hamming distance H as a return value;

step S4-2-4, if the Hamming distance H of a plurality of question numbers in the question number set meets the requirement that H is less than or equal to alpha, and the question number with the smallest Hamming distance H also has a plurality of question numbers, the standard answer and the knowledge point label corresponding to a certain question number are arbitrarily used as return values;

and S4-2-5, if no question number matched with the question number of the test paper is searched, outputting abnormal information of the correct answer of the question temporarily, recording the corresponding question of the test paper, and filling the question library after the question library expert gives the correct standard answer and the knowledge point label.

The automatic scoring and error correction recommendation method for the subjective questions provided by the invention can also have the following characteristics: in step S6, the specific steps of recommending a policy are as follows:

step S6-1, the knowledge point label returned in step S4 is recorded as Tags;

step S6-2, searching the question bank for the question corresponding to the knowledge point tag similar to the knowledge point tag Tags, and randomly returning a question to consolidate the knowledge points.

Action and Effect of the invention

According to the automatic scoring and error correction recommendation method for the subjective questions, a simhash algorithm is used for generating the question numbers of the questions in the aspect of question matching so as to ensure the retrieval accuracy, and a multi-bit inverted index table is used for retrieval, so that the retrieval efficiency can be effectively improved; in the aspect of answer matching, similarity is calculated by using an advanced pre-training model such as bert and xlnet and a cosine similarity algorithm, and a textrank keyword algorithm is used for supplementing similarity measurement so as to ensure the accuracy of an answer matching result, and meanwhile, insufficient characteristics of a student test paper answer and a standard answer can be given, so that the student can be helped to determine own problems; in the aspect of recommendation strategies, similar questions are recommended for students according to the knowledge point labels, and the students can be helped to consolidate the knowledge points. Therefore, the automatic scoring and error correction recommendation method for the subjective questions can solve the phenomenon that students are overwhelmed when working alone to face difficult problems, solve the phenomena of time consumption during question matching and unreasonable answer matching, and achieve the purpose of helping the students improve knowledge by using big data and an artificial intelligence algorithm.

Drawings

Fig. 1 is a flowchart of an automatic scoring and error correction recommendation method for subjective questions according to an embodiment of the present invention;

FIG. 2 is a flowchart of step S1 in an embodiment of the present invention;

fig. 3 is a schematic diagram of a multi-bit inverted index table established when α is 3 and M is 4 in an embodiment of the present invention;

fig. 4 is a schematic diagram of a multi-bit inverted index table established when α is 3 and M is 5 in an embodiment of the present invention;

fig. 5 is a flowchart of step S4 in an embodiment of the present invention.

Detailed Description

In order to make the technical means and functions of the present invention easy to understand, the present invention is specifically described below with reference to the embodiments and the accompanying drawings.

< example >

Fig. 1 is a flowchart of an automatic scoring and error correction recommendation method for subjective questions according to an embodiment of the present invention.

As shown in fig. 1, the automatic scoring and error correction recommendation method for subjective questions of this embodiment includes the following steps:

step S1, establishing a question bank, wherein the question bank comprises questions, standard answers corresponding to the questions, knowledge point labels and question numbers.

Fig. 2 is a flowchart of step S1 in an embodiment of the present invention.

As shown in fig. 2, in step S1, the title, the standard answer, and the knowledge point tag are derived from various books and internet resources, and the title number is generated from the title, which includes the following specific steps:

step S1-1, performing word segmentation processing on the text of the title;

Step S2, establishing a multi-bit reverse index table for the question numbers in the question bank, and establishing a multi-bit reverse index total bank.

In step S2, the specific steps of establishing the multi-bit inverted index table and establishing the multi-bit inverted index total library are as follows:

hash＝[hash₁,hash₂,…,hash_i,…,hash_M],i∈[1,M] (3)

step S2-2, the M segment sub-topic number hash of the topic number hash_i,i∈[1,M]Performing (M-alpha) bit permutation and combination to establish a multi-bit inverted index table, wherein alpha is a similarity threshold value, M is more than alpha, and each title number hash will have

fig. 3 is a schematic diagram of a multi-bit inverted index table established when α is 3 and M is 4 in an embodiment of the present invention.

As shown in fig. 3, when α is 3 and M is 4, the topic number is segmented into 4 segments of sub-topic numbers, and common index 1, index 2, index 3, and index 4 point to the topic number.

fig. 4 is a schematic diagram of a multi-bit inverted index table established when α is 3 and M is 5 in an embodiment of the present invention.

As shown in fig. 4, when α is 3 and M is 5, the topic number is segmented into 5 segments of sub-topic numbers, and common indexes 1 to 10 point to the topic number.

And step S3, receiving the picture of the answer sheet to be scored, and dividing a question area and an answer area according to a target segmentation algorithm.

In step S3, the target segmentation algorithm includes the following specific steps:

step S3-1, collecting a plurality of answer sheet pictures, marking question areas and answer areas on the answer sheet pictures by using a manual marking method, taking the areas formed by printing as the question areas and taking the areas formed by handwriting as the answer areas when manual marking is carried out, wherein the question areas and the answer areas are mainly rectangles;

Fig. 5 is a flowchart of step S4 in an embodiment of the present invention.

As shown in fig. 5, in step S4, OCR technology is used to identify the question text in the question area to obtain the question of the test paper, and the question matching the question of the test paper is found from the question bank according to the multi-digit search algorithm and the question matching algorithm, and the standard answer and the knowledge point label are extracted correspondingly.

In step S4, the multi-bit search algorithm includes the following steps:

the specific steps of the topic matching algorithm are as follows:

And step S5, recognizing the answer text of the answer area by using an OCR technology to obtain the answer of the test paper, calculating the similarity between the answer of the test paper and the standard answer according to an answer matching algorithm, and providing the deficiency of the answer of the test paper compared with the standard answer.

In step S5, the answer matching algorithm includes the following steps:

Sim＝Sim₁×a+Sim₂×b (2)

Step S6, searching question similar to the question of the test paper from the question bank according to the recommendation strategy to consolidate the knowledge point.

In step S6, the specific steps of recommending a policy are as follows:

step S6-1, the knowledge point label returned in step S4 is recorded as Tags;

Effects and effects of the embodiments

According to the automatic scoring and error correction recommendation method for the subjective questions, in the aspect of question matching, a simhash algorithm is used for generating question numbers of the questions to ensure the retrieval accuracy, and a multi-bit inverted index table is used for retrieval, so that the retrieval efficiency can be effectively improved; in the aspect of answer matching, similarity is calculated by using an advanced pre-training model such as bert and xlnet and a cosine similarity algorithm, and a textrank keyword algorithm is used for supplementing similarity measurement so as to ensure the accuracy of an answer matching result, and meanwhile, insufficient characteristics of a student test paper answer and a standard answer can be given, so that the student can be helped to determine own problems; in the aspect of recommendation strategies, similar questions are recommended for students according to the knowledge point labels, and the students can be helped to consolidate the knowledge points. Therefore, the automatic scoring and error correction recommendation method for the subjective questions can solve the phenomenon that students are overwhelmed when working alone to face difficult problems, solve the phenomena that time is consumed when questions are matched and answers are unreasonable when answers are matched, and achieve the purpose of helping students improve knowledge by using big data and an artificial intelligence algorithm.

The above embodiments are preferred examples of the present invention, and are not intended to limit the scope of the present invention.

Claims

1. An automatic scoring and error correction recommendation method for subjective questions is characterized by comprising the following steps:

step S1, establishing a question bank, wherein the question bank comprises questions, and standard answers, knowledge point labels and question numbers corresponding to the questions;

step S4, identifying the question text in the question area by using an OCR technology to obtain a test paper question, finding the question matched with the test paper question from the question library according to a multi-digit search algorithm and a question matching algorithm, and correspondingly extracting the standard answer and the knowledge point label;

step S5, recognizing the answer text of the answer area by using an OCR technology to obtain a test paper answer, calculating the similarity between the test paper answer and the standard answer according to an answer matching algorithm, and providing the deficiency of the test paper answer compared with the standard answer;

step S6, searching the question bank for the question similar to the question of the test paper according to the recommendation strategy to consolidate the knowledge point,

in step S5, the answer matching algorithm includes the following specific steps:

step S5-1, recognizing the answer text of the answer area by using an OCR technology, obtaining the answer of the test paper, marking as Daan, and recording the standard answer returned in the step S4 as Biaozhun;

Sim＝Sim₁×a+Sim₂×b (2)

step S5-5, according to the set score of the test paper question, carrying out similarity Sim mapping and returning the score of the test paper answer, and in addition, collecting the keywords G₂But keyword set G₁Also returns an element representing an insufficient feature of the test paper answer compared to the standard answer,

2. The automatic scoring and correction recommendation method for subjective questions according to claim 1, wherein:

in step S1, the titles, the standard answers, and the knowledge point labels are derived from various books and internet resources, and the title numbers are generated from the titles, and the specific generation steps are as follows:

step S1-1, performing word segmentation processing on the text of the title;

3. The automatic scoring and correction recommendation method for subjective questions according to claim 2, wherein:

step S2-1, marking the question number of a certain question in the question bank as a hash, segmenting the question number hash according to M segments to obtain M segments of sub-question numbers, as shown in formula (3),

hash＝[hash₁,hash₂,…,hash_i,…,hash_M],i∈[1,M] (3)

step S2-2, M segments of the topic number hash and the sub-topic number hash_i,i∈[1,M]Carrying out (M-alpha) bit permutation and combination to establish the multi-bit inverted index table, wherein alpha is a similar threshold value, M is more than alpha, and each title number hash has

step S2-3, summarizing the multi-bit inverted index tables of all the question numbers in the question bank, constructing to obtain the multi-bit inverted index total bank,

4. The automatic scoring and correction recommendation method for subjective questions according to claim 1, wherein:

step S3-1, collecting a plurality of answer sheet pictures, marking the question areas and the answer areas on the answer sheet pictures by using a manual marking method, and taking the areas formed by the printed form as the question areas and the areas formed by the handwritten form as the answer areas when the manual marking is carried out;

step S3-3, inputting the answer sheet picture to be scored into the two classification models obtained by training, correspondingly dividing the answer sheet picture according to the print form and the handwriting form to obtain the question area and the answer area,

5. The automatic scoring and correction recommendation method for subjective questions according to claim 1, wherein:

in step S4, the multi-bit search algorithm includes the following specific steps:

step S4-1-1, identifying the question text of the question area by using an OCR technology to obtain the test paper question, performing the same processing on the test paper question according to the specific generation step of the question number in the step 1 to obtain a test paper question number, and recording the test paper question number as Thash;

step S4-1-2, according to the specific steps established by the multi-bit inverted index table in the step S2, the test paper title number Thash is also segmented by M sections, and the multi-bit inverted index table of the test paper title number Thash is established;

step S4-1-3, searching the title numbers which are the same as the index numbers and the index values of the test paper title numbers Thash in the multi-bit inverted index total library according to the multi-bit inverted index table of the test paper title numbers Thash to obtain a title number set,

the title matching algorithm comprises the following specific steps:

step S4-2-3, if the Hamming distance H of a plurality of question numbers in the question number set meets the requirement that H is not more than alpha, taking the standard answer and the knowledge point label corresponding to the question number with the smallest Hamming distance H as a return value;

step S4-2-4, if the Hamming distance H of a plurality of question numbers in the question number set meets the requirement that H is not more than alpha, and the question number with the smallest Hamming distance H also has a plurality of question numbers, the standard answer and the knowledge point label corresponding to one of the question numbers are arbitrarily used as return values;

and S4-2-5, if the question number matched with the question number of the test paper is not searched, outputting abnormal information of the correct answer of the question temporarily, recording the corresponding question of the test paper, and filling the correct standard answer and the knowledge point label into the question bank after the question bank expert gives out the correct standard answer and the knowledge point label.

6. The automatic scoring and correction recommendation method for subjective questions according to claim 1, wherein:

in step S6, the recommendation strategy includes the following specific steps:

step S6-1, recording the knowledge point label returned in the step S4 as Tags;

step S6-2, search in the question bank with the knowledge point label Tags similar to the question that the knowledge point label corresponds to the question, and random return one the question to carry on the knowledge point consolidation.