CN113392187A - Automatic scoring and error correction recommendation method for subjective questions - Google Patents

Automatic scoring and error correction recommendation method for subjective questions Download PDF

Info

Publication number
CN113392187A
CN113392187A CN202110672735.9A CN202110672735A CN113392187A CN 113392187 A CN113392187 A CN 113392187A CN 202110672735 A CN202110672735 A CN 202110672735A CN 113392187 A CN113392187 A CN 113392187A
Authority
CN
China
Prior art keywords
question
answer
test paper
algorithm
title
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110672735.9A
Other languages
Chinese (zh)
Inventor
马黎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Publishing and Printing College
Original Assignee
Shanghai Publishing and Printing College
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Publishing and Printing College filed Critical Shanghai Publishing and Printing College
Priority to CN202110672735.9A priority Critical patent/CN113392187A/en
Publication of CN113392187A publication Critical patent/CN113392187A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • G06F16/319Inverted lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention provides an automatic scoring and error correction recommendation method for subjective questions, which comprises the following steps: step S1, establishing a question bank, wherein the question bank comprises questions, corresponding standard answers, knowledge point labels and question numbers; step S2, establishing a multi-bit inverted index table for the title number, and establishing a multi-bit inverted index total library; step S3, receiving an answer sheet picture to be scored, and dividing a question area and an answer area according to a target segmentation algorithm; step S4, identifying the question text in the question area by using OCR technology to obtain the test paper question, finding out the matched question from the question bank according to the multi-digit search algorithm and the question matching algorithm, and extracting the standard answer and the knowledge point label; step S5, recognizing the answer text of the answer area by using OCR technology to obtain the answer of the test paper, calculating the similarity between the answer of the test paper and the standard answer according to the answer matching algorithm, and providing the deficiency compared with the standard answer; and step S6, providing similar topics according to the recommendation strategy for knowledge point consolidation.

Description

Automatic scoring and error correction recommendation method for subjective questions
Technical Field
The invention belongs to the technical field of automatic scoring, error correction and recommendation, and particularly relates to an automatic scoring and error correction recommendation method for subjective questions.
Background
In recent years, students have increased burdens and have more and more learning tasks. In writing, especially when the subject is not met, if the teacher is helping to solve the problem in school, if the parent has limited ability at home, the teacher can take measures.
With the development of science and technology, researchers can utilize related technologies to realize automatic correction and error correction of jobs under the support of big data and artificial intelligence. The prior art discloses a method, a device, an electronic device and a storage medium for automatically correcting a job with application number 202010603637.5, wherein the method comprises the following steps: receiving a job picture to be corrected sent by an intelligent terminal; inputting the operation picture into a pre-trained text detection model to generate subject information and answer information of a target title; performing OCR recognition on the question information and the answer information respectively to obtain a question text and an answer text; searching in a resource library according to the question text to obtain answer analysis corresponding to the original question; comparing the answer analysis with the answer text to obtain the similarity of the answer analysis and the answer text; and when the similarity is greater than or equal to the preset threshold, the answer result of the correction target title is correct, and when the similarity is less than the preset threshold, the answer result of the correction target title is wrong, and the answer is returned and analyzed to the intelligent terminal. Although the technology can be used for automatically correcting the homework, no specific implementation algorithm is given in the links of searching, comparing and the like, only the correct or wrong question is given for the objective question, the reason for the correct or wrong question is not given, and the students are still in a blank face; in addition, the implementation effect on the subjective questions is limited.
Disclosure of Invention
The present invention is made to solve the above problems, and an object of the present invention is to provide an automatic scoring and error correction recommendation method for subjective questions.
The invention provides an automatic scoring and error correction recommendation method for subjective questions, which is characterized by comprising the following steps of:
step S1, establishing a question bank, wherein the question bank comprises questions, standard answers corresponding to the questions, knowledge point labels and question numbers;
step S2, establishing a multi-bit inverted index table for the question numbers in the question bank, and establishing a multi-bit inverted index total bank;
step S3, receiving an answer sheet picture to be scored, and dividing a question area and an answer area according to a target segmentation algorithm;
step S4, identifying the question text in the question area by using an OCR technology to obtain the question of the test paper, finding the question matched with the question of the test paper from the question library according to a multi-digit search algorithm and a question matching algorithm, and correspondingly extracting a standard answer and a knowledge point label;
step S5, recognizing the answer text of the answer area by using an OCR technology to obtain the answer of the test paper, calculating the similarity between the answer of the test paper and the standard answer according to an answer matching algorithm, and providing the shortage of the answer of the test paper compared with the standard answer;
step S6, searching questions similar to the questions of the test paper from the question bank according to the recommendation strategy to consolidate the knowledge points,
in step S5, the answer matching algorithm includes the following steps:
step S5-1, recognizing answer texts in answer areas by using an OCR technology to obtain test paper answers, marking the test paper answers as Daan, and marking the standard answers returned in the step S4 as Biaozhun;
step S5-2, a bert pre-training model or an xlnet pre-training model is used for generating sentence vectors of the test paper answer Daan and the standard answer Biaozhun, and a cosine similarity algorithm is used for calculating similarity Sim between the vectors1,0≤Sim1≤1;
Step S5-3, extracting a keyword set G from the test paper answer Daan by using a textrank algorithm1Extracting a keyword set G from the standard answer Biaozhun by using a textrank algorithm2And calculating the similarity Sim of the two groups of keyword sets2,0≤Sim2Less than or equal to 1, similarity Sim2The calculation formula of (a) is as follows:
Figure BDA0003120001600000031
step S5-4, similarity Sim1And similarity Sim2And (3) carrying out fusion to obtain the similarity Sim, wherein the calculation formula is as follows:
Sim=Sim1×a+Sim2×b (2)
step S5-5, according to the set score of the test paper question, carrying out similarity Sim mapping and returning the score of the test paper answer, and in addition, collecting the keywords G2But keyword set G1Also returns an element representing the insufficient feature of the test paper answer compared with the standard answer,
in the formula (2), a and b are respectively similarity Sim1And similarity Sim2The weight of (a) satisfies that a + b is 1 and a is not less than b.
The automatic scoring and error correction recommendation method for the subjective questions provided by the invention can also have the following characteristics: in step S1, the titles, the standard answers, and the knowledge point labels are derived from various books and internet resources, and the title numbers are generated from the titles, and the specific generation steps are as follows:
step S1-1, performing word segmentation processing on the text of the title;
step S1-2, using MD5 algorithm as pseudo-random number generator, using TF-IDF algorithm or BM25 algorithm to calculate the word weight of each word after word segmentation;
and step S1-3, generating a hash value corresponding to the text of the title by using a 128-bit simhash algorithm according to the pseudo-random number generator and the word weight, and taking the hash value as the title number.
The automatic scoring and error correction recommendation method for the subjective questions provided by the invention can also have the following characteristics: in step S2, the specific steps of establishing the multi-bit inverted index table and establishing the multi-bit inverted index total library are as follows:
step S2-1, the title number of a title in the title library is hash, the title number hash is divided according to M segments to obtain M segments of sub-title numbers, as shown in formula (3),
hash=[hash1,hash2,…,hashi,…,hashM],i∈[1,M] (3)
step S2-2, the M segment sub-topic number hash of the topic number hashi,i∈[1,M]Performing (M-alpha) bit permutation and combination to establish a multi-bit inverted index table, wherein alpha is phaseLike the threshold, and M > α, each topic number hash will have
Figure BDA0003120001600000041
An index points to the index, and the indexes are sequentially marked as index 1, index 2, … … and index from top to bottom
Figure BDA0003120001600000042
The multi-bit inverted index table when α is 3 and M is 4 is shown in formula (4),
Figure BDA0003120001600000043
the multi-bit inverted index table when α is 3 and M is 5 is shown in equation (5),
Figure BDA0003120001600000051
step S2-3, summarizing the multi-bit inverted index tables of all question numbers in the question bank, constructing to obtain a multi-bit inverted index total bank,
wherein, in the formula (3), hashi,i∈[1,M]The sub-topic number representing the topic number hash.
The automatic scoring and error correction recommendation method for the subjective questions provided by the invention can also have the following characteristics: in step S3, the target segmentation algorithm includes the following specific steps:
step S3-1, collecting a plurality of answer sheet pictures, marking question areas and answer areas on the answer sheet pictures by using a manual marking method, and taking the areas formed by the printing forms as the question areas and the areas formed by the handwriting forms as the answer areas when manual marking is carried out;
step S3-2, using the collected answer sheet pictures and the manually marked information as training data, using a deep learning technology and training the two classification models of the print form and the handwriting form by means of a transfer learning technology;
step S3-3, inputting the answer sheet picture to be scored into the two classification models obtained by training, correspondingly dividing according to the print form and the handwriting form to obtain a question area and an answer area,
wherein, the deep learning technology is convolutional neural network CNN, recurrent neural network RNN or LSTM or GRU.
The automatic scoring and error correction recommendation method for the subjective questions provided by the invention can also have the following characteristics: in step S4, the multi-bit search algorithm includes the following specific steps:
step S4-1-1, identifying the question text of the question area by using an OCR technology to obtain the question of the test paper, performing the same processing on the question of the test paper according to the specific generation step of the question number in the step 1 to obtain the question number of the test paper, and recording the question number of the test paper as Thash;
step S4-1-2, according to the specific steps of establishing the multi-bit inverted index table in the step S2, the test paper title number Thash is also segmented by M sections, and the multi-bit inverted index table of the test paper title number Thash is established;
step S4-1-3, according to the multi-bit inverted index table of the test paper title number Thash, searching the title number which is the same as the index number of the test paper title number Thash and has the same index value in the multi-bit inverted index total library to obtain a title number set,
the specific steps of the topic matching algorithm are as follows:
step S4-2-1, calculating Hamming distance H for the test paper title number Thash and the title numbers in the title number set one by one;
step S4-2-2, if the Hamming distance H of only one question number in the question number set meets the requirement that H is less than or equal to alpha, taking the standard answer and the knowledge point label corresponding to the question number as return values;
s4-2-3, if Hamming distance H of a plurality of question numbers in the question number set meets the requirement that H is less than or equal to alpha, taking a standard answer and a knowledge point label corresponding to the question number with the smallest Hamming distance H as a return value;
step S4-2-4, if the Hamming distance H of a plurality of question numbers in the question number set meets the requirement that H is less than or equal to alpha, and the question number with the smallest Hamming distance H also has a plurality of question numbers, the standard answer and the knowledge point label corresponding to a certain question number are arbitrarily used as return values;
and S4-2-5, if no question number matched with the question number of the test paper is searched, outputting abnormal information of the correct answer of the question temporarily, recording the corresponding question of the test paper, and filling the question library after the question library expert gives the correct standard answer and the knowledge point label.
The automatic scoring and error correction recommendation method for the subjective questions provided by the invention can also have the following characteristics: in step S6, the specific steps of recommending a policy are as follows:
step S6-1, the knowledge point label returned in step S4 is recorded as Tags;
step S6-2, searching the question bank for the question corresponding to the knowledge point tag similar to the knowledge point tag Tags, and randomly returning a question to consolidate the knowledge points.
Action and Effect of the invention
According to the automatic scoring and error correction recommendation method for the subjective questions, a simhash algorithm is used for generating the question numbers of the questions in the aspect of question matching so as to ensure the retrieval accuracy, and a multi-bit inverted index table is used for retrieval, so that the retrieval efficiency can be effectively improved; in the aspect of answer matching, similarity is calculated by using an advanced pre-training model such as bert and xlnet and a cosine similarity algorithm, and a textrank keyword algorithm is used for supplementing similarity measurement so as to ensure the accuracy of an answer matching result, and meanwhile, insufficient characteristics of a student test paper answer and a standard answer can be given, so that the student can be helped to determine own problems; in the aspect of recommendation strategies, similar questions are recommended for students according to the knowledge point labels, and the students can be helped to consolidate the knowledge points. Therefore, the automatic scoring and error correction recommendation method for the subjective questions can solve the phenomenon that students are overwhelmed when working alone to face difficult problems, solve the phenomena of time consumption during question matching and unreasonable answer matching, and achieve the purpose of helping the students improve knowledge by using big data and an artificial intelligence algorithm.
Drawings
Fig. 1 is a flowchart of an automatic scoring and error correction recommendation method for subjective questions according to an embodiment of the present invention;
FIG. 2 is a flowchart of step S1 in an embodiment of the present invention;
fig. 3 is a schematic diagram of a multi-bit inverted index table established when α is 3 and M is 4 in an embodiment of the present invention;
fig. 4 is a schematic diagram of a multi-bit inverted index table established when α is 3 and M is 5 in an embodiment of the present invention;
fig. 5 is a flowchart of step S4 in an embodiment of the present invention.
Detailed Description
In order to make the technical means and functions of the present invention easy to understand, the present invention is specifically described below with reference to the embodiments and the accompanying drawings.
< example >
Fig. 1 is a flowchart of an automatic scoring and error correction recommendation method for subjective questions according to an embodiment of the present invention.
As shown in fig. 1, the automatic scoring and error correction recommendation method for subjective questions of this embodiment includes the following steps:
step S1, establishing a question bank, wherein the question bank comprises questions, standard answers corresponding to the questions, knowledge point labels and question numbers.
Fig. 2 is a flowchart of step S1 in an embodiment of the present invention.
As shown in fig. 2, in step S1, the title, the standard answer, and the knowledge point tag are derived from various books and internet resources, and the title number is generated from the title, which includes the following specific steps:
step S1-1, performing word segmentation processing on the text of the title;
step S1-2, using MD5 algorithm as pseudo-random number generator, using TF-IDF algorithm or BM25 algorithm to calculate the word weight of each word after word segmentation;
and step S1-3, generating a hash value corresponding to the text of the title by using a 128-bit simhash algorithm according to the pseudo-random number generator and the word weight, and taking the hash value as the title number.
Step S2, establishing a multi-bit reverse index table for the question numbers in the question bank, and establishing a multi-bit reverse index total bank.
In step S2, the specific steps of establishing the multi-bit inverted index table and establishing the multi-bit inverted index total library are as follows:
step S2-1, the title number of a title in the title library is hash, the title number hash is divided according to M segments to obtain M segments of sub-title numbers, as shown in formula (3),
hash=[hash1,hash2,…,hashi,…,hashM],i∈[1,M] (3)
step S2-2, the M segment sub-topic number hash of the topic number hashi,i∈[1,M]Performing (M-alpha) bit permutation and combination to establish a multi-bit inverted index table, wherein alpha is a similarity threshold value, M is more than alpha, and each title number hash will have
Figure BDA0003120001600000091
An index points to the index, and the indexes are sequentially marked as index 1, index 2, … … and index from top to bottom
Figure BDA0003120001600000092
The multi-bit inverted index table when α is 3 and M is 4 is shown in formula (4),
Figure BDA0003120001600000101
fig. 3 is a schematic diagram of a multi-bit inverted index table established when α is 3 and M is 4 in an embodiment of the present invention.
As shown in fig. 3, when α is 3 and M is 4, the topic number is segmented into 4 segments of sub-topic numbers, and common index 1, index 2, index 3, and index 4 point to the topic number.
The multi-bit inverted index table when α is 3 and M is 5 is shown in equation (5),
Figure BDA0003120001600000102
fig. 4 is a schematic diagram of a multi-bit inverted index table established when α is 3 and M is 5 in an embodiment of the present invention.
As shown in fig. 4, when α is 3 and M is 5, the topic number is segmented into 5 segments of sub-topic numbers, and common indexes 1 to 10 point to the topic number.
Step S2-3, summarizing the multi-bit inverted index tables of all question numbers in the question bank, constructing to obtain a multi-bit inverted index total bank,
wherein, in the formula (3), hashi,i∈[1,M]The sub-topic number representing the topic number hash.
And step S3, receiving the picture of the answer sheet to be scored, and dividing a question area and an answer area according to a target segmentation algorithm.
In step S3, the target segmentation algorithm includes the following specific steps:
step S3-1, collecting a plurality of answer sheet pictures, marking question areas and answer areas on the answer sheet pictures by using a manual marking method, taking the areas formed by printing as the question areas and taking the areas formed by handwriting as the answer areas when manual marking is carried out, wherein the question areas and the answer areas are mainly rectangles;
step S3-2, using the collected answer sheet pictures and the manually marked information as training data, using a deep learning technology and training the two classification models of the print form and the handwriting form by means of a transfer learning technology;
step S3-3, inputting the answer sheet picture to be scored into the two classification models obtained by training, correspondingly dividing according to the print form and the handwriting form to obtain a question area and an answer area,
wherein, the deep learning technology is convolutional neural network CNN, recurrent neural network RNN or LSTM or GRU.
Fig. 5 is a flowchart of step S4 in an embodiment of the present invention.
As shown in fig. 5, in step S4, OCR technology is used to identify the question text in the question area to obtain the question of the test paper, and the question matching the question of the test paper is found from the question bank according to the multi-digit search algorithm and the question matching algorithm, and the standard answer and the knowledge point label are extracted correspondingly.
In step S4, the multi-bit search algorithm includes the following steps:
step S4-1-1, identifying the question text of the question area by using an OCR technology to obtain the question of the test paper, performing the same processing on the question of the test paper according to the specific generation step of the question number in the step 1 to obtain the question number of the test paper, and recording the question number of the test paper as Thash;
step S4-1-2, according to the specific steps of establishing the multi-bit inverted index table in the step S2, the test paper title number Thash is also segmented by M sections, and the multi-bit inverted index table of the test paper title number Thash is established;
step S4-1-3, according to the multi-bit inverted index table of the test paper title number Thash, searching the title number which is the same as the index number of the test paper title number Thash and has the same index value in the multi-bit inverted index total library to obtain a title number set,
the specific steps of the topic matching algorithm are as follows:
step S4-2-1, calculating Hamming distance H for the test paper title number Thash and the title numbers in the title number set one by one;
step S4-2-2, if the Hamming distance H of only one question number in the question number set meets the requirement that H is less than or equal to alpha, taking the standard answer and the knowledge point label corresponding to the question number as return values;
s4-2-3, if Hamming distance H of a plurality of question numbers in the question number set meets the requirement that H is less than or equal to alpha, taking a standard answer and a knowledge point label corresponding to the question number with the smallest Hamming distance H as a return value;
step S4-2-4, if the Hamming distance H of a plurality of question numbers in the question number set meets the requirement that H is less than or equal to alpha, and the question number with the smallest Hamming distance H also has a plurality of question numbers, the standard answer and the knowledge point label corresponding to a certain question number are arbitrarily used as return values;
and S4-2-5, if no question number matched with the question number of the test paper is searched, outputting abnormal information of the correct answer of the question temporarily, recording the corresponding question of the test paper, and filling the question library after the question library expert gives the correct standard answer and the knowledge point label.
And step S5, recognizing the answer text of the answer area by using an OCR technology to obtain the answer of the test paper, calculating the similarity between the answer of the test paper and the standard answer according to an answer matching algorithm, and providing the deficiency of the answer of the test paper compared with the standard answer.
In step S5, the answer matching algorithm includes the following steps:
step S5-1, recognizing answer texts in answer areas by using an OCR technology to obtain test paper answers, marking the test paper answers as Daan, and marking the standard answers returned in the step S4 as Biaozhun;
step S5-2, a bert pre-training model or an xlnet pre-training model is used for generating sentence vectors of the test paper answer Daan and the standard answer Biaozhun, and a cosine similarity algorithm is used for calculating similarity Sim between the vectors1,0≤Sim1≤1;
Step S5-3, extracting a keyword set G from the test paper answer Daan by using a textrank algorithm1Extracting a keyword set G from the standard answer Biaozhun by using a textrank algorithm2And calculating the similarity Sim of the two groups of keyword sets2,0≤Sim2Less than or equal to 1, similarity Sim2The calculation formula of (a) is as follows:
Figure BDA0003120001600000131
step S5-4, similarity Sim1And similarity Sim2And (3) carrying out fusion to obtain the similarity Sim, wherein the calculation formula is as follows:
Sim=Sim1×a+Sim2×b (2)
step S5-5, according to the set score of the test paper question, carrying out similarity Sim mapping and returning the score of the test paper answer, and in addition, collecting the keywords G2But keyword set G1Also returns an element representing the insufficient feature of the test paper answer compared with the standard answer,
in the formula (2), a and b are respectively similarity Sim1And similarity Sim2The weight of (a) satisfies that a + b is 1 and a is not less than b.
Step S6, searching question similar to the question of the test paper from the question bank according to the recommendation strategy to consolidate the knowledge point.
In step S6, the specific steps of recommending a policy are as follows:
step S6-1, the knowledge point label returned in step S4 is recorded as Tags;
step S6-2, searching the question bank for the question corresponding to the knowledge point tag similar to the knowledge point tag Tags, and randomly returning a question to consolidate the knowledge points.
Effects and effects of the embodiments
According to the automatic scoring and error correction recommendation method for the subjective questions, in the aspect of question matching, a simhash algorithm is used for generating question numbers of the questions to ensure the retrieval accuracy, and a multi-bit inverted index table is used for retrieval, so that the retrieval efficiency can be effectively improved; in the aspect of answer matching, similarity is calculated by using an advanced pre-training model such as bert and xlnet and a cosine similarity algorithm, and a textrank keyword algorithm is used for supplementing similarity measurement so as to ensure the accuracy of an answer matching result, and meanwhile, insufficient characteristics of a student test paper answer and a standard answer can be given, so that the student can be helped to determine own problems; in the aspect of recommendation strategies, similar questions are recommended for students according to the knowledge point labels, and the students can be helped to consolidate the knowledge points. Therefore, the automatic scoring and error correction recommendation method for the subjective questions can solve the phenomenon that students are overwhelmed when working alone to face difficult problems, solve the phenomena that time is consumed when questions are matched and answers are unreasonable when answers are matched, and achieve the purpose of helping students improve knowledge by using big data and an artificial intelligence algorithm.
The above embodiments are preferred examples of the present invention, and are not intended to limit the scope of the present invention.

Claims (6)

1. An automatic scoring and error correction recommendation method for subjective questions is characterized by comprising the following steps:
step S1, establishing a question bank, wherein the question bank comprises questions, and standard answers, knowledge point labels and question numbers corresponding to the questions;
step S2, establishing a multi-bit inverted index table for the question numbers in the question bank, and establishing a multi-bit inverted index total bank;
step S3, receiving an answer sheet picture to be scored, and dividing a question area and an answer area according to a target segmentation algorithm;
step S4, identifying the question text in the question area by using an OCR technology to obtain a test paper question, finding the question matched with the test paper question from the question library according to a multi-digit search algorithm and a question matching algorithm, and correspondingly extracting the standard answer and the knowledge point label;
step S5, recognizing the answer text of the answer area by using an OCR technology to obtain a test paper answer, calculating the similarity between the test paper answer and the standard answer according to an answer matching algorithm, and providing the deficiency of the test paper answer compared with the standard answer;
step S6, searching the question bank for the question similar to the question of the test paper according to the recommendation strategy to consolidate the knowledge point,
in step S5, the answer matching algorithm includes the following specific steps:
step S5-1, recognizing the answer text of the answer area by using an OCR technology, obtaining the answer of the test paper, marking as Daan, and recording the standard answer returned in the step S4 as Biaozhun;
step S5-2, a bert pre-training model or an xlnet pre-training model is used for generating sentence vectors of the test paper answer Daan and the standard answer Biaozhun, and a cosine similarity algorithm is used for calculating similarity Sim between the vectors1,0≤Sim1≤1;
Step S5-3, extracting a keyword set G from the test paper answer Daan by using a textrank algorithm1Extracting a keyword set G from the standard answer Biaozhun by using a textrank algorithm2And calculating the similarity Sim of the two groups of keyword sets2,0≤Sim2Less than or equal to 1, similarity Sim2The calculation formula of (a) is as follows:
Figure FDA0003120001590000021
step S5-4, similarity Sim1And similarity Sim2And (3) carrying out fusion to obtain the similarity Sim, wherein the calculation formula is as follows:
Sim=Sim1×a+Sim2×b (2)
step S5-5, according to the set score of the test paper question, carrying out similarity Sim mapping and returning the score of the test paper answer, and in addition, collecting the keywords G2But keyword set G1Also returns an element representing an insufficient feature of the test paper answer compared to the standard answer,
in the formula (2), a and b are respectively similarity Sim1And similarity Sim2The weight of (a) satisfies that a + b is 1 and a is not less than b.
2. The automatic scoring and correction recommendation method for subjective questions according to claim 1, wherein:
in step S1, the titles, the standard answers, and the knowledge point labels are derived from various books and internet resources, and the title numbers are generated from the titles, and the specific generation steps are as follows:
step S1-1, performing word segmentation processing on the text of the title;
step S1-2, using MD5 algorithm as pseudo-random number generator, using TF-IDF algorithm or BM25 algorithm to calculate the word weight of each word after word segmentation;
and step S1-3, generating a hash value corresponding to the text of the title by using a 128-bit simhash algorithm according to the pseudo-random number generator and the word weight, and taking the hash value as the title number.
3. The automatic scoring and correction recommendation method for subjective questions according to claim 2, wherein:
in step S2, the specific steps of establishing the multi-bit inverted index table and establishing the multi-bit inverted index total library are as follows:
step S2-1, marking the question number of a certain question in the question bank as a hash, segmenting the question number hash according to M segments to obtain M segments of sub-question numbers, as shown in formula (3),
hash=[hash1,hash2,…,hashi,…,hashM],i∈[1,M] (3)
step S2-2, M segments of the topic number hash and the sub-topic number hashi,i∈[1,M]Carrying out (M-alpha) bit permutation and combination to establish the multi-bit inverted index table, wherein alpha is a similar threshold value, M is more than alpha, and each title number hash has
Figure FDA0003120001590000031
An index points to the index, and the indexes are sequentially marked as index 1, index 2, … … and index from top to bottom
Figure FDA0003120001590000032
The multi-bit inverted index table when α is 3 and M is 4 is shown in formula (4),
Figure FDA0003120001590000041
the multi-bit inverted index table when α is 3 and M is 5 is shown in equation (5),
Figure FDA0003120001590000042
step S2-3, summarizing the multi-bit inverted index tables of all the question numbers in the question bank, constructing to obtain the multi-bit inverted index total bank,
wherein, in the formula (3), hashi,i∈[1,M]The sub-topic number representing the topic number hash.
4. The automatic scoring and correction recommendation method for subjective questions according to claim 1, wherein:
in step S3, the target segmentation algorithm includes the following specific steps:
step S3-1, collecting a plurality of answer sheet pictures, marking the question areas and the answer areas on the answer sheet pictures by using a manual marking method, and taking the areas formed by the printed form as the question areas and the areas formed by the handwritten form as the answer areas when the manual marking is carried out;
step S3-2, using the collected answer sheet pictures and the manually marked information as training data, using a deep learning technology and training the two classification models of the print form and the handwriting form by means of a transfer learning technology;
step S3-3, inputting the answer sheet picture to be scored into the two classification models obtained by training, correspondingly dividing the answer sheet picture according to the print form and the handwriting form to obtain the question area and the answer area,
wherein, the deep learning technology is convolutional neural network CNN, recurrent neural network RNN or LSTM or GRU.
5. The automatic scoring and correction recommendation method for subjective questions according to claim 1, wherein:
in step S4, the multi-bit search algorithm includes the following specific steps:
step S4-1-1, identifying the question text of the question area by using an OCR technology to obtain the test paper question, performing the same processing on the test paper question according to the specific generation step of the question number in the step 1 to obtain a test paper question number, and recording the test paper question number as Thash;
step S4-1-2, according to the specific steps established by the multi-bit inverted index table in the step S2, the test paper title number Thash is also segmented by M sections, and the multi-bit inverted index table of the test paper title number Thash is established;
step S4-1-3, searching the title numbers which are the same as the index numbers and the index values of the test paper title numbers Thash in the multi-bit inverted index total library according to the multi-bit inverted index table of the test paper title numbers Thash to obtain a title number set,
the title matching algorithm comprises the following specific steps:
step S4-2-1, calculating Hamming distance H for the test paper title number Thash and the title numbers in the title number set one by one;
step S4-2-2, if the Hamming distance H of only one question number in the question number set meets the requirement that H is less than or equal to alpha, taking the standard answer and the knowledge point label corresponding to the question number as return values;
step S4-2-3, if the Hamming distance H of a plurality of question numbers in the question number set meets the requirement that H is not more than alpha, taking the standard answer and the knowledge point label corresponding to the question number with the smallest Hamming distance H as a return value;
step S4-2-4, if the Hamming distance H of a plurality of question numbers in the question number set meets the requirement that H is not more than alpha, and the question number with the smallest Hamming distance H also has a plurality of question numbers, the standard answer and the knowledge point label corresponding to one of the question numbers are arbitrarily used as return values;
and S4-2-5, if the question number matched with the question number of the test paper is not searched, outputting abnormal information of the correct answer of the question temporarily, recording the corresponding question of the test paper, and filling the correct standard answer and the knowledge point label into the question bank after the question bank expert gives out the correct standard answer and the knowledge point label.
6. The automatic scoring and correction recommendation method for subjective questions according to claim 1, wherein:
in step S6, the recommendation strategy includes the following specific steps:
step S6-1, recording the knowledge point label returned in the step S4 as Tags;
step S6-2, search in the question bank with the knowledge point label Tags similar to the question that the knowledge point label corresponds to the question, and random return one the question to carry on the knowledge point consolidation.
CN202110672735.9A 2021-06-17 2021-06-17 Automatic scoring and error correction recommendation method for subjective questions Pending CN113392187A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110672735.9A CN113392187A (en) 2021-06-17 2021-06-17 Automatic scoring and error correction recommendation method for subjective questions

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110672735.9A CN113392187A (en) 2021-06-17 2021-06-17 Automatic scoring and error correction recommendation method for subjective questions

Publications (1)

Publication Number Publication Date
CN113392187A true CN113392187A (en) 2021-09-14

Family

ID=77621762

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110672735.9A Pending CN113392187A (en) 2021-06-17 2021-06-17 Automatic scoring and error correction recommendation method for subjective questions

Country Status (1)

Country Link
CN (1) CN113392187A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115774996A (en) * 2022-12-05 2023-03-10 英仕互联(北京)信息技术有限公司 Question-following generation method and device for intelligent interview and electronic equipment
CN116595129A (en) * 2023-06-12 2023-08-15 广州市南方人力资源评价中心有限公司 Subjective question scoring method and device based on knowledge point labeling

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108172050A (en) * 2017-12-26 2018-06-15 科大讯飞股份有限公司 Mathematics subjective item answer result corrects method and system
CN110363194A (en) * 2019-06-17 2019-10-22 深圳壹账通智能科技有限公司 Intelligently reading method, apparatus, equipment and storage medium based on NLP
CN111310458A (en) * 2020-03-20 2020-06-19 广东工业大学 Subjective question automatic scoring method based on multi-feature fusion
CN111753767A (en) * 2020-06-29 2020-10-09 广东小天才科技有限公司 Method and device for automatically correcting operation, electronic equipment and storage medium
CN111897982A (en) * 2020-06-17 2020-11-06 昆明理工大学 Medical CT image storage and retrieval method
CN112560429A (en) * 2020-12-23 2021-03-26 信雅达科技股份有限公司 Intelligent training detection method and system based on deep learning

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108172050A (en) * 2017-12-26 2018-06-15 科大讯飞股份有限公司 Mathematics subjective item answer result corrects method and system
CN110363194A (en) * 2019-06-17 2019-10-22 深圳壹账通智能科技有限公司 Intelligently reading method, apparatus, equipment and storage medium based on NLP
CN111310458A (en) * 2020-03-20 2020-06-19 广东工业大学 Subjective question automatic scoring method based on multi-feature fusion
CN111897982A (en) * 2020-06-17 2020-11-06 昆明理工大学 Medical CT image storage and retrieval method
CN111753767A (en) * 2020-06-29 2020-10-09 广东小天才科技有限公司 Method and device for automatically correcting operation, electronic equipment and storage medium
CN112560429A (en) * 2020-12-23 2021-03-26 信雅达科技股份有限公司 Intelligent training detection method and system based on deep learning

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115774996A (en) * 2022-12-05 2023-03-10 英仕互联(北京)信息技术有限公司 Question-following generation method and device for intelligent interview and electronic equipment
CN116595129A (en) * 2023-06-12 2023-08-15 广州市南方人力资源评价中心有限公司 Subjective question scoring method and device based on knowledge point labeling
CN116595129B (en) * 2023-06-12 2023-10-27 广州市南方人力资源评价中心有限公司 Subjective question scoring method and device based on knowledge point labeling

Similar Documents

Publication Publication Date Title
US11508251B2 (en) Method and system for intelligent identification and correction of questions
CN111753767B (en) Method and device for automatically correcting operation, electronic equipment and storage medium
CN107169485B (en) Mathematical formula identification method and device
Yahya et al. Automatic classification of questions into Bloom's cognitive levels using support vector machines
CN113392187A (en) Automatic scoring and error correction recommendation method for subjective questions
CN112559781B (en) Image retrieval system and method
Rasyidi et al. Classification of handwritten Javanese script using random forest algorithm
CN110968708A (en) Method and system for labeling education information resource attributes
CN111914550A (en) Knowledge graph updating method and system for limited field
Agarwal et al. Autoeval: A nlp approach for automatic test evaluation system
CN112966518B (en) High-quality answer identification method for large-scale online learning platform
Belaid et al. Administrative document analysis and structure
CN111783697A (en) Wrong question detection and target recommendation system and method based on convolutional neural network
JP7293658B2 (en) Information processing device, information processing method and program
Lu et al. Automatic scoring system for handwritten examination papers based on YOLO algorithm
CN113792574B (en) Cross-dataset expression recognition method based on metric learning and teacher student model
Wu et al. A self-relevant cnn-svm model for problem classification in k-12 question-driven learning
Saha et al. Adopting computer-assisted assessment in evaluation of handwritten answer books: An experimental study
Brummerloh et al. Boromir at Touché 2022: Combining Natural Language Processing and Machine Learning Techniques for Image Retrieval for Arguments.
Maniar et al. Generation and grading of arduous MCQs using NLP and OMR detection using OpenCV
CN110825930A (en) Method for automatically identifying correct answers in community question-answering forum based on artificial intelligence
Hu et al. A new intelligent learning diagnosis method constructed based on concept map
Srihari et al. Automated scoring of handwritten essays based on latent semantic analysis
Negi et al. An artificially intelligent machine for answer scripts evaluation during pandemic to support the online methodology of teaching and evaluation
Krisnadi et al. A multiple-choice test recognition system based on android and RBFNN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210914