CN112800182A - Test question generation method and device - Google Patents

Test question generation method and device Download PDF

Info

Publication number
CN112800182A
CN112800182A CN202110185141.5A CN202110185141A CN112800182A CN 112800182 A CN112800182 A CN 112800182A CN 202110185141 A CN202110185141 A CN 202110185141A CN 112800182 A CN112800182 A CN 112800182A
Authority
CN
China
Prior art keywords
question
information
test question
test
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110185141.5A
Other languages
Chinese (zh)
Inventor
湛志强
张柳新
高菁华
常新峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lenovo Beijing Ltd
Original Assignee
Lenovo Beijing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lenovo Beijing Ltd filed Critical Lenovo Beijing Ltd
Priority to CN202110185141.5A priority Critical patent/CN112800182A/en
Publication of CN112800182A publication Critical patent/CN112800182A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The present disclosure relates to a test question generation method and device, the method includes: performing first natural language processing on the question information of a first test question to determine knowledge points examined in the first test question; performing second natural language processing on the question stem information of the first test question to obtain entity information of the first test question; the entity information is a core word contained in the first test question; and generating at least one second test question based on the knowledge points and the entity information, wherein the second test question has the same question type as the first test question. The method can generate different question stem descriptions under the condition that the examined knowledge points are inconvenient, and fully examine the understanding and mastering capacity of students on the knowledge points; meanwhile, different test questions are described by different question stems, so that cheating behaviors in the test can be effectively prevented, and fairness and justness of test results are guaranteed to the greater extent.

Description

Test question generation method and device
Technical Field
The disclosure relates to the technical field of information processing, in particular to a test question generation method and device.
Background
In the prior art, in order to prevent students from cheating in an examination, different test papers are formed by randomly scrambling test question numbers (or candidate answer sequences). Although the existing anti-cheating method can play a certain role, the question stem description of the question is generally kept unchanged, so that the students can still copy according to the question and the candidate answers. In addition, in the prior art, the question stem of the question has a single description form, and the understanding and mastering conditions of students on knowledge points cannot be fully examined.
Disclosure of Invention
The embodiment of the disclosure provides a test question generation method and device, which can solve the problems that understanding and mastering conditions of students on knowledge points cannot be fully examined and cheating of the students cannot be effectively prevented in the prior art.
According to one aspect of the present disclosure, there is provided a test question generation method, including:
performing first natural language processing on the question information of a first test question to determine knowledge points examined in the first test question;
performing second natural language processing on the question stem information of the first test question to obtain entity information of the first test question; the entity information is a core word contained in the first test question;
and generating at least one second test question based on the knowledge points and the entity information, wherein the second test question has the same question type as the first test question.
In some embodiments, performing a first natural language processing on the topic information of a first test question to determine knowledge points examined in the first test question comprises:
determining keywords of the title information;
comparing the keyword with a preset category of knowledge points, and determining semantic similarity between the keyword and the preset category of knowledge points;
and when the semantic similarity between the keyword and the knowledge points of the preset category is greater than a preset semantic threshold, determining the knowledge points of the preset category with the maximum semantic similarity as the knowledge points examined in the first test question.
In some embodiments, determining the keyword for the topic information comprises:
performing word segmentation processing on the question information of the first test question to determine candidate keywords;
and determining synonyms of the candidate keywords, and taking the candidate keywords and the synonyms as the keywords of the topic information.
In some embodiments, determining the keyword for the topic information comprises:
determining the question type of the first test question;
determining the information type contained in the question information of the first test question based on the question type of the first test question;
and determining the keywords of the title information according to the information category.
In some embodiments, when the first test question is a choice question, the question information of the first test question includes question stem information and candidate answer information, and determining the keyword of the question information according to the information category includes:
determining keywords in the question stem information and the candidate answer information of the first test question respectively, and determining the keywords of the question stem information matched with the keywords of the candidate answer information as the keywords of the first test question; alternatively, the first and second electrodes may be,
when the first test question is a question and answer question or an analysis question, the question information of the first test question comprises question stem information and question information, and the keyword of the question information is determined according to the information category, which comprises the following steps:
and respectively determining the question stem information of the first test question and the keywords in the question information, and determining the keywords of the question stem information matched with the keywords of the question information as the keywords of the first test question.
In some embodiments, generating at least one second question based on the knowledge points and the entity information comprises:
determining a relevant word corresponding to the core word based on the text characteristics of the core word and the syntactic characteristics of the stem information;
and generating question stem information of at least one second test question according with a grammar structure based on the knowledge points, the core words and the associated words.
In some embodiments, determining the relevant word corresponding to the core word based on the text features of the core word and the syntactic features of the stem information comprises:
determining the semantics of the core words expressed in the stem information based on the text features of the core words and the syntactic features of the stem information;
performing semantic expansion on the core word based on the semantic meaning of the core word expressed in the stem information to obtain a first associated word; and/or
Determining a context associated word corresponding to the core word in the stem information based on the text characteristics of the core word and the syntactic characteristics of the stem information, and performing semantic expansion on the context associated word to obtain a second associated word.
In some embodiments, the text characteristics of the core words include at least one of a category, a part of speech, a location in the stem information, and a dependency relationship with other core words of the core words.
In some embodiments, the method further comprises:
determining a difficulty coefficient of the first test question;
and generating at least one second test question based on the knowledge points, the entity information and the difficulty coefficient.
According to one aspect of the present disclosure, there is provided a test question generating apparatus, including:
the system comprises a determining module, a searching module and a searching module, wherein the determining module is configured to perform first natural language processing on the question information of a first test question and determine the knowledge points examined in the first test question;
the acquisition module is configured to perform second natural language processing on the question stem information of the first test question to acquire entity information of the first test question; the entity information is a core word contained in the first test question;
and the generating module is configured to generate at least one second test question based on the knowledge points and the entity information, wherein the second test question has the same question type as the first test question.
According to one of the aspects of the present disclosure, an electronic device is further provided, which includes a processor and a memory, where the memory is used to store computer-executable instructions, and the processor executes the computer-executable instructions to implement the test question generation method.
According to one aspect of the present disclosure, a computer-readable storage medium is provided, on which computer-executable instructions are stored, and when the computer-executable instructions are executed by a processor, the test question generation method is implemented.
The test question generation method and the device provided by various embodiments of the disclosure determine knowledge points examined by a first test question from question information of the first test question to be generated by using a natural language processing technology, extract entity information in the first test question from question stem information of the first test question, and then regenerate at least one second test question by using the knowledge points and the entity information, so that different question stem descriptions can be generated on the premise of ensuring that the examined knowledge points are not changed, and the understanding and mastering abilities of students on the knowledge points are fully examined; meanwhile, different test questions are described by different question stems, so that barriers can be effectively created for copying among students, cheating behaviors in the examination can be effectively prevented, and fairness and justness of examination results can be guaranteed to the maximum extent.
Drawings
FIG. 1 shows a flow chart of a test question generation method of an embodiment of the present disclosure;
FIG. 2 illustrates an example diagram of a first test question when generated in an embodiment of the present disclosure;
FIG. 3 illustrates an exemplary diagram of a second test question generated in an embodiment of the present disclosure;
FIG. 4 shows another flow chart of a test question generation method of an embodiment of the present disclosure;
FIG. 5 shows yet another flow chart of a test question generation method of an embodiment of the present disclosure;
FIG. 6 illustrates yet another flowchart of a test question generation method of an embodiment of the present disclosure;
fig. 7 shows a schematic structural diagram of a test question generation apparatus according to an embodiment of the present disclosure.
Detailed Description
Various aspects and features of the disclosure are described herein with reference to the drawings.
It will be understood that various modifications may be made to the embodiments of the present application. Accordingly, the foregoing description should not be construed as limiting, but merely as exemplifications of embodiments. Other modifications will occur to those skilled in the art within the scope and spirit of the disclosure.
The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate embodiments of the disclosure and, together with a general description of the disclosure given above, and the detailed description of the embodiments given below, serve to explain the principles of the disclosure.
These and other characteristics of the present disclosure will become apparent from the following description of preferred forms of embodiment, given as non-limiting examples, with reference to the attached drawings.
It is also to be understood that although the present disclosure has been described with reference to certain specific examples, those skilled in the art will be able to ascertain many other equivalents to the present disclosure.
The above and other aspects, features and advantages of the present disclosure will become more apparent in view of the following detailed description when taken in conjunction with the accompanying drawings.
Specific embodiments of the present disclosure are described hereinafter with reference to the accompanying drawings; however, it is to be understood that the disclosed embodiments are merely exemplary of the disclosure that may be embodied in various forms. Well-known and/or repeated functions and structures have not been described in detail so as not to obscure the present disclosure with unnecessary or unnecessary detail. Therefore, specific structural and functional details disclosed herein are not to be interpreted as limiting, but merely as a basis for the claims and as a representative basis for teaching one skilled in the art to variously employ the present disclosure in virtually any appropriately detailed structure.
Fig. 1 shows a flowchart of a test question generation method of an embodiment of the present disclosure. As shown in fig. 1, the present disclosure provides a test question generation method, including:
s1: and performing first natural language processing on the question information of the first test question to determine the knowledge points examined in the first test question.
The first test question is a first test question, the question information is all information contained in the first test question, and the first test questions with different question types have different question information. For example, as shown in fig. 2, the first question is a radio question, and the question information includes question stem information and candidate answer information.
The first natural language processing is text analysis based on text recognition, text classification and the like to accurately determine knowledge points examined by the first test question. For example, as shown in fig. 2, a "relational database management system" is a knowledge point of the determined first test question. The knowledge point is explicitly represented in the question stem information of the first test question and can be directly determined through text recognition.
In other embodiments, the knowledge point is not explicitly represented in the topic information of the first test question, and needs to be obtained through reasoning and summarization, and therefore needs to be determined jointly through a combination of methods such as text recognition and text classification.
S2: performing second natural language processing on the question stem information of the first test question to obtain entity information of the first test question; and the entity information is a core word contained in the first test question.
Specifically, the core words include the words in the first test question, such as the keywords, the logic relation words, the indicator words, and the directional words, which play a core role in the question stem information set sentence. As shown in fig. 2, the "relational database system", "management", and "relationship" in the stem information may be core words.
Since the entity information is the specific content included in the first test question, the second natural language processing is mainly text analysis based on text recognition to recognize and extract the core word from the first test question.
S3: and generating at least one second test question based on the knowledge points and the entity information, wherein the second test question has the same question type as the first test question.
After the knowledge points and the entity information are obtained, the knowledge points and the entity information can be combined to further perform text analysis processing, and at least one second test question with the same type as the first test question is generated. For example, the second test question as shown in fig. 3 may be generated based on the knowledge points and the entity information determined from the first test question as shown in fig. 2, and the first test question and the generated second test question are single-choice questions.
Wherein, generating at least one second test question means generating question stem information of the second test question. For example, when the first test question is a blank filling question or a judgment question, the question information only includes question stem information, and the generated second test question is the question stem information of the second test question. When the first test question is a choice question, a question answer, an analysis question or the like, the question information includes not only the question stem information but also the candidate answer or question information, and since the entity information is obtained from the question stem information of the first test question, the question stem information of the second test question is generated based on the knowledge points and the entity information.
The test question generating method provided by the embodiment of the disclosure determines a knowledge point examined by a first test question from question information of the first test question to be generated by using a natural language processing technology, extracts entity information in the first test question from question stem information of the first test question, and then regenerates at least one second test question by using the knowledge point and the entity information, so that different question stem descriptions can be generated on the premise of ensuring that the examined knowledge point is not changed, and understanding and mastering abilities of students on the knowledge point are fully examined; meanwhile, different test questions are described by different question stems, so that barriers can be effectively created for copying among students, cheating behaviors in the examination can be effectively prevented, and fairness and justness of examination results can be guaranteed to the maximum extent.
In some embodiments, as shown in fig. 4, in step S1, performing a first natural language processing on the topic information of the first test question to determine the knowledge points examined in the first test question includes:
s11: determining keywords of the title information;
s12: comparing the keyword with a preset category of knowledge points, and determining semantic similarity between the keyword and the preset category of knowledge points;
s13: and when the semantic similarity between the keyword and the knowledge points of the preset category is greater than a preset semantic threshold, determining the knowledge points of the preset category with the maximum semantic similarity as the knowledge points examined in the first test question.
Since the knowledge points examined by the first test question are key points in the whole test question, the key words in the question information of the first test question need to be determined first. The determination of the keywords may be performed using text recognition methods. For example, the topic information may be segmented or segmented, and then the keywords in the topic information are determined from different segments or segments. The number of keywords may be one or more. For example, in fig. 2, the "relational database management system" is a keyword of the topic information of the first test question.
After determining the keyword, the keyword may be compared with the knowledge points of the preset categories in the preset knowledge point library, and semantic similarity between the keyword and the knowledge points of the preset categories is calculated, where a certain number of accurate knowledge points of known categories are stored in the knowledge point library.
The knowledge points of the preset category may be the same knowledge points as the semantic environment of the keyword, for example, the subjects and sections of the textbook to which the "relational database management system" belongs may be determined, the subjects and sections of the textbook to which the "relational database management system" belongs may be used as the semantic environment of the keyword, and then the knowledge points in the subjects and sections may be extracted from the preset knowledge point library and compared with the keyword as the knowledge points of the preset category. When the semantic similarity between the keyword and the knowledge points in the preset category is greater than a preset semantic threshold, determining that the knowledge points in the preset category meeting the preset semantic threshold are candidate knowledge points of the first test question. And after the candidate knowledge points are determined, determining the knowledge points with the largest semantic similarity in the preset categories as the knowledge points to be finally examined in the first test question.
When a plurality of preset-category knowledge points meeting the preset threshold are provided, some knowledge points are non-key knowledge points, so that the non-key knowledge points can be removed to ensure that the screened final knowledge points are the knowledge points examined in the first test question. For example, when the knowledge point is "management of the relational database management system", and the candidate knowledge point determined based on the keyword "relational database management system" may be a subdivided knowledge point such as "management of the relational database management system", "classification of the relational database management system", and the like, at this time, "management of the relational database management system" with the largest similarity threshold is determined as the knowledge point examined in the first test question, and a more accurate knowledge point can be obtained.
The specific semantic similarity calculation method can be a statistical method based on word co-occurrence, and mainly carries out statistics on word frequency in a sentence, such as TF-IDF algorithm and the like; or a corpus training feature extraction method based on a neural network, and the specific calculation method is not particularly limited in the present disclosure.
In some embodiments, the step S11 of determining the keyword of the title information includes:
s111: performing word segmentation processing on the question information of the first test question to determine candidate keywords;
s112: and determining synonyms of the candidate keywords, and taking the candidate keywords and the synonyms as the keywords of the topic information.
The word segmentation processing comprises vocabulary splitting, punctuation filtering and the like. A plurality of word segmentation segments are obtained through word segmentation processing, and the word segmentation segments can be single characters, single words or composite words formed by a plurality of words.
After word segmentation processing, judging the probability of each word segmentation segment appearing in a preset keyword library according to a word frequency statistical method and the like, and if the probability of the word segmentation segment appearing in the preset keyword library meets a preset threshold value, selecting the word segmentation segment as a candidate keyword; and if the probability of the word segmentation segment appearing in the preset keyword library does not meet the preset threshold value, determining that the word segmentation segment is not the candidate keyword. In a specific implementation, the candidate keywords may also be determined by judging whether the word segmentation segments exist in a preset keyword library.
After the candidate keywords are determined, synonyms synonymous with the candidate keywords are extracted from a preset keyword library, the candidate keywords and the synonyms are used as the keywords of the topic information of the first test question, and therefore more comprehensive and accurate keywords can be obtained. For example, in fig. 2, the RDBMS is abbreviated as english of the relational database management system, when the keyword is determined to be the "relational database management system", the RDBMS synonymous with the "relational database management system" or the full english name thereof needs to be selected from a preset keyword library or an existing dictionary, and if only the keyword in the topic information is used, the comparison between the RDBMS and the preset knowledge point may be missed, which may reduce the accuracy of the knowledge point determination.
In other embodiments, the step S11 of determining the keyword of the topic information includes:
s113: determining the question type of the first test question;
s114: determining the information type contained in the question information of the first test question based on the question type of the first test question;
s115: and determining the keywords of the title information according to the information category.
Since the information types included in the topic information of different topic types are different, for example, when the first test question is a choice question, the topic information includes topic stem information and candidate answer information, if the keyword is obtained only through the above steps S111 and S112, then part of the interference information that appears frequently in the candidate answer may also be the keyword. For example, the same keyword appears in the distracting options of the radio topic. In addition, for the first test questions with different question types, although the keywords of the question information can be determined from the question stem information, the keywords can be determined from the question stem information only, so that important keywords appearing in correct options can be omitted, and the accuracy of extracting the keywords of the question information is reduced. Therefore, in order to ensure the accuracy of determining the keywords of the topic information, in step S113-115, the information category included in the topic information of the first test question is determined according to the topic type of the first test question, and then the keywords of the topic information are determined.
In a specific embodiment, when the first test question is a choice question, the question information of the first test question includes question stem information and candidate answer information, and determining the keyword of the question information according to the information category includes:
and respectively determining keywords in the question stem information and the candidate answer information of the first test question, and determining the keywords of the question stem information matched with the keywords of the candidate answer information as the keywords of the first test question.
That is, when the first test question is a choice question, the information category of the question information includes question stem information and candidate answer information, the question stem information and the candidate answer information are subjected to word segmentation processing respectively to obtain keywords in the question stem information and the candidate answer information, then the first keyword in the determined question stem information is matched with the second keyword in the candidate answer information, if the first keyword and the second keyword both contain the same keyword, the keyword is indicated to be both the keyword in the question stem information and the keyword in the candidate answer, so that the influence of the keyword in the interference option on the keyword determination of the question information can be eliminated, and the accuracy of the keyword determination of the question information is ensured.
In another specific embodiment, when the first test question is a question and answer question or an analysis question, the question information of the first test question includes question stem information and question information, and determining the keyword of the question information according to the information category includes:
and respectively determining the question stem information of the first test question and the keywords in the question information, and determining the keywords of the question stem information matched with the keywords of the question information as the keywords of the first test question.
The keyword determination of the topic information of the question-and-answer or analysis question is similar to the keyword determination of the topic information of the selection question, and is not described herein again.
The steps S111 to S112 and the steps S113 to S115 may be executed independently or in combination, and for example, the keywords of the title information may be identified using the steps S111 to S112 in the step S115.
In some embodiments, as shown in fig. 5, the generating at least one second test question based on the knowledge points and the entity information in step S3 includes:
s31: determining a relevant word corresponding to the core word based on the text characteristics of the core word and the syntactic characteristics of the stem information;
s32: and generating question stem information of at least one second test question according with a grammar structure based on the knowledge points, the core words and the associated words.
The text characteristics of the core words comprise at least one of categories, parts of speech, positions in the stem information and dependency relationships with other core words of the core words. The category of the core word can be proper nouns such as a name of a person, a name of a mechanism, a name of a place, actions, meaningful time and the like; the part of speech of the core word can be nouns, verbs, adjectives, adverbs and the like; the positions of the core words in the stem information can be the beginning of a word, the end of a word and the like, and the word sequence of the core words can be determined based on the positions of the core words; the dependency relationship with other core words includes a context, an affiliation, a modifier, or whether it is necessary to use with other core words, and the like. The syntactic characteristics of the question stem information comprise a sentence pattern structure and the logical semantic relationship among all participles, and the sentence pattern structure can comprise the major-minor structure, the major-minor structure and the like of partial phrases in the whole question stem information or the question stem information.
For example, in fig. 2, the relation managed by the relational database management system is "and may be divided into" relational database management system "," manage "," relationship "and" yes "after word segmentation, where the core words" relational database management system "," manage "and" relationship "are phrases of subject, predicate and object, and" manage "and" relationship "need to be preceded by a term".
By analyzing the text characteristics of the core words and the syntactic characteristics of the stem information, the relevant words corresponding to the core words can be obtained, so that more words relevant to the core words can be obtained, and sentence recombination can be performed. And after the relevant words corresponding to the core words are obtained, screening and combining the core words and the relevant words, and obtaining second test questions with more quantity and more accurate semantic expression under the condition of the same knowledge points.
It should be noted that the above syntax-compliant structure refers to a complete sentence expression form compliant with the conventional general syntax.
In some embodiments, in step S31, determining the relevant word corresponding to the core word based on the text feature of the core word and the syntactic feature of the stem information includes:
step S311: determining the semantics of the core words expressed in the stem information based on the text features of the core words and the syntactic features of the stem information;
step S312: performing semantic expansion on the core word based on the semantic meaning of the core word expressed in the stem information to obtain a first associated word; and/or
Step S313: determining a context associated word corresponding to the core word in the stem information based on the text characteristics of the core word and the syntactic characteristics of the stem information, and performing semantic expansion on the context associated word to obtain a second associated word.
Step S311 and step S312 are expanding steps of the expanded semantic word of the core word, and the semantics expressed by the core word in the stem information is accurately determined through step S311, so as to prevent semantic understanding errors. Especially, the semantic recognition can be accurately carried out on the condition that the same core word has different expressed semantics under different semantic environments. For example, a relationship in "relational database management system" is modified as a modifier to "database management system" to determine the type of "database management system"; the relationship in the "managed relationship" is a noun, and therefore, it is necessary to accurately identify the semantic meaning expressed by each core word in order to obtain a more accurate first related word. And after the semantics expressed by the core words are accurately determined, expanding the core words to obtain more core words for generating second test questions so as to generate more second test questions.
For example, in fig. 2, the relation managed by the stem information "relational database management system" is that, in the core words "relational database management system", "management", "relation" are nouns, verbs, and nouns, respectively, the syntactic characteristics of the stem information are standard main and predicate guest structures, and as can be seen from the above, the "relational database management system" is a proper noun that can be used alone, so that it can be directly extended synonymously to obtain its english abbreviation "RDBMS", or perform semantic reasoning on the "relational database management system" to obtain relational data managed by the "relational database management system", thereby obtaining an extended semantic word such as "relational data" shown in fig. 3. For another example, the "managed relationship" in the topic stem information is a guest-moving relationship, and it is determined that "what is filled in after" is "should be a data type based on the positions of" management "and" relationship "or the dependency relationship between the two, so that an expanded semantic word such as" store "shown in fig. 3 can be obtained based on the two core words of" management "and" relationship "and the syntactic characteristics of the topic stem information. And then recombining the screened knowledge points, the core words and the first associated words based on common grammatical expression or idiomatic expression to obtain a second test question. For example, "store" as a verb, the common expressions behind which are generally "store" or "store as", without "store being" and "store being" generally a noun representing a place, and "store as" behind which a data type may be added, thus determining "store as" for a second question; the core word "management" used in the first test question may be used continuously, and the core word "relational database management system" may be used multiple times, and when it is used again, it may be abbreviated as "system", and since the knowledge point is the same as one of the core words in this embodiment, the two may be merged, and finally the question stem information of the first test question "the relationship managed by the relational database management system is" regenerated into "in the RDBMS, and the relationship managed by the system is stored as".
In some embodiments, when the topic information further includes information of other information categories, step S31 further includes:
and determining a relevant word corresponding to the core word based on the text characteristics of the core word, the syntactic characteristics of the question stem information and other information except the question stem information in the question information.
Specifically, based on the candidate answers of the single-choice questions as shown in fig. 2, it can be determined that the question stem information "the relation managed by the relational database management system is" the middle is followed by the file type to be filled in, so after the extended semantic word "store" is determined, it can be directly determined that "store is" used in the second test question, and the test question generation efficiency can be improved.
Before step S311, the method further includes:
s3101: judging whether the semantics expressed by the core words are single or not;
s3102: if yes, performing no semantic expansion on the core words; if not, extracting the core words with non-single expressed semantics, and executing step S311.
Namely, when the semantic expressed by the core word in the question stem information is single, the core word can be determined to be an independent word, the expressed semantic is unique no matter in any semantic environment, the semantic is not required to be expanded, and the text analysis processing efficiency can be improved.
After the extended semantic words of the core words are obtained, the knowledge points, the core words and the extended semantic words can be input into a test question generation model containing a preset grammar structure to generate a second test question.
It should be noted that, in the test question generating process, the core words and the expanded semantic words thereof may be replaced with each other, and may be used repeatedly in the process of arranging and combining the sentences according to actual needs, and a specific use manner is not specifically limited in this embodiment.
In specific implementation, different core words and the expanded semantic words are used as far as possible in different test questions (including the first test question and the generated second test question), so that the expression of the test questions is richer, and cheating is effectively prevented.
Step S313 is determining a context related word of the core word, and performing an expansion step of the context related word, if the second test question is generated based only on the knowledge point, the core word, and the expanded semantic word of the core word, because the core word is usually an important word such as a noun and a verb, and the expression of the generated second test question is hard or not smooth enough, in order to ensure that the generated second test question is more natural and smooth, the context related word of the core word in the stem information is obtained through step S313, and the context related word may be the core word or a non-core word, such as a pronoun and a conjunctive.
After the context related words are determined, the context related words are classified, the core words are classified into one class, the non-core words are classified into one class, and the processing of the core words in the context related words can be expanded as described in steps S311 and S312 and is not described herein again. For the non-core words, the expansion semantic words of the non-core words can be obtained through expansion, and then the context related words and the expansion semantic words of the context related words are used as second related words and combined with knowledge points, entity information and/or the first related words to obtain second test questions which are accurate in semantic expression, natural and smooth; meanwhile, more second test questions can be generated by acquiring the context associated words and the expanded semantic words thereof, and the semantic expression is richer.
In some embodiments, as shown in fig. 6, the method further comprises:
s4: determining a difficulty coefficient of the first test question;
s33: and generating at least one second test question based on the knowledge points, the entity information and the difficulty coefficient.
The comprehension ability of students to different question stem information is different, when the question stem information is expressed smoothly, the students can easily understand the semantics expressed by the question stem information, and therefore quick answering can be carried out; when the question stem information expresses a notch, students can understand the semantics expressed by the question stem information only by analyzing, the answering time is influenced, and the question stem understanding error is easily caused, so that the understanding and mastering capacity of the students on the knowledge points is difficult to accurately examine, and the examination fairness is also influenced. Therefore, by determining the difficulty coefficient of the first test question and then generating a second test question which is the same as or similar to the difficulty coefficient of the first test question based on the knowledge point and the entity information, the understanding and mastering capacity of students on the knowledge point can be accurately examined, and the fairness of the examination can be ensured.
In specific implementation, a difficulty coefficient model can be obtained by pre-training a sufficient number of first test questions, second test questions and response conditions of the first test questions and the second test questions. After the difficulty coefficient of the first test question is obtained through the difficulty coefficient model, the knowledge point, the entity information and the difficulty coefficient are input into a preset text generation model containing the difficulty coefficient, and then at least one second test question which is the same as the difficulty coefficient of the first test question can be obtained.
The method for generating the test questions is mainly a step for generating the question stem information of the second test question, and when the question information of the first test question includes information of other information types except the question stem information, information of other information types of the second test question can be generated at the same time, for example, for a selected question, the question stem information can be generated and a candidate answer can be generated at the same time, and a specific method for generating the candidate answer is similar to the method for generating the question stem information, and is not described herein again.
Fig. 7 shows a schematic structural diagram of a test question generation apparatus according to an embodiment of the present disclosure. As shown in fig. 7, an embodiment of the present disclosure provides a test question generating apparatus, including:
the determination module 10 is configured to perform first natural language processing on the question information of a first test question, and determine knowledge points examined in the first test question;
the obtaining module 20 is configured to perform second natural language processing on the question stem information of the first test question to obtain entity information of the first test question; the entity information is a core word contained in the first test question;
a generating module 30 configured to generate at least one second test question based on the knowledge points and the entity information, wherein the second test question has a same question type as the first test question.
The test question generating device provided by the embodiment of the disclosure determines knowledge points examined by a first test question from question information of the first test question to be generated by using a natural language processing technology, extracts entity information in the first test question from question stem information of the first test question, and then regenerates at least one second test question by using the knowledge points and the entity information, so that different question stem descriptions can be generated on the premise of ensuring that the examined knowledge points are not changed, and understanding and mastering abilities of students on the knowledge points are fully examined; meanwhile, different test questions are described by different question stems, so that barriers can be effectively created for copying among students, cheating behaviors in the examination can be effectively prevented, and fairness and justness of examination results can be guaranteed to the maximum extent.
The test question generation device provided in the embodiment of the present disclosure corresponds to the test question generation method in the above embodiment, and based on the above test question generation method, a person skilled in the art can understand the specific implementation manner and various variations of the test question generation device in the embodiment of the present disclosure, and any optional items in the embodiment of the test question generation method are also applicable to the test question generation device, and are not described herein again.
An embodiment of the present disclosure further provides an electronic device, including: the test question generation device comprises a processor and a memory, wherein the memory is used for storing computer executable instructions, and the processor executes the computer executable instructions to realize the test question generation method.
The processor may be a general-purpose processor, including a central processing unit CPU, a Network Processor (NP), and the like; but also a digital signal processor DSP, an application specific integrated circuit ASIC, a field programmable gate array FPGA or other programmable logic device, discrete gate or transistor logic, discrete hardware components.
The memory may include Random Access Memory (RAM) and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The embodiment of the present disclosure also provides a computer-readable storage medium, on which computer-executable instructions are stored, and when the computer-executable instructions are executed by a processor, the method for generating test questions is implemented.
The above embodiments are merely exemplary embodiments of the present disclosure, which is not intended to limit the present disclosure, and the scope of the present disclosure is defined by the claims. Various modifications and equivalents of the disclosure may occur to those skilled in the art within the spirit and scope of the disclosure, and such modifications and equivalents are considered to be within the scope of the disclosure.

Claims (10)

1. A test question generation method, comprising:
performing first natural language processing on the question information of a first test question to determine knowledge points examined in the first test question;
performing second natural language processing on the question stem information of the first test question to obtain entity information of the first test question; the entity information is a core word contained in the first test question;
and generating at least one second test question based on the knowledge points and the entity information, wherein the second test question has the same question type as the first test question.
2. The method of claim 1, wherein performing a first natural language processing on the topic information of a first test question to determine knowledge points examined in the first test question comprises:
determining keywords of the title information;
comparing the keyword with a preset category of knowledge points, and determining semantic similarity between the keyword and the preset category of knowledge points;
and when the semantic similarity between the keyword and the knowledge points of the preset category is greater than a preset semantic threshold, determining the knowledge points of the preset category with the maximum semantic similarity as the knowledge points examined in the first test question.
3. The method of claim 2, wherein determining keywords for the topic information comprises:
performing word segmentation processing on the question information of the first test question to determine candidate keywords;
and determining synonyms of the candidate keywords, and taking the candidate keywords and the synonyms as the keywords of the topic information.
4. The method of claim 2, wherein determining keywords for the topic information comprises:
determining the question type of the first test question;
determining the information type contained in the question information of the first test question based on the question type of the first test question;
and determining the keywords of the title information according to the information category.
5. The method of claim 4, wherein when the first test question is a choice question, the question information of the first test question comprises question stem information and candidate answer information, and determining the keyword of the question information according to the information category comprises:
determining keywords in the question stem information and the candidate answer information of the first test question respectively, and determining the keywords of the question stem information matched with the keywords of the candidate answer information as the keywords of the first test question; alternatively, the first and second electrodes may be,
when the first test question is a question and answer question or an analysis question, the question information of the first test question comprises question stem information and question information, and the keyword of the question information is determined according to the information category, which comprises the following steps:
and respectively determining the question stem information of the first test question and the keywords in the question information, and determining the keywords of the question stem information matched with the keywords of the question information as the keywords of the first test question.
6. The method of claim 1, wherein generating at least one second question based on the knowledge points and the entity information comprises:
determining a relevant word corresponding to the core word based on the text characteristics of the core word and the syntactic characteristics of the stem information;
and generating question stem information of at least one second test question according with a grammar structure based on the knowledge points, the core words and the associated words.
7. The method of claim 6, wherein determining the relevant word corresponding to the core word based on the text features of the core word and the syntactic features of the stem information comprises:
determining the semantics of the core words expressed in the stem information based on the text features of the core words and the syntactic features of the stem information;
performing semantic expansion on the core word based on the semantic meaning of the core word expressed in the stem information to obtain a first associated word; and/or
Determining a context associated word corresponding to the core word in the stem information based on the text characteristics of the core word and the syntactic characteristics of the stem information, and performing semantic expansion on the context associated word to obtain a second associated word.
8. The method of claim 6, wherein the text features of the core words comprise at least one of a category, a part of speech, a location in the stem information, and a dependency relationship with other core words of the core words.
9. The method of claim 1, wherein the method further comprises:
determining a difficulty coefficient of the first test question;
and generating at least one second test question based on the knowledge points, the entity information and the difficulty coefficient.
10. A test question generating apparatus comprising:
the system comprises a determining module, a searching module and a searching module, wherein the determining module is configured to perform first natural language processing on the question information of a first test question and determine the knowledge points examined in the first test question;
the acquisition module is configured to perform second natural language processing on the question stem information of the first test question to acquire entity information of the first test question; the entity information is a core word contained in the first test question;
and the generating module is configured to generate at least one second test question based on the knowledge points and the entity information, wherein the second test question has the same question type as the first test question.
CN202110185141.5A 2021-02-10 2021-02-10 Test question generation method and device Pending CN112800182A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110185141.5A CN112800182A (en) 2021-02-10 2021-02-10 Test question generation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110185141.5A CN112800182A (en) 2021-02-10 2021-02-10 Test question generation method and device

Publications (1)

Publication Number Publication Date
CN112800182A true CN112800182A (en) 2021-05-14

Family

ID=75815087

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110185141.5A Pending CN112800182A (en) 2021-02-10 2021-02-10 Test question generation method and device

Country Status (1)

Country Link
CN (1) CN112800182A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113505195A (en) * 2021-06-24 2021-10-15 作业帮教育科技(北京)有限公司 Knowledge base, construction method and retrieval method thereof, and question setting method and system based on knowledge base
CN114201613A (en) * 2021-11-30 2022-03-18 北京百度网讯科技有限公司 Test question generation method, test question generation device, electronic device, and storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201349159A (en) * 2012-05-31 2013-12-01 Han Lin Publishing Co Ltd Method for generating learning test questions and system thereof
CN106409041A (en) * 2016-11-22 2017-02-15 深圳市鹰硕技术有限公司 Generation method and system for gap filling test question and grading method and system for gap filling test paper
CN107273490A (en) * 2017-06-14 2017-10-20 北京工业大学 A kind of combination mistake topic recommendation method of knowledge based collection of illustrative plates
CN108334493A (en) * 2018-01-07 2018-07-27 深圳前海易维教育科技有限公司 A kind of topic knowledge point extraction method based on neural network
CN109359290A (en) * 2018-08-20 2019-02-19 国政通科技有限公司 The knowledge point of examination question text determines method, electronic equipment and storage medium
CN111311459A (en) * 2020-03-16 2020-06-19 宋继华 Interactive question setting method and system for international Chinese teaching
CN111815274A (en) * 2020-07-03 2020-10-23 北京字节跳动网络技术有限公司 Information processing method and device and electronic equipment
CN112069295A (en) * 2020-09-18 2020-12-11 科大讯飞股份有限公司 Similar question recommendation method and device, electronic equipment and storage medium
CN112101017A (en) * 2020-04-02 2020-12-18 上海迷因网络科技有限公司 Method for generating questions for rapid expressive force test
CN112164261A (en) * 2020-09-24 2021-01-01 浙江太学科技集团有限公司 Intelligent assessment method
CN112287659A (en) * 2019-07-15 2021-01-29 北京字节跳动网络技术有限公司 Information generation method and device, electronic equipment and storage medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201349159A (en) * 2012-05-31 2013-12-01 Han Lin Publishing Co Ltd Method for generating learning test questions and system thereof
CN106409041A (en) * 2016-11-22 2017-02-15 深圳市鹰硕技术有限公司 Generation method and system for gap filling test question and grading method and system for gap filling test paper
CN107273490A (en) * 2017-06-14 2017-10-20 北京工业大学 A kind of combination mistake topic recommendation method of knowledge based collection of illustrative plates
CN108334493A (en) * 2018-01-07 2018-07-27 深圳前海易维教育科技有限公司 A kind of topic knowledge point extraction method based on neural network
CN109359290A (en) * 2018-08-20 2019-02-19 国政通科技有限公司 The knowledge point of examination question text determines method, electronic equipment and storage medium
CN112287659A (en) * 2019-07-15 2021-01-29 北京字节跳动网络技术有限公司 Information generation method and device, electronic equipment and storage medium
CN111311459A (en) * 2020-03-16 2020-06-19 宋继华 Interactive question setting method and system for international Chinese teaching
CN112101017A (en) * 2020-04-02 2020-12-18 上海迷因网络科技有限公司 Method for generating questions for rapid expressive force test
CN111815274A (en) * 2020-07-03 2020-10-23 北京字节跳动网络技术有限公司 Information processing method and device and electronic equipment
CN112069295A (en) * 2020-09-18 2020-12-11 科大讯飞股份有限公司 Similar question recommendation method and device, electronic equipment and storage medium
CN112164261A (en) * 2020-09-24 2021-01-01 浙江太学科技集团有限公司 Intelligent assessment method

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113505195A (en) * 2021-06-24 2021-10-15 作业帮教育科技(北京)有限公司 Knowledge base, construction method and retrieval method thereof, and question setting method and system based on knowledge base
CN114201613A (en) * 2021-11-30 2022-03-18 北京百度网讯科技有限公司 Test question generation method, test question generation device, electronic device, and storage medium
CN114201613B (en) * 2021-11-30 2022-10-21 北京百度网讯科技有限公司 Test question generation method, test question generation device, electronic device, and storage medium

Similar Documents

Publication Publication Date Title
Martinc et al. Supervised and unsupervised neural approaches to text readability
Hovy et al. Question Answering in Webclopedia.
US9575955B2 (en) Method of detecting grammatical error, error detecting apparatus for the method, and computer-readable recording medium storing the method
US10339168B2 (en) System and method for generating full questions from natural language queries
US9754504B2 (en) Generating multiple choice questions and answers based on document text
US8473278B2 (en) Systems and methods for identifying collocation errors in text
US20190278812A1 (en) Model generation device, text search device, model generation method, text search method, data structure, and program
US20080126319A1 (en) Automated short free-text scoring method and system
CN109271524B (en) Entity linking method in knowledge base question-answering system
US10339167B2 (en) System and method for generating full questions from natural language queries
JP2011118689A (en) Retrieval method and system
CN112800182A (en) Test question generation method and device
Burman et al. USFD at KBP 2011: Entity linking, slot filling and temporal bounding
Malhar et al. Deep learning based Answering Questions using T5 and Structured Question Generation System’
Grivaz Automatic extraction of causal knowledge from natural language texts
CN113987141A (en) Question-answering system answer reliability instant checking method based on recursive query
Chughtai et al. A lecture centric automated distractor generation for post-graduate software engineering courses
CN108573025B (en) Method and device for extracting sentence classification characteristics based on mixed template
Aldabe et al. A study on the automatic selection of candidate sentences distractors
CN111930911A (en) Rapid field question-answering method and device
Athanaselis et al. A corpus based technique for repairing ill-formed sentences with word order errors using co-occurrences of n-grams
Thenmozhi et al. An open information extraction for question answering system
Llorens et al. Data-driven approach based on semantic roles for recognizing temporal expressions and events in Chinese
Yang et al. Informal features in English academic writing: Mismatch between prescriptive advice and actual practice
US11854432B1 (en) Developing an e-rater advisory to detect babel-generated essays

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination