CN112487183A

CN112487183A - Labeled test question knowledge point classification method and system

Info

Publication number: CN112487183A
Application number: CN202011244070.3A
Authority: CN
Inventors: 华敏
Original assignee: Jiangsu Leyixue Education Technology Co ltd
Current assignee: Jiangsu Leyixue Education Technology Co ltd
Priority date: 2020-11-10
Filing date: 2020-11-10
Publication date: 2021-03-12

Abstract

The invention discloses a labeled test question knowledge point classification method and a labeled test question knowledge point classification system, wherein the classification method comprises the following steps: s1, converting the information of the target test questions to be classified into a preset text with a standard format; s2, extracting keywords from the converted text; s3, determining the corresponding knowledge nodes of the target test question in the multi-dimensional model of the pre-constructed knowledge points according to the keywords; s4, establishing labels of the target test questions, wherein the labels comprise one or more keyword labels and one or more knowledge node labels; s5, comparing the keyword labels of the target test questions with the test questions in the database of the corresponding knowledge nodes, and calculating the association degree of the target test questions and the corresponding knowledge nodes; and S6, if the relevance between the target test question and only one knowledge node exceeds a preset relevance threshold, attributing the target test question to the database of the test question corresponding to the knowledge node. The invention carries out labeling on the test questions, helps screening and filtering mass learning contents, and provides a basis for self-adaptive teaching.

Description

Labeled test question knowledge point classification method and system

Technical Field

The invention relates to the field of artificial intelligence, in particular to a labeled test question knowledge point classification method and system.

Background

With the popularization of internet online education, more and more people choose online learning. However, as the demand and demand for learning resources increase, the resources in the network expand greatly, which brings about a challenge for learners to find suitable learning resources. How to better find out proper resources from mass learning resources according to the requirements of learners is a constantly concerned hotspot problem, so that the utilization rate of the learning resources and the learning efficiency of learners are effectively improved. In learning, no matter courseware or test questions, the learning system has certain pertinence, certain knowledge points and adaptive groups. Reasonable classification, simple labeling, i.e., labeling, is taken. The label is closely connected with the resource, and a recommendation mechanism of the label is gradually formed on the basis, and the label is combined with the learning content.

In the field of primary school teaching, particularly, the tags of courseware and test questions can better help to mine the actual content in a database, and a user can capture the essential content without blind selection; the label can help the selection and the filtration of mass contents and is also one of the core contents of the self-adaptive teaching.

Disclosure of Invention

The invention provides a labeled test question knowledge point classification method and a labeled test question knowledge point classification system, which can not classify and label the knowledge of language test questions, and aims to solve the problems of the prior art. The technical scheme is as follows:

on one hand, the method for classifying the labeled test question knowledge points comprises the following steps:

s1, converting the information of the target test questions to be classified into a preset text with a standard format;

s2, extracting keywords from the converted text;

s3, determining the corresponding knowledge nodes of the target test question in the pre-constructed multi-dimensional model of the knowledge points according to the keywords, wherein the construction method of the multi-dimensional model of the knowledge points comprises the following steps: decomposing knowledge points into minimum particles according to teaching material catalogues and/or domain knowledge, connecting the minimum particles or a plurality of particles as knowledge nodes, and/or connecting the knowledge nodes and the minimum particles or the knowledge nodes to form new knowledge nodes, wherein each knowledge node comprises a database for storing corresponding test questions, and the sum of the knowledge nodes forms a multi-dimensional model of the knowledge points;

s4, establishing labels of the target test questions, wherein the labels comprise one or more keyword labels and one or more knowledge node labels;

s5, comparing the keyword labels of the target test questions with the test questions in the database of the corresponding knowledge nodes, and calculating the association degree of the target test questions and the corresponding knowledge nodes;

and S6, if the relevance between the target test question and only one knowledge node exceeds a preset relevance threshold, attributing the target test question to a database of the test questions corresponding to the knowledge node.

Further, the association degree between the target test question and the corresponding knowledge node in step S5 is calculated through the following steps:

calculating the association degree of the ith keyword tag and one knowledge node according to a formula KiA 1-QiA 1, wherein KiA1 is the frequency of the ith keyword and the occurrence frequency of the ith keyword at the knowledge node, and QiA1 is the frequency of the test question containing the ith keyword at the knowledge node;

and averaging the association degrees of each keyword label of the target test question and the knowledge node to obtain the association degree of the target test question and the knowledge node.

Further, the calculation formula of the ith keyword and the occurrence frequency of the knowledge node is as follows: the number of the test questions containing the key words in the knowledge node/the number of all the test questions in the knowledge node;

the calculation formula of the frequency of the test questions containing the ith keyword appearing in the knowledge node is as follows: the number of the test questions containing the keyword/the number of all the test questions containing the keyword are contained in the knowledge node.

Further, the corresponding knowledge node in step S3 is determined by:

sequentially searching a first keyword of a target test question in the multi-dimensional stereo model of the knowledge points, and determining one or more knowledge nodes as possible positions of the first keyword;

determining respective possible positions of the remaining keywords in sequence;

and comparing the possible positions of all the keywords, and selecting the knowledge node with the highest repetition degree as the knowledge node corresponding to the target test question.

Further, if the relevance between the target test question and a plurality of or zero knowledge nodes exceeds a preset relevance threshold, the target test question is placed into a database of undetermined knowledge nodes.

Further, the appearance frequency, the proportion, the relevance and the relevance of each label of the knowledge node are updated regularly.

Further, step S6 is followed by: and a reference basis is provided for subsequent personalized recommendation, screening or filtering of teaching resources.

Further, in step S1, the question, answer, question solving step and auxiliary method of the target test question to be classified are converted into a text in a preset standard format.

In another aspect, the present invention provides a labeled test question knowledge point classification system, which includes the following modules:

the format conversion module is used for converting the information of the target test questions to be classified into a preset text with a standard format;

the keyword extraction module is used for extracting keywords from the converted text;

a knowledge node determining module, configured to determine, according to the keyword, a knowledge node corresponding to the target test question in a pre-constructed multi-dimensional model of a knowledge point, where a construction method of the multi-dimensional model of the knowledge point includes: decomposing knowledge points into minimum particles according to teaching material catalogues and/or domain knowledge, connecting the minimum particles or a plurality of particles as knowledge nodes, and/or connecting the knowledge nodes and the minimum particles or the knowledge nodes to form new knowledge nodes, wherein each knowledge node comprises a database for storing corresponding test questions, and the sum of the knowledge nodes forms a multi-dimensional model of the knowledge points;

the label establishing module is used for establishing labels of the target test questions, wherein the labels comprise one or more keyword labels and one or more knowledge node labels;

the comparison module is used for comparing the keyword labels of the target test questions with the test questions in the database of the corresponding knowledge nodes and calculating the association degree of the target test questions and the corresponding knowledge nodes;

and the classification module is used for attributing the target test question to the database of the test question corresponding to the knowledge node on the premise that the association degree of the target test question with only one knowledge node exceeds a preset association degree threshold value.

Further, the comparison module comprises a correlation calculation unit for:

The technical scheme provided by the invention has the following beneficial effects:

a) based on the multi-dimensional knowledge model, knowledge nodes corresponding to the target test question can be matched;

b) in a plurality of possible knowledge nodes, the relevance between the target test question and all the possible knowledge nodes is compared through a comparison model, so that a classification result is obtained;

c) the database regularly updates information such as the occurrence frequency of the labels of the knowledge nodes, and the like, and follows the change of big data test questions to ensure the accuracy of the labels.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a data flow diagram of a labeled test question knowledge point classification method according to an embodiment of the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, apparatus, article, or device that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or device.

The method collects knowledge points and vocabularies of different disciplines to form a multi-dimensional model of the knowledge points of different disciplines; writing target test questions of labels to be acquired (to be classified) into texts in a standard format; extracting keywords in a text, and determining subject vocabularies and knowledge points to which the test questions belong according to the incidence relation between the keywords and the multi-dimensional stereo model; establishing a label corresponding to the question bank, manually checking the label to be verified, and modifying the information of the label, wherein in addition, the system can periodically verify the label to ensure the accuracy of the label (recalculating the association degree of each test question and a knowledge node where the test question is located, the association degree can be changed along with the increase of the number of the test questions in the knowledge node, and periodically verifying the label to be verified and waiting for manual verification, wherein the association degree is lower than a standard value; the label generated by the method provides a basis for subsequent personalized recommendation and provides a reference basis for screening or filtering teaching resources.

In an embodiment of the present invention, a labeled test question knowledge point classification method is provided, and referring to fig. 1, the classification method includes the following steps:

and S1, converting the information of the target test questions to be classified into a preset text with a standard format, wherein the information of the target test questions specifically comprises the questions, answers, question solving steps and an auxiliary method of the target test questions.

For example, there are many kinds of texts that can be converted into standard format, such as "___, ___ intersect at X", "___ intersect at ___, ___ at X, Y", "___, ___ intersect at X", "___, ___, intersect at X", "___, ___, ___ intersect at X", "cross point a makes straight line cross ___ at X, cross point ___ at Y", "cross point a, makes straight line cross ___ with you B at X, cross point ___ at Y", and the text with highest similarity is selected according to the target test question.

S2, extracting keywords from the converted text;

s3, determining the corresponding knowledge nodes of the target test question in the pre-constructed multi-dimensional model of the knowledge points according to the keywords, wherein the construction method of the multi-dimensional model of the knowledge points comprises the following steps: decomposing the knowledge points into minimum particles according to a teaching material catalog and/or domain knowledge, connecting the minimum particles or a plurality of particles as knowledge nodes, and/or connecting the knowledge nodes and the minimum particles or a plurality of knowledge nodes to form new knowledge nodes, wherein each knowledge node comprises a database for storing corresponding test questions, and the sum of the knowledge nodes forms a multi-dimensional model of the knowledge points.

Specifically, the multidimensional three-dimensional knowledge model decomposes primary knowledge points into minimum particles, wherein conceptual dimensions exist, for example, primary mathematics can be divided into numbers and calculation, quantity and measurement, space and geometry, ratio and proportion, statistics and algebra, wherein the space and geometry can be divided into space and geometry, the geometry can be divided into three-dimensional figures and plane figures, the plane figures can be divided into line segments, straight lines, circles, triangles, quadrilaterals, polygons and the like, the triangles can be decomposed into common triangles, isosceles triangles, and the isosceles triangles can be divided into common isosceles triangles and equilateral triangles; the common attribute of all the solid figures and the plane figures is classification and decomposition in another dimension, such as volume, area, perimeter, angle and the like, and the angle can be divided into an acute angle, a right angle, an obtuse angle, a flat angle and a peripheral angle. The minimum particles can be used as a knowledge node, the connection of the particles can form a new knowledge node, the connection of the knowledge node and the particles can form a new knowledge node again, for example, a common isosceles triangle in the triangle is connected with a right angle in an angle to form an isosceles right triangle, the isosceles right triangle can be connected with an area again to form the knowledge node of the area of the isosceles triangle, and the knowledge node close to the teaching outline in the elementary school is selected as an important node, so that the node classification is realized. The third dimension is based on the first two dimensions, after the first two dimensions are completely established, the third dimension, such as the area of an isosceles triangle, is established for each knowledge node, the application and the method can be divided into a sequence condition and a reverse order condition, wherein the sequence condition is that basic quantity of the isosceles triangle is calculated through topic conditions, the area is calculated by using an area formula, and the reverse order condition is that the area is calculated through equivalent substitution or a reverse method.

Specifically, the corresponding knowledge node is determined by:

firstly, searching a first keyword of a target test question from top to bottom in a multi-dimensional knowledge model, then searching from bottom to top, wherein one or more repeated knowledge nodes are possible positions of the first keyword, then determining the possible positions of the rest keywords in sequence, and finally comparing the possible positions of all the keywords, wherein all the repeated knowledge nodes are possible knowledge nodes of the target test question.

specifically, the association degree between the target test question and the corresponding knowledge node is calculated through the following steps:

calculating the association degree of the ith keyword tag and one knowledge node according to a formula KiA 1-QiA 1, wherein KiA1 is the frequency of the ith keyword and the occurrence frequency of the knowledge node (the number of the test questions containing the ith keyword in the knowledge node/the number of all the test questions in the knowledge node), and QiA1 is the frequency of the test questions containing the ith keyword in the knowledge node (the number of the test questions containing the ith keyword in the knowledge node/the number of all the test questions containing the ith keyword); and averaging the association degrees of each keyword label of the target test question and the knowledge node to obtain the association degree of the target test question and the knowledge node.

Specifically, each knowledge node includes a database, a large number of test questions are stored in the database, and first, the frequency K1a1 of the first keyword of the target test question appearing in the first possible knowledge node (the test question including the keyword in the knowledge node/all the test questions in the knowledge node) and the frequency Q1a1 of the test question including the keyword appearing in the first possible knowledge node (the test question including the keyword in the knowledge node/all the test questions including the keyword) are compared, so that the association degree between the first keyword and the first possible knowledge node is K1a 1Q 1a1, and similarly, the association degree between the second keyword and the first possible knowledge node is K2a 1Q 2a1, and the association degree between the third keyword and the first possible knowledge node is K3a 1Q 3a 1. After all the keywords and the first possible knowledge node are compared, the association degree KQ1 (the average value of the association degree of each keyword and the knowledge node) between the target test question and the first possible knowledge node can be obtained.

Next, the degree of association between the target test question and the second possible knowledge node is compared, and the same logic compares the frequency K1a2 of the first keyword of the target test question appearing in the second possible knowledge node (the test question containing the keyword in the knowledge node/all the test questions in the knowledge node) with the frequency Q1a2 of the test question containing the keyword appearing in the second possible knowledge node (the test question containing the keyword in the knowledge node/all the test questions containing the keyword), so that the degree of association between the second keyword and the second possible knowledge node is K1a 2Q 1a2, and similarly, the degree of association between the second keyword and the second possible knowledge node is K2a 2Q 2a2, and the degree of association between the third keyword and the second possible knowledge node is K3a 2Q 3a 2. After all the keywords are aligned with the second possible knowledge node, the degree of association KQ2 between the target test question and the second possible knowledge node can be obtained.

And repeating the calculation to obtain the association degrees of the target test question and all the corresponding knowledge nodes.

And if the relevance between the target test question and a plurality of or zero knowledge nodes exceeds a preset relevance threshold, putting the target test question into a database of nodes to be determined, and waiting for an expert to manually analyze, check and classify.

In a preferred embodiment of the invention, the appearance frequency, the proportion, the relevance and the relevance of each label of the knowledge node are updated periodically; and a reference basis is provided for subsequent personalized recommendation, screening or filtering of teaching resources.

In one embodiment of the invention, the invention provides a labeled test question knowledge point classification system, which comprises the following modules:

Further, the comparison module comprises a correlation calculation unit for:

The invention carries out labeling on the test questions, helps screening and filtering mass learning contents, and provides a basis for self-adaptive teaching.

It should be noted that: in the labeled test question knowledge point classification system provided in this embodiment, only the division of the above function modules is used for illustration when performing labeled test question knowledge point classification, and in practical applications, the function distribution may be completed by different function modules as needed, that is, the internal structure of the labeled test question knowledge point classification system is divided into different function modules to complete all or part of the above described functions. In addition, the embodiment of the labeled test question knowledge point classification system provided in this embodiment and the embodiment of the labeled test question knowledge point classification method provided in the above embodiments belong to the same concept, and the specific implementation process thereof is described in detail in the embodiment of the method.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A labeled test question knowledge point classification method is characterized by comprising the following steps:

s2, extracting keywords from the converted text;

2. The labeled test question knowledge point classification method according to claim 1, wherein the association degree between the target test question and the corresponding knowledge node in step S5 is calculated by the following steps:

3. The labeled test question knowledge point classification method according to claim 2, wherein the calculation formula of the ith keyword and the occurrence frequency of the knowledge point is as follows: the number of the test questions containing the key words in the knowledge node/the number of all the test questions in the knowledge node;

4. The labeled test question knowledge point classification method according to claim 1, wherein the corresponding knowledge node in step S3 is determined by:

5. The labeled test question knowledge point classification method according to claim 1, wherein if the association degree between the target test question and a plurality of or zero knowledge nodes exceeds a preset association degree threshold, the target test question is placed in an undetermined knowledge node database.

6. The labeled test question knowledge point classification method according to claim 1, characterized in that the appearance frequency, the specific gravity, the degree of correlation and the degree of association of each label of knowledge nodes are updated periodically.

7. The labeled test question knowledge point classification method according to claim 1, further comprising, after step S6: and a reference basis is provided for subsequent personalized recommendation, screening or filtering of teaching resources.

8. The labeled test question knowledge point classification method according to claim 1, wherein in step S1, the question, answer, solution step and auxiliary method of the target test question to be classified are converted into a preset text with a standard format.

9. A labeled test question knowledge point classification system is characterized by comprising the following modules:

10. The labeled test question knowledge point classification system according to claim 9, wherein the comparison module comprises an association degree calculation unit configured to: