CN112347366B

CN112347366B - Pre-science Chinese exercise pushing method based on similarity of images and exercises of learners

Info

Publication number: CN112347366B
Application number: CN202011408926.6A
Authority: CN
Inventors: 王华珍; 赵荐轩; 赵毅飞
Original assignee: Huaqiao University
Current assignee: Huaqiao University
Priority date: 2020-12-04
Filing date: 2020-12-04
Publication date: 2022-07-08
Anticipated expiration: 2040-12-04
Also published as: CN112347366A

Abstract

The embodiment of the invention discloses a method for pushing a Chinese exercise in a preliminary department based on the similarity between a learner portrait and the exercise, which comprises the steps of constructing the learner portrait based on a language family of a user's native language and a history of the user; then constructing a knowledge point multi-way tree, and further calculating the similarity of the exercises by adopting an LCA mechanism; then, the similarity between the learner picture and the exercise is fused to generate a candidate exercise queue; and finally, generating a pushing exercise queue in real time based on the user interaction data. The embodiment of the invention aims at the president student and constructs the deep vertical learner portrait of the user; the multilayer semantic information of the exercises is fully utilized, and the accuracy and the interpretability of exercise pushing are improved; real-time guidance under the teaching theory of 'i + 1' is realized by adopting real-time interactive data.

Description

Pre-science Chinese exercise pushing method based on similarity of images and exercises of learners

Technical Field

The invention relates to the technical field of recommendation algorithms, in particular to a president Chinese exercise pushing method based on the similarity of a learner image and an exercise.

Background

Due to the unique advantages of online education, the online learning platform for the outdoor life is also developed by all schools. The Chinese study of the department of president is an important component of the online learning platforms, and can provide online intelligent learning guidance for students taking the examination of the president education and the employment of the students taking the examination of the president of the students offered by the education department, help the students pass the examination smoothly and obtain high scores. The platform covers practical functions of Chinese horizontal testing, real question practice, simulated examination and the like.

At present, users can only obtain exercises on a traditional Chinese president learning platform according to knowledge points, wrong exercise books or purely random exercises, the nature of the exercises is a passive and non-personalized pushing mechanism, and a system cannot push the exercises for learners according to multidimensional information, so that the problems of low learning efficiency, poor learning experience, low utilization rate of platform exercises and the like of the learners are caused. Therefore, in view of the problems and deficiencies described above, there is a strong need for an active, personalized problem-pushing algorithm that can combine multiple dimensions for a wide range of chinese pre-learner.

Disclosure of Invention

The invention aims to solve the technical problem of providing a method for pushing a Chinese pre-science problem based on the similarity between a learner portrait and a problem.

In order to solve the technical problem, the invention is realized as follows:

the embodiment of the specification provides a method for pushing a Chinese pre-subject problem based on the similarity between a picture of a learner and the problem, which comprises the following steps:

step 10, constructing a learner portrait based on a subject track of a native language family of a user and a history of the user;

step 20, constructing a knowledge point multi-way tree, and then calculating the similarity of the exercises by adopting an LCA mechanism;

step 30, generating a candidate exercise queue based on the similarity between the image of the learner of the user and the exercise;

and step 40, traversing the candidate problem queue based on the user interaction data to generate a pushing problem queue.

Further, the step 10 specifically includes:

step 11, mapping the native language information L of the user to a Kyada language family and an isolated language family LF, wherein the LF belongs to (0,1,2,3,4,5,6,7,8, 9);

step 12, acquiring a historical exercise track data set H of a user, and constructing a user exercise track vector X, wherein the length of X is the scale of primary knowledge points of the Chinese knowledge outline of the student reserved in the subject, and the value of X is the exercise accuracy of the primary knowledge points;

step 13: summarizing the native language information L, language family data LF and user exercise track vector X of the user to form a learner portrait UP ═ L, LF and X;

step 14: for any two users a and b, the learner images are Ua (L1, LF1, X1) and Ub (L2, LF2, X2), and the similarity of the learner images of the users a and b is calculated by the following formula:

wherein n is the scale of the first-level knowledge points of the Chinese knowledge outline reserved for students in the subject, and X is the scale of the first-level knowledge points of the Chinese knowledge outline reserved for students in the subject_1iThe question making accuracy rate X of the user a under the condition that the ith primary knowledge point of the Chinese knowledge outline of the student is reserved in the subject_2iAnd reserving question making accuracy rate of the ith primary knowledge point of the Chinese knowledge outline of the student for the user b in the pre-subject.

Further, the step S20 specifically includes:

step 21, constructing a knowledge point multi-way tree T according to a pre-subject Chinese knowledge schema, wherein a root node R of the knowledge point multi-way tree T is a virtual node, the depth h is 0, a node with the depth h being 1 is arranged corresponding to a first-level knowledge point of the Chinese knowledge schema, and a lower-level child node is arranged corresponding to a lower-level knowledge point of the Chinese knowledge schema; defining the weight function of the T path of the knowledge point multi-branch tree as follows:

d(h)＝0.5^h

step 22, aiming at any two exercises e_i，e_j(i is not equal to j; i, j is 1,2.. q, q is the number of the exercises of the exercise library EL), and the knowledge point groups K respectively corresponding to the two exercises are determined according to the knowledge point groups K_ei＝{k_ei1,k_ei2,...,k_einH and K_ej＝{k_ej1,k_ej2,...,k_ejm}, calculating K_eiAnd K_ejLCA nodes of all the generated knowledge point pairs are: for two nodes u and v of the rooted tree T, the nearest common ancestor LCA node represents a node x, and the condition that x is the ancestor of u and v and the depth of x is as large as possible is met; calculating the average value of the edge weights from the LCA node to the root node of all the knowledge point pairs to obtain a problem e_iAnd e_jThe formula of the exercise similarity is as follows:

wherein k is_{LCA(keio,kejp)}Is k_eioAnd k is_ejpLCA node corresponding to the knowledge point, Dist (k) is the sum of path weights from any node to the root node, L_eiFor topic e_iCorresponding knowledge point group K_eiLength, L_ejFor topic e_jCorresponding knowledge point group K_ejLength.

Further, the step S30 specifically includes:

step 31, extracting the wrong-problem set WE of the user, then sorting according to the time stamp, and generating a wrong-problem set queue WEL [ WE ]₁，we₂，...]；

Step 32, traversing the total problem library EL, and satisfying any problem e in the total problem library EL

In and in the wrong question set WE, a certain question WE can be found to satisfy Simex (e, WE)>When 0, adding the problem e into the similar problem set SEL to obtain the similar problem set SEL { (e)_i,r_i)，j＝1,2,...,q_selIn which r is_iAs an exercise e_iSimex (e) when joining a queue_iWe); when there are multiple exercises and ei satisfying Simex in the wrong exercise set WE>0, taking the highest value as similarity data r recorded when joining the queue_i；

Step 33, calculating the similarity between all users in the user library U and the current user a, sorting according to the similarity value from large to small, taking the first 1% of users as a similar group SU, and then acquiring a problem set SUE of the similar group SU;

step 34, performing double-condition sorting on the similar problem sets SEL according to the following sorting rules:

for any two exercises (e1, e2)

When e1, e2 ∈ SUE or e1,

then, the similarity data recorded when the two are added into a similar problem set SEL are sorted from top to bottom;

when any one of the e1, e2 problems belongs to the SUE and the other problem does not belong to the SUE, the problem belonging to the SUE is placed before the problem not belonging to the SUE;

the result of the two-condition ordering of the similar problem set SEL is the problem candidate queue CEL.

Further, the step S40 specifically includes:

step 41, classifying the historical exercise set of the user according to the primary knowledge point, and then calculating the answer accuracy (r) of the user according to three difficulty levels of simple, medium and difficult_ke,r_kn,r_kh)(k∈1,2...t)；

Generating pushing probabilities under three levels of simplicity, moderate and difficulty for each first-level knowledge point:

(p_ke，p_kn，p_kh)

wherein p is_kePush probability, p, representing simple difficulty problem_knPush probability, p, representing simple difficulty problem_khA push probability representing a simple difficulty problem;

then generating a knowledge point pushing probability matrix M of t multiplied by 3, wherein t is the number of first-level knowledge points of the Chinese knowledge outline;

step 42, traversing the generated problem candidate queue CEL in sequence, and for any problem e, the corresponding knowledge point group is { k }₁，k₂，...，k_m}; selecting corresponding knowledge point pushing probability according to the difficulty corresponding to the problem e, summarizing the pushing probability of all knowledge points to obtain final pushing probability:

and determining whether the problem e is added into a pushing problem queue FEL or not based on a probability threshold criterion, finishing traversal when the length of the pushing problem queue FEL meets the requirement, and returning all the problems in the pushing problem queue FEL to recommend to a user.

The invention has the following advantages:

constructing a depth vertical learner portrait for a reserved student; the multilayer semantic information of the exercises is fully utilized, and the accuracy and the interpretability of exercise pushing are improved; real-time guidance under the teaching theory of 'i + 1' is realized by adopting real-time interactive data.

Drawings

The invention will be further described with reference to the following examples with reference to the accompanying drawings.

FIG. 1 is a flow chart of a method performed by an embodiment of the present invention;

FIG. 2 is a flow chart of learner representation generation by a user in accordance with an embodiment of the present invention;

FIG. 3 is a partial schematic diagram of a multi-way tree generated based on knowledge points according to an embodiment of the present invention;

FIG. 4 is a diagram illustrating a double-sequencing process for a similar problem set SEL according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the present invention is further explained with reference to the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The general idea of the invention is as follows: using the language family information of the student's native language and the user trace in the existing system as the basic data for generating the learner picture, constructing a knowledge point multi-way tree through the knowledge point information of the exercises, and calculating LCA nodes of the knowledge nodes corresponding to different exercises to obtain similarity data among the exercises; after the candidate problem queue is obtained through the above dimensions, the hit probability of each problem is generated according to the theory of "i + 1" (i "represents the current language knowledge level of the learner, and" 1 "represents the difference between the current language knowledge state of the learner and the next stage), and finally the final push problem queue is generated. A pre-science Chinese exercise 'i + 1' pushing algorithm based on the similarity of the picture and the exercise of the learner is introduced, a two-dimensional exercise candidate queue generating algorithm based on the similarity of the picture and the exercise of the learner is introduced, and the pertinence and the interpretability of exercise pushing are improved. Meanwhile, a push probability generation algorithm based on an 'i + 1' teaching theory embodies the teaching theory of layered teaching, effectively improves the learning interest and learning motivation of learners, and optimizes the learning experience of learners.

The embodiment of the invention discloses a method for pushing a Chinese exercise in a preliminary department based on the similarity between a picture of a learner and the exercise, and please refer to a picture 1, which comprises the following steps:

s1, constructing a learner portrait based on the language family of the native language of the user and the history of the user;

s11, mapping the native language information L of a user (such as a pre-determined student) to a Kyagi language family and an isolated language family LF, wherein the LF belongs to (0,1,2,3,4,5,6,7,8 and 9); regarding a user group of which the mother language belongs to the same language family under the Jiudao language family, considering that the user group has higher similarity on the mother language family; for a language belonging to the independent language family (LF ═ 9), similarity determination is performed using the native language itself.

S12, obtaining a historical exercise trajectory data set H of the user, and constructing a user exercise trajectory vector X according to the exercise trajectory data set H, wherein the length of X is the scale of a first-level knowledge point of the Chinese knowledge outline of the student reserved in the subject, and the value of X is the exercise accuracy under the first-level knowledge point.

S13 summarizes the user' S native language information L, language family data LF, and user problem trajectory data X to form a learner image denoted as UP (L, LF, X).

S14, for any two users a and b, the learner images of which are Ua (L1, LF1, X1) and Ub (L2, LF2, X2), respectively, the similarity formula for the learner images of the users a and b is calculated as follows:

wherein n is the scale of the first-level knowledge points of the Chinese knowledge outline reserved for students in the subject, and X is the scale of the first-level knowledge points of the Chinese knowledge outline reserved for students in the subject_1iThe question making accuracy rate X of the user a under the condition that the ith primary knowledge point of the Chinese knowledge outline of the student is reserved in the subject_2iThe question making accuracy rate of the user b under the ith primary knowledge point of the Chinese knowledge outline of the student in the pre-subject is reserved;

a flow chart of learner representation generation is shown in FIG. 2.

S2, constructing a knowledge point multi-way tree, and then calculating the similarity of the exercises by adopting an LCA mechanism;

s21, constructing a knowledge point multi-way tree T according to the Chinese president knowledge outline, wherein a root node R is a virtual node, and the depth h is 0; the node with the depth h of 1 is a first-level knowledge point of the Chinese knowledge outline, and by analogy, the lower-level child node is a lower-level knowledge point of the Chinese knowledge outline, the edge weights d at the same depth of the multi-branch tree T are the same and are related to the depth, and the multi-branch tree path weight function is defined as follows:

d(h)＝0.5^h

s22, aiming at any two exercises e_i，e_j(i ≠ j; i, j ═ 1,2.. and q, q are the number of questions in the question bank EL), and the corresponding knowledge point group K is determined according to the number of the questions_ei＝{k_ei1,k_ei2,...,k_einK and Kej ═ k_ej1,k_ej2,...,k_ejm}, calculating K_eiAnd K_ejLCA nodes of all the generated knowledge point pairs, wherein the LCA nodes are defined as: for two nodes u, v of the rooted tree T, the nearest common ancestor LCA node represents a node x, provided that x is an ancestor of u and v and that the depth of x is as large as possible. Here, a node may also be its own ancestor. Further calculate the average value of the edge weights from the LCA node to the root node of all the knowledge point pairs, which is the exercise e_iAnd e_jThe similarity of the exercises. The following formula is thus defined:

acquiring LCA nodes of any two knowledge points:

k_LCA(k1,k2)

acquiring the sum of the path weights from any node to the root node:

Dist(k)

any two exercises e₁，e₂The similarity formula is as follows:

S3, generating a candidate exercise queue based on the similarity between the learner image of the user and the exercise;

s31, extracting the wrong question sets WE of the users, sorting the wrong question sets WE from large to small according to the time stamps, and generating a wrong question set queue WEL [ WE ]₁，we₂，...]；

S32, traversing the total problem library EL, and if and only if any problem e, e belongs to EL in the total problem library EL, satisfying

In and in the wrong question set WE, a certain question WE can be found to satisfy Simex (e, WE)>0, adding the problem into a similar problem set SEL, and finally obtaining the similar problem set SEL { (e)_i,r_i)，j＝1,2,...,q_selIn which r is_iAs an exercise e_iSimex (e) when joining a queue_iWe). When there are multiple exercises and e in the wrong-problem set WE_iSatisfies Simex>0, taking the highest value as similarity data r recorded when joining the queue_i；

S33, calculating the similarity between all users in the user base U and the current user a according to the Simstu function, taking 1% of users with the highest similarity as a similar group SU, and acquiring a problem set SUE of the similar group SU;

s34, performing double-condition sorting on the similar problem set SEL according to the following sorting rule:

for any two exercises (e1, e2)

(1) When e1, e2 ∈ SUE or e1,

when, add according to bothSorting the similarity data recorded when entering a similar problem set SEL from top to bottom;

(2) when any one of the e1, e2 problems belongs to the SUE and the other problem does not belong to the SUE, the problem belonging to the SUE is placed before the problem not belonging to the SUE;

the result of the double-condition ordering of the similar problem set SEL is the problem candidate queue CEL.

S4, traversing the candidate problem queue based on the user interaction data to generate a pushing problem queue;

s41, classifying the historical exercise set of the user according to the first-level knowledge points, and further calculating the answer accuracy (r) according to the three difficulty levels of simple, medium and difficult_ke,r_kn,r_kh) (k belongs to 1,2.. t, and t is the number of first-level knowledge points of the Chinese knowledge schema);

according to the teaching theory of 'i + 1', the knowledge points (k is the number of the knowledge points belonging to 1,2.. t, t is the number of the first-level knowledge points of the Chinese knowledge outline) with the serial number of k are used for the exercises with simple difficulty, when the exercises with the simple difficulty in the user are right and low, the probability of pushing the exercises with simple difficulty is high, and therefore, the pushing probability p of setting the exercises with simple difficulty is set_keComprises the following steps:

for the exercises with medium difficulty, when the user has high accuracy of the simple difficulty exercises and low accuracy of the difficult exercises, the probability of pushing the exercises with medium difficulty is high, and therefore, the pushing probability p of the exercises with medium difficulty is set_knComprises the following steps:

for the problem with difficulty, when the user has high accuracy of simple and medium difficulty problem, the probability of pushing the problem with difficulty is high, therefore, the pushing probability p of the problem with difficulty is set_khComprises the following steps:

thereby generating the push probability of the knowledge point with the number k under three levels of simplicity, medium and difficulty:

(p_ke，p_kn，p_kh)

and finally, generating a knowledge point pushing probability matrix M of t multiplied by 3, wherein t is the number of the first-level knowledge points of the Chinese knowledge outline.

S42, traversing the problem candidate queue CEL generated in S3 in sequence, wherein for any problem e, the corresponding knowledge point group is { k }₁，k₂，...，k_mSelecting corresponding knowledge point pushing probability according to the difficulty (simple, medium and difficult) corresponding to the problem e, summarizing the pushing probability of all corresponding knowledge points to obtain the final pushing probability:

the closer the value of p is to 1, the more the p conforms to the problem pushing under the teaching theory of 'i + 1'. And determining whether to join the final push queue FEL to the problem e based on a probability threshold criterion. That is, random numbers of 0 to 1 are randomly generated, and when the generated random number is smaller than the final push probability p of the problem e, the problem e is added into the final push queue. And when the final queue length meets the requirement, ending the traversal, and returning all the topics in the final push queue FEL to be recommended to the user.

The following is a specific example:

the first step is as follows: learner sketch generation

The learner portrait requires language family as explicit data and user exercises as implicit data, and the process of generating the learner portrait for the student numbered #0 is as follows:

(1) language family data:

and mapping the user native language into a language family, and using a commonly recognized Kyada language family and an isolated language family in linguistics as a mapping result set to obtain language family data LF. The corresponding relationship between the language and the language family data adopted in this embodiment is shown in table 1:

table 1 correspondence between common languages and the Jiudao language family

(2) User exercise trajectory data

Acquiring a historical exercise track data set H of a user, and constructing a user exercise track vector X by using the historical exercise track data set H, wherein the length of X is the scale of a primary knowledge point of the Chinese knowledge point outline of the prefecture, and the value of X is the exercise accuracy under the primary knowledge point.

(3) Construction of learner portrait data tuples

The learner image is formed by summarizing the learner's native language information L, the language family data LF and the user exercise trajectory data X as (L, LF, X).

The specific steps for generating a learner representation for the user numbered #0 are as follows:

first, the mapping result of the native language information and language family of the user with number #0 is as follows:

L_#0english (English language)

LF_#0＝0

Then, the user's partial problem trajectory data table two shows, where "question knowledge point group" represents knowledge point information corresponding to the question, and the elements in the group are divided in "the" segmentation ". For example {1.6.2,2.2.2,4.7} refers to three knowledge points numbered 1.6.2,2.2.2, and 4.7.

Table 2 partial problem trajectory information numbered #0

Finally, the learner portraits of the user with the summary generation number #0 are:

UP_#0english, 0, [0.9339195828342755,0.7251881382959092,0.5208145598192523,0.8616284113349806,0.6810263255880479,0.5914522872981305,0.3744961440723096,0.7861996558752264,0.9296202080857866,0.8481496339940807]}

The second step is that: calculation based on similarity between knowledge point multi-way tree and LCA mechanism problem

(1) A knowledge point multi-branch tree T is constructed according to the Chinese president knowledge outline, a root node R is a virtual node, and the depth is 0. The node with the depth of 1 is a first-level knowledge point of the Chinese knowledge schema, and so on, and the lower-layer child nodes are lower-layer knowledge points of the Chinese knowledge schema. According to any two knowledge points k_i，k_j(i ≠ j; i, j ═ 1,2.. s) constructing edge weight d of the two nodes_ij. The edges at the same depth of the multi-way tree T have the same weight and the weight is related to the depth, and a schematic diagram of the multi-way tree is shown in FIG. 3.

(2) For any two exercises e_i，e_j(i ≠ j; i, j ═ 1,2.. q, q is the number of questions in the question bank EL), and the corresponding knowledge point group K is determined according to the number_ei＝{k_ei1,k_ei2,...,k_einAnd K_ej＝{k_ej1,k_ej2,...,k_ejm}, calculating K_eiAnd K_ejGenerating LCA nodes of all knowledge point pairs, and further calculating the edge weight d from the LCA nodes to the root node_ij ^LCAI.e. exercise e_iAnd e_jSimilarity of exercises.

Calculate similarity for problem ID 413 and 424:

first, question e with ID 413_#413Knowledge point group K of_#413Problem e with ID 424 of 6.1.4.1,1.2_#424Knowledge point group K of_#424All the pairs of knowledge points are (6.1.4.1,1.1.2), (6.1.4.1,1.2.4), (1.2,1.1.2) and (1.2, 1.2.4).

Secondly, calculating the distances from the LCA nodes of all the knowledge point pairs to the root node from the LCA nodes:

(6.1.4.1,1.1.2) having an LCA node of 0, Dist (0) ═ 0;

(6.1.4.1,1.2.4) having an LCA node of 0, Dist (0) of 0;

LCA node of (1.2,1.1.2) is 1, Dist (1) ═ 0.5;

the LCA node of (1.2,1.2.4) is 1.2, Dist (1.2) ═ 0.75.

Finally, the average value of the distances from the LCA node to the root node of all the knowledge point pairs is calculated and is marked as a problem e_i，e_jSimilarity of (2):

the third step: fusing the similarity of the learner image and the exercise to generate a candidate exercise queue;

extracting the wrong question set WE of the user, sorting the wrong question set WE from large to small according to the time stamp, and generating a wrong question set queue WEL [ WE ]₁，we₂，...]。

Traversing the total problem library EL, and if and only if any problem e in the total problem library EL, e belongs to EL, satisfying

And a certain problem WE can be found in the wrong problem set WE to satisfy Simex (e, WE)>0, adding the problem into a similar problem set SEL, and finally obtaining the similar problem set SEL { (e)_i,r_i)，j＝1,2,...,q_selIn which r is_iAs an exercise e_iSimex (e) when joining a queue_iWe). When there are multiple exercises and e in the wrong-problem set WE_iSatisfies Simex>0, taking the highest value as the similarity data r recorded when adding in the queue_i. And traversing all users in the user library U, and taking the first 1% of user sets with the highest similarity with the user a as similar user groups SU.

According to the Simstu function, calculating the similarity between all users in the user base U and the current user a, taking 1% of users with the highest similarity as a similar group SU, and acquiring a problem set SUE of the similar group SU.

And performing double-condition sorting on the similar problem set SEL to generate a problem candidate queue CEL, wherein the detailed process of the double-condition sorting is shown in FIG. 4.

The specific steps for generating candidate problem queue CEL for the user numbered #0 are as follows:

first, the error question set WE of the user with number #0 is acquired_#0The partial error question information of the user is shown in the following table:

TABLE 3 partial error topic information numbered #0

Topic ID	Time stamp
		412	1580039206
416	1590283268
		417	1584495190
419	1583084544
		424	1582961296
958	1586444357
		762	1588219986
530	1582961296
		785	1586444357
949	1588692389

Generated after sorting according to the time stamps from big to small

WEL_#0＝[412,416,762,958,785,417,419,424,530,1772,1986,949,1998,1975,1560,1205,1780,1273,1265,1572,1707,1492,1617,1595,1051,1910,1114,1764,1233,1845…]

Traverse the total problem library EL and satisfy all the conditions (e)_i,r_i) Adding similar problem sets SEL_#0，SEL_#0＝{(5910,0.6922667),(4753,0.5354832),(2303,0.08529),(100,0.546306),(1551,0.633632),(4743,0.9098956),(1638,0.9329645),(4488,0.0512871),(2516,0.6642648),(7398,0.0963061),…}

The user affinity group SU with number #0 is then obtained_#0The partial similarity group user information is shown in the following table:

table 4 partial similarity group user information numbered #0

Obtaining similar group SU_#0Problem set

SUE_#0＝{7203,15976,100,17615,24555,13904,14894,20497,233,24707,6001,27445,3917,23742,22673,26421…}

Finally, selecting similar problem sets_#0Performing double condition sorting to similar problem sets SEL_#0The ordering process of the two specific groups of elements is as follows:

for (5910,0.6922667) and (100, 0.546306):

problem ID of 5910 does not belong to SUE^#0The problem with ID of 100 belongs to SUE_#0Then the problem with ID 100 would be placed before the problem with ID 5910;

for (5910,0.6922667) and (2303, 0.08529):

the exercises with ID 5910 and ID 2303 do not belong to SUE_#0Then, based on 0.6922667 being greater than 0.08529, the problem with ID 5910 is placed before the problem with ID 2303;

generating exercise candidate queue finally

CEL_#0＝[5769,3921,9135,5804,5112,1121,4936,7477,4645,100,3939,8843,2179,2838,1582,4628,2048,9172,8493,7160…]

The fourth step: real-time generation of 'i + 1' push exercise queue based on user interaction data

Classifying the historical exercise set of the user according to the first-level knowledge point, and further calculating the answer accuracy (r) according to three difficulty levels of simple, medium and difficult_ke,r_kn,r_kh) (k ∈ 1,2.. t, t is the number of first-level knowledge points of the Chinese knowledge schema).

According to an i +1 teaching theory, pushing probabilities under three levels of simple, medium and difficult are generated aiming at each primary knowledge point, and finally a t multiplied by 3 knowledge point pushing probability matrix M is obtained, wherein t is the number of the primary knowledge points of the Chinese knowledge outline.

Sequentially traversing the problem candidate queue CEL generated in the third step, and for any problem e, the final push probability is set as { k ] from the corresponding knowledge point group₁，k₂，...，k_mAnd selecting corresponding knowledge point difficulty pushing probability in the knowledge point pushing probability matrix M according to the difficulty. When the exercises meet the pushing conditions, adding the exercises into a final pushing queue FEL; and when the length of the final pushing queue FEL meets the requirement, ending traversal and returning all the questions in the final pushing queue.

First, the user with number #0 is categorized into its historical problem set according to a first-level knowledge point. Under the first-level knowledge point of the 'part of speech', the accuracy of the simple, medium and difficult difficulty of the user is (0.66, 0.5 and 0.3);

according to the teaching theory of 'i + 1', the push probability of generating the simple difficulty problem under the first-level knowledge point of the 'part of speech' is as follows:

the push probability of the medium-difficulty exercises under the first-level knowledge point of the 'part of speech' is as follows:

the pushing probability of the difficult difficulty problem under the first-level knowledge point of the 'part of speech' is as follows:

the user with the number of #0 can be obtained, and the pushing probability under the three difficulties of simplicity, moderate and difficulty under the first-level knowledge point of the "part of speech" is as follows:

(0.6，0.68，0.58)

by analogy, the pushing probabilities under the three difficulties of simplicity, medium difficulty and difficulty are generated for all nine primary knowledge points, and finally the knowledge point pushing probability matrix M of the user with the number of #0 is obtained_#0The size is 9 × 3.

To exercise candidate queue CEL_#0Problem with middle ID of 100 and knowledge point group K_#100If the difficulty is medium, {7.1.4.1,1.2,3.9.7}, then the final push probability is:

finally, random number ran with the range of 0 to 1 is generated once, if ran is less than or equal to 0.68, the problem with the ID of 100 is added into the final push queue FEL_#0In, otherwise, sequentially selecting problem candidate queue CEL_#0Until FEL_#0The length meets the length requirement of the current pushing exercise.

Although specific embodiments of the invention have been described above, it will be understood by those skilled in the art that the specific embodiments described are illustrative only and are not limiting upon the scope of the invention, and that equivalent modifications and variations can be made by those skilled in the art without departing from the spirit of the invention, which is to be limited only by the appended claims.

Claims

1. A method for pushing a Chinese pre-problem based on the similarity between a picture of a learner and a problem is characterized by comprising the following steps:

step 40, traversing the candidate problem queue based on the user interaction data to generate a pushing problem queue;

wherein, the step 10 specifically comprises:

step 12, acquiring a historical problem-making track data set H of a user, and constructing a user problem-making track vector X, wherein the length of X is the scale of a primary knowledge point of a Chinese knowledge outline of a student reserved in a subject, and the value of X is the problem-making accuracy of the primary knowledge point;

step 13: summarizing user's native language information L, language family data LF and user exercise trajectory vector X to form learner portrait UP ═ L, LF, X;

wherein n is the scale of the first-level knowledge points of Chinese knowledge outline reserved for students, and X_1iThe question making accuracy rate X of the user a under the condition that the ith primary knowledge point of the Chinese knowledge outline of the student is reserved in the subject_2iAnd reserving question making accuracy rate of the ith primary knowledge point of the Chinese knowledge outline of the student for the user b in the pre-subject.

2. The method of claim 1, wherein: the step 20 specifically includes:

d(h)＝0.5^h

step 22, aiming at any two exercises e_i，e_jI ≠ j, i, j ≠ 1,2.. q, q is the number of exercises of the question bank EL, and the knowledge point groups K respectively corresponding to the two exercises are determined according to the number of the exercises_ei＝{k_ei1,k_ei2,...,k_einAnd K_ej＝{k_ej1,k_ej2,...,k_ejm}, calculating K_eiAnd K_ejLCA nodes of all the generated knowledge point pairs are: for two nodes u and v of the rooted tree T, the nearest common ancestor LCA node represents a node x, and the condition that x is the ancestor of u and v and the depth of x is as large as possible is met; calculating the average value of the edge weights from the LCA node to the root node of all the knowledge point pairs to obtain a problem e_iAnd e_jThe formula of the exercise similarity is as follows:

3. The method of claim 1, wherein: the step 30 specifically includes:

step 31, extracting the wrong problem set WE of the user, then sorting according to the time stamp, and generating a wrong problem set queue WEL [ WE ]₁，we₂，...]；

In and in the wrong question set WE, a certain question WE can be found to satisfy Simex (e, WE)>When 0, adding the problem e into the similar problem set SEL to obtain the similar problem set SEL { (e)_i,r_i)，j＝1,2,...,q_selIn which r is_iAs an exercise e_iSimex (e) when joining a queue_iWe) value; when there are multiple exercises and e in the wrong-problem set WE_iSatisfies Simex>0, taking the highest value as similarity data ri recorded when the data ri is added into the queue;

and step 34, performing double-condition sequencing on the similar problem set SEL according to the following sequencing rule:

for any two exercises (e1, e2)

When e1, e2 ∈ SUE or e1,

then, the similarity data recorded when the two are added into a similar problem set SEL are sorted from high to low;

4. The method of claim 1, wherein: the step 40 specifically includes:

(p_ke，p_kn，p_kh)

and determining whether the problem e is added into a pushing problem queue FEL or not based on a probability threshold criterion, finishing traversal when the length of the pushing problem queue FEL meets the requirement, and returning all problems in the pushing problem queue FEL to recommend to the user.