CN112347366B - Pre-science Chinese exercise pushing method based on similarity of images and exercises of learners - Google Patents

Pre-science Chinese exercise pushing method based on similarity of images and exercises of learners Download PDF

Info

Publication number
CN112347366B
CN112347366B CN202011408926.6A CN202011408926A CN112347366B CN 112347366 B CN112347366 B CN 112347366B CN 202011408926 A CN202011408926 A CN 202011408926A CN 112347366 B CN112347366 B CN 112347366B
Authority
CN
China
Prior art keywords
user
knowledge point
knowledge
similarity
exercise
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011408926.6A
Other languages
Chinese (zh)
Other versions
CN112347366A (en
Inventor
王华珍
赵荐轩
赵毅飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huaqiao University
Original Assignee
Huaqiao University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huaqiao University filed Critical Huaqiao University
Priority to CN202011408926.6A priority Critical patent/CN112347366B/en
Publication of CN112347366A publication Critical patent/CN112347366A/en
Application granted granted Critical
Publication of CN112347366B publication Critical patent/CN112347366B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/38Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/383Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • G06Q50/205Education administration or guidance

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Library & Information Science (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Computational Linguistics (AREA)
  • Educational Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Educational Administration (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • General Business, Economics & Management (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Artificial Intelligence (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the invention discloses a method for pushing a Chinese exercise in a preliminary department based on the similarity between a learner portrait and the exercise, which comprises the steps of constructing the learner portrait based on a language family of a user's native language and a history of the user; then constructing a knowledge point multi-way tree, and further calculating the similarity of the exercises by adopting an LCA mechanism; then, the similarity between the learner picture and the exercise is fused to generate a candidate exercise queue; and finally, generating a pushing exercise queue in real time based on the user interaction data. The embodiment of the invention aims at the president student and constructs the deep vertical learner portrait of the user; the multilayer semantic information of the exercises is fully utilized, and the accuracy and the interpretability of exercise pushing are improved; real-time guidance under the teaching theory of 'i + 1' is realized by adopting real-time interactive data.

Description

Pre-science Chinese exercise pushing method based on similarity of images and exercises of learners
Technical Field
The invention relates to the technical field of recommendation algorithms, in particular to a president Chinese exercise pushing method based on the similarity of a learner image and an exercise.
Background
Due to the unique advantages of online education, the online learning platform for the outdoor life is also developed by all schools. The Chinese study of the department of president is an important component of the online learning platforms, and can provide online intelligent learning guidance for students taking the examination of the president education and the employment of the students taking the examination of the president of the students offered by the education department, help the students pass the examination smoothly and obtain high scores. The platform covers practical functions of Chinese horizontal testing, real question practice, simulated examination and the like.
At present, users can only obtain exercises on a traditional Chinese president learning platform according to knowledge points, wrong exercise books or purely random exercises, the nature of the exercises is a passive and non-personalized pushing mechanism, and a system cannot push the exercises for learners according to multidimensional information, so that the problems of low learning efficiency, poor learning experience, low utilization rate of platform exercises and the like of the learners are caused. Therefore, in view of the problems and deficiencies described above, there is a strong need for an active, personalized problem-pushing algorithm that can combine multiple dimensions for a wide range of chinese pre-learner.
Disclosure of Invention
The invention aims to solve the technical problem of providing a method for pushing a Chinese pre-science problem based on the similarity between a learner portrait and a problem.
In order to solve the technical problem, the invention is realized as follows:
the embodiment of the specification provides a method for pushing a Chinese pre-subject problem based on the similarity between a picture of a learner and the problem, which comprises the following steps:
step 10, constructing a learner portrait based on a subject track of a native language family of a user and a history of the user;
step 20, constructing a knowledge point multi-way tree, and then calculating the similarity of the exercises by adopting an LCA mechanism;
step 30, generating a candidate exercise queue based on the similarity between the image of the learner of the user and the exercise;
and step 40, traversing the candidate problem queue based on the user interaction data to generate a pushing problem queue.
Further, the step 10 specifically includes:
step 11, mapping the native language information L of the user to a Kyada language family and an isolated language family LF, wherein the LF belongs to (0,1,2,3,4,5,6,7,8, 9);
step 12, acquiring a historical exercise track data set H of a user, and constructing a user exercise track vector X, wherein the length of X is the scale of primary knowledge points of the Chinese knowledge outline of the student reserved in the subject, and the value of X is the exercise accuracy of the primary knowledge points;
step 13: summarizing the native language information L, language family data LF and user exercise track vector X of the user to form a learner portrait UP ═ L, LF and X;
step 14: for any two users a and b, the learner images are Ua (L1, LF1, X1) and Ub (L2, LF2, X2), and the similarity of the learner images of the users a and b is calculated by the following formula:
Figure GDA0003650128180000021
wherein n is the scale of the first-level knowledge points of the Chinese knowledge outline reserved for students in the subject, and X is the scale of the first-level knowledge points of the Chinese knowledge outline reserved for students in the subject1iThe question making accuracy rate X of the user a under the condition that the ith primary knowledge point of the Chinese knowledge outline of the student is reserved in the subject2iAnd reserving question making accuracy rate of the ith primary knowledge point of the Chinese knowledge outline of the student for the user b in the pre-subject.
Further, the step S20 specifically includes:
step 21, constructing a knowledge point multi-way tree T according to a pre-subject Chinese knowledge schema, wherein a root node R of the knowledge point multi-way tree T is a virtual node, the depth h is 0, a node with the depth h being 1 is arranged corresponding to a first-level knowledge point of the Chinese knowledge schema, and a lower-level child node is arranged corresponding to a lower-level knowledge point of the Chinese knowledge schema; defining the weight function of the T path of the knowledge point multi-branch tree as follows:
d(h)=0.5h
step 22, aiming at any two exercises ei,ej(i is not equal to j; i, j is 1,2.. q, q is the number of the exercises of the exercise library EL), and the knowledge point groups K respectively corresponding to the two exercises are determined according to the knowledge point groups Kei={kei1,kei2,...,keinH and Kej={kej1,kej2,...,kejm}, calculating KeiAnd KejLCA nodes of all the generated knowledge point pairs are: for two nodes u and v of the rooted tree T, the nearest common ancestor LCA node represents a node x, and the condition that x is the ancestor of u and v and the depth of x is as large as possible is met; calculating the average value of the edge weights from the LCA node to the root node of all the knowledge point pairs to obtain a problem eiAnd ejThe formula of the exercise similarity is as follows:
Figure GDA0003650128180000031
wherein k isLCA(keio,kejp)Is keioAnd k isejpLCA node corresponding to the knowledge point, Dist (k) is the sum of path weights from any node to the root node, LeiFor topic eiCorresponding knowledge point group KeiLength, LejFor topic ejCorresponding knowledge point group KejLength.
Further, the step S30 specifically includes:
step 31, extracting the wrong-problem set WE of the user, then sorting according to the time stamp, and generating a wrong-problem set queue WEL [ WE ]1,we2,...];
Step 32, traversing the total problem library EL, and satisfying any problem e in the total problem library EL
Figure GDA0003650128180000032
In and in the wrong question set WE, a certain question WE can be found to satisfy Simex (e, WE)>When 0, adding the problem e into the similar problem set SEL to obtain the similar problem set SEL { (e)i,ri),j=1,2,...,qselIn which r isiAs an exercise eiSimex (e) when joining a queueiWe); when there are multiple exercises and ei satisfying Simex in the wrong exercise set WE>0, taking the highest value as similarity data r recorded when joining the queuei
Step 33, calculating the similarity between all users in the user library U and the current user a, sorting according to the similarity value from large to small, taking the first 1% of users as a similar group SU, and then acquiring a problem set SUE of the similar group SU;
step 34, performing double-condition sorting on the similar problem sets SEL according to the following sorting rules:
for any two exercises (e1, e2)
When e1, e2 ∈ SUE or e1,
Figure GDA0003650128180000041
then, the similarity data recorded when the two are added into a similar problem set SEL are sorted from top to bottom;
when any one of the e1, e2 problems belongs to the SUE and the other problem does not belong to the SUE, the problem belonging to the SUE is placed before the problem not belonging to the SUE;
the result of the two-condition ordering of the similar problem set SEL is the problem candidate queue CEL.
Further, the step S40 specifically includes:
step 41, classifying the historical exercise set of the user according to the primary knowledge point, and then calculating the answer accuracy (r) of the user according to three difficulty levels of simple, medium and difficultke,rkn,rkh)(k∈1,2...t);
Generating pushing probabilities under three levels of simplicity, moderate and difficulty for each first-level knowledge point:
(pke,pkn,pkh)
wherein p iskePush probability, p, representing simple difficulty problemknPush probability, p, representing simple difficulty problemkhA push probability representing a simple difficulty problem;
then generating a knowledge point pushing probability matrix M of t multiplied by 3, wherein t is the number of first-level knowledge points of the Chinese knowledge outline;
step 42, traversing the generated problem candidate queue CEL in sequence, and for any problem e, the corresponding knowledge point group is { k }1,k2,...,km}; selecting corresponding knowledge point pushing probability according to the difficulty corresponding to the problem e, summarizing the pushing probability of all knowledge points to obtain final pushing probability:
Figure GDA0003650128180000042
and determining whether the problem e is added into a pushing problem queue FEL or not based on a probability threshold criterion, finishing traversal when the length of the pushing problem queue FEL meets the requirement, and returning all the problems in the pushing problem queue FEL to recommend to a user.
The invention has the following advantages:
constructing a depth vertical learner portrait for a reserved student; the multilayer semantic information of the exercises is fully utilized, and the accuracy and the interpretability of exercise pushing are improved; real-time guidance under the teaching theory of 'i + 1' is realized by adopting real-time interactive data.
Drawings
The invention will be further described with reference to the following examples with reference to the accompanying drawings.
FIG. 1 is a flow chart of a method performed by an embodiment of the present invention;
FIG. 2 is a flow chart of learner representation generation by a user in accordance with an embodiment of the present invention;
FIG. 3 is a partial schematic diagram of a multi-way tree generated based on knowledge points according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating a double-sequencing process for a similar problem set SEL according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the present invention is further explained with reference to the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The general idea of the invention is as follows: using the language family information of the student's native language and the user trace in the existing system as the basic data for generating the learner picture, constructing a knowledge point multi-way tree through the knowledge point information of the exercises, and calculating LCA nodes of the knowledge nodes corresponding to different exercises to obtain similarity data among the exercises; after the candidate problem queue is obtained through the above dimensions, the hit probability of each problem is generated according to the theory of "i + 1" (i "represents the current language knowledge level of the learner, and" 1 "represents the difference between the current language knowledge state of the learner and the next stage), and finally the final push problem queue is generated. A pre-science Chinese exercise 'i + 1' pushing algorithm based on the similarity of the picture and the exercise of the learner is introduced, a two-dimensional exercise candidate queue generating algorithm based on the similarity of the picture and the exercise of the learner is introduced, and the pertinence and the interpretability of exercise pushing are improved. Meanwhile, a push probability generation algorithm based on an 'i + 1' teaching theory embodies the teaching theory of layered teaching, effectively improves the learning interest and learning motivation of learners, and optimizes the learning experience of learners.
The embodiment of the invention discloses a method for pushing a Chinese exercise in a preliminary department based on the similarity between a picture of a learner and the exercise, and please refer to a picture 1, which comprises the following steps:
s1, constructing a learner portrait based on the language family of the native language of the user and the history of the user;
s11, mapping the native language information L of a user (such as a pre-determined student) to a Kyagi language family and an isolated language family LF, wherein the LF belongs to (0,1,2,3,4,5,6,7,8 and 9); regarding a user group of which the mother language belongs to the same language family under the Jiudao language family, considering that the user group has higher similarity on the mother language family; for a language belonging to the independent language family (LF ═ 9), similarity determination is performed using the native language itself.
S12, obtaining a historical exercise trajectory data set H of the user, and constructing a user exercise trajectory vector X according to the exercise trajectory data set H, wherein the length of X is the scale of a first-level knowledge point of the Chinese knowledge outline of the student reserved in the subject, and the value of X is the exercise accuracy under the first-level knowledge point.
S13 summarizes the user' S native language information L, language family data LF, and user problem trajectory data X to form a learner image denoted as UP (L, LF, X).
S14, for any two users a and b, the learner images of which are Ua (L1, LF1, X1) and Ub (L2, LF2, X2), respectively, the similarity formula for the learner images of the users a and b is calculated as follows:
Figure GDA0003650128180000061
wherein n is the scale of the first-level knowledge points of the Chinese knowledge outline reserved for students in the subject, and X is the scale of the first-level knowledge points of the Chinese knowledge outline reserved for students in the subject1iThe question making accuracy rate X of the user a under the condition that the ith primary knowledge point of the Chinese knowledge outline of the student is reserved in the subject2iThe question making accuracy rate of the user b under the ith primary knowledge point of the Chinese knowledge outline of the student in the pre-subject is reserved;
a flow chart of learner representation generation is shown in FIG. 2.
S2, constructing a knowledge point multi-way tree, and then calculating the similarity of the exercises by adopting an LCA mechanism;
s21, constructing a knowledge point multi-way tree T according to the Chinese president knowledge outline, wherein a root node R is a virtual node, and the depth h is 0; the node with the depth h of 1 is a first-level knowledge point of the Chinese knowledge outline, and by analogy, the lower-level child node is a lower-level knowledge point of the Chinese knowledge outline, the edge weights d at the same depth of the multi-branch tree T are the same and are related to the depth, and the multi-branch tree path weight function is defined as follows:
d(h)=0.5h
s22, aiming at any two exercises ei,ej(i ≠ j; i, j ═ 1,2.. and q, q are the number of questions in the question bank EL), and the corresponding knowledge point group K is determined according to the number of the questionsei={kei1,kei2,...,keinK and Kej ═ kej1,kej2,...,kejm}, calculating KeiAnd KejLCA nodes of all the generated knowledge point pairs, wherein the LCA nodes are defined as: for two nodes u, v of the rooted tree T, the nearest common ancestor LCA node represents a node x, provided that x is an ancestor of u and v and that the depth of x is as large as possible. Here, a node may also be its own ancestor. Further calculate the average value of the edge weights from the LCA node to the root node of all the knowledge point pairs, which is the exercise eiAnd ejThe similarity of the exercises. The following formula is thus defined:
acquiring LCA nodes of any two knowledge points:
kLCA(k1,k2)
acquiring the sum of the path weights from any node to the root node:
Dist(k)
any two exercises e1,e2The similarity formula is as follows:
Figure GDA0003650128180000071
wherein k isLCA(keio,kejp)Is keioAnd k isejpLCA node corresponding to the knowledge point, Dist (k) is the sum of path weights from any node to the root node, LeiFor topic eiCorresponding knowledge point group KeiLength, LejFor topic ejCorresponding knowledge point group KejLength.
S3, generating a candidate exercise queue based on the similarity between the learner image of the user and the exercise;
s31, extracting the wrong question sets WE of the users, sorting the wrong question sets WE from large to small according to the time stamps, and generating a wrong question set queue WEL [ WE ]1,we2,...];
S32, traversing the total problem library EL, and if and only if any problem e, e belongs to EL in the total problem library EL, satisfying
Figure GDA0003650128180000081
In and in the wrong question set WE, a certain question WE can be found to satisfy Simex (e, WE)>0, adding the problem into a similar problem set SEL, and finally obtaining the similar problem set SEL { (e)i,ri),j=1,2,...,qselIn which r isiAs an exercise eiSimex (e) when joining a queueiWe). When there are multiple exercises and e in the wrong-problem set WEiSatisfies Simex>0, taking the highest value as similarity data r recorded when joining the queuei
S33, calculating the similarity between all users in the user base U and the current user a according to the Simstu function, taking 1% of users with the highest similarity as a similar group SU, and acquiring a problem set SUE of the similar group SU;
s34, performing double-condition sorting on the similar problem set SEL according to the following sorting rule:
for any two exercises (e1, e2)
(1) When e1, e2 ∈ SUE or e1,
Figure GDA0003650128180000082
when, add according to bothSorting the similarity data recorded when entering a similar problem set SEL from top to bottom;
(2) when any one of the e1, e2 problems belongs to the SUE and the other problem does not belong to the SUE, the problem belonging to the SUE is placed before the problem not belonging to the SUE;
the result of the double-condition ordering of the similar problem set SEL is the problem candidate queue CEL.
S4, traversing the candidate problem queue based on the user interaction data to generate a pushing problem queue;
s41, classifying the historical exercise set of the user according to the first-level knowledge points, and further calculating the answer accuracy (r) according to the three difficulty levels of simple, medium and difficultke,rkn,rkh) (k belongs to 1,2.. t, and t is the number of first-level knowledge points of the Chinese knowledge schema);
according to the teaching theory of 'i + 1', the knowledge points (k is the number of the knowledge points belonging to 1,2.. t, t is the number of the first-level knowledge points of the Chinese knowledge outline) with the serial number of k are used for the exercises with simple difficulty, when the exercises with the simple difficulty in the user are right and low, the probability of pushing the exercises with simple difficulty is high, and therefore, the pushing probability p of setting the exercises with simple difficulty is setkeComprises the following steps:
Figure GDA0003650128180000091
for the exercises with medium difficulty, when the user has high accuracy of the simple difficulty exercises and low accuracy of the difficult exercises, the probability of pushing the exercises with medium difficulty is high, and therefore, the pushing probability p of the exercises with medium difficulty is setknComprises the following steps:
Figure GDA0003650128180000092
for the problem with difficulty, when the user has high accuracy of simple and medium difficulty problem, the probability of pushing the problem with difficulty is high, therefore, the pushing probability p of the problem with difficulty is setkhComprises the following steps:
Figure GDA0003650128180000093
thereby generating the push probability of the knowledge point with the number k under three levels of simplicity, medium and difficulty:
(pke,pkn,pkh)
and finally, generating a knowledge point pushing probability matrix M of t multiplied by 3, wherein t is the number of the first-level knowledge points of the Chinese knowledge outline.
S42, traversing the problem candidate queue CEL generated in S3 in sequence, wherein for any problem e, the corresponding knowledge point group is { k }1,k2,...,kmSelecting corresponding knowledge point pushing probability according to the difficulty (simple, medium and difficult) corresponding to the problem e, summarizing the pushing probability of all corresponding knowledge points to obtain the final pushing probability:
Figure GDA0003650128180000094
the closer the value of p is to 1, the more the p conforms to the problem pushing under the teaching theory of 'i + 1'. And determining whether to join the final push queue FEL to the problem e based on a probability threshold criterion. That is, random numbers of 0 to 1 are randomly generated, and when the generated random number is smaller than the final push probability p of the problem e, the problem e is added into the final push queue. And when the final queue length meets the requirement, ending the traversal, and returning all the topics in the final push queue FEL to be recommended to the user.
The following is a specific example:
the first step is as follows: learner sketch generation
The learner portrait requires language family as explicit data and user exercises as implicit data, and the process of generating the learner portrait for the student numbered #0 is as follows:
(1) language family data:
and mapping the user native language into a language family, and using a commonly recognized Kyada language family and an isolated language family in linguistics as a mapping result set to obtain language family data LF. The corresponding relationship between the language and the language family data adopted in this embodiment is shown in table 1:
table 1 correspondence between common languages and the Jiudao language family
Figure GDA0003650128180000101
Figure GDA0003650128180000111
Figure GDA0003650128180000121
Figure GDA0003650128180000131
Figure GDA0003650128180000141
(2) User exercise trajectory data
Acquiring a historical exercise track data set H of a user, and constructing a user exercise track vector X by using the historical exercise track data set H, wherein the length of X is the scale of a primary knowledge point of the Chinese knowledge point outline of the prefecture, and the value of X is the exercise accuracy under the primary knowledge point.
(3) Construction of learner portrait data tuples
The learner image is formed by summarizing the learner's native language information L, the language family data LF and the user exercise trajectory data X as (L, LF, X).
The specific steps for generating a learner representation for the user numbered #0 are as follows:
first, the mapping result of the native language information and language family of the user with number #0 is as follows:
L#0english (English language)
LF#0=0
Then, the user's partial problem trajectory data table two shows, where "question knowledge point group" represents knowledge point information corresponding to the question, and the elements in the group are divided in "the" segmentation ". For example {1.6.2,2.2.2,4.7} refers to three knowledge points numbered 1.6.2,2.2.2, and 4.7.
Table 2 partial problem trajectory information numbered #0
Figure GDA0003650128180000142
Figure GDA0003650128180000151
Finally, the learner portraits of the user with the summary generation number #0 are:
UP#0english, 0, [0.9339195828342755,0.7251881382959092,0.5208145598192523,0.8616284113349806,0.6810263255880479,0.5914522872981305,0.3744961440723096,0.7861996558752264,0.9296202080857866,0.8481496339940807]}
The second step is that: calculation based on similarity between knowledge point multi-way tree and LCA mechanism problem
(1) A knowledge point multi-branch tree T is constructed according to the Chinese president knowledge outline, a root node R is a virtual node, and the depth is 0. The node with the depth of 1 is a first-level knowledge point of the Chinese knowledge schema, and so on, and the lower-layer child nodes are lower-layer knowledge points of the Chinese knowledge schema. According to any two knowledge points ki,kj(i ≠ j; i, j ═ 1,2.. s) constructing edge weight d of the two nodesij. The edges at the same depth of the multi-way tree T have the same weight and the weight is related to the depth, and a schematic diagram of the multi-way tree is shown in FIG. 3.
(2) For any two exercises ei,ej(i ≠ j; i, j ═ 1,2.. q, q is the number of questions in the question bank EL), and the corresponding knowledge point group K is determined according to the numberei={kei1,kei2,...,keinAnd Kej={kej1,kej2,...,kejm}, calculating KeiAnd KejGenerating LCA nodes of all knowledge point pairs, and further calculating the edge weight d from the LCA nodes to the root nodeij LCAI.e. exercise eiAnd ejSimilarity of exercises.
Calculate similarity for problem ID 413 and 424:
first, question e with ID 413#413Knowledge point group K of#413Problem e with ID 424 of 6.1.4.1,1.2#424Knowledge point group K of#424All the pairs of knowledge points are (6.1.4.1,1.1.2), (6.1.4.1,1.2.4), (1.2,1.1.2) and (1.2, 1.2.4).
Secondly, calculating the distances from the LCA nodes of all the knowledge point pairs to the root node from the LCA nodes:
(6.1.4.1,1.1.2) having an LCA node of 0, Dist (0) ═ 0;
(6.1.4.1,1.2.4) having an LCA node of 0, Dist (0) of 0;
LCA node of (1.2,1.1.2) is 1, Dist (1) ═ 0.5;
the LCA node of (1.2,1.2.4) is 1.2, Dist (1.2) ═ 0.75.
Finally, the average value of the distances from the LCA node to the root node of all the knowledge point pairs is calculated and is marked as a problem ei,ejSimilarity of (2):
Figure GDA0003650128180000161
the third step: fusing the similarity of the learner image and the exercise to generate a candidate exercise queue;
extracting the wrong question set WE of the user, sorting the wrong question set WE from large to small according to the time stamp, and generating a wrong question set queue WEL [ WE ]1,we2,...]。
Traversing the total problem library EL, and if and only if any problem e in the total problem library EL, e belongs to EL, satisfying
Figure GDA0003650128180000162
And a certain problem WE can be found in the wrong problem set WE to satisfy Simex (e, WE)>0, adding the problem into a similar problem set SEL, and finally obtaining the similar problem set SEL { (e)i,ri),j=1,2,...,qselIn which r isiAs an exercise eiSimex (e) when joining a queueiWe). When there are multiple exercises and e in the wrong-problem set WEiSatisfies Simex>0, taking the highest value as the similarity data r recorded when adding in the queuei. And traversing all users in the user library U, and taking the first 1% of user sets with the highest similarity with the user a as similar user groups SU.
According to the Simstu function, calculating the similarity between all users in the user base U and the current user a, taking 1% of users with the highest similarity as a similar group SU, and acquiring a problem set SUE of the similar group SU.
And performing double-condition sorting on the similar problem set SEL to generate a problem candidate queue CEL, wherein the detailed process of the double-condition sorting is shown in FIG. 4.
The specific steps for generating candidate problem queue CEL for the user numbered #0 are as follows:
first, the error question set WE of the user with number #0 is acquired#0The partial error question information of the user is shown in the following table:
TABLE 3 partial error topic information numbered #0
Topic ID Time stamp
412 1580039206
416 1590283268
417 1584495190
419 1583084544
424 1582961296
958 1586444357
762 1588219986
530 1582961296
785 1586444357
949 1588692389
Generated after sorting according to the time stamps from big to small
WEL#0=[412,416,762,958,785,417,419,424,530,1772,1986,949,1998,1975,1560,1205,1780,1273,1265,1572,1707,1492,1617,1595,1051,1910,1114,1764,1233,1845…]
Traverse the total problem library EL and satisfy all the conditions (e)i,ri) Adding similar problem sets SEL#0,SEL#0={(5910,0.6922667),(4753,0.5354832),(2303,0.08529),(100,0.546306),(1551,0.633632),(4743,0.9098956),(1638,0.9329645),(4488,0.0512871),(2516,0.6642648),(7398,0.0963061),…}
The user affinity group SU with number #0 is then obtained#0The partial similarity group user information is shown in the following table:
table 4 partial similarity group user information numbered #0
Figure GDA0003650128180000181
Figure GDA0003650128180000191
Obtaining similar group SU#0Problem set
SUE#0={7203,15976,100,17615,24555,13904,14894,20497,233,24707,6001,27445,3917,23742,22673,26421…}
Finally, selecting similar problem sets#0Performing double condition sorting to similar problem sets SEL#0The ordering process of the two specific groups of elements is as follows:
for (5910,0.6922667) and (100, 0.546306):
problem ID of 5910 does not belong to SUE#0The problem with ID of 100 belongs to SUE#0Then the problem with ID 100 would be placed before the problem with ID 5910;
for (5910,0.6922667) and (2303, 0.08529):
the exercises with ID 5910 and ID 2303 do not belong to SUE#0Then, based on 0.6922667 being greater than 0.08529, the problem with ID 5910 is placed before the problem with ID 2303;
generating exercise candidate queue finally
CEL#0=[5769,3921,9135,5804,5112,1121,4936,7477,4645,100,3939,8843,2179,2838,1582,4628,2048,9172,8493,7160…]
The fourth step: real-time generation of 'i + 1' push exercise queue based on user interaction data
Classifying the historical exercise set of the user according to the first-level knowledge point, and further calculating the answer accuracy (r) according to three difficulty levels of simple, medium and difficultke,rkn,rkh) (k ∈ 1,2.. t, t is the number of first-level knowledge points of the Chinese knowledge schema).
According to an i +1 teaching theory, pushing probabilities under three levels of simple, medium and difficult are generated aiming at each primary knowledge point, and finally a t multiplied by 3 knowledge point pushing probability matrix M is obtained, wherein t is the number of the primary knowledge points of the Chinese knowledge outline.
Sequentially traversing the problem candidate queue CEL generated in the third step, and for any problem e, the final push probability is set as { k ] from the corresponding knowledge point group1,k2,...,kmAnd selecting corresponding knowledge point difficulty pushing probability in the knowledge point pushing probability matrix M according to the difficulty. When the exercises meet the pushing conditions, adding the exercises into a final pushing queue FEL; and when the length of the final pushing queue FEL meets the requirement, ending traversal and returning all the questions in the final pushing queue.
First, the user with number #0 is categorized into its historical problem set according to a first-level knowledge point. Under the first-level knowledge point of the 'part of speech', the accuracy of the simple, medium and difficult difficulty of the user is (0.66, 0.5 and 0.3);
according to the teaching theory of 'i + 1', the push probability of generating the simple difficulty problem under the first-level knowledge point of the 'part of speech' is as follows:
Figure GDA0003650128180000201
the push probability of the medium-difficulty exercises under the first-level knowledge point of the 'part of speech' is as follows:
Figure GDA0003650128180000202
the pushing probability of the difficult difficulty problem under the first-level knowledge point of the 'part of speech' is as follows:
Figure GDA0003650128180000203
the user with the number of #0 can be obtained, and the pushing probability under the three difficulties of simplicity, moderate and difficulty under the first-level knowledge point of the "part of speech" is as follows:
(0.6,0.68,0.58)
by analogy, the pushing probabilities under the three difficulties of simplicity, medium difficulty and difficulty are generated for all nine primary knowledge points, and finally the knowledge point pushing probability matrix M of the user with the number of #0 is obtained#0The size is 9 × 3.
Figure GDA0003650128180000211
To exercise candidate queue CEL#0Problem with middle ID of 100 and knowledge point group K#100If the difficulty is medium, {7.1.4.1,1.2,3.9.7}, then the final push probability is:
Figure GDA0003650128180000212
finally, random number ran with the range of 0 to 1 is generated once, if ran is less than or equal to 0.68, the problem with the ID of 100 is added into the final push queue FEL#0In, otherwise, sequentially selecting problem candidate queue CEL#0Until FEL#0The length meets the length requirement of the current pushing exercise.
Although specific embodiments of the invention have been described above, it will be understood by those skilled in the art that the specific embodiments described are illustrative only and are not limiting upon the scope of the invention, and that equivalent modifications and variations can be made by those skilled in the art without departing from the spirit of the invention, which is to be limited only by the appended claims.

Claims (4)

1. A method for pushing a Chinese pre-problem based on the similarity between a picture of a learner and a problem is characterized by comprising the following steps:
step 10, constructing a learner portrait based on a subject track of a native language family of a user and a history of the user;
step 20, constructing a knowledge point multi-way tree, and then calculating the similarity of the exercises by adopting an LCA mechanism;
step 30, generating a candidate exercise queue based on the similarity between the image of the learner of the user and the exercise;
step 40, traversing the candidate problem queue based on the user interaction data to generate a pushing problem queue;
wherein, the step 10 specifically comprises:
step 11, mapping the native language information L of the user to a Kyada language family and an isolated language family LF, wherein the LF belongs to (0,1,2,3,4,5,6,7,8, 9);
step 12, acquiring a historical problem-making track data set H of a user, and constructing a user problem-making track vector X, wherein the length of X is the scale of a primary knowledge point of a Chinese knowledge outline of a student reserved in a subject, and the value of X is the problem-making accuracy of the primary knowledge point;
step 13: summarizing user's native language information L, language family data LF and user exercise trajectory vector X to form learner portrait UP ═ L, LF, X;
step 14: for any two users a and b, the learner images are Ua (L1, LF1, X1) and Ub (L2, LF2, X2), and the similarity of the learner images of the users a and b is calculated by the following formula:
Figure FDA0003650128170000011
wherein n is the scale of the first-level knowledge points of Chinese knowledge outline reserved for students, and X1iThe question making accuracy rate X of the user a under the condition that the ith primary knowledge point of the Chinese knowledge outline of the student is reserved in the subject2iAnd reserving question making accuracy rate of the ith primary knowledge point of the Chinese knowledge outline of the student for the user b in the pre-subject.
2. The method of claim 1, wherein: the step 20 specifically includes:
step 21, constructing a knowledge point multi-way tree T according to a pre-subject Chinese knowledge schema, wherein a root node R of the knowledge point multi-way tree T is a virtual node, the depth h is 0, a node with the depth h being 1 is arranged corresponding to a first-level knowledge point of the Chinese knowledge schema, and a lower-level child node is arranged corresponding to a lower-level knowledge point of the Chinese knowledge schema; defining the weight function of the T path of the knowledge point multi-branch tree as follows:
d(h)=0.5h
step 22, aiming at any two exercises ei,ejI ≠ j, i, j ≠ 1,2.. q, q is the number of exercises of the question bank EL, and the knowledge point groups K respectively corresponding to the two exercises are determined according to the number of the exercisesei={kei1,kei2,...,keinAnd Kej={kej1,kej2,...,kejm}, calculating KeiAnd KejLCA nodes of all the generated knowledge point pairs are: for two nodes u and v of the rooted tree T, the nearest common ancestor LCA node represents a node x, and the condition that x is the ancestor of u and v and the depth of x is as large as possible is met; calculating the average value of the edge weights from the LCA node to the root node of all the knowledge point pairs to obtain a problem eiAnd ejThe formula of the exercise similarity is as follows:
Figure FDA0003650128170000021
wherein k isLCA(keio,kejp)Is keioAnd k isejpLCA node corresponding to the knowledge point, Dist (k) is the sum of path weights from any node to the root node, LeiFor topic eiCorresponding knowledge point group KeiLength, LejFor topic ejCorresponding knowledge point group KejLength.
3. The method of claim 1, wherein: the step 30 specifically includes:
step 31, extracting the wrong problem set WE of the user, then sorting according to the time stamp, and generating a wrong problem set queue WEL [ WE ]1,we2,...];
Step 32, traversing the total problem library EL, and satisfying any problem e in the total problem library EL
Figure FDA0003650128170000022
In and in the wrong question set WE, a certain question WE can be found to satisfy Simex (e, WE)>When 0, adding the problem e into the similar problem set SEL to obtain the similar problem set SEL { (e)i,ri),j=1,2,...,qselIn which r isiAs an exercise eiSimex (e) when joining a queueiWe) value; when there are multiple exercises and e in the wrong-problem set WEiSatisfies Simex>0, taking the highest value as similarity data ri recorded when the data ri is added into the queue;
step 33, calculating the similarity between all users in the user library U and the current user a, sorting according to the similarity value from large to small, taking the first 1% of users as a similar group SU, and then acquiring a problem set SUE of the similar group SU;
and step 34, performing double-condition sequencing on the similar problem set SEL according to the following sequencing rule:
for any two exercises (e1, e2)
When e1, e2 ∈ SUE or e1,
Figure FDA0003650128170000031
then, the similarity data recorded when the two are added into a similar problem set SEL are sorted from high to low;
when any one of the e1, e2 problems belongs to the SUE and the other problem does not belong to the SUE, the problem belonging to the SUE is placed before the problem not belonging to the SUE;
the result of the two-condition ordering of the similar problem set SEL is the problem candidate queue CEL.
4. The method of claim 1, wherein: the step 40 specifically includes:
step 41, classifying the historical exercise set of the user according to the primary knowledge point, and then calculating the answer accuracy (r) of the user according to three difficulty levels of simple, medium and difficultke,rkn,rkh)(k∈1,2...t);
Generating pushing probabilities under three levels of simplicity, moderate and difficulty for each first-level knowledge point:
(pke,pkn,pkh)
wherein p iskePush probability, p, representing simple difficulty problemknPush probability, p, representing simple difficulty problemkhA push probability representing a simple difficulty problem;
then generating a knowledge point pushing probability matrix M of t multiplied by 3, wherein t is the number of first-level knowledge points of the Chinese knowledge outline;
step 42, traversing the generated problem candidate queue CEL in sequence, and for any problem e, the corresponding knowledge point group is { k }1,k2,...,km}; selecting corresponding knowledge point pushing probability according to the difficulty corresponding to the problem e, summarizing the pushing probability of all knowledge points to obtain final pushing probability:
Figure FDA0003650128170000041
and determining whether the problem e is added into a pushing problem queue FEL or not based on a probability threshold criterion, finishing traversal when the length of the pushing problem queue FEL meets the requirement, and returning all problems in the pushing problem queue FEL to recommend to the user.
CN202011408926.6A 2020-12-04 2020-12-04 Pre-science Chinese exercise pushing method based on similarity of images and exercises of learners Active CN112347366B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011408926.6A CN112347366B (en) 2020-12-04 2020-12-04 Pre-science Chinese exercise pushing method based on similarity of images and exercises of learners

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011408926.6A CN112347366B (en) 2020-12-04 2020-12-04 Pre-science Chinese exercise pushing method based on similarity of images and exercises of learners

Publications (2)

Publication Number Publication Date
CN112347366A CN112347366A (en) 2021-02-09
CN112347366B true CN112347366B (en) 2022-07-08

Family

ID=74427416

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011408926.6A Active CN112347366B (en) 2020-12-04 2020-12-04 Pre-science Chinese exercise pushing method based on similarity of images and exercises of learners

Country Status (1)

Country Link
CN (1) CN112347366B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113918825A (en) * 2021-12-07 2022-01-11 北京世纪好未来教育科技有限公司 Exercise recommendation method and device and computer storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110851723A (en) * 2019-11-14 2020-02-28 上海钦文信息科技有限公司 English exercise recommendation method based on large-scale knowledge point labeling result
CN110930274A (en) * 2019-12-02 2020-03-27 中山大学 Practice effect evaluation and learning path recommendation system and method based on cognitive diagnosis
CN111723193A (en) * 2020-06-19 2020-09-29 平安科技(深圳)有限公司 Exercise intelligent recommendation method and device, computer equipment and storage medium
CN111753077A (en) * 2020-06-28 2020-10-09 华侨大学 Chinese intelligent teaching question bank generation method based on student knowledge portrait
CN111831914A (en) * 2020-07-22 2020-10-27 上海掌学教育科技有限公司 Intelligent question pushing system for online education

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150170536A1 (en) * 2013-12-18 2015-06-18 William Marsh Rice University Time-Varying Learning and Content Analytics Via Sparse Factor Analysis

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110851723A (en) * 2019-11-14 2020-02-28 上海钦文信息科技有限公司 English exercise recommendation method based on large-scale knowledge point labeling result
CN110930274A (en) * 2019-12-02 2020-03-27 中山大学 Practice effect evaluation and learning path recommendation system and method based on cognitive diagnosis
CN111723193A (en) * 2020-06-19 2020-09-29 平安科技(深圳)有限公司 Exercise intelligent recommendation method and device, computer equipment and storage medium
CN111753077A (en) * 2020-06-28 2020-10-09 华侨大学 Chinese intelligent teaching question bank generation method based on student knowledge portrait
CN111831914A (en) * 2020-07-22 2020-10-27 上海掌学教育科技有限公司 Intelligent question pushing system for online education

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种面向教育评估的智能教育辅助平台;黄振亚等;《中国科学技术大学学报》;20151031;第45卷(第10期);第846-853页 *

Also Published As

Publication number Publication date
CN112347366A (en) 2021-02-09

Similar Documents

Publication Publication Date Title
Bogdandy et al. Digital transformation in education during COVID-19: A case study
CN107230174B (en) Online interactive learning system and method based on network
Hu et al. Natural language object retrieval
US20150044659A1 (en) Clustering short answers to questions
CN111833672B (en) Teaching video display method, device and system
Tsiriga et al. A framework for the initialization of student models in web-based intelligent tutoring systems
CN113590956B (en) Knowledge point recommendation method, knowledge point recommendation device, knowledge point recommendation terminal and computer readable storage medium
CN109816265B (en) Knowledge characteristic mastery degree evaluation method, question recommendation method and electronic equipment
CN109871504B (en) Course recommendation system based on heterogeneous information network and deep learning
CN112395403A (en) Knowledge graph-based question and answer method, system, electronic equipment and medium
CN112784608A (en) Test question recommendation method and device, electronic equipment and storage medium
CN116383455A (en) Learning resource determining method and device, electronic equipment and storage medium
CN111008340B (en) Course recommendation method, device and storage medium
CN112685550B (en) Intelligent question-answering method, intelligent question-answering device, intelligent question-answering server and computer readable storage medium
CN112347366B (en) Pre-science Chinese exercise pushing method based on similarity of images and exercises of learners
CN112380429A (en) Exercise recommendation method and device
CN115640368A (en) Method and system for intelligently diagnosing recommended question bank
CN111125318A (en) Method for improving knowledge graph relation prediction performance based on sememe-semantic item information
Liu et al. Extracting locations from sport and exercise-related social media messages using a neural network-based bilingual toponym recognition model
CN111311997B (en) Interaction method based on network education resources
CN112951022A (en) Multimedia interactive education training system
WO2016018336A1 (en) Create a heterogeneous learner group
CN111930908A (en) Answer recognition method and device based on artificial intelligence, medium and electronic equipment
Agrawal et al. Grouping Students for Maximizing Learning from Peers.
CN110765278A (en) Method for searching similar exercises, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant