CN108614865B - Personalized learning recommendation method based on deep reinforcement learning - Google Patents
Personalized learning recommendation method based on deep reinforcement learning Download PDFInfo
- Publication number
- CN108614865B CN108614865B CN201810307140.1A CN201810307140A CN108614865B CN 108614865 B CN108614865 B CN 108614865B CN 201810307140 A CN201810307140 A CN 201810307140A CN 108614865 B CN108614865 B CN 108614865B
- Authority
- CN
- China
- Prior art keywords
- user
- learning
- knowledge points
- nodes
- topic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000002787 reinforcement Effects 0.000 title claims abstract description 36
- 238000000034 method Methods 0.000 title claims abstract description 27
- 230000006399 behavior Effects 0.000 claims abstract description 31
- 238000012549 training Methods 0.000 claims abstract description 17
- 230000009471 action Effects 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims 1
- 230000008569 process Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 6
- 238000011160 research Methods 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
- G06Q50/205—Education administration or guidance
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Educational Administration (AREA)
- Educational Technology (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- Health & Medical Sciences (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a personalized learning recommendation method based on deep reinforcement learning, which comprises the following steps: defining difficulty attributes of knowledge points and questions, and constructing a knowledge point network graph according to the relation between the knowledge points; determining the relation between the topics under the knowledge points according to the relation between the knowledge points, and constructing a topic network graph; according to the user behavior data, obtaining a sub-graph in the current state for a specified user in the question network graph as a learning boundary; and then, modeling by using a deep reinforcement learning algorithm and a user history record, and training to obtain how to select a cut set strategy in a sub-graph of the user in the current state. The method can intelligently recommend the best questions to the user, save the learning time of the user, improve the learning efficiency and improve the learning experience.
Description
Technical Field
The invention relates to the field of personalized learning recommendation research, in particular to a personalized learning recommendation method based on deep reinforcement learning.
Background
Along with the release of more and more internet education platforms at present, the network learning resources are greatly enriched, users can learn at any time and any place, and simultaneously, the users can obtain tests at any time, and the experience is self-evident to the convenience of the users. However, the learning effect is greatly influenced by the differences of students in the aspects of individual difference, interest, learning style and the like, and the condition of low learning efficiency and difficulty in performing teaching according to the situation exists in non-differentiated teaching. The american psychologist Noel tiky (nonel Tichy) proposed that the most ideal state for one to learn is the "stretch zone" where things that are often in learning are appropriately challenging. Then, mining the learning behaviors of the user, and finding the question of the learning area has very important significance for the learning process of the user when the user is recommended. In addition, due to the popularization of the internet education learning platform, learning resources which are most suitable for the cognitive level of a user can be rapidly presented, and personalized recommendation is performed by finding the most suitable questions for students in the question sea. The popularization of platforms and the increase of the number of users also accumulate more and more behavior data of user network learning. How to utilize the behavior data of the user to recommend learning teaching materials or subjects suitable for the user, so that the improvement of the learning experience of the user becomes a hotspot of current research.
At present, there are related researches on behavior data of a current user, modeling is performed according to the behavior data, and personalized topics are recommended for the user. The two methods have the problems that information contained in user behaviors is easy to ignore, the resource utilization rate is not high, the recommendation output is unstable, the precision is low and the like.
Disclosure of Invention
The invention aims to overcome the defect that the prior art can not perform personalized recommendation, and provides a personalized learning recommendation method based on deep reinforcement learning.
The purpose of the invention is realized by the following technical scheme: the personalized learning recommendation method based on the deep reinforcement learning comprises the following steps:
(1) defining difficulty attributes of knowledge points and questions, and constructing a knowledge point network graph according to the relation between the knowledge points;
(2) determining the relation between the topics under the knowledge points according to the relation between the knowledge points, and constructing a topic network graph;
(3) obtaining a sub-graph of the appointed user in the current state in the topic network graph according to the user behavior data;
(4) and (3) modeling by using a deep reinforcement learning algorithm and utilizing the user history record, and training to obtain how to select a cut set in the subgraph of the user in the current state, namely a user 'learning area' strategy.
Preferably, in the step (1), the difficulty attribute value of the knowledge point is defined by depending on expert or user data modeling, and the difficulty attribute of the topic is defined by depending on the difficulty attribute value of the knowledge point where the topic is located and the difficulty of the topic itself on the expert or user data modeling.
Preferably, in the step (1), the knowledge point network graph is modeled by using knowledge points as nodes, difficulty attribute values of the knowledge points as difficulty attribute values of the nodes, edges are established according to relationships between the knowledge points, the relationship degree between the knowledge points is used as a weight value of the edges, and the relationships depend on experts or user data.
Preferably, in the step (2), the topic network graph is that a topic under a knowledge point is used as a node, a difficulty attribute value of the topic is used as a topic difficulty attribute value of the node, a difficulty attribute value of the knowledge point where the topic is located is used as a knowledge point difficulty attribute value of the node, a continuous edge is established according to a relationship between topics under knowledge points with continuous edges and a relationship between topics under the same knowledge point, and a relationship degree between topics is used as a weight value of the continuous edge.
Preferably, in step (3), the method for constructing the sub-graph in the current state of the user is as follows: according to the user behavior data, finding forward or backward nodes of the answered topic nodes in the topic network graph according to the user behavior data, wherein the found nodes and the connecting edges and the weights of the connecting edges form a subgraph of the user in the current state.
Preferably, in the step (4), a deep reinforcement learning model is constructed, the historical answer records of the user are used as the state of the deep reinforcement learning model, the question selection strategy according to the difficulty attribute of the nodes in the subgraph in the current state of the user is used as an action set, the return value is determined according to the correct number of the answers of the user, deep reinforcement learning training is carried out through a certain number of answer processes, a cut set strategy is selected from the subgraph in the current state of the user, and the cut set is the question of a learning area in the personalized learning recommendation.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. according to the method, the modeling is carried out according to the learning behaviors of the user, the deep reinforcement learning algorithm is used for learning the behaviors of the user, and the learning area of the user is obtained, so that the final question recommending the response of the user is suitable for the difficulty of the user, the response of the user can be better accurate, and the purpose of efficiently learning the user is achieved.
2. Based on the complex network diagram, the invention finds the topics associated with the user historical behaviors in the topic network diagram according to the user historical behaviors, and can fully utilize the user historical behavior information to mine the effective information of the user behaviors.
3. In the deep reinforcement learning training process, when a deep reinforcement learning model is constructed, the user behavior sequence is used for modeling, namely, the deep reinforcement learning training is carried out through a certain amount of answers, the latest answer record of the user is used as the state after each answer, and the update is carried out after each answer, so that the selected state can effectively reflect the individuation of the user.
4. The method can intelligently select the learning area of the user, namely, a strategy of carrying out personalized question recommendation on the user is learned by utilizing a deep reinforcement learning algorithm, so that the purpose of intelligently recommending the question to the user, namely, the question in the range of the learning area is achieved, and the user experience is better.
Drawings
Fig. 1 is a schematic diagram of the principle of the method of the present embodiment, (a) shows a knowledge point network graph structure, (b) shows a topic network structure under the same knowledge point, (c) shows a topic network structure under an associated knowledge point, (d) shows a structure of selected user behavior data in a topic network graph, (e) shows forward and backward nodes for finding the topic node in the topic network graph, (f) shows a structure of a sub-graph under a current state of a user, and (g) shows an obtained "learning region" topic.
FIG. 2 is a process diagram of the deep reinforcement learning training of the present invention.
Fig. 3 is a relationship between data, operations, and the like in the implementation of the method of the present embodiment.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.
Examples
The embodiment provides a personalized learning recommendation method based on deep reinforcement learning, which comprises the steps of forming a knowledge point network graph by using a complex network graph to represent the relation between knowledge points and forming a question network graph by using the relation between questions, obtaining a sub-graph of user behaviors in the question network graph in the current state of a user through user behavior data, converting a problem of searching a learning area into a problem of searching a cut set in the sub-graph in the current state of the user, modeling the user behavior data by using a deep reinforcement learning algorithm, and training to obtain a strategy of selecting the cut set from the sub-graph in the current state of the user, so that personalized learning recommendation is realized for the user. The steps are specifically described below with reference to the drawings.
Firstly, defining difficulty attributes of knowledge points and topics, and constructing a knowledge point network graph according to the relation between the knowledge points.
In actual operation, the difficulty attributes of the knowledge points and the topics can be preset by a senior teacher according to own teaching experience or generated by using historical data of users, and the difficulty attributes of the topics can be defined according to the difficulty attribute value of the knowledge points where the topics are combined and the difficulty of the topics, which depend on experts or user data modeling.
In the constructed knowledge point network graph, the knowledge points are used as nodes, the difficulty attribute values of the knowledge points are used as the difficulty attribute values of the nodes, the continuous edges are established according to the relation between the knowledge points, and the relation degree between the knowledge points is used as the weight value of the continuous edges. The constructed knowledge point network diagram structure is shown in fig. 1 (a).
And secondly, determining the relation between the topics under the knowledge points according to the relation between the knowledge points, and constructing a topic network graph.
In this embodiment, the topic network graph is that, according to topics under a knowledge point as nodes, difficulty attribute values of the topics are as topic difficulty attribute values of the nodes, difficulty attribute values of the knowledge points where the topics are located are as knowledge point difficulty attribute values of the nodes, a continuous edge is established according to relationships between topics under the knowledge points with continuous edges and relationships between topics under the same knowledge point, and a relationship degree between the topics is used as a weight value of the continuous edge. The constructed structures are shown in fig. 1(b) and fig. 1(c), wherein fig. 1(b) shows a topic network structure under the knowledge point, and fig. 1(c) shows a topic network structure under the knowledge point.
And thirdly, obtaining a sub-graph of the user in the current state in the topic network according to the user behavior data.
(1) Firstly, obtaining user behavior data from a user behavior library, selecting a nearest answer record, namely behavior data of the current state of a user, and referring to a structure in a question network diagram in a figure 1 (d);
(2) then finding forward and backward nodes of the question node from the question network graph according to the latest answer record, specifically, if the historical question is answered correctly, finding the backward node of the question node in the question network graph, and if the historical question is answered incorrectly, finding the forward node of the question node in the question network graph, wherein the structure is shown in figure 1 (e);
(3) and then, the found nodes and the connected edges and the weights of the connected edges jointly form a subgraph in the current state of the user, and the structure is shown in fig. 1 (f).
And fourthly, training to obtain how to select a cut set strategy in the subgraph of the user in the current state by using a deep reinforcement learning algorithm and combining with the user history record.
Referring to fig. 2, the learning process using the deep reinforcement learning algorithm is as follows:
(1) firstly, constructing a deep reinforcement learning initial model, carrying out deep reinforcement learning training through a certain amount of user answers, taking a historical answer record of a user as the state of the deep reinforcement learning model in the training process, taking a question selection strategy of difficulty attributes of nodes in a subgraph in the current state of the user as an action set, and determining a return value according to the correct number of answers of the user;
(2) feeding back the question of a learning area according to the depth reinforcement learning model, and continuously inputting a return value of a strategy, a new answer record, a new sub-graph of the user in the current state and an original answer record into the depth reinforcement learning model for training after the user answers;
(3) finally, training is carried out to obtain a strategy of selecting a cut set from the subgraph of the user in the current state, so that personalized learning recommendation is carried out on the user, and the obtained learning area topic is shown in fig. 1 (g).
Referring to fig. 3, in the implementation process of the method, a user continuously obtains new historical records in response, the records are continuously input into the deep reinforcement learning model for training, new subjects in a learning area, namely new subjects screened from the sub-image of the user in the current state, are obtained according to the training result, the user continuously responds, and through the process, the optimal strategy for selecting the subjects is obtained, so that personalized learning recommendation is realized.
The method is based on the neural network of the deep reinforcement learning, can adapt to the behaviors of most users through a large amount of training, models the user behaviors, and utilizes the deep reinforcement learning technology to learn the question setting strategy according to the user behaviors, so that the personalized learning recommendation is realized according to the users, and the purpose of personalized question setting can be achieved in application.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.
Claims (3)
1. The personalized learning recommendation method based on the deep reinforcement learning is characterized by comprising the following steps:
(1) defining difficulty attributes of knowledge points and questions, and constructing a knowledge point network graph according to the relation between the knowledge points;
(2) determining the relation between the topics under the knowledge points according to the relation between the knowledge points, and constructing a topic network graph;
the topic network graph is characterized in that according to topics under knowledge points as nodes, the difficulty attribute values of the topics are used as topic difficulty attribute values of the nodes, the difficulty attribute values of the knowledge points where the topics are located are used as knowledge point difficulty attribute values of the nodes, continuous edges are established according to the relation between topics under the knowledge points with continuous edges and the relation between topics under the same knowledge point, and the relation degree between the topics is used as the weight value of the continuous edges;
(3) according to the user behavior data, obtaining a subgraph of the appointed user in the current state in the topic network graph, wherein the subgraph comprises nodes which answer correctly and wrongly and neighbor nodes in the appointed period;
in the step (3), the construction method of the sub-graph in the current state of the user is as follows: according to the user behavior data, finding forward or backward nodes of the answered topic nodes in the topic network graph according to the user behavior data, wherein the found nodes and the connecting edges and the weights of the connecting edges form a sub-graph of the user in the current state;
(4) modeling by using a deep reinforcement learning algorithm and using a user history record, training to obtain how to select a cut set in a subgraph in the current state of the user, determining a question selection strategy and selecting questions;
in the step (4), a deep reinforcement learning model is constructed, the historical answer records of the user are used as the state of the deep reinforcement learning model, the question selection strategy according to the difficulty attribute of the nodes in the subgraph in the current state of the user is used as an action set, the return value is determined according to the correct number of the answers of the user, deep reinforcement learning training is carried out through a certain number of answers, and a cut set strategy is selected from the subgraph in the current state of the user.
2. The personalized learning recommendation method based on the deep reinforcement learning of claim 1, wherein in the step (1), the difficulty attribute value of the knowledge point is defined by expert or user data modeling, and the difficulty attribute of the topic is defined according to the difficulty attribute value of the knowledge point where the topic is located and the difficulty of the topic itself by expert or user data modeling.
3. The personalized learning recommendation method based on the deep reinforcement learning as claimed in claim 1, wherein in the step (1), the knowledge point network graph is defined by using knowledge points as nodes, using difficulty attribute values of the knowledge points as difficulty attribute values of the nodes, establishing connection edges according to relationships among the knowledge points, using relationship degrees among the knowledge points as weight values of the connection edges, and modeling the relationships by depending on experts or user data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810307140.1A CN108614865B (en) | 2018-04-08 | 2018-04-08 | Personalized learning recommendation method based on deep reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810307140.1A CN108614865B (en) | 2018-04-08 | 2018-04-08 | Personalized learning recommendation method based on deep reinforcement learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108614865A CN108614865A (en) | 2018-10-02 |
CN108614865B true CN108614865B (en) | 2020-12-11 |
Family
ID=63659587
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810307140.1A Active CN108614865B (en) | 2018-04-08 | 2018-04-08 | Personalized learning recommendation method based on deep reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108614865B (en) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109255994A (en) * | 2018-10-26 | 2019-01-22 | 北京智能优学科技有限公司 | A kind of foreign language teaching adaptive learning method and computer readable storage medium |
CN109543840B (en) * | 2018-11-09 | 2023-01-10 | 北京理工大学 | Dynamic recommendation system design method based on multidimensional classification reinforcement learning |
CN109859554A (en) * | 2019-03-29 | 2019-06-07 | 上海乂学教育科技有限公司 | Adaptive english vocabulary learning classification pushes away topic device and computer learning system |
CN110009956A (en) * | 2019-04-22 | 2019-07-12 | 上海乂学教育科技有限公司 | English Grammar adaptive learning method and learning device |
CN110223553B (en) * | 2019-05-20 | 2021-08-10 | 北京师范大学 | Method and system for predicting answer information |
CN110399541B (en) * | 2019-05-31 | 2021-03-23 | 平安国际智慧城市科技股份有限公司 | Topic recommendation method and device based on deep learning and storage medium |
CN110288878B (en) * | 2019-07-01 | 2021-10-08 | 科大讯飞股份有限公司 | Self-adaptive learning method and device |
CN110675295A (en) * | 2019-09-29 | 2020-01-10 | 联想(北京)有限公司 | Processing method and device and electronic equipment |
CN111061694A (en) * | 2019-11-26 | 2020-04-24 | 上海乂学教育科技有限公司 | Student test question sharing system |
CN111428020A (en) * | 2020-04-09 | 2020-07-17 | 圆梦共享教育科技(深圳)有限公司 | Personalized learning test question recommendation method based on artificial intelligence |
CN114595923B (en) * | 2022-01-11 | 2023-04-28 | 电子科技大学 | Group teaching recommendation system based on deep reinforcement learning |
-
2018
- 2018-04-08 CN CN201810307140.1A patent/CN108614865B/en active Active
Non-Patent Citations (1)
Title |
---|
贝叶斯网络在知识地图中的研究与应用;刘继鹏;《中国优秀硕士学位论文全文数据库》;20170215;对比文件1第10-81页 * |
Also Published As
Publication number | Publication date |
---|---|
CN108614865A (en) | 2018-10-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108614865B (en) | Personalized learning recommendation method based on deep reinforcement learning | |
CN110955834A (en) | Knowledge graph driven personalized accurate recommendation method | |
CN108959331B (en) | Method, apparatus and computer program for using a device learning framework | |
CN105183848A (en) | Human-computer chatting method and device based on artificial intelligence | |
CN109947915B (en) | Knowledge management system-based artificial intelligence expert system and construction method thereof | |
CN110473438B (en) | Word auxiliary learning system and method based on quantitative analysis | |
CN107169043A (en) | A kind of knowledge point extraction method and system based on model answer | |
CN115114421A (en) | Question-answer model training method | |
CN111143539A (en) | Knowledge graph-based question-answering method in teaching field | |
CN114372155A (en) | Personalized learning platform based on self-expansion knowledge base and multi-mode portrait | |
CN110134871A (en) | A kind of dynamic course recommended method based on course and learner's network structure | |
CN114201684A (en) | Knowledge graph-based adaptive learning resource recommendation method and system | |
CN113239209A (en) | Knowledge graph personalized learning path recommendation method based on RankNet-transformer | |
CN109300069A (en) | Acquisition methods, device and the electronic equipment of user's learning path model | |
CN111311997B (en) | Interaction method based on network education resources | |
Forsman et al. | Sandbox university: Estimating influence of institutional action | |
CN116228361A (en) | Course recommendation method, device, equipment and storage medium based on feature matching | |
CN112906293B (en) | Machine teaching method and system based on review mechanism | |
CN114611696A (en) | Model distillation method, device, electronic equipment and readable storage medium | |
CN112734608A (en) | Method and system for expanding concept of admiration course | |
Newell et al. | Models for an intelligent context-aware blended m-learning system | |
CN111242518A (en) | Learning process configuration method suitable for intelligent adaptation system | |
Pu et al. | Teaching Path generation model based on machine learning | |
Eagle et al. | Interaction Network Estimation: Predicting Problem-Solving Diversity in Interactive Environments. | |
CN109447865A (en) | A kind of learning Content recommended method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |