CN114595923B - Group teaching recommendation system based on deep reinforcement learning - Google Patents
Group teaching recommendation system based on deep reinforcement learning Download PDFInfo
- Publication number
- CN114595923B CN114595923B CN202210028554.7A CN202210028554A CN114595923B CN 114595923 B CN114595923 B CN 114595923B CN 202210028554 A CN202210028554 A CN 202210028554A CN 114595923 B CN114595923 B CN 114595923B
- Authority
- CN
- China
- Prior art keywords
- student
- model
- teaching
- recommendation
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000002787 reinforcement Effects 0.000 title claims abstract description 15
- 238000012549 training Methods 0.000 claims abstract description 67
- 238000000034 method Methods 0.000 claims abstract description 31
- 230000008569 process Effects 0.000 claims abstract description 24
- 238000012360 testing method Methods 0.000 claims abstract description 21
- 230000006870 function Effects 0.000 claims abstract description 16
- 238000005457 optimization Methods 0.000 claims abstract description 11
- 238000013528 artificial neural network Methods 0.000 claims description 52
- 238000003062 neural network model Methods 0.000 claims description 38
- 238000011156 evaluation Methods 0.000 claims description 20
- 238000004422 calculation algorithm Methods 0.000 claims description 15
- 238000013523 data management Methods 0.000 claims description 14
- 238000007726 management method Methods 0.000 claims description 11
- 230000015654 memory Effects 0.000 claims description 5
- 238000005070 sampling Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000007704 transition Effects 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims description 2
- 230000002452 interceptive effect Effects 0.000 claims description 2
- 230000007787 long-term memory Effects 0.000 claims description 2
- 230000006403 short-term memory Effects 0.000 claims description 2
- 230000000306 recurrent effect Effects 0.000 claims 1
- 230000003993 interaction Effects 0.000 abstract description 9
- 230000008901 benefit Effects 0.000 abstract description 5
- 238000013439 planning Methods 0.000 abstract description 2
- 230000006399 behavior Effects 0.000 description 3
- 230000001351 cycling effect Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 125000004122 cyclic group Chemical group 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 210000005036 nerve Anatomy 0.000 description 2
- 230000000630 rising effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000008921 facial expression Effects 0.000 description 1
- 230000007786 learning performance Effects 0.000 description 1
- 238000012886 linear function Methods 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000000704 physical effect Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/067—Enterprise or organisation modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
- G06Q50/205—Education administration or guidance
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B5/00—Electrically-operated educational appliances
- G09B5/08—Electrically-operated educational appliances providing for individual presentation of information to a plurality of student stations
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Strategic Management (AREA)
- Human Resources & Organizations (AREA)
- Educational Administration (AREA)
- Educational Technology (AREA)
- General Health & Medical Sciences (AREA)
- Economics (AREA)
- Health & Medical Sciences (AREA)
- Tourism & Hospitality (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Entrepreneurship & Innovation (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- General Business, Economics & Management (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Marketing (AREA)
- Probability & Statistics with Applications (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Game Theory and Decision Science (AREA)
- Development Economics (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
The invention discloses a group teaching recommendation system based on deep reinforcement learning, and belongs to the technical field of education and information. The invention collects student data through interaction methods such as voting, question answering, homework, test and the like in a classroom, provides a teaching plan with the largest overall benefit for a given student group, and the overall benefit can be represented by a multi-objective optimization function, and particularly can comprise, but is not limited to, pass rate, excellent rate, average and the like. The invention uses the deep reinforcement learning method to carry out target-oriented teaching path planning for teachers, and can process large-scale complex data. Meanwhile, the training process which takes most time is placed before and after class, and in the class, a teacher can immediately obtain recommended teaching knowledge points through class feedback of students.
Description
Technical Field
The invention relates to the technical field of education and information, in particular to a group teaching recommendation system based on deep reinforcement learning.
Background
In conventional classroom teaching, teachers often arrange learning content empirically because the learning details of students are not visible and uncontrolled. Of course, the teacher may have a variety of information including questions and answers, classroom assessment, and facial expressions, gestures, and physical actions of the student to assess the student's learning performance. But this information is often rough and cannot cover every student or track every person's learning details, which often makes it impossible for teachers to design teaching paths on a fine granularity. The development of teaching auxiliary systems relieves the difficulty faced by teachers. The teaching auxiliary system provides various teacher-student interaction methods, interaction information can be recorded, and a teacher can more accurately and deeply understand student conditions through the interaction information. On the other hand, the teaching auxiliary system can also provide recommended teaching plans or learning plans for teachers or students, so that the working pressure of the teachers is relieved to a greater extent.
The patent application with publication number CN 112700688A discloses an intelligent classroom teaching auxiliary system. And collecting student learning data through interaction methods such as voting in a class, modeling and tracking the students based on the data, and finally giving a recommended teaching plan according to a model of the students in the whole class. However, the recommendation algorithm is to simulate the learning process of students in various teaching plans based on the current student model, and finally select the teaching plan with the best simulation effect as the recommendation. In order to obtain a better recommendation, it is necessary to simulate as much as possible the situation under all possible teaching plans, which brings about a large amount of calculation and time consumption. With more students and more knowledge points, the resulting long wait may be unacceptable, resulting in a teacher not being able to get timely feedback in the class.
Disclosure of Invention
The invention provides a group teaching recommendation system based on deep reinforcement learning, which can be used for improving the processing efficiency of group teaching recommendation.
The technical scheme adopted by the invention is as follows:
a group teaching recommendation system based on deep reinforcement learning, the system comprising: the system comprises a user terminal, a knowledge point management module, a student data management module, a student model module, a pre-training module and a group teaching recommendation module;
the user terminal is used for a teacher or a student to log in the system and is an interactive input and output terminal of the user and the system;
the knowledge point management module is used for a teacher user to input knowledge point data and send the knowledge point data to the student model module and the pre-training module group teaching recommendation module;
the student data management module is used for inputting student basic data by a student user and sending the student basic data to the student model module; the system comprises a group teaching recommendation module, a student classroom feedback acquisition module and a group teaching recommendation module, wherein the group teaching recommendation module is used for acquiring student classroom feedback in a classroom;
the student model module creates a student model based on currently entered knowledge point data and student basic data according to a preset creation strategy and sends the student model to the pre-training module;
the pre-training module takes the student model created by the student model module as a learning main body, takes data sent by the knowledge point management module and the student data management module as training data, trains a preset initial group recommendation model, and obtains a trained group recommendation model; the initial group recommendation model comprises a first neural network model and a second neural network model, wherein the first neural network model and the second neural network model comprise an input layer, at least one hidden layer and an output layer, the input layer is a student class feedback data sequence, the hidden layer is a neural network capable of processing sequence input, and the output layer of the first neural network model is used for outputting recommendation degree of each knowledge point of a current course; the output layer of the second neural network model is used for outputting the evaluation value of the current classroom teaching, namely the evaluation value of the teaching behavior which is executed currently; the group teaching recommendation module calls a group recommendation model trained by the pre-training module, and combines the student class feedback of each class of the course to output teaching recommendation information in the course of course teaching and sends the teaching recommendation information to the corresponding teacher user; saving student classroom feedback collected by the raw data management module; updating and training the group recommendation model based on student classroom feedback stored in the current period according to the configured model updating period in the course teaching process;
the output teaching recommendation information comprises recommended knowledge points of the next class and evaluation values of a current student class feedback data sequence, wherein the recommended knowledge points of the next class are knowledge points with the maximum recommendation degree;
further, the knowledge point data includes: knowledge point ID, belonging course name, knowledge point brief introduction, knowledge point content, knowledge point difficulty coefficient, the prepositive knowledge point ID of the knowledge point, the matched class test questions of the knowledge point and knowledge point related data.
Further, the student base data includes: student number, name, age, sex, age and student type; the student classroom feedback includes data including: test question names, belonging to knowledge point IDs, test question contents, participation in testing student IDs, student test results and the like.
Further, the student model module uses a student model to simulate a group recommendation model training process of a real student in the pre-training module, and a construction model of the student model is an Aibinhaos memory model, a half-life memory model or a Bayesian knowledge tracking model;
and the description of the model includes:
describing the current mastering state of the virtual students for each knowledge point;
a process describing how a virtual student transitions from one state to another by learning;
classroom feedback after learning is described.
Further, the training of the initial group recommendation model by the pre-training module includes:
a student model created by the student model module is used as a virtual student to form a class to participate in training;
setting course requirement information and initializing network parameters of the initial group recommendation model;
taking the whole class virtual students as environments, taking a first neural network model and a second neural network model of an initial group recommendation model as an intelligent body, training the intelligent body by adopting a near-end strategy optimization algorithm, and storing current network parameters when a preset training ending condition is met to obtain a trained group recommendation model.
The curriculum schedule information includes: the number of lessons and the pass rate, excellent rate, average score and the like which are needed to be achieved when the lessons are finished.
Further, training the agent using the near-end policy optimization algorithm includes:
step S1: recording the initial state of the virtual student;
step S2: judging whether the first cycle times reach a preset first maximum cycle times, if so, executing the step S3; otherwise, the following process is circularly performed:
step S201: resetting the virtual student status to the initial status recorded in step S1;
step S202: step S202-1 to step S202-4 are circularly executed until the cycle number reaches the preset maximum subcycling number; recording the classroom feedback of the virtual students in each cycle, the recommendation degree of the knowledge points output by the first neural network, the evaluation value output by the second neural network model, and calculating the rewarding value obtained by the knowledge points learned last time according to course requirement information through the classroom feedback of all the virtual students;
step S202-1: all virtual students participate in classroom learning, and the virtual students give student classroom feedback;
step S202-2: the student classroom feedback given in the step S202-1 is formed into a student classroom feedback data sequence and is input into a first neural network, and knowledge points of the next teaching are obtained based on the output of the first neural network, namely, the knowledge point with the maximum recommendation degree is used as the knowledge point of the next teaching;
step S202-3: the student classroom feedback given in the step S202-1 is formed into a student classroom feedback data sequence, a second neural network model is input, and an evaluation value of the student classroom feedback data sequence is obtained based on the output of the second neural network model;
step S202-4: all virtual students learn knowledge points of the next teaching based on the first neural network;
step S3: judging whether the preset second maximum cycle number is reached, if so, ending; otherwise, the following process is circularly performed:
step S301: sampling the student classroom feedback data collected in the step S2;
step S302: calculating a first objective function (namely, output loss of the first neural network) based on the sampled data, and adjusting network parameters of the first neural network according to a preset random gradient ascent algorithm;
step S303: calculating a second objective function (namely, output loss of the second neural network model) based on the sampled data, and adjusting network parameters of the second neural network model according to a preset random gradient ascent algorithm;
the recommendation and training process of the group teaching recommendation module is as follows:
initializing a group recommendation model, and initializing network parameters stored after training by using a pre-training module;
after a teacher starts teaching, a student class feedback data sequence is formed based on student class feedback of students in each class through a user terminal, and the first and second neural network models of the group recommendation model are respectively input; acquiring recommended teaching knowledge points and corresponding evaluation values of the next class based on the output of the teaching knowledge points, and storing the student class feedback data sequence, the recommended teaching knowledge points and the evaluation values; transmitting the recommended teaching knowledge point of the next classroom to the corresponding teacher;
after class, updating and training the group recommendation model based on historical data stored in the current updating period, wherein the historical data comprises a plurality of groups of data records, and each group of data comprises a student class feedback data sequence, recommended teaching knowledge points and evaluation values.
The technical scheme provided by the embodiment of the invention at least has the following beneficial effects:
compared with the prior art, the invention collects student data through interaction methods such as voting, question answering, homework, test and the like in a classroom, provides a teaching plan with the largest overall benefit for a given student group (such as the whole class), and the overall benefit can be represented by a multi-objective optimization function, and can specifically comprise (but is not limited to) pass rate, excellent rate, average and the like. The invention uses the deep reinforcement learning method to carry out target-oriented teaching path planning for teachers, and can process large-scale complex data. Meanwhile, the training process which takes most time is placed before and after class, and in the class, a teacher can immediately obtain recommended teaching knowledge points through class feedback of students.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a block diagram of a group teaching recommendation system based on deep reinforcement learning according to an embodiment of the present invention;
fig. 2 is a teaching process data sequence diagram of a group teaching recommendation system based on deep reinforcement learning provided by the embodiment of the invention;
FIG. 3 is a flowchart of a pre-training module of a group teaching recommendation system based on deep reinforcement learning according to an embodiment of the present invention;
FIG. 4 is a flow chart of a group teaching recommendation module of a group teaching recommendation system based on deep reinforcement learning provided by an embodiment of the invention;
fig. 5 is a clip function diagram of a group teaching recommendation system based on deep reinforcement learning according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the embodiments of the present invention will be described in further detail with reference to the accompanying drawings.
The embodiment of the invention provides a group teaching recommendation system based on deep reinforcement learning, as shown in fig. 1, the system comprises: user terminal (for teacher or student login system), knowledge point management module, student data management module, student model module, pre-training module and group teaching recommendation module. The specific process for realizing group teaching recommendation through data interaction among the modules comprises the following steps:
(1) The user terminal (teacher user) inputs knowledge point data through a knowledge point management module, and the knowledge point management module sends the knowledge point data to a student model module and a pre-training module group teaching recommendation module;
(2) The user terminal (student user) inputs student basic data through a student data management module, and the student data management module sends the student basic data to a student model module and a pre-training module; in the class, through interaction with the user terminal, student class feedback is collected and sent to the group teaching recommendation module;
(3) The student model module creates a student model based on the currently entered related information (knowledge point data and student basic data) according to a preset creation strategy and sends the student model to the pre-training module;
(4) The pre-training module takes the student model created by the student model module as a study subject, takes data sent by the knowledge point management module and the student data management module as training data, trains a preset initial group recommendation model, and obtains a trained group recommendation model;
(5) The group teaching recommendation module calls the group recommendation model trained by the pre-training module, and combines the student class feedback of each class of the course to output teaching recommendation information and sends the teaching recommendation information to the corresponding teacher user in the course teaching process; saving student classroom feedback collected by the raw data management module; and updating and training the group recommendation model based on student classroom feedback stored in the current period according to the configured model updating period in the course teaching process.
In this embodiment, the knowledge point management module is configured to: receiving and storing knowledge point data (namely knowledge point information) input by an expert; the received data is provided as a data set to other modules for use. The expert refers to a teacher or a teacher group with advanced teaching experience and familiar course knowledge points; the knowledge point information comprises knowledge point ID, belonging course name, knowledge point introduction, knowledge point content, knowledge point difficulty coefficient, prepositioned knowledge point ID of the knowledge point, class test questions matched with the knowledge point and knowledge point related data.
In this embodiment, the student data management module is configured to: receiving and storing student basic information input by students; collecting and storing student classroom feedback data through a classroom test interaction mode; the received data is provided as a data set to other modules for use. The student basic information comprises student numbers, names, ages, sexes, ages and student types; the classroom feedback data comprises a test question name, a belonging knowledge point ID, test question content, a participation test student ID and a student test result; the data set comprises a student basic information data set and a classroom feedback data set; the data sequence generated during teaching in this embodiment is shown in fig. 2.
In this embodiment, the student model module is configured to create a student model based on a student basic information data set; and simulating the group recommendation model training process of the real students in the pre-training module by using the student model. The student model is realized through an Eggy memory model, and the student model is used for describing several pieces of characteristic information:
(1) Describing the current mastery state of the virtual student for each knowledge point, the formula in this embodiment is as follows:
wherein ,Pi Representing the probability of a student grasping the ith knowledge point,representing the mastering probability of a prepositioned knowledge point of an ith knowledge point, wherein theta is a difficulty coefficient, D is the time from the last learning of the knowledge point to the current interval according to the specific conditions of students and knowledge points, and S is the total number of times of learning the knowledge point;
(2) Describing how a virtual student transitions from one state to another by learning, in this embodiment by changing D and S in the above-described formulas;
(3) Describing the learned classroom feedback, in this embodiment, by sampling a random number between 0 and 1, if less than P in the above formula i It is considered that the question of the knowledge point can be answered correctly and otherwise not.
In this embodiment, the pre-training module is configured to train a group recommendation model based on the student model module through a near-end policy optimization algorithm before a class, and provide the group recommendation model for the group teaching recommendation module, and the flow is shown in fig. 3. The group recommendation model consists of a recommendation neural network and a comment family neural network. The recommended neural network is a cyclic neural network, and since feedback data of students is data of a sequence arranged according to time, the recommended neural network needs to be capable of processing sequence input, and in this embodiment, a long-term and short-term memory cyclic neural network is used to output knowledge points for teaching as recommended. The comment home neural network structure is similar to the recommendation neural network, namely the comment home neural network structure and the recommendation neural network are similar, namely the comment home neural network structure and the recommendation neural network are both composed of an input layer, a hiding layer and an output layer, wherein the input layer is used for inputting a student class feedback sequence, the number of layers of the hiding layer can be one layer or multiple layers, the number of layers of the comment home neural network structure and the hiding layer of the recommendation neural network can be consistent or different, the output layer is the main difference of the comment home neural network structure and the recommendation neural network, the output layer of the recommendation neural network is used for classified output, the output layer of the recommendation neural network adopts a softmax function, and the output information is used for representing the recommendation degree of each knowledge point of a current course in the next class (when a recommendation result is formed, the maximum recommendation degree is used as a recommendation result); the output layer of the commentator neural network adopts a Linear function, and the output information is used for representing the scoring value of the behavior at each sampling moment, namely the output of the commentator neural network is the evaluation (scoring value) of the current class teaching. The training group recommendation model by using a near-end strategy optimization algorithm (Proximal Policy Optimization, PPO) comprises the following specific training procedures:
(1) The student model created by the student model module is used as a virtual student to form a class to participate in training, and if the number of class is 20;
(2) Setting course requirement information;
(3) Initializing a group recommendation model, namely initializing network parameters of a recommendation neural network and a comment home neural network;
(4) Taking a whole class of virtual students as an environment, recommending a neural network and commenting family neural networks as an intelligent body, and training the intelligent body by using a near-end strategy optimization algorithm;
(5) After the training of the recommended neural network and the comment family neural network is completed, the network parameters of the current recommended neural network and the comment family neural network are saved and provided for the group teaching recommendation module.
As a possible implementation manner, in the training process of this embodiment, (2) the course requirement information includes a number of courses of 80, a passing rate required to be achieved at the end of a course of 0.8, and a better average value is obtained as an excellent rate of 0.2; (3) The initialization group recommendation model comprises two layers of neural networks and 64 hidden layer neurons; the flow of the near-end policy optimization algorithm in (4) is as follows:
(1) Recording the initial state of the virtual student;
(2) The following steps are cycled for specified times:
(2-a) cycling the following steps a specified number of times:
I. resetting the virtual student status to the initial status saved in (1);
II. The following steps are circulated until the learning times reach the set time, the test results returned by students in each cycle are recorded, the knowledge points output by the neural network are recommended, the evaluation values output by the family neural network are reviewed, and the rewarding values obtained by the knowledge points learned last time are calculated according to course requirement information through classroom feedback of all students, wherein the rewarding value formula is as follows:
Reward=λ 1 R p +λ 2 R e +λ 3 R a
wherein ,Rp Index and lattice rate, R e Indicate excellent rate, R a Refers to the evaluation and mastering probability lambda of all students on knowledge points 1 ,λ 2 and λ3 The weights are respectively represented, the values of the weights are larger than or equal to 0, and the specific values are empirical values, so that the invention is not particularly limited. Taken as 5,3,1 in this embodiment, respectively.
1) Allowing all virtual students to participate in classroom tests, and returning test results by the virtual students;
2) Transmitting classroom feedback into a recommendation neural network, and outputting a recommended knowledge point of the next teaching;
3) Transmitting classroom feedback into comment home nerve network, and outputting evaluation value;
4) All virtual students learn and recommend knowledge points output by the neural network;
(2-b) cycling through the following operations a specified number of times:
I. sampling from the data collected in (2-a).
II. Calculating an objective function by using the sampled data, and selecting a random gradient rising algorithm to train a recommended neural network, wherein the formula is as follows:
wherein ,θk Refers to the parameters of the recommended neural network during the kth training, D k Referring to the sampled data set, τ refers to the sampled data under a set of teaching paths, i.e., a complete teaching track sample (e.g., 40 lessons in time, 40 lessons after complete teaching, the 40 lessons teaching constitutes a set of data, D k Consists of a plurality of groups of tau), T is the duration of course and pi θ (a t |s t ) When the representation parameter is theta, at the time t, the input classroom feedback is s t The output is a t As shown in figure 4, i.e. the input parameters of clip () include r t (θ)And represents the boundary value E, if r t (θ) is equal to or less than 1-e, clip () =1-e; if r t (θ) is equal to or greater than 1+_E, clip () =1+_E, if 1+_E < r t (θ) < 1+.e, clip () =r t (θ). In this embodiment, the boundary value e is 0.1./>For the dominant value of the behavior at time t, the formula is as follows:
ξ t =r t +γV(s t+1 )-V(s t )
wherein ,ξt Representing the intermediate parameter at time t, i.e. intermediate parameter ζ at different times t According to xi t Calculated by the calculation formula of (2), gamma tableShowing the discount factor, in this example, the value is 0.99, T represents the total time so far, r t Represents the prize value obtained at time t, V (s t ) The comment value given by the comment home nerve network at the moment t is represented;
III, calculating an objective function by using the sampled data, and selecting a random gradient rising algorithm to train a critic neural network formula as follows:
wherein ,parameters of comment on the family neural network during the kth training +.>The representation is based on current network parametersThe output (comment value) of the home neural network is commented on at time t.
In this embodiment, the group teaching recommendation module is configured to receive and store classroom feedback data of students in a classroom; giving a knowledge point for recommending and teaching based on classroom feedback; and further training the group recommendation model by using classroom feedback data after the class. The recommendation and training process is shown in fig. 5:
(1) Initializing a group recommendation model, and initializing parameters stored after training by using a pre-training module;
(2) The teacher starts teaching, students give classroom feedback and input a recommended neural network and a comment family neural network, the recommended neural network outputs recommended teaching knowledge points, the comment family neural network outputs evaluation values, and all data are stored;
(3) And (2) circularly executing, and teaching by a teacher according to the recommended knowledge points. After a certain number of times, after class, calculating an objective function by using the data saved so far, and training the group recommendation model again;
(4) And (3) cycling until the course is finished.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.
What has been described above is merely some embodiments of the present invention. It will be apparent to those skilled in the art that various modifications and improvements can be made without departing from the spirit of the invention.
Claims (10)
1. A group teaching recommendation system based on deep reinforcement learning, comprising: the system comprises a user terminal, a knowledge point management module, a student data management module, a student model module, a pre-training module and a group teaching recommendation module;
the user terminal is used for a teacher or a student to log in the system and is an interactive input and output terminal of the user and the system;
the knowledge point management module is used for a teacher user to input knowledge point data and send the knowledge point data to the student model module and the pre-training module group teaching recommendation module;
the student data management module is used for inputting student basic data by a student user and sending the student basic data to the student model module; the system comprises a group teaching recommendation module, a student classroom feedback acquisition module and a group teaching recommendation module, wherein the group teaching recommendation module is used for acquiring student classroom feedback in a classroom;
the student model module creates a student model based on currently entered knowledge point data and student basic data according to a preset creation strategy and sends the student model to the pre-training module;
the pre-training module takes the student model created by the student model module as a learning main body, takes data sent by the knowledge point management module and the student data management module as training data, trains a preset initial group recommendation model, and obtains a trained group recommendation model; the initial group recommendation model comprises a first neural network model and a second neural network model, wherein the first neural network model and the second neural network model comprise an input layer, at least one hidden layer and an output layer, the input layer is a student class feedback data sequence, the hidden layer is a neural network capable of processing sequence input, and the output layer of the first neural network model is used for outputting recommendation degree of each knowledge point of a current course; the output layer of the second neural network model is used for outputting the evaluation value of the current classroom teaching;
the group teaching recommendation module calls a group recommendation model trained by the pre-training module, and combines the student class feedback of each class of the course to output teaching recommendation information in the course of course teaching and sends the teaching recommendation information to the corresponding teacher user; saving student classroom feedback collected by the raw data management module; updating and training the group recommendation model based on student classroom feedback stored in the current period according to the configured model updating period in the course teaching process;
the output teaching recommendation information comprises recommended knowledge points of the next class and evaluation values of the current student class feedback data sequence, wherein the recommended knowledge points of the next class are the knowledge points with the maximum recommendation degree.
2. The group teaching recommendation system of claim 1, wherein the knowledge point data comprises: knowledge point ID, belonging course name, knowledge point brief introduction, knowledge point content, knowledge point difficulty coefficient, the prepositive knowledge point ID of the knowledge point, the matched class test questions of the knowledge point and knowledge point related data.
3. The group teaching recommendation system of claim 1, wherein the student base data comprises: student number, name, age, sex, age and student type; the student classroom feedback includes data including: the test question name, the belonging knowledge point ID, the test question content, the participation test student ID and the student test result.
4. The group teaching recommendation system according to claim 1, wherein the student model module uses a student model to simulate a real student participating in a group recommendation model training process in the pre-training module, and a construction model of the student model is an eibinos memory model, a half-life memory model or a bayesian knowledge tracking model;
and the description of the model includes:
describing the current mastering state of the virtual students for each knowledge point;
a process describing how a virtual student transitions from one state to another by learning;
classroom feedback after learning is described.
5. The group teaching recommendation system of claim 1, wherein the training of the initial group recommendation model by the pre-training module comprises:
a student model created by the student model module is used as a virtual student to form a class to participate in training;
setting course requirement information and initializing network parameters of the initial group recommendation model;
taking the whole class virtual students as environments, taking a first neural network model and a second neural network model of an initial group recommendation model as an intelligent body, training the intelligent body by adopting a near-end strategy optimization algorithm, and storing current network parameters when a preset training ending condition is met to obtain a trained group recommendation model.
6. The group teaching recommendation system of claim 5, wherein the curriculum requirements information comprises: the number of lessons and the pass rate to be achieved at the end of the lessons, the excellent rate and the average score.
7. The group teaching recommendation system of claim 1, wherein training the agent using a near-end policy optimization algorithm comprises:
step S1: recording the initial state of the virtual student;
step S2: judging whether the first cycle times reach a preset first maximum cycle times, if so, executing the step S3; otherwise, the following process is circularly performed:
step S201: resetting the virtual student status to the initial status recorded in step S1;
step S202: step S202-1 to step S202-4 are circularly executed until the cycle number reaches the preset maximum subcycling number; recording the classroom feedback of the virtual students in each cycle, the recommendation degree of the knowledge points output by the first neural network model, the evaluation value output by the second neural network model, and the rewarding value obtained by the knowledge points learned last time according to course requirement information through the classroom feedback of all the virtual students;
step S202-1: all virtual students participate in classroom learning, and the virtual students give student classroom feedback;
step S202-2: the student classroom feedback given in the step S202-1 is formed into a student classroom feedback data sequence and is input into a first neural network model, and knowledge points of the next teaching are obtained based on the output of the first neural network model, namely, the knowledge point with the maximum recommendation degree is used as the knowledge point of the next teaching;
step S202-3: the student classroom feedback given in the step S202-1 is formed into a student classroom feedback data sequence, a second neural network model is input, and an evaluation value of the student classroom feedback data sequence is obtained based on the output of the second neural network model;
step S202-4: all virtual students learn knowledge points of the next teaching obtained based on the first neural network model;
step S3: judging whether the preset second maximum cycle number is reached, if so, ending; otherwise, the following process is circularly performed:
step S301: sampling the student classroom feedback data collected in the step S2;
step S302: calculating a first objective function based on the sampled data, and adjusting network parameters of a first neural network model according to a preset random gradient ascent algorithm, wherein the first objective function is used for representing the output loss of the first neural network model;
step S303: and calculating a second objective function based on the sampled data, and adjusting network parameters of the second neural network model according to a preset random gradient ascent algorithm, wherein the second objective function is used for representing the output loss of the second neural network model.
8. The group teaching recommendation system of claim 7, wherein the first objective function is:
wherein ,
θ k+1 representing network parameters of the first neural network during the (k+1) th training;
D k representing a sampled dataset;
t represents the duration of the course;
τ represents sampled data under a set of teaching paths;
π θ (a t |s t ) When the network parameter is represented as theta, at the time t, the input classroom feedback of the student is s t Output is a t Probability of (2);
the input parameters of the function clip () include r t (θ) and the boundary value E, if r t (θ) is equal to or less than 1-e, clip () =1-e; if r t (θ) is equal to or greater than 1+_E, clip () =1+_E, if 1+_E < r t (θ) < 1+.e, clip () =r t (θ); wherein,
ξ t =r t +γV(s t+1 )-V(s t );
wherein ,ξt Represents an intermediate parameter at time t, gamma represents a preset discount factor, r t Represents the prize value obtained at time t, V (s t ) The comment value output by the second neural network model at the moment t is represented;
the second objective functions are respectively:
wherein ,
9. The group teaching recommendation system of claim 1, wherein the recommendation and training process of the group teaching recommendation module is:
initializing a group recommendation model, and initializing network parameters stored after training by using a pre-training module;
after a teacher starts teaching, a student class feedback data sequence is formed based on student class feedback of students in each class through a user terminal, and the first and second neural network models of the group recommendation model are respectively input; acquiring recommended teaching knowledge points and corresponding evaluation values of the next class based on the output of the teaching knowledge points, and storing the student class feedback data sequence, the recommended teaching knowledge points and the evaluation values; transmitting the recommended teaching knowledge point of the next classroom to the corresponding teacher;
after class, updating and training the group recommendation model based on historical data stored in the current updating period, wherein the historical data comprises a plurality of groups of data records, and each group of data comprises a student class feedback data sequence, recommended teaching knowledge points and evaluation values.
10. The group teaching recommendation system of claim 1, wherein the first and second neural network model hidden layers are long and short term memory recurrent neural networks.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210028554.7A CN114595923B (en) | 2022-01-11 | 2022-01-11 | Group teaching recommendation system based on deep reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210028554.7A CN114595923B (en) | 2022-01-11 | 2022-01-11 | Group teaching recommendation system based on deep reinforcement learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114595923A CN114595923A (en) | 2022-06-07 |
CN114595923B true CN114595923B (en) | 2023-04-28 |
Family
ID=81803873
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210028554.7A Active CN114595923B (en) | 2022-01-11 | 2022-01-11 | Group teaching recommendation system based on deep reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114595923B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116521936B (en) * | 2023-06-30 | 2023-09-01 | 云南师范大学 | Course recommendation method and device based on user behavior analysis and storage medium |
CN117114937B (en) * | 2023-09-07 | 2024-06-14 | 深圳市真实智元科技有限公司 | Method and device for generating exercise song based on artificial intelligence |
CN117455389B (en) * | 2023-10-10 | 2024-05-28 | 北京华普亿方科技集团股份有限公司 | Vocational training management platform based on artificial intelligence |
CN117688248B (en) * | 2024-02-01 | 2024-04-26 | 安徽教育网络出版有限公司 | Online course recommendation method and system based on convolutional neural network |
CN117910481A (en) * | 2024-03-20 | 2024-04-19 | 北京语言大学 | Spoken language dialogue method and device for assisting language learning and dialogue robot |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108615423A (en) * | 2018-06-21 | 2018-10-02 | 中山大学新华学院 | Instructional management system (IMS) on a kind of line based on deep learning |
CN113509726A (en) * | 2021-04-16 | 2021-10-19 | 超参数科技(深圳)有限公司 | Interactive model training method and device, computer equipment and storage medium |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3543918A1 (en) * | 2018-03-20 | 2019-09-25 | Flink AI GmbH | Reinforcement learning method |
CN108614865B (en) * | 2018-04-08 | 2020-12-11 | 暨南大学 | Personalized learning recommendation method based on deep reinforcement learning |
CN109242207A (en) * | 2018-10-10 | 2019-01-18 | 中山大学 | A kind of Financial Time Series prediction technique based on deeply study |
CN112307214A (en) * | 2019-07-26 | 2021-02-02 | 株式会社理光 | Deep reinforcement learning-based recommendation method and recommendation device |
CN112700688B (en) * | 2020-12-25 | 2021-09-24 | 电子科技大学 | Intelligent classroom teaching auxiliary system |
CN112784154B (en) * | 2020-12-31 | 2022-03-15 | 电子科技大学 | Online teaching recommendation system with data enhancement |
CN113590929A (en) * | 2021-01-28 | 2021-11-02 | 腾讯科技(深圳)有限公司 | Information recommendation method and device based on artificial intelligence and electronic equipment |
CN113033537B (en) * | 2021-03-25 | 2022-07-01 | 北京百度网讯科技有限公司 | Method, apparatus, device, medium and program product for training a model |
-
2022
- 2022-01-11 CN CN202210028554.7A patent/CN114595923B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108615423A (en) * | 2018-06-21 | 2018-10-02 | 中山大学新华学院 | Instructional management system (IMS) on a kind of line based on deep learning |
CN113509726A (en) * | 2021-04-16 | 2021-10-19 | 超参数科技(深圳)有限公司 | Interactive model training method and device, computer equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN114595923A (en) | 2022-06-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114595923B (en) | Group teaching recommendation system based on deep reinforcement learning | |
Varma et al. | Preservice elementary teachers’ perceptions of their understanding of inquiry and inquiry-based science pedagogy: Influence of an elementary science education methods course and a science field experience | |
CN109215426A (en) | A kind of student's learning information analysis system and its application method | |
CN109858797A (en) | The various dimensions information analysis of the students method of knowledge based network exact on-line education system | |
CN109118861A (en) | A kind of individualized intelligent tutoring system | |
Khanna et al. | Expert systems advances in education | |
CN110263020A (en) | On-line study item bank management system and management method | |
Noh et al. | Intelligent tutoring system using rule-based and case-based: a comparison | |
CN109920288A (en) | Adaptive learning task intelligence generating means and computer learning system | |
CN110046804A (en) | A kind of education training method and system based on student's classification | |
Chan et al. | Applying the genetic encoded conceptual graph to grouping learning | |
Schwartz et al. | Choice-based assessments for the digital age | |
CN112951022A (en) | Multimedia interactive education training system | |
Wang | Exploration on the operation status and optimization strategy of networked teaching of physical education curriculum based on AI algorithm | |
Lee et al. | Comparison of peer-to-peer and virtual simulation rehearsals in eliciting student thinking through number talks | |
Tang et al. | Adaptive narrative game for personalized learning | |
CN115205072A (en) | Cognitive diagnosis method for long-period evaluation | |
Fleener et al. | Dimensions of teacher education accountability: A Louisiana perspective on value-added | |
Wan et al. | Adaptive course generation based on evolutionary algorithm | |
KR100995679B1 (en) | Personalized E-Learning System for the Written Examination of Driver's License Test | |
Zou et al. | A novel learning early-warning model based on knowledge points and question types | |
CN114155124B (en) | Test question resource recommendation method and system | |
Javadi et al. | Improving student's modeling framework in a tutorial-like system based on Pursuit learning automata and reinforcement learning | |
Wang et al. | Research on the Reform of English Precision Teaching in Colleges and Universities Facilitated by Artificial Intelligence Technology | |
Kamha et al. | Implementation of a Curriculum to Enhance Learning Management Competency in Computational Thinking for the Lower Secondary Teachers |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |