CN109784578A - A kind of on-line study stagnation forecasting system of combination business rule - Google Patents

A kind of on-line study stagnation forecasting system of combination business rule Download PDF

Info

Publication number
CN109784578A
CN109784578A CN201910082918.8A CN201910082918A CN109784578A CN 109784578 A CN109784578 A CN 109784578A CN 201910082918 A CN201910082918 A CN 201910082918A CN 109784578 A CN109784578 A CN 109784578A
Authority
CN
China
Prior art keywords
line study
feature
data
study
behavior
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910082918.8A
Other languages
Chinese (zh)
Other versions
CN109784578B (en
Inventor
刘杰
蔡承烨
李国斌
周新运
马志柔
李松领
孟欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Open Distance Education Center Co ltd
Institute of Software of CAS
Original Assignee
Beijing Open Distance Education Center Co ltd
Institute of Software of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Open Distance Education Center Co ltd, Institute of Software of CAS filed Critical Beijing Open Distance Education Center Co ltd
Priority to CN201910082918.8A priority Critical patent/CN109784578B/en
Publication of CN109784578A publication Critical patent/CN109784578A/en
Application granted granted Critical
Publication of CN109784578B publication Critical patent/CN109784578B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention relates to a kind of on-line studies of combination business rule to stagnate forecasting system, comprising: data loading module, data markers module, Feature Engineering module, model training module and prediction of result module;Data loading module: student data is read from database;Study is read from rule base stagnates decision rule;Data markers module: decision rule label student data is stagnated according to study, meets study and stagnates decision rule, given birth to labeled as stagnating;Otherwise, it is given birth to labeled as non-stagnation;Feature Engineering module: Feature Engineering processing, including feature selecting, characteristic processing and feature normalization are carried out to student data;Model training module: treated that student data is trained to Feature Engineering for selection machine learning model, then according to model-evaluation index, selects the machine learning model of best performance as study stagnation prediction model;Prediction of result module: student data to be predicted is input in study stagnation prediction model and is calculated, prediction result is obtained.

Description

A kind of on-line study stagnation forecasting system of combination business rule
Technical field
The present invention relates to a kind of on-line studies of combination business rule to stagnate forecasting system, belongs to software technology field.
Background technique
On-line study breaches the limitation of conventional teaching over time and space by Internet technology, allows student certainly By, reasonable arrangement learning time, so that Learning efficiency be made to maximize.However, on-line study mode is with the subjective initiative of student For main drive, necessary supervision is lacked to learning activities and supervision, student tend to generate idle mood, it is final to develop At the serious consequence discontinued one's studies.How the learning behavior of student is assessed in time, discovery learning in advance tends to the sign stagnated And timely early warning, for promoting online Learning efficiency, the on-line study mode sound development of promotion is of great significance.
The research for stagnating forecasting problem about on-line study can be broadly divided into two classes, and one kind is research prediction model, such as Whether Amnueypornsakul et al. can stagnate study using the clickstream data prediction student of learner, using SVM model pair Enliven, forsake studies, the learner of inactive three types predicts respectively, obtain higher predictablity rate (referring to document: Amnueypornsakul B,Bhat S,Chinprutthiwong P.Predicting Attrition Along the Way:The UIUC Model[C]//EMNLP 2014 Workshop on Analysis of Large Scale Social Interaction in Moocs.2014:55-59.).Lu Xiaohang et al. is by building sliding window model, in SVM and LSTM Better effects are obtained in method, and (referring to document: Lu Xiaohang, Wang Shengqing, Huang Junjie wait a kind of based on sliding window model The analysis of MOOCs dropping rate prediction technique [J] data and Knowledge Discovery, 2017,1 (4): 67-75.).Another kind of is research prediction Feature predicts the iterative process for learning to stagnate using machine learning techniques as Sharkey et al. is described in detail, and passes through research It obtains with predictive feature and their relative weighting (referring to document: Sharkey M, Sanders R.A Process for Predicting MOOC Attrition[C]//EMNLP 2014 Workshop on Analysis of Large Scale Social Interaction in Moocs.2014:50-54.).Taylor et al. uses a variety of machine learning methods Prediction study is stagnated, discovery cooperation social relevant feature such as wiki and forum it is particularly significant in prediction (referring to document: Taylor C, Veeramachaneni K, O'Reilly U M.Likely to stop? Predicting Stopout in Massive Open Online Courses[J].Computer Science,2014.)。
The above method mostly predicts by the simple learning behavior according to student, when observing that learning behavior is decayed simultaneously After reaching certain threshold value, that is, determine that maximum probability is generated study stagnation behavior by the student.But which kind of decays to about learning behavior Degree can determine that stagnate and not generally acknowledging effective method, affect the applicability and extension of these methods to a certain extent Property.
Summary of the invention
Technology of the invention solves the problems, such as: overcoming the deficiencies of the prior art and provide a kind of online of combination business rule It practises and stagnates forecasting system, explicitly define study stagnation crowd using business rule, and then more acurrate using supervised learning algorithm Study stagnation crowd and non-stagnate crowd feature;Also, it is next to practise behavior decaying for unorthodox method middle school with business rule Definition study stagnation crowd can effectively be avoided " vacation is stagnated " of being caused learning behavior lull due to cause specific, be improved The accuracy of system prediction.
Technical solution of the invention: forecasting system is stagnated in conjunction with the on-line study of business rule, as shown in Figure 1, packet It includes:
Data loading module: reading student data from database, the on-line study behavioral data including student and basic Attribute data;Study is read from rule base stagnates decision rule.
Data markers module: decision rule is stagnated according to study and marks student data.Meet study and stagnate decision rule, It is given birth to labeled as stagnating;Otherwise, it is given birth to labeled as non-stagnation.
Feature Engineering module: Feature Engineering processing is carried out to student data, including feature selecting, characteristic processing and feature are returned One changes.
Model training module: treated that student data is trained to Feature Engineering for selection machine learning model, then According to model-evaluation index, the machine learning model of best performance is selected to stagnate prediction model as the study of present system.
Prediction of result module: student data to be predicted is input in study stagnation prediction model and is calculated, is obtained pre- Survey result.
The Feature Engineering module realizes that process is as follows:
(1) feature selecting.Selected from the on-line study behavioral data and basic attribute data of student typical data as Data characteristics.The data type of on-line study behavioral data is numeric type, referred to as numeric type feature.Student's essential attribute, such as property Not, marital status etc., value are discrete enumerable data, referred to as classification type feature.
(2) characteristic processing.Logarithm type feature and classification type feature carry out characteristic processing respectively.
The characteristic processing of numeric type feature is the operation processing to on-line study behavioral data in present system.Online Learning behavior data are the systems to (a referred to as on-line study behavioral statistics period) a certain on-line study behavior in the regular period It counts, the statistical indicator of the corresponding on-line study behavior of an on-line study behavioral data.It such as " login times " and " logs in Number of days " is two statistical indicators of " logging in online learning system " this on-line study behavior, corresponding two on-line study behaviors Data.The i on-line study behavioral statistics period if it exists, the statistical indicator of j on-line study behavior, then each on-line study The behavioral statistics period contains j on-line study behavioral data, and the statistical indicator of each on-line study behavior corresponds to i on-line study Behavioral data.
The characteristic processing process description of present system numeric type feature is as follows:
1) splice.I*j on-line study behavioral data of i on-line study behavioral statistics period statistics is spliced one It rises, obtains i*j dimensional feature.
2) average.For the statistical indicator of each on-line study behavior, i on-line study behavioral statistics period is counted I on-line study behavioral data average, as a new feature.J can be obtained in the statistical indicator of j on-line study behavior Dimensional feature.
3) it is weighted and averaged.For the statistical indicator of each on-line study behavior, by i on-line study behavioral statistics period I on-line study behavioral data of statistics seeks weighted average, as a new feature.The statistics of j on-line study behavior refers to J dimensional feature can be obtained in mark.
4) it is averaged by login times.To the statistical indicator (except login times) of each on-line study behavior, according to stepping on Record number calculates the average value logged in every time, as a new feature.The statistical indicator of j-1 on-line study behavior can obtain j-1 Dimensional feature.
5) variance is sought.For the statistical indicator of each on-line study behavior, i on-line study behavioral statistics period is united I on-line study behavioral data of meter seeks variance, as a new feature.J can be obtained in the statistical indicator of j on-line study behavior Dimensional feature.
6) minimax normalizes.For the statistical indicator of each on-line study behavior, i on-line study behavior is united Meter period i on-line study behavioral data of statistics does minimax normalized, and it is special can to obtain i minimax normalization Sign.I*j dimensional feature can be obtained in the statistical indicator of j on-line study behavior.
Minimax normalizes calculation formula are as follows: (val-min)/(max-min)
Wherein, val is certain corresponding k-th of on-line study behavioral data of on-line study behavioral statistics index (0 < k≤i), Min, max are respectively the minimum value and maximum in the corresponding i on-line study behavioral data of on-line study behavioral statistics index Value.
7) weighted variance.For the statistical indicator of each on-line study behavior, by i on-line study behavioral statistics period Variance is sought in the i on-line study behavioral data weighting of statistics, as a new feature.The statistical indicator of j on-line study behavior J dimensional feature can be obtained.
8) it sorts.For the statistical indicator of each on-line study behavior, i on-line study behavioral statistics period is counted I on-line study behavioral data sequence, can using the position of i on-line study behavioral data in the sequence as independent characteristic Obtain i dimensional feature.I*j dimensional feature can be obtained in the statistical indicator of j on-line study behavior.
Processing operation for classification type feature includes:
1) one-hot coding (one-hot).Classification type feature is extended to respective dimension according to the value number of classification type feature The feature of degree.
2) Interval Maps joint one-hot coding (one-hot)." age " is a kind of more special attribute, value range Larger, value number is more.Age data is first mapped to the age range delimited in advance by the present invention, reduction " age " this The value number of attribute, then does one-hot processing again.
After the completion of characteristic processing operation, one group of n dimensional feature, the m number of students of m student is can be obtained in each student data According to the eigenmatrix of an available m*n.In the matrix, every a line indicates that the corresponding n dimensional feature of a student, each column indicate The corresponding m value of m student under the dimensional feature.
(3) feature normalization.Normalized is done to the numeric type feature in m*n eigenmatrix.
The present invention realizes that formula is as follows using the standardized mode of 0 mean value:
Wherein, x is the value (0 < k≤n) of kth dimensional feature, and μ, σ are respectively the mean value of the corresponding m value of the dimensional feature And variance.
The model training module realizes that process is as follows:
(1) Logistic Regression, five SVM, XGboost, Random Forest, LSTM machine learning are selected Model, to Feature Engineering, treated that student data is trained, and respectively obtains the optimal solution of 5 machine learning models.
(2) it is with accuracy rate (accuracy), rate of precision (precision), recall rate (recall), f1 value, roc-auc Model-evaluation index assesses above-mentioned machine learning model, selects the machine learning model of best performance as the present invention Prediction model is stagnated in the study of system.
The advantage of the present invention compared with prior art is:
(1) rule-based accurate identification learns stagnation crowd, is more accurately learnt using the machine learning algorithm for having supervision Stagnation crowd and non-stagnate the feature of crowd, learn stagnation prediction model for training and provide more favorable foundation.
(2) characteristic processing operation abundant is expanded around the student data of on-line study, extends the feature dimensions of data Degree, increases the information content of model training data.
(3) present system calls multiple machine learning models to be trained simultaneously, and with a variety of model-evaluation indexes pair Multiple machine learning models carry out comprehensive assessment, and the machine learning model of best performance is selected to stagnate prediction model as study, Ensure that the validity and reliability of prediction model is stagnated in study.
Detailed description of the invention
Fig. 1 present system system assumption diagram;
Prediction concept map is stagnated in on-line study in Fig. 2 present system;
Feature Engineering module realizes process in Fig. 3 present system;
Model training module realizes process in Fig. 4 present system.
Specific embodiment
Below in conjunction with specific example, the present invention is described in detail.
Whether the present invention pays this attribute of tuition fee as the essential condition for determining whether student's study stagnates using student, determines Decision rule and related notion are stagnated in adopted following study:
Define a payment period
Student should complete payment in learning process within some time cycle, this time cycle is week of paying the fees Phase.The payment period of each student is identical.The length in payment period can be determined according to actual business requirement.
It defines two and stagnates life
The student that the continuous w payment period does not pay the fees, referred to as stagnates life.Numerical value w can be determined according to actual business requirement.
Decision rule is stagnated in one study of rule
If some student does not pay the fees in the continuous w payment period, it can determine that the student learns to stagnate, become stagnation life.
As shown in Fig. 2, present system be in the case where there is l (0 < l < w) a payment period and does not pay the fees in student, Predict its w probability do not paid the fees of payment period.
This example is using Python as data prediction and arithmetic programming language.Setting study, which is stagnated, in example determines Parameter of regularity w=2, i.e., the student that continuous 2 payment periods do not pay the fees are judged to stagnating life, and are 6 by payment cycle set A month.The measurement period for concurrently setting on-line study behavioral data is " moon ".
As shown in Figure 1, present system includes data loading module, data markers module, Feature Engineering module, model instruction Practice module and prediction of result module and a database.Database is for storing student data and calculated result.Student data packet Include the initial data of the systems such as on-line study behavioral data and the basic attribute data of student input.Calculated result includes sample number According to label result, the obtained model parameter of Feature Engineering treated characteristic value, model training and the final prediction of system As a result.
Data loading module, from database read student basic attribute data and first payment periodic recording Line learning behavior data.Basic attribute data include admission mode, attend school universities and colleges, attend school profession, grade, marital status, gender, Student's source type, learning type, place province, age and current number of cycles of not paying the fees continuously.On-line study behavioral data Note quantity, monthly courseware number of clicks, monthly column number of clicks, monthly log duration, every are sent back to including monthly course question and answer The moon logs in number of days, monthly login times.
Data markers module stagnates decision rule for student labeled as " stagnate and give birth to " or " non-stagnation according to on-line study It is raw ".In this example, " current number of cycles of the not paying the fees continuously " attribute for reading each student will if its value is more than or equal to 2 " label result " attribute of the student is set to " stagnating life ", otherwise, is set to " non-stagnation life ".
Feature Engineering module carries out feature selecting, characteristic processing and feature normalizing to the data of data loading module load Change operation, realize that process is as shown in Figure 3:
(1) feature selecting, selected from the on-line study behavioral data and basic attribute data of student typical data as Data characteristics, and numeric type feature and classification type feature are divided into according to the data type of feature, as shown in table 1.
1 data characteristics table of table
Serial number Feature name Explanation Characteristic type
1 KCWD_NUM Monthly course question and answer send back to note quantity Numeric type
2 KJ_CLICK_NUM Monthly courseware number of clicks Numeric type
3 LM_CLICK_NUM Monthly column number of clicks Numeric type
4 LOGIN_DURATION Monthly log duration Numeric type
5 LOG_DAY Monthly log in number of days Numeric type
6 LOGIN_NUM Monthly login times Numeric type
7 ENTRANCETYPE Admission mode Classification type
8 UNIVERSITY Attend school universities and colleges Classification type
9 MAJOR Attend school profession Classification type
10 GRADE Grade Classification type
11 MARRIAGE Marital status Classification type
12 SEX Gender Classification type
13 STUDENTSOURCE Student's source type Classification type
14 STUDYMODE Academic year/credit system Classification type
15 PROVINCE_NAME Place province Classification type
16 AGE Age Classification type
(2) characteristic processing.To each numeric type feature, following operation is completed:
1) splice.By taking " monthly courseware number of clicks KJ_CLICK_NUM " as an example, the courseware number of clicks of every month is made For an independent feature, the feature of payment period (6 months) available 6 dimension.1-6 feature can obtain altogether in table one Obtain the feature of 36 dimensions.
2) average.6 months average value is calculated as a new feature.1-6 feature can get 6 dimensions altogether in table one Feature.
3) it is weighted and averaged.6 months weighted averages are calculated as a new feature.The weight of first trimester is set as 1, Fourth, fifth month weight is 2, the weight of the last one month is 6.1-6 feature can get the spy of 6 dimensions altogether in table one Sign.
4) it is averaged by login times.6 months total login times are found out first, and then 1-5 feature is logged according to total Number is averaged, and can get the feature of 5 dimensions.
5) variance is sought.6 months variances are calculated as a new feature.1-6 feature can get 6 dimensions in table one Feature.
6) minimax normalizes.By taking " monthly courseware number of clicks KJ_CLICK_NUM " as an example, if one user 6 months Courseware number of clicks be respectively 8,3,5,6,2,4, done minimax normalized ((val-min)/(max-min)) Afterwards, 1,1/6,1/2,2/3,0,1/3 can be obtained.This 6 are worth as independent feature.1-6 feature can get in table one The feature of 36 dimensions.
7) weighted variance.6 months weighted variances are calculated as a new feature.The weight of first trimester is set as 1, Four, five months weights are 2, the weight of the last one month is 6.1-6 feature can get the feature of 6 dimensions altogether in table one.
8) it sorts.By taking " monthly courseware number of clicks KJ_CLICK_NUM " as an example, if a user 6 months courseware is clicked Number is respectively 8,3,5,6,2,4, takes position of the current value where after sequence, can be obtained 6,2,4,5,1,3, by this 6 numbers Respectively as independent feature.1-6 feature can get the feature of 36 dimensions altogether in table one.
To each classification type feature, following operation is completed:
1)one-hot.By taking " marital status MARRIAGE " as an example, feature value is (married, unmarried, other).one-hot The feature that processing is tieed up for one 3.Even user is married, it is characterized in that (1,0,0);If unmarried, it is characterized in that (0,1,0);If Other, it is characterized in that (0,0,1).
2) Interval Maps combine one-hot." age AGE " feature is mapped that first according to the value at age Preset age range [less than 30 years old], [between 30 years old (containing) to 40 years old (being free of)], [40 years old (containing) to 50 years old (being free of) Between], [be greater than 50 years old (containing)], then one-hot processing for one 4 dimension feature.For example, age of user is 35 years old, year Age character representation is (0,1,0,0).
(3) feature normalization processing is done according to a pair of whole numeric type features of formula.
Model training module is adopted as shown in figure 4, the module is input with the characteristic after Feature Engineering resume module With the mode of five folding cross validations, input data is divided into 5 parts.Selected Logistic Regression, SVM, XGboost, Random Forest, LSTM five machine learning model are as training pattern.For each training pattern, made with 4 parts of data For training set, then training pattern parameter uses 1 part of data as test set, calculate accuracy, precision, recall, The value of five model-evaluation indexes of f1, roc-auc.It is averaged as final model-evaluation index result for Continuous plus 5 times. According to model-evaluation index as a result, the model of best performance is selected to stagnate prediction model as study, and by model and its parameter Storage is in the database.
Prediction of result module, to meet the student of condition " 0 < number of cycles < 2 of not paying the fees continuously currently " for be predicted It is raw, the on-line study behavioral data in its basic attribute data and period of not paying the fees is input to study and is stagnated in prediction model, It obtains the student and learns the prediction result stagnated, and store the result into database.
Although describing specific implementation method of the invention above, it will be appreciated by those of skill in the art that these It is merely illustrative of, under the premise of without departing substantially from the principle of the invention and realization, numerous variations can be made to these embodiments Or modification, therefore, protection scope of the present invention is defined by the appended claims.

Claims (8)

1. forecasting system is stagnated in a kind of on-line study of combination business rule characterized by comprising data loading module, number According to mark module, Feature Engineering module, model training module and prediction of result module;
Data loading module: reading student data from database, and the student data includes: the on-line study behavior number of student According to and basic attribute data;Study is read from rule base stagnates decision rule;
Data markers module: stagnating decision rule according to study and mark student data, stagnates decision rule for meeting study Student data is given birth to labeled as stagnating;Otherwise, it is given birth to labeled as non-stagnation;
Feature Engineering module: Feature Engineering processing is carried out to student data, the Feature Engineering processing includes: feature selecting, spy Sign processing and feature normalization obtain Feature Engineering treated student data;
Model training module: treated that student data is trained to Feature Engineering for selection machine learning model, then basis Model-evaluation index selects the machine learning model of best performance to stagnate prediction model as study;
Prediction of result module: student data to be predicted is input in study stagnation prediction model and is calculated, prediction knot is obtained Fruit.
2. forecasting system is stagnated in a kind of on-line study of combination business rule according to claim 1, it is characterised in that: institute The Feature Engineering module stated is accomplished by
(1) feature selecting, it is online can to characterize student for selection from the on-line study behavioral data and basic attribute data of student The data of study trend are numeric type, referred to as numeric type feature as data characteristics, the data type of on-line study behavioral data; Student's essential attribute, value are discrete enumerable data, referred to as classification type feature;
(2) characteristic processing, logarithm type feature and classification type feature carry out characteristic processing respectively, obtain m*n eigenmatrix, In, m indicates the m student data of m student;N indicates the corresponding n dimensional feature of every student data after characteristic processing;
(3) feature normalization does normalized to the numeric type feature in m*n eigenmatrix.
3. forecasting system is stagnated in a kind of on-line study of combination business rule according to claim 2, it is characterised in that: institute The characteristic processing for stating numeric type feature is operation processing to on-line study behavioral data, and on-line study behavioral data is to one In the timing phase, i.e., referred to as in an on-line study behavioral statistics period, the statistical data of a certain on-line study behavior, one online The statistical indicator of the corresponding on-line study behavior of learning behavior data;Login times and login number of days are to log in on-line study system It unites two statistical indicators of this on-line study behavior, the corresponding two on-line study behavioral datas of described two indexs;I if it exists In a on-line study behavioral statistics period, the statistical indicator of j on-line study behavior, then each on-line study behavioral statistics period contains There is j on-line study behavioral data, the statistical indicator of each on-line study behavior corresponds to i on-line study behavioral data.
4. forecasting system is stagnated in a kind of on-line study of combination business rule according to claim 2, it is characterised in that: institute The characteristic processing process for stating numeric type feature is as follows:
Step 1. splicing splices i*j on-line study behavioral data of i on-line study behavioral statistics period statistics one It rises, obtains i*j dimensional feature;
Step 2. is average, and for the statistical indicator of each on-line study behavior, i on-line study behavioral statistics period is counted I on-line study behavioral data average, as a new feature, the statistical indicator of j on-line study behavior obtain j dimension Feature;
Step 3. weighted average, for the statistical indicator of each on-line study behavior, by i on-line study behavioral statistics period I on-line study behavioral data of statistics seeks weighted average, and as a new feature, the statistics of j on-line study behavior refers to Mark obtains j dimensional feature;
Step 4. is averaged by login times, to the statistical indicator of each on-line study behavior, is calculated according to login times each The average value of login, as a new feature;The statistical indicator of j-1 on-line study behavior obtains j-1 dimensional feature;
Step 5. seeks variance, and for the statistical indicator of each on-line study behavior, i on-line study behavioral statistics period is united I on-line study behavioral data of meter seeks variance, and as a new feature, the statistical indicator of j on-line study behavior obtains j Dimensional feature;
I on-line study behavior is united for the statistical indicator of each on-line study behavior by the normalization of step 6. minimax Meter period i on-line study behavioral data of statistics does minimax normalized, and it is special to obtain i minimax normalization I*j dimensional feature can be obtained in sign, the statistical indicator of j on-line study behavior;
Step 7. weighted variance, for the statistical indicator of each on-line study behavior, by i on-line study behavioral statistics period Variance is sought in the i on-line study behavioral data weighting of statistics, as a new feature, the statistical indicator of j on-line study behavior Obtain j dimensional feature;
I on-line study behavioral statistics period is counted the statistical indicator of each on-line study behavior by step 8. sequence I on-line study behavioral data sequence, using the position of i on-line study behavioral data in the sequence as independent characteristic, obtain To i dimensional feature, the statistical indicator of j on-line study behavior obtains i*j dimensional feature.
5. forecasting system is stagnated in a kind of on-line study of combination business rule according to claim 4, it is characterised in that: institute It states in step 6, minimax normalizes calculation formula are as follows: (val-min)/(max-min), wherein val is certain on-line study row For corresponding k-th of on-line study behavioral data of statistical indicator, 0 < k≤i, min, max are respectively the on-line study behavioral statistics Minimum value and maximum value in the corresponding i on-line study behavioral data of index.
6. forecasting system is stagnated in a kind of on-line study of combination business rule according to claim 2, it is characterised in that: institute State the processing of classification type feature the following steps are included:
(1) classification type feature is extended to respective dimensions according to the value number of classification type feature by one-hot coding (one-hot) Feature;
(2) Interval Maps joint one-hot coding (one-hot), " age " are a kind of more special attributes, value range compared with Greatly, value number is more, and age data is first mapped to the age range delimited in advance, reduces taking for " age " this attribute It is worth number, then does one-hot processing again.
7. forecasting system is stagnated in a kind of on-line study of combination business rule according to claim 2, it is characterised in that: institute It states feature normalization and realizes that formula is as follows with the standardized mode of 0 mean value:
Wherein, x is the value of kth dimensional feature, and 0 < k≤n, μ, σ are respectively mean value and the side of the corresponding m value of kth dimensional feature Difference.
8. forecasting system is stagnated in a kind of on-line study of combination business rule according to claim 1, it is characterised in that: institute Model training module is stated to be accomplished by
(1) Logistic Regression, five SVM, XGboost, Random Forest, LSTM machine learning moulds are selected Type, to Feature Engineering, treated that student data is trained, and respectively obtains the optimal solution of 5 machine learning models;
(2) using accuracy rate (accuracy), rate of precision (precision), recall rate (recall), f1 value, roc-auc as model Evaluation index assesses above-mentioned 5 machine learning models, and the machine learning model of best performance is selected to stagnate as study Prediction model.
CN201910082918.8A 2019-01-24 2019-01-24 Online learning stagnation prediction system combined with business rules Active CN109784578B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910082918.8A CN109784578B (en) 2019-01-24 2019-01-24 Online learning stagnation prediction system combined with business rules

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910082918.8A CN109784578B (en) 2019-01-24 2019-01-24 Online learning stagnation prediction system combined with business rules

Publications (2)

Publication Number Publication Date
CN109784578A true CN109784578A (en) 2019-05-21
CN109784578B CN109784578B (en) 2021-02-02

Family

ID=66502811

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910082918.8A Active CN109784578B (en) 2019-01-24 2019-01-24 Online learning stagnation prediction system combined with business rules

Country Status (1)

Country Link
CN (1) CN109784578B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117828490A (en) * 2024-03-06 2024-04-05 南京信息工程大学 Typhoon disaster forecasting method and system based on ensemble learning

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150100512A1 (en) * 2013-10-04 2015-04-09 Apollo Group, Inc. Detecting and engaging participants in an online course that are otherwise not likely to continue to attend the online course
CN104813353A (en) * 2012-10-30 2015-07-29 阿尔卡特朗讯 System and method for generating subscriber churn predictions
CN105183743A (en) * 2015-06-29 2015-12-23 临沂大学 Prediction method of MicroBlog public sentiment propagation range
CN106373057A (en) * 2016-09-29 2017-02-01 西安交通大学 Network education-orientated poor learner identification method
CN106682770A (en) * 2016-12-14 2017-05-17 重庆邮电大学 Friend circle-based dynamic microblog forwarding behavior prediction system and method
CN107180284A (en) * 2017-07-07 2017-09-19 北京航空航天大学 A kind of SPOC student based on learning behavior feature shows weekly Forecasting Methodology and device
CN108140220A (en) * 2015-07-03 2018-06-08 英庭私人有限公司 Monitor the system and method that learner carries out the progress in experimental learning period
CN108694502A (en) * 2018-05-10 2018-10-23 清华大学 A kind of robot building unit self-adapting dispatching method based on XGBoost algorithms
CN108830409A (en) * 2018-05-31 2018-11-16 中国科学技术大学 The donations behavior of platform is raised towards crowd and contributor keeps prediction technique

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104813353A (en) * 2012-10-30 2015-07-29 阿尔卡特朗讯 System and method for generating subscriber churn predictions
US20150100512A1 (en) * 2013-10-04 2015-04-09 Apollo Group, Inc. Detecting and engaging participants in an online course that are otherwise not likely to continue to attend the online course
CN105183743A (en) * 2015-06-29 2015-12-23 临沂大学 Prediction method of MicroBlog public sentiment propagation range
CN108140220A (en) * 2015-07-03 2018-06-08 英庭私人有限公司 Monitor the system and method that learner carries out the progress in experimental learning period
CN106373057A (en) * 2016-09-29 2017-02-01 西安交通大学 Network education-orientated poor learner identification method
CN106682770A (en) * 2016-12-14 2017-05-17 重庆邮电大学 Friend circle-based dynamic microblog forwarding behavior prediction system and method
CN107180284A (en) * 2017-07-07 2017-09-19 北京航空航天大学 A kind of SPOC student based on learning behavior feature shows weekly Forecasting Methodology and device
CN108694502A (en) * 2018-05-10 2018-10-23 清华大学 A kind of robot building unit self-adapting dispatching method based on XGBoost algorithms
CN108830409A (en) * 2018-05-31 2018-11-16 中国科学技术大学 The donations behavior of platform is raised towards crowd and contributor keeps prediction technique

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
卢晓航等: "一种基于滑动窗口模型的MOOCs辍学率预测方法", 《数据分析与知识发现》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117828490A (en) * 2024-03-06 2024-04-05 南京信息工程大学 Typhoon disaster forecasting method and system based on ensemble learning
CN117828490B (en) * 2024-03-06 2024-05-17 南京信息工程大学 Typhoon disaster forecasting method and system based on ensemble learning

Also Published As

Publication number Publication date
CN109784578B (en) 2021-02-02

Similar Documents

Publication Publication Date Title
Li et al. Re-examining bitcoin volatility: a CAViaR-based approach
CN109993652A (en) A kind of debt-credit assessing credit risks method and device
Bhimani et al. The role of financial, macroeconomic, and non-financial information in bank loan default timing prediction
Mousavi et al. Multi-criteria ranking of corporate distress prediction models: empirical evaluation and methodological contributions
CN107704995A (en) Student&#39;s evaluation system
Goo et al. Improving the prediction of going concern of Taiwanese listed companies using a hybrid of LASSO with data mining techniques
Tung et al. Binary classification and data analysis for modeling calendar anomalies in financial markets
Šarlija et al. Measuring enterprise growth: pitfalls and implications
CN109784578A (en) A kind of on-line study stagnation forecasting system of combination business rule
Hu et al. Credit risk assessment model for small, medium and micro enterprises based on RS-PSO-SVM integration
Feng et al. Optimization and analysis of intelligent accounting information system based on deep learning model
CN109919366A (en) Forecasting of Stock Prices method based on tensor and event-driven LSTM model
Dissanayake et al. Soft computing approach to construction performance prediction and diagnosis
CN114066602A (en) Financial industry risk control method and device
Ping et al. Risk Early Warning Research on China’s Futures Company
Frolov et al. Use of machine learning to investigate factors affecting waste generation and processing processes in Russia
Murti et al. The determinant of business intelligence systems quality on Indonesian higher education information center
Qi et al. Forecasting market risk using ultra-high-frequency data and scaling laws
Sabri et al. Forecasting Turkish lira against the US dollars via forecasting approaches
Saxena et al. Prediction of Academic Performance of Students Using Multiple Regression
Chen et al. The High-tech Enterprise Certification Policy and Innovation: Quantity or Quality?
Yuan et al. Early Detecting the At-risk Students in Online Courses Based on Their Behavior Sequences
Kauranen et al. Performance measurement–Viewpoints of measuring the future
Thamprasert et al. Simulated trial and error experiments on productivity
Brdnik et al. Utilizing Interaction Metrics in a Virtual Learning Environment for Early Prediction of Students’ Academic Performance

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant