CN115617991A - Method for evaluating viewpoint quality in online collaborative learning based on machine learning - Google Patents

Method for evaluating viewpoint quality in online collaborative learning based on machine learning Download PDF

Info

Publication number
CN115617991A
CN115617991A CN202211234394.8A CN202211234394A CN115617991A CN 115617991 A CN115617991 A CN 115617991A CN 202211234394 A CN202211234394 A CN 202211234394A CN 115617991 A CN115617991 A CN 115617991A
Authority
CN
China
Prior art keywords
viewpoint
subject
learning
online
vocabularies
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211234394.8A
Other languages
Chinese (zh)
Inventor
蒋纪平
张顺利
周俊明
胡萍
许睿
马丽娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan Institute of Science and Technology
Original Assignee
Henan Institute of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan Institute of Science and Technology filed Critical Henan Institute of Science and Technology
Priority to CN202211234394.8A priority Critical patent/CN115617991A/en
Publication of CN115617991A publication Critical patent/CN115617991A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • G06Q50/205Education administration or guidance

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Tourism & Hospitality (AREA)
  • General Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Educational Technology (AREA)
  • Educational Administration (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention discloses a method for evaluating viewpoint quality in online collaborative learning based on machine learning, which comprises the following specific processes: firstly, acquiring text data discussed by a learner through an online learning platform, preprocessing the text, segmenting words and extracting keywords; performing Word segmentation and keyword extraction on a textbook text, performing subject Word vector analysis by using Word2vec, and establishing a subject semantic Word library; secondly, extracting key features of the text on the online learning platform from six aspects of course relevance, course speciality, viewpoint cohesion, viewpoint depth, viewpoint exploration and viewpoint innovativeness; thirdly, iteratively generating the best-fit evaluation model, standard error sum R through machine learning model training 2 The evaluation result shows that the evaluation model has better effect; finally, the evaluation model is used for realizingAnd (4) automatically evaluating the text information on the platform in a large scale, and outputting an opinion quality evaluation result.

Description

Method for evaluating viewpoint quality in online collaborative learning based on machine learning
Technical Field
The invention belongs to the technical field of online collaborative learning discussion text quality evaluation methods, and particularly relates to an evaluation method of viewpoint quality in online collaborative learning based on machine learning.
Background
The learning effect evaluation is an important link of teaching activities, and the learning evaluation in the online collaborative learning influences the teaching process and results to a great extent. The traditional online collaborative learning has some disadvantages, and firstly, the data source is mainly the result data, and the process data is less. The data analysis mainly comprises login times, learning time, reading times, examination scores, questionnaire measurement, self-report and the like, most of the data analysis can only analyze the external behaviors of learners and track preset knowledge, and the analysis of procedural data needs to be strengthened; secondly, the analysis method is mainly a statistical analysis method of structured data, and the unstructured data is not sufficiently analyzed. The existing data analysis method usually adopts statistical analysis, such as percentage statistics, front and back test and other modes, only result comparison can be analyzed, and the data with good structure is convenient to analyze but is based on a preset model. Some qualitative analyses are mainly performed, subjectivity is strong, requirements on professional levels of analysts are high, popularization is difficult, coding indexes are complicated and rigid, sampling modes are mechanical, standardized analysis is complex, large-scale analysis capability is weak, and time and labor are wasted. Machine learning and deep learning methods are new trends. Thirdly, the analysis content is mainly based on the external display behavior, and the analysis on the implicit data related to the knowledge is not deep. For online collaborative learning, the existing research mainly includes: the method aims at social interaction relation in the community, knowledge sharing behaviors among members, group interaction, cognitive interaction, emotional interaction and the like. The analysis from these angles reflects the effects of these variables on knowledge evolution to a certain extent, but is insufficient for learners in knowledge mastering, low in data value, insufficient in mining the procedural regularity of online collaborative learning, and insufficient in refining implicit knowledge features and cognitive structures in texts.
The traditional text feature extraction method mainly represents each feature by extracting a single keyword, and then a text feature extraction method based on sentences is proposed. With the application of technologies such as big data, artificial intelligence, and machine learning in education, researchers have begun to apply natural language processing techniques to mine learners' opinions. The method for analyzing data by natural language has high requirement on technology, relatively less research quantity and the automatic evaluation of the online learning quality of learners becomes a problem which needs to be solved urgently. The method provides a good idea and method for semantic analysis of knowledge of a learner or published posts in online collaborative learning.
The analysis finds that the on-line collaborative learning effect evaluation research shows the analysis from the learning result to the learning process, the construction from the preset model to the data-driven model, the semantic analysis from the analysis of the explicit behaviors to the implicit knowledge, and the automatic analysis research of semantic judgment and large-scale data by using a computer is lacked. The invention adopts the natural language processing technology to carry out machine learning on the platform text data and evaluate the text quality in online collaborative learning, and no relevant research in the aspect exists at present.
Disclosure of Invention
The invention provides an evaluation method of viewpoint quality in online collaborative learning based on machine learning, and aims to provide a method for large-scale automatic evaluation of text data in an online collaborative learning platform, and scientific and quantitative evaluation of online text data and online learning quality is realized by using natural language processing technology and machine learning.
The invention adopts the following technical scheme for solving the technical problems, and the evaluation method of viewpoint quality in online collaborative learning based on machine learning is characterized by comprising the following specific processes: firstly, acquiring text data discussed by a learner through an online learning platform, preprocessing the text, segmenting words and extracting keywords; performing Word segmentation and keyword extraction on a textbook text, performing subject Word vector analysis by using Word2vec, and establishing a subject semantic Word library; secondly, from classExtracting key features of a text on an online learning platform from six aspects of course relevance, course speciality, viewpoint cohesion, viewpoint longitudinal depth, viewpoint exploration degree and viewpoint innovativeness; thirdly, iteratively generating the best-fit evaluation model, standard error and R through machine learning model training 2 The evaluation result shows that the evaluation model has better effect; and finally, large-scale automatic evaluation of the text information on the platform is realized by using the evaluation model, and an opinion quality evaluation result is output.
Further limiting, the method for evaluating the viewpoint quality in online collaborative learning based on machine learning is characterized by comprising the following specific steps:
step S1: collecting discussion data of a student online learning platform, storing the discussion data into a computer database, and carrying out classification management on a large amount of data: the method comprises the following steps of (1) name, viewpoint content, publication time, reading times, response times and reference times, wherein the viewpoint content is mainly unstructured text data;
step S2: performing text preprocessing on an online collaborative learning platform, mainly comprising data cleaning, word segmentation, stop word removal and keyword extraction, performing word segmentation on interactive text data in online collaborative learning by adopting Jieba, and extracting keywords by adopting TF-IDF; then removing stop words, merging synonyms, adding new words and the like, and establishing keywords of a platform discussion view point on the basis;
and step S3: building a subject Word library, extracting subject keywords on the basis of Word segmentation, stop Word removal, synonym combination and new Word addition, further training a subject textbook text by adopting Word2vec, calculating the relevance among Word vectors, quantizing the importance level of subject words, performing level labeling on the subject words according to the importance, dividing the subject words into three levels, and further building a subject semantic Word library;
and step S4: the method mainly comprises the following steps of extracting the characteristics of online collaborative learning viewpoints, and the characteristics of viewpoints published by a student online platform are divided into six aspects: course relevance, course speciality, viewpoint cohesion, viewpoint depth, viewpoint exploration degree and viewpoint innovativeness;
step S5: training a machine learning model for viewpoint quality evaluation, performing normalization processing on six extracted characteristic values of viewpoints on online collaborative learning, then dividing a training set and a test set according to the proportion of 4;
step S6: evaluating the model effect by adopting the common indexes of a regression model in machine learning;
mean Square Error (MSE): expected value of the square of the difference between the predicted value of the parameter and the actual value of the parameter:
Figure BDA0003883059620000031
where n is the number of samples, y i Is the actual value of the sample i and,
Figure BDA0003883059620000032
the prediction value of the sample i is shown, and from the formula, the smaller the MSE value is, the better the accuracy of the prediction model describing the experimental data is shown;
root Mean Square Error (RMSE): the root mean square error is the MSE root opening number, used as an estimate for the regression model:
Figure BDA0003883059620000033
similarly, the smaller the RMSE value is, the better the accuracy of the prediction model describing the experimental data is;
mean Absolute Error (MAE):
Figure BDA0003883059620000034
determining a coefficient:
Figure BDA0003883059620000035
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003883059620000036
the average value of the actual y values of all samples is generally between 0 and 1, and the larger the value, the better the model fitting effect is shown;
step S7: and (4) outputting the viewpoint quality evaluation result, and outputting all indexes and results of the online learning viewpoint quality by adopting an optimized fitting model.
Further, the relevance of the course is the relevance between the online published viewpoint and the course content, and is expressed by the ratio of the number of subject vocabularies contained in the viewpoint to the amount of viewpoint vocabularies, that is, the content of the course content in the viewpoint, and the relevance F1= | PC |/|.p |.
Further, the class professionalism is characterized in that the professionalism is higher as the number of subject words contained in the viewpoint of the student (n, number) and the level of professionalism (L, level) are larger; the higher the grade of the subject vocabulary is, the stronger the specialty thereof is, the manual marking mode is adopted to divide the subject vocabulary into three grades, 1-3, the larger the numerical value is, the higher the grade is, the product of the quantity of the subject vocabulary and the specialty grade is adopted as the calculation method of the view specialty, in order to ensure the dimensional unification of the numerical value, normalization processing is further carried out, namely the specialty is
Figure BDA0003883059620000041
L i Is the professional grade corresponding to the ith subject vocabulary in a viewpoint.
Further defined, the viewpoint cohesion degree refers to the degree of association between subject words in the viewpoint, and is represented by the maximum value of the degree of association between subject words, the larger the coefficient of association is, the smaller the distance is, the higher the degree of association between words is, i.e., the cohesion degree F3= max (C) ij ) Wherein, C ij And (3) representing the correlation between the ith subject vocabulary and the jth subject vocabulary in the viewpoint, and calculating a Word vector by using Word2 vec.
Further limiting, the viewpoint depth refers to the exposition of problems, phenomena and sourcesThe depth of the essential longitudinal direction of the problem is understood and revealed, the minimum value of the correlation degree among the subject vocabularies is adopted for representing, the lower the correlation degree among the subject vocabularies is, the larger the distance is, the greater the longitudinal depth is, namely, the longitudinal depth F4= min (C) ij ) Wherein, C ij The degree of correlation between the ith subject vocabulary and the jth subject vocabulary in the viewpoint is expressed.
The viewpoint exploring degree is the presentation of the exploring mode of the student viewpoint, the analyzing, summarizing, concluding and discovering degrees are mainly embodied as the use of a learning support on a platform, and meanwhile, the learning support is labeled into different grades and is divided into 1-6 according to the effect of the learning support on viewpoint organization, and the grades are shown in the table 1;
the exploration degree of the viewpoint of the student is related to the using quantity (s, sum) of the learning supports and the hierarchy (G, grade) of the supports, the quantity of the learning supports reflects the organization viewpoint of the student and the seriousness of the demonstration, the grade of the learning supports reflects the depth of the viewpoint to a certain degree, and the exploration degree
Figure BDA0003883059620000042
s is the number of learning supports used in a perspective, G i Is the level corresponding to the ith learning support in a viewpoint.
The invention compares the keywords of the viewpoints published by students on the platform with the subject vocabularies in the teaching materials, screens out the characteristic vocabularies beyond the teaching materials, calculates the number of innovative vocabularies, and takes the number of innovative vocabularies/(the number of subject vocabularies in the teaching materials + the number of innovative vocabularies) as an innovative calculation method, namely, the innovation is F6= CN/(n + CN), and CN is the number of innovative vocabularies beyond the teaching materials.
Compared with the prior art, the invention has the following advantages and beneficial effects: the method can realize quantitative evaluation of unstructured data such as discussion texts of an online learning platform, and the analysis result is beneficial to automatic diagnosis of online learning effect, promotion of online teaching intervention, promotion of personalized learning and resource recommendation of learners, acceleration of knowledge evolution rate and the like, and has practical application value.
Drawings
FIG. 1 is a flow chart of the present invention.
Fig. 2 is a flowchart of the GBDT regression model algorithm for opinion quality assessment in the present invention.
FIG. 3 is a diagram comparing a test set with a prediction set.
Detailed Description
The present invention is described in further detail below with reference to examples, but it should not be understood that the scope of the subject matter of the present invention is limited to the examples below, and any technique realized based on the above contents of the present invention falls within the scope of the present invention.
Examples
A big data analysis method for a learner to make opinions in a learning platform in online collaborative learning based on machine learning and natural language processing specifically comprises the following steps:
step S1: collecting discussion data of a student online learning platform, storing the discussion data into a computer database, and carrying out classification management on a large amount of data: name, view content, publication time, reading times, response times, reference times and the like. The viewpoint content is mainly unstructured text data;
step S2: and preprocessing the text of the online collaborative learning platform. The method mainly comprises data cleaning, word segmentation, word removal and stop and keyword extraction. Performing word segmentation processing on interactive text data in online collaborative learning by adopting Jieba, removing stop words, and extracting key words by adopting TF-IDF; then stop words are removed, and the stop word list is formed by artificially adding partial stop word vocabularies with characteristics on the basis of referring to the stop word list of the Haugh size; merging synonyms such as an interest class, an out-of-class tutoring class and an out-of-class remedial class; online teaching, online learning, online lessons, etc.; on the basis of the above-mentioned information, the keywords of platform discussion viewpoint are formed.
And step S3: and (5) building a subject word library. The method mainly extracts subject keywords on the basis of Word segmentation, stop Word removal, synonym combination, new Word addition and the like, further trains subject textbook texts by adopting Word2vec, and calculates the correlation among Word vectors. In addition, the importance level of the subject vocabularies is quantified, the subject vocabularies are subjected to level labeling according to the importance and are divided into three levels, and then a subject semantic word library is constructed.
And step S4: and (4) feature extraction of online collaborative learning viewpoints. The viewpoint published by the student online platform is mainly characterized by six aspects: course relevance, course speciality, viewpoint cohesion, viewpoint depth, viewpoint exploration and viewpoint innovation.
(1) The relevance of the course refers to the relevance between the online publishing viewpoint and the course content. The relevance degree F1= | PC |/| P |, which is expressed by the ratio of the number of subject vocabularies included in the viewpoint to the vocabulary amount of the viewpoint, i.e., the content rate of the course content in the viewpoint.
(2) The curriculum professionality is related to the number (n, number) and the professional level (L, level) of subject vocabularies contained in the viewpoint of the student, and the more the number of the subject vocabularies contained in the viewpoint is, the higher the professionality is; the higher the grade of the subject vocabulary is, the stronger the specialty of the subject vocabulary is, the manual labeling mode is adopted to divide the subject vocabulary into three grades, 1-3, and the higher the numerical value is, the higher the grade is. The product of the number of subject words and the professional grade is used as a calculation method of the view specialty, and normalization processing is further carried out for ensuring the dimensional unification of numerical values, namely the specialty
Figure BDA0003883059620000051
Li is the professional grade corresponding to the ith subject vocabulary in a viewpoint. For example, if the 1 st viewpoint contains 3 subject words and the corresponding professional grades are 1, 2 and 3, respectively, the speciality of the viewpoint can be represented as 1 × 1+1 × 2+1 × 3=6, and the result of the normalization process on the viewpoint is 6/20=0.30 by dividing the maximum value by 20.
(3) The viewpoint cohesion degree refers to the degree of association between subject words in the viewpoint, and is represented by the maximum value of the degree of association between the subject words. The larger the correlation coefficient is, the smaller the distance is, and the higher the correlation degree between words is. I.e. a degree of cohesion F3= max (C) ij ). Wherein, C ij The ith subject vocabulary and the jth science in the representation viewpointAnd calculating the relevance among the subject vocabularies by adopting Word2vec to obtain the Word vectors.
(4) The viewpoint depth refers to the depth of the longitudinal direction for describing problems, phenomena, principles and revealing problems, and is expressed by the minimum value of the correlation degree among the subject vocabularies, the lower the correlation degree among the subject vocabularies, the larger the distance between the subject vocabularies, and the larger the depth. I.e. the longitudinal depth F4= min (C) ij ) Wherein, C ij The degree of correlation between the ith subject vocabulary and the jth subject vocabulary in the viewpoint is expressed.
(5) The degree of exploration of the viewpoint. The research degree refers to the presentation of a research mode of a student viewpoint, and the degrees of analysis, summarization, conclusion, discovery and the like are mainly reflected in the use of learning supports on a platform, such as [ my viewpoint ], [ my evidence ], [ my conclusion ], [ new discovery ], and the like, and the learning supports serve as characteristic words to identify different types. Meanwhile, the learning scaffolds were labeled with different grades, 1-6, according to their role in the viewpoint organization, as shown in table 1. The exploration degree of the student view point is related to the using quantity (s, sum) of the learning supports and the hierarchy (G, grade) of the supports, the quantity of the learning supports reflects the organization view point of the student and the seriousness of demonstration, the grade of the learning supports reflects the depth of the view point to a certain extent, and for example, "my conclusion" is further summarized on the basis of "my idea" and "my example". Degree of investigation
Figure BDA0003883059620000061
s is the number of learning supports used in a perspective, G i Is the level corresponding to the ith learning support in a viewpoint.
TABLE 1 learning support feature words and grades
Figure BDA0003883059620000062
Figure BDA0003883059620000071
(6) Innovativeness, viewpoint innovativeness or novelty means that students propose different viewpoints or new opinions, which are mainly embodied in the cross application of multidisciplinary field knowledge, such as the combination of 'sand table games' with mental health or the combination of dramatic teaching with reading, and these words exceed the content of teaching materials. The invention compares the keywords published by the students on the platform with the subject vocabularies in the teaching materials, screens out the characteristic vocabularies exceeding the teaching materials, calculates the number of innovative vocabularies, and takes the number of innovative vocabularies/(the number of subject vocabularies in the teaching materials + the number of innovative vocabularies) as an innovative calculation method, namely, the innovative number F6= CN/(n + CN), and CN is the number of the innovative vocabularies exceeding the teaching materials.
Step S5: and (4) training a machine learning model for viewpoint quality evaluation. The method comprises the steps of carrying out normalization processing on six extracted characteristic values of viewpoints on online collaborative learning, then dividing a training set and a test set according to the proportion of 4.
Step S6: and (5) evaluating the effect of the model. The common indicators of regression models in machine learning are used.
Mean Square Error (MSE): expected value of the square of the difference between the predicted value of the parameter and the actual value of the parameter:
Figure BDA0003883059620000072
where n is the number of samples, y i Is the actual value of the sample i and,
Figure BDA0003883059620000073
is the predicted value for sample i. From the equation, the smaller the value of MSE, the better the accuracy of the prediction model in describing the experimental data.
Root Mean Square Error (RMSE): the root mean square error is the MSE root opening number, used as an estimate for the regression model:
Figure BDA0003883059620000074
similarly, the smaller the value of RMSE, the better the accuracy of the prediction model in describing the experimental data.
Mean Absolute Error (MAE):
Figure BDA0003883059620000075
determining a coefficient:
Figure BDA0003883059620000076
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003883059620000077
the average of the actual values of y for all samples is generally between 0 and 1, and the larger the value, the better the predictive model fits. The fitting effect of the prediction model is shown in Table 2, and R can be seen 2 Higher than 0.8 indicates better fitting of the prediction model.
TABLE 2 estimation of model fitting Effect
Mean square error MSE Standard error RMSE Mean absolute error MAE R2
0.638181684 0.798862744 0.594528382 0.849875
The training results are compared to the annotation results, as shown in FIG. 2. It can be seen that the overall trends are consistent. Since the test set is labeled with discrete values, the intervals are 0.5, e.g., 6.5, 7, 7.5, 8, etc.; the regression results of the prediction set generate continuous values, so that the prediction set fluctuates above and below the test set.
Step S7: and outputting the viewpoint quality evaluation result. And (3) outputting all indexes and results of all online learning viewpoint qualities by adopting an optimal fitting model as shown in a table 3.
TABLE 3 indexes and results of student's opinions (parts)
Figure BDA0003883059620000081
In conclusion, the method can realize quantitative evaluation of unstructured data such as discussion texts of the online learning platform, and the analysis result is beneficial to automatic diagnosis of online learning effect, promotion of online teaching intervention, promotion of personalized learning and resource recommendation of learners, acceleration of knowledge evolution rate and the like, and has practical application value.
While the foregoing embodiments have described the general principles, features and advantages of the present invention, it will be understood by those skilled in the art that the present invention is not limited thereto, and that the foregoing embodiments and descriptions are only illustrative of the principles of the present invention, and various changes and modifications can be made without departing from the scope of the principles of the present invention, and these changes and modifications are within the scope of the present invention.

Claims (8)

1. Method for evaluating quality of opinion in online collaborative learning based on machine learning, and computer program productIs characterized by comprising the following specific processes: firstly, acquiring text data discussed by a learner through an online learning platform, preprocessing the text, segmenting words and extracting keywords; performing Word segmentation and keyword extraction on the textbook text, performing subject Word vector analysis by using Word2vec, and establishing a subject semantic Word library; secondly, extracting key features of the text on the online learning platform from six aspects of course relevance, course speciality, viewpoint cohesion, viewpoint depth, viewpoint exploration and viewpoint innovativeness; thirdly, iteratively generating the best-fit evaluation model, standard error and R through machine learning model training 2 The evaluation result shows that the evaluation model has better effect; and finally, large-scale automatic evaluation of the text information on the platform is realized by using the evaluation model, and an opinion quality evaluation result is output.
2. The method for evaluating the opinion quality in the machine learning-based online collaborative learning according to claim 1 is characterized by comprising the following specific steps:
step S1: collecting discussion data of a student online learning platform, storing the discussion data into a computer database, and carrying out classification management on a large amount of data: the method comprises the following steps of (1) name, viewpoint content, publication time, reading times, response times and reference times, wherein the viewpoint content is mainly unstructured text data;
step S2: performing text preprocessing on an online collaborative learning platform, mainly comprising data cleaning, word segmentation, word removal and keyword extraction, performing word segmentation on interactive text data in online collaborative learning by adopting Jieba, and extracting keywords by adopting TF-IDF; then removing stop words, merging synonyms, adding new words and the like, and establishing keywords of the platform discussion view;
and step S3: building a subject Word library, extracting subject keywords on the basis of Word segmentation, stop Word removal, synonym combination and new Word addition, further training a subject textbook text by adopting Word2vec, calculating the relevance among Word vectors, quantizing the importance level of subject words, performing level labeling on the subject words according to the importance, dividing the subject words into three levels, and further building a subject semantic Word library;
and step S4: the method mainly comprises the following steps of extracting the characteristics of online collaborative learning viewpoints, and the characteristics of viewpoints published by a student online platform are divided into six aspects: course relevance, course speciality, viewpoint cohesion, viewpoint depth, viewpoint exploration and viewpoint innovativeness;
step S5: the method comprises the steps of training a machine learning model for viewpoint quality evaluation, performing normalization processing on six extracted characteristic values of viewpoints in online collaborative learning, dividing a training set and a test set according to a ratio of 4;
step S6: evaluating the model effect by adopting the common indexes of a regression model in machine learning;
mean Square Error (MSE): expected value of the square of the difference between the predicted value of the parameter and the actual value of the parameter:
Figure FDA0003883059610000021
where n is the number of samples, y i Is the actual value of the sample i and,
Figure FDA0003883059610000022
the prediction value of the sample i is shown, and from the formula, the smaller the MSE value is, the better the accuracy of the prediction model describing the experimental data is shown;
root Mean Square Error (RMSE): the root mean square error is the MSE root opening number, used as an estimate for the regression model:
Figure FDA0003883059610000023
similarly, the smaller the RMSE value is, the better the accuracy of the prediction model describing the experimental data is;
mean Absolute Error (MAE):
Figure FDA0003883059610000024
determining a coefficient:
Figure FDA0003883059610000025
wherein the content of the first and second substances,
Figure FDA0003883059610000026
the average value of the actual y values of all samples is generally between 0 and 1, and the larger the value, the better the model fitting effect is represented;
step S7: and (4) outputting the evaluation result of the viewpoint quality, and outputting all indexes and results of the online learning viewpoint quality by adopting an optimal fitting model.
3. The method for evaluating the quality of an opinion in online collaborative learning based on machine learning according to claim 1 or 2, characterized in that: the course relevancy refers to the relevancy between the online publishing viewpoint and the course content, and is expressed by the ratio of the number of subject words contained in the viewpoint to the viewpoint word amount, namely the content of the course content in the viewpoint, and the relevancy F1= | PC |/| P |.
4. The method for evaluating opinion quality in machine learning-based online collaborative learning according to claim 1 or 2, wherein: the curriculum speciality is related to the number (n, number) and professional level (L, level) of subject vocabularies contained in the viewpoint of the student, and the more the number of the subject vocabularies contained in the viewpoint is, the higher the speciality is; the higher the grade of the subject vocabulary is, the stronger the specialty thereof is, the manual marking mode is adopted to divide the subject vocabulary into three grades, 1-3, the larger the numerical value is, the higher the grade is, the product of the quantity of the subject vocabulary and the specialty grade is adopted as the calculation method of the view specialty, in order to ensure the dimensional unification of the numerical value, normalization processing is further carried out, namely the specialty is
Figure FDA0003883059610000027
L i Is the professional grade corresponding to the ith subject vocabulary in a viewpoint.
5. The method for evaluating opinion quality in machine learning-based online collaborative learning according to claim 1 or 2, wherein: the viewpoint cohesion degree refers to the degree of association between subject words in the viewpoint, and is represented by the maximum value of the degree of association between the subject words, the larger the association coefficient is, the smaller the distance is, the higher the degree of association between the words is, i.e., the cohesion degree F3= max (C) ij ) Wherein, C ij And (3) representing the correlation between the ith subject vocabulary and the jth subject vocabulary in the viewpoint, and calculating a Word vector by using Word2 vec.
6. The method for evaluating the quality of an opinion in online collaborative learning based on machine learning according to claim 1 or 2, characterized in that: the viewpoint depth refers to the depth for describing problems, phenomena, principles and revealing the essential longitudinal direction of the problems, and is represented by the minimum value of the correlation degree among the subject vocabularies, the lower the correlation degree among the subject vocabularies, the larger the distance between the subject vocabularies, and the larger the depth, namely, the depth F4= min (C) ij ) Wherein, C ij The degree of correlation between the ith subject vocabulary and the jth subject vocabulary in the viewpoint is expressed.
7. The method for evaluating the quality of an opinion in online collaborative learning based on machine learning according to claim 1 or 2, characterized in that: the viewpoint exploration degree refers to the presentation of the exploration mode of the student viewpoint, the degrees of analysis, summarization, conclusion and discovery are mainly embodied as the use of a learning support on a platform, and meanwhile, the learning support is labeled into different grades which are divided into 1-6 according to the effect of the learning support on viewpoint organization, and the following tables show that:
Figure FDA0003883059610000031
the exploration degree of the student opinions is related to the using quantity (s, sum) of the learning supports and the grade (G, grade) of the supports, the quantity of the learning supports reflects the organization opinions of students and the demonstration tightness, the grade of the learning supports reflects the depth of the opinions to a certain degree, and the exploration degree
Figure FDA0003883059610000032
s is the number of learning supports used in a perspective, G i Is the level corresponding to the ith learning support in a viewpoint.
8. The method for evaluating the quality of an opinion in online collaborative learning based on machine learning according to claim 1 or 2, characterized in that: the viewpoint innovativeness means that students provide different viewpoints or brand new opinions, and the viewpoints are mainly reflected in the cross application of knowledge in multiple subject fields, keywords of viewpoints published by the students on a platform are compared with subject vocabularies in a teaching material, vocabularies exceeding the teaching material are screened out, the number of innovative vocabularies is calculated, and the number of the innovative vocabularies/(the number of the subject vocabularies in the teaching material + the number of the innovative vocabularies) is used as an innovative calculation method, namely, the innovation F6= CN/(n + CN), and CN is the number of the innovative vocabularies exceeding the teaching material.
CN202211234394.8A 2022-10-10 2022-10-10 Method for evaluating viewpoint quality in online collaborative learning based on machine learning Pending CN115617991A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211234394.8A CN115617991A (en) 2022-10-10 2022-10-10 Method for evaluating viewpoint quality in online collaborative learning based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211234394.8A CN115617991A (en) 2022-10-10 2022-10-10 Method for evaluating viewpoint quality in online collaborative learning based on machine learning

Publications (1)

Publication Number Publication Date
CN115617991A true CN115617991A (en) 2023-01-17

Family

ID=84863508

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211234394.8A Pending CN115617991A (en) 2022-10-10 2022-10-10 Method for evaluating viewpoint quality in online collaborative learning based on machine learning

Country Status (1)

Country Link
CN (1) CN115617991A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116664013A (en) * 2023-07-24 2023-08-29 西南林业大学 Effect evaluation method for collaborative learning mode, ubiquitous intelligent learning system and medium
CN117153007A (en) * 2023-09-05 2023-12-01 深圳市弘扬德教科技有限公司 Classroom teaching interactive system based on artificial intelligence
CN117273013A (en) * 2023-11-21 2023-12-22 中国人民公安大学 Electronic data processing method for stroke records

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116664013A (en) * 2023-07-24 2023-08-29 西南林业大学 Effect evaluation method for collaborative learning mode, ubiquitous intelligent learning system and medium
CN116664013B (en) * 2023-07-24 2023-09-22 西南林业大学 Effect evaluation method for collaborative learning mode, ubiquitous intelligent learning system and medium
CN117153007A (en) * 2023-09-05 2023-12-01 深圳市弘扬德教科技有限公司 Classroom teaching interactive system based on artificial intelligence
CN117153007B (en) * 2023-09-05 2024-07-19 深圳市弘扬德教科技有限公司 Classroom teaching interactive system based on artificial intelligence
CN117273013A (en) * 2023-11-21 2023-12-22 中国人民公安大学 Electronic data processing method for stroke records
CN117273013B (en) * 2023-11-21 2024-01-26 中国人民公安大学 Electronic data processing method for stroke records

Similar Documents

Publication Publication Date Title
CN109919810B (en) Student modeling and personalized course recommendation method in online learning system
Cypress Data analysis software in qualitative research: Preconceptions, expectations, and adoption
CN111241243B (en) Test question, knowledge and capability tensor construction and labeling method oriented to knowledge measurement
CN115617991A (en) Method for evaluating viewpoint quality in online collaborative learning based on machine learning
Hujala et al. Improving the quality of teaching by utilising written student feedback: A streamlined process
Van Zoonen et al. Social media research: The application of supervised machine learning in organizational communication research.
CN104636425B (en) A kind of network individual or colony's Emotion recognition ability prediction and method for visualizing
Misuraca et al. Using Opinion Mining as an educational analytic: An integrated strategy for the analysis of students’ feedback
US10134297B2 (en) Systems and methods for determining text complexity
Yussupova et al. Models and methods for quality management based on artificial intelligence applications
Herrmann In a test bed with Kafka. Introducing a mixed-method approach to digital stylistics
Ansari Cost-based text understanding to improve maintenance knowledge intelligence in manufacturing enterprises
Grönberg et al. Palaute: An online text mining tool for analyzing written student course feedback
Ren et al. Automatic scoring of student feedback for teaching evaluation based on aspect-level sentiment analysis
Zörgő et al. Methodology in the mirror: a living, systematic review of works in quantitative ethnography
Prodromou Data visualization and statistical literacy for open and big data
Lalata et al. A correlation analysis of the sentiment analysis scores and numerical ratings of the students in the faculty evaluation
CN114116967A (en) Data cleaning method and device, electronic equipment and computer readable medium
CN112966070A (en) Company employee comment analysis system and method based on aspect emotion analysis
Gaillat Investigating the scopes of textual metrics for learner level discrimination and learner analytics
Diana et al. Data-driven generation of rubric parameters from an educational programming environment
Chen et al. Analysing preservice teachers' reflection journals using text-mining techniques
Shibani et al. Understanding students’ revisions in writing: from word counts to the revision graph
CN116258390B (en) Teacher online teaching feedback-oriented cognitive support quality evaluation method and system
Bu et al. What key contextual factors contribute to students’ reading literacy among top-performing countries and economies? Statistical and machine learning analyses

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination