CN108256699A - Graduation whereabouts Forecasting Methodology and system based on college student stereo data - Google Patents
Graduation whereabouts Forecasting Methodology and system based on college student stereo data Download PDFInfo
- Publication number
- CN108256699A CN108256699A CN201810316749.5A CN201810316749A CN108256699A CN 108256699 A CN108256699 A CN 108256699A CN 201810316749 A CN201810316749 A CN 201810316749A CN 108256699 A CN108256699 A CN 108256699A
- Authority
- CN
- China
- Prior art keywords
- data
- student
- value
- graduation
- whereabouts
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 28
- 238000001514 detection method Methods 0.000 claims abstract description 12
- 238000012360 testing method Methods 0.000 claims description 28
- 238000012549 training Methods 0.000 claims description 9
- 238000004422 calculation algorithm Methods 0.000 claims description 8
- 230000009466 transformation Effects 0.000 claims description 5
- 230000001502 supplementing effect Effects 0.000 claims description 4
- 238000004140 cleaning Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 claims description 3
- 238000012986 modification Methods 0.000 claims description 3
- 230000004048 modification Effects 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 238000012546 transfer Methods 0.000 claims description 3
- 230000000007 visual effect Effects 0.000 claims description 2
- 230000008859 change Effects 0.000 description 5
- 241001269238 Data Species 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000011835 investigation Methods 0.000 description 3
- 230000000875 corresponding effect Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000013479 data entry Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
- G06Q50/205—Education administration or guidance
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Tourism & Hospitality (AREA)
- Human Resources & Organizations (AREA)
- Theoretical Computer Science (AREA)
- Educational Administration (AREA)
- Marketing (AREA)
- Educational Technology (AREA)
- General Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Game Theory and Decision Science (AREA)
- Quality & Reliability (AREA)
- Development Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Operations Research (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses graduation whereabouts Forecasting Methodology and system based on college student stereo data, including:Obtain the stereo data of student;Stereo data is cleaned and standardization;The data cleansing refers to:Carry out missing data completion, repeated data detection and data bug patch successively to stereo data;Using treated, stereo data predicts the graduation whereabouts of student to be measured;The graduation whereabouts prediction result for treating prediction student is shown;If personal employment intention is not inconsistent with prediction result, Students ' Feedback data are received, are recalculated;The present invention can give the career planing of student to provide reference.
Description
Technical field
The present invention relates to graduation whereabouts Forecasting Methodologies and system based on college student stereo data.
Background technology
As China Higher education by elite education is changed into popular education, Graduates Employment situation is increasingly severe,
Effectively prediction how is carried out to Graduate Employment whereabouts and is not only college student concern, also increasingly by employing unit
With the attention of careers guidance department.
First, from data plane, what the prediction of traditional graduation whereabouts was relied on is student's graduation grade mostly, wins a prize and encourage
And the external data such as certificate, lack the inherent data closely related with graduation whereabouts such as professional personality test, these planes
The data of change be difficult to show student university during three-dimensional situation.
Secondly, in terms of the dimension of analysis, traditional graduation whereabouts Analysis of Prediction dimension is often relatively simple, lacks
Horizontal analysis (of the same grade, same to profession) and the connected applications of vertical analysis (student at school and the preceding-year-pupil), it is difficult to be formed more to student
Dimension, dynamic evaluation.
In addition, due to the closure of colleges and universities' data, employment portal or platform are difficult to obtain performance letter of the college student in school
Breath causes traditional method to lack to college student considering and utilize in school historical information.Therefore, how effective land productivity
With the historical data of the college student preceding-year-pupil, how historical data is cleaned and handled, more accurately carried out graduation and go
To prediction, the career planing for student provides reference, has very important significance and urgent.
Invention content
The purpose of the present invention proposes a kind of three-dimensional based on college student in order to which the career planing of student is given to provide reference
The graduation whereabouts Forecasting Methodology and system of data.
As the first aspect of the present invention:
Graduation whereabouts Forecasting Methodology based on college student stereo data, including:
Step (1):Obtain the stereo data of student;
Step (2):Data cleansing and standardization are carried out to stereo data;Data cleansing refers to:To stereo data according to
Secondary progress missing data completion, repeated data detection and data bug patch;
Step (3):Using treated, stereo data predicts the graduation whereabouts of student to be measured;
Step (4):The graduation whereabouts prediction result for treating prediction student is shown;
Step (5):If personal employment intention is not inconsistent with prediction result, Students ' Feedback data, return to step (3) weight are received
It is new to calculate.
Further, the stereo data of student, including:The inherent data and external data of student;
The inherence data, including:Professional personality test value, the occupation personality test value pass through professional character test amount
Table obtains;
The external data, including:The academic record of student, moral education achievement, attendance achievement, creative course credit, library go out
Indegree and library borrow number.
Further, the data cleansing refers to:Carry out missing data completion, repeated data detection successively to stereo data
With data bug patch.
The missing data completion carries out completion using k nearest neighbor algorithm to the missing values in stereo data;
For example, in the test of professional personality, it is assumed that the professional personality test value lacked in a sample is leading variable, that
The professional personality test value not lacked in current sample is auxiliary variable;Calculate in current sample auxiliary variable with it is several complete
The distance between auxiliary variable in sample;Find K nearest samples of current sample;K nearest samples of calculating are dominated
The average value of variable, the professional personality test value that average value is lacked as current sample.
The advantageous effect of Supplementing Data:The particularity of professional personality test is that student has privacy guarantor during filling in
The demand of shield so when professional personality test is filled in, is susceptible to the phenomenon that not filling out or failing to fill in, if will fail to fill in or not fill out
Data ignore, then the graduation whereabouts result finally obtained would not be accurate as a result, only by Supplementing Data
It is possible that whereabouts prediction of accurately graduating is obtained, and why mode there are many Supplementing Datas, the application are considered as k nearest neighbor
Algorithm is without being because k nearest neighbor algorithm is more suitable for the completion of professional personality test missing values using other algorithms.
The repeated data detection, refers to the record for detecting to be directed toward same target, using cosine similarity computational algorithm
It carries out repeating detection;
For example, record raWith rbPredicable be a1,a2,…,ak, then r is recordedaWith rbCosine similarity sim (ra,
rb) be:
Wherein, ra.aiRepresent record raAttribute ai;rb.aiRepresent record rbAttribute ai;
As record raWith rbCosine similarity sim (ra,rb) more than given threshold when, illustrate two record be directed toward it is same
Object.
The meaning of repeated data detection:For the number that can be identified by unique marks such as student number, identification card numbers
According to can be identified by this unique mark.But for student number, identification card number digit is incorrect or even in the case of missing, easily
Cause be originally a student data record, be but considered as the situation of the data record of two students.In order to avoid such case,
The application uses Data duplication detection technique, ensure that the uniqueness of data, reduces the complexity of algorithm calculating.
The error in data modification, refers to, for being determined as being directed toward a plurality of record of same student by repeating detection, such as
It is inconsistent on a certain attribute for fruit, that is, property value conflict occurs, then certainly exists the data value of mistake;Based on what is detected
Repeated data carries out actual value selection using Nearest Neighbor with Weighted Voting mode, using the actual value of selection as the actual value of conflict attribute;From
And a plurality of record that repeats is merged into a consistent and accurate record.
The repetition that same student is directed toward equipped with n items records r1,r2,…,rn, the corresponding property values of predicable a are respectively
ra1,ra2,…,ran, repeat to record r1,r2,…,rnCorresponding data source is respectively s1,s2,…,sn;
Calculate the score value Score (ra of each property valuei):
Score(rai)=Trustworthy (si)×Vote(rai)/n;
Wherein, Trustworthy (si) it is data source siConfidence level, data source siConfidence level by be manually set,
Vote(rai) be and property value raiIdentical property value quantity;
Finally, point actual value of highest property value alternatively is obtained.
Further, the data normalization processing, refers to, using min-max standardized methods to the data after cleaning
Linear transformation is carried out, the result after transformation is fallen in [0,1] section, and transfer function is as follows:
Wherein max is the maximum value of sample data, and min is the minimum value of sample data.
Further, using treated, stereo data predicts the graduation whereabouts of student to be measured:
Define graduation whereabouts classification C=[c1,c2,c3,c4,c5,c6,c7,c8], student characteristics X=[x1,x2,x3,x4,x5,
x6,x7]
c1,c2,c3,c4,c5,c6,c7,c8Represent that government departments and institutions, state-owned enterprise, private incorporated business, foreign capitals are public respectively
Enterprise of department, other education sectors, enters a higher school and goes abroad at state school;
x1,x2,x3,x4,x5,x6,x7Academic record, moral education achievement, attendance achievement, creative course credit, library are represented respectively
Go out indegree, book borrowing and reading number, professional personality test value;
First, by the use of previous session graduate as training set, using the previous session graduate student characteristics as the input of grader
Value, using the previous session graduate graduation whereabouts as the output valve of grader, is trained grader, obtains trained classification
Device;
Secondly, it using student characteristics to be predicted as input value, is input in trained grader;Export to be predicted
Raw graduation whereabouts probability;
Sort from big to small according to graduation whereabouts probability, to student recommend several graduation whereabouts classifications in the top,
Previous session graduate's number of probability value and graduation whereabouts classification.
Further, the graduation whereabouts prediction result for treating prediction student is shown, including:The synthesis of student to be predicted
Evaluation, professional personality suggestion, graduation whereabouts displaying and the recommendation previous session graduate information.
The overall merit of the student to be predicted, refers to, current student and the lateral ratio of the same grade with other professional students
Compared with.
The occupation personality suggestion, refers to, is tested by each term occupation personality, Students ' Professional personality is evaluated,
Obtained employment orientation and content to be learned to student's future carry out conductive suggestion;
The graduation whereabouts displaying, refers to, passes through the graduation whereabouts classification and probability value of Visual Chart displaying prediction.
The recommendation previous session graduate information, refers to, the previous session graduate stereo data information and employment type.Pass through
Student to be evaluated and the previous session graduate longitudinal comparison, play the role of that graduation whereabouts class prediction result can be explained.
Further, for the student of non-graduation grade, such as the student of big two or big Third school grades, due to its no all
The data of phase could fill out expected performance application form, the discreet value of its achievement data to following term obtained according to application form, from
And form the partial data of student to be evaluated.Then, according to the partial data of student to be evaluated again to the graduation whereabouts of the student
It is predicted, more to be met the prediction result of expected performance.
As the second aspect of the present invention:
Graduation whereabouts forecasting system based on college student stereo data, including:It memory, processor and is stored in
The computer instruction run on reservoir and on a processor, the computer instruction complete any of the above-described side when being run by processor
Step described in method.
As the third aspect of the present invention:
A kind of computer readable storage medium, thereon operation have computer instruction, and the computer instruction is transported by processor
During row, the step described in any of the above-described method is completed.
Compared with prior art, the beneficial effects of the invention are as follows:
By using inherent data and external data, lateral comparison and longitudinal comparison are carried out, being provided for college student just has
Compared with the occupational planning suggestion of high reference value.Inherent data refer in the present invention, professional personality Scale and questionnaire data;External number
According to referring in the present invention, student achievement data, moral education data, student attendance data, creative course credit data, library comes in and goes out time
Number data, book borrowing and reading historical data;Lateral comparison refers to compare active user and the level of the same grade with other professional students
Compare;Longitudinal comparison refers to vertical comparison of the active user with student similar in history graduate.
Description of the drawings
The accompanying drawings which form a part of this application are used for providing further understanding of the present application, and the application's shows
Meaning property embodiment and its explanation do not form the improper restriction to the application for explaining the application.
Fig. 1 is the flow chart of model simplification of the present invention.
Specific embodiment
It is noted that following detailed description is all illustrative, it is intended to provide further instruction to the application.It is unless another
It indicates, all technical and scientific terms used herein has usual with the application person of an ordinary skill in the technical field
The identical meanings of understanding.
It should be noted that term used herein above is merely to describe specific embodiment, and be not intended to restricted root
According to the illustrative embodiments of the application.As used herein, unless the context clearly indicates otherwise, otherwise singulative
It is also intended to include plural form, additionally, it should be understood that, when in the present specification using term "comprising" and/or " packet
Include " when, indicate existing characteristics, step, operation, device, component and/or combination thereof.
As shown in Figure 1, the graduation whereabouts Forecasting Methodology based on college student stereo data, includes the following steps:
(1) student's stereo data obtains
(2) cleaning and standardization of stereo data
(3) student's graduation whereabouts prediction
(4) prediction result is shown
(5) student is expected performance investigation
In the step (1), student's stereo data includes the inherent data of student and external data.Wherein, inherent data
Refer in the present invention, professional personality data, which is tested and obtained by professional personality scale MBTI, and the present invention is every
Term, students carried out a MBTI test, recorded the situation of change of its test result.External data refer in the present invention
Student school performance data, including:Student's school work shows, moral education performance, student attendance data, creative course credit data, books
Shop goes out indegree data, book borrowing and reading historical data.In the present invention, the importing which passes through the good Various types of data of predefined
Template is integrated in the database of predicting platform.
In the step (2), for stereo data used in the present invention, different data sources is respectively from, such as
Scores come from system at University Educational Administration, and library goes out indegree, book borrowing and reading historical data comes from library,
Students ' Moral Education achievement then comes from the daily record of counsellor (Excel forms), and professional personality data then come from this system
MBTI databases.The sources of these data is various, form differs, and causes integrated student's stereo data there is missing, again
The problem of " dirty datas " such as multiple, mistakes, these low-quality integrated datas will largely effect on graduation whereabouts prediction as a result,
Therefore it needs to obtain complete, consistent, accurate integrated data by data cleansing, and then ensure the accuracy of prediction result.
For missing data completion:The main reason is that the leakage record of data inputting personnel, data acquisition equipment failure are (as schemed
Book shop swiping card equipment failure), by during investigation the mismatching of student (as professional personality is tested) and data collection arrangement
Maloperation etc..Larger in view of data volume, the present invention carries out the completion of data, the party using K minimum distances neighbours' completion method
Method does not need to establish prediction model to the data of each missing, so as to be simple and efficient processing missing data.Specifically, than
Such as, in the test of professional personality, it is assumed that the professional personality test value lacked in a sample is leading variable, then current sample
In the professional personality test value that does not lack be auxiliary variable;Auxiliary variable is calculated in current sample with being assisted in several full samples
The distance between variable;Find K nearest samples of current sample;Calculate being averaged for the leading variable of K nearest samples
Value, the professional personality test value that average value is lacked as current sample.
Data duplication is detected:Due to needing the integrated Various types of data from different data sources, this isomeric data
Integrated the problem of will necessarily causing Data duplication, it is therefore desirable to identify whether Various types of data is directed toward together by repeating detection technique
One student, so as to establish for the multi-sided panoramic view of same student.For can uniquely be marked by student number, identification card number etc.
The data for knowing to be identified, can be identified by this unique mark.And feelings that are incorrect for student number digit or even lacking
Condition, the similarity measurement that the present invention will be recorded with reference to other co-occurrence attributes.
For example, record raWith rbPredicable be a1,a2,…,ak, then r is recordedaWith rbCosine similarity sim (ra,
rb) be:
Wherein, ra.aiRepresent record raAttribute ai;rb.aiRepresent record rbAttribute ai;
As record raWith rbCosine similarity sim (ra,rb) more than given threshold when, illustrate two record be directed toward it is same
Object.
It adjusts and verifies by test of many times, threshold value can be set as 0.92 herein.
Error in data is changed:In integrated isomeric data, due to abuse abbreviation, data entry error, spelling
The reasons such as variation, different measurement units and out-of-date coding, it is understood that there may be differ for the multiple data source datas of same student
The situation of even mistake is caused, the conflict of this data can also reduce the accuracy of prediction and analysis.For such issues that, the present invention
The conflict of data content is handled by the method for Nearest Neighbor with Weighted Voting.It is assigned first to the different data sources of different types of data certain
Confidence level, such as student's school work product, Educational Affairs Office data source has a higher confidence level, and counsellor's data source can
Reliability is then suitably turned down;Then, it for a plurality of record clashed, according to the confidence level of respective data source, is thrown by weighting
Ticket rule selects actual value.
For example, the repetition that same student is directed toward equipped with n items records r1,r2,…,rn, repeat the predicable a recorded and correspond to
Property value be respectively ra1,ra2,…,ran, repeat to record r1,r2,…,rnCorresponding data source is respectively s1,s2,…,sn;
Calculate the score value Score (ra of each property valuei):
Score(rai)=Trustworthy (si)×Vote(rai)/n;
Wherein, Trustworthy (si) it is data source siConfidence level, data source siConfidence level by be manually set,
Vote(rai) be and property value raiIdentical property value quantity;Finally, point highest property value alternatively true is obtained
Value.
In addition, it is inconsistent due to all types of data metric methods, it is predicted for the graduation whereabouts of next step, it is also necessary to right
Data are standardized.In the present invention, standardized method is using common min-max standardized methods, by right
Linear transformation is carried out in data, its result is made to fall on [0,1] section, transfer function is as follows:
Wherein max is the maximum value of sample data, and min is the minimum value of sample data.
In the step (3), the present invention is graduated according to the stereo data of student to be predicted using Bayes classifier
Whereabouts is classified, and then predicts the graduation whereabouts type coincideing with the student performance data.
In the present invention, graduation whereabouts can be divided into government departments and institutions, state-owned enterprise, private incorporated business, corporation with foreign capital
Enterprise, state school, 8 classifications such as other education sectors, enter a higher school, go abroad.Based on this, graduation whereabouts classification C={ c are defined1,
c2,c3,c4,c5,c6,c7,c8, student characteristics X=[x1,x2,x3,x4,x5,x6,x7]。
First with previous session student as training set, training Bayes classifier.Then, a given student is led to
It crosses trained grader and acquires the probability that the student belongs to a certain graduation whereabouts, i.e.,:
For each student, the probability of each Obtained employment orientation can be obtained by the grader, according to prediction probability
It sorts and recommends previous session graduate's number of corresponding graduation whereabouts classification, prediction probability and the category to student.
Particularly, for Bayes classifier, when attribute is discrete type, the prior probability of class can pass through training set
The number of each sample estimates, such as:And class conditional probability P (X=xi| C=ci) can be according to class
ciMiddle attribute is equal to xiThe ratio of training example estimate
In the present invention, the attribute of student's stereo data belongs to numeric type, for the attribute of continuous type, it is assumed that continuous variable takes
From Gaussian Profile, the parameter in Gaussian Profile is then estimated by training set:Mean μ and variances sigma2。
Then, for each class ci, attribute xiClass conditional probability be equal to:
Wherein, parameter μijClass c can be passed throughiAll training records about xiSample average estimate;Same σij 2
It can be estimated with the sample variance of these training records.
In the step (4), as a result show and include four aspects altogether:The overall merit of student to be predicted, professional personality are built
View, graduation whereabouts displaying and the recommendation previous session graduate information.
Wherein overall merit will carry out overall merit, and its performance is shown in the form of radar map to student in school information
Data and residing precedence, achieve the effect that horizontal analysis;Show its change in each term performance in school in the form of a graph simultaneously
Change situation.
Professional personality suggests then testing by each term occupation personality, targetedly the possible Obtained employment orientation to student
With the advantage and disadvantage of, professional personality, aspect strengthened etc. is needed to carry out conductive suggestion.
Graduation whereabouts displaying will show Obtained employment orientation of the present invention for student's prediction in the form of pie chart.
It is three-dimensional with the preceding-year-pupil of student's type that previous session graduate recommends the Obtained employment orientation that will be predicted according to the present invention to recommend
Data form longitudinal comparison with the information of current student, as the explanation to prediction result, further relate to the conjunction of prediction result
Rationality.
It is of the invention by base due to having relatively complete stereo data of each term in system for the student for grade of graduating
The graduation whereabouts of student is predicted in these data.And for the student of non-graduation grade, such as big two or big Third school grades
Student, due to the stereo data without its all term, so the present invention will be predicted based on data with existing, this may be with
Not the phenomenon that raw expected generation is not inconsistent.Therefore, in the step (5), situation expected from student is not met for prediction result, is learned
It is raw to could fill out expected performance application form, the discreet value of its performance to following term is obtained according to application form, it is to be evaluated so as to be formed
Survey the partial data of student.Then, data again predict the graduation whereabouts of the student accordingly, more to be met expection
The prediction result of performance.Meanwhile the result of this expected performance investigation will also supervise student to reach oneself ideal employment side
To and effort.
The foregoing is merely the preferred embodiments of the application, are not limited to the application, for the skill of this field
For art personnel, the application can have various modifications and variations.It is all within spirit herein and principle, made any repair
Change, equivalent replacement, improvement etc., should be included within the protection domain of the application.
Claims (10)
1. the graduation whereabouts Forecasting Methodology based on college student stereo data, it is characterized in that, including:
Step (1):Obtain the stereo data of student;
Step (2):Data cleansing and standardization are carried out to stereo data;The data cleansing refers to:To stereo data according to
Secondary progress missing data completion, repeated data detection and data bug patch;
Step (3):Using treated, stereo data predicts the graduation whereabouts of student to be measured;
Step (4):The graduation whereabouts prediction result for treating prediction student is shown;
Step (5):If personal employment intention is not inconsistent with prediction result, Students ' Feedback data are received, return to step (3) is counted again
It calculates.
2. the graduation whereabouts Forecasting Methodology based on college student stereo data as described in claim 1, it is characterized in that, student's
Stereo data, including:The inherent data and external data of student;
The inherence data, including:Professional personality test value, the occupation personality test value are obtained by professional character test scale
It takes;
The external data, including:The academic record of student, moral education achievement, attendance achievement, creative course credit, library's discrepancy time
Number and library borrow number.
3. the graduation whereabouts Forecasting Methodology based on college student stereo data as described in claim 1, it is characterized in that, it is described to lack
Supplementing Data is lost, completion is carried out to the missing values in stereo data using k nearest neighbor algorithm;
In the test of professional personality, it is assumed that the professional personality test value lacked in a sample is leading variable, then current sample
The professional personality test value not lacked in this is auxiliary variable;Calculate in current sample auxiliary variable with it is auxiliary in several full samples
Help the distance between variable;Find K nearest samples of current sample;Calculate the flat of the leading variable of K nearest samples
Mean value, the professional personality test value that average value is lacked as current sample.
4. the graduation whereabouts Forecasting Methodology based on college student stereo data as described in claim 1, it is characterized in that,
The repeated data detection, is referred to the record for detecting to be directed toward same target, is carried out using cosine similarity computational algorithm
It repeats to detect;
Record raWith rbPredicable be a1,a2,…,ak, then r is recordedaWith rbCosine similarity sim (ra,rb) be:
Wherein, ra.aiRepresent record raAttribute ai;rb.aiRepresent record rbAttribute ai;
As record raWith rbCosine similarity sim (ra,rb) more than given threshold when, illustrate two record be directed toward same targets.
5. the graduation whereabouts Forecasting Methodology based on college student stereo data as described in claim 1, it is characterized in that,
The error in data modification, refers to:For the data by repeating detection, if it is inconsistent on a certain attribute, i.e.,
Property value conflict occurs, then certainly exists the data value of mistake;Based on the repeated data detected, using Nearest Neighbor with Weighted Voting mode
Actual value selection is carried out, using the actual value of selection as the actual value of conflict attribute;So as to which a plurality of record that repeats is merged into one
Item is consistent and accurately records;
The repetition that same student is directed toward equipped with n items records r1,r2,…,rn, then in the corresponding attributes of predicable a for repeating record
Value is respectively ra1,ra2,…,ran, repeat to record r1,r2,…,rnCorresponding data source is respectively s1,s2,…,sn;
Calculate the score value Score (ra of each property valuei):
Score(rai)=Trustworthy (si)×Vote(rai)/n;
Wherein, Trustworthy (si) it is data source siConfidence level, data source siConfidence level pass through be manually set, Vote
(rai) be and property value raiIdentical property value quantity;
Finally, point actual value of highest property value alternatively is obtained.
6. the graduation whereabouts Forecasting Methodology based on college student stereo data as described in claim 1, it is characterized in that, the mark
Quasi-ization processing, refers to, carries out linear transformation to the data after cleaning using min-max standardized methods, the result after transformation is equal
It falls in [0,1] section, transfer function is as follows:
Wherein max is the maximum value of sample data, and min is the minimum value of sample data.
7. the graduation whereabouts Forecasting Methodology based on college student stereo data as described in claim 1, it is characterized in that, utilize place
Stereo data after reason predicts the graduation whereabouts of student to be measured:
Define graduation whereabouts classification C=[c1,c2,c3,c4,c5,c6,c7,c8], student characteristics X=[x1,x2,x3,x4,x5,x6,x7]
c1,c2,c3,c4,c5,c6,c7,c8Government departments and institutions, state-owned enterprise, private incorporated business, enterprise of corporation with foreign capital are represented respectively
Industry, other education sectors, enters a higher school and goes abroad at state school;
x1,x2,x3,x4,x5,x6,x7Academic record is represented respectively, and moral education achievement, attendance achievement, creative course credit, library comes in and goes out secondary
Number, book borrowing and reading number and professional personality test value;
It first,, will using the previous session graduate student characteristics as the input value of grader by the use of previous session graduate as training set
Output valve of the previous session graduate graduation whereabouts as grader, is trained grader, obtains trained grader;
Secondly, it using student characteristics to be predicted as input value, is input in trained grader;Export student's to be predicted
Graduation whereabouts probability;
It sorts from big to small according to graduation whereabouts probability, recommends several graduation whereabouts classification, probability in the top to student
Previous session graduate's number of value and graduation whereabouts classification.
8. the graduation whereabouts Forecasting Methodology based on college student stereo data as described in claim 1, it is characterized in that, it treats pre-
The graduation whereabouts prediction result for surveying student is shown, including:
The overall merit of student to be predicted, professional personality suggestion, graduation whereabouts displaying and the recommendation previous session graduate information;
The overall merit of the student to be predicted, refers to, current student and the lateral comparison of the same grade with other professional students;
The occupation personality suggestion, refers to, is tested by each term occupation personality, Students ' Professional personality is evaluated, to learning
The raw following Obtained employment orientation and content to be learned carry out conductive suggestion;
The graduation whereabouts displaying, refers to, passes through the graduation whereabouts classification and probability value of Visual Chart displaying prediction;
The recommendation previous session graduate information, refers to, the previous session graduate stereo data information and employment type;By to be evaluated
Student and the previous session graduate longitudinal comparison are surveyed, plays the role of that graduation whereabouts class prediction result can be explained.
9. the graduation whereabouts forecasting system based on college student stereo data, it is characterized in that, including:Memory, processor and
The computer instruction run on a memory and on a processor is stored, is completed when the computer instruction is run by processor
State the step described in claim 1-8 either method.
10. a kind of computer readable storage medium, operation thereon has computer instruction, and the computer instruction is run by processor
When, complete the step described in the claims 1-8 either method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810316749.5A CN108256699A (en) | 2018-04-10 | 2018-04-10 | Graduation whereabouts Forecasting Methodology and system based on college student stereo data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810316749.5A CN108256699A (en) | 2018-04-10 | 2018-04-10 | Graduation whereabouts Forecasting Methodology and system based on college student stereo data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108256699A true CN108256699A (en) | 2018-07-06 |
Family
ID=62748092
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810316749.5A Pending CN108256699A (en) | 2018-04-10 | 2018-04-10 | Graduation whereabouts Forecasting Methodology and system based on college student stereo data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108256699A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109213833A (en) * | 2018-09-10 | 2019-01-15 | 成都四方伟业软件股份有限公司 | Two disaggregated model training methods, data classification method and corresponding intrument |
CN109492676A (en) * | 2018-10-23 | 2019-03-19 | 东华大学 | Postgraduate employment prediction technique based on particle swarm algorithm Support Vector Machines Optimized |
CN109711482A (en) * | 2019-01-07 | 2019-05-03 | 东华大学 | A kind of placement of graduates information management and recommender system |
CN110059883A (en) * | 2019-04-22 | 2019-07-26 | 青岛科技大学 | A kind of method, apparatus, system and the storage medium of on-line prediction college students'employment |
CN111444189A (en) * | 2020-04-17 | 2020-07-24 | 贝壳技术有限公司 | Data processing method, device, medium and electronic equipment |
CN113222315A (en) * | 2020-12-10 | 2021-08-06 | 成都寻道科技有限公司 | University student in school data management system |
CN113642804A (en) * | 2021-08-27 | 2021-11-12 | 西安交通大学 | Multi-component enhanced family graduate-going prediction and recommendation multitasking method and system |
-
2018
- 2018-04-10 CN CN201810316749.5A patent/CN108256699A/en active Pending
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109213833A (en) * | 2018-09-10 | 2019-01-15 | 成都四方伟业软件股份有限公司 | Two disaggregated model training methods, data classification method and corresponding intrument |
CN109492676A (en) * | 2018-10-23 | 2019-03-19 | 东华大学 | Postgraduate employment prediction technique based on particle swarm algorithm Support Vector Machines Optimized |
CN109711482A (en) * | 2019-01-07 | 2019-05-03 | 东华大学 | A kind of placement of graduates information management and recommender system |
CN110059883A (en) * | 2019-04-22 | 2019-07-26 | 青岛科技大学 | A kind of method, apparatus, system and the storage medium of on-line prediction college students'employment |
CN111444189A (en) * | 2020-04-17 | 2020-07-24 | 贝壳技术有限公司 | Data processing method, device, medium and electronic equipment |
CN113222315A (en) * | 2020-12-10 | 2021-08-06 | 成都寻道科技有限公司 | University student in school data management system |
CN113642804A (en) * | 2021-08-27 | 2021-11-12 | 西安交通大学 | Multi-component enhanced family graduate-going prediction and recommendation multitasking method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108256699A (en) | Graduation whereabouts Forecasting Methodology and system based on college student stereo data | |
Schreiber | Issues and recommendations for exploratory factor analysis and principal component analysis | |
Lloyd | Spatial data analysis: an introduction for GIS users | |
CN107766418A (en) | A kind of credit estimation method based on Fusion Model, electronic equipment and storage medium | |
CN104516897B (en) | A kind of method and apparatus being ranked up for application | |
Liu et al. | Detecting outliers in species distribution data | |
CN107239967A (en) | House property information processing method, device, computer equipment and storage medium | |
CN107230108A (en) | The processing method and processing device of business datum | |
Akritas | Probability and Statistics with R | |
CN106447075B (en) | Industrial electricity demand prediction method and system | |
CN109409757A (en) | A kind of city degree Stress appraisal method based on NB Algorithm and curve modeling | |
Ruiz-Lendínez et al. | Automatic positional accuracy assessment of geospatial databases using line-based methods | |
Finch et al. | Comparison of NOHARM and DETECT in item cluster recovery: Counting dimensions and allocating items | |
CN110716998B (en) | Fine scale population data spatialization method | |
CN104361600B (en) | motion recognition method and system | |
Herlambang et al. | Intelligent computing system to predict vocational high school student learning achievement using Naï ve Bayes algorithm | |
de Mast et al. | Modeling and evaluating repeatability and reproducibility of ordinal classifications | |
Traun et al. | Autocorrelation-Based Regioclassification–a self-calibrating classification approach for choropleth maps explicitly considering spatial autocorrelation | |
Praserttitipong et al. | Elective course recommendation model for higher education program. | |
Feng | Predicting students' academic performance with Decision Tree and Neural Network | |
Hermans | Implementation of geographically weighted regression in automated valuation models in The Netherlands | |
Yearsley et al. | Contextuality in human decision making in the presence of direct influences: A comment on Basieva et al.(2019) | |
JP2014206382A (en) | Target type identification device | |
Wijaya et al. | Implementation of KNN Algorithm for Occupancy Classification of Rehabilitation Houses | |
CN116228484B (en) | Course combination method and device based on quantum clustering algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180706 |
|
RJ01 | Rejection of invention patent application after publication |