CN109255480A - Between servant lead prediction technique, device, computer equipment and storage medium - Google Patents
Between servant lead prediction technique, device, computer equipment and storage medium Download PDFInfo
- Publication number
- CN109255480A CN109255480A CN201811001657.4A CN201811001657A CN109255480A CN 109255480 A CN109255480 A CN 109255480A CN 201811001657 A CN201811001657 A CN 201811001657A CN 109255480 A CN109255480 A CN 109255480A
- Authority
- CN
- China
- Prior art keywords
- servant
- data
- field
- random forest
- leads
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06393—Score-carding, benchmarking or key performance indicator [KPI] analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/08—Insurance
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Theoretical Computer Science (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Physics & Mathematics (AREA)
- Development Economics (AREA)
- General Physics & Mathematics (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Game Theory and Decision Science (AREA)
- Data Mining & Analysis (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Educational Administration (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Tourism & Hospitality (AREA)
- Life Sciences & Earth Sciences (AREA)
- Technology Law (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
This application discloses an inter-species servants to lead prediction technique, device, computer equipment and storage medium.This method comprises: obtaining the history performance data of multiple employees, data cleansing is carried out to history performance data, obtains target data;Servant among target data is led into data that field does not lack as training set, training set is inputted Random Forest model function, correspond to the Random Forest model that obtains leading for servant and predict by the data that servant among target data is led field missing as test set;Test set is inputted into Random Forest model, obtains in test set servant between each employee and lead corresponding servant of field leading value.This method, as training set, is inputted the training of Random Forest model function and obtains Random Forest model, predicted value accuracy rate is high, will not generate over-fitting using the history performance data of multiple employees after cleaning.
Description
Technical field
This application involves technical field of data processing more particularly to an inter-species servant to lead prediction technique, device, computer equipment
And storage medium.
Background technique
Currently, common parameter is direct commission when calculating to the enterprise staff performance in insurance industry, hire indirectly
Gold etc..Currently in order to relatively reasonable indirect commission ratio (servant leads between abbreviation) is arranged to analyze its shadow to operation cost of enterprises
It rings, general to be analyzed and predicted using with reference to history month data, accuracy is lower.And a servant lead analyzing influence condition compared with
It is more, manually it is difficult effectively to make accurate judgement in conjunction with each condition.
Summary of the invention
This application provides an inter-species servants to lead prediction technique, device, computer equipment and storage medium, it is intended to solve existing
Commission ratio is analyzed and predicted using with reference to history month data indirectly eventually in technology, the lower problem of accuracy.
This application provides an inter-species servants to lead prediction technique comprising:
The history performance data for obtaining multiple employees carries out data cleansing to the history performance data, obtains number of targets
According to;Wherein, servant leads field between including in the history performance data of each employee and at least one to a servant leads relevant associated characters
Section, the numerical value of included associate field is the numerical value of completion in target data;
Servant leads data that field does not lack as training set between selecting in target data, and servant among target data is led word
The data of section missing input Random Forest model function as test set, by training set, corresponding to obtain leading prediction for servant
Random Forest model;
Test set is inputted into Random Forest model, obtains in test set servant between each employee and lead corresponding servant of field leading
Value.
This application provides an inter-species servants to lead prediction meanss comprising:
Data cleansing unit counts the history performance data for obtaining the history performance data of multiple employees
According to cleaning, target data is obtained;Wherein, between including in the history performance data of each employee servant lead field and at least one with
Servant leads relevant associate field, and the numerical value of included associate field is the numerical value of completion in target data;
Model acquiring unit leads data that field does not lack as training set for servant between selecting in target data, will
Training set is inputted Random Forest model function, to deserved as test set by the data that servant leads field missing among target data
Servant leads the Random Forest model of prediction between being used for;
Predicted value acquiring unit obtains in test set between each employee for test set to be inputted Random Forest model
Servant leads corresponding servant of field and leads value.
The application provides a kind of computer equipment again, including memory, processor and is stored on the memory simultaneously
The computer program that can be run on the processor, the processor realize that the application provides when executing the computer program
Described in any item servants lead prediction technique.
Present invention also provides a kind of storage mediums, wherein the storage medium is stored with computer program, the calculating
Machine program includes program instruction, and described program instruction makes the processor execute provided by the present application when being executed by a processor
Servant leads prediction technique between described in one.
The application provides an inter-species servant and leads prediction technique, device, computer equipment and storage medium.This method passes through acquisition
The history performance data of multiple employees carries out data cleansing to the history performance data, obtains target data;Wherein, each
Servant leads field between including in the history performance data of employee and at least one leads relevant associate field to a servant, in target data
The numerical value of included associate field is the numerical value of completion;The data that servant leads that field does not lack between selecting in target data are made
For training set, training set is inputted Random Forest model as test set by the data that servant among target data is led field missing
Function, the corresponding Random Forest model for obtaining leading prediction for servant;Test set is inputted into Random Forest model, obtains test set
In between each employee servant lead corresponding servant of field and lead value.This method uses the history performance data of multiple employees after cleaning
As training set, Random Forest model function is inputted, obtains the Random Forest model for leading prediction for servant, predicted value accuracy rate
Height will not generate over-fitting.
Detailed description of the invention
Technical solution in ord to more clearly illustrate embodiments of the present application, below will be to needed in embodiment description
Attached drawing is briefly described, it should be apparent that, the accompanying drawings in the following description is some embodiments of the present application, general for this field
For logical technical staff, without creative efforts, it is also possible to obtain other drawings based on these drawings.
Fig. 1 is the schematic flow diagram that inter-species servant provided by the embodiments of the present application leads prediction technique;
Fig. 2 is the sub-process schematic diagram that inter-species servant provided by the embodiments of the present application leads prediction technique;
Fig. 3 is another schematic flow diagram that inter-species servant provided by the embodiments of the present application leads prediction technique;
Fig. 4 is another sub-process schematic diagram that inter-species servant provided by the embodiments of the present application leads prediction technique;
Fig. 5 is another sub-process schematic diagram that inter-species servant provided by the embodiments of the present application leads prediction technique;
Fig. 6 is the schematic block diagram that inter-species servant provided by the embodiments of the present application leads prediction meanss;
Fig. 7 is the subelement schematic block diagram that inter-species servant provided by the embodiments of the present application leads prediction meanss;
Fig. 8 is another schematic block diagram that inter-species servant provided by the embodiments of the present application leads prediction meanss;
Fig. 9 is another subelement schematic block diagram that inter-species servant provided by the embodiments of the present application leads prediction meanss;
Figure 10 is another subelement schematic block diagram that inter-species servant provided by the embodiments of the present application leads prediction meanss;
Figure 11 is a kind of schematic block diagram of computer equipment provided by the embodiments of the present application.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete
Site preparation description, it is clear that described embodiment is some embodiments of the present application, instead of all the embodiments.Based on this Shen
Please in embodiment, every other implementation obtained by those of ordinary skill in the art without making creative efforts
Example, shall fall in the protection scope of this application.
It should be appreciated that ought use in this specification and in the appended claims, term " includes " and "comprising" instruction
Described feature, entirety, step, operation, the presence of element and/or component, but one or more of the other feature, whole is not precluded
Body, step, operation, the presence or addition of element, component and/or its set.
It is also understood that mesh of the term used in this present specification merely for the sake of description specific embodiment
And be not intended to limit the application.As present specification and it is used in the attached claims, unless on
Other situations are hereafter clearly indicated, otherwise " one " of singular, "one" and "the" are intended to include plural form.
It will be further appreciated that the term "and/or" used in present specification and the appended claims is
Refer to any combination and all possible combinations of one or more of associated item listed, and including these combinations.
Referring to Fig. 1, Fig. 1 is the schematic flow diagram that inter-species servant provided by the embodiments of the present application leads prediction technique.The party
Method is applied in the terminals such as desktop computer, laptop computer, tablet computer, also can be applied in server.As shown in Figure 1, should
Method includes step S101~S103.
S101, the history performance data for obtaining multiple employees carry out data cleansing to the history performance data, obtain mesh
Mark data;Wherein, servant leads field between including in the history performance data of each employee and at least one to a servant leads relevant pass
Join field, the numerical value of included associate field is the numerical value of completion in target data.
In the present embodiment, the server end of the performance data of employee is stored with first by the history performance data of multiple employees
It imports in a specified data table, then obtains the history performance data of multiple employees from the data form.Wherein, employee
Every row is a training examples (i.e. employee) in history performance data, and each column is the feature of the sample, it can be understood as each column pair
Answer a feature field.For example, the training examples of every a line have following field:
Employee's work number ID;
Name;
Gender: male=male, female=women;
Age;
Lineal relative's total number of persons in enterprise;
Collaterals' total number of persons in enterprise;
Alumnus's total number of persons in enterprise;
Wage;
Title and rank;
Direct commission total value;
Direct commission rate;
Between servant lead;
Wherein, a servant lead it is corresponding be between servant lead field, it is employee's work number ID, name, gender, the age, straight in enterprise
Be relatives' total number of persons, collaterals' total number of persons in enterprise, alumnus's total number of persons in enterprise, wage, title and rank, directly
Corresponding commission total value, direct commission rate are to lead relevant associate field to a servant.
In the history performance data for obtaining above-mentioned multiple employees, data cleansing is carried out to the history performance data, is obtained
To after target data, servant leads the corresponding missing values of field due to being to need the value predicted so there is no need to completions, and lead phase with a servant
The associate field of pass then needs to carry out completion during data cleansing, to meet the data requirements of prediction process.I.e. multiple members
The history performance data of work can be considered untreated primary data comprising a servant leads field and a servant leads relevant associated characters
Section, and there may be unassignable situations for the associate field in these history performance datas, it is necessary to pass through the side of data cleansing
Formula is associated the completion of the numerical value of field.
In one embodiment, as shown in Fig. 2, step S101 includes:
S1011, the history performance data of each employee in the history performance data of multiple employees is subjected to integrality inspection
It looks into, if there are missing values, the average values pair of the field according to corresponding to missing values for the associate field in the history performance data of employee
Missing values carry out completion, obtain partial data;
Associate field and servant lead the related coefficient between field in S1012, acquisition partial data, retain phase relation numerical digit
Associate field before default rank value, data after being cleared up for the first time;
S1013, the partial velocities for obtaining data after first cleaning, the corresponding skewness value of field in data after clearing up for the first time
Field beyond the preset coefficient of skew carries out logarithm operation, obtains target data.
In the present embodiment, integrity checking is carried out to the history performance data of each employee, is because predicting
The method for not allowing that there are missing values in journey, therefore needing to fill by average value carries out completion to missing values, obtains partial data.
Assuming that there is the data of 100 employees, wherein 10 lineal relative's total numbers of persons lacked in enterprise, 20 lack
Alumnus's total number of persons in enterprise, 7 missing title and ranks;At this point, user can be allowed to supplement by way of issuing prompt, or
It is that average value is filled automatically.Namely in the data of above-mentioned missing, missing values can be carried out according to the average value of the field
Supplement, to ensure, the data of completion do not influence subsequent analysis and operation.
It obtains associate field and servant in partial data and leads the related coefficient between field, such as get direct commission rate
The related coefficient that field and servant lead field is 0.8, and the alumnus's total number of persons field and a servant got in enterprise leads field
Related coefficient is 0.7, and above-mentioned two field and servant lead the related coefficient ranking front two between field, if default rank value is
3, then can by partial data in addition to direct commission rate field, alumnus's total number of persons field in enterprise and a servant lead field
Except all fields delete, data after being cleared up for the first time.
I.e. there are the corresponding skewness values of field to have exceeded the preset coefficient of skew in data after first cleaning, then to the word
The corresponding each numerical value of section carries out taking logarithm operation, to reduce the skewness value of the field.Such as corresponding numerical value of the field is x, then
Adjusted value after carrying out logarithm operation is lnx, i.e., takes logarithm the bottom of by of e, after above-mentioned adjustment, the data that can be can be used for
The foundation of subsequent Random Forest model.
S102, servant among target data is led into data that field does not lack as training set, target data centre servant is led
The data of field missing input Random Forest model function as test set, by training set, correspond to servant between being used for and lead prediction
Random Forest model.
In the present embodiment, if between being directed to servant lead the missing values in field using average value or random writing method into
Row supplement, it is not high to will lead to its accuracy rate, and can have the case where overfitting, generated servant is caused to lead data application
When operation cost of enterprises analysis, practical value is low.Make when leading the data that field does not lack using servant between selecting in target data
For training set, training set is inputted into Random Forest model function, the corresponding Random Forest model for obtaining leading prediction for servant.
For example, servant between middle selection, which is led the data that field does not lack, inputs cforest () function, cforest () function is adopted
With Random Forest model, it may be assumed that
Model <-cforest alumnus's the total number of persons of direct commission rate+in enterprise (servant leads~).
By above-mentioned training process, the Random Forest model that prediction is led for servant can be obtained.
In one embodiment, as shown in figure 3, after step S102 further include:
S102a, the data of corresponding amount are randomly choosed as verifying collection in training set according to preset extraction ratio;
S102b, verifying collection is input to Random Forest model progress model verifying, if the verifying of Random Forest model is correct
Rate exceeds preset accuracy threshold value, saves the Random Forest model.
In the present embodiment, it in order to verify the order of accuarcy of Random Forest model, will can be chosen at random again in training set
Select the data of corresponding amount as verifying collection, if obtained verification result is the verifying accuracy of Random Forest model beyond preset
Accuracy threshold value (preset accuracy threshold value is 80%), then save the Random Forest model as the subsequent prediction mould used
Type.
In one embodiment, as shown in figure 4, step S102 includes:
S1021, it is concentrated with the sample set for randomly selecting the first quantity put back to from training, according to the first number of sample set building
The post-class processing of amount;
S1022, each post-class processing is trained according to bagging method, obtains multiple decision trees, and by decision
Tree combination obtains the Random Forest model that prediction is led for servant.
In the present embodiment, Bagging method is to obtain in ensemble methods (i.e. integrated approach) for training
An important ring for the data of base estimator (basic estimator).As its name, Bagging method is exactly by all training
Data are put into the bag (can image be interpreted as a flight data recorder or black-envelope is wrapped up in) of a black, and black means to can't see the inside
Data details, only know the inside have data set.Then a part of data are taken out at random from this bag to be used to instruct out
Practice a base estimator.The data being extracted into be finished after there are two types of selection, put back to or do not put back to.Bagging technology can
Effectively to reduce variance, that is, reduce over-fitting degree.
By bagging technology and decision tree, random forest is obtained.Using decision tree as base estimator (base
Plinth estimator), a lot of small decision trees of bagging technique drill are then used, finally these small decision trees combine, this
Sample has just obtained a piece of forest (random forest).
More specifically, the process for obtaining Random Forest model by raw sample data training is as follows:
1) it concentrates from original training data, is put using bootstrap method (resampling technique in statistics in fact)
It randomly selects k new self-service sample sets with returning, and thus constructs k post-class processing, the sample composition not being pumped to every time
The outer data of K bag (out-of-bag is abbreviated as BBB);
2) it is equipped with n feature, then randomly selects mtry feature at each node of every one tree, it is each by calculating
The information content that feature contains, the feature of the most classification capacity of selection one carries out node split in feature;
3) each tree is grown to the maximum extent, does not do any cut out;
4) more trees of generation are formed into random forest, is classified with random forest to new data, classification results are pressed
Depending on Tree Classifier ballot is how many.
Random forest as its name suggests, is to establish a forest with random manner, has many decision tree groups inside forest
At being not associated between each decision tree of random forest.After obtaining forest, when there is a new input sample
Into when, just allow each decision tree in forest once to be judged respectively, look at which this sample should belong to
Class (for sorting algorithm) then looks at which kind of at most, just predicts that this sample is that is a kind of by selection.
During establishing each decision tree, there is two o'clock to need to pay attention to-sample and fully nonlinear water wave.Be first two with
The process of machine sampling, random forest will carry out the sampling of row, column to the data of input.Row is sampled, using putting back to
Mode, that is, in the obtained sample set of sampling, may there is duplicate sample.Assuming that input sample be it is N number of, then adopting
The sample of sample is also N number of.Make when training in this way, the sample that the input sample of every one tree is all not all of, so that
It is opposite to be not easy over-fitting occur.Then column sampling is carried out, from M feature, selects m (m < < M).Later
It is that decision tree is established out using the mode of fully nonlinear water wave to the data after sampling, some leaf node of such decision tree is wanted
It is the same classification being all directed to that can not continue all samples of division or the inside.General many decision trees
All one important step-beta pruning of algorithm, but it is not dry so here, since the process of two stochastical samplings before ensure that
Randomness, even if so over-fitting, will not occur in not beta pruning.
S103, test set is inputted into Random Forest model, obtaining in test set servant between each employee, to lead field corresponding
Between servant lead value.
In the present embodiment, test set is inputted into Random Forest model, can be obtained in test set and is hired between each employee
Corresponding servant of rate field leads value, and servant between each obtain is led to corresponding filling to the deletion sites corresponding to it, with complete
The prediction led at servant.
In one embodiment, as shown in figure 5, step S103 includes:
S1031, the operation function between field is led according to Random Forest model acquisition associate field and servant;
S1032, the associate field respective value of employee each in test set is inputted into the operation function, obtained in test set
Servant leads corresponding servant of field and leads value between each employee.
In the present embodiment, by test set input Random Forest model be trained after, can be obtained associate field with
Between servant lead the operation function between field, such as (servant leads=school of the direct commission rate+10* of 1.1* in enterprise to linear function
Friendly total number of persons/enterprise's total number of persons etc.), then by the way that the associate field respective value of employee each in test set is inputted the operation letter
Number, obtains in test set servant between each employee and leads corresponding servant of field leading value, the accurate prediction process that servant leads between completing,
Avoid the overfitting of prediction data.
As it can be seen that this method is using the history performance data of multiple employees after cleaning as training set, input random forest
Pattern function obtains the Random Forest model that prediction is led for servant, and predicted value accuracy rate is high, will not generate over-fitting.
The embodiment of the present application also provides an inter-species servant and leads prediction meanss, this servant leads prediction meanss for executing aforementioned servant
Any embodiment of rate prediction technique.Specifically, referring to Fig. 6, Fig. 6 is that inter-species servant provided by the embodiments of the present application leads prediction
The schematic block diagram of device.Between servant lead prediction meanss 100 can be configured at desktop computer, tablet computer, laptop computer, etc. terminals
In, it can also be configured in server.
As shown in fig. 6, it includes data cleansing unit 101, model acquiring unit 102, predicted value that a servant, which leads prediction meanss 100,
Acquiring unit 103.
Data cleansing unit 101 carries out the history performance data for obtaining the history performance data of multiple employees
Data cleansing obtains target data;Wherein, between including in the history performance data of each employee servant lead field and at least one with
Between servant lead relevant associate field, the numerical value of included associate field is the numerical value of completion in target data.
In the present embodiment, the server end of the performance data of employee is stored with first by the history performance data of multiple employees
It imports in a specified data table, then obtains the history performance data of multiple employees from the data form.Wherein, employee
Every row is a training examples (i.e. employee) in history performance data, and each column is the feature of the sample, it can be understood as each column pair
Answer a feature field.For example, the training examples of every a line have following field:
Employee's work number ID;
Name;
Gender: male=male, female=women;
Age;
Lineal relative's total number of persons in enterprise;
Collaterals' total number of persons in enterprise;
Alumnus's total number of persons in enterprise;
Wage;
Title and rank;
Direct commission total value;
Direct commission rate;
Between servant lead;
Wherein, a servant lead it is corresponding be between servant lead field, it is employee's work number ID, name, gender, the age, straight in enterprise
Be relatives' total number of persons, collaterals' total number of persons in enterprise, alumnus's total number of persons in enterprise, wage, title and rank, directly
Corresponding commission total value, direct commission rate are to lead relevant associate field to a servant.
In the history performance data for obtaining above-mentioned multiple employees, data cleansing is carried out to the history performance data, is obtained
To after target data, servant leads the corresponding missing values of field due to being to need the value predicted so there is no need to completions, and lead phase with a servant
The associate field of pass then needs to carry out completion during data cleansing, to meet the data requirements of prediction process.I.e. multiple members
The history performance data of work can be considered untreated primary data comprising a servant leads field and a servant leads relevant associated characters
Section, and there may be unassignable situations for the associate field in these history performance datas, it is necessary to pass through the side of data cleansing
Formula is associated the completion of the numerical value of field.
In one embodiment, as shown in fig. 7, data cleansing unit 101 includes:
Missing values supplementary units 1011, the history performance number for each employee in the history performance data by multiple employees
According to integrity checking is carried out, if there are missing values for the associate field in the history performance data of employee, according to missing values, institute is right
It answers the average value of field to carry out completion to missing values, obtains partial data;
Correlation judging unit 1012 leads phase relation between field for obtaining associate field and servant in partial data
Number retains the associate field that related coefficient is located at before default rank value, data after being cleared up for the first time;
Skewness computing unit 1013, for obtaining the partial velocities of data after first cleaning, after clearing up for the first time in data
The corresponding skewness value of field carries out logarithm operation beyond the field of the preset coefficient of skew, obtains target data.
In the present embodiment, integrity checking is carried out to the history performance data of each employee, is because predicting
The method for not allowing that there are missing values in journey, therefore needing to fill by average value carries out completion to missing values, obtains partial data.
Assuming that there is the data of 100 employees, wherein 10 lineal relative's total numbers of persons lacked in enterprise, 20 lack
Alumnus's total number of persons in enterprise, 7 missing title and ranks;At this point, user can be allowed to supplement by way of issuing prompt, or
It is that average value is filled automatically.Namely in the data of above-mentioned missing, missing values can be carried out according to the average value of the field
Supplement, to ensure, the data of completion do not influence subsequent analysis and operation.
It obtains associate field and servant in partial data and leads the related coefficient between field, such as get direct commission rate
The related coefficient that field and servant lead field is 0.8, and the alumnus's total number of persons field and a servant got in enterprise leads field
Related coefficient is 0.7, and above-mentioned two field and servant lead the related coefficient ranking front two between field, if default rank value is
3, then can by partial data in addition to direct commission rate field, alumnus's total number of persons field in enterprise and a servant lead field
Except all fields delete, data after being cleared up for the first time.
I.e. there are the corresponding skewness values of field to have exceeded the preset coefficient of skew in data after first cleaning, then to the word
The corresponding each numerical value of section carries out taking logarithm operation, to reduce the skewness value of the field.Such as corresponding numerical value of the field is x, then
Adjusted value after carrying out logarithm operation is lnx, i.e., takes logarithm the bottom of by of e, after above-mentioned adjustment, the data that can be can be used for
The foundation of subsequent Random Forest model.
Model acquiring unit 102, for servant among target data to be led data that field does not lack as training set, by mesh
Training set is inputted Random Forest model function, correspondence obtains as test set by the data that servant leads field missing among mark data
The Random Forest model of prediction is led for servant.
In the present embodiment, if between being directed to servant lead the missing values in field using average value or random writing method into
Row supplement, it is not high to will lead to its accuracy rate, and can have the case where overfitting, generated servant is caused to lead data application
When operation cost of enterprises analysis, practical value is low.Make when leading the data that field does not lack using servant between selecting in target data
For training set, training set is inputted into Random Forest model function, the corresponding Random Forest model for obtaining leading prediction for servant.
For example, servant between middle selection, which is led the data that field does not lack, inputs cforest () function, cforest () function is adopted
With Random Forest model, it may be assumed that
Model <-cforest alumnus's the total number of persons of direct commission rate+in enterprise (servant leads~).
By above-mentioned training process, the Random Forest model that prediction is led for servant can be obtained.
In one embodiment, as shown in figure 8, a servant leads prediction meanss 100 further include:
Verifying collection selection unit 102a, for randomly choosing the number of corresponding amount in training set according to preset extraction ratio
Collect according to as verifying;
Model authentication unit 102b is input to Random Forest model progress model verifying for that will verify collection, if random gloomy
The verifying accuracy of woods model exceeds preset accuracy threshold value, saves the Random Forest model.
In the present embodiment, it in order to verify the order of accuarcy of Random Forest model, will can be chosen at random again in training set
Select the data of corresponding amount as verifying collection, if obtained verification result is the verifying accuracy of Random Forest model beyond preset
Accuracy threshold value (preset accuracy threshold value is 80%), then save the Random Forest model as the subsequent prediction mould used
Type.
In one embodiment, as shown in figure 9, model acquiring unit 102 includes:
Post-class processing acquiring unit 1021, for being concentrated with the sample for randomly selecting the first quantity put back to from training
Collection constructs the post-class processing of the first quantity according to sample set;
Decision tree assembled unit 1022 obtains more for each post-class processing to be trained according to bagging method
A decision tree, and combine decision tree be used between servant lead the Random Forest model of prediction.
In the present embodiment, Bagging method is to obtain in ensemble methods (i.e. integrated approach) for training
An important ring for the data of base estimator (basic estimator).As its name, Bagging method is exactly by all training
Data are put into the bag (can image be interpreted as a flight data recorder or black-envelope is wrapped up in) of a black, and black means to can't see the inside
Data details, only know the inside have data set.Then a part of data are taken out at random from this bag to be used to instruct out
Practice a base estimator.The data being extracted into be finished after there are two types of selection, put back to or do not put back to.Bagging technology can
Effectively to reduce variance, that is, reduce over-fitting degree.
By bagging technology and decision tree, random forest is obtained.Using decision tree as base estimator (base
Plinth estimator), a lot of small decision trees of bagging technique drill are then used, finally these small decision trees combine, this
Sample has just obtained a piece of forest (random forest).
Random forest as its name suggests, is to establish a forest with random manner, has many decision tree groups inside forest
At being not associated between each decision tree of random forest.After obtaining forest, when there is a new input sample
Into when, just allow each decision tree in forest once to be judged respectively, look at which this sample should belong to
Class (for sorting algorithm) then looks at which kind of at most, just predicts that this sample is that is a kind of by selection.
During establishing each decision tree, there is two o'clock to need to pay attention to-sample and fully nonlinear water wave.Be first two with
The process of machine sampling, random forest will carry out the sampling of row, column to the data of input.Row is sampled, using putting back to
Mode, that is, in the obtained sample set of sampling, may there is duplicate sample.Assuming that input sample be it is N number of, then adopting
The sample of sample is also N number of.Make when training in this way, the sample that the input sample of every one tree is all not all of, so that
It is opposite to be not easy over-fitting occur.Then column sampling is carried out, from M feature, selects m (m < < M).Later
It is that decision tree is established out using the mode of fully nonlinear water wave to the data after sampling, some leaf node of such decision tree is wanted
It is the same classification being all directed to that can not continue all samples of division or the inside.General many decision trees
All one important step-beta pruning of algorithm, but it is not dry so here, since the process of two stochastical samplings before ensure that
Randomness, even if so over-fitting, will not occur in not beta pruning.
Predicted value acquiring unit 103 obtains each employee in test set for test set to be inputted Random Forest model
Between servant lead corresponding servant of field and lead value.
In the present embodiment, test set is inputted into Random Forest model, can be obtained in test set and is hired between each employee
Corresponding servant of rate field leads value, and servant between each obtain is led to corresponding filling to the deletion sites corresponding to it, with complete
The prediction led at servant.
In one embodiment, as shown in Figure 10, predicted value acquiring unit 103 includes:
Operation function acquiring unit 1031, for predicted value acquiring unit according to Random Forest model obtain associate field with
Between servant lead the operation function between field;
Predictor calculation unit 1032, for the associate field respective value of employee each in test set to be inputted the operation
Function, obtains in test set servant between each employee and leads corresponding servant of field leading value.
In the present embodiment, by test set input Random Forest model be trained after, can be obtained associate field with
Between servant lead the operation function between field, such as (servant leads=school of the direct commission rate+10* of 1.1* in enterprise to linear function
Friendly total number of persons/enterprise's total number of persons etc.), then by the way that the associate field respective value of employee each in test set is inputted the operation letter
Number, obtains in test set servant between each employee and leads corresponding servant of field leading value, the accurate prediction process that servant leads between completing,
Avoid the overfitting of prediction data.
As it can be seen that the device is using the history performance data of multiple employees after cleaning as training set, input random forest
Pattern function obtains the Random Forest model that prediction is led for servant, and predicted value accuracy rate is high, will not generate over-fitting.
Above-mentioned servant, which leads prediction meanss, can be implemented as a kind of form of computer program, which can be such as
It is run in computer equipment shown in Figure 11.
Figure 11 is please referred to, Figure 11 is a kind of schematic block diagram of computer equipment provided by the embodiments of the present application.The calculating
500 equipment of machine equipment can be terminal, be also possible to server.The terminal can be tablet computer, laptop, desktop
The electronic equipments such as brain, personal digital assistant.
Refering to fig. 11, which includes processor 502, memory and the net connected by system bus 501
Network interface 505, wherein memory may include non-volatile memory medium 503 and built-in storage 504.
The non-volatile memory medium 503 can storage program area 5031 and computer program 5032.The computer program
5032 include program instruction, which is performed, and processor 502 may make to execute an inter-species servant and lead prediction technique.
The processor 502 supports the operation of entire computer equipment 500 for providing calculating and control ability.
The built-in storage 504 provides environment for the operation of the computer program 5032 in non-volatile memory medium 503, should
When computer program 5032 is executed by processor 502, processor 502 may make to execute an inter-species servant and lead prediction technique.
The network interface 505 such as sends the task dispatching of distribution for carrying out network communication.Those skilled in the art can manage
It solves, structure shown in Figure 11, only the block diagram of part-structure relevant to application scheme, is not constituted to the application side
The restriction for the computer equipment 500 that case is applied thereon, specific computer equipment 500 may include more than as shown in the figure
Or less component, perhaps combine certain components or with different component layouts.
Wherein, the processor 502 is for running computer program 5032 stored in memory, to realize following function
Can: the history performance data of multiple employees is obtained, data cleansing is carried out to the history performance data, obtains target data;Its
In, servant leads field between including in the history performance data of each employee and at least one to a servant leads relevant associate field, mesh
The numerical value for marking associate field included in data is the numerical value of completion;Servant among target data is led into the number that field does not lack
According to as training set, training set is inputted random forest as test set by the data that servant among target data is led field missing
Pattern function, the corresponding Random Forest model for obtaining leading prediction for servant;Test set is inputted into Random Forest model, is surveyed
Servant leads corresponding servant of field and leads value between each employee of examination concentration.
In one embodiment, processor 502 also performs the following operations: by every a member in the history performance data of multiple employees
The history performance data of work carries out integrity checking, if the associate field in the history performance data of employee there are missing values,
The average value of the field according to corresponding to missing values carries out completion to missing values, obtains partial data;It obtains and is associated in partial data
Field and servant lead the related coefficient between field, retain the associate field that related coefficient is located at before default rank value, obtain
Data after first cleaning;Obtain the partial velocities of data after clearing up for the first time, the corresponding skewness of field in data after clearing up for the first time
Value carries out logarithm operation beyond the field of the preset coefficient of skew, obtains target data.
In one embodiment, processor 502 also performs the following operations: random in training set according to preset extraction ratio
The data of corresponding amount are selected to collect as verifying;Verifying collection is input to Random Forest model and carries out model verifying, if random forest
The verifying accuracy of model exceeds preset accuracy threshold value, saves the Random Forest model.
In one embodiment, processor 502 also performs the following operations: from training be concentrated with put back to randomly select first number
The sample set of amount constructs the post-class processing of the first quantity according to sample set;By each post-class processing according to bagging method
Be trained, obtain multiple decision trees, and combine decision tree be used between servant lead the Random Forest model of prediction.
In one embodiment, processor 502 also performs the following operations: according to Random Forest model obtain associate field and
Servant leads the operation function between field;The associate field respective value of employee each in test set is inputted into the operation function, is obtained
Into test set, servant leads corresponding servant of field and leads value between each employee.
It will be understood by those skilled in the art that the embodiment of computer equipment shown in Figure 11 is not constituted to computer
The restriction of equipment specific composition, in other embodiments, computer equipment may include components more more or fewer than diagram, or
Person combines certain components or different component layouts.For example, in some embodiments, computer equipment can only include depositing
Reservoir and processor, in such embodiments, the structure and function of memory and processor are consistent with embodiment illustrated in fig. 11,
Details are not described herein.
It should be appreciated that in the embodiment of the present application, processor 502 can be central processing unit (Central
Processing Unit, CPU), which can also be other general processors, digital signal processor (Digital
Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit,
ASIC), ready-made programmable gate array (Field-Programmable GateArray, FPGA) or other programmable logic devices
Part, discrete gate or transistor logic, discrete hardware components etc..Wherein, general processor can be microprocessor or
The processor is also possible to any conventional processor etc..
A kind of storage medium is provided in another embodiment of the application.The storage medium can be computer-readable storage
Medium.The storage medium is stored with computer program, and wherein computer program includes program instruction.The program instruction is by processor
It is realized when execution: obtaining the history performance data of multiple employees, data cleansing is carried out to the history performance data, obtains target
Data;Wherein, servant leads field between including in the history performance data of each employee and at least one to a servant leads relevant association
Field, the numerical value of included associate field is the numerical value of completion in target data;Servant among target data is led into field not
As training set, the data that servant among target data is led field missing input training set as test set the data of missing
Random Forest model function, the corresponding Random Forest model for obtaining leading prediction for servant;Test set is inputted into random forest mould
Type, obtains in test set servant between each employee and leads corresponding servant of field leading value.
In one embodiment, realization when which is executed by processor: will be in the history performance data of multiple employees
The history performance data of each employee carries out integrity checking, lacks if the associate field in the history performance data of employee exists
The average value of mistake value, the field according to corresponding to missing values carries out completion to missing values, obtains partial data;It obtains in partial data
Associate field and servant lead the related coefficient between field, retain the associated characters that related coefficient is located at before default rank value
Section, data after being cleared up for the first time;The partial velocities for obtaining data after clearing up for the first time, field is corresponding in data after clearing up for the first time
Skewness value beyond the preset coefficient of skew field carry out logarithm operation, obtain target data.
In one embodiment, processor 502 also performs the following operations: random in training set according to preset extraction ratio
The data of corresponding amount are selected to collect as verifying;Verifying collection is input to Random Forest model and carries out model verifying, if random forest
The verifying accuracy of model exceeds preset accuracy threshold value, saves the Random Forest model.
In one embodiment, realization when which is executed by processor: randomly selecting of putting back to is concentrated with from training
The sample set of first quantity constructs the post-class processing of the first quantity according to sample set;By each post-class processing according to
Bagging method is trained, and obtains multiple decision trees, and combine decision tree be used between servant lead the random forest of prediction
Model.
In one embodiment, associated characters realization when which is executed by processor: are obtained according to Random Forest model
Section and servant lead the operation function between field;The associate field respective value of employee each in test set is inputted into the operation letter
Number, obtains in test set servant between each employee and leads corresponding servant of field leading value.
In one embodiment, realization when which is executed by processor: if the data sending terminal terminating communication number
According to transmission exceed preset time threshold, the shared drive is discharged.
The storage medium can be the internal storage unit of aforementioned device, such as the hard disk or memory of equipment.It is described to deposit
Storage media is also possible to the plug-in type hard disk being equipped on the External memory equipment of the equipment, such as the equipment, intelligent storage
Block (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash card (Flash Card) etc..
Further, the storage medium can also both including the equipment internal storage unit and also including External memory equipment.
It is apparent to those skilled in the art that for convenience of description and succinctly, foregoing description is set
The specific work process of standby, device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
Those of ordinary skill in the art may be aware that unit described in conjunction with the examples disclosed in the embodiments of the present disclosure and algorithm
Step can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware and software
Interchangeability generally describes each exemplary composition and step according to function in the above description.These functions are studied carefully
Unexpectedly the specific application and design constraint depending on technical solution are implemented in hardware or software.Professional technician
Each specific application can be used different methods to achieve the described function, but this realization is it is not considered that exceed
The scope of the present invention.
In several embodiments provided herein, it should be understood that disclosed unit and method, it can be with
It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit
It divides, only a kind of logical function partition, there may be another division manner in actual implementation, can also will have identical function
The unit set of energy can be combined or can be integrated into another system at a unit, such as multiple units or components, or
Some features can be ignored or not executed.In addition, shown or discussed mutual coupling or direct-coupling or communication link
Connect can be through some interfaces, the indirect coupling or communication connection of device or unit, be also possible to electricity, it is mechanical or other
Form connection.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.Some or all of unit therein can be selected to realize the embodiment of the present invention according to the actual needs
Purpose.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit
It is that each unit physically exists alone, is also possible to two or more units and is integrated in one unit.It is above-mentioned integrated
Unit both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product
When, it can store in one storage medium.Based on this understanding, technical solution of the present invention is substantially in other words to existing
The all or part of part or the technical solution that technology contributes can be embodied in the form of software products, should
Computer software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be
Personal computer, server or network equipment etc.) execute all or part of step of each embodiment the method for the present invention
Suddenly.And storage medium above-mentioned include: USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), magnetic disk or
The various media that can store program code such as person's CD.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any
Those familiar with the art in the technical scope disclosed by the present invention, can readily occur in various equivalent modifications or replace
It changes, these modifications or substitutions should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with right
It is required that protection scope subject to.
Claims (10)
1. an inter-species servant leads prediction technique characterized by comprising
The history performance data for obtaining multiple employees carries out data cleansing to the history performance data, obtains target data;Its
In, servant leads field between including in the history performance data of each employee and at least one to a servant leads relevant associate field, mesh
The numerical value for marking associate field included in data is the numerical value of completion;
Servant among target data is led into data that field does not lack as training set, servant among target data is led what field lacked
Data input Random Forest model function, the corresponding random forest for obtaining leading prediction for servant as test set, by training set
Model;
Test set is inputted into Random Forest model, obtains in test set servant between each employee and lead corresponding servant of field leading value.
2. according to claim 1 servant leads prediction technique, which is characterized in that described to be carried out to the history performance data
Data cleansing obtains target data, comprising:
The history performance data of each employee in the history performance data of multiple employees is subjected to integrity checking, if employee
For associate field in history performance data there are missing values, the average value of the field according to corresponding to missing values mends missing values
Entirely, partial data is obtained;
It obtains associate field and servant in partial data and leads the related coefficient between field, retain related coefficient and be located at default ranking
Associate field before value, data after being cleared up for the first time;
The partial velocities for obtaining data after first cleaning, after clearing up for the first time in data the corresponding skewness value of field beyond preset
The field of the coefficient of skew carries out logarithm operation, obtains target data.
3. according to claim 1 servant leads prediction technique, which is characterized in that described that training set is inputted random forest mould
Type function, it is corresponding obtain leading the Random Forest model of prediction for servant after, further includes:
The data of corresponding amount are randomly choosed in training set as verifying collection according to preset extraction ratio;
Verifying collection is input to Random Forest model and carries out model verifying, if the verifying accuracy of Random Forest model is beyond default
Accuracy threshold value, save the Random Forest model.
4. according to claim 1 servant leads prediction technique, which is characterized in that described that training set is inputted random forest mould
Type function, the corresponding Random Forest model for obtaining leading prediction for servant, comprising:
It is concentrated with the sample set for randomly selecting the first quantity put back to from training, is returned according to the classification that sample set constructs the first quantity
Gui Shu;
Each post-class processing is trained according to bagging method, obtains multiple decision trees, and decision tree is combined to obtain
The Random Forest model of prediction is led for servant.
5. according to claim 1 servant leads prediction technique, which is characterized in that described that test set is inputted random forest mould
Type, obtains in test set servant between each employee and leads corresponding servant of field leading value, comprising:
Associate field is obtained according to Random Forest model and servant leads the operation function between field;
The associate field respective value of employee each in test set is inputted into the operation function, obtains each employee in test set
Between servant lead corresponding servant of field and lead value.
6. an inter-species servant leads prediction meanss characterized by comprising
It is clear to carry out data to the history performance data for obtaining the history performance data of multiple employees for data cleansing unit
It washes, obtains target data;Wherein, servant leads field between including in the history performance data of each employee and at least one leads with a servant
Relevant associate field, the numerical value of included associate field is the numerical value of completion in target data;
Model acquiring unit leads data that field does not lack as training set, by target for servant between selecting in target data
Training set is inputted Random Forest model function, correspondence is used as test set by the data that servant leads field missing among data
The Random Forest model of prediction is led in servant;
Predicted value acquiring unit obtains in test set that servant leads between each employee for test set to be inputted Random Forest model
Corresponding servant of field leads value.
7. according to claim 6 servant leads prediction meanss, which is characterized in that the data cleansing unit, comprising:
Missing values supplementary units, the history performance data for each employee in the history performance data by multiple employees carry out
Integrity checking, if leading relevant associate field to a servant there are missing values in the history performance data of employee, according to missing
The average value for being worth corresponding field carries out completion to missing values, obtains partial data;
Correlation judging unit leads related coefficient between field for obtaining associate field and servant in partial data, retains
Related coefficient be located at default rank value before in a servant lead relevant associate field, data after being cleared up for the first time;
Skewness computing unit, for obtaining the partial velocities of data after first cleaning, field is corresponding in data after clearing up for the first time
Skewness value beyond the preset coefficient of skew field carry out logarithm operation, obtain target data.
8. according to claim 6 servant leads prediction meanss, which is characterized in that the model acquiring unit, further includes:
Verifying collection selection unit, the data for randomly choosing corresponding amount in training set according to preset extraction ratio, which are used as, to be tested
Card collection;
Model authentication unit is input to Random Forest model progress model verifying for that will verify collection, if Random Forest model
It verifies accuracy and exceeds preset accuracy threshold value, save the Random Forest model.
9. a kind of computer equipment, including memory, processor and it is stored on the memory and can be on the processor
The computer program of operation, which is characterized in that the processor is realized when executing the computer program as in claim 1-5
Described in any item servants lead prediction technique.
10. a kind of storage medium, which is characterized in that the storage medium is stored with computer program, the computer program packet
Program instruction is included, described program instruction executes the processor such as any one of claim 1-5 institute
Servant leads prediction technique between stating.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811001657.4A CN109255480A (en) | 2018-08-30 | 2018-08-30 | Between servant lead prediction technique, device, computer equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811001657.4A CN109255480A (en) | 2018-08-30 | 2018-08-30 | Between servant lead prediction technique, device, computer equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109255480A true CN109255480A (en) | 2019-01-22 |
Family
ID=65049424
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811001657.4A Pending CN109255480A (en) | 2018-08-30 | 2018-08-30 | Between servant lead prediction technique, device, computer equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109255480A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110796200A (en) * | 2019-10-30 | 2020-02-14 | 深圳前海微众银行股份有限公司 | Data classification method, terminal, device and storage medium |
CN111563077A (en) * | 2020-05-12 | 2020-08-21 | 国网山东省电力公司泰安供电公司 | Power grid voltage data missing filling method, system, terminal and storage medium |
CN111667107A (en) * | 2020-05-29 | 2020-09-15 | 中国工商银行股份有限公司 | Research and development management and control problem prediction method and device based on gradient random forest |
CN111861004A (en) * | 2020-07-22 | 2020-10-30 | 携程计算机技术(上海)有限公司 | Method, system, apparatus and storage medium for automatic commission prediction of daily income production |
CN113268476A (en) * | 2021-06-07 | 2021-08-17 | 一汽解放汽车有限公司 | Data cleaning method and device applied to Internet of vehicles and computer equipment |
CN117113929A (en) * | 2023-09-08 | 2023-11-24 | 中电金信数字科技集团有限公司 | Method and device for splitting field data, electronic equipment and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160140463A1 (en) * | 2014-11-18 | 2016-05-19 | International Business Machines Corporation | Decision support for compensation planning |
CN107180362A (en) * | 2017-05-03 | 2017-09-19 | 浙江工商大学 | Retail commodity sales forecasting method based on deep learning |
CN108154311A (en) * | 2018-01-11 | 2018-06-12 | 国网山东省电力公司 | Top-tier customer recognition methods and device based on random forest and decision tree |
-
2018
- 2018-08-30 CN CN201811001657.4A patent/CN109255480A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160140463A1 (en) * | 2014-11-18 | 2016-05-19 | International Business Machines Corporation | Decision support for compensation planning |
CN107180362A (en) * | 2017-05-03 | 2017-09-19 | 浙江工商大学 | Retail commodity sales forecasting method based on deep learning |
CN108154311A (en) * | 2018-01-11 | 2018-06-12 | 国网山东省电力公司 | Top-tier customer recognition methods and device based on random forest and decision tree |
Non-Patent Citations (1)
Title |
---|
最实务人力资源编委会, 中国铁道出版社 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110796200A (en) * | 2019-10-30 | 2020-02-14 | 深圳前海微众银行股份有限公司 | Data classification method, terminal, device and storage medium |
CN111563077A (en) * | 2020-05-12 | 2020-08-21 | 国网山东省电力公司泰安供电公司 | Power grid voltage data missing filling method, system, terminal and storage medium |
CN111563077B (en) * | 2020-05-12 | 2023-04-25 | 国网山东省电力公司泰安供电公司 | Power grid voltage data missing filling method, system, terminal and storage medium |
CN111667107A (en) * | 2020-05-29 | 2020-09-15 | 中国工商银行股份有限公司 | Research and development management and control problem prediction method and device based on gradient random forest |
CN111667107B (en) * | 2020-05-29 | 2024-05-14 | 中国工商银行股份有限公司 | Research and development management and control problem prediction method and device based on gradient random forest |
CN111861004A (en) * | 2020-07-22 | 2020-10-30 | 携程计算机技术(上海)有限公司 | Method, system, apparatus and storage medium for automatic commission prediction of daily income production |
CN111861004B (en) * | 2020-07-22 | 2024-05-28 | 携程计算机技术(上海)有限公司 | Automatic commission prediction method, system, device and storage medium for daily income output |
CN113268476A (en) * | 2021-06-07 | 2021-08-17 | 一汽解放汽车有限公司 | Data cleaning method and device applied to Internet of vehicles and computer equipment |
CN117113929A (en) * | 2023-09-08 | 2023-11-24 | 中电金信数字科技集团有限公司 | Method and device for splitting field data, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109255480A (en) | Between servant lead prediction technique, device, computer equipment and storage medium | |
US8671097B2 (en) | Method and system for log file analysis based on distributed computing network | |
US20190172564A1 (en) | Early cost prediction and risk identification | |
CN111462845A (en) | Dynamic form generation method and device, computer equipment and storage medium | |
CN108053838A (en) | With reference to audio analysis and fraud recognition methods, device and the storage medium of video analysis | |
CN106875110A (en) | Operational indicator layered calculation method and device, distributed computing method and system | |
CN109062780A (en) | The development approach and terminal device of automatic test cases | |
CN105404763A (en) | Method for recommending doctors to patient in mobile medical system | |
CN107800591A (en) | A kind of analysis method of unified daily record data | |
CN108038413A (en) | Cheat probability analysis method, apparatus and storage medium | |
CN107909330A (en) | Work stream data processing method, device, storage medium and computer equipment | |
US11573675B2 (en) | Generating visual experience-journey timelines using experience and touchpoint data | |
CN109359019A (en) | Application program capacity monitoring method, device, electronic equipment and storage medium | |
CN102117470A (en) | Internet simulation browser-based method for acquiring data in credit investigation system | |
CN107734081A (en) | Determination method, medium, device and the computing device of contact person's label | |
CN102918522A (en) | Systems, methods, and logic for generating statistical research information | |
CN110275903A (en) | Improve the method and system of the feature formation efficiency of machine learning sample | |
CN110147941A (en) | Content of examination acquisition methods, Stakeholder Evaluation method and device | |
CN109785966A (en) | Medical record checking method, device, equipment and storage medium based on machine learning | |
CN109460942A (en) | Method and Related product based on data assay hospital | |
CN109597948A (en) | Access method, system and the storage medium of URL link | |
CN114238062B (en) | Board card burning device performance analysis method, device, equipment and readable storage medium | |
CN110215703A (en) | The selection method of game application, apparatus and system | |
CN109785155A (en) | Method and Related product based on medical insurance reimbursement model adjustment medical insurance strategy | |
Scrivner et al. | XD Metrics on demand value analytics: visualizing the impact of internal information technology investments on external funding, publications, and collaboration networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |