CN110363359A - A kind of occupation prediction technique and system - Google Patents
A kind of occupation prediction technique and system Download PDFInfo
- Publication number
- CN110363359A CN110363359A CN201910667159.1A CN201910667159A CN110363359A CN 110363359 A CN110363359 A CN 110363359A CN 201910667159 A CN201910667159 A CN 201910667159A CN 110363359 A CN110363359 A CN 110363359A
- Authority
- CN
- China
- Prior art keywords
- occupation
- user
- measured
- attribute information
- users
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
- G06Q10/105—Human resources
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Physics & Mathematics (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Game Theory and Decision Science (AREA)
- Development Economics (AREA)
- Data Mining & Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present embodiments relate to a kind of professional prediction technique and systems, wherein, this method comprises: obtaining the attribute information of the sample of users of known occupation and the attribute information of user to be measured respectively, according to the attribute information of sample of users, calculate mixed coefficint, mean vector and professional number, according to mixed coefficint, mean vector, professional number, user to be measured attribute information and gauss hybrid models, it determines the corresponding maximum probability of user to be measured, the occupation of user to be measured is determined according to maximum probability and preset threshold value.By calculating attribute information, to determine three parameters (i.e. mixed coefficint, mean vector and professional number), the technical effect predicted from multiple dimensions the occupation of user to be measured then can be achieved, and using gauss hybrid models in such a way that three parameters combine, then can be by the semi-supervised learning of gauss hybrid models the characteristics of, realizes prediction result more accurately technical effect.
Description
Technical field
The present embodiments relate to field of computer technology more particularly to a kind of professional prediction techniques and system.
Background technique
In the prior art, predict that the occupation of user generallys use two ways, a kind of mode is by background work personnel
The basic document provided by working experience user to be measured is predicted;Another way is by obtaining user's (including sample
This user and user to be measured) interest or the single indexs such as ability construct model, and based on model to the occupation of user to be measured into
Row prediction.
Inventor in the implementation of the present invention, has found at least exist: carrying out professional prediction when passing through first way
When, cause prediction result excessively subjective, the drawbacks such as precision of prediction is low;When by the second way carry out occupation prediction when, only with
The single indexs such as interest or ability at all can not accurately appraiser to the depth and width and the disadvantages such as development trend for adapting to occupation
End.
Summary of the invention
According to an aspect of an embodiment of the present invention, the embodiment of the invention provides a kind of professional prediction technique, the sides
Method includes:
The attribute information of the sample of users of known occupation and the attribute information of user to be measured are obtained respectively;
According to the attribute information of the sample of users, mixed coefficint, mean vector and professional number are calculated;
According to the mixed coefficint, the mean vector, the professional number, the user to be measured attribute information and height
This mixed model determines the corresponding maximum probability of the user to be measured;
The occupation of the user to be measured is determined according to the maximum probability and preset threshold value.
In some embodiments, the threshold value includes first threshold, described according to the maximum probability and preset threshold value
The occupation for determining the user to be measured includes:
It is greater than or equal to the first threshold in response to the maximum probability, then really by the corresponding occupation of the maximum probability
It is set to the occupation of the user to be measured.
In some embodiments, the method also includes: the threshold value includes first threshold and second threshold, the basis
The maximum probability and preset threshold value determine that the occupation of the user to be measured includes:
It is less than the first threshold in response to the maximum probability, and is greater than the second threshold, is used according to the sample
The attribute information at family and the attribute information of the user to be measured carry out clustering processing to all users, wherein all users
Including the sample of users and the user to be measured;
The number for calculating the user of known occupation in each class accounts for the percentage of the sample of users number;
Choose the occupation of the user of the known occupation of percent maximum;
It is same occupation in response to the occupation occupation corresponding with the maximum probability selected, really by the occupation selected
It is set to the occupation of the user to be measured.
In some embodiments, the method also includes:
Be less than or equal to the second threshold in response to the maximum probability, calculate the attribute information of the user to be measured with
The similarity of the attribute information of the sample of users;
Choose the corresponding occupation of attribute information of the highest sample of users of similarity, the duty as the user to be measured
Industry.
In some embodiments, after the occupation that the user to be measured has been determined, the method also includes:
According to the corresponding attribute information of the occupation of the attribute information of the sample of users and the user to be measured determined
The mixed coefficint, the mean vector, the professional number are updated.
According to the other side of the embodiment of the present disclosure, the embodiment of the present disclosure additionally provides a kind of professional forecasting system, institute
The system of stating includes:
Module is obtained, for obtaining the attribute information of the sample of users of known occupation and the attribute letter of user to be measured respectively
Breath;
Computing module calculates mixed coefficint, mean vector and occupation for the attribute information according to the sample of users
Number;
Maximum probability determining module, for according to the mixed coefficint, mean vector, the professional number, described
The attribute information and gauss hybrid models of user to be measured determines the corresponding maximum probability of the user to be measured;
Professional determining module, for determining the occupation of the user to be measured according to the maximum probability and preset threshold value.
In some embodiments, the threshold value includes first threshold and second threshold, and the occupation determining module is used for:
It is less than the first threshold in response to the maximum probability, and is greater than the second threshold, is used according to the sample
The attribute information at family and the attribute information of the user to be measured carry out clustering processing to all users, wherein all users
Including the sample of users and the user to be measured;
The number for calculating the user of known occupation in each class accounts for the percentage of the sample of users number;
Choose the occupation of the user of the known occupation of percent maximum;
It is same occupation in response to the occupation occupation corresponding with the maximum probability selected, really by the occupation selected
It is set to the occupation of the user to be measured.
In some embodiments, the professional determining module is used for:
Be less than or equal to the second threshold in response to the maximum probability, calculate the attribute information of the user to be measured with
The similarity of the attribute information of the sample of users;
Choose the corresponding occupation of attribute information of the highest sample of users of similarity, the duty as the user to be measured
Industry.
In some embodiments, the system also includes:
Update module, for right according to the attribute information of the sample of users and the occupation of the user to be measured determined
The attribute information answered is updated the mixed coefficint, the mean vector, the professional number.
The beneficial effect of the embodiment of the present invention is, obtains the attribute of the sample of users of known occupation respectively due to using
The attribute information of information and user to be measured calculates mixed coefficint, mean vector and occupation according to the attribute information of sample of users
Number, according to mixed coefficint, mean vector, professional number, user to be measured attribute information and gauss hybrid models, determine use to be measured
The corresponding maximum probability in family determines the technical solution of the occupation of user to be measured according to maximum probability and preset threshold value, by right
Attribute information is calculated, and to determine three parameters (i.e. mixed coefficint, mean vector and professional number), then can be achieved from multiple
The technical effect that dimension predicts the occupation of user to be measured, and the side combined using gauss hybrid models with three parameters
Formula, then can be by the semi-supervised learning of gauss hybrid models the characteristics of, realize prediction result more accurately technical effect.
Detailed description of the invention
Fig. 1 is the flow diagram of the professional prediction technique of the embodiment of the present disclosure;
Fig. 2 is the method for the occupation that user to be measured is determined according to maximum probability and preset threshold value of the embodiment of the present disclosure
Flow diagram;
Fig. 3 is the side of the occupation that user to be measured is determined according to maximum probability and preset threshold value of another embodiment of the disclosure
The flow diagram of method;
Fig. 4 is the flow diagram of the professional prediction technique of another embodiment of the disclosure;
Fig. 5 is the schematic diagram of the professional forecasting system of the embodiment of the present disclosure;
Fig. 6 is the schematic diagram of the professional forecasting system of another embodiment of the disclosure;
Appended drawing reference: 1, obtaining module, 2, computing module, 3, maximum probability determining module, 4, professional determining module, and 5, more
New module.
Specific embodiment
In being described below, for illustration and not for limitation, propose such as specific system structure, interface, technology it
The detail of class, to understand thoroughly the present invention.However, it will be clear to one skilled in the art that there is no these specific
The present invention also may be implemented in the other embodiments of details.In other situations, omit to well-known system, circuit and
The detailed description of method, in case unnecessary details interferes description of the invention.
The embodiment of the invention provides a kind of professional prediction technique and systems.
According to an aspect of an embodiment of the present invention, the embodiment of the invention provides a kind of professional prediction techniques.
Referring to Fig. 1, Fig. 1 is the flow diagram of the professional prediction technique of the embodiment of the present disclosure.
As shown in Figure 1, this method comprises:
S1: the attribute information of the sample of users of known occupation and the attribute information of user to be measured are obtained respectively.
In this step, user can be divided into two classes, and one kind is the user of known occupation, the another kind of use for unknown occupation
Family, it is known that professional user is sample of users, and the user of unknown occupation is user to be measured.
Wherein, attribute information is information relevant to occupation.Specifically, attribute information includes: occupation, and including base station
Access times of application program in location information, preset duration, educational background, the time that networks, age, the average moon in the preset duration
Degree is entered an item of expenditure in the accounts one of number in month that the amount of money, the amount of money of entering an item of expenditure in the accounts in the preset duration is 0 or a variety of.
Specifically, the location information of base station includes the longitude of base station and the latitude of base station.
Preferably due to the frequency of use of certain applications program it is higher (can by by application program in preset duration
The total degree that is used divided by all application programs of access times, obtain frequency of use, and by frequency of use and preset threshold value
It is compared, if frequency of use is greater than threshold value, illustrates that frequency of use is higher, conversely, then frequency of use is lower), and use
User's is especially universal, so that the application based on the part can not distinguish the feature of different users, therefore, by this
It is deleted in certain applications program dependence information.
Wherein, application program includes but is not limited to: financing class application program, social category application program, image processing class are answered
With program, camera shooting class application program, map class application program, trip class application program, travelling class application program, read class application
Program, video handle class application program, business administration class application program.
Exemplarily, social category application program includes but is not limited to: Alipay, wechat and microblogging.Due to Alipay, wechat
It is that frequency of use is higher, and most use uses Alipay, wechat and microblogging per family with microblogging, therefore, by Alipay, wechat
It is deleted in microblogging dependence information.
In some embodiments, after getting attribute information, attribute information is pre-processed, based on pretreated
Attribute information carries out subsequent step.Now by taking the user A in the sample of users of known occupation as an example, to the attribute information of user A
Pretreated step is carried out to be elaborated as follows:
Educational background can be divided into junior middle school and junior middle school or less, senior middle school, university, postgraduate, doctor or more, respectively to different educational backgrounds
Assignment is carried out, is such as successively assigned a value of 1 to 5, and its corresponding field is marked.
To networking, the time is calculated, and is calculated, is obtained especially by formula (time of time-the networking now)/age
The value of corresponding field.
To entering an item of expenditure in the accounts in the monthly amount of money of entering an item of expenditure in the accounts average in the location information of base station, age, preset duration, preset duration, the amount of money is
The access times of application program are standardized (i.e. normalized) in the number and preset duration in 0 month, specifically
Standardization can be used mode in the prior art and realize, e.g., standard value=(X-Xmin)/(Xmax-Xmin), wherein X is just
It is the corresponding value of each field, Xmin and Xmax are the corresponding minimum value of all users this field and maximum value.
S2: according to the attribute information of sample of users, mixed coefficint, mean vector and professional number are calculated.
Calculation method in the prior art specifically can be used to realize, such as realized by neural network model mode.
Preferably, estimated by attribute information of the maximum-likelihood method to sample of users, so as to obtain mixed coefficint,
It is worth vector sum occupation number.
Specifically, it is calculated by formula 1:
LL=ln ((∑ ai*p(x_j)|ui, ∑ _ i) and * p (y_j) | θ=i, xj))+ln(∑ai*p(x_j)|ui,∑_i))
Wherein, θ ∈ { 1 ..., N }, xj indicate the attribute information for belonging to j-th of user, all duties of j-th of user of yj
Industry, θ=i expression belong to i-th of distribution, p (yj) | θ=i, xj) whole meaning is user j, the attribute information of user j is x,
The affiliated occupation of user j is j, this group of numerical value belongs to the probability of i-th of Gaussian Profile is how many.And aiFor mixed coefficint, uiFor
Mean vector, N are professional number.
In some embodiments, by ai、uiFormula 1 is substituted into N, is iterated, until convergence, and a that convergence is obtainediReally
It is set to mixed coefficint, uiIt is determined as mean vector and N is determined as professional number.
S3: according to mixed coefficint, mean vector, professional number, user to be measured attribute information and gauss hybrid models, really
Determine the corresponding maximum probability of user to be measured.
In this step, the attribute information of mixed coefficint, mean vector, professional number, user to be measured are mixed as Gauss
The input of molding type, output are the corresponding maximum probability p (x) of user to be measured.
In some embodiments, gauss hybrid models specifically can refer to formula 2:
P (x)=∑ ai*p(x)|ui, ∑ _ i) and * p (y_j) | θ=i, xj)
Wherein, x is the attribute information of user to be measured.
S4: the occupation of user to be measured is determined according to maximum probability and preset threshold value.
The embodiment of the present disclosure provides a kind of new professional prediction technique, this method comprises: obtaining known occupation respectively
The attribute information of the attribute information of sample of users and user to be measured calculates mixed coefficint, according to the attribute information of sample of users
Be worth vector sum occupation number, according to mixed coefficint, mean vector, professional number, user to be measured attribute information and Gaussian Mixture
Model determines the corresponding maximum probability of user to be measured, and the occupation of user to be measured is determined according to maximum probability and preset threshold value.?
In the embodiment of the present disclosure, by calculating attribute information, to determine three parameters (i.e. mixed coefficint, mean vector and duties
Industry number), then the technical effect predicted from multiple dimensions the occupation of user to be measured can be achieved, and use Gaussian Mixture mould
The mode that type is combined with three parameters, then can be by the semi-supervised learning of gauss hybrid models the characteristics of, realize prediction result
More accurately technical effect.
In some embodiments, threshold value includes first threshold, and S4 includes: to be greater than or equal to the first threshold in response to maximum probability
The corresponding occupation of maximum probability, then is determined as the occupation of user to be measured by value.
The step specifically includes: the size of maximum probability and first threshold is judged, if maximum probability is greater than or equal to the
The corresponding occupation of maximum probability is then determined as the occupation of user to be measured by one threshold value.
More specifically, determine the corresponding gauss component (gauss component i.e. in gauss hybrid models) of maximum probability, it will be high
The corresponding occupation of this ingredient is determined as the occupation of user to be measured.
In some embodiments, first threshold is set as 0.6.
In conjunction with Fig. 2 it is found that threshold value includes first threshold and second threshold, S4 includes:
S41: it is less than first threshold in response to maximum probability, and is greater than second threshold, according to the attribute information of sample of users
Clustering processing is carried out to all users with the attribute information of user to be measured, wherein all users include sample of users and use to be measured
Family.
The step specifically includes: judge the size of maximum probability and first threshold, if maximum probability is less than first threshold,
Then judge the size of maximum probability and second threshold, if maximum probability is greater than second threshold, all users are clustered
Processing.
In some embodiments, second threshold is set as 0.3.
In some embodiments, clustering processing is carried out to all users using Density Clustering method.
S42: the number for calculating the user of known occupation in each class accounts for the percentage of sample of users number.
By the clustering processing of S1, all users are divided into inhomogeneous user, unknown duty may be only included in each class
The user (user i.e. to be measured) of industry, it is also possible to only include user (certain customers i.e. in sample of users, the Huo Zhequan of known occupation
Portion user), it is also possible to while the user of the user and known occupation including unknown occupation.
In this step, in each class it is known occupation and user number account for sample of users number percentage carry out
It calculates, specifically: the number of the user of occupation known in each class being counted, and calculates separately counting for each class
The percentage of number and sample of users number.
S43: the occupation of the user of the known occupation of percent maximum is chosen.
S44: being same occupation in response to the occupation occupation corresponding with maximum probability selected, really by the occupation selected
It is set to the occupation of user to be measured.
Wherein, S44 is specifically included: judging whether the occupation occupation corresponding with maximum probability selected is identical, if phase
Together, then the occupation selected is determined as to the occupation of user to be measured.
In some embodiments, if corresponding from the maximum probability occupation of occupation selected is different, use in S41 not
Same clustering method carries out clustering processing to all users, and successively executes subsequent step (i.e. S42 to S440).Such as, when in S41
Using Density Clustering method, then in this step, using K mean algorithm (K-means).If still based on K mean algorithm
Can not determine user to be measured occupation (i.e. after K mean algorithm, the occupation occupation still corresponding with maximum probability selected out
It is different), then the prediction of the occupation of user to be measured is realized by the method for calculating similarity, specific embodiment mode can be found in subsequent
It illustrates.
In conjunction with Fig. 3 it is found that in some embodiments, S4 further include:
S411: being less than or equal to second threshold in response to maximum probability, and the attribute information and sample for calculating user to be measured are used
The similarity of the attribute information at family.
Wherein, which specifically includes: the size of maximum probability and second threshold is judged, if maximum probability is less than or waits
In second threshold, then the similarity of the attribute information to user to be measured and the attribute information of sample of users calculates.Calculate phase
Method in the prior art can be used like the method for degree, can also be realized by following methods:
The attribute information as described in above-mentioned example includes 17 attributes (i.e. occupation, location information of base station etc.), needle altogether
It sorting to each attribute to sample of users, arranges method in 17 altogether, it is 1/4,3/4 and median that attribute value is taken out in every kind of row's method,
The user for so at most taking out 17*3=51 known occupations (has under certain minimum possible several properties ordering scenario, takes out identical
Known occupation user), using cosine similarity calculation, calculate each user to be measured and professional use known to this 51
The similarity degree at family.
S412: the corresponding occupation of attribute information of the highest sample of users of similarity, the occupation as user to be measured are chosen.
In conjunction with above-mentioned example, the highest occupation of similarity is always selected from 51 similarities, and it (is referred into similarity highest
Occupation) be determined as the occupation of user to be measured.
In some embodiments, mark can be added to the occupation of user to be measured, to mark the occupation of user to be measured for " pseudo- duty
Industry ", i.e., user to be measured is not necessarily engaged in the occupation, but the occupation of user to be measured is higher with the occupation similarity.
In some embodiments, similarity can be calculated by cosine similarity algorithm, can be specifically realized by formula 3, formula 3:
Wherein, n is equal to 51, xiFor the attribute information of user to be measured, yiBelieve for the attribute of a certain user in 51 users
Breath, a are the corresponding vector of attribute information of user to be measured, and b is the corresponding vector of attribute information of user to be measured.
In conjunction with Fig. 4 it is found that in some embodiments, after the occupation that user to be measured has been determined, this method further include:
S5: according to the corresponding attribute information of the occupation of the attribute information of sample of users and the user to be measured determined to mixing
Coefficient, mean vector, professional number are updated.
Specifically, by expectation-maximization algorithm (Expectation Maximization Algorithm, EM) to sample
The corresponding attribute information of the occupation of the attribute information of this user and the user to be measured determined is (at this point, the attribute of user to be measured is believed
The corresponding field of professional attribute in breath is no longer sky) calculated, with to mixed coefficint, mean vector, professional number more
Newly.
Other side according to an embodiment of the present invention, the embodiment of the invention provides the one kind corresponded to the above method
Professional forecasting system.
Referring to Fig. 5, Fig. 5 is the schematic diagram of the professional forecasting system of the embodiment of the present disclosure.
As shown in figure 5, the system includes:
Module 1 is obtained, for obtaining the attribute information of the sample of users of known occupation and the attribute letter of user to be measured respectively
Breath;
Computing module 2 calculates mixed coefficint, mean vector and occupation for the attribute information according to the sample of users
Number;
Maximum probability determining module 3, for according to the mixed coefficint, mean vector, the professional number, described
The attribute information and gauss hybrid models of user to be measured determines the corresponding maximum probability of the user to be measured;
Professional determining module 4, for determining the occupation of the user to be measured according to the maximum probability and preset threshold value.
In some embodiments, the threshold value includes first threshold, and the occupation determining module 4 is used for:
It is greater than or equal to the first threshold in response to the maximum probability, then really by the corresponding occupation of the maximum probability
It is set to the occupation of the user to be measured.
In some embodiments, the threshold value includes first threshold and second threshold, and the occupation determining module 4 is used for:
It is less than the first threshold in response to the maximum probability, and is greater than the second threshold, is used according to the sample
The attribute information at family and the attribute information of the user to be measured carry out clustering processing to all users, wherein all users
Including the sample of users and the user to be measured;
The number for calculating the user of known occupation in each class accounts for the percentage of the sample of users number;
Choose the occupation of the user of the known occupation of percent maximum;
It is same occupation in response to the occupation occupation corresponding with the maximum probability selected, really by the occupation selected
It is set to the occupation of the user to be measured.
In some embodiments, the professional determining module 4 is used for:
Be less than or equal to the second threshold in response to the maximum probability, calculate the attribute information of the user to be measured with
The similarity of the attribute information of the sample of users;
Choose the corresponding occupation of attribute information of the highest sample of users of similarity, the duty as the user to be measured
Industry.
In conjunction with Fig. 6 it is found that in some embodiments, the system also includes:
Update module 5, for according to the attribute information of the sample of users and the occupation of the user to be measured determined
Corresponding attribute information is updated the mixed coefficint, the mean vector, the professional number.
Reader should be understood that in the description of this specification reference term " one embodiment ", " is shown " some embodiments "
The description of example ", " specific example " or " some examples " etc. means specific features described in conjunction with this embodiment or example, structure
Or feature is included at least one embodiment or example of the invention.In the present specification, to the schematic of above-mentioned term
Statement need not be directed to identical embodiment or example.Moreover, specific features, structure or the feature of description can be any
It can be combined in any suitable manner in a or multiple embodiment or examples.In addition, without conflicting with each other, the technology of this field
The feature of different embodiments or examples described in this specification and different embodiments or examples can be combined by personnel
And combination.
It is apparent to those skilled in the art that for convenience of description and succinctly, the dress of foregoing description
The specific work process with unit is set, can refer to corresponding processes in the foregoing method embodiment, details are not described herein.
In several embodiments provided herein, it should be understood that disclosed device and method can pass through it
Its mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of unit, only
A kind of logical function partition, there may be another division manner in actual implementation, for example, multiple units or components can combine or
Person is desirably integrated into another system, or some features can be ignored or not executed.
Unit may or may not be physically separated as illustrated by the separation member, shown as a unit
Component may or may not be physical unit, it can and it is in one place, or may be distributed over multiple networks
On unit.It can select some or all of unit therein according to the actual needs to realize the mesh of the embodiment of the present invention
's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit
It is that each unit physically exists alone, is also possible to two or more units and is integrated in one unit.It is above-mentioned integrated
Unit both can take the form of hardware realization, can also realize in the form of software functional units.
It, can if integrated unit is realized in the form of SFU software functional unit and when sold or used as an independent product
To be stored in a computer readable storage medium.Based on this understanding, technical solution of the present invention substantially or
Say that all or part of the part that contributes to existing technology or the technical solution can embody in the form of software products
Out, which is stored in a storage medium, including some instructions are used so that a computer equipment
(can be personal computer, server or the network equipment etc.) executes all or part of each embodiment method of the present invention
Step.And storage medium above-mentioned include: USB flash disk, it is mobile hard disk, read-only memory (ROM, Read-Only Memory), random
Access various Jie that can store program code such as memory (RAM, Random Access Memory), magnetic or disk
Matter.
It should also be understood that magnitude of the sequence numbers of the above procedures are not meant to execute sequence in various embodiments of the present invention
It is successive, the execution of each process sequence should be determined by its function and internal logic, the implementation without coping with the embodiment of the present invention
Journey constitutes any restriction.
More than, only a specific embodiment of the invention, but scope of protection of the present invention is not limited thereto, and it is any to be familiar with
Those skilled in the art in the technical scope disclosed by the present invention, can readily occur in various equivalent modifications or substitutions,
These modifications or substitutions should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be wanted with right
Subject to the protection scope asked.
Claims (10)
1. a kind of occupation prediction technique, which is characterized in that the described method includes:
The attribute information of the sample of users of known occupation and the attribute information of user to be measured are obtained respectively;
According to the attribute information of the sample of users, mixed coefficint, mean vector and professional number are calculated;
It is mixed according to the mixed coefficint, the mean vector, the professional number, the attribute information of the user to be measured and Gauss
Molding type determines the corresponding maximum probability of the user to be measured;
The occupation of the user to be measured is determined according to the maximum probability and preset threshold value.
2. the method according to claim 1, wherein the threshold value includes first threshold, it is described according to most
Maximum probability and preset threshold value determine that the occupation of the user to be measured includes:
It is greater than or equal to the first threshold in response to the maximum probability, then is determined as the corresponding occupation of the maximum probability
The occupation of the user to be measured.
3. the method according to claim 1, wherein the method also includes: the threshold value includes first threshold
And second threshold, the occupation that the user to be measured is determined according to the maximum probability and preset threshold value include:
It is less than the first threshold in response to the maximum probability, and is greater than the second threshold, according to the sample of users
Attribute information and the attribute information of the user to be measured carry out clustering processing to all users, wherein all users include
The sample of users and the user to be measured;
The number for calculating the user of known occupation in each class accounts for the percentage of the sample of users number;
Choose the occupation of the user of the known occupation of percent maximum;
It is same occupation in response to the occupation occupation corresponding with the maximum probability selected, the occupation selected is determined as
The occupation of the user to be measured.
4. according to the method described in claim 3, it is characterized in that, the method also includes:
Be less than or equal to the second threshold in response to the maximum probability, calculate the attribute information of the user to be measured with it is described
The similarity of the attribute information of sample of users;
Choose the corresponding occupation of attribute information of the highest sample of users of similarity, the occupation as the user to be measured.
5. according to the method in any one of claims 1 to 3, which is characterized in that the duty of the user to be measured is being determined
After industry, the method also includes:
According to the corresponding attribute information of the occupation of the attribute information of the sample of users and the user to be measured determined to institute
Mixed coefficint, the mean vector, the professional number is stated to be updated.
6. a kind of occupation forecasting system, which is characterized in that the system comprises:
Module is obtained, for obtaining the attribute information of the sample of users of known occupation and the attribute information of user to be measured respectively;
Computing module calculates mixed coefficint, mean vector and professional number for the attribute information according to the sample of users;
Maximum probability determining module, for according to the mixed coefficint, mean vector, the professional number, described to be measured
The attribute information and gauss hybrid models of user determines the corresponding maximum probability of the user to be measured;
Professional determining module, for determining the occupation of the user to be measured according to the maximum probability and preset threshold value.
7. system according to claim 6, which is characterized in that the threshold value includes first threshold, and the occupation determines mould
Block is used for:
It is greater than or equal to the first threshold in response to the maximum probability, then is determined as the corresponding occupation of the maximum probability
The occupation of the user to be measured.
8. system according to claim 7, which is characterized in that the threshold value includes first threshold and second threshold, described
Professional determining module is used for:
It is less than the first threshold in response to the maximum probability, and is greater than the second threshold, according to the sample of users
Attribute information and the attribute information of the user to be measured carry out clustering processing to all users, wherein all users include
The sample of users and the user to be measured;
The number for calculating the user of known occupation in each class accounts for the percentage of the sample of users number;
Choose the occupation of the user of the known occupation of percent maximum;
It is same occupation in response to the occupation occupation corresponding with the maximum probability selected, the occupation selected is determined as
The occupation of the user to be measured.
9. system according to claim 8, which is characterized in that the occupation determining module is used for:
Be less than or equal to the second threshold in response to the maximum probability, calculate the attribute information of the user to be measured with it is described
The similarity of the attribute information of sample of users;
Choose the corresponding occupation of attribute information of the highest sample of users of similarity, the occupation as the user to be measured.
10. system according to any one of claims 6 to 9, which is characterized in that the system also includes:
Update module, for corresponding according to the attribute information of the sample of users and the occupation of the user to be measured determined
Attribute information is updated the mixed coefficint, the mean vector, the professional number.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910667159.1A CN110363359A (en) | 2019-07-23 | 2019-07-23 | A kind of occupation prediction technique and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910667159.1A CN110363359A (en) | 2019-07-23 | 2019-07-23 | A kind of occupation prediction technique and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110363359A true CN110363359A (en) | 2019-10-22 |
Family
ID=68219786
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910667159.1A Pending CN110363359A (en) | 2019-07-23 | 2019-07-23 | A kind of occupation prediction technique and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110363359A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111179055A (en) * | 2019-12-20 | 2020-05-19 | 北京淇瑀信息科技有限公司 | Credit limit adjusting method and device and electronic equipment |
CN112785163A (en) * | 2021-01-26 | 2021-05-11 | 维沃移动通信有限公司 | Occupation recognition method, device, equipment and medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102033965A (en) * | 2011-01-17 | 2011-04-27 | 安徽海汇金融投资集团有限公司 | Method and system for classifying data based on classification model |
CN107563410A (en) * | 2017-08-04 | 2018-01-09 | 中国科学院自动化研究所 | The sorting technique and equipment with multi-task learning are unanimously clustered based on topic categories |
CN109492093A (en) * | 2018-09-30 | 2019-03-19 | 平安科技(深圳)有限公司 | File classification method and electronic device based on gauss hybrid models and EM algorithm |
-
2019
- 2019-07-23 CN CN201910667159.1A patent/CN110363359A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102033965A (en) * | 2011-01-17 | 2011-04-27 | 安徽海汇金融投资集团有限公司 | Method and system for classifying data based on classification model |
CN107563410A (en) * | 2017-08-04 | 2018-01-09 | 中国科学院自动化研究所 | The sorting technique and equipment with multi-task learning are unanimously clustered based on topic categories |
CN109492093A (en) * | 2018-09-30 | 2019-03-19 | 平安科技(深圳)有限公司 | File classification method and electronic device based on gauss hybrid models and EM algorithm |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111179055A (en) * | 2019-12-20 | 2020-05-19 | 北京淇瑀信息科技有限公司 | Credit limit adjusting method and device and electronic equipment |
CN111179055B (en) * | 2019-12-20 | 2024-04-02 | 北京淇瑀信息科技有限公司 | Credit line adjusting method and device and electronic equipment |
CN112785163A (en) * | 2021-01-26 | 2021-05-11 | 维沃移动通信有限公司 | Occupation recognition method, device, equipment and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108280477B (en) | Method and apparatus for clustering images | |
CN109344736B (en) | Static image crowd counting method based on joint learning | |
US10810870B2 (en) | Method of processing passage record and device | |
CN107133277B (en) | A kind of tourist attractions recommended method based on Dynamic Theme model and matrix decomposition | |
CN106817251B (en) | Link prediction method and device based on node similarity | |
CN104199818B (en) | Method is recommended in a kind of socialization based on classification | |
WO2015165372A1 (en) | Method and apparatus for classifying object based on social networking service, and storage medium | |
CN108876470B (en) | Tag user expansion method, computer device, and storage medium | |
CN112221159B (en) | Virtual item recommendation method and device and computer readable storage medium | |
Tekin et al. | Adaptive ensemble learning with confidence bounds | |
Yucel et al. | Identification of social relation within pedestrian dyads | |
Hong et al. | Topic models to infer socio-economic maps | |
CN109388674A (en) | Data processing method, device, equipment and readable storage medium storing program for executing | |
CN107633257B (en) | Data quality evaluation method and device, computer readable storage medium and terminal | |
CN113051930B (en) | Intent recognition method and device based on Bert model and related equipment | |
CN111831894B (en) | Information matching method and device | |
CN112365007B (en) | Model parameter determining method, device, equipment and storage medium | |
CN109300041A (en) | Typical karst ecosystem recommended method, electronic device and readable storage medium storing program for executing | |
CN110377829A (en) | Function recommended method and device applied to electronic equipment | |
CN108198172A (en) | Image significance detection method and device | |
CN106778851A (en) | Social networks forecasting system and its method based on Mobile Phone Forensics data | |
CN110363359A (en) | A kind of occupation prediction technique and system | |
CN112733035A (en) | Knowledge point recommendation method and device based on knowledge graph, storage medium and electronic device | |
KR102449694B1 (en) | Method and apparatus for providing meeting matching service based on artificial intelligence | |
CN113609337A (en) | Pre-training method, device, equipment and medium of graph neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191022 |
|
RJ01 | Rejection of invention patent application after publication |