CN110363359A - A kind of occupation prediction technique and system - Google Patents

A kind of occupation prediction technique and system Download PDF

Info

Publication number
CN110363359A
CN110363359A CN201910667159.1A CN201910667159A CN110363359A CN 110363359 A CN110363359 A CN 110363359A CN 201910667159 A CN201910667159 A CN 201910667159A CN 110363359 A CN110363359 A CN 110363359A
Authority
CN
China
Prior art keywords
occupation
user
measured
attribute information
users
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910667159.1A
Other languages
Chinese (zh)
Inventor
刘颖慧
许丹丹
王笑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China United Network Communications Group Co Ltd
Original Assignee
China United Network Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China United Network Communications Group Co Ltd filed Critical China United Network Communications Group Co Ltd
Priority to CN201910667159.1A priority Critical patent/CN110363359A/en
Publication of CN110363359A publication Critical patent/CN110363359A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/105Human resources

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present embodiments relate to a kind of professional prediction technique and systems, wherein, this method comprises: obtaining the attribute information of the sample of users of known occupation and the attribute information of user to be measured respectively, according to the attribute information of sample of users, calculate mixed coefficint, mean vector and professional number, according to mixed coefficint, mean vector, professional number, user to be measured attribute information and gauss hybrid models, it determines the corresponding maximum probability of user to be measured, the occupation of user to be measured is determined according to maximum probability and preset threshold value.By calculating attribute information, to determine three parameters (i.e. mixed coefficint, mean vector and professional number), the technical effect predicted from multiple dimensions the occupation of user to be measured then can be achieved, and using gauss hybrid models in such a way that three parameters combine, then can be by the semi-supervised learning of gauss hybrid models the characteristics of, realizes prediction result more accurately technical effect.

Description

A kind of occupation prediction technique and system
Technical field
The present embodiments relate to field of computer technology more particularly to a kind of professional prediction techniques and system.
Background technique
In the prior art, predict that the occupation of user generallys use two ways, a kind of mode is by background work personnel The basic document provided by working experience user to be measured is predicted;Another way is by obtaining user's (including sample This user and user to be measured) interest or the single indexs such as ability construct model, and based on model to the occupation of user to be measured into Row prediction.
Inventor in the implementation of the present invention, has found at least exist: carrying out professional prediction when passing through first way When, cause prediction result excessively subjective, the drawbacks such as precision of prediction is low;When by the second way carry out occupation prediction when, only with The single indexs such as interest or ability at all can not accurately appraiser to the depth and width and the disadvantages such as development trend for adapting to occupation End.
Summary of the invention
According to an aspect of an embodiment of the present invention, the embodiment of the invention provides a kind of professional prediction technique, the sides Method includes:
The attribute information of the sample of users of known occupation and the attribute information of user to be measured are obtained respectively;
According to the attribute information of the sample of users, mixed coefficint, mean vector and professional number are calculated;
According to the mixed coefficint, the mean vector, the professional number, the user to be measured attribute information and height This mixed model determines the corresponding maximum probability of the user to be measured;
The occupation of the user to be measured is determined according to the maximum probability and preset threshold value.
In some embodiments, the threshold value includes first threshold, described according to the maximum probability and preset threshold value The occupation for determining the user to be measured includes:
It is greater than or equal to the first threshold in response to the maximum probability, then really by the corresponding occupation of the maximum probability It is set to the occupation of the user to be measured.
In some embodiments, the method also includes: the threshold value includes first threshold and second threshold, the basis The maximum probability and preset threshold value determine that the occupation of the user to be measured includes:
It is less than the first threshold in response to the maximum probability, and is greater than the second threshold, is used according to the sample The attribute information at family and the attribute information of the user to be measured carry out clustering processing to all users, wherein all users Including the sample of users and the user to be measured;
The number for calculating the user of known occupation in each class accounts for the percentage of the sample of users number;
Choose the occupation of the user of the known occupation of percent maximum;
It is same occupation in response to the occupation occupation corresponding with the maximum probability selected, really by the occupation selected It is set to the occupation of the user to be measured.
In some embodiments, the method also includes:
Be less than or equal to the second threshold in response to the maximum probability, calculate the attribute information of the user to be measured with The similarity of the attribute information of the sample of users;
Choose the corresponding occupation of attribute information of the highest sample of users of similarity, the duty as the user to be measured Industry.
In some embodiments, after the occupation that the user to be measured has been determined, the method also includes:
According to the corresponding attribute information of the occupation of the attribute information of the sample of users and the user to be measured determined The mixed coefficint, the mean vector, the professional number are updated.
According to the other side of the embodiment of the present disclosure, the embodiment of the present disclosure additionally provides a kind of professional forecasting system, institute The system of stating includes:
Module is obtained, for obtaining the attribute information of the sample of users of known occupation and the attribute letter of user to be measured respectively Breath;
Computing module calculates mixed coefficint, mean vector and occupation for the attribute information according to the sample of users Number;
Maximum probability determining module, for according to the mixed coefficint, mean vector, the professional number, described The attribute information and gauss hybrid models of user to be measured determines the corresponding maximum probability of the user to be measured;
Professional determining module, for determining the occupation of the user to be measured according to the maximum probability and preset threshold value.
In some embodiments, the threshold value includes first threshold and second threshold, and the occupation determining module is used for:
It is less than the first threshold in response to the maximum probability, and is greater than the second threshold, is used according to the sample The attribute information at family and the attribute information of the user to be measured carry out clustering processing to all users, wherein all users Including the sample of users and the user to be measured;
The number for calculating the user of known occupation in each class accounts for the percentage of the sample of users number;
Choose the occupation of the user of the known occupation of percent maximum;
It is same occupation in response to the occupation occupation corresponding with the maximum probability selected, really by the occupation selected It is set to the occupation of the user to be measured.
In some embodiments, the professional determining module is used for:
Be less than or equal to the second threshold in response to the maximum probability, calculate the attribute information of the user to be measured with The similarity of the attribute information of the sample of users;
Choose the corresponding occupation of attribute information of the highest sample of users of similarity, the duty as the user to be measured Industry.
In some embodiments, the system also includes:
Update module, for right according to the attribute information of the sample of users and the occupation of the user to be measured determined The attribute information answered is updated the mixed coefficint, the mean vector, the professional number.
The beneficial effect of the embodiment of the present invention is, obtains the attribute of the sample of users of known occupation respectively due to using The attribute information of information and user to be measured calculates mixed coefficint, mean vector and occupation according to the attribute information of sample of users Number, according to mixed coefficint, mean vector, professional number, user to be measured attribute information and gauss hybrid models, determine use to be measured The corresponding maximum probability in family determines the technical solution of the occupation of user to be measured according to maximum probability and preset threshold value, by right Attribute information is calculated, and to determine three parameters (i.e. mixed coefficint, mean vector and professional number), then can be achieved from multiple The technical effect that dimension predicts the occupation of user to be measured, and the side combined using gauss hybrid models with three parameters Formula, then can be by the semi-supervised learning of gauss hybrid models the characteristics of, realize prediction result more accurately technical effect.
Detailed description of the invention
Fig. 1 is the flow diagram of the professional prediction technique of the embodiment of the present disclosure;
Fig. 2 is the method for the occupation that user to be measured is determined according to maximum probability and preset threshold value of the embodiment of the present disclosure Flow diagram;
Fig. 3 is the side of the occupation that user to be measured is determined according to maximum probability and preset threshold value of another embodiment of the disclosure The flow diagram of method;
Fig. 4 is the flow diagram of the professional prediction technique of another embodiment of the disclosure;
Fig. 5 is the schematic diagram of the professional forecasting system of the embodiment of the present disclosure;
Fig. 6 is the schematic diagram of the professional forecasting system of another embodiment of the disclosure;
Appended drawing reference: 1, obtaining module, 2, computing module, 3, maximum probability determining module, 4, professional determining module, and 5, more New module.
Specific embodiment
In being described below, for illustration and not for limitation, propose such as specific system structure, interface, technology it The detail of class, to understand thoroughly the present invention.However, it will be clear to one skilled in the art that there is no these specific The present invention also may be implemented in the other embodiments of details.In other situations, omit to well-known system, circuit and The detailed description of method, in case unnecessary details interferes description of the invention.
The embodiment of the invention provides a kind of professional prediction technique and systems.
According to an aspect of an embodiment of the present invention, the embodiment of the invention provides a kind of professional prediction techniques.
Referring to Fig. 1, Fig. 1 is the flow diagram of the professional prediction technique of the embodiment of the present disclosure.
As shown in Figure 1, this method comprises:
S1: the attribute information of the sample of users of known occupation and the attribute information of user to be measured are obtained respectively.
In this step, user can be divided into two classes, and one kind is the user of known occupation, the another kind of use for unknown occupation Family, it is known that professional user is sample of users, and the user of unknown occupation is user to be measured.
Wherein, attribute information is information relevant to occupation.Specifically, attribute information includes: occupation, and including base station Access times of application program in location information, preset duration, educational background, the time that networks, age, the average moon in the preset duration Degree is entered an item of expenditure in the accounts one of number in month that the amount of money, the amount of money of entering an item of expenditure in the accounts in the preset duration is 0 or a variety of.
Specifically, the location information of base station includes the longitude of base station and the latitude of base station.
Preferably due to the frequency of use of certain applications program it is higher (can by by application program in preset duration The total degree that is used divided by all application programs of access times, obtain frequency of use, and by frequency of use and preset threshold value It is compared, if frequency of use is greater than threshold value, illustrates that frequency of use is higher, conversely, then frequency of use is lower), and use User's is especially universal, so that the application based on the part can not distinguish the feature of different users, therefore, by this It is deleted in certain applications program dependence information.
Wherein, application program includes but is not limited to: financing class application program, social category application program, image processing class are answered With program, camera shooting class application program, map class application program, trip class application program, travelling class application program, read class application Program, video handle class application program, business administration class application program.
Exemplarily, social category application program includes but is not limited to: Alipay, wechat and microblogging.Due to Alipay, wechat It is that frequency of use is higher, and most use uses Alipay, wechat and microblogging per family with microblogging, therefore, by Alipay, wechat It is deleted in microblogging dependence information.
In some embodiments, after getting attribute information, attribute information is pre-processed, based on pretreated Attribute information carries out subsequent step.Now by taking the user A in the sample of users of known occupation as an example, to the attribute information of user A Pretreated step is carried out to be elaborated as follows:
Educational background can be divided into junior middle school and junior middle school or less, senior middle school, university, postgraduate, doctor or more, respectively to different educational backgrounds Assignment is carried out, is such as successively assigned a value of 1 to 5, and its corresponding field is marked.
To networking, the time is calculated, and is calculated, is obtained especially by formula (time of time-the networking now)/age The value of corresponding field.
To entering an item of expenditure in the accounts in the monthly amount of money of entering an item of expenditure in the accounts average in the location information of base station, age, preset duration, preset duration, the amount of money is The access times of application program are standardized (i.e. normalized) in the number and preset duration in 0 month, specifically Standardization can be used mode in the prior art and realize, e.g., standard value=(X-Xmin)/(Xmax-Xmin), wherein X is just It is the corresponding value of each field, Xmin and Xmax are the corresponding minimum value of all users this field and maximum value.
S2: according to the attribute information of sample of users, mixed coefficint, mean vector and professional number are calculated.
Calculation method in the prior art specifically can be used to realize, such as realized by neural network model mode.
Preferably, estimated by attribute information of the maximum-likelihood method to sample of users, so as to obtain mixed coefficint, It is worth vector sum occupation number.
Specifically, it is calculated by formula 1:
LL=ln ((∑ ai*p(x_j)|ui, ∑ _ i) and * p (y_j) | θ=i, xj))+ln(∑ai*p(x_j)|ui,∑_i))
Wherein, θ ∈ { 1 ..., N }, xj indicate the attribute information for belonging to j-th of user, all duties of j-th of user of yj Industry, θ=i expression belong to i-th of distribution, p (yj) | θ=i, xj) whole meaning is user j, the attribute information of user j is x, The affiliated occupation of user j is j, this group of numerical value belongs to the probability of i-th of Gaussian Profile is how many.And aiFor mixed coefficint, uiFor Mean vector, N are professional number.
In some embodiments, by ai、uiFormula 1 is substituted into N, is iterated, until convergence, and a that convergence is obtainediReally It is set to mixed coefficint, uiIt is determined as mean vector and N is determined as professional number.
S3: according to mixed coefficint, mean vector, professional number, user to be measured attribute information and gauss hybrid models, really Determine the corresponding maximum probability of user to be measured.
In this step, the attribute information of mixed coefficint, mean vector, professional number, user to be measured are mixed as Gauss The input of molding type, output are the corresponding maximum probability p (x) of user to be measured.
In some embodiments, gauss hybrid models specifically can refer to formula 2:
P (x)=∑ ai*p(x)|ui, ∑ _ i) and * p (y_j) | θ=i, xj)
Wherein, x is the attribute information of user to be measured.
S4: the occupation of user to be measured is determined according to maximum probability and preset threshold value.
The embodiment of the present disclosure provides a kind of new professional prediction technique, this method comprises: obtaining known occupation respectively The attribute information of the attribute information of sample of users and user to be measured calculates mixed coefficint, according to the attribute information of sample of users Be worth vector sum occupation number, according to mixed coefficint, mean vector, professional number, user to be measured attribute information and Gaussian Mixture Model determines the corresponding maximum probability of user to be measured, and the occupation of user to be measured is determined according to maximum probability and preset threshold value.? In the embodiment of the present disclosure, by calculating attribute information, to determine three parameters (i.e. mixed coefficint, mean vector and duties Industry number), then the technical effect predicted from multiple dimensions the occupation of user to be measured can be achieved, and use Gaussian Mixture mould The mode that type is combined with three parameters, then can be by the semi-supervised learning of gauss hybrid models the characteristics of, realize prediction result More accurately technical effect.
In some embodiments, threshold value includes first threshold, and S4 includes: to be greater than or equal to the first threshold in response to maximum probability The corresponding occupation of maximum probability, then is determined as the occupation of user to be measured by value.
The step specifically includes: the size of maximum probability and first threshold is judged, if maximum probability is greater than or equal to the The corresponding occupation of maximum probability is then determined as the occupation of user to be measured by one threshold value.
More specifically, determine the corresponding gauss component (gauss component i.e. in gauss hybrid models) of maximum probability, it will be high The corresponding occupation of this ingredient is determined as the occupation of user to be measured.
In some embodiments, first threshold is set as 0.6.
In conjunction with Fig. 2 it is found that threshold value includes first threshold and second threshold, S4 includes:
S41: it is less than first threshold in response to maximum probability, and is greater than second threshold, according to the attribute information of sample of users Clustering processing is carried out to all users with the attribute information of user to be measured, wherein all users include sample of users and use to be measured Family.
The step specifically includes: judge the size of maximum probability and first threshold, if maximum probability is less than first threshold, Then judge the size of maximum probability and second threshold, if maximum probability is greater than second threshold, all users are clustered Processing.
In some embodiments, second threshold is set as 0.3.
In some embodiments, clustering processing is carried out to all users using Density Clustering method.
S42: the number for calculating the user of known occupation in each class accounts for the percentage of sample of users number.
By the clustering processing of S1, all users are divided into inhomogeneous user, unknown duty may be only included in each class The user (user i.e. to be measured) of industry, it is also possible to only include user (certain customers i.e. in sample of users, the Huo Zhequan of known occupation Portion user), it is also possible to while the user of the user and known occupation including unknown occupation.
In this step, in each class it is known occupation and user number account for sample of users number percentage carry out It calculates, specifically: the number of the user of occupation known in each class being counted, and calculates separately counting for each class The percentage of number and sample of users number.
S43: the occupation of the user of the known occupation of percent maximum is chosen.
S44: being same occupation in response to the occupation occupation corresponding with maximum probability selected, really by the occupation selected It is set to the occupation of user to be measured.
Wherein, S44 is specifically included: judging whether the occupation occupation corresponding with maximum probability selected is identical, if phase Together, then the occupation selected is determined as to the occupation of user to be measured.
In some embodiments, if corresponding from the maximum probability occupation of occupation selected is different, use in S41 not Same clustering method carries out clustering processing to all users, and successively executes subsequent step (i.e. S42 to S440).Such as, when in S41 Using Density Clustering method, then in this step, using K mean algorithm (K-means).If still based on K mean algorithm Can not determine user to be measured occupation (i.e. after K mean algorithm, the occupation occupation still corresponding with maximum probability selected out It is different), then the prediction of the occupation of user to be measured is realized by the method for calculating similarity, specific embodiment mode can be found in subsequent It illustrates.
In conjunction with Fig. 3 it is found that in some embodiments, S4 further include:
S411: being less than or equal to second threshold in response to maximum probability, and the attribute information and sample for calculating user to be measured are used The similarity of the attribute information at family.
Wherein, which specifically includes: the size of maximum probability and second threshold is judged, if maximum probability is less than or waits In second threshold, then the similarity of the attribute information to user to be measured and the attribute information of sample of users calculates.Calculate phase Method in the prior art can be used like the method for degree, can also be realized by following methods:
The attribute information as described in above-mentioned example includes 17 attributes (i.e. occupation, location information of base station etc.), needle altogether It sorting to each attribute to sample of users, arranges method in 17 altogether, it is 1/4,3/4 and median that attribute value is taken out in every kind of row's method, The user for so at most taking out 17*3=51 known occupations (has under certain minimum possible several properties ordering scenario, takes out identical Known occupation user), using cosine similarity calculation, calculate each user to be measured and professional use known to this 51 The similarity degree at family.
S412: the corresponding occupation of attribute information of the highest sample of users of similarity, the occupation as user to be measured are chosen.
In conjunction with above-mentioned example, the highest occupation of similarity is always selected from 51 similarities, and it (is referred into similarity highest Occupation) be determined as the occupation of user to be measured.
In some embodiments, mark can be added to the occupation of user to be measured, to mark the occupation of user to be measured for " pseudo- duty Industry ", i.e., user to be measured is not necessarily engaged in the occupation, but the occupation of user to be measured is higher with the occupation similarity.
In some embodiments, similarity can be calculated by cosine similarity algorithm, can be specifically realized by formula 3, formula 3:
Wherein, n is equal to 51, xiFor the attribute information of user to be measured, yiBelieve for the attribute of a certain user in 51 users Breath, a are the corresponding vector of attribute information of user to be measured, and b is the corresponding vector of attribute information of user to be measured.
In conjunction with Fig. 4 it is found that in some embodiments, after the occupation that user to be measured has been determined, this method further include:
S5: according to the corresponding attribute information of the occupation of the attribute information of sample of users and the user to be measured determined to mixing Coefficient, mean vector, professional number are updated.
Specifically, by expectation-maximization algorithm (Expectation Maximization Algorithm, EM) to sample The corresponding attribute information of the occupation of the attribute information of this user and the user to be measured determined is (at this point, the attribute of user to be measured is believed The corresponding field of professional attribute in breath is no longer sky) calculated, with to mixed coefficint, mean vector, professional number more Newly.
Other side according to an embodiment of the present invention, the embodiment of the invention provides the one kind corresponded to the above method Professional forecasting system.
Referring to Fig. 5, Fig. 5 is the schematic diagram of the professional forecasting system of the embodiment of the present disclosure.
As shown in figure 5, the system includes:
Module 1 is obtained, for obtaining the attribute information of the sample of users of known occupation and the attribute letter of user to be measured respectively Breath;
Computing module 2 calculates mixed coefficint, mean vector and occupation for the attribute information according to the sample of users Number;
Maximum probability determining module 3, for according to the mixed coefficint, mean vector, the professional number, described The attribute information and gauss hybrid models of user to be measured determines the corresponding maximum probability of the user to be measured;
Professional determining module 4, for determining the occupation of the user to be measured according to the maximum probability and preset threshold value.
In some embodiments, the threshold value includes first threshold, and the occupation determining module 4 is used for:
It is greater than or equal to the first threshold in response to the maximum probability, then really by the corresponding occupation of the maximum probability It is set to the occupation of the user to be measured.
In some embodiments, the threshold value includes first threshold and second threshold, and the occupation determining module 4 is used for:
It is less than the first threshold in response to the maximum probability, and is greater than the second threshold, is used according to the sample The attribute information at family and the attribute information of the user to be measured carry out clustering processing to all users, wherein all users Including the sample of users and the user to be measured;
The number for calculating the user of known occupation in each class accounts for the percentage of the sample of users number;
Choose the occupation of the user of the known occupation of percent maximum;
It is same occupation in response to the occupation occupation corresponding with the maximum probability selected, really by the occupation selected It is set to the occupation of the user to be measured.
In some embodiments, the professional determining module 4 is used for:
Be less than or equal to the second threshold in response to the maximum probability, calculate the attribute information of the user to be measured with The similarity of the attribute information of the sample of users;
Choose the corresponding occupation of attribute information of the highest sample of users of similarity, the duty as the user to be measured Industry.
In conjunction with Fig. 6 it is found that in some embodiments, the system also includes:
Update module 5, for according to the attribute information of the sample of users and the occupation of the user to be measured determined Corresponding attribute information is updated the mixed coefficint, the mean vector, the professional number.
Reader should be understood that in the description of this specification reference term " one embodiment ", " is shown " some embodiments " The description of example ", " specific example " or " some examples " etc. means specific features described in conjunction with this embodiment or example, structure Or feature is included at least one embodiment or example of the invention.In the present specification, to the schematic of above-mentioned term Statement need not be directed to identical embodiment or example.Moreover, specific features, structure or the feature of description can be any It can be combined in any suitable manner in a or multiple embodiment or examples.In addition, without conflicting with each other, the technology of this field The feature of different embodiments or examples described in this specification and different embodiments or examples can be combined by personnel And combination.
It is apparent to those skilled in the art that for convenience of description and succinctly, the dress of foregoing description The specific work process with unit is set, can refer to corresponding processes in the foregoing method embodiment, details are not described herein.
In several embodiments provided herein, it should be understood that disclosed device and method can pass through it Its mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of unit, only A kind of logical function partition, there may be another division manner in actual implementation, for example, multiple units or components can combine or Person is desirably integrated into another system, or some features can be ignored or not executed.
Unit may or may not be physically separated as illustrated by the separation member, shown as a unit Component may or may not be physical unit, it can and it is in one place, or may be distributed over multiple networks On unit.It can select some or all of unit therein according to the actual needs to realize the mesh of the embodiment of the present invention 's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, is also possible to two or more units and is integrated in one unit.It is above-mentioned integrated Unit both can take the form of hardware realization, can also realize in the form of software functional units.
It, can if integrated unit is realized in the form of SFU software functional unit and when sold or used as an independent product To be stored in a computer readable storage medium.Based on this understanding, technical solution of the present invention substantially or Say that all or part of the part that contributes to existing technology or the technical solution can embody in the form of software products Out, which is stored in a storage medium, including some instructions are used so that a computer equipment (can be personal computer, server or the network equipment etc.) executes all or part of each embodiment method of the present invention Step.And storage medium above-mentioned include: USB flash disk, it is mobile hard disk, read-only memory (ROM, Read-Only Memory), random Access various Jie that can store program code such as memory (RAM, Random Access Memory), magnetic or disk Matter.
It should also be understood that magnitude of the sequence numbers of the above procedures are not meant to execute sequence in various embodiments of the present invention It is successive, the execution of each process sequence should be determined by its function and internal logic, the implementation without coping with the embodiment of the present invention Journey constitutes any restriction.
More than, only a specific embodiment of the invention, but scope of protection of the present invention is not limited thereto, and it is any to be familiar with Those skilled in the art in the technical scope disclosed by the present invention, can readily occur in various equivalent modifications or substitutions, These modifications or substitutions should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be wanted with right Subject to the protection scope asked.

Claims (10)

1. a kind of occupation prediction technique, which is characterized in that the described method includes:
The attribute information of the sample of users of known occupation and the attribute information of user to be measured are obtained respectively;
According to the attribute information of the sample of users, mixed coefficint, mean vector and professional number are calculated;
It is mixed according to the mixed coefficint, the mean vector, the professional number, the attribute information of the user to be measured and Gauss Molding type determines the corresponding maximum probability of the user to be measured;
The occupation of the user to be measured is determined according to the maximum probability and preset threshold value.
2. the method according to claim 1, wherein the threshold value includes first threshold, it is described according to most Maximum probability and preset threshold value determine that the occupation of the user to be measured includes:
It is greater than or equal to the first threshold in response to the maximum probability, then is determined as the corresponding occupation of the maximum probability The occupation of the user to be measured.
3. the method according to claim 1, wherein the method also includes: the threshold value includes first threshold And second threshold, the occupation that the user to be measured is determined according to the maximum probability and preset threshold value include:
It is less than the first threshold in response to the maximum probability, and is greater than the second threshold, according to the sample of users Attribute information and the attribute information of the user to be measured carry out clustering processing to all users, wherein all users include The sample of users and the user to be measured;
The number for calculating the user of known occupation in each class accounts for the percentage of the sample of users number;
Choose the occupation of the user of the known occupation of percent maximum;
It is same occupation in response to the occupation occupation corresponding with the maximum probability selected, the occupation selected is determined as The occupation of the user to be measured.
4. according to the method described in claim 3, it is characterized in that, the method also includes:
Be less than or equal to the second threshold in response to the maximum probability, calculate the attribute information of the user to be measured with it is described The similarity of the attribute information of sample of users;
Choose the corresponding occupation of attribute information of the highest sample of users of similarity, the occupation as the user to be measured.
5. according to the method in any one of claims 1 to 3, which is characterized in that the duty of the user to be measured is being determined After industry, the method also includes:
According to the corresponding attribute information of the occupation of the attribute information of the sample of users and the user to be measured determined to institute Mixed coefficint, the mean vector, the professional number is stated to be updated.
6. a kind of occupation forecasting system, which is characterized in that the system comprises:
Module is obtained, for obtaining the attribute information of the sample of users of known occupation and the attribute information of user to be measured respectively;
Computing module calculates mixed coefficint, mean vector and professional number for the attribute information according to the sample of users;
Maximum probability determining module, for according to the mixed coefficint, mean vector, the professional number, described to be measured The attribute information and gauss hybrid models of user determines the corresponding maximum probability of the user to be measured;
Professional determining module, for determining the occupation of the user to be measured according to the maximum probability and preset threshold value.
7. system according to claim 6, which is characterized in that the threshold value includes first threshold, and the occupation determines mould Block is used for:
It is greater than or equal to the first threshold in response to the maximum probability, then is determined as the corresponding occupation of the maximum probability The occupation of the user to be measured.
8. system according to claim 7, which is characterized in that the threshold value includes first threshold and second threshold, described Professional determining module is used for:
It is less than the first threshold in response to the maximum probability, and is greater than the second threshold, according to the sample of users Attribute information and the attribute information of the user to be measured carry out clustering processing to all users, wherein all users include The sample of users and the user to be measured;
The number for calculating the user of known occupation in each class accounts for the percentage of the sample of users number;
Choose the occupation of the user of the known occupation of percent maximum;
It is same occupation in response to the occupation occupation corresponding with the maximum probability selected, the occupation selected is determined as The occupation of the user to be measured.
9. system according to claim 8, which is characterized in that the occupation determining module is used for:
Be less than or equal to the second threshold in response to the maximum probability, calculate the attribute information of the user to be measured with it is described The similarity of the attribute information of sample of users;
Choose the corresponding occupation of attribute information of the highest sample of users of similarity, the occupation as the user to be measured.
10. system according to any one of claims 6 to 9, which is characterized in that the system also includes:
Update module, for corresponding according to the attribute information of the sample of users and the occupation of the user to be measured determined Attribute information is updated the mixed coefficint, the mean vector, the professional number.
CN201910667159.1A 2019-07-23 2019-07-23 A kind of occupation prediction technique and system Pending CN110363359A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910667159.1A CN110363359A (en) 2019-07-23 2019-07-23 A kind of occupation prediction technique and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910667159.1A CN110363359A (en) 2019-07-23 2019-07-23 A kind of occupation prediction technique and system

Publications (1)

Publication Number Publication Date
CN110363359A true CN110363359A (en) 2019-10-22

Family

ID=68219786

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910667159.1A Pending CN110363359A (en) 2019-07-23 2019-07-23 A kind of occupation prediction technique and system

Country Status (1)

Country Link
CN (1) CN110363359A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111179055A (en) * 2019-12-20 2020-05-19 北京淇瑀信息科技有限公司 Credit limit adjusting method and device and electronic equipment
CN112785163A (en) * 2021-01-26 2021-05-11 维沃移动通信有限公司 Occupation recognition method, device, equipment and medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102033965A (en) * 2011-01-17 2011-04-27 安徽海汇金融投资集团有限公司 Method and system for classifying data based on classification model
CN107563410A (en) * 2017-08-04 2018-01-09 中国科学院自动化研究所 The sorting technique and equipment with multi-task learning are unanimously clustered based on topic categories
CN109492093A (en) * 2018-09-30 2019-03-19 平安科技(深圳)有限公司 File classification method and electronic device based on gauss hybrid models and EM algorithm

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102033965A (en) * 2011-01-17 2011-04-27 安徽海汇金融投资集团有限公司 Method and system for classifying data based on classification model
CN107563410A (en) * 2017-08-04 2018-01-09 中国科学院自动化研究所 The sorting technique and equipment with multi-task learning are unanimously clustered based on topic categories
CN109492093A (en) * 2018-09-30 2019-03-19 平安科技(深圳)有限公司 File classification method and electronic device based on gauss hybrid models and EM algorithm

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111179055A (en) * 2019-12-20 2020-05-19 北京淇瑀信息科技有限公司 Credit limit adjusting method and device and electronic equipment
CN111179055B (en) * 2019-12-20 2024-04-02 北京淇瑀信息科技有限公司 Credit line adjusting method and device and electronic equipment
CN112785163A (en) * 2021-01-26 2021-05-11 维沃移动通信有限公司 Occupation recognition method, device, equipment and medium

Similar Documents

Publication Publication Date Title
CN108280477B (en) Method and apparatus for clustering images
CN109344736B (en) Static image crowd counting method based on joint learning
US10810870B2 (en) Method of processing passage record and device
CN107133277B (en) A kind of tourist attractions recommended method based on Dynamic Theme model and matrix decomposition
CN106817251B (en) Link prediction method and device based on node similarity
CN104199818B (en) Method is recommended in a kind of socialization based on classification
WO2015165372A1 (en) Method and apparatus for classifying object based on social networking service, and storage medium
CN108876470B (en) Tag user expansion method, computer device, and storage medium
CN112221159B (en) Virtual item recommendation method and device and computer readable storage medium
Tekin et al. Adaptive ensemble learning with confidence bounds
Yucel et al. Identification of social relation within pedestrian dyads
Hong et al. Topic models to infer socio-economic maps
CN109388674A (en) Data processing method, device, equipment and readable storage medium storing program for executing
CN107633257B (en) Data quality evaluation method and device, computer readable storage medium and terminal
CN113051930B (en) Intent recognition method and device based on Bert model and related equipment
CN111831894B (en) Information matching method and device
CN112365007B (en) Model parameter determining method, device, equipment and storage medium
CN109300041A (en) Typical karst ecosystem recommended method, electronic device and readable storage medium storing program for executing
CN110377829A (en) Function recommended method and device applied to electronic equipment
CN108198172A (en) Image significance detection method and device
CN106778851A (en) Social networks forecasting system and its method based on Mobile Phone Forensics data
CN110363359A (en) A kind of occupation prediction technique and system
CN112733035A (en) Knowledge point recommendation method and device based on knowledge graph, storage medium and electronic device
KR102449694B1 (en) Method and apparatus for providing meeting matching service based on artificial intelligence
CN113609337A (en) Pre-training method, device, equipment and medium of graph neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20191022

RJ01 Rejection of invention patent application after publication