WO2020024444A1 - 人群绩效等级识别方法、装置、存储介质及计算机设备 - Google Patents

人群绩效等级识别方法、装置、存储介质及计算机设备 Download PDF

Info

Publication number
WO2020024444A1
WO2020024444A1 PCT/CN2018/111118 CN2018111118W WO2020024444A1 WO 2020024444 A1 WO2020024444 A1 WO 2020024444A1 CN 2018111118 W CN2018111118 W CN 2018111118W WO 2020024444 A1 WO2020024444 A1 WO 2020024444A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
leaf node
crowd
performance level
sample
Prior art date
Application number
PCT/CN2018/111118
Other languages
English (en)
French (fr)
Inventor
金戈
徐亮
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020024444A1 publication Critical patent/WO2020024444A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Definitions

  • the present application relates to the field of artificial intelligence technology, and in particular, to a method, a device, a storage medium, and a computer device for identifying a performance level of a crowd.
  • the crowd performance level is usually identified only by a decision tree model, that is, the crowd performance level obtained by the decision tree model to identify the user is determined as the crowd performance level of the user.
  • the decision tree model may not be able to achieve an ideal fit.
  • the lower population is usually the target population.
  • the data of high-performance groups is usually much smaller than the data of ordinary performance groups.
  • the proportion of high-performance groups is usually 20%
  • the general-performance group is usually 80%
  • the high-performance groups are usually For the target population. If the performance level of the crowd is identified only by the decision tree model, the recognition error of the decision tree model will occur, resulting in a lower number of identified target populations and a lower recall rate of the target populations.
  • the present application provides a method, device, storage medium and computer equipment for identifying the performance level of a crowd, which mainly reduces the situation of leaf node recognition errors whose information entropy is greater than or equal to a preset threshold, and can guarantee the accuracy of recognition , To improve the recall rate of the target population is relatively low.
  • a method for identifying a crowd performance level including:
  • the performance characteristics of the crowd corresponding to the user are input to a preset logistic regression model for calculation, and the performance level of the user belonging to the target population is obtained.
  • the crowd performance level of the user is determined according to the probability value of the user belonging to the target population performance level and the probability limit value of the specific leaf node.
  • a crowd performance level identification device including:
  • a recognition unit configured to input a crowd performance characteristic corresponding to a user to be identified into a preset decision tree model for identification, and obtain information entropy of the user belonging to a specific leaf node;
  • a calculation unit configured to input the performance characteristics of the crowd corresponding to the user into a preset logistic regression model for calculation if the information entropy of the user belonging to a particular leaf node is greater than or equal to a preset threshold, to obtain the user's belonging to The probability value of the target group's performance level;
  • a determining unit is configured to determine the user's crowd performance level based on the probability value of the user's belonging to the target population performance level and the probability limit value of the specific leaf node.
  • a non-volatile readable storage medium on which computer-readable instructions are stored, and when the computer-readable instructions are executed by a processor, the following steps are implemented:
  • the performance characteristics of the crowd corresponding to the user are input to a preset logistic regression model for calculation, and the performance level of the user belonging to the target population is obtained.
  • the crowd performance level of the user is determined according to the probability value of the user belonging to the target population performance level and the probability limit value of the specific leaf node.
  • a computer device including a memory, a processor, and computer-readable instructions stored on the memory and executable on the processor.
  • the processor executes the computer-readable instructions, Implement the following steps:
  • the performance characteristics of the crowd corresponding to the user are input to a preset logistic regression model for calculation, and the performance of the user belonging to the target population Probability value
  • the crowd performance level of the user is determined according to the probability value of the user belonging to the target population performance level and the probability limit value of the specific leaf node.
  • a crowd performance level identification method, device, storage medium, and computer equipment provided by the present application.
  • the present application can input the crowd performance characteristics corresponding to the user to be identified into A decision tree model is used for identification to obtain the information entropy of the user belonging to a specific leaf node.
  • the performance characteristics of the crowd corresponding to the user can be input to a preset logistic regression model for calculation, and the user is attributed to The probability value of the target group's performance level; and the user's group performance level can be determined according to the probability value of the user's belonging to the target group's performance level and the probability limit value of the specific leaf node, so that the preset decision can be combined
  • the tree model and the preset logistic regression model identify the performance level of the crowd, that is, when the information entropy of a specific leaf node obtained by the preset decision tree model is greater than or equal to a preset threshold, the user attribution that can be calculated by the preset logistic regression model
  • the user is further identified based on the probability value of the target group's performance level and the probability limit value of the specific leaf node, which can reduce the recognition error of leaf nodes whose information entropy is greater than or equal to
  • FIG. 1 shows a flowchart of a method for identifying a crowd performance level according to an embodiment of the present application
  • FIG. 2 shows a flowchart of another method for identifying a performance level of a crowd according to an embodiment of the present application
  • FIG. 3 is a schematic structural diagram of a crowd performance level identification device according to an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of another device for identifying a performance level of a crowd according to an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a computer device according to an embodiment of the present application.
  • the crowd performance level is usually identified only by a decision tree model, that is, the crowd performance level obtained by the decision tree model to identify the user is determined as the crowd performance level of the user.
  • the decision tree model may not be able to achieve an ideal fit.
  • the lower population is usually the target population. If the performance level of the crowd is identified only by the decision tree model, the recognition error of the decision tree model will occur, resulting in a lower number of identified target populations and a lower recall rate of the target populations.
  • an embodiment of the present application provides a method for identifying a performance level of a crowd. As shown in FIG. 1, the method includes:
  • the performance characteristics of the crowd may include, but are not limited to, the number of monthly average courses, the latitude and longitude of the work address, and the number of Internet trading products in a single month.
  • the crowd performance level may include a high performance level and a general performance level.
  • the preset decision tree model may be established according to a sample crowd performance characteristic and a sample crowd performance level, and the preset decision tree model may save a correspondence rule between a crowd performance characteristic and a crowd performance level.
  • the specific leaf node to which the user belongs may be a leaf node classified by the user by a preset decision tree model, and specifically may be obtained by classifying according to input crowd performance characteristics, that is, each crowd performance characteristic may be used for identification The user category node. After the crowd performance characteristics are input into the preset decision tree model, the preset decision tree model will classify the user according to the crowd performance characteristics, and finally obtain that the user belongs to a specific leaf node.
  • the average number of monthly learning courses is 50
  • the work address latitude and longitude are (123.435, 41.819)
  • the number of Internet trading products in a single month is 10.
  • the preset decision tree model first classifies users based on the average number of monthly learning courses 50 , And then classify users according to the latitude and longitude of the work address (123.435, 41.819); classify users based on the number of Internet transactions in a single month, 10, and finally get the leaf node 1 of the user, so leaf node 1 is The specific leaf node to which the user belongs.
  • information entropy can be used to represent the uncertainty of classification.
  • the information entropy can be calculated by the following formula:
  • D represents the sample set of the sample population performance characteristics and the sample population performance level combination
  • c represents the number of categories of the sample population performance level
  • p i represents the probability or proportion of the sample users being classified into the sample population performance level i category.
  • the preset threshold may be set according to user requirements, for example, the preset threshold may be 0.88, 0.89, or the like. If the information entropy of the user belonging to a particular leaf node is greater than or equal to a preset threshold, it indicates that the particular leaf node of the preset decision tree model may be misclassified, and by inputting the performance characteristics of the crowd corresponding to the user into the preset logistic regression The model performs calculations to enable secondary classification and identification of the user.
  • the target group performance level may be a low group performance level in the preset decision tree model. Specifically, in the embodiments of the present application, the target group performance level may be a high performance level. Therefore, the probability value that the user belongs to a high performance level can be output through the preset logistic regression model.
  • the probability limit value of the specific leaf node may be obtained by sorting the probability values of each sample user belonging to the target population performance level under the specific leaf node, and each sample user belongs to the target population performance level.
  • the probability value may also be calculated through the preset logistic regression model.
  • the probability limit value of the specific leaf node can be used to evaluate the probability that the user belongs to the performance level of the target population.
  • the probability value of the user belonging to the target population performance level is smaller than the probability limit value of the specific leaf node, it means that the user is unlikely to belong to the target population performance level, so it can be adopted
  • the classification result of the specific leaf node of the preset decision tree model identifies the crowd performance level of the user. If the probability value that the user belongs to the performance level of the target population is greater than or equal to the probability limit value of the specific leaf node, it means that the user is likely to belong to the performance level of the target population, so the performance level of the target population can be determined
  • the performance level of the user is to avoid the problem that the recall rate of the target population is relatively low due to the relatively low performance level of the target population.
  • a method for identifying a performance level of a crowd provided by the embodiment of the present application, compared with the conventional method of identifying a performance level of a crowd through a decision tree model, the embodiment of the present application can input the performance characteristics of the crowd corresponding to the user to be identified into a preset decision tree model The identification is performed to obtain the information entropy of the user belonging to a specific leaf node.
  • the performance characteristics of the crowd corresponding to the user can be input to a preset logistic regression model for calculation, and the user is attributed to The probability value of the target group's performance level; and the user's group performance level can be determined according to the probability value of the user's belonging to the target group's performance level and the probability limit value of the specific leaf node, so that the preset decision can be combined
  • the tree model and the preset logistic regression model identify the performance level of the crowd, that is, when the information entropy of a specific leaf node obtained by the preset decision tree model is greater than or equal to a preset threshold, the user attribution that can be calculated by the preset logistic regression model
  • the user is further identified based on the probability value of the target group's performance level and the probability limit value of the specific leaf node, which can reduce the recognition error of leaf nodes whose information entropy is greater than or equal to
  • this embodiment of the present application provides another method for identifying the performance level of the crowd, as shown in FIG. 2.
  • Methods include:
  • the method before step 201 may further include: obtaining sample population performance characteristics and sample population performance levels corresponding to multiple sample users; according to the sample population
  • the performance characteristics and the performance level of the sample population are used to establish the preset decision tree model.
  • the sample population performance characteristics and the sample population performance level can be trained by a decision tree algorithm to obtain the preset decision tree model.
  • the preset decision tree model stores correspondence rules between crowd performance characteristics and crowd performance levels.
  • Decision tree algorithm can be a method to approximate the value of discrete functions. It is a typical classification method. It first processes the data, uses inductive algorithms to generate readable rules and decision trees, and then uses the decisions to analyze new data.
  • a decision tree is essentially a process of classifying data through a series of rules.
  • the method may further include: selecting, from the preset decision tree model, each leaf node whose information entropy of the sample user is greater than or equal to a preset threshold, and selecting the leaf node Each sample user under each leaf node; and the preset logistic regression model is established according to a sample population performance characteristic and a sample population performance level corresponding to each sample user.
  • the logistic regression algorithm can be used to train the sample population performance characteristics and sample crowd performance levels corresponding to each sample user to obtain the preset logistic regression model.
  • the preset logistic regression model can output the probability value of the user belonging to the performance level of the target population.
  • Logistic regression algorithm also called logistic regression analysis algorithm, is one of the classification and prediction algorithms.
  • the probability of future outcomes can be predicted by the performance of historical data.
  • the logistic regression algorithm when used to train the sample population performance characteristics and sample population performance levels corresponding to each sample user, the sample population performance characteristics may be used as independent variables, and the sample population performance levels may be used as training for the dependent variables.
  • the preset logistic regression model when used to train the sample population performance characteristics and sample population performance levels corresponding to each sample user, the sample population performance characteristics may be used as independent variables, and the sample population performance levels may be used as training for the dependent variables.
  • the preset logistic regression model when the logistic regression algorithm is used to train the sample population performance characteristics and sample population performance levels corresponding to each sample user, the sample population performance characteristics may be used as independent variables, and the sample population performance levels may be used as training for the dependent variables.
  • the embodiment of the present application further supports a function of determining the probability limit value of each leaf node according to the preset logistic regression model, including: obtaining the sample users calculated by the preset logistic regression model. Probability values that belong to the performance level of the target population; sort the probability values that each sample user belongs to the performance level of the target population; and determine the probability limit value of each leaf node according to the probability value ranking result.
  • the probability values of the sample users belonging to the performance level of the target population may be sorted according to the order of the probability values, and the sample users corresponding to the sample population performance level of the target user performance level among the sample users are determined. The number of users whose probability positions are equal to the number of users is then determined as the probability limit value of each leaf node.
  • the probability limit value of the leaf node can also be adjusted according to the actual needs of the user. For example, the probability value at any sorted position can be determined as the probability limit value of the leaf node.
  • the sample population performance level corresponding to 60 sample users is a general performance level
  • the sample population performance level corresponding to 40 sample users is a high performance level.
  • the probability values of the 100 sample users belonging to the high performance level can be calculated.
  • the obtained 100 probabilities can be calculated.
  • the values are sorted, and then the probability value with the ranking position 40 is determined as the probability limit value of the leaf node 1.
  • a probability value with a ranking position of 35 can also be selected according to user needs to determine the probability limit value of the leaf node 1.
  • Probability thresholds of other leaf nodes whose information entropy is higher than 0.88 may also be determined in a similar manner, and the examples in this application are not repeated here.
  • the method may further include: obtaining the crowd performance data corresponding to the user to be identified; and extracting the user corresponding data from the crowd performance data.
  • Crowd performance characteristics may be manually uploaded, or may be collected from an enterprise performance management system.
  • the crowd performance characteristics corresponding to the user may be extracted from the crowd performance data by way of feature keyword matching. Specifically, the keywords of the crowd performance characteristics can be matched with the crowd performance data, so as to extract the crowd performance characteristics corresponding to the user from the crowd performance data.
  • the crowd performance characteristic keyword is “average learning courses per month”, the number of monthly average learning courses corresponding to the user may be extracted from the crowd performance data, which may specifically be 80; if the crowd performance characteristic keyword is "Working address latitude and longitude", the working address longitude and latitude corresponding to the user can be extracted from the crowd performance data, which can be specifically (123.436, 41.819). If the crowd performance characteristic keyword is "Internet trading products in a single month”, the number of Internet trading products corresponding to the user in a single month may be extracted from the crowd performance data, which may specifically be 20.
  • step 202 Detect whether the information entropy of the user belonging to a specific leaf node is greater than or equal to a preset threshold. If yes, go to step 203; if no, go to step 206.
  • the preset threshold may be set according to user requirements, for example, the preset threshold may be 0.88, 0.89, or the like. It should be noted that due to the problem of imbalanced data distribution, there may be errors in the classification of the leaf nodes of the preset decision tree model, for example, there are 1000 sample users, and their corresponding sample population performance characteristics and sample population performance levels, It includes 80% of the sample users of the ordinary performance level, and 20% of the sample users of the high performance level. According to the preset decision tree model, only 80% of the sample users may be correctly classified. Therefore, the embodiment of the present application Through the information entropy of the leaf nodes, it can be determined whether there is a possibility of classification errors of the leaf nodes.
  • step 203 may be performed to further identify the user, thereby avoiding the target The problem of low recall caused by the relatively low performance of the crowd in the training set.
  • step 206 may be performed to determine the user based on the performance level of the crowd corresponding to the specific leaf node. Crowd performance rating.
  • step 204 Detect whether the probability value of the user belonging to the target group performance level is greater than or equal to the probability limit value of the specific leaf node. If yes, go to step 205; if no, go to step 206.
  • step 201 For a manner of determining the probability limit value of the specific leaf node, reference may be made to the corresponding description and explanation in step 201, and details are not described herein.
  • the preset decision tree model determines that the specific leaf node to which the user belongs is leaf node 1, and the information entropy of the user to leaf node 1 is 0.90, the information entropy of the user to leaf node 1 is greater than
  • the preset threshold value of 0.88 indicates that the preset decision tree model may not correctly classify the user.
  • the probability value of the user belonging to a high performance level may be calculated by a preset logistic regression model.
  • the probability value of the user belonging to the high performance level is calculated to be 0.85, and the probability threshold value of the leaf node 1 is determined to be 0.81 through the preset logistic regression model, the probability value of the user belonging to the high performance level is greater than the leaf node A probability limit value of 1 indicates that the user is more likely to belong to a high-performance crowd.
  • the high-performance rank can be determined as the user's crowd performance rank, that is, the user is determined to be a high-performance crowd.
  • the preset decision tree model determines that the specific leaf node to which the user belongs is leaf node 1, and the information entropy of the user to leaf node 1 is 0.5, the information entropy of the user to leaf node 1 is less than
  • the preset threshold value of 0.88 indicates that the classification of the user by the leaf node 1 is correct, and the ordinary performance level corresponding to the leaf node 1 can be determined as the crowd performance level of the user. At this time, the user is determined to be the ordinary performance crowd. Alternatively, the high performance level corresponding to the leaf node 1 may be determined as the crowd performance level of the user. At this time, the user is determined to be the high performance crowd.
  • the preset decision tree model determines that the specific leaf node to which the user belongs is leaf node 1, and the information entropy of the user to leaf node 1 is 0.92, the information entropy of the user to leaf node 1 If the value is greater than the preset threshold of 0.88, it indicates that the preset decision tree model may incorrectly classify the user. At this time, a preset logistic regression model is used to calculate a probability value that the user belongs to a high performance level.
  • the probability value of the user belonging to the high performance level is calculated to be 0.34, and the probability threshold value of the leaf node 1 is determined to be 0.81 through the preset logistic regression model, the probability value of the user belonging to the high performance level is less than the leaf node If the probability limit value is 1, the user is less likely to belong to the high-performance crowd.
  • the ordinary performance level corresponding to the leaf node 1 may be determined as the user's crowd performance level. At this time, it is determined that the user is Ordinary performance group.
  • the high performance level corresponding to the leaf node 1 may be determined as the crowd performance level of the user. At this time, the user is determined to be the high performance crowd.
  • the embodiment of the present application can input the performance characteristics of the crowd corresponding to the user to be identified into a preset decision tree.
  • the model performs identification to obtain the information entropy of the user belonging to a specific leaf node.
  • the performance characteristics of the crowd corresponding to the user can be input to a preset logistic regression model for calculation, and the user is attributed to The probability value of the target group's performance level; and the user's group performance level can be determined according to the probability value of the user's belonging to the target group's performance level and the probability limit value of the specific leaf node, so that the preset decision can be combined
  • the tree model and the preset logistic regression model identify the performance level of the crowd, that is, when the information entropy of a specific leaf node obtained by the preset decision tree model is greater than or equal to a preset threshold, the user attribution that can be calculated by the preset logistic regression model
  • the user is further identified based on the probability value of the target group's performance level and the probability limit value of the specific leaf node, which can reduce the recognition error of leaf nodes whose information entropy is greater than or equal to
  • an embodiment of the present application provides a device for identifying a crowd performance level.
  • the device includes a recognition unit 31, a calculation unit 32, and a determination unit 33.
  • the identification unit 31 may be configured to input a performance characteristic of a crowd corresponding to a user to be identified into a preset decision tree model for identification, and obtain information entropy of the user belonging to a specific leaf node.
  • the identification unit 31 is a main functional module in this device that inputs the performance characteristics of the crowd corresponding to the user to be identified into a preset decision tree model to obtain the information entropy that the user belongs to a specific leaf node.
  • the calculation unit 32 may be configured to: if the information entropy of the user belonging to a specific leaf node is greater than or equal to a preset threshold, input the performance characteristics of the crowd corresponding to the user into a preset logistic regression model for calculation, and obtain The probability value of the user's belonging to the performance level of the target crowd; the calculation unit 32 is the performance characteristic of the crowd corresponding to the user in the device when the information entropy of the user's belonging to a specific leaf node is greater than or equal to a preset threshold
  • the input is input to a preset logistic regression model for calculation, and a main functional module, which is also a core module, to obtain the probability value of the user belonging to the target group performance level.
  • the determining unit 33 may be configured to determine a crowd performance level of the user according to a probability value of the user belonging to the target population performance level and a probability limit value of the specific leaf node.
  • the determining unit 33 is a main function module and a core module for determining the user's crowd performance level according to the probability value of the user's belonging to the target population performance level and the probability limit value of the specific leaf node in the device.
  • the apparatus in order to establish the preset decision tree model and the preset logistic regression model, the apparatus further includes: an obtaining unit 34, a establishing unit 35, and a selecting unit 36, as shown in FIG. 4.
  • the obtaining unit 34 may be configured to obtain a sample population performance characteristic and a sample population performance level corresponding to a plurality of sample users.
  • the obtaining unit 34 is a main functional module for obtaining the sample population performance characteristics and the sample population performance level corresponding to a plurality of sample users in the device.
  • the establishing unit 35 may be configured to establish the preset decision tree model according to the sample population performance characteristics and the sample population performance level.
  • the establishing unit 35 is a main functional module of the device for establishing the preset decision tree model according to the performance characteristics of the sample population and the performance level of the sample population.
  • the selecting unit 36 may be configured to select, from the preset decision tree model, each leaf node whose information entropy is greater than or equal to a preset threshold, and select each sample user under each leaf node.
  • the selecting unit 36 selects, from the preset decision tree model, each leaf node whose information entropy of sample users is greater than or equal to a preset threshold, and selects the main users of each sample user under each leaf node. functional module.
  • the establishing unit 35 may be configured to establish the preset logistic regression model according to a sample population performance characteristic and a sample population performance level corresponding to the sample users.
  • the establishing unit 35 is a main functional module of the device for establishing the preset logistic regression model according to the sample population performance characteristics and the sample population performance level corresponding to the respective sample users.
  • the determining unit 33 may be further configured to determine a probability limit value of each leaf node according to the preset logistic regression model.
  • the determining unit 33 may include: an obtaining module, a sorting module, and a determining module.
  • the obtaining module may be configured to obtain a probability value of the sample user belonging to the target population performance level calculated by the preset logistic regression model.
  • the ranking module may be used for ranking the probability values of the performance levels of the target users belonging to each sample user.
  • the determining module may be configured to determine a probability limit value of each leaf node according to a ranking result of the probability values.
  • the ranking module may be specifically configured to sort the probability values of each sample user belonging to the performance level of the target population according to the order of the probability values; the determination module may be specifically used to determine the sample users The number of users of the sample users corresponding to the performance level of the target population of the sample population; the probability value where the probability value ranking position is equal to the number of users is determined as the probability limit value of each leaf node.
  • the determining unit 33 may be specifically configured to be used if the probability value of the user belonging to the target crowd performance level is greater than or equal to the probability of the specific leaf node
  • the threshold value the target crowd performance level is determined as the user's crowd performance level; if the probability value of the user belonging to the target population performance level is less than the probability limit value of the specific leaf node, the specific The crowd performance level corresponding to the leaf node is determined as the crowd performance level of the user.
  • the determining unit 33 may also be specifically configured to determine the crowd performance level corresponding to the specific leaf node as the crowd performance of the user if the information entropy of the user belonging to the specific leaf node is less than a preset threshold. grade.
  • the apparatus in order to obtain the crowd performance characteristics corresponding to the user to be identified, the apparatus may further include an extraction unit 37.
  • the obtaining unit 34 may be further configured to obtain crowd performance data corresponding to the user to be identified.
  • the obtaining unit 34 is also a main function module in the device for obtaining the performance data of the crowd corresponding to the user to be identified.
  • the extraction unit 37 may be configured to extract a crowd performance characteristic corresponding to the user from the crowd performance data.
  • the extraction unit 37 is a main functional module for extracting the crowd performance characteristics corresponding to the user from the crowd performance data in the device.
  • an embodiment of the present application further provides a storage medium, which may specifically be a computer non-volatile readable storage medium having computer-readable instructions stored thereon, the computer-readable
  • the following steps are implemented: inputting the crowd performance characteristics corresponding to the user to be identified into a preset decision tree model for identification, and obtaining the information entropy of the user belonging to a specific leaf node; if the user belongs to a specific leaf The information entropy of the node is greater than or equal to a preset threshold, and the performance characteristics of the crowd corresponding to the user are input to a preset logistic regression model for calculation, and the probability value of the user belonging to the target crowd performance level is obtained; according to the user belonging The crowd performance level of the user is determined based on the probability value of the target population performance level and the probability limit value of the specific leaf node.
  • an embodiment of the present application further provides a physical structure diagram of a computer device.
  • the computer device includes: The processor 41, the memory 42, and computer-readable instructions stored on the memory 42 and executable on the processor, wherein the memory 42 and the processor 41 are all disposed on the bus 43, and the processor 61 executes the computer-readable
  • the following steps are implemented: inputting the performance characteristics of the crowd corresponding to the user to be identified into a preset decision tree model for identification, and obtaining the information entropy of the user belonging to a specific leaf node; if the information entropy of the user belonging to a specific leaf node Greater than or equal to a preset threshold, input the performance characteristics of the crowd corresponding to the user into a preset logistic regression model for calculation, and obtain the probability value of the user belonging to the target crowd performance level; according to the performance of the user belonging to the
  • the performance characteristics of the crowd corresponding to the user to be identified can be input to a preset decision tree model for identification, and the information entropy of the user belonging to a specific leaf node can be obtained.
  • the performance characteristics of the crowd corresponding to the user can be input to a preset logistic regression model for calculation, and the user is attributed to The probability value of the target group's performance level; and the user's group performance level can be determined according to the probability value of the user's belonging to the target group's performance level and the probability limit value of the specific leaf node, so that the preset decision can be combined
  • the tree model and the preset logistic regression model identify the performance level of the crowd, that is, when the information entropy of a specific leaf node obtained by the preset decision tree model is greater than or equal to a preset threshold, the user attribution that can be calculated by the preset
  • modules or steps of the present application may be implemented by a general-purpose computing device, and they may be concentrated on a single computing device or distributed in a network composed of multiple computing devices.
  • they can be implemented with computer-readable instruction code executable by the computing device, so that they can be stored in a storage device and executed by the computing device, and in some cases, can be different from this
  • the steps shown or described are performed in sequence, either by making them into individual integrated circuit modules, or by making multiple modules or steps into a single integrated circuit module. As such, this application is not limited to any particular combination of hardware and software.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Biology (AREA)
  • Game Theory and Decision Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Image Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种人群绩效等级识别方法、装置、存储介质及计算机设备,涉及人工智能技术领域,主要目的在于能够在保证识别精确度的前提下,提升占比较低的目标人群召回率。所述方法包括:将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别,得到所述用户归属于特定叶子节点的信息熵(101);若所述用户归属于特定叶子节点的信息熵大于或者等于预设阈值,则将所述用户对应的人群绩效特征输入到预设逻辑回归模型进行计算,得到所述用户归属于目标人群绩效等级的概率值(102);根据所述用户归属于目标人群绩效等级的概率值和所述特定叶子节点的概率界限值,确定所述用户的人群绩效等级(103)。

Description

人群绩效等级识别方法、装置、存储介质及计算机设备
本申请要求与2018年8月1日提交中国专利局、申请号为2018108652995、申请名称为“人群绩效等级识别方法、装置、存储介质及计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在申请中。
技术领域
本申请涉及人工智能技术领域,尤其是涉及一种人群绩效等级识别方法、装置、存储介质及计算机设备。
背景技术
近年来,很多行业开始重视人群绩效,尤其是开始重视识别人群绩效等级,通过识别出目标人群并对目标人群进行奖励,能够大大促进企业整体效益的提升。
目前,在识别人群绩效等级时,通常仅通过决策树模型识别人群绩效等级,即将决策树模型对用户识别得到的人群绩效等级,确定为用户的人群绩效等级。然而,由于人群数据量庞大且各个人群绩效等级的人群数据分布不平衡,决策树模型可能无法取得理想拟合效果,通常占比较低的人群通常为目标人群。例如,高绩效等级的人群数据通常会远小于普通绩效等级的人群数据,一般高绩效等级的人群占比通常为20%,普通绩效等级的人群占比通常为80%,高绩效等级的人群通常为目标人群。若仅通过决策树模型识别人群绩效等级,会发生决策树模型识别错误的情况,造成识别出来的占比较低的目标人群的数量较少,从而造成占比较低的目标人群召回率较低。
发明内容
本申请提供了一种人群绩效等级识别方法、装置、存储介质及计算机设备,主要在于能够减少信息熵大于或者等于预设阈值的叶子节点识别错误的情况,且能够在保证识别精确度的前提下,提升占比较低的目标人群召回率。
根据本申请的第一个方面,提供一种人群绩效等级识别方法,包括:
将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别,得到所述用户归属于特定叶子节点的信息熵;
若所述用户归属于特定叶子节点的信息熵大于或者等于预设阈值,则将所述用户对应的人群绩效特征输入到预设逻辑回归模型进行计算,得到所述用户归属于目标人群绩效等 级的概率值;
根据所述用户归属于目标人群绩效等级的概率值和所述特定叶子节点的概率界限值,确定所述用户的人群绩效等级。
根据本申请的第二个方面,提供一种人群绩效等级识别装置,包括:
识别单元,用于将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别,得到所述用户归属于特定叶子节点的信息熵;
计算单元,用于若所述用户归属于特定叶子节点的信息熵大于或者等于预设阈值,则将所述用户对应的人群绩效特征输入到预设逻辑回归模型进行计算,得到所述用户归属于目标人群绩效等级的概率值;
确定单元,用于根据所述用户归属于目标人群绩效等级的概率值和所述特定叶子节点的概率界限值,确定所述用户的人群绩效等级。
根据本申请的第三个方面,提供一种非易失性可读存储介质,其上存储有计算机可读指令,该计算机可读指令被处理器执行时实现以下步骤:
将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别,得到所述用户归属于特定叶子节点的信息熵;
若所述用户归属于特定叶子节点的信息熵大于或者等于预设阈值,则将所述用户对应的人群绩效特征输入到预设逻辑回归模型进行计算,得到所述用户归属于目标人群绩效等级的概率值;
根据所述用户归属于目标人群绩效等级的概率值和所述特定叶子节点的概率界限值,确定所述用户的人群绩效等级。
根据本申请的第四个方面,提供一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现以下步骤:
将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别,得到所述用户归属于特定叶子节点的信息熵;
若所述用户归属于特定叶子节点的信息熵大于或者等于预设阈值,则将所述用户对应的人群绩效特征输入到预设逻辑回归模型进行计算,得到所述用户归属于目标人群绩效等级的概率值;
根据所述用户归属于目标人群绩效等级的概率值和所述特定叶子节点的概率界限值,确定所述用户的人群绩效等级。
本申请提供的一种人群绩效等级识别方法、装置、存储介质及计算机设备,与目前通常仅通过决策树模型识别人群绩效等级相比,本申请能够将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别,得到所述用户归属于特定叶子节点的信息熵。与此同时,在所述用户归属于特定叶子节点的信息熵大于或者等于预设阈值时,能够将所述用户对应的人群绩效特征输入到预设逻辑回归模型进行计算,得到所述用户归属于目标人群绩效等级的概率值;且能够根据所述用户归属于目标人群绩效等级的概率值与所述特定叶子节点的概率界限值,确定所述用户的人群绩效等级,从而能够实现结合预设决策树模型和预设逻辑回归模型识别人群绩效等级,即在预设决策树模型得到特定叶子节点的信息熵大于或者等于预设阈值时,能够通过所述预设逻辑回归模型计算的所述用户归属于目标人群绩效等级的概率值和所述特定叶子节点的概率界限值对用户进行进一步识别,进而能够减少信息熵大于或者等于预设阈值的叶子节点识别错误的情况,且能够在保证识别精确度的前提下,提升占比较低的目标人群召回率。
附图说明
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:
图1示出了本申请实施例提供的一种人群绩效等级识别方法流程图;
图2示出了本申请实施例提供的另一种人群绩效等级识别方法流程图;
图3示出了本申请实施例提供的一种人群绩效等级识别装置的结构示意图;
图4示出了本申请实施例提供的另一种人群绩效等级识别装置的结构示意图;
图5示出了本申请实施例提供的一种计算机设备的实体结构示意图。
具体实施方式
下文中将参考附图并结合实施例来详细说明本申请。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。
如背景技术,目前,在识别人群绩效等级时,通常仅通过决策树模型识别人群绩效等级,即将决策树模型对用户识别得到的人群绩效等级,确定为用户的人群绩效等级。然而,由于人群数据量庞大且各个人群绩效等级的人群数据分布不平衡,决策树模型可能无法取得理想拟合效果,通常占比较低的人群通常为目标人群。若仅通过决策树模型识别人群绩效等级,会发生决策树模型识别错误的情况,造成识别出来的占比较低的目标人群的数量较少,从而造成占比较低的目标人群召回率较低。
为了解决上述问题,本申请实施例提供了一种人群绩效等级识别方法,如图1所示,所述方法包括:
101、将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别,得到所述用户归属于特定叶子节点的信息熵。
其中,所述人群绩效特征可以包括但不限于:月均学习课程数、工作地址经纬度、单月内互联网交易产品数。所述人群绩效等级可以包括:高绩效等级和普遍绩效等级。所述预设决策树模型可以为根据样本人群绩效特征和样本人群绩效等级建立的,所述预设决策树模型可以保存有人群绩效特征和人群绩效等级之间的对应规则。所述用户归属于的特定叶子节点可以为所述用户被预设决策树模型分类后的叶子节点,具体地可以为根据输入的人群绩效特征分类得到的,即每个人群绩效特征都可以作为识别所述用户类别的节点。在人群绩效特征输入到预设决策树模型后,预设决策树模型会根据人群绩效特征对所述用户进行分类,最终得到所述用户归属于特定叶子节点。
例如,月均学习课程数为50、工作地址经纬度为(123.435,41.819)、单月内互联网交易产品数为10,所述预设决策树模型会先根据月均学习课程数50对用户进行分类,然后根据工作地址经纬度(123.435,41.819)对用户进行分类;根据单月内互联网交易产品数10对用户进行分类,最后得到所述用户被分类的叶子节点1,因此,叶子节点1即为所述用户归属于的特定叶子节点。
需要说明的是,信息熵可以用于表示分类的不确定性,信息熵越大,则说明被分类的不确定性越大,越可能存在分类错误的情况。本申请实施例中,可以通过如下公式计算信息熵:
Figure PCTCN2018111118-appb-000001
其中,D表示样本人群绩效特征和样本人群绩效等级组合的样本集合,c表示样本人群绩效等级的类别数,p i表示样本用户被分类到样本人群绩效等级i类别的概率或者比例。
102、若所述用户归属于特定叶子节点的信息熵大于或者等于预设阈值,则将所述用户对应的人群绩效特征输入到预设逻辑回归模型进行计算,得到所述用户归属于目标人群绩效等级的概率值。
其中,所述预设阈值可以根据用户需求进行设置的,如所述预设阈值可以为0.88、0.89等。若所述用户归属于特定叶子节点的信息熵大于或者等于预设阈值,则说明预设决策树模型的特定叶子节点可能分类错误,通过将所述用户对应的人群绩效特征输入到预设逻辑 回归模型进行计算,能够实现对所述用户进行二次分类识别。所述目标人群绩效等级可以为所述预设决策树模型中占比较低的人群绩效等级,具体地,在本申请实施例中所述目标人群绩效等级可以为高绩效等级。因此,通过所述预设逻辑回归模型可以输出所述用户归属于高绩效等级的概率值。
103、根据所述用户归属于目标人群绩效等级的概率值和所述特定叶子节点的概率界限值,确定所述用户的人群绩效等级。
其中,所述特定叶子节点的概率界限值可以为对所述特定叶子节点下的各个样本用户归属于目标人群绩效等级的概率值进行排序得到的,所述各个样本用户归属于目标人群绩效等级的概率值也可以为通过所述预设逻辑回归模型计算得到的。所述特定叶子节点的概率界限值可以用于评价所述用户归属于目标人群绩效等级的可能性大小。
在本申请实施例中,若所述用户归属于目标人群绩效等级的概率值小于所述特定叶子节点的概率界限值,则说明所述用户归属于目标人群绩效等级的可能性小,因此可以采用所述预设决策树模型的特定叶子节点分类结果,识别所述用户的人群绩效等级。若所述用户归属于目标人群绩效等级的概率值大于或者等于所述特定叶子节点的概率界限值,则说明所述用户归属于目标人群绩效等级的可能性大,因此可以将目标人群绩效等级确定为所述用户的人群绩效等级,以避免由于目标人群绩效等级占比较低,而造成占比较低的目标人群召回率较低的问题。
本申请实施例提供的一种人群绩效等级识别方法,与目前通常仅通过决策树模型识别人群绩效等级相比,本申请实施例能够将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别,得到所述用户归属于特定叶子节点的信息熵。与此同时,在所述用户归属于特定叶子节点的信息熵大于或者等于预设阈值时,能够将所述用户对应的人群绩效特征输入到预设逻辑回归模型进行计算,得到所述用户归属于目标人群绩效等级的概率值;且能够根据所述用户归属于目标人群绩效等级的概率值与所述特定叶子节点的概率界限值,确定所述用户的人群绩效等级,从而能够实现结合预设决策树模型和预设逻辑回归模型识别人群绩效等级,即在预设决策树模型得到特定叶子节点的信息熵大于或者等于预设阈值时,能够通过所述预设逻辑回归模型计算的所述用户归属于目标人群绩效等级的概率值和所述特定叶子节点的概率界限值对用户进行进一步识别,进而能够减少信息熵大于或者等于预设阈值的叶子节点识别错误的情况,且能够在保证识别精确度的前提下,提升占比较低的目标人群召回率。
进一步的,为了更好的说明上述人群绩效等级识别的过程,作为对上述实施例的细化 和扩展,本申请实施例提供了另一种人群绩效等级识别方法,如图2所示,所述方法包括:
201、将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别,得到所述用户归属于特定叶子节点的信息熵。
对于本申请实施例,为了建立所述预设决策树模型,所述步骤201之前所述方法还可以包括:获取多个样本用户对应的样本人群绩效特征和样本人群绩效等级;根据所述样本人群绩效特征和所述样本人群绩效等级,建立所述预设决策树模型。具体地,可以通过决策树算法对所述样本人群绩效特征和所述样本人群绩效等级进行训练,得到所述预设决策树模型。所述预设决策树模型保存有人群绩效特征和人群绩效等级之间的对应规则。决策树算法可以是一种逼近离散函数值的方法。它是一种典型的分类方法,首先对数据进行处理,利用归纳算法生成可读的规则和决策树,然后使用决策对新数据进行分析。本质上决策树是通过一系列规则对数据进行分类的过程。
此外,为了建立所述预设决策树模型,所述方法还可以包括:从所述预设决策树模型中选取样本用户归属的信息熵大于或者等于预设阈值的各个叶子节点,并选取所述各个叶子节点下的各个样本用户;根据所述各个样本用户对应的样本人群绩效特征和样本人群绩效等级,建立所述预设逻辑回归模型。具体地,可以通过逻辑回归算法对所述各个样本用户对应的样本人群绩效特征和样本人群绩效等级进行训练,得到所述预设逻辑回归模型。通过所述预设逻辑回归模型可以输出所述用户归属于目标人群绩效等级的概率值。逻辑回归算法又称为逻辑回归分析算法,是分类和预测算法中的一种。可以通过历史数据的表现对未来结果发生的概率进行预测。在本申请实施例中,逻辑回归算法对所述各个样本用户对应的样本人群绩效特征和样本人群绩效等级进行训练时,可以将样本人群绩效特征作为自变量,把样本人群绩效等级作为因变量训练所述预设逻辑回归模型。
需要说明的是,本申请实施例还支持根据所述预设逻辑回归模型,确定所述各个叶子节点的概率界限值的功能,包括:获取所述预设逻辑回归模型计算的所述各个样本用户归属于目标人群绩效等级的概率值;对所述各个样本用户归属于目标人群绩效等级的概率值进行排序;根据概率值排序结果,确定所述各个叶子节点的概率界限值。
具体地,可以按照概率值的高低顺序,对所述各个样本用户归属于目标人群绩效等级的概率值进行排序,并确定所述各个样本用户中对应样本人群绩效等级为目标人群绩效等级的样本用户的用户数量;然后将概率值排序位置与所述用户数量相等的概率值,确定为所述各个叶子节点的概率界限值。此外,本申请实施例还可以根据用户的实际需求调整叶子节点的概率界限值,如可以将任何排序位置上的概率值,确定为叶子节点的概率界限值。
例如,信息熵高于0.88的叶子节点1下有100个样本用户,其中,有60个样本用户对应的样本人群绩效等级为普通绩效等级,40个样本用户对应的样本人群绩效等级为高绩效等级,通过所述预设逻辑回归模型可以计算出这100个样本用户归属于高绩效等级的概率值,在得到这100个样本用户归属于高绩效等级的概率值后,可以对得到的100个概率值进行排序,然后将排序位置为40的概率值确定为叶子节点1的概率界限值。此外,还可以根据用户需求选取排序位置为35的概率值确定为叶子节点1的概率界限值。其他信息熵高于0.88的叶子节点的概率界限值也可以为采用类似方式进行确定,本申请实施例在此不进行重复举例。
进一步地,为了获取待识别用户对应的人群绩效特征,在步骤201之前,所述方法还可以包括:获取待识别用户对应的人群绩效数据;从所述人群绩效数据中提取出所述用户对应的人群绩效特征。其中,所述人群绩效数据可以为人工上传的,也可以为从企业的绩效管控***采集的。在本申请实施例中,可以通过特征关键字匹配的方式从人群绩效数据中提取出所述用户对应的人群绩效特征。具体地,可以将人群绩效特征关键字与所述人群绩效数据进行匹配,实现从所述人群绩效数据中提取出所述用户对应的人群绩效特征。
例如,若人群绩效特征关键字为“月均学习课程”,则可以从所述人群绩效数据中提取出所述用户对应的月均学习课程数,具体可以为80;若人群绩效特征关键字为“工作地址经纬度”,则可以从从所述人群绩效数据中提取出所述用户对应的工作地址经纬度,具体可以为(123.436,41.819)。若人群绩效特征关键字为“单月内互联网交易产品”,则可以从所述人群绩效数据中提取出所述用户对应的单月内互联网交易产品数,具体可以为20。
202、检测所述用户归属于特定叶子节点的信息熵是否大于或者等于预设阈值。若是,则执行步骤203;若否,则执行步骤206。
其中,所述预设阈值可以根据用户需求进行设置的,如所述预设阈值可以为0.88、0.89等。需要说明的是,由于数据分布不平衡的问题,所述预设决策树模型的叶子节点分类可能存在错误,例如,有1000个样本用户,以及其对应的样本人群绩效特征和样本人群绩效等级,其中包括80%的普通绩效等级的样本用户,包括20%的高绩效等级的样本用户,通过所述预设决策树模型可能只能将其中80%的样本用户正确分类,因此,本申请实施例通过叶子节点的信息熵,可以确定叶子节点是否存在分类错误的可能。
具体地,在对单个用户进行分类识别时,通过检测所述用户归属于特定叶子节点的信息熵是否大于或者等于预设阈值,能够检测特定叶子节点是否存在将所述用户分类错误的 可能。若所述用户归属于特定叶子节点的信息熵大于或者等于预设阈值,则说明所述特定叶子节点可能将所述用户分类错误,因此可以执行步骤203以对所用户进行进一步识别,进而避免目标人群绩在训练集中占比较低而造成的低召回率的问题。若所述用户归属于特定叶子节点的信息熵小于预设阈值,则说明所述特定叶子节点的分类结果正确,因此可以执行步骤206,以所述特定叶子节点对应的人群绩效等级确定所述用户的人群绩效等级。
203、将所述用户对应的人群绩效特征输入到预设逻辑回归模型进行计算,得到所述用户归属于目标人群绩效等级的概率值。
204、检测所述用户归属于目标人群绩效等级的概率值是否大于或者等于所述特定叶子节点的概率界限值。若是,则执行步骤205;若否,则执行步骤206。
其中,所述特定叶子节点的概率界限值的确定方式可以参考步骤201下的相应描述和解释,在此不进行赘述。
205、将所述目标人群绩效等级确定为所述用户的人群绩效等级。
例如,若预设决策树模型确定所述用户归属于的特定叶子节点为叶子节点1,且所述用户归属于叶子节点1的信息熵为0.90,所述用户归属于叶子节点1的信息熵大于预设阈值0.88,则说明所述预设决策树模型对所述用户的分类可能不正确,此时可以通过预设逻辑回归模型计算所述用户归属于高绩效等级的概率值。若计算所述用户归属于高绩效等级的概率值为0.85,且通过所述预设逻辑回归模型确定叶子节点1的概率界限值为0.81,所述用户归属于高绩效等级的概率值大于叶子节点1的概率界限值,则说明所述用户属于高绩效人群的可能性较大,此时可以将高绩效等级确定为所述用户的人群绩效等级,即确定所述用户为高绩效人群。
206、将所述特定叶子节点对应的人群绩效等级确定为所述用户的人群绩效等级。
例如,若预设决策树模型确定所述用户归属于的特定叶子节点为叶子节点1,且所述用户归属于叶子节点1的信息熵为0.5,所述用户归属于叶子节点1的信息熵小于预设阈值0.88,则说明所述叶子节点1对所述用户的分类正确,可以将叶子节点1对应的普通绩效等级确定为所述用户的人群绩效等级,此时,确定所述用户为普通绩效人群。或者可以将叶子节点1对应的高绩效等级确定为所述用户的人群绩效等级,此时,确定所述用户为高绩效人群。
又例如,若预设决策树模型确定所述用户归属于的特定叶子节点为叶子节点1,且所述用户归属于叶子节点1的信息熵为0.92,所述用户归属于叶子节点1的信息熵大于预设阈值0.88,则说明所述预设决策树模型对所述用户的分类可能不正确,此时通过预设逻辑 回归模型计算所述用户归属于高绩效等级的概率值。若计算所述用户归属于高绩效等级的概率值为0.34,且通过所述预设逻辑回归模型确定叶子节点1的概率界限值为0.81,所述用户归属于高绩效等级的概率值小于叶子节点1的概率界限值,则所述用户属于高绩效人群的可能性较小,此时可以将叶子节点1对应的普通绩效等级确定为所述用户的人群绩效等级,此时,确定所述用户为普通绩效人群。或者可以将叶子节点1对应的高绩效等级确定为所述用户的人群绩效等级,此时,确定所述用户为高绩效人群。
对于本申请实施例,从步骤205-206所涉及的举例可知,通过结合预设决策树模型和预设逻辑回归模型识别人群绩效等级,能够减少信息熵大于或者等于预设阈值的叶子节点识别错误的情况且能够识别出更多的占比较低的目标人群。
本申请实施例提供的另一种人群绩效等级识别方法,与目前通常仅通过决策树模型识别人群绩效等级相比,本申请实施例能够将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别,得到所述用户归属于特定叶子节点的信息熵。与此同时,在所述用户归属于特定叶子节点的信息熵大于或者等于预设阈值时,能够将所述用户对应的人群绩效特征输入到预设逻辑回归模型进行计算,得到所述用户归属于目标人群绩效等级的概率值;且能够根据所述用户归属于目标人群绩效等级的概率值与所述特定叶子节点的概率界限值,确定所述用户的人群绩效等级,从而能够实现结合预设决策树模型和预设逻辑回归模型识别人群绩效等级,即在预设决策树模型得到特定叶子节点的信息熵大于或者等于预设阈值时,能够通过所述预设逻辑回归模型计算的所述用户归属于目标人群绩效等级的概率值和所述特定叶子节点的概率界限值对用户进行进一步识别,进而能够减少信息熵大于或者等于预设阈值的叶子节点识别错误的情况,且能够在保证识别精确度的前提下,提升占比较低的目标人群召回率。
进一步地,作为图1的具体实现,本申请实施例提供了一种人群绩效等级识别装置,如图3所示,所述装置包括:识别单元31、计算单元32和确定单元33。
所述识别单元31,可以用于将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别,得到所述用户归属于特定叶子节点的信息熵。所述识别单元31是本装置中将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别,得到所述用户归属于特定叶子节点的信息熵的主要功能模块。
所述计算单元32,可以用于若所述用户归属于特定叶子节点的信息熵大于或者等于预设阈值,则将所述用户对应的人群绩效特征输入到预设逻辑回归模型进行计算,得到所述用户归属于目标人群绩效等级的概率值;所述计算单元32是本装置中在所述用户归属于 特定叶子节点的信息熵大于或者等于预设阈值时,将所述用户对应的人群绩效特征输入到预设逻辑回归模型进行计算,得到所述用户归属于目标人群绩效等级的概率值的主要功能模块,也是核心模块。
所述确定单元33,可以用于根据所述用户归属于目标人群绩效等级的概率值和所述特定叶子节点的概率界限值,确定所述用户的人群绩效等级。所述确定单元33是本装置中根据所述用户归属于目标人群绩效等级的概率值和所述特定叶子节点的概率界限值,确定所述用户的人群绩效等级的主要功能模块,也是核心模块。
对于本申请实施例,为了建立所述预设决策树模型和所述预设逻辑回归模型,所述装置还包括:获取单元34、建立单元35和选取单元36,如图4所示。
所述获取单元34,可以用于获取多个样本用户对应的样本人群绩效特征和样本人群绩效等级。所述获取单元34是本装置中获取多个样本用户对应的样本人群绩效特征和样本人群绩效等级的主要功能模块。
所述建立单元35,可以用于根据所述样本人群绩效特征和所述样本人群绩效等级,建立所述预设决策树模型。所述建立单元35是本装置中根据所述样本人群绩效特征和所述样本人群绩效等级,建立所述预设决策树模型的主要功能模块。
所述选取单元36,可以用于从所述预设决策树模型中选取样本用户归属的信息熵大于或者等于预设阈值的各个叶子节点,并选取所述各个叶子节点下的各个样本用户。所述选取单元36是本装置中从所述预设决策树模型中选取样本用户归属的信息熵大于或者等于预设阈值的各个叶子节点,并选取所述各个叶子节点下的各个样本用户的主要功能模块。
所述建立单元35,可以用于根据所述各个样本用户对应的样本人群绩效特征和样本人群绩效等级,建立所述预设逻辑回归模型。所述建立单元35是本装置中根据所述各个样本用户对应的样本人群绩效特征和样本人群绩效等级,建立所述预设逻辑回归模型的主要功能模块。
所述确定单元33,还可以用于根据所述预设逻辑回归模型,确定所述各个叶子节点的概率界限值。对于本申请实施例,所述确定单元33可以包括:获取模块、排序模块和确定模块。
所述获取模块,可以用于获取所述预设逻辑回归模型计算的所述各个样本用户归属于目标人群绩效等级的概率值。所述排序模块,可以用于对所述各个样本用户归属于目标人群绩效等级的概率值进行排序。所述确定模块,可以用于根据概率值排序结果,确定所述各个叶子节点的概率界限值。所述排序模块,具体可以用于按照概率值的高低顺序,对所 述各个样本用户归属于目标人群绩效等级的概率值进行排序;所述确定模块,具体可以用于确定所述各个样本用户中对应样本人群绩效等级为目标人群绩效等级的样本用户的用户数量;将概率值排序位置与所述用户数量相等的概率值,确定为所述各个叶子节点的概率界限值。
对于本申请实施例,为了最终确定所述用户的人群绩效等级,所述确定单元33,具体可以用于若所述用户归属于目标人群绩效等级的概率值大于或者等于所述特定叶子节点的概率界限值,则将所述目标人群绩效等级确定为所述用户的人群绩效等级;若所述用户归属于目标人群绩效等级的概率值小于所述特定叶子节点的概率界限值,则将所述特定叶子节点对应的人群绩效等级确定为所述用户的人群绩效等级。
此外,所述确定单元33,具体还可以用于若所述用户归属于特定叶子节点的信息熵小于预设阈值,则将所述特定叶子节点对应的人群绩效等级确定为所述用户的人群绩效等级。对于本申请实施例,为了获取待识别用户对应的人群绩效特征,所述装置还可以包括:提取单元37。
所述获取单元34,还可以用于获取待识别用户对应的人群绩效数据。所述获取单元34还是本装置中获取待识别用户对应的人群绩效数据的主要功能模块。
所述提取单元37,可以用于从所述人群绩效数据中提取出所述用户对应的人群绩效特征。所述提取单元37是本装置中从所述人群绩效数据中提取出所述用户对应的人群绩效特征的主要功能模块。
需要说明的是,本申请实施例提供的一种人群绩效等级识别装置所涉及各功能模块的其他相应描述,可以参考图1所示方法的对应描述,在此不再赘述。
基于上述如图1所示方法,相应的,本申请实施例还提供了一种存储介质,具体可以为计算机非易失性可读存储介质,其上存储有计算机可读指令,该计算机可读指令被处理器执行时实现以下步骤:将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别,得到所述用户归属于特定叶子节点的信息熵;若所述用户归属于特定叶子节点的信息熵大于或者等于预设阈值,则将所述用户对应的人群绩效特征输入到预设逻辑回归模型进行计算,得到所述用户归属于目标人群绩效等级的概率值;根据所述用户归属于目标人群绩效等级的概率值和所述特定叶子节点的概率界限值,确定所述用户的人群绩效等级。
基于上述如图1所示方法和如图3所示人群绩效等级识别装置的实施例,本申请实施例还提供了一种计算机设备的实体结构图,如图5所示,该计算机设备包括:处理器41、存储器42、及存储在存储器42上并可在处理器上运行的计算机可读指令,其中存储器42 和处理器41均设置在总线43上所述处理器61执行所述计算机可读指令时实现以下步骤:将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别,得到所述用户归属于特定叶子节点的信息熵;若所述用户归属于特定叶子节点的信息熵大于或者等于预设阈值,则将所述用户对应的人群绩效特征输入到预设逻辑回归模型进行计算,得到所述用户归属于目标人群绩效等级的概率值;根据所述用户归属于目标人群绩效等级的概率值和所述特定叶子节点的概率界限值,确定所述用户的人群绩效等级。该计算机设备还包括:总线43,被配置为耦接处理器41及存储器42。
通过本申请的技术方案,能够将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别,得到所述用户归属于特定叶子节点的信息熵。与此同时,在所述用户归属于特定叶子节点的信息熵大于或者等于预设阈值时,能够将所述用户对应的人群绩效特征输入到预设逻辑回归模型进行计算,得到所述用户归属于目标人群绩效等级的概率值;且能够根据所述用户归属于目标人群绩效等级的概率值与所述特定叶子节点的概率界限值,确定所述用户的人群绩效等级,从而能够实现结合预设决策树模型和预设逻辑回归模型识别人群绩效等级,即在预设决策树模型得到特定叶子节点的信息熵大于或者等于预设阈值时,能够通过所述预设逻辑回归模型计算的所述用户归属于目标人群绩效等级的概率值和所述特定叶子节点的概率界限值对用户进行进一步识别,进而能够减少信息熵大于或者等于预设阈值的叶子节点识别错误的情况,且能够在保证识别精确度的前提下,提升占比较低的目标人群召回率。
显然,本领域的技术人员应该明白,上述的本申请的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的计算机可读指令代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本申请不限制于任何特定的硬件和软件结合。
以上所述仅为本申请的优选实施例而已,并不用于限制本申请,对于本领域的技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包括在本申请的保护范围之内。

Claims (20)

  1. 一种人群绩效等级识别方法,其特征在于,包括:
    将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别,得到所述用户归属于特定叶子节点的信息熵;
    若所述用户归属于特定叶子节点的信息熵大于或者等于预设阈值,则将所述用户对应的人群绩效特征输入到预设逻辑回归模型进行计算,得到所述用户归属于目标人群绩效等级的概率值;
    根据所述用户归属于目标人群绩效等级的概率值和所述特定叶子节点的概率界限值,确定所述用户的人群绩效等级。
  2. 根据权利要求1所述的方法,其特征在于,所述将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别之前,所述方法还包括:
    获取多个样本用户对应的样本人群绩效特征和样本人群绩效等级;
    根据所述样本人群绩效特征和所述样本人群绩效等级,建立所述预设决策树模型;
    从所述预设决策树模型中选取样本用户归属的信息熵大于或者等于预设阈值的各个叶子节点,并选取所述各个叶子节点下的各个样本用户;
    根据所述各个样本用户对应的样本人群绩效特征和样本人群绩效等级,建立所述预设逻辑回归模型,并根据所述预设逻辑回归模型,确定所述各个叶子节点的概率界限值。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述预设逻辑回归模型,确定所述各个叶子节点的概率界限值,包括:
    获取所述预设逻辑回归模型计算的所述各个样本用户归属于目标人群绩效等级的概率值;
    对所述各个样本用户归属于目标人群绩效等级的概率值进行排序;
    根据概率值排序结果,确定所述各个叶子节点的概率界限值。
  4. 根据权利要求3所述的方法,其特征在于,所述对所述各个样本用户归属于目标人群绩效等级的概率值进行排序,包括:
    按照概率值的高低顺序,对所述各个样本用户归属于目标人群绩效等级的概率值进行排序;
    所述根据概率值排序结果,确定所述各个叶子节点的概率界限值,包括:
    确定所述各个样本用户中对应样本人群绩效等级为目标人群绩效等级的样本用户的用户数量;
    将概率值排序位置与所述用户数量相等的概率值,确定为所述各个叶子节点的概率界限值。
  5. 根据权利要求1所述的方法,其特征在于,所述根据所述用户归属于目标人群绩效等级的概率值与所述特定叶子节点的概率界限值,确定所述用户的人群绩效等级,包括:
    若所述用户归属于目标人群绩效等级的概率值大于或者等于所述特定叶子节点的概率界限值,则将所述目标人群绩效等级确定为所述用户的人群绩效等级;
    若所述用户归属于目标人群绩效等级的概率值小于所述特定叶子节点的概率界限值,则将所述特定叶子节点对应的人群绩效等级确定为所述用户的人群绩效等级。
  6. 根据权利要求1所述的方法,其特征在于,所述将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别,得到所述用户归属于特定叶子节点的信息熵之后,所述方法还包括:
    若所述用户归属于特定叶子节点的信息熵小于预设阈值,则将所述特定叶子节点对应的人群绩效等级确定为所述用户的人群绩效等级。
  7. 根据权利要求1所述的方法,其特征在于,所述将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别,得到所述用户归属于特定叶子节点的信息熵之前,所述方法还包括:
    获取待识别用户对应的人群绩效数据;
    从所述人群绩效数据中提取出所述用户对应的人群绩效特征。
  8. 一种人群绩效等级识别装置,其特征在于,包括:
    识别单元,用于将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别,得到所述用户归属于特定叶子节点的信息熵;
    计算单元,用于若所述用户归属于特定叶子节点的信息熵大于或者等于预设阈值,则将所述用户对应的人群绩效特征输入到预设逻辑回归模型进行计算,得到所述用户归属于目标人群绩效等级的概率值;
    确定单元,用于根据所述用户归属于目标人群绩效等级的概率值和所述特定叶子节点的概率界限值,确定所述用户的人群绩效等级。
  9. 根据权利要求8所述的装置,其特征在于,所述装置还包括:获取单元、建立单元和选取单元;
    所述获取单元,用于获取多个样本用户对应的样本人群绩效特征和样本人群绩效等级;
    所述建立单元,用于根据所述样本人群绩效特征和所述样本人群绩效等级,建立所 述预设决策树模型;
    所述选取单元,用于从所述预设决策树模型中选取样本用户归属的信息熵大于或者等于预设阈值的各个叶子节点,并选取所述各个叶子节点下的各个样本用户;
    所述建立单元,还用于根据所述各个样本用户对应的样本人群绩效特征和样本人群绩效等级,建立所述预设逻辑回归模型;
    所述确定单元,还用于根据所述预设逻辑回归模型,确定所述各个叶子节点的概率界限值。
  10. 根据权利要求9所述的装置,其特征在于,所述确定单元包括:
    获取模块,用于获取所述预设逻辑回归模型计算的所述各个样本用户归属于目标人群绩效等级的概率值。
    排序模块,用于对所述各个样本用户归属于目标人群绩效等级的概率值进行排序。
    确定模块,用于根据概率值排序结果,确定所述各个叶子节点的概率界限值。
  11. 根据权利要求10所述的装置,其特征在于,所述排序模块,具体用于按照概率值的高低顺序,对所述各个样本用户归属于目标人群绩效等级的概率值进行排序;
    所述确定模块,具体用于确定所述各个样本用户中对应样本人群绩效等级为目标人群绩效等级的样本用户的用户数量;将概率值排序位置与所述用户数量相等的概率值,确定为所述各个叶子节点的概率界限值。
  12. 根据权利要求8所述的装置,其特征在于,所述确定单元,具体用于若所述用户归属于目标人群绩效等级的概率值大于或者等于所述特定叶子节点的概率界限值,则将所述目标人群绩效等级确定为所述用户的人群绩效等级;若所述用户归属于目标人群绩效等级的概率值小于所述特定叶子节点的概率界限值,则将所述特定叶子节点对应的人群绩效等级确定为所述用户的人群绩效等级。
  13. 根据权利要求8所述的装置,其特征在于,所述确定单元33,具体还用于若所述用户归属于特定叶子节点的信息熵小于预设阈值,则将所述特定叶子节点对应的人群绩效等级确定为所述用户的人群绩效等级。
  14. 根据权利要求8所述的装置,其特征在于,所述装置还包括:提取单元;
    所述获取单元,还用于获取待识别用户对应的人群绩效数据;
    所述提取单元,用于从所述人群绩效数据中提取出所述用户对应的人群绩效特征。
  15. 一种非易失性可读存储介质,其上存储有计算机可读指令,其特征在于,所述计算机可读指令被处理器执行时实现人群绩效等级识别方法,包括:
    将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别,得到所述用户归 属于特定叶子节点的信息熵;若所述用户归属于特定叶子节点的信息熵大于或者等于预设阈值,则将所述用户对应的人群绩效特征输入到预设逻辑回归模型进行计算,得到所述用户归属于目标人群绩效等级的概率值;根据所述用户归属于目标人群绩效等级的概率值和所述特定叶子节点的概率界限值,确定所述用户的人群绩效等级。
  16. 根据权利要求15所述的非易失性可读存储介质,其特征在于,所述计算机可读指令被处理器执行时实现所述将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别之前,所述方法还包括:
    获取多个样本用户对应的样本人群绩效特征和样本人群绩效等级;根据所述样本人群绩效特征和所述样本人群绩效等级,建立所述预设决策树模型;从所述预设决策树模型中选取样本用户归属的信息熵大于或者等于预设阈值的各个叶子节点,并选取所述各个叶子节点下的各个样本用户;根据所述各个样本用户对应的样本人群绩效特征和样本人群绩效等级,建立所述预设逻辑回归模型,并根据所述预设逻辑回归模型,确定所述各个叶子节点的概率界限值。
  17. 根据权利要求15所述的非易失性可读存储介质,其特征在于,所述计算机可读指令被处理器执行时实现所述根据所述预设逻辑回归模型,确定所述各个叶子节点的概率界限值,包括:
    获取所述预设逻辑回归模型计算的所述各个样本用户归属于目标人群绩效等级的概率值;对所述各个样本用户归属于目标人群绩效等级的概率值进行排序;根据概率值排序结果,确定所述各个叶子节点的概率界限值。
  18. 一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机可读指令,其特征在于,所述计算机可读指令被处理器执行时实现人群绩效等级识别方法,包括:
    将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别,得到所述用户归属于特定叶子节点的信息熵;若所述用户归属于特定叶子节点的信息熵大于或者等于预设阈值,则将所述用户对应的人群绩效特征输入到预设逻辑回归模型进行计算,得到所述用户归属于目标人群绩效等级的概率值;根据所述用户归属于目标人群绩效等级的概率值和所述特定叶子节点的概率界限值,确定所述用户的人群绩效等级。
  19. 根据权利要求18所述的计算机设备,其特征在于,所述计算机可读指令被处理器执行时实现所述将待识别用户对应的人群绩效特征输入到预设决策树模型进行识别之前,所述方法还包括:
    获取多个样本用户对应的样本人群绩效特征和样本人群绩效等级;根据所述样本人群 绩效特征和所述样本人群绩效等级,建立所述预设决策树模型;从所述预设决策树模型中选取样本用户归属的信息熵大于或者等于预设阈值的各个叶子节点,并选取所述各个叶子节点下的各个样本用户;根据所述各个样本用户对应的样本人群绩效特征和样本人群绩效等级,建立所述预设逻辑回归模型,并根据所述预设逻辑回归模型,确定所述各个叶子节点的概率界限值。
  20. 根据权利要求18所述的计算机设备,其特征在于,所述计算机可读指令被处理器执行时实现所述根据所述预设逻辑回归模型,确定所述各个叶子节点的概率界限值,包括:
    获取所述预设逻辑回归模型计算的所述各个样本用户归属于目标人群绩效等级的概率值;对所述各个样本用户归属于目标人群绩效等级的概率值进行排序;根据概率值排序结果,确定所述各个叶子节点的概率界限值。
PCT/CN2018/111118 2018-08-01 2018-10-21 人群绩效等级识别方法、装置、存储介质及计算机设备 WO2020024444A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810865299.5 2018-08-01
CN201810865299.5A CN109308564A (zh) 2018-08-01 2018-08-01 人群绩效等级识别方法、装置、存储介质及计算机设备

Publications (1)

Publication Number Publication Date
WO2020024444A1 true WO2020024444A1 (zh) 2020-02-06

Family

ID=65225978

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/111118 WO2020024444A1 (zh) 2018-08-01 2018-10-21 人群绩效等级识别方法、装置、存储介质及计算机设备

Country Status (2)

Country Link
CN (1) CN109308564A (zh)
WO (1) WO2020024444A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110210707A (zh) * 2019-04-30 2019-09-06 跨越速运集团有限公司 基于熵权法的点部自动调度效率评价方法及***

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110929752B (zh) * 2019-10-18 2023-06-20 平安科技(深圳)有限公司 基于知识驱动和数据驱动的分群方法及相关设备
CN111275121B (zh) * 2020-01-23 2023-07-18 北京康夫子健康技术有限公司 一种医学影像处理方法、装置和电子设备
CN114239732A (zh) * 2021-12-21 2022-03-25 深圳前海微众银行股份有限公司 有序分类标签确定方法、装置、电子设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130103619A1 (en) * 2011-10-21 2013-04-25 International Business Machines Corporation Composite production rules
CN105631707A (zh) * 2015-12-23 2016-06-01 北京奇虎科技有限公司 基于决策树的广告点击率预估方法与应用推荐方法及装置
CN106485421A (zh) * 2016-10-19 2017-03-08 江苏电力信息技术有限公司 一种基于工单数据的员工绩效评估***及方法
CN108133306A (zh) * 2017-11-17 2018-06-08 上海哔哩哔哩科技有限公司 绩效考核方法、服务器及绩效考核***

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104503973A (zh) * 2014-11-14 2015-04-08 浙江大学软件学院(宁波)管理中心(宁波软件教育中心) 一种基于奇异值分解与分类器融合推荐的方法
CN105447514B (zh) * 2015-11-18 2018-11-23 北京航空航天大学 一种基于信息熵的金属识别方法
CN108073883A (zh) * 2016-11-11 2018-05-25 深圳云天励飞技术有限公司 大规模人群属性识别方法及装置
CN107832581B (zh) * 2017-12-15 2022-02-18 百度在线网络技术(北京)有限公司 状态预测方法和装置
CN108288130A (zh) * 2018-03-05 2018-07-17 湖北省第三人民医院 基于数据挖掘分析平台的临床护理绩效评价***及方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130103619A1 (en) * 2011-10-21 2013-04-25 International Business Machines Corporation Composite production rules
CN105631707A (zh) * 2015-12-23 2016-06-01 北京奇虎科技有限公司 基于决策树的广告点击率预估方法与应用推荐方法及装置
CN106485421A (zh) * 2016-10-19 2017-03-08 江苏电力信息技术有限公司 一种基于工单数据的员工绩效评估***及方法
CN108133306A (zh) * 2017-11-17 2018-06-08 上海哔哩哔哩科技有限公司 绩效考核方法、服务器及绩效考核***

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110210707A (zh) * 2019-04-30 2019-09-06 跨越速运集团有限公司 基于熵权法的点部自动调度效率评价方法及***

Also Published As

Publication number Publication date
CN109308564A (zh) 2019-02-05

Similar Documents

Publication Publication Date Title
Pamina et al. An effective classifier for predicting churn in telecommunication
WO2020024444A1 (zh) 人群绩效等级识别方法、装置、存储介质及计算机设备
CN107766929B (zh) 模型分析方法及装置
US10013636B2 (en) Image object category recognition method and device
WO2018014610A1 (zh) 基于c4.5决策树算法的特定用户挖掘***及其方法
CN109492026B (zh) 一种基于改进的主动学习技术的电信欺诈分类检测方法
CN103324745B (zh) 基于贝叶斯模型的文本垃圾识别方法和***
CN107392241B (zh) 一种基于加权列抽样XGBoost的图像目标分类方法
CN113326377B (zh) 一种基于企业关联关系的人名消歧方法及***
CN110310114B (zh) 对象分类方法、装置、服务器及存储介质
CN109492776B (zh) 基于主动学习的微博流行度预测方法
CN105306296B (zh) 一种基于lte信令的数据过滤处理方法
CN109918498B (zh) 一种问题入库方法和装置
JP2020053073A (ja) 学習方法、学習システム、および学習プログラム
CN111159481B (zh) 图数据的边预测方法、装置及终端设备
CN114897085A (zh) 一种基于封闭子图链路预测的聚类方法及计算机设备
CN114549897A (zh) 一种分类模型的训练方法、装置及存储介质
CN104468276B (zh) 基于随机抽样多分类器的网络流量识别方法
Gias et al. Samplehst: Efficient on-the-fly selection of distributed traces
WO2020024448A1 (zh) 人群绩效等级识别方法、装置、存储介质及计算机设备
CN112508363A (zh) 基于深度学习的电力信息***状态分析方法及装置
CN116204647A (zh) 一种目标比对学习模型的建立、文本聚类方法及装置
CN112069392B (zh) 涉网犯罪防控方法、装置、计算机设备及存储介质
US11544332B2 (en) Bipartite graph construction
CN110570093B (zh) 一种业务拓展渠道自动管理的方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18928239

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18928239

Country of ref document: EP

Kind code of ref document: A1