CN110837847A - User classification method and device, storage medium and server - Google Patents

User classification method and device, storage medium and server Download PDF

Info

Publication number
CN110837847A
CN110837847A CN201910967144.7A CN201910967144A CN110837847A CN 110837847 A CN110837847 A CN 110837847A CN 201910967144 A CN201910967144 A CN 201910967144A CN 110837847 A CN110837847 A CN 110837847A
Authority
CN
China
Prior art keywords
user
feature vector
data
sample data
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910967144.7A
Other languages
Chinese (zh)
Inventor
赵毅仁
胡宏辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Lake Information Technology Co Ltd
Original Assignee
Shanghai Lake Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Lake Information Technology Co Ltd filed Critical Shanghai Lake Information Technology Co Ltd
Priority to CN201910967144.7A priority Critical patent/CN110837847A/en
Publication of CN110837847A publication Critical patent/CN110837847A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Strategic Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Development Economics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Technology Law (AREA)
  • Game Theory and Decision Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A user classification method and device, a storage medium and a server are provided, and the method comprises the following steps: determining a group of sample data, wherein each sample data comprises a user characteristic vector and a self-conversion value associated with the user characteristic vector, and the self-conversion value is used for indicating whether a user actively executes preset operation or not; for each sample data in a group of sample data, mapping a user characteristic vector of the sample data into an updated user characteristic vector, wherein the dimensionality of the updated user characteristic vector is greater than the dimensionality of the user characteristic vector; training to obtain a user transformation model by using the updated user feature vector and the associated self-transformation value thereof, wherein the user transformation model is used for calculating the probability of executing the preset operation by the user; and determining the category of the user to which the data to be detected belongs based on the user feature vector of the data to be detected and the user conversion model, wherein the category comprises an active conversion type and a passive conversion type. The technical scheme of the invention can improve the conversion rate of the precipitation users in the marketing scene.

Description

User classification method and device, storage medium and server
Technical Field
The invention relates to the technical field of big data processing, in particular to a user classification method and device, a storage medium and a server.
Background
Typically, subsequent loan application processes may be spontaneous after the user registers with the financial institution platform. But more users may choose not to perform subsequent operations, which becomes precipitating users. In order to improve the business conversion rate, a plurality of financial institutions have manual professionals to carry out telemarketing, and the aim is to improve the conversion rate of precipitation users.
In the prior art, in order to improve efficiency, a marketing model can be developed based on historical data. The purpose of the marketing model is to determine the probability of automatic conversion by the user. A common algorithm for constructing the marketing model is a logistic regression algorithm, the automatic forwarding probability of the users is calculated according to the logistic regression model, and a manual marketing specialist can only perform marketing aiming at the users with low automatic conversion rate, so that the conversion rate of the users precipitating in the link is greatly improved.
However, in the marketing model development process, because the feature dimensions of the users acquired in this step are small, the effect is not good when a complex algorithm such as a neural network algorithm is tried, and the existing marketing model algorithm also has a great promotion space.
Disclosure of Invention
The technical problem solved by the invention is how to improve the accuracy of the user classification model so as to classify the user more accurately.
To solve the foregoing technical problem, an embodiment of the present invention provides a user classification method, including: determining a group of sample data, wherein each sample data comprises a user feature vector and a self-conversion value associated with the user feature vector, and the self-conversion value is used for indicating whether a user actively executes a preset operation; for each sample data in the set of sample data, mapping a user feature vector of the sample data into an updated user feature vector, wherein the dimensionality of the updated user feature vector is greater than the dimensionality of the user feature vector; training to obtain a user transformation model by using the updated user feature vector and the associated self-transformation value thereof, wherein the user transformation model is used for calculating the probability of executing the preset operation by the user; determining the category of the user to which the data to be detected belongs based on the user feature vector of the data to be detected and the user conversion model, wherein the category comprises an active conversion type and a passive conversion type.
Optionally, the determining, based on the user feature vector of the data to be detected and the user transformation model, the category of the user to which the data to be detected belongs includes: calculating to obtain the probability of executing the preset operation by the user to which the data to be detected belongs based on the user feature vector of the data to be detected and the user transformation model; and when the probability is smaller than a preset probability, determining the category of the user to which the data to be detected belongs as the active conversion type, otherwise, determining the category of the user to which the data to be detected belongs as the passive conversion type.
Optionally, the mapping the user feature vector of the sample data into an updated user feature vector includes: determining the number of decision trees in a random forest algorithm; mapping the user characteristic vectors of the sample data into the number of mapped user characteristic vectors by utilizing a decision tree algorithm; and forming the updated user characteristic vector by using a random forest algorithm and the mapped user characteristic vector.
Optionally, the training to obtain the user transformation model by using the updated user feature vector and the associated self-transformation value thereof includes: and training to obtain the user transformation model by adopting a neural network algorithm based on the updated user feature vector and the associated self-transformation value thereof.
Optionally, the user conversion model includes hidden layers, and the number of the hidden layers is less than or equal to 3.
Optionally, the neural network algorithm is a deep learning neural network algorithm.
To solve the foregoing technical problem, an embodiment of the present invention further provides a user classifying device, including: the device comprises a first determination module, a second determination module and a third determination module, wherein the first determination module is used for determining a group of sample data, each sample data comprises a user feature vector and a self-conversion value associated with the user feature vector, and the self-conversion value is used for indicating whether a user actively executes a preset operation or not; a mapping module, for each sample data in the set of sample data, mapping a user feature vector of the sample data into an updated user feature vector, wherein a dimension of the updated user feature vector is greater than a dimension of the user feature vector; the training module is used for training to obtain a user transformation model by utilizing the updated user feature vector and the associated self-transformation value thereof, and the user transformation model is used for calculating the probability of executing the preset operation by the user; and the second determining module is used for determining the category of the user to which the data to be detected belongs based on the user feature vector of the data to be detected and the user conversion model, wherein the category comprises an active conversion type and a passive conversion type.
Optionally, the second determining module includes: the calculation submodule is used for calculating and obtaining the probability of executing the preset operation by the user to which the data to be detected belongs based on the user characteristic vector of the data to be detected and the user conversion model; and the first determining submodule is used for determining the category of the user to which the data to be detected belongs as the active conversion type when the probability is smaller than a preset probability, and otherwise, determining the category of the user to which the data to be detected belongs as the passive conversion type.
To solve the above technical problem, an embodiment of the present invention further provides a storage medium having stored thereon computer instructions, where the computer instructions execute the steps of the above method when executed.
In order to solve the above technical problem, an embodiment of the present invention further provides a server, including a memory and a processor, where the memory stores computer instructions executable on the processor, and the processor executes the computer instructions to perform the steps of the above method.
Compared with the prior art, the technical scheme of the embodiment of the invention has the following beneficial effects:
the embodiment of the invention provides a user classification method, which comprises the following steps: determining a group of sample data, wherein each sample data comprises a user feature vector and a self-conversion value associated with the user feature vector, and the self-conversion value is used for indicating whether a user actively executes a preset operation; for each sample data in the set of sample data, mapping a user feature vector of the sample data into an updated user feature vector, wherein the dimensionality of the updated user feature vector is greater than the dimensionality of the user feature vector; training to obtain a user transformation model by using the updated user feature vector and the associated self-transformation value thereof, wherein the user transformation model is used for calculating the probability of executing the preset operation by the user; determining the category of the user to which the data to be detected belongs based on the user feature vector of the data to be detected and the user conversion model, wherein the category comprises an active conversion type and a passive conversion type. The embodiment of the invention improves the user characteristic vector of the sample data by mapping the user characteristic vector into the updated user characteristic vector with higher dimensionality, thereby being capable of training to obtain a user transformation model so as to better extract the complex interactive relation among data and improve the model discrimination.
Further, the mapping the user feature vector of the sample data into an updated user feature vector includes: determining the number of decision trees in a random forest algorithm; mapping the user characteristic vectors of the sample data into the number of mapped user characteristic vectors by utilizing a decision tree algorithm; and forming the updated user characteristic vector by using a random forest algorithm and the mapped user characteristic vector. According to the embodiment of the invention, the random forest algorithm is used for variable mapping, so that low-dimensional data can be mapped to a high-dimensional space, and a possibility is provided for obtaining a user transformation model with higher accuracy by adopting a model training method with higher complexity.
Further, the training to obtain the user transformation model by using the updated user feature vector and the associated self-transformation value thereof includes: and training to obtain the user transformation model by adopting a neural network algorithm based on the updated user feature vector and the associated self-transformation value thereof. According to the embodiment of the invention, high-dimensional data obtained by a random forest algorithm can be used as input data of a neural network algorithm, and a user transformation model is obtained by training the neural network algorithm.
Drawings
Fig. 1 is a flowchart illustrating a user classification method according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating an embodiment of step S102 shown in FIG. 1
FIG. 3 is a simplified schematic diagram of a random forest algorithm according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a perceptron structure of a neural network algorithm according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a user classification device according to an embodiment of the present invention.
Detailed Description
Those skilled in the art will appreciate that, as is the background, existing mechanisms have deficiencies and require further investigation to arrive at a more accurate user transformation model.
The embodiment of the invention provides a user classification method, which comprises the following steps: determining a group of sample data, wherein each sample data comprises a user feature vector and a self-conversion value associated with the user feature vector, and the self-conversion value is used for indicating whether a user actively executes a preset operation; for each sample data in the set of sample data, mapping a user feature vector of the sample data into an updated user feature vector, wherein the dimensionality of the updated user feature vector is greater than the dimensionality of the user feature vector; training to obtain a user transformation model by using the updated user feature vector and the associated self-transformation value thereof, wherein the user transformation model is used for calculating the probability of executing the preset operation by the user; determining the category of the user to which the data to be detected belongs based on the user feature vector of the data to be detected and the user conversion model, wherein the category comprises an active conversion type and a passive conversion type.
According to the embodiment of the invention, the user characteristic vector is mapped into the updated user characteristic vector with higher dimensionality, and the user characteristic vector of the sample data is improved, so that a user transformation model can be obtained through training, the complex interaction relation among data can be better extracted, the model discrimination is improved, and the user type with higher accuracy is obtained. The embodiment of the invention is applied to a marketing scene, and is beneficial to improving the conversion rate of the precipitation users in the marketing scene.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.
Fig. 1 is a flowchart illustrating a user classification method according to an embodiment of the present invention. The user classification method can be executed by a server, is applied to a marketing scene, and is used for obtaining a user transformation model with higher precision, so that self-transformation users and non-self-transformation users (for example, precipitation users) can be more accurately distinguished.
In one embodiment, the server may be a server cluster consisting of a plurality of servers. The user classification method may include the steps of:
step S101, determining a group of sample data, wherein each sample data comprises a user feature vector and a self-transformation value associated with the user feature vector, and the self-transformation value is used for indicating whether a user actively executes preset operation or not;
step S102, for each sample data in the set of sample data, mapping the user characteristic vector of the sample data into an updated user characteristic vector, wherein the dimensionality of the updated user characteristic vector is greater than the dimensionality of the user characteristic vector;
step S103, training to obtain a user transformation model by using the updated user feature vector and the associated self-transformation value thereof, wherein the user transformation model is used for calculating the probability of the user for executing the preset operation;
and S104, determining the category of the user to which the data to be detected belongs based on the user feature vector of the data to be detected and the user conversion model, wherein the category comprises an active conversion type and a passive conversion type.
More specifically, in step S101, a set of sample data may be determined. The sample data may be statistically relevant data of precipitation users (e.g., users who do not actively apply for a loan) and relevant data of non-precipitation users (e.g., users who actively apply for a loan).
In one embodiment, each sample data may include a user feature vector and its associated self-translation value that can be used to indicate whether the user is actively performing a preset operation, e.g., may be used to indicate whether the user is actively applying for a loan.
In step S102, for a certain set of sample data, the user feature vector of each sample data may be mapped to obtain an updated user feature vector with a higher dimensionality.
In one embodiment, a random forest algorithm may be used to obtain updated user feature vectors with higher dimensionality. Fig. 2 is a flowchart illustrating an embodiment of step S102 shown in fig. 1. Specifically, the step S102 may include the steps of:
step S1021, determining the number of decision trees in the random forest algorithm;
step S1022, map the user' S characteristic vector of the said sample data into the characteristic vector of said quantity of users after mapping by using the algorithm of decision tree;
and S1023, forming the updated user feature vector by using a random forest algorithm and the mapped user feature vector.
Specifically, in step S1021, parameter values in the random forest algorithm may be determined, for example, the number of decision trees in the random algorithm may be determined, for example, may be 3, 4, or 5, and so on.
In step S1022, the user feature vector of each sample data may be mapped by using a decision tree algorithm, and a certain number of mapped user feature vectors are obtained.
Then, in step S1023, the updated user feature vector may be generated by using a random forest algorithm and the mapped user feature vector.
Further, in step S103, a user conversion model may be obtained by training using the updated user feature vector and its associated self-conversion value, and the user conversion model may be used to calculate a probability that the user performs the preset operation, for example, the preset operation is an active loan application operation or an inactive loan application operation. The user conversion model may be used to calculate the probability that the user is actively applying for a loan operation, or the probability that the user is not actively (e.g., passively) applying for a loan operation.
In one embodiment, the user transformation model may be obtained by training based on the updated user feature vector and its associated self-transformation value by using a neural network algorithm. For example, the updated user feature vector and the associated self-transformation value thereof are used as input data of the neural network algorithm, and each parameter of the neural network model is obtained through training, so that the user transformation model can be obtained.
In specific implementation, the updated user feature vector and the associated self-transformation value thereof may be input into a deep learning neural network model, and each parameter of the deep learning neural network model is obtained through training, so as to obtain the user transformation model.
In a specific implementation, the user translation model (i.e., the neural network model) may include a hidden layer. Through statistical tests, it can be obtained that when the number of the hidden layers is less than or equal to 3, the accuracy of the user transformation model for user classification is high, and the complexity is low.
In step S104, a category of a user to which the data to be detected belongs may be determined based on the user feature vector of the data to be detected and the user transformation model, where the category may include an active transformation type and a passive transformation type.
In an embodiment, the probability that the user to which the data to be detected belongs actively executes the loan application operation may be calculated based on the user feature vector of the data to be detected and the user conversion model.
In another embodiment, the probability that the user to which the data to be detected belongs does not perform (or passively perform) the loan application operation may be calculated based on the user feature vector of the data to be detected and the user conversion model.
Further, after the probability is calculated, if the probability is smaller than a preset probability, the category of the user to which the data to be detected belongs can be determined to be the active conversion type. And if the probability is not less than the preset probability, determining that the category of the user to which the data to be detected belongs is the passive conversion type. Or if the probability is smaller than a preset probability, determining that the category of the user to which the data to be detected belongs is the passive conversion type. And if the probability is not less than the preset probability, determining that the category of the user to which the data to be detected belongs is the active conversion type.
The following specifically explains the embodiment of the present invention by taking the number of decision trees equal to 2 as an example.
Fig. 3 is a simplified schematic diagram of a random forest algorithm according to an embodiment of the present invention. In a specific implementation, first, sample data is determined from the historical data, each sample data being (X)i,yi). Wherein, XiIs a characteristic variable of the user, yiFor the user's self-translated value, when the user translates by itself, yiIs 1, and vice versa yiIs 0, i is a natural number.
And secondly, determining to obtain updated user feature vectors with higher dimensionality by adopting a random forest algorithm. Those skilled in the art understand that the random forest algorithm is composed of a plurality of decision trees. As shown in fig. 2, assume that the number of decision trees is 2, i.e., a random forest containing 2 decision trees is trained. If 2 decision trees get leaf node numbers of 2 and 4 respectively. Ith data XiIf the data falls on the 2 nd node of the 1 st decision tree and the 3 rd node of the 2 nd decision tree, the corresponding new feature of the data is (0,1,0,0,1, 0).
The more the number of the decision trees trained in the random forest algorithm is, the higher the new feature dimension can be obtained by mapping, so that the mapping from low-latitude data to high-latitude data can be completed. In FIG. 2, data XiFor example, XiAnd respectively used as input data of the decision tree 1 and the decision tree 2, and respectively obtained leaf data through calculation of a decision tree algorithm. Wherein, the decision tree 1 obtains a leaf 1 and a leaf 2; decision Tree 2 gets leaves 1 through 4, i.e. XiThe resulting updated feature vector is (0,1,0,0,1, 0).
Further, deep learning neural network training may be performed on the obtained high-dimensional feature vector, and the training result may refer to fig. 4. Fig. 4 is a schematic diagram of a perceptron structure of a neural network algorithm according to an embodiment of the present invention. As shown in fig. 4, the training result may be y ═ f (∑ e)iXiWi- θ). Wherein y represents a probability of the user performing the preset operation, WiWeight vector, X, representing user iiThe feature vector of user i is represented and θ represents the error.
The parameters of the training result may be obtained by training according to specific input data, or may be obtained by experimental tests and calculation errors, for example, the number of layers of the hidden layer of the deep learning neural network may be obtained by experiment and adjustment according to the experimental tests, the errors, and the accuracy.
In order to verify the effect of the user transformation model provided by the embodiment of the invention, the result obtained by the user transformation model can be compared with the result obtained by the traditional method. In practical application, performance indexes such as accuracy and recall rate can be used as comparison objects, so that a user conversion model with better performance can be selected.
In summary, the embodiments of the present invention provide a method for performing variable mapping using a random forest algorithm in a marketing scene and training a model using a decision tree algorithm, so as to map low-dimensional data to a high-dimensional space, and extract a complex interaction relationship between data better by combining a neural network algorithm, improve a degree of discrimination of the model, and finally improve a conversion rate of a precipitation user in the marketing scene.
Fig. 5 is a schematic structural diagram of a user classification device according to an embodiment of the present invention. The user classification device 5 may implement the method solutions shown in fig. 1 and fig. 2, and is executed by a server.
Specifically, the user classification device 5 may include: a first determining module 51, configured to determine a set of sample data, where each sample data includes a user feature vector and a self-transformation value associated with the user feature vector, and the self-transformation value is used to indicate whether a user actively performs a preset operation; a mapping module 52, configured to map, for each sample data in the set of sample data, a user feature vector of the sample data into an updated user feature vector, where a dimension of the updated user feature vector is greater than a dimension of the user feature vector; a training module 53, configured to train to obtain a user transformation model by using the updated user feature vector and its associated self-transformation value, where the user transformation model is used to calculate a probability that a user executes the preset operation; and a second determining module 54, configured to determine, based on the user feature vector of the data to be detected and the user transformation model, a category of a user to which the data to be detected belongs, where the category includes an active transformation type and a passive transformation type.
In a specific implementation, the second determining module 54 may include: the calculating submodule 541 calculates and obtains the probability of executing the preset operation by the user to which the data to be detected belongs based on the user feature vector of the data to be detected and the user conversion model; the first determining submodule 542 is configured to determine that the category of the user to which the data to be detected belongs is the active conversion type when the probability is smaller than a preset probability, and otherwise determine that the category of the user to which the data to be detected belongs is the passive conversion type.
In a specific implementation, the mapping module 52 may include: a second determining submodule 521, configured to determine the number of decision trees in the random forest algorithm; a mapping submodule 522, configured to map the user feature vectors of the sample data into the number of mapped user feature vectors by using a decision tree algorithm; the generating submodule 523 is configured to form the updated user feature vector by using a random forest algorithm and the mapped user feature vector.
In a specific implementation, the training module 53 may include: and the training submodule 531 trains to obtain the user transformation model by using a neural network algorithm based on the updated user feature vector and the associated self-transformation value thereof.
In a specific implementation, the user conversion model may include hidden layers, and the number of the hidden layers is less than or equal to 3.
In a specific implementation, the neural network algorithm may be a deep learning neural network algorithm.
For more details of the operation principle and the operation mode of the user classification device 5, reference may be made to the related description in fig. 1 and fig. 2, and details are not repeated here.
Further, the embodiment of the present invention also discloses a storage medium, on which computer instructions are stored, and when the computer instructions are executed, the technical solution of the method in the embodiment shown in fig. 1 and fig. 2 is executed. Preferably, the storage medium may include a computer-readable storage medium such as a non-volatile (non-volatile) memory or a non-transitory (non-transient) memory. The storage medium may include ROM, RAM, magnetic or optical disks, etc.
Further, an embodiment of the present invention further discloses a server, which includes a memory and a processor, where the memory stores computer instructions capable of being executed on the processor, and the processor executes the computer instructions to execute the technical solutions of the methods in the embodiments shown in fig. 1 and fig. 2.
Although the present invention is disclosed above, the present invention is not limited thereto. Various changes and modifications may be effected therein by one skilled in the art without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (10)

1. A method for classifying a user, comprising:
determining a group of sample data, wherein each sample data comprises a user feature vector and a self-conversion value associated with the user feature vector, and the self-conversion value is used for indicating whether a user actively executes a preset operation;
for each sample data in the set of sample data, mapping a user feature vector of the sample data into an updated user feature vector, wherein the dimensionality of the updated user feature vector is greater than the dimensionality of the user feature vector;
training to obtain a user transformation model by using the updated user feature vector and the associated self-transformation value thereof, wherein the user transformation model is used for calculating the probability of executing the preset operation by the user;
determining the category of the user to which the data to be detected belongs based on the user feature vector of the data to be detected and the user conversion model, wherein the category comprises an active conversion type and a passive conversion type.
2. The user classification method according to claim 1, wherein the determining the category of the user to which the data to be detected belongs based on the user feature vector of the data to be detected and the user transformation model comprises:
calculating to obtain the probability of executing the preset operation by the user to which the data to be detected belongs based on the user feature vector of the data to be detected and the user transformation model;
and when the probability is smaller than a preset probability, determining the category of the user to which the data to be detected belongs as the active conversion type, otherwise, determining the category of the user to which the data to be detected belongs as the passive conversion type.
3. The method according to claim 1, wherein the mapping the user feature vector of the sample data to an updated user feature vector comprises:
determining the number of decision trees in a random forest algorithm;
mapping the user characteristic vectors of the sample data into the number of mapped user characteristic vectors by utilizing a decision tree algorithm;
and forming the updated user characteristic vector by using a random forest algorithm and the mapped user characteristic vector.
4. The method according to claim 1, wherein the training to obtain the user transformation model using the updated user feature vector and its associated self-transformation value comprises:
and training to obtain the user transformation model by adopting a neural network algorithm based on the updated user feature vector and the associated self-transformation value thereof.
5. The user classification method according to claim 4, wherein the user conversion model comprises hidden layers, and the number of the hidden layers is less than or equal to 3.
6. The user classification method according to claim 4 or 5, characterized in that the neural network algorithm is a deep learning neural network algorithm.
7. A user classifying apparatus, comprising:
the device comprises a first determination module, a second determination module and a third determination module, wherein the first determination module is used for determining a group of sample data, each sample data comprises a user feature vector and a self-conversion value associated with the user feature vector, and the self-conversion value is used for indicating whether a user actively executes a preset operation or not;
a mapping module, for each sample data in the set of sample data, mapping a user feature vector of the sample data into an updated user feature vector, wherein a dimension of the updated user feature vector is greater than a dimension of the user feature vector;
the training module is used for training to obtain a user transformation model by utilizing the updated user feature vector and the associated self-transformation value thereof, and the user transformation model is used for calculating the probability of executing the preset operation by the user;
and the second determining module is used for determining the category of the user to which the data to be detected belongs based on the user feature vector of the data to be detected and the user conversion model, wherein the category comprises an active conversion type and a passive conversion type.
8. The apparatus of claim 7, wherein the second determining module comprises: the calculation submodule is used for calculating and obtaining the probability of executing the preset operation by the user to which the data to be detected belongs based on the user characteristic vector of the data to be detected and the user conversion model;
and the first determining submodule is used for determining the category of the user to which the data to be detected belongs as the active conversion type when the probability is smaller than a preset probability, and otherwise, determining the category of the user to which the data to be detected belongs as the passive conversion type.
9. A storage medium having stored thereon computer instructions, characterized in that the computer instructions are operative to perform the steps of the method of any one of claims 1 to 6.
10. A server comprising a memory and a processor, the memory having stored thereon computer instructions executable on the processor, wherein the processor, when executing the computer instructions, performs the steps of the method of any one of claims 1 to 6.
CN201910967144.7A 2019-10-12 2019-10-12 User classification method and device, storage medium and server Pending CN110837847A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910967144.7A CN110837847A (en) 2019-10-12 2019-10-12 User classification method and device, storage medium and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910967144.7A CN110837847A (en) 2019-10-12 2019-10-12 User classification method and device, storage medium and server

Publications (1)

Publication Number Publication Date
CN110837847A true CN110837847A (en) 2020-02-25

Family

ID=69575302

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910967144.7A Pending CN110837847A (en) 2019-10-12 2019-10-12 User classification method and device, storage medium and server

Country Status (1)

Country Link
CN (1) CN110837847A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112446541A (en) * 2020-11-26 2021-03-05 上海浦东发展银行股份有限公司 Fusion classification model establishing method, marketing conversion rate gain prediction method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108256052A (en) * 2018-01-15 2018-07-06 成都初联创智软件有限公司 Automobile industry potential customers' recognition methods based on tri-training
CN109350032A (en) * 2018-10-16 2019-02-19 武汉中旗生物医疗电子有限公司 A kind of classification method, system, electronic equipment and storage medium
CN109496322A (en) * 2017-09-28 2019-03-19 深圳乐信软件技术有限公司 Credit assessment method and device and the progressive decision tree parameter regulation means of gradient and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109496322A (en) * 2017-09-28 2019-03-19 深圳乐信软件技术有限公司 Credit assessment method and device and the progressive decision tree parameter regulation means of gradient and device
CN108256052A (en) * 2018-01-15 2018-07-06 成都初联创智软件有限公司 Automobile industry potential customers' recognition methods based on tri-training
CN109350032A (en) * 2018-10-16 2019-02-19 武汉中旗生物医疗电子有限公司 A kind of classification method, system, electronic equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112446541A (en) * 2020-11-26 2021-03-05 上海浦东发展银行股份有限公司 Fusion classification model establishing method, marketing conversion rate gain prediction method and system

Similar Documents

Publication Publication Date Title
US10713597B2 (en) Systems and methods for preparing data for use by machine learning algorithms
US20230325724A1 (en) Updating attribute data structures to indicate trends in attribute data provided to automated modelling systems
US10810463B2 (en) Updating attribute data structures to indicate joint relationships among attributes and predictive outputs for training automated modeling systems
CN108108854B (en) Urban road network link prediction method, system and storage medium
CN108475393A (en) The system and method that decision tree is predicted are promoted by composite character and gradient
CN110276679B (en) Network personal credit fraud behavior detection method for deep learning
AU2016243106A1 (en) Optimizing neural networks for risk assessment
WO2023065859A1 (en) Item recommendation method and apparatus, and storage medium
US10268749B1 (en) Clustering sparse high dimensional data using sketches
CN110135681A (en) Risk subscribers recognition methods, device, readable storage medium storing program for executing and terminal device
US11687804B2 (en) Latent feature dimensionality bounds for robust machine learning on high dimensional datasets
WO2023116111A1 (en) Disk fault prediction method and apparatus
CN113011895A (en) Associated account sample screening method, device and equipment and computer storage medium
CN109670927A (en) The method of adjustment and its device of credit line, equipment, storage medium
JP2022515941A (en) Generating hostile neuropil-based classification system and method
CN109871866B (en) Model training method, device, equipment and medium for hospital infection prediction
CN116910573B (en) Training method and device for abnormality diagnosis model, electronic equipment and storage medium
CN116522912B (en) Training method, device, medium and equipment for package design language model
CN111143533B (en) Customer service method and system based on user behavior data
CN110837847A (en) User classification method and device, storage medium and server
US20140324524A1 (en) Evolving a capped customer linkage model using genetic models
CN114529399A (en) User data processing method, device, computer equipment and storage medium
CN111898708A (en) Transfer learning method and electronic equipment
CN112085584A (en) Enterprise credit default probability calculation method and system
CN113515383B (en) System resource data distribution method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200225

RJ01 Rejection of invention patent application after publication