CN116308715A

CN116308715A - Credit evaluation model acquisition method, user credit evaluation method and device

Info

Publication number: CN116308715A
Application number: CN202310341858.3A
Authority: CN
Inventors: 林得有; 朱海宽; 麦少练; 林益坤
Original assignee: Industrial and Commercial Bank of China Ltd ICBC
Current assignee: Industrial and Commercial Bank of China Ltd ICBC
Priority date: 2023-03-31
Filing date: 2023-03-31
Publication date: 2023-06-23

Abstract

The application relates to the technical field of computers, and provides a credit evaluation model acquisition method, a user credit evaluation method and a user credit evaluation device, which can be used in the field of financial science and technology or other related fields. The method comprises the following steps: acquiring user characteristics of a reference user; removing a first abnormal user with missing user characteristic types and a second abnormal user with missing abnormal user characteristics from the reference users to obtain effective reference users; screening the user characteristics of multiple types according to the association degree of the user characteristics of each type in the effective reference user and the user risk to obtain target user characteristics of multiple types; training a logistic regression model by utilizing the characteristics of the target user, determining the default probability of the effective reference user based on the trained default risk assessment model, and determining the evaluation model parameters and the credit evaluation model corresponding to the evaluation model parameters based on the default probability of the reference user. The method can efficiently realize credit evaluation of the user.

Description

Credit evaluation model acquisition method, user credit evaluation method and device

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a method for obtaining a credit evaluation model, and a method and apparatus for evaluating credit of a user.

Background

With the development of computer technology, credit assessment may be performed on users in various ways, including a method of establishing a credit assessment model for performing credit assessment of users based on statistical user data. However, the conventional credit evaluation model has a problem in that evaluation efficiency is low when performing credit evaluation of a user.

Disclosure of Invention

Based on this, it is necessary to provide a method for obtaining a credit evaluation model, a method for evaluating credit of a user and a device thereof, aiming at the technical problems.

The application provides a credit evaluation model acquisition method, which comprises the following steps:

acquiring user characteristics of a reference user; the user characteristics of each reference user include at least one type of user characteristics;

determining a first abnormal user with missing user characteristic types in the reference user, and determining a second abnormal user with abnormal user characteristics according to the user characteristics of the reference user;

removing the first abnormal user and the second abnormal user from the reference users to obtain effective reference users;

Determining the association degree of each type of user features in the effective reference user and the user risk, and screening a plurality of types of user features according to the association degree to obtain a plurality of types of target user features;

training the logistic regression model by utilizing the characteristics of the target user to obtain a trained default risk assessment model;

the method comprises the steps of determining the default probability of an effective reference user based on a default risk assessment model, determining an assessment model parameter based on the default probability of the reference user, and a credit assessment model corresponding to the assessment model parameter.

In one embodiment, the step of determining the association degree between each type of user feature in the effective reference user and the user risk, and screening the plurality of types of user features according to the association degree to obtain a plurality of types of target user features includes: aiming at each type of user characteristics, carrying out box division processing on the user characteristics to obtain a plurality of groups of user characteristics; determining the default information of each group of user features in each type of user features according to the default information of each reference user acquired in advance, and determining the evidence weight of each group of user features according to the default information of each group of user features; according to the default information of each group of user features in each type of user features and the evidence weight of each group of user features, the association degree of each type of user features and the user risk is obtained, and the user features with the association degree larger than a preset threshold value are determined as target user features.

In one embodiment, the breach information of the reference user includes a breach user number and a daemon user number; determining evidence weights of the user features of each group according to the default information of the user features of each group, wherein the evidence weights comprise the following steps: dividing the ratio of the number of default users and the number of guard users corresponding to the characteristics of each group of users by the ratio of the number of default users and the number of guard users in the reference users respectively to obtain the deviation degree of the characteristics of each group of users; and obtaining the evidence weight of each group of user characteristics according to the deviation degree of each group of user characteristics.

In one embodiment, the breach information of the reference user includes a breach user number and a daemon user number; according to the default information of each group of user features and the evidence weight of each group of user features, the step of obtaining the association degree of the user features and the user risks comprises the following steps: respectively making the ratio of the number of default users and the number of guard users corresponding to the characteristics of each group of users worse than the ratio of the number of default users and the number of guard users in the reference users to obtain the deviation value of the characteristics of each group of users; multiplying the deviation values of the user features of each group with the evidence weights of the user features of each group to obtain the association degree of the user features of each group; and summing the relevancy of the user features of each group to obtain the relevancy of the user features.

In one embodiment, training the logistic regression model using the target user characteristics to obtain a trained default risk assessment model includes: acquiring a default condition label of an effective reference user; inputting the target user characteristics of the effective reference user into a logistic regression model to be trained, and obtaining the default probability output by the logistic regression model; determining a difference value between the default probability and the default condition label, and determining a loss value according to the difference value and a target loss function acquired in advance; and adjusting model parameters of the logistic regression model according to the loss value until the training ending condition is met, and obtaining the default risk assessment model.

In one embodiment, prior to the step of determining the loss value from the difference value and the pre-acquired target loss function, the method further comprises: acquiring a cross entropy loss function, and determining regularization items of the cross entropy loss function and parameters of the regularization items; the objective loss function is constructed based on the cross entropy loss function, the regularization term, and parameters of the regularization term.

In one embodiment, the steps of determining a breach probability of a reference user based on a breach risk assessment model, and determining an assessment model parameter based on the breach probability of the reference user, and a credit assessment model corresponding to the assessment model parameter, comprise: determining the default probability of the first effective reference user and the default probability of the second effective reference user based on the default risk assessment model; acquiring and determining the credit score of a first effective reference user as a first reference score, acquiring the credit score of a second reference user according to the first reference score, and determining the credit score of the second reference user as a second reference score; and determining an evaluation model parameter and a credit evaluation model corresponding to the evaluation model parameter according to the relation between the first effective reference user default probability and the first benchmark score and the relation between the second effective reference user default probability and the second benchmark score.

The application also provides a method for evaluating the credit of the user, which comprises the following steps:

obtaining the default probability of a user to be evaluated; the default probability of the user to be evaluated is determined by a pre-trained default risk evaluation model based on a plurality of target user characteristics of the user to be evaluated;

inputting the default probability of the user to be evaluated into a pre-trained credit evaluation model to obtain the credit score of the user to be evaluated; the credit evaluation model is obtained by training based on the credit evaluation model acquisition method.

The application provides a credit evaluation model acquisition device, which comprises:

the user characteristic acquisition module is used for acquiring the user characteristics of the reference user; the user characteristics of each reference user include at least one type of user characteristics;

the abnormal user determining module is used for determining a first abnormal user with missing user characteristic types in the reference user and determining a second abnormal user with abnormal user characteristics according to the user characteristics of the reference user;

the reference user screening module is used for eliminating the first abnormal user and the second abnormal user from the reference users to obtain effective reference users;

the user feature screening module is used for determining the association degree of each type of user features in the effective reference user and the user risk, and screening the plurality of types of user features according to the association degree to obtain a plurality of types of target user features;

The assessment model training module is used for training the logistic regression model by utilizing the characteristics of the target user to obtain a trained default risk assessment model;

and the evaluation model acquisition module is used for determining the default probability of the effective reference user based on the default risk evaluation model, determining the evaluation model parameters based on the default probability of the reference user and a credit evaluation model corresponding to the evaluation model parameters.

The application also provides a device for evaluating credit of a user, which comprises:

the breach probability acquisition module is used for acquiring breach probability of the user to be evaluated; the default probability of the user to be evaluated is determined by a pre-trained default risk evaluation model based on a plurality of target user characteristics of the user to be evaluated;

the credit score acquisition module is used for inputting the default probability of the user to be evaluated into a pre-trained credit evaluation model to obtain the credit score of the user to be evaluated; the credit evaluation model is obtained by training based on the credit evaluation model acquisition method.

The application provides a computer device comprising a memory storing a computer program and a processor executing the above method.

The present application provides a computer readable storage medium having stored thereon a computer program for execution by a processor of the above method.

The present application provides a computer program product having a computer program stored thereon, the computer program being executed by a processor to perform the above method.

The method for acquiring the credit evaluation model, the method and the device for evaluating the credit of the user can acquire the user characteristics of the reference user; the user characteristics of each reference user include at least one type of user characteristics; determining a first abnormal user with missing user characteristic types in the reference user, and determining a second abnormal user with abnormal user characteristics according to the user characteristics of the reference user; and eliminating the first abnormal user and the second abnormal user from the reference users to obtain effective reference users. Then, determining the association degree of each type of user features in the effective reference user and the user risk, and screening a plurality of types of user features according to the association degree to obtain a plurality of types of target user features; training the logistic regression model by utilizing the characteristics of the target user to obtain a trained default risk assessment model; the method comprises the steps of determining the default probability of an effective reference user based on a default risk assessment model, determining an assessment model parameter based on the default probability of the reference user, and a credit assessment model corresponding to the assessment model parameter. Before training the default risk assessment model, the method performs data preprocessing on the reference user to obtain the effective reference user, and screens out the target user features according to the relevance between the user features of each type and the user risk, so that abnormal data and redundant features are avoided, the calculation resources consumed during training the model are reduced, and the running efficiency of the credit assessment model is improved. Further, the credit evaluation model is built based on the logistic regression model with high calculation efficiency, and the evaluation efficiency of the obtained credit evaluation model can be remarkably improved, so that credit evaluation of the user according to the user characteristics can be efficiently realized.

Drawings

FIG. 1 is a flow chart of a method for acquiring a credit rating model in one embodiment;

FIG. 2 is a flow chart of a method for screening target user features in one embodiment;

FIG. 3 is a schematic diagram of a training process of the breach risk assessment model, in one embodiment;

FIG. 4 is a schematic diagram of a process for acquiring a credit rating model in one embodiment;

FIG. 5 is a flow diagram of a method of user credit assessment in one embodiment;

FIG. 6 is a block diagram showing the configuration of a device for acquiring a credit rating model in one embodiment;

FIG. 7 is a block diagram of an apparatus for user credit assessment in one embodiment;

fig. 8 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly understand that the embodiments described herein may be combined with other embodiments.

It should be noted that, the method for acquiring the credit evaluation model of the present application may be used in the relevant process of the credit evaluation of the user in the financial field, and may also be used in the relevant process of the credit evaluation of the user in any field other than the financial field, and the application fields of the method, the device, the computer equipment, the storage medium and the product for acquiring the credit evaluation model of the present application are not limited.

In one embodiment, as shown in fig. 1, a method for obtaining a credit evaluation model is provided, where the method is applied to a terminal for illustration, it is understood that the method may also be applied to a server, and may also be applied to a system including the terminal and the server, and implemented through interaction between the terminal and the server. In this embodiment, the method includes the following steps:

step S101, obtaining user characteristics of a reference user; the user characteristics of each reference user include at least one type of user characteristics.

Specifically, the number of reference users is plural, and the user characteristics of each reference user include at least one type of user characteristics.

Step S102, a first abnormal user with missing user characteristic types in the reference user is determined, and a second abnormal user with abnormal user characteristics is determined according to the user characteristics of the reference user.

And step S103, eliminating the first abnormal user and the second abnormal user from the reference users to obtain effective reference users.

Specifically, data preprocessing is performed on a reference user, and a first abnormal user with missing user characteristics and a second abnormal user with abnormal user characteristics in the reference user are determined. The user feature type absence refers to the case when a certain user feature of the reference user is blank. For example, a normal reference user has n user characteristics, while a first abnormal user has m user characteristics, if m < n, it is indicated that the first abnormal user has a user characteristic type missing. The range of user features may be determined by a plurality of reference user features, wherein an abnormal user feature refers to a user feature that does not fall within the range of user features. Illustratively, when the user is characterized by an age, the ages of the plurality of reference users are all between 18 and 60, the age range is confirmed to be 18 to 60, and if the age of one reference user is 16, the reference user is confirmed to be the second abnormal user. And then, eliminating the first abnormal user and the second abnormal user from the reference users to obtain effective reference users.

Step S104, determining the association degree of each type of user features in the effective reference user and the user risk, and screening the plurality of types of user features according to the association degree to obtain a plurality of types of target user features.

Specifically, feature screening is performed on the effective reference users. The degree of association of different types of user features with user risk is different, and some user features have a high degree of association with user risk, and some user features have a low degree of association with user risk. And screening the user characteristics of multiple types according to the association degree of the user characteristics of each type and the user risk, and removing the characteristics with low association degree with the user risk to obtain target user characteristics of multiple types.

Step S105, training a logistic regression model by utilizing the characteristics of the target user to obtain a trained default risk assessment model;

and step S106, determining the default probability of the effective reference user based on the default risk assessment model, determining the evaluation model parameters based on the default probability of the reference user, and a credit evaluation model corresponding to the evaluation model parameters.

Specifically, the target user characteristics are used as input variables, the breach probability of the effective reference user is used as output variables, and the logistic regression model is trained to obtain a trained breach risk assessment model. The scoring card is an algorithm model applied in the financial field, is a means for measuring the probability of the offence risk in the form of score, and is a prediction of the probability of the offence risk of the user in a future period, and generally, the higher the score is, the lower the offence probability of the user is. Therefore, the breach risk assessment model is converted into a scoring card, namely a credit assessment model. Determining the default probability of an effective reference user based on the default risk assessment model, determining the assessment model parameters based on the obtained default probability of the reference user, further determining a credit assessment model corresponding to the assessment model parameters, and obtaining the credit assessment model which can be used for carrying out credit assessment on the user according to the user characteristics of the user.

The credit evaluation model acquisition method can acquire the user characteristics of the reference user; the user characteristics of each reference user include at least one type of user characteristics; determining a first abnormal user with missing user characteristic types in the reference user, and determining a second abnormal user with abnormal user characteristics according to the user characteristics of the reference user; and eliminating the first abnormal user and the second abnormal user from the reference users to obtain effective reference users. Then, determining the association degree of each type of user features in the effective reference user and the user risk, and screening a plurality of types of user features according to the association degree to obtain a plurality of types of target user features; training the logistic regression model by utilizing the characteristics of the target user to obtain a trained default risk assessment model; the method comprises the steps of determining the default probability of an effective reference user based on a default risk assessment model, determining an assessment model parameter based on the default probability of the reference user, and a credit assessment model corresponding to the assessment model parameter. Before training the default risk assessment model, the method performs data preprocessing on the reference user to obtain the effective reference user, and screens out the target user features according to the relevance between the user features of each type and the user risk, so that abnormal data and redundant features are avoided, the calculation resources consumed during training the model are reduced, and the running efficiency of the credit assessment model is improved. Further, the credit evaluation model is built based on the logistic regression model with high calculation efficiency, and the evaluation efficiency of the obtained credit evaluation model can be remarkably improved, so that credit evaluation of the user according to the user characteristics can be efficiently realized.

In one embodiment, as shown in fig. 2, the steps of determining the association degree between each type of user feature in the effective reference user and the user risk, and screening the multiple types of user features according to the association degree to obtain multiple types of target user features include:

step S202, aiming at each type of user characteristic, carrying out box division processing on the user characteristic to obtain a plurality of groups of user characteristics.

Specifically, the user features of each type are respectively subjected to box division processing, so that multiple groups of user features under each type can be obtained, namely, the user features of one type are divided into multiple groups of user features under one type according to specific numerical values of the user features, so as to obtain the user features of different intervals. For example, if the user features are age, the user features may be divided into groups of user features such as 20-30, 30-40, 40-50, etc. according to the size of the age.

Feature binning is a treatment method for continuous variable discretization. The box separation method generally has three methods: equidistant segmentation, equal depth segmentation and optimal segmentation. Wherein equidistant segmentation means that the intervals of the segments are consistent; the equal depth segmentation is to determine the number of segments first and then make the number in each segment approximately equal; optimal segmentation, also called supervised discretization, uses recursive partitioning to divide continuous variables into segments, the principle of which is an algorithm that looks up the preferred packets based on conditional inference.

For example, the present application may employ an optimal segment binning method, and when the distribution of the user features does not meet the requirement of the optimal segment, then an equidistant segment binning method is adopted for the user features.

Step S204, according to the pre-acquired default information of each reference user, determining default information of each group of user characteristics in each type of user characteristics, and according to the default information of each group of user characteristics, determining evidence weight of each group of user characteristics.

Specifically, the default information of each reference user, that is, the information that each reference user is known to be a default user or a guard user, is obtained in advance, and then the default information of each group of user characteristics under each type, that is, the number of default users and the number of guard users under each group of user characteristics is determined. And then, according to the default information of the user characteristics of each group, determining the evidence weight of the user characteristics of each group. Illustratively, evidence weights for each set of user features may be calculated by evidence weight analysis (Weight of Evidence, WOE).

Step S206, according to the default information of each group of user features in each type of user features and the evidence weight of each group of user features, the association degree of each type of user features and the user risk is obtained, and the user features with the association degree larger than the preset threshold value are determined to be target user features.

Specifically, according to the default information of each group of user features in each type of user features and the evidence weight of each group of user features, the association degree of each type of user features and the user risk is obtained, and the user features with the association degree larger than a preset threshold value are determined to be target user features.

Illustratively, the degree of association of each type of user feature with the user risk is calculated by determining the information value (Information Value, IV) of the user feature. For example, for each type of user feature, a user feature with IV < 0.02 may be specified as an invalid feature with little valid information; the user characteristics with IV being more than or equal to 0.02 and less than or equal to 0.1 are weak effect characteristics, and effective information is less; the user characteristics with IV being more than or equal to 0.1 and less than 0.5 are relevant effect characteristics, and the effective information is more; the user characteristics with IV more than or equal to 0.5 are strong effect characteristics, and the effective information is very much. Illustratively, a user feature having an IV value greater than 0.1, i.e., a user feature having a degree of association greater than a preset threshold, is selected as the target user feature.

In this embodiment, the user features are screened in the manner of feature classification, evidence weight and relevance, so as to obtain target user features, redundant features in the user features can be removed, and user features with high risk relevance to the user are selected for model training, so that the calculation resources consumed during model training are reduced, and the running efficiency of the credit evaluation model is further improved.

In an alternative embodiment, the evidence weights for each set of user features may be determined according to the following formula:

wherein WOE is as follows _i Evidence weights for the ith set of user features; b (B) _i The default user number corresponding to the ith group of user characteristics; b (B) _T The number of default users in the reference users; g _i The user is the corresponding guard user number of the ith group of user characteristics; g _T To refer to the number of guard users among the users.

It will be appreciated that the degree of deviation of each set of user characteristics represents the difference between the ratio of the number of offending users to the number of guard users corresponding to each set of user characteristics and the ratio of the number of offending users to the number of guard users in the reference user. If the number of the guard users corresponding to the group of user features is larger, the evidence weight of the group of user features is smaller, otherwise, if the number of the guard users corresponding to the group of user features is larger, the evidence weight of the group of user features is larger.

In an alternative embodiment, the degree of association of the user features may be determined according to the following formula:

wherein IV _i The degree of association of the user features of the ith group; WOE (WOE) _i Evidence weights for the ith set of user features; IV is the degree of association of the user's features.

In one embodiment, as shown in the schematic diagram of the training process of the breach risk assessment model of fig. 3, the step of training the logistic regression model by using the characteristics of the target user to obtain a trained breach risk assessment model includes:

Step S302, obtaining the default condition label of the effective reference user.

Specifically, the default condition tags include two categories: daemon users and offending users.

And step S304, inputting the target user characteristics of the effective reference user into the logistic regression model to be trained, and obtaining the default probability output by the logistic regression model.

The logistic regression model is based on a linear regression equation, a layer of nonlinear mapping (Sigmoid) function is added to the characteristic result, the characteristic result of the target user characteristic is linearly summed, and then a violation probability value of 0 to 1 is obtained through the nonlinear mapping function.

In one example, the linear regression equation may be as follows:

the nonlinear mapping function may be as follows:

wherein, when x is 0, the nonlinear mapping function value is 0.5; when x → +. In the case of infinity, the air conditioner is controlled, the nonlinear mapping function value is 1; when x- & gt is & gt, the nonlinear mapping function value is 0.

Thus, the logistic regression model formula is as follows:

wherein f (h) is the default probability output by the model; omega _i Model parameters corresponding to the ith target user feature; x is x _i Is the ith target user feature.

Further, the effective reference users may be classified into two categories by setting a threshold for the logistic regression model. Illustratively, the threshold is set to 0.5, indicating that the active reference user is a daemon user when f (h) < 0.5, and indicating that the active reference user is an offending user when f (h) > 0.5.

Step S306, determining a difference value between the default probability and the default condition label, and determining a loss value according to the difference value and a pre-acquired target loss function.

And step 308, adjusting model parameters of the logistic regression model according to the loss value until the training ending condition is met, and obtaining the default risk assessment model.

Specifically, a difference value between the breach probability and the breach situation tag is determined, and a loss value is determined according to the difference value and a target loss function acquired in advance. Illustratively, the default condition label value for the contracting user is set to 0 and the default condition label value for the contracting user is set to 1. And adjusting model parameters of the logistic regression model according to the loss value until the training ending condition is met, and obtaining the default risk assessment model. Illustratively, a maximum likelihood estimation method is used to solve the loss function, and a gradient descent algorithm is used to minimize the solved model parameters.

In this embodiment, the logistic regression model is trained by using the characteristics of the target user to obtain a trained default risk assessment model, and the characteristic of high calculation efficiency of the logistic regression model can be used to improve the evaluation efficiency of the obtained credit evaluation model, so that credit evaluation on the user according to the user characteristics can be efficiently realized.

In one embodiment, after obtaining the default risk assessment model, the assessment performance of the model can be further checked through a test set, and if the assessment accuracy of the trained default risk assessment model meets the requirement, the trained default risk assessment model is regarded as the trained default risk assessment model.

Specifically, when the characteristics of the target user are excessive, a regularization term can be introduced to optimize the logistic regression model in order to avoid the situation of excessive fitting.

The cross entropy loss function is as follows:

the objective loss function is as follows:

wherein J (ω) is a loss value; m is the number of effective reference users; y is ⁱ Is a default condition label; h is a _ω (x ⁱ ) Outputting the default probability for the model; l (ω) is a regularization term.

Specifically, the regularization term may be implemented by means of L1 regularization or L2 regularization.

The regularization of L1 compresses the parameters to 0, and when the feature quantity is large, the data dimension is high, and the L1 is suitable. The L1 regularization has a characteristic screening function, penalty forces on all parameters are the same, and a part of weights can be changed into 0. The L1 regularized expression and the corresponding regularization term are:

L ₁ ＝|ω ₁ |+|ω ₂ |+...+|ω _n |

L(ω)＝λ*L ₁

where λ is a parameter of the regularization term, also called penalty term coefficient. Lambda is greater than 0, which determines the penalty, too high may be under-fitted, and too small may not be able to solve the over-fitting.

The L2 regularization can make parameters as small as possible, so that the weight of each dimension is generally reduced, the fixed proportion of the weight is reduced, and the weight is smoother. The L2 regularized expression and the corresponding regularization term are:

L(ω)＝λ*L ₂

by way of example, the regularization term of the present application adopts L2 regularization by default, and if the model is still fitted when L2 regularization is adopted, or the model has insufficient evaluation accuracy for unknown data, L1 regularization is adopted.

In this embodiment, by adding regularization terms into the cross entropy loss function in the training process, the default risk assessment model obtained by training can be ensured to have a good fitting effect, and further the evaluation effect of the credit evaluation model can be ensured.

In one embodiment, as shown in the schematic diagram of the process of obtaining the credit rating model of fig. 4, the steps of determining the default probability of the reference user based on the default risk assessment model, determining the rating model parameters based on the default probability of the reference user, and the credit rating model corresponding to the rating model parameters include:

step S402, based on the breach risk assessment model, determining breach probability of the first effective reference user and breach probability of the second effective reference user.

The relative probability of an event occurring (Odds) refers to the ratio of the probability of the event occurring to the probability of the event not occurring. Assuming that the probability of a user's breach is p, its probability of breach is 1-p, from which the relative probability of breach can be calculated by the following formula:

at this time, the client's default probability p may be expressed as:

specifically, the probability of breach of the effective reference user is calculated by the following formula:

log(Odds)＝ω ₀ +ω ₁ x ₁ +...+ω _n x _n

where log (odds) is the log value of the probability of breach of the user being effectively referenced; omega _n Model parameters corresponding to the nth target user feature; x is x _n Evidence weights for the nth target user feature.

Step S404, obtaining and determining the credit score of the first effective reference user as a first reference score, obtaining the credit score of the second reference user according to the first reference score, and determining the credit score of the second reference user as a second reference score.

Step S406, determining an evaluation model parameter and a credit evaluation model corresponding to the evaluation model parameter according to the relation between the first effective reference user default probability and the first reference score and the relation between the second effective reference user default probability and the second reference score.

Specifically, the expression of the credit evaluation model of the present application is as follows:

Score＝A-Blog(Odds)

wherein Score is the credit Score output by the model; a and B are evaluation model parameters. The greater the probability of breach of the user, the lower the credit score of the user.

Setting the credit score of the first effective reference user as a first reference score, obtaining the credit score of the second reference user according to the first reference score, and determining the credit score of the second reference user as a second reference score. It should be noted that there is a link between the credit score of the first active reference user and the credit score of the second reference user. Illustratively, the credit score of the second reference user is twice the credit score of the first active reference user.

Therefore, the values of the evaluation model parameters can be obtained by solving the values between the first effective reference user default probability and the first reference score and between the second effective reference user default probability and the second reference score and substituting the second effective reference user default probability and the second reference score into the credit evaluation model, and then the credit evaluation model corresponding to the evaluation model parameters is determined.

In this embodiment, by effectively referencing the breach probability and the credit score of the user, determining the evaluation model parameter and the credit evaluation model corresponding to the evaluation model parameter, a connection between the credit score output by the credit evaluation model and the breach probability of the user is established, so that the breach probability of the user can be intuitively known through the credit score of the user.

In order to better understand the above method, an application example of the method for acquiring the credit evaluation model of the present application is described in detail below, but it should be understood that the embodiment of the present application is not limited thereto.

Firstly, obtaining user characteristics of a reference user; the user characteristics of each reference user include at least one type of user characteristics. Determining a first abnormal user with missing user characteristic types in the reference user, and determining a second abnormal user with abnormal user characteristics according to the user characteristics of the reference user. And eliminating the first abnormal user and the second abnormal user from the reference users to obtain effective reference users. Then, the user characteristics of the effective reference users are screened, and the specific process comprises the following steps: aiming at each type of user characteristics in the effective reference user, carrying out optimal segmentation box division processing on the user characteristics to obtain a plurality of groups of user characteristics; according to the information that each reference user is a default user or a guard user, a evidence weight analysis (WOE) method is adopted to obtain evidence weights of all groups of user characteristics; according to the pre-acquired information of each reference user being a default user or a guard user and the evidence weight of each group of user characteristics, adopting an information value analysis (IV) method to obtain the association degree of each type of user characteristics and the user risk, and determining the user characteristics with the association degree larger than a preset threshold value (IV > 0.1) as target user characteristics.

And secondly, training a logistic regression model by utilizing the characteristics of the target user. And training the logistic regression model by taking the characteristics of the target user as an input variable and the default probability of the effective reference user as an output variable. The cross entropy loss function employed in the training process includes an L2 regularization term. And adjusting model parameters of the logistic regression model according to the loss value until the training ending condition is met, and obtaining the default risk assessment model. After obtaining the default risk assessment model, the assessment performance of the model can be checked through a test set, and if the assessment precision of the trained default risk assessment model meets the requirements, the trained default risk assessment model is regarded as the trained default risk assessment model.

And finally, determining the default probability of the effective reference user based on the trained default risk assessment model, determining the evaluation model parameters based on the default probability of the reference user, and a credit evaluation model corresponding to the evaluation model parameters. The specific process is as follows: firstly, determining the default probability of a first effective reference user and the default probability of a second effective reference user; secondly, obtaining and determining the credit score of a first effective reference user as a first reference score, obtaining the credit score of a second reference user according to the first reference score, and determining the credit score of the second reference user as a second reference score; and finally, determining an evaluation model parameter and a credit evaluation model corresponding to the evaluation model parameter according to the relation between the first effective reference user default probability and the first reference score and the relation between the second effective reference user default probability and the second reference score.

In this embodiment, the user characteristics of the reference user may be obtained; the user characteristics of each reference user include at least one type of user characteristics; determining a first abnormal user with missing user characteristic types in the reference user, and determining a second abnormal user with abnormal user characteristics according to the user characteristics of the reference user; and eliminating the first abnormal user and the second abnormal user from the reference users to obtain effective reference users. Then, determining the association degree of each type of user features in the effective reference user and the user risk, and screening a plurality of types of user features according to the association degree to obtain a plurality of types of target user features; training the logistic regression model by utilizing the characteristics of the target user to obtain a trained default risk assessment model; the method comprises the steps of determining the default probability of an effective reference user based on a default risk assessment model, determining an assessment model parameter based on the default probability of the reference user, and a credit assessment model corresponding to the assessment model parameter. Before training the default risk assessment model, the method performs data preprocessing on the reference user to obtain the effective reference user, and screens out the target user features according to the relevance between the user features of each type and the user risk, so that abnormal data and redundant features are avoided, the calculation resources consumed during training the model are reduced, and the running efficiency of the credit assessment model is improved. Further, the credit evaluation model is built based on the logistic regression model with high calculation efficiency, and the evaluation efficiency of the obtained credit evaluation model can be remarkably improved, so that credit evaluation of the user according to the user characteristics can be efficiently realized.

In one embodiment, as shown in fig. 5, a method for evaluating credit of a user is provided, and this embodiment is applied to a terminal for illustration by using the method, it is understood that the method may also be applied to a server, and may also be applied to a system including the terminal and the server, and implemented through interaction between the terminal and the server. In this embodiment, the method includes the following steps:

step S502, obtaining the default probability of a user to be evaluated; the probability of breach of the user to be assessed is determined by a pre-trained breach risk assessment model based on a plurality of target user characteristics of the user to be assessed.

Specifically, a plurality of target user characteristics of a user to be evaluated are taken as input variables and input into a pre-trained default risk evaluation model, so that the default probability of the user to be evaluated is determined. The pre-trained default risk assessment model is obtained by training based on the credit assessment model acquisition method.

Step S504, inputting the default probability of the user to be evaluated into a pre-trained credit evaluation model to obtain the credit score of the user to be evaluated; the credit evaluation model is obtained by training based on the credit evaluation model acquisition method.

Specifically, the default probability of the user to be evaluated is input into a pre-trained credit evaluation model to obtain the credit score of the user to be evaluated, so that the credit evaluation of the user to be evaluated is realized. The credit evaluation model is obtained by training based on the credit evaluation model acquisition method.

In the embodiment, the credit score of the user to be evaluated is obtained by acquiring the default probability of the user to be evaluated and according to the default probability of the user to be evaluated and the credit evaluation model, so that the credit evaluation of the user to be evaluated is efficiently realized, and the efficiency of acquiring the credit score of the user is improved.

It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. Unless explicitly stated herein, the steps are not strictly limited to the order in which they are performed, and at least some of the steps in the flowcharts described in the embodiments may include steps or stages which are not necessarily performed at the same time but may be performed at different times, and the order in which the steps or stages are performed may not necessarily be sequential, but may be performed in alternate or alternating fashion with other steps or at least some of the steps or stages in other steps.

In one embodiment, as shown in fig. 6, there is provided an acquisition apparatus of a credit evaluation model, including:

a user feature acquisition module 601, configured to acquire a user feature of a reference user; the user characteristics of each reference user include at least one type of user characteristics;

an abnormal user determining module 602, configured to determine a first abnormal user having a missing user feature type in the reference user, and determine a second abnormal user having an abnormal user feature according to the user feature of the reference user;

the reference user filtering module 603 is configured to reject the first abnormal user and the second abnormal user from the reference users, so as to obtain effective reference users;

the user feature screening module 604 is configured to determine a degree of association between each type of user feature in the effective reference user and the user risk, and screen the plurality of types of user features according to the degree of association to obtain a plurality of types of target user features;

the evaluation model training module 605 is configured to train the logistic regression model by using the characteristics of the target user to obtain a trained default risk evaluation model;

the evaluation model acquisition module 606 is configured to determine an breach probability of the effective reference user based on the breach risk assessment model, and determine an evaluation model parameter based on the breach probability of the reference user, and a credit evaluation model corresponding to the evaluation model parameter.

In one embodiment, the user feature filtering module 604 is further configured to perform a binning process on the user features for each type of user feature to obtain multiple groups of user features; determining the default information of each group of user features in each type of user features according to the default information of each reference user acquired in advance, and determining the evidence weight of each group of user features according to the default information of each group of user features; according to the default information of each group of user features in each type of user features and the evidence weight of each group of user features, the association degree of each type of user features and the user risk is obtained, and the user features with the association degree larger than a preset threshold value are determined as target user features.

In one embodiment, the breach information of the reference user includes a breach user number and a daemon user number; the user feature screening module 604 is further configured to divide the ratio of the number of default users to the number of guard users corresponding to each set of user features by the ratio of the number of default users to the number of guard users in the reference user, so as to obtain the deviation degree of each set of user features; and obtaining the evidence weight of each group of user characteristics according to the deviation degree of each group of user characteristics.

In one embodiment, the breach information of the reference user includes a breach user number and a daemon user number; the user feature screening module 604 is further configured to respectively make a difference between the ratio of the number of default users to the ratio of the number of guard users corresponding to each group of user features and the ratio of the number of default users to the ratio of the number of guard users in the reference user, so as to obtain a deviation value of each group of user features; multiplying the deviation values of the user features of each group with the evidence weights of the user features of each group to obtain the association degree of the user features of each group; and summing the relevancy of the user features of each group to obtain the relevancy of the user features.

In one embodiment, the assessment model training module 605 is further configured to obtain a violation condition label of a valid reference user; inputting the target user characteristics of the effective reference user into a logistic regression model to be trained, and obtaining the default probability output by the logistic regression model; determining a difference value between the default probability and the default condition label, and determining a loss value according to the difference value and a target loss function acquired in advance; and adjusting model parameters of the logistic regression model according to the loss value until the training ending condition is met, and obtaining the default risk assessment model.

In one embodiment, the apparatus further comprises a target loss function construction module for obtaining a cross entropy loss function and determining regularization terms of the cross entropy loss function and parameters of the regularization terms; the objective loss function is constructed based on the cross entropy loss function, the regularization term, and parameters of the regularization term.

In one embodiment, the evaluation model acquisition module 606 is further configured to determine, based on the breach risk assessment model, a breach probability of the first active reference user and a breach probability of the second active reference user; acquiring and determining the credit score of a first effective reference user as a first reference score, acquiring the credit score of a second reference user according to the first reference score, and determining the credit score of the second reference user as a second reference score; and determining an evaluation model parameter and a credit evaluation model corresponding to the evaluation model parameter according to the relation between the first effective reference user default probability and the first benchmark score and the relation between the second effective reference user default probability and the second benchmark score.

For specific limitations on the means for obtaining the credit rating model, reference may be made to the above limitations on the method for obtaining the credit rating model, and details thereof will not be repeated here. The respective modules in the above-described credit evaluation model acquisition means may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, as shown in FIG. 7, an apparatus for user credit assessment is provided, comprising:

the breach probability obtaining module 701 is configured to obtain breach probability of a user to be evaluated; the default probability of the user to be evaluated is determined by a pre-trained default risk evaluation model based on a plurality of target user characteristics of the user to be evaluated;

the credit score obtaining module 702 is configured to input the breach probability of the user to be evaluated to a pre-trained credit evaluation model, so as to obtain a credit score of the user to be evaluated; the credit rating model is trained based on the method for acquiring a credit rating model according to any one of claims 1 to 7.

For specific limitations on the means of user credit assessment, reference may be made to the limitations of the method of user credit assessment hereinabove, and will not be repeated here. The various modules in the above-described means for user credit assessment may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a terminal, and the internal structure thereof may be as shown in fig. 8. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless mode can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer equipment also comprises an input/output interface, wherein the input/output interface is a connecting circuit for exchanging information between the processor and the external equipment, and the input/output interface is connected with the processor through a bus and is called as an I/O interface for short. The computer program is executed by a processor to implement a method of obtaining a credit rating model. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.

It will be appreciated by those skilled in the art that the structure shown in fig. 8 is merely a block diagram of a portion of the structure associated with the present application and does not constitute a limitation of the computer device to which the present application is applied, and in particular, the computer device may include more or less components than those shown in the drawings, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory storing a computer program and a processor that when executing the computer program performs the steps of:

Further, the step of determining the association degree between each type of user feature in the effective reference user and the user risk and screening the plurality of types of user features according to the association degree to obtain a plurality of types of target user features is realized when the processor executes the computer program, and specifically includes: aiming at each type of user characteristics, carrying out box division processing on the user characteristics to obtain a plurality of groups of user characteristics; determining the default information of each group of user features in each type of user features according to the default information of each reference user acquired in advance, and determining the evidence weight of each group of user features according to the default information of each group of user features; according to the default information of each group of user features in each type of user features and the evidence weight of each group of user features, the association degree of each type of user features and the user risk is obtained, and the user features with the association degree larger than a preset threshold value are determined as target user features.

Further, the default information of the reference user comprises the default user number and the guard user number; the step of determining evidence weights of the user features of each group according to the default information of the user features of each group is realized when the processor executes the computer program, and specifically comprises the following steps: dividing the ratio of the number of default users and the number of guard users corresponding to the characteristics of each group of users by the ratio of the number of default users and the number of guard users in the reference users respectively to obtain the deviation degree of the characteristics of each group of users; and obtaining the evidence weight of each group of user characteristics according to the deviation degree of each group of user characteristics.

Further, the default information of the reference user comprises the default user number and the guard user number; the step of obtaining the association degree between the user features and the user risks according to the default information of each group of user features and the evidence weight of each group of user features when the processor executes the computer program specifically comprises the following steps: respectively making the ratio of the number of default users and the number of guard users corresponding to the characteristics of each group of users worse than the ratio of the number of default users and the number of guard users in the reference users to obtain the deviation value of the characteristics of each group of users; multiplying the deviation values of the user features of each group with the evidence weights of the user features of each group to obtain the association degree of the user features of each group; and summing the relevancy of the user features of each group to obtain the relevancy of the user features.

Further, the step of training the logistic regression model by using the characteristics of the target user to obtain a trained default risk assessment model is realized when the processor executes the computer program, and specifically includes: acquiring a default condition label of an effective reference user; inputting the target user characteristics of the effective reference user into a logistic regression model to be trained, and obtaining the default probability output by the logistic regression model; determining a difference value between the default probability and the default condition label, and determining a loss value according to the difference value and a target loss function acquired in advance; and adjusting model parameters of the logistic regression model according to the loss value until the training ending condition is met, and obtaining the default risk assessment model.

Further, the processor when executing the computer program also implements the following steps: acquiring a cross entropy loss function, and determining regularization items of the cross entropy loss function and parameters of the regularization items; the objective loss function is constructed based on the cross entropy loss function, the regularization term, and parameters of the regularization term.

Further, the step of determining the default probability of the reference user based on the default risk assessment model and determining the evaluation model parameter based on the default probability of the reference user and the credit evaluation model corresponding to the evaluation model parameter is realized when the processor executes the computer program, specifically includes: determining the default probability of the first effective reference user and the default probability of the second effective reference user based on the default risk assessment model; acquiring and determining the credit score of a first effective reference user as a first reference score, acquiring the credit score of a second reference user according to the first reference score, and determining the credit score of the second reference user as a second reference score; and determining an evaluation model parameter and a credit evaluation model corresponding to the evaluation model parameter according to the relation between the first effective reference user default probability and the first benchmark score and the relation between the second effective reference user default probability and the second benchmark score.

In one embodiment, a computer device is provided comprising a memory and a processor, the memory having stored therein a computer program, the processor when executing the computer program performing the steps of:

In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, carries out the steps of the respective method embodiments described above.

In one embodiment, a computer program product is provided, on which a computer program is stored, which computer program is executed by a processor for performing the steps of the various method embodiments described above.

It should be noted that, user information (including but not limited to user equipment information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party.

Those skilled in the art will appreciate that implementing all or part of the above-described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, or the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The foregoing examples represent only a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims

1. A method for obtaining a credit rating model, the method comprising:

determining the association degree of each type of user features in the effective reference users and the user risks, and screening a plurality of types of user features according to the association degree to obtain a plurality of types of target user features;

Training a logistic regression model by utilizing the target user characteristics to obtain a trained default risk assessment model;

determining the default probability of the effective reference user based on the default risk assessment model, determining an assessment model parameter based on the default probability of the reference user, and a credit assessment model corresponding to the assessment model parameter.

2. The method of claim 1, wherein determining the degree of association between each type of the user features in the active reference user and the user risk, and filtering the plurality of types of user features according to the degree of association, to obtain a plurality of types of target user features, comprises:

aiming at each type of user characteristics, carrying out box division processing on the user characteristics to obtain a plurality of groups of user characteristics;

determining the default information of each group of user features in each type of user features according to the default information of each reference user acquired in advance, and determining the evidence weight of each group of user features according to the default information of each group of user features;

and acquiring the association degree of each type of user feature and the user risk according to the default information of each group of user features in each type of user features and the evidence weight of each group of user features, and determining the user features with the association degree larger than a preset threshold as target user features.

3. The method of claim 2, wherein the reference user's default information includes a number of default users and a number of guard users; the determining the evidence weight of each group of user features according to the default information of each group of user features comprises the following steps:

dividing the ratio of the number of default users and the number of guard users corresponding to the characteristics of each group of users by the ratio of the number of default users and the number of guard users in the reference users respectively to obtain the deviation degree of the characteristics of each group of users;

and obtaining the evidence weight of each group of user characteristics according to the deviation degree of each group of user characteristics.

4. The method of claim 2, wherein the reference user's default information includes a number of default users and a number of guard users; the obtaining the association degree between the user features and the user risks according to the default information of the user features of each group and the evidence weight of the user features of each group comprises the following steps:

respectively making the ratio of the number of default users and the number of guard users corresponding to the characteristics of each group of users worse than the ratio of the number of default users and the number of guard users in the reference users to obtain the deviation value of the characteristics of each group of users;

multiplying the deviation values of the user features of each group with the evidence weights of the user features of each group respectively to obtain the association degree of the user features of each group;

And summing the association degrees of the user characteristics of each group to obtain the association degrees of the user characteristics.

5. The method of claim 1, wherein the step of training a logistic regression model using the target user characteristics to obtain a trained breach risk assessment model comprises:

acquiring the default condition label of the effective reference user;

inputting the target user characteristics of the effective reference user into a logistic regression model to be trained, and obtaining the default probability output by the logistic regression model;

determining a difference value between the default probability and the default condition label, and determining a loss value according to the difference value and a target loss function acquired in advance;

and adjusting model parameters of the logistic regression model according to the loss value until the training ending condition is met, and obtaining the default risk assessment model.

6. The method of claim 5, further comprising, prior to said determining a loss value based on said difference value and a pre-obtained target loss function:

acquiring a cross entropy loss function, and determining regularization items of the cross entropy loss function and parameters of the regularization items;

And constructing an objective loss function based on the cross entropy loss function, the regularization term and parameters of the regularization term.

7. The method according to any one of claims 1-6, wherein the steps of determining a breach probability of the reference user based on the breach risk assessment model, and determining an assessment model parameter based on the breach probability of the reference user, and a credit assessment model corresponding to the assessment model parameter, comprise:

determining the default probability of the first effective reference user and the default probability of the second effective reference user based on the default risk assessment model;

acquiring and determining the credit score of the first effective reference user as a first reference score, obtaining the credit score of the second reference user according to the first reference score, and determining the credit score of the second reference user as a second reference score;

and determining the evaluation model parameters and a credit evaluation model corresponding to the evaluation model parameters according to the relation between the default probability of the first effective reference user and the first benchmark score and the relation between the default probability of the second effective reference user and the second benchmark score.

8. A method of user credit assessment, the method comprising:

inputting the default probability of the user to be evaluated into a pre-trained credit evaluation model to obtain a credit score of the user to be evaluated; the credit evaluation model is trained based on the method for acquiring a credit evaluation model according to any one of claims 1 to 7.

9. An apparatus for acquiring a credit evaluation model, the apparatus comprising:

The user feature screening module is used for determining the association degree of the user features of each type in the effective reference user and the user risk, and screening a plurality of types of user features according to the association degree to obtain a plurality of types of target user features;

the assessment model training module is used for training the logistic regression model by utilizing the target user characteristics to obtain a trained default risk assessment model;

and the evaluation model acquisition module is used for determining the default probability of the effective reference user based on the default risk evaluation model, determining an evaluation model parameter based on the default probability of the reference user and a credit evaluation model corresponding to the evaluation model parameter.

10. An apparatus for user credit assessment, the apparatus comprising:

the credit score acquisition module is used for inputting the default probability of the user to be evaluated into a pre-trained credit evaluation model to obtain the credit score of the user to be evaluated; the credit evaluation model is trained based on the method for acquiring a credit evaluation model according to any one of claims 1 to 7.

11. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the method of any one of claims 1 to 8 or the method of claim 9 when executing the computer program.

12. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the method of any one of claims 1 to 8 or the method of claim 9.

13. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements the method of any one of claims 1 to 8 or the method of claim 9.