CN112749921A - Mathematical modeling method, system, device and computer readable medium - Google Patents
Mathematical modeling method, system, device and computer readable medium Download PDFInfo
- Publication number
- CN112749921A CN112749921A CN202110133964.3A CN202110133964A CN112749921A CN 112749921 A CN112749921 A CN 112749921A CN 202110133964 A CN202110133964 A CN 202110133964A CN 112749921 A CN112749921 A CN 112749921A
- Authority
- CN
- China
- Prior art keywords
- screening
- model
- training
- feature
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000012549 training Methods 0.000 claims abstract description 79
- 238000012216 screening Methods 0.000 claims abstract description 77
- 238000005065 mining Methods 0.000 claims abstract description 22
- 230000000694 effects Effects 0.000 claims abstract description 16
- 230000004927 fusion Effects 0.000 claims abstract description 8
- 238000012360 testing method Methods 0.000 claims description 24
- 238000004590 computer program Methods 0.000 claims description 12
- 230000004069 differentiation Effects 0.000 claims description 8
- 238000011156 evaluation Methods 0.000 claims description 8
- 238000013178 mathematical model Methods 0.000 abstract description 8
- 238000012954 risk control Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 206010063385 Intellectualisation Diseases 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0635—Risk analysis of enterprise or organisation activities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/067—Enterprise or organisation modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Physics & Mathematics (AREA)
- Development Economics (AREA)
- Theoretical Computer Science (AREA)
- Marketing (AREA)
- General Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Educational Administration (AREA)
- Tourism & Hospitality (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Game Theory and Decision Science (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Technology Law (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a mathematical modeling method, a mathematical modeling system, mathematical modeling equipment and a computer readable medium, wherein the mathematical modeling method comprises the following steps: strict variable screening; mining client setting information as variables; screening the variables through the set characteristics of the variables; and (5) training a model. Modeling external credit worthiness characteristics independently, performing model result fusion by matching with models built by other characteristics, and adjusting the credit worthiness model only when the external credit worthiness occurs, without influencing the whole model; products with similar distribution and scenes are combined together for modeling. The mathematical modeling method, the system, the equipment and the computer readable medium can improve the stability of the mathematical model and improve the efficiency and the accuracy of identifying the quality of the client. The method can reduce the instability of variables and the overfitting of the model, and reduce the influence of the instability of data on the model training and the final effect as much as possible.
Description
Technical Field
The invention belongs to the technical field of mathematical model construction, relates to a modeling method, and particularly relates to a mathematical modeling method, a mathematical modeling system, mathematical modeling equipment and a computer readable medium.
Background
Risk control is a key of finance, and with the development of the times, the general trend in the field of wind control is that the informatization, modeling and intellectualization degrees are higher and higher.
In the risk control modeling process, the stability of the model is influenced because the external credit information is influenced unstably by the external environment. In addition, because the types of the risk control products are more, a corresponding mathematical model needs to be established for each risk control product, the workload is higher, and the identification efficiency of the model to the client is lower.
In view of the above, there is an urgent need to design a new mathematical model building method to overcome at least some of the above-mentioned disadvantages of the existing mathematical model building methods.
Disclosure of Invention
The invention provides a mathematical modeling method, a mathematical modeling system, mathematical modeling equipment and a computer readable medium, which can improve the stability of a mathematical model and improve the efficiency and the accuracy of identifying the quality of a client.
In order to solve the technical problem, according to one aspect of the present invention, the following technical solutions are adopted:
a mathematical modeling method reduces the instability of variables and overfitting of a model by means of strict variable screening and model training, and reduces the influence of the instability of data on the model training and the final effect as much as possible;
the mathematical modeling method comprises the following steps:
strict variable screening;
training a model;
wherein the stringent variable screening comprises:
-a variable mining step; obtaining customer dimension information, including address list information, operator information, APP embedded point information, risk event information and external credit information;
-a variable screening step; screening the variables through the set characteristics of the variables; the method specifically comprises the following steps:
and (4) screening characteristic saturation degree: eliminating the characteristic with the characteristic saturation less than 20%, and considering that the characteristic has difficulty in distinguishing the training target due to excessive missing values;
and (3) screening distribution stability: the modeling sample is divided into a training set and a testing set, the distribution condition of each feature on the training set and the testing set is required to be kept stable to a certain extent, the stable evaluation index is calculated PSI, each feature has a PSI value, and the feature with the PSI larger than 0.1 is considered to have the condition of unstable distribution and is removed;
screening the information quantity indexes: calculating the IV value of each feature, and removing the features with IV less than 0.02 without obvious distinguishing effect on the training targets;
and (3) risk differentiation stability screening: if the IV of one feature on the training set is 0.08 and the IV on the test set is 0.01, the feature is considered to have great variation on the distinguishing capability of the training target on different samples, and the feature is removed;
wherein the model training step comprises:
modeling external credit worthiness characteristics independently, performing model result fusion by matching with models built by other characteristics, and adjusting the credit worthiness model only when the external credit worthiness occurs, without influencing the whole model;
products with similar distribution and scenes are combined together for modeling in consideration of product diversification, and different customers are grouped because of different products; therefore, the model group modeling mode can enable the model group to better distinguish the quality of the customers.
According to another aspect of the invention, the following technical scheme is adopted: a mathematical modeling method, the mathematical modeling method comprising:
strict variable screening; mining client setting information as variables; screening the variables through the set characteristics of the variables;
and (5) training a model.
As an embodiment of the present invention, the stringent variable screening comprises:
variable mining; mining client setting information as variables;
screening variables; and screening the variables through the set characteristics of the variables.
As an implementation manner of the present invention, in the step of mining the variables, customer dimension information including address book information, operator information, APP spot information, risk event information, and external credit information is obtained;
the variable screening step specifically comprises:
and (4) screening characteristic saturation degree: eliminating the characteristic with the characteristic saturation less than 20%, and considering that the characteristic has difficulty in distinguishing the training target due to excessive missing values;
and (3) screening distribution stability: the modeling sample is divided into a training set and a testing set, the distribution condition of each feature on the training set and the testing set is required to be kept stable to a certain extent, the stable evaluation index is calculated PSI, each feature has a PSI value, and the feature with the PSI larger than 0.1 is considered to have the condition of unstable distribution and is removed;
screening the information quantity indexes: calculating the IV value of each feature, and removing the features with IV less than 0.02 without obvious distinguishing effect on the training targets;
and (3) risk differentiation stability screening: if a feature has an IV of 0.08 on the training set and an IV of 0.01 on the test set, the feature is considered to have a large degree of variation in the ability of distinguishing training targets on different samples, and such a feature should be rejected.
In the model training step, the external credit worthiness features are independently modeled, models built by matching with other features are subjected to model result fusion, and when the external credit worthiness occurs, the credit worthiness model is adjusted only without influencing the whole model;
products with similar distribution and scenes are combined together for modeling; the customers are different because of different products; therefore, the model group modeling mode can enable the model group to better distinguish the quality of the customers.
According to another aspect of the invention, the following technical scheme is adopted: a mathematical modeling system, the mathematical modeling system comprising:
the strict variable screening module is used for mining the customer setting information as a variable and screening the variable according to the setting characteristics of the variable;
and the model training module is used for carrying out model training.
As an embodiment of the present invention, the strict variable screening module comprises:
the variable mining unit is used for acquiring customer dimension information, including address book information, operator information, APP buried point information, risk event information and external credit information;
the variable screening unit screens the variables according to the set characteristics of the variables; the method specifically comprises the following steps:
and (4) screening characteristic saturation degree: eliminating the characteristic with the characteristic saturation less than 20%, and considering that the characteristic has difficulty in distinguishing the training target due to excessive missing values;
and (3) screening distribution stability: the modeling sample is divided into a training set and a testing set, the distribution condition of each feature on the training set and the testing set is required to be kept stable to a certain extent, the stable evaluation index is calculated PSI, each feature has a PSI value, and the feature with the PSI larger than 0.1 is considered to have the condition of unstable distribution and is removed;
screening the information quantity indexes: calculating the IV value of each feature, and removing the features with IV less than 0.02 without obvious distinguishing effect on the training targets;
and (3) risk differentiation stability screening: if a feature has an IV of 0.08 on the training set and an IV of 0.01 on the test set, the feature is considered to have a large degree of variation in the ability of distinguishing training targets on different samples, and such a feature should be rejected.
As an implementation mode of the invention, the model training unit is used for modeling external credit worthiness features separately, performing model result fusion by matching with models built by other features, and adjusting the credit worthiness model only when the external credit worthiness occurs, without affecting the whole model;
the model training unit is used for combining products with similar distribution and scenes together for modeling, and because the products are different, the customer groups are different; therefore, the model group modeling mode can enable the model group to better distinguish the quality of the customers.
According to another aspect of the invention, the following technical scheme is adopted: apparatus for a mathematical modeling method, the apparatus comprising a memory for storing computer program instructions and a processor for executing the computer program instructions, wherein the computer program instructions, when executed by the processor, trigger the apparatus to perform the method described above.
According to another aspect of the invention, the following technical scheme is adopted: a computer readable medium having stored thereon computer program instructions executable by a processor to implement the above-described method.
The invention has the beneficial effects that: the mathematical modeling method, the system, the equipment and the computer readable medium can improve the stability of the mathematical model and improve the efficiency and the accuracy of identifying the quality of the client. The method can reduce the instability of variables and the overfitting of the model, and reduce the influence of the instability of data on the model training and the final effect as much as possible.
Drawings
FIG. 1 is a flow chart of a mathematical modeling method in an embodiment of the present invention.
FIG. 2 is a schematic diagram of the mathematical modeling system according to an embodiment of the present invention.
Detailed Description
Preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
For a further understanding of the invention, reference will now be made to the preferred embodiments of the invention by way of example, and it is to be understood that the description is intended to further illustrate features and advantages of the invention, and not to limit the scope of the claims.
The description in this section is for several exemplary embodiments only, and the present invention is not limited only to the scope of the embodiments described. It is within the scope of the present disclosure and protection that the same or similar prior art means and some features of the embodiments may be interchanged.
The steps in the embodiments in the specification are only expressed for convenience of description, and the implementation manner of the present application is not limited by the order of implementation of the steps. The term "connected" in the specification includes both direct connection and indirect connection.
The invention discloses a mathematical modeling method which reduces the instability of variables and the overfitting of a model by means of strict variable screening and model training and reduces the influence of the instability of data on the model training and the final effect as much as possible.
FIG. 1 is a flow chart of a mathematical modeling method in one embodiment of the present invention; referring to fig. 1, the mathematical modeling method includes:
step S1, strict variable screening; mining client setting information as variables; and screening the variables through the set characteristics of the variables.
In one embodiment of the present invention, the stringent variable screening comprises:
variable mining; mining client setting information as variables;
screening variables; and screening the variables through the set characteristics of the variables.
In an embodiment of the present invention, in the variable mining step, customer dimension information, including address book information, operator information, APP spot information, risk event information, and external credit information, is obtained; by acquiring the information, the accuracy of judging the user risk can be improved.
In one embodiment, the variable screening step specifically includes:
and (4) screening characteristic saturation degree: eliminating the characteristic with the characteristic saturation less than 20 percent (or other values between 0 and 20 percent), and considering that the characteristic has difficult distinguishing effect on the training target due to excessive missing values;
and (3) screening distribution stability: the modeling sample is divided into a training set and a testing set, the distribution condition of each feature on the training set and the testing set is required to be kept stable to a certain extent, the stable evaluation index is calculated PSI, each feature has a PSI value, and the feature with the PSI larger than 0.1 (or other values) is considered to have the condition of unstable distribution and is removed;
screening the information quantity indexes: calculating the IV value of each feature, and eliminating the features with IV less than 0.02(0.02 can be other values between 0.01 and 0.03; IV calculation logic is a conventional way of the skilled person and is not described herein);
and (3) risk differentiation stability screening: if a feature has an IV of 0.08 on the training set and an IV of 0.01 on the test set, the feature is considered to have a large degree of variation in the ability of distinguishing training targets on different samples, and such a feature should be rejected.
Step S2, a model training step.
In an embodiment of the invention, the stability of the model is influenced by considering that the external credit information is influenced unstably by the external environment; and (3) modeling the external credit worthiness characteristics independently, performing model result fusion by matching with models built by other characteristics, and adjusting the credit worthiness model only when the external credit worthiness occurs, without influencing the overall model.
Products with similar distribution and scenes are combined together for modeling; the customers are different because of different products; therefore, the model group modeling mode can enable the model group to better distinguish the quality of the customers.
The invention also discloses a mathematical modeling system, and fig. 2 is a schematic composition diagram of the mathematical modeling system in an embodiment of the invention; referring to fig. 2, the mathematical modeling system includes: strict variable screening module 1 and model training module 2. The strict variable screening module 1 is used for mining customer setting information as a variable and screening the variable according to the setting characteristics of the variable; the model training module 2 is used for performing model training.
In an embodiment of the present invention, the strict variable screening module 1 includes: the device comprises a variable mining unit and a variable screening unit.
The variable mining unit is used for acquiring customer dimension information including address list information, operator information, APP buried point information, risk event information and external credit information. Customer information can be acquired as much as possible, so that the risk of the customer can be accurately judged.
The variable screening unit screens the variables through the set characteristics of the variables; the method specifically comprises the following steps:
and (4) screening characteristic saturation degree: eliminating the characteristic with the characteristic saturation less than 20%, and considering that the characteristic has difficulty in distinguishing the training target due to excessive missing values;
and (3) screening distribution stability: the modeling sample is divided into a training set and a testing set, the distribution condition of each feature on the training set and the testing set is required to be kept stable to a certain extent, the stable evaluation index is calculated PSI, each feature has a PSI value, and the feature with the PSI larger than 0.1 is considered to have the condition of unstable distribution and is removed;
screening the information quantity indexes: calculating the IV value of each feature, and removing the features with IV less than 0.02 without obvious distinguishing effect on the training targets;
and (3) risk differentiation stability screening: if a feature has an IV of 0.08 on the training set and an IV of 0.01 on the test set, the feature is considered to have a large degree of variation in the ability of distinguishing training targets on different samples, and such a feature should be rejected.
In an embodiment of the invention, the model training unit is used for modeling external credit worthiness features independently, performing model result fusion by matching with models built by other features, and adjusting the credit worthiness model only when the external credit worthiness occurs, without affecting the whole model;
the model training unit is used for combining products with similar distribution and scenes together for modeling, and because the products are different, the customer groups are different; therefore, the model group modeling mode can enable the model group to better distinguish the quality of the customers.
The invention also discloses an apparatus for a mathematical modeling method, the apparatus comprising a memory for storing computer program instructions and a processor for executing the computer program instructions, wherein the computer program instructions, when executed by the processor, trigger the apparatus to perform the method described above.
The invention further discloses a computer readable medium having stored thereon computer program instructions executable by a processor to implement the above-described method.
In summary, the mathematical modeling method, system, device and computer readable medium provided by the invention can improve the stability of the mathematical model and improve the efficiency and accuracy of identifying the quality of the customer. The method can reduce the instability of variables and the overfitting of the model, and reduce the influence of the instability of data on the model training and the final effect as much as possible.
It should be noted that the present application may be implemented in software and/or a combination of software and hardware; for example, it may be implemented using Application Specific Integrated Circuits (ASICs), general purpose computers, or any other similar hardware devices. In some embodiments, the software programs of the present application may be executed by a processor to implement the above steps or functions. As such, the software programs (including associated data structures) of the present application can be stored in a computer-readable recording medium; such as RAM memory, magnetic or optical drives or diskettes, and the like. In addition, some steps or functions of the present application may be implemented using hardware; for example, as circuitry that cooperates with the processor to perform various steps or functions.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The description and applications of the invention herein are illustrative and are not intended to limit the scope of the invention to the embodiments described above. Effects or advantages referred to in the embodiments may not be reflected in the embodiments due to interference of various factors, and the description of the effects or advantages is not intended to limit the embodiments. Variations and modifications of the embodiments disclosed herein are possible, and alternative and equivalent various components of the embodiments will be apparent to those skilled in the art. It will be clear to those skilled in the art that the present invention may be embodied in other forms, structures, arrangements, proportions, and with other components, materials, and parts, without departing from the spirit or essential characteristics thereof. Other variations and modifications of the embodiments disclosed herein may be made without departing from the scope and spirit of the invention.
Claims (10)
1. A mathematical modeling method is characterized in that strict variable screening and model training are relied on, variable stability and instability and model overfitting are reduced, and the influence of data instability on model training and final effect is reduced as much as possible;
the mathematical modeling method comprises the following steps:
strict variable screening;
training a model;
wherein the stringent variable screening comprises:
-a variable mining step; obtaining customer dimension information, including address list information, operator information, APP embedded point information, risk event information and external credit information;
-a variable screening step; screening the variables through the set characteristics of the variables; the method specifically comprises the following steps:
and (4) screening characteristic saturation degree: eliminating the characteristic with the characteristic saturation less than 20%, and considering that the characteristic has difficulty in distinguishing the training target due to excessive missing values;
and (3) screening distribution stability: the modeling sample is divided into a training set and a testing set, the distribution condition of each feature on the training set and the testing set is required to be kept stable to a certain extent, the stable evaluation index is calculated PSI, each feature has a PSI value, and the feature with the PSI larger than 0.1 is considered to have the condition of unstable distribution and is removed;
screening the information quantity indexes: calculating the IV value of each feature, and removing the features with IV less than 0.02 without obvious distinguishing effect on the training targets;
and (3) risk differentiation stability screening: if the IV of one feature on the training set is 0.08 and the IV on the test set is 0.01, the feature is considered to have great variation on the distinguishing capability of the training target on different samples, and the feature is removed;
wherein the model training step comprises:
modeling external credit worthiness characteristics independently, performing model result fusion by matching with models built by other characteristics, and adjusting the credit worthiness model only when the external credit worthiness occurs, without influencing the whole model;
products with similar distribution and scenes are combined together for modeling in consideration of product diversification, and different customers are grouped because of different products; therefore, the model group modeling mode can enable the model group to better distinguish the quality of the customers.
2. A mathematical modeling method, comprising:
strict variable screening; mining client setting information as variables; screening the variables through the set characteristics of the variables;
and (5) training a model.
3. The mathematical modeling method of claim 2, wherein:
the stringent variable screening comprises:
variable mining; mining client setting information as variables;
screening variables; and screening the variables through the set characteristics of the variables.
4. The mathematical modeling method of claim 3, wherein:
in the variable mining step, obtaining customer dimension information including address book information, operator information, APP buried point information, risk event information and external credit information;
the variable screening step specifically comprises:
and (4) screening characteristic saturation degree: eliminating the characteristic with the characteristic saturation less than 20%, and considering that the characteristic has difficulty in distinguishing the training target due to excessive missing values;
and (3) screening distribution stability: the modeling sample is divided into a training set and a testing set, the distribution condition of each feature on the training set and the testing set is required to be kept stable to a certain extent, the stable evaluation index is calculated PSI, each feature has a PSI value, and the feature with the PSI larger than 0.1 is considered to have the condition of unstable distribution and is removed;
screening the information quantity indexes: calculating the IV value of each feature, and removing the features with IV less than 0.02 without obvious distinguishing effect on the training targets;
and (3) risk differentiation stability screening: if a feature has an IV of 0.08 on the training set and an IV of 0.01 on the test set, the feature is considered to have a large degree of variation in the ability of distinguishing training targets on different samples, and such a feature should be rejected.
5. The mathematical modeling method of claim 2, wherein:
in the model training step, external credit worthiness features are independently modeled, model results are fused by matching with models built by other features, and when external credit worthiness occurs, only the credit worthiness model is adjusted without influencing an integral model;
products with similar distribution and scenes are combined together for modeling; the customers are different because of different products; therefore, the model group modeling mode can enable the model group to better distinguish the quality of the customers.
6. A mathematical modeling system, the mathematical modeling system comprising:
the strict variable screening module is used for mining the customer setting information as a variable and screening the variable according to the setting characteristics of the variable;
and the model training module is used for carrying out model training.
7. The mathematical modeling system of claim 6, wherein:
the stringency variable screening module comprises:
the variable mining unit is used for acquiring customer dimension information, including address book information, operator information, APP buried point information, risk event information and external credit information;
the variable screening unit screens the variables according to the set characteristics of the variables; the method specifically comprises the following steps:
and (4) screening characteristic saturation degree: eliminating the characteristic with the characteristic saturation less than 20%, and considering that the characteristic has difficulty in distinguishing the training target due to excessive missing values;
and (3) screening distribution stability: the modeling sample is divided into a training set and a testing set, the distribution condition of each feature on the training set and the testing set is required to be kept stable to a certain extent, the stable evaluation index is calculated PSI, each feature has a PSI value, and the feature with the PSI larger than 0.1 is considered to have the condition of unstable distribution and is removed;
screening the information quantity indexes: calculating the IV value of each feature, and removing the features with IV less than 0.02 without obvious distinguishing effect on the training targets;
and (3) risk differentiation stability screening: if a feature has an IV of 0.08 on the training set and an IV of 0.01 on the test set, the feature is considered to have a large degree of variation in the ability of distinguishing training targets on different samples, and such a feature should be rejected.
8. The mathematical modeling system of claim 6, wherein:
the model training unit is used for modeling external credit worthiness characteristics independently, performing model result fusion by matching with models built by other characteristics, and adjusting the credit worthiness model only when the external credit worthiness occurs, without influencing the whole model;
the model training unit is used for combining products with similar distribution and scenes together for modeling, and because the products are different, the customer groups are different; therefore, the model group modeling mode can enable the model group to better distinguish the quality of the customers.
9. An apparatus for a mathematical modeling method, the apparatus comprising a memory for storing computer program instructions and a processor for executing the computer program instructions, wherein the computer program instructions, when executed by the processor, trigger the apparatus to perform the method of any of claims 1 to 5.
10. A computer-readable medium having computer program instructions stored thereon, the computer-readable instructions being executable by a processor to implement the method of any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110133964.3A CN112749921A (en) | 2021-02-01 | 2021-02-01 | Mathematical modeling method, system, device and computer readable medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110133964.3A CN112749921A (en) | 2021-02-01 | 2021-02-01 | Mathematical modeling method, system, device and computer readable medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112749921A true CN112749921A (en) | 2021-05-04 |
Family
ID=75653447
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110133964.3A Pending CN112749921A (en) | 2021-02-01 | 2021-02-01 | Mathematical modeling method, system, device and computer readable medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112749921A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113793212A (en) * | 2021-09-24 | 2021-12-14 | 重庆富民银行股份有限公司 | Credit assessment method |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107016571A (en) * | 2017-03-31 | 2017-08-04 | 北京百分点信息科技有限公司 | Data predication method and its system |
CN110298389A (en) * | 2019-06-11 | 2019-10-01 | 上海冰鉴信息科技有限公司 | More wheels circulation feature selection approach and device when training pattern |
CN110930198A (en) * | 2019-12-05 | 2020-03-27 | 佰聆数据股份有限公司 | Electric energy substitution potential prediction method and system based on random forest, storage medium and computer equipment |
CN111311400A (en) * | 2020-03-30 | 2020-06-19 | 百维金科(上海)信息科技有限公司 | Modeling method and system of grading card model based on GBDT algorithm |
CN112070239A (en) * | 2020-11-11 | 2020-12-11 | 上海森亿医疗科技有限公司 | Analysis method, system, medium, and device based on user data modeling |
-
2021
- 2021-02-01 CN CN202110133964.3A patent/CN112749921A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107016571A (en) * | 2017-03-31 | 2017-08-04 | 北京百分点信息科技有限公司 | Data predication method and its system |
CN110298389A (en) * | 2019-06-11 | 2019-10-01 | 上海冰鉴信息科技有限公司 | More wheels circulation feature selection approach and device when training pattern |
CN110930198A (en) * | 2019-12-05 | 2020-03-27 | 佰聆数据股份有限公司 | Electric energy substitution potential prediction method and system based on random forest, storage medium and computer equipment |
CN111311400A (en) * | 2020-03-30 | 2020-06-19 | 百维金科(上海)信息科技有限公司 | Modeling method and system of grading card model based on GBDT algorithm |
CN112070239A (en) * | 2020-11-11 | 2020-12-11 | 上海森亿医疗科技有限公司 | Analysis method, system, medium, and device based on user data modeling |
Non-Patent Citations (1)
Title |
---|
梅子行: "《智能风控 Python金融风险管理与评分卡建模》", 机械工业出版社, pages: 160 - 161 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113793212A (en) * | 2021-09-24 | 2021-12-14 | 重庆富民银行股份有限公司 | Credit assessment method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI673669B (en) | Modeling method and device for evaluating model | |
TWI789345B (en) | Modeling method and device for machine learning model | |
JP6771751B2 (en) | Risk assessment method and system | |
US20110153536A1 (en) | Computer-Implemented Systems And Methods For Dynamic Model Switching Simulation Of Risk Factors | |
CN106651573A (en) | Business data processing method and apparatus | |
CA2935281C (en) | A multidimensional recursive learning process and system used to discover complex dyadic or multiple counterparty relationships | |
CN113177700B (en) | Risk assessment method, system, electronic equipment and storage medium | |
JP6251383B2 (en) | Calculating the probability of a defaulting company | |
CN112037007A (en) | Credit approval method for small and micro enterprises and electronic equipment | |
CN110930248A (en) | Credit risk prediction model construction method and system, storage medium and electronic equipment | |
CN115203496A (en) | Project intelligent prediction and evaluation method and system based on big data and readable storage medium | |
CN112749921A (en) | Mathematical modeling method, system, device and computer readable medium | |
TW202111592A (en) | Learning model application system, learning model application method, and program | |
CN115809837B (en) | Financial enterprise management method, equipment and medium based on digital simulation scene | |
CN106874286B (en) | Method and device for screening user characteristics | |
CN111651500A (en) | User identity recognition method, electronic device and storage medium | |
TWI835478B (en) | An operation behavior recognition method, device, computer equipment and computer-readable storage medium | |
CN112862594A (en) | Financial risk control method, system, device and computer readable medium | |
CN111737090B (en) | Log simulation method and device, computer equipment and storage medium | |
CN115018625A (en) | Credit fusion report generation method, device, equipment and storage medium | |
Duboc et al. | A case study in eliciting scalability requirements | |
CN114253518A (en) | Intelligent project management method and system | |
CN113011748A (en) | Recommendation effect evaluation method and device, electronic equipment and readable storage medium | |
CN111489134A (en) | Data model construction method, device, equipment and computer readable storage medium | |
CN111612023A (en) | Classification model construction method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |