CN115545878A

CN115545878A - Machine learning-based bond risk assessment method and system

Info

Publication number: CN115545878A
Application number: CN202211061131.1A
Authority: CN
Inventors: 王骏; 祝智魁; 王剑锋; 周功梓
Original assignee: Hangzhou Bangzhi Technology Co ltd
Current assignee: Hangzhou Bangzhi Technology Co ltd
Priority date: 2022-08-31
Filing date: 2022-08-31
Publication date: 2022-12-30

Abstract

The application discloses a bond risk assessment method and system based on machine learning, which belong to the field of artificial intelligence, and comprise the following steps: acquiring basic information of a target company, carrying out data screening identification based on big data, determining a training data set, carrying out identification sorting on a data proportion weight average value in each group of data, obtaining a sequence sorting result, constructing an initial weight proportion constraint interval, carrying out weight constraint on the sequence sorting result, obtaining an initial constraint result, taking the training data set as input data, taking risk rating identification information as supervision data, taking the initial constraint result as a hidden layer to calculate a weight constraint condition, constructing a bond rating model, inputting the basic information, and obtaining a rating output result. The method and the device solve the technical problems that in the prior art, the bond risk cannot be accurately evaluated, and the evaluation efficiency is low, and achieve the technical effects of improving the accuracy of bond risk evaluation and optimizing the risk evaluation speed.

Description

Machine learning-based bond risk assessment method and system

Technical Field

The application relates to the field of artificial intelligence, in particular to a bond risk assessment method and system based on machine learning.

Background

In recent years, the bond market is an important basic component in the financial market and is the largest direct financing platform, and the relationship between the bond market and the financial and entity economic development is very tight. With the expansion of the market of credit and bond, while supporting the financing of enterprises, the quality of issuing main bodies is diversified, and the risk is increased.

At present, the domestic rating industry is highly dispersed, the standardization construction of industrial risk assessment is delayed, and the risk assessment level of bonds cannot reach the internationally approved standard. Meanwhile, in the process of carrying out bond rating on an enterprise, due to the fact that data are numerous, the original workload of analyzing and evaluating the data is huge, the data cannot be rapidly processed only based on the existing manual scoring mode, and meanwhile the accuracy of a data analysis result cannot be guaranteed. However, even in the process of processing data through machine learning, since the evaluation index cannot be accurately grasped and the core factor of the rating is accurately extracted to optimize the rating process, the evaluation result is inaccurate, and investment decision cannot be accurately guided and market risk cannot be avoided. The technical problems that the bond risk cannot be accurately evaluated and the evaluation efficiency is low exist in the prior art.

Disclosure of Invention

The application aims to provide a machine learning-based bond risk assessment method and system, and aims to solve the technical problems that bond risk cannot be accurately assessed and assessment efficiency is low in the prior art.

In view of the above problems, the present application provides a method and a system for evaluating risk of bonds based on machine learning.

In a first aspect, the present application provides a method for machine learning-based risk assessment of bonds, wherein the method includes: acquiring basic information of a target company, wherein the basic information comprises asset scale information, sales information, credit level information and surplus information; performing data screening identification based on big data, and determining a training data set based on screening identification results, wherein each group of data forming the training data set comprises asset scale information, sales information, credit rating information, surplus information and risk rating identification information; identifying and sorting the average value of the data proportion weights in each group of data of the training data set to obtain a sequential sorting result; constructing an initial weight ratio constraint interval, and performing weight constraint on the sequential ordering result based on the initial weight ratio constraint interval to obtain an initial constraint result; taking asset scale information, sales information, credit rating information and surplus information in the training data set as input data, taking risk rating identification information as supervision data, taking the initial constraint result as a hidden layer to calculate a weight constraint condition, and constructing a bond rating model; and inputting the basic information into the constructed bond rating model to obtain a rating output result.

In another aspect, the present application further provides a machine learning-based bond risk assessment system, where the system includes: the system comprises a basic information acquisition module, a basic information acquisition module and a basic information processing module, wherein the basic information acquisition module is used for acquiring basic information of a target company, and the basic information comprises asset scale information, sales information, credit grade information and surplus information; the training data determining module is used for carrying out data screening identification based on big data and determining a training data set based on screening identification results, wherein each group of data forming the training data set comprises asset scale information, sales information, credit rating information, surplus information and risk rating identification information; the identification sorting module is used for identifying and sorting the average value of the data proportion weights in each group of data of the training data set to obtain a sequence sorting result; the weight constraint module is used for constructing an initial weight proportion constraint interval, and carrying out weight constraint on the sequential ordering result based on the initial weight proportion constraint interval to obtain an initial constraint result; the system comprises a rating model building module, a risk rating identification module and a risk rating model building module, wherein the rating model building module is used for taking asset scale information, sales information, credit rating information and surplus information in a training data set as input data, taking risk rating identification information as supervision data, taking an initial constraint result as a hidden layer calculation weight constraint condition and building a bond rating model; and the rating result output module is used for inputting the basic information into the constructed bond rating model to obtain a rating output result.

One or more technical solutions provided in the present application have at least the following technical effects or advantages:

the method comprises the steps of acquiring information capable of reflecting basic conditions of a target company to obtain basic information, providing basic analysis data for bond risk assessment of the company, acquiring a data set for model training from big data, screening samples for bond risk assessment, acquiring sample data from the aspects of asset scale, sales condition, credit rating, surplus condition and risk rating identification condition, performing identification sorting according to a data proportion weight average value in each sample data to obtain a sequential sorting result of the data, performing weight constraint on the sequential sorting result according to an initial weight proportion constraint interval, initially constraining the result, further inputting the acquired basic information of the target company into a constructed bond rating model by using the asset scale information, the sales information, the credit rating information and the surplus information in a training data set as input data, using the risk rating identification information as supervision data, using the initially constrained result as a hidden layer to calculate a weight constraint condition, constructing the bond rating model, and inputting the acquired basic information of the target company into the constructed bond rating model to obtain a rating output result. The technical effects of intelligently evaluating the risk of the bond and improving the evaluation efficiency and accuracy are achieved.

Drawings

In order to more clearly illustrate the technical solutions in the present application or the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only exemplary, and for those skilled in the art, other drawings can be obtained according to the provided drawings without inventive effort.

Fig. 1 is a schematic flowchart of a bond risk assessment method based on machine learning according to an embodiment of the present disclosure;

fig. 2 is a schematic flowchart illustrating a procedure of constructing a bond rating model in a bond risk assessment method based on machine learning according to an embodiment of the present application;

fig. 3 is a schematic flowchart illustrating a process of constructing and optimizing a bond rating model in a bond risk assessment method based on machine learning according to an embodiment of the present disclosure;

fig. 4 is a schematic structural diagram of a machine learning-based bond risk assessment system according to the present application;

description of reference numerals: the system comprises a basic information acquisition module 11, a training data determination module 12, an identification sorting module 13, a weight constraint module 14, a rating model construction module 15 and a rating result output module 16.

Detailed Description

The application provides a machine learning-based bond risk assessment method and system, and solves the technical problems that bond risk cannot be accurately assessed and assessment efficiency is low in the prior art. The technical effects of improving the accuracy of bond risk assessment and optimizing the risk assessment speed are achieved.

According to the technical scheme, the data acquisition, storage, use, processing and the like meet relevant regulations of national laws and regulations.

In the following, the technical solutions in the present application will be clearly and completely described with reference to the accompanying drawings, and it is to be understood that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments of the present application, and it is to be understood that the present application is not limited by the example embodiments described herein. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application. It should be further noted that, for the convenience of description, only some but not all of the elements relevant to the present application are shown in the drawings.

Example one

As shown in fig. 1, the present application provides a method for evaluating risk of a bond based on machine learning, wherein the method includes:

step S100: acquiring basic information of a target company, wherein the basic information comprises asset scale information, sales information, credit level information and surplus information;

specifically, the target company is any company that is prepared to issue bonds and needs to evaluate the risk of bonds of the company. The basic information is basic information capable of reflecting the whole operation condition and the operation capability of a company in the process of evaluating the bond risk, and comprises the following steps: asset size information, sales information, credit rating information, and surplus information. Wherein the asset scale information reflects the existing total asset amount or the fixed total asset amount owned or controlled by the target company, and the larger the company scale is, the stronger the risk resistance is. The sales information is the sales capability of the target company for bond issuance, the credit rating information reflects the credit condition of the target company, and the credit rating of the company reflects the reliability of the bond in the credit bond market. The surplus information is the income increase condition of the company in the operation process and reflects the continuous profit capacity of the company. By obtaining the basic information of the company, the technical effect of providing basic data for the risk assessment of subsequently issuing bonds to the target company is achieved.

Step S200: performing data screening identification based on big data, and determining a training data set based on screening identification results, wherein each group of data forming the training data set comprises asset scale information, sales information, credit rating information, surplus information and risk rating identification information;

specifically, in order to obtain real, reliable and accurate training data, in the process of obtaining the training data through big data, data extraction, data cleaning and data loading are firstly carried out on the data obtained based on the big data, and the screened training data set is obtained by identifying the data related to the bond risk assessment. Wherein the training data set is a data set comprising a plurality of sets of risk assessment data and risk assessment results reflecting different companies. The basic situation of the company is reflected by the asset scale information, the sales information, the credit rating information and the surplus information in each group of data. The risk rating identification information reflects bond risk assessment results of companies corresponding to each group of basic data. Reliable and real training data are obtained based on big data, so that the technical effects of improving the accuracy of an evaluation model constructed based on the training data and improving the analysis capability of the model on different company conditions are achieved.

Step S300: identifying and sorting the average value of the data proportion weights in each group of data of the training data set to obtain a sequential sorting result;

specifically, each group of data in the training data set includes a plurality of data which are used for risk assessment of bonds of a company, contribution amounts of each data to the risk assessment are different, and the contribution amounts of each group of data in the training data set in the risk assessment are obtained, and then mean value calculation is performed, so that the overall average proportion weight condition of the data in the training data set in the risk assessment, namely the data proportion weight value, is obtained. And according to the size of the average value, identifying and sorting the calculation sequence of the data in the risk assessment, and according to the weight from large to small, obtaining the sequence sorting result. Wherein the data ranked above contribute the most to the risk assessment. Therefore, the method and the device achieve the purposes of analyzing and sorting the importance degree of the data in the risk assessment, achieve the technical effects of performing hierarchical division on the data, sorting according to the importance degree of the data, improving the reliability degree of the data, refining the analysis process and improving the accuracy of the assessment.

Step S400: constructing an initial weight ratio constraint interval, and performing weight constraint on the sequential ordering result based on the initial weight ratio constraint interval to obtain an initial constraint result;

specifically, after the sequence ordering of the data parameters is determined, the weight distribution result is constrained based on the sequence ordering result, the influence degree of the data parameters of the lower order on the risk assessment cannot exceed the influence degree of the data parameters of the higher order on the risk assessment, that is, the weight result distributed by each data parameter is constrained, so that the reliability of the refined assessment of the data is further ensured. The initial weight ratio constraint interval limits the distribution condition of the influence degree of each group of data in the training data set in risk assessment, so that the assessment process is carried out according to a set assessment scheme. And the initial constraint result is the weight condition of each data parameter obtained after the data evaluation weight ratio in the training data set is constrained.

Step S500: taking asset scale information, sales information, credit rating information and surplus information in the training data set as input data, taking risk rating identification information as supervision data, taking the initial constraint result as a hidden layer to calculate a weight constraint condition, and constructing a bond rating model;

specifically, the bond rating model is a functional model that analyzes the input basic information of the target company, and performs risk assessment on bonds issued by the target company to obtain a bond risk level. And inputting the asset scale information, the sales information, the credit rating information and the surplus information in the training data set as input data into an input layer of the model, analyzing and training the model based on the input data, obtaining the weight proportion of each input data in the analysis process by combining a hidden layer to obtain a bond rating result, monitoring the bond rating result through the risk identification information, judging whether the output result is consistent with the risk identification information, continuing training if the output result is inconsistent with the risk identification information, stopping training the model until the output result of the model is converged, and obtaining the bond rating model. Therefore, the technical effects of intelligently evaluating the bond risk and improving the evaluation efficiency and accuracy are achieved.

Step S600: and inputting the basic information into the constructed bond rating model to obtain a rating output result.

Specifically, the basic information of the target company is input into the bond rating model as input data, the model is used for carrying out multi-angle analysis on the risk of issuing bonds of the target company to obtain an accurate risk rating result, the efficiency and the accuracy of risk evaluation are improved, and the technical effect of intelligently assisting enterprises in rating bonds is achieved.

Further, as shown in fig. 2, in the step S500 of constructing a bond rating model according to the embodiment of the present application, the step of:

step S510: constructing a test data set based on the big data;

step S520: performing model test on the bond rating model through the test data set, and outputting a test result;

step S530: judging whether the accuracy of the output test result meets an expected threshold value;

step S540: and when the accuracy of the output test result cannot meet the expected threshold, continuing to construct and optimize the bond rating model until the accuracy of the output test result of the bond rating model meets the expected threshold, and completing the construction of the bond rating model.

Further, as shown in fig. 3, when the building of the bond rating model is optimized, step S540 of the embodiment of the present application further includes:

step S541: performing deviation test result integration based on the output test result to obtain an abnormal rating set;

step S542: obtaining test data information corresponding to the abnormal rating set, and performing common feature integration on the test data information based on the test data set to generate a common feature integration result;

step S543: obtaining sensitivity analysis results of the bond rating model based on the common feature integration results;

step S544: and optimizing the bond rating model according to the sensitivity analysis result.

Further, the obtaining a sensitivity analysis result of the bond rating model based on the result of integrating the common characteristics further includes step S543 of:

step S5431: judging whether the sensitivity analysis result meets a preset sensitivity threshold value or not;

step S5432: when the sensitivity analysis result does not meet the preset sensitivity threshold, optimizing the initial weight proportion constraint interval based on the common characteristic integration result to obtain an optimized weight proportion constraint interval;

step S5433: performing incremental optimization on the bond rating model based on the optimization weight ratio constraint interval to generate an incrementally optimized bond rating model;

step S5434: and performing data processing including the common feature integration result feature based on the incremental optimization bond rating model.

Specifically, after the bond rating model is obtained, the model needs to be subjected to a functionality test, so that the reliability of the model is determined. And obtaining test data for performing performance test on the bond rating model according to the big data to obtain the test data set. Wherein the test data set includes asset size information, sales information, credit rating information, surplus information, and risk rating identification information. And inputting the asset scale information, the sales information, the credit rating information and the surplus information in the test data set as input information into the bond rating model, obtaining a rating result after model analysis, comparing the rating result with the risk rating identification information, obtaining the accuracy of the model according to the difference condition of the rating result and the risk rating identification information, and taking the accuracy as a test result. The expected threshold is an accurate condition which can be achieved by a preset model, and reflects the required precision of risk assessment. When the accuracy of the output test result cannot meet the expected threshold, further analysis needs to be performed on the condition that the accuracy is not met, and the model is optimized according to the analysis result.

Specifically, when the accuracy of the model cannot meet the preset requirement, the sensitivity of the output result of the model to the change of the system parameters or the surrounding conditions can be determined by analyzing the sensitivity of the model. In order to quickly construct the bond rating model and improve the operation speed of the model, initial weight ratio constraint is carried out on data in the training data set, the number of iterations of operation is limited, and weight distribution is directly and uniformly carried out on the data to reduce the operation amount, so that the particularity of part of data is ignored, and the output accuracy of the model is reduced. Therefore, by analyzing the sensitivity of the model, the stability of the optimal solution of the model when the original data is inaccurate or changed can be analyzed and evaluated, and the influence of parameters on the model can be analyzed and obtained, so that the data with abnormal characteristics can be analyzed and processed more specifically, and the stability of the model can be optimized.

And comparing all output test results with risk rating identification information serving as supervision data to obtain deviation conditions, analyzing and integrating all deviation conditions, summarizing all abnormal rating results output by the model to form the abnormal rating set. In the obtaining of the abnormal rating set, test data corresponding to the obtained abnormal rating can be reversely searched, the data in the test data set is analyzed, the common feature in the data is extracted, optionally, the data in the test data set is arranged according to the importance through a principal component analysis method, a group of data with the importance degree from large to small is obtained, and the feature meeting the importance degree is taken as the common feature. The result of the commonality characteristic integration is a characteristic parameter which has a large influence on the output accuracy of the model.

Specifically, a sensitivity analysis result of the model is obtained according to the common characteristic integration result, and further, the sensitivity analysis result is compared with a preset sensitivity threshold value for judgment, when the sensitivity analysis result does not meet the sensitivity threshold value, it is indicated that the influence degree of the common characteristic on the model exceeds the tolerable range of the model, the stability of the model is greatly influenced, and incremental learning needs to be performed on the model, so that the stability of the model is improved. And obtaining the influence degree of each feature on the model according to the common feature integration result, optimizing the initial weight ratio constraint interval according to the influence degree, and adjusting the weight ratio of each feature in the evaluation so as to obtain an optimized weight ratio optimization interval with more reasonable weight distribution. And then acquiring more training data based on big data according to the weight ratio condition in the optimization weight ratio constraint interval, and performing incremental optimization on the bond rating model according to the optimization weight ratio constraint interval to obtain an incremental optimization bond rating model with higher stability. And then, carrying out risk assessment on the data containing the common characteristic integration result characteristics by using the incremental optimization bond rating model to obtain a new test result, and judging the accuracy of the new test result. Judging whether the expected threshold value is met or not, if so, completing the construction of a bond rating model, and stopping model optimization; if not, continuing to construct and optimize the bond rating model in the steps until the accuracy of the test result meets the expected threshold. Therefore, the goal of testing the stability of the model is achieved, the stability of the model is improved, and the technical effect of ensuring the accuracy of risk assessment is achieved.

Further, step S500 in the embodiment of the present application further includes:

step S550: obtaining a model convergence rate evaluation value in the process of constructing the bond rating model;

step S560: judging whether the convergence speed evaluation value meets a convergence threshold value;

step S570: when the convergence speed evaluation value does not meet the convergence threshold value, selecting a compensation constraint characteristic;

step S580: performing data compensation acquisition on each group of training data in the training data set through the compensation constraint characteristics, and adding compensation acquisition results to the training data set;

step S590: and constructing the bond rating model through the compensated training data set.

Specifically, after obtaining the bond rating model, further evaluation of the operation performance of the model is needed, and when the convergence rate of the model meets the requirement, the evaluation efficiency can meet the requirement. The model convergence rate evaluation value is a value obtained by evaluating the speed of the model operation result approaching to the limit in the model construction process, and reflects the model construction speed. The convergence threshold is a preset value of the speed at which the model reaches convergence. When the convergence speed evaluation value does not meet the convergence threshold value, the initially set training data cannot provide complete risk evaluation data of the bonds, and more dimensional analysis features of the bonds need to be acquired to evaluate the risk, so that the model can be converged quickly. And the compensation constraint characteristic is a characteristic which is obtained according to experience after the convergence rate evaluation value is analyzed except the data characteristic in the training data set and used for evaluating the bond risk. And acquiring corresponding compensation data based on the big data according to the compensation constraint characteristics, and adding a compensation acquisition result into a corresponding training data set. After the training data is supplemented and perfected, a bond rating model is constructed based on the compensated training data set. The model is optimized from the angle of convergence speed, the convergence speed of the model is improved, and the accuracy of the model is further improved.

Further, step S590 in this embodiment of the present application further includes:

step S591: performing sample richness evaluation on sample data in the training data set to generate a sample richness evaluation result;

step S592: sample amount evaluation is carried out on the sample data in the training data set, and a sample total amount evaluation result is generated;

step S593: optimizing sample data in the training data set according to the sample richness evaluation result and the sample total evaluation result to obtain an optimized training data set;

step S594: and constructing the bond rating model through the optimized training data set.

Specifically, when training data is acquired, the selection condition of the sample has a great influence on the reliability and accuracy of the data, and therefore, the quality of the sample data needs to be evaluated. Mainly through carrying out analysis evaluation to the sample from two angles of richness and sample size, wherein, sample richness evaluation refers to the variety of sample kind, and is optional, and the sample kind includes: company operating range, company size, price of issued bonds, etc. The more the types of the samples are, the more the richness of the samples is, and the more different conditions of the bond risk assessment can be reflected. The sample size refers to the number of samples sampled. The total amount of samples is about more, so that accidental errors can be avoided, and the quality of data is ensured. And evaluating the sample data according to the sample richness evaluation result and the sample total evaluation result, and if the evaluation is unqualified, optimizing the data in a targeted manner by correspondingly increasing the sample amount and the sample type to obtain an optimized training data set. And then, constructing a bond rating model according to the optimized training data set.

step S595: judging whether the sample amount in the training data set is insufficient;

step S596: when the sample is insufficient, calling the encrypted sample data;

step S597: and completing the construction of the bond rating model through the encrypted sample data and the training data.

Specifically, a bond risk assessment sample is obtained based on big data, and a training data set is obtained after data acquisition is carried out on the sample. And then analyzing the data in the training data set, judging whether the number of samples can meet the requirement, and if the number of samples cannot meet the training requirement, acquiring more encrypted data. And acquiring the encrypted sample data by acquiring the calling instruction. The encrypted sample data is sample evaluation data which can be obtained only by decryption. Sample data and training data are comprehensively encrypted, and the number of samples is increased, so that the learning data of the model is increased, and the accuracy of the model is improved.

In summary, the machine learning-based bond risk assessment method provided by the application has the following technical effects:

1. the method comprises the steps of screening data information related to bond risk assessment from big data to form a training data set, taking asset scale information, sales information, credit rating information and surplus information in the training data set as input data, inputting the input data into an input layer, taking risk rating identification information as supervision data, taking an initial constraint result as a hidden layer to calculate a weight constraint condition, combining the input layer, the hidden layer and an output layer to form an intelligent bond rating model, and assessing basic conditions of a target company to obtain a bond risk rating result of the target company. Therefore, the technical effects of intelligently evaluating the risk of the bond and improving the evaluation efficiency and accuracy are achieved.

2. According to the method and the device, the sensitivity of the model is tested, the abnormal rating result of the bond risk, namely the rating result with a large difference between the output result of the model and the risk rating identification information, is obtained according to the comprehensive analysis of the test result and the deviation test result, the common characteristic extraction is carried out on the test data corresponding to the abnormal rating result, then the initial weight proportion constraint interval is optimized according to the extracted common characteristic, a new optimization weight proportion constraint space is obtained, and the rating model is further optimized. Therefore, the technical effects of improving the stability of the rating model and further improving the adaptation range and accuracy of the model are achieved.

Example two

Based on the same inventive concept as the machine learning-based bond risk assessment method in the foregoing embodiment, as shown in fig. 4, the present application further provides a machine learning-based bond risk assessment system, where the system includes:

the basic information acquisition module 11 is used for acquiring basic information of a target company, wherein the basic information comprises asset scale information, sales information, credit rating information and surplus information;

the training data determination module 12 is configured to perform data screening identification based on big data, and determine a training data set based on a screening identification result, where each set of data forming the training data set includes asset scale information, sales information, credit rating information, surplus information, and risk rating identification information;

the identification sorting module 13 is configured to perform identification sorting on the average value of the data proportion weights in each group of data in the training data set to obtain a sequential sorting result;

the weight constraint module 14 is configured to construct an initial weight proportion constraint interval, and perform weight constraint on the sequential ordering result based on the initial weight proportion constraint interval to obtain an initial constraint result;

a rating model building module 15, where the rating model building module 15 is configured to use asset scale information, sales information, credit rating information, and surplus information in the training data set as input data, use risk rating identification information as supervision data, use the initial constraint result as a hidden layer calculation weight constraint condition, and build a bond rating model;

and the rating result output module 16 is configured to input the basic information into the constructed bond rating model, and obtain a rating output result.

Further, the system further comprises:

the test data construction unit is used for constructing a test data set based on big data;

the model testing unit is used for performing model testing on the bond rating model through the testing data set and outputting a testing result;

the accuracy judging unit is used for judging whether the accuracy of the output test result meets an expected threshold value or not;

and the model optimization unit is used for continuously constructing and optimizing the bond rating model when the accuracy of the output test result cannot meet the expected threshold until the accuracy of the output test result of the bond rating model meets the expected threshold, so that the construction of the bond rating model is completed.

Further, the system further comprises:

the abnormal rating obtaining unit is used for integrating deviation test results based on the output test results to obtain an abnormal rating set;

the common characteristic integration unit is used for obtaining test data information corresponding to the abnormal rating set, performing common characteristic integration on the test data information based on the test data set and generating a common characteristic integration result;

a sensitivity analysis unit for obtaining a sensitivity analysis result of the bond rating model based on the common feature integration result;

and the optimization processing unit is used for optimizing the bond rating model according to the sensitivity analysis result.

Further, the system further comprises:

the sensitivity judgment unit is used for judging whether the sensitivity analysis result meets a preset sensitivity threshold value or not;

a constraint interval optimization unit, configured to, when the sensitivity analysis result does not satisfy the preset sensitivity threshold, perform initial weight proportion constraint interval optimization based on the common feature integration result to obtain an optimized weight proportion constraint interval;

the increment optimization unit is used for carrying out increment optimization on the bond rating model based on the optimization weight ratio constraint interval to generate an increment optimized bond rating model;

and the data processing unit is used for carrying out data processing containing the common feature integration result feature based on the increment optimization bond rating model.

Further, the system further comprises:

an evaluation value obtaining unit for obtaining a model convergence rate evaluation value in a procedure of constructing a bond rating model;

a convergence speed determination unit for determining whether the convergence speed evaluation value satisfies a convergence threshold value;

a constraint feature selection unit, configured to select a compensation constraint feature when the convergence rate evaluation value does not satisfy the convergence threshold;

the data compensation acquisition unit is used for performing data compensation acquisition on each group of training data in the training data set through the compensation constraint characteristics and adding a compensation acquisition result to the training data set;

and the rating model construction unit is used for constructing the bond rating model through the compensated training data set.

Further, the system further comprises:

the richness evaluation unit is used for evaluating the sample richness of the sample data in the training data set to generate a sample richness evaluation result;

the sample size evaluation unit is used for evaluating the sample size of the sample data in the training data set to generate a sample total evaluation result;

the sample data optimization unit is used for optimizing the sample data in the training data set according to the sample richness evaluation result and the sample total evaluation result to obtain an optimized training data set;

and the optimization model construction unit is used for constructing the bond rating model through the optimization training data set.

Further, the system further comprises:

the sample size judging unit is used for judging whether the sample size in the training data set is insufficient;

the sample data calling unit is used for calling the encrypted sample data when the sample is insufficient;

and the bond model construction unit is used for completing construction of the bond rating model through the encrypted sample data and the training data.

In the present specification, the embodiments are described in a progressive manner, and each embodiment focuses on differences from other embodiments, and the machine learning-based bond risk assessment method and the specific example in the first embodiment of fig. 1 are also applicable to the machine learning-based bond risk assessment system of the present embodiment, and through the foregoing detailed description of the machine learning-based bond risk assessment method, those skilled in the art can clearly know the machine learning-based bond risk assessment system of the present embodiment, and therefore, for the brevity of the description, detailed descriptions are not repeated here. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A machine learning-based bond risk assessment method, the method comprising:

acquiring basic information of a target company, wherein the basic information comprises asset scale information, sales information, credit level information and surplus information;

performing data screening identification based on big data, and determining a training data set based on screening identification results, wherein each group of data forming the training data set comprises asset scale information, sales information, credit rating information, surplus information and risk rating identification information;

identifying and sorting the average value of the data proportion weights in each group of data of the training data set to obtain a sequential sorting result;

constructing an initial weight ratio constraint interval, and performing weight constraint on the sequential ordering result based on the initial weight ratio constraint interval to obtain an initial constraint result;

taking the asset scale information, the sales information, the credit rating information and the surplus information in the training data set as input data, taking the risk rating identification information as supervision data, taking the initial constraint result as a hidden layer to calculate a weight constraint condition, and constructing a bond rating model;

and inputting the basic information into the constructed bond rating model to obtain a rating output result.

2. The method of claim 1, wherein the method further comprises:

constructing a test data set based on the big data;

performing model test on the bond rating model through the test data set, and outputting a test result;

judging whether the accuracy of the output test result meets an expected threshold value;

and when the accuracy of the output test result cannot meet the expected threshold, continuing to construct and optimize the bond rating model until the accuracy of the output test result of the bond rating model meets the expected threshold, and completing the construction of the bond rating model.

3. The method of claim 2, wherein the method further comprises:

performing deviation test result integration based on the output test result to obtain an abnormal rating set;

obtaining test data information corresponding to the abnormal rating set, and performing common feature integration on the test data information based on the test data set to generate a common feature integration result;

obtaining sensitivity analysis results of the bond rating model based on the common feature integration results;

and optimizing the bond rating model according to the sensitivity analysis result.

4. The method of claim 3, wherein the method further comprises:

judging whether the sensitivity analysis result meets a preset sensitivity threshold value or not;

when the sensitivity analysis result does not meet the preset sensitivity threshold, optimizing the initial weight ratio constraint interval based on the common characteristic integration result to obtain an optimized weight ratio constraint interval;

performing incremental optimization on the bond rating model based on the optimization weight ratio constraint interval to generate an incrementally optimized bond rating model;

and performing data processing including the common feature integration result feature based on the incremental optimization bond rating model.

5. The method of claim 1, wherein the method further comprises:

obtaining a model convergence rate evaluation value in the process of constructing the bond rating model;

judging whether the convergence speed evaluation value meets a convergence threshold value;

when the convergence speed evaluation value does not meet the convergence threshold value, selecting a compensation constraint characteristic;

performing data compensation acquisition on each group of training data in the training data set through the compensation constraint characteristics, and adding compensation acquisition results to the training data set;

and constructing the bond rating model through the compensated training data set.

6. The method of claim 1, wherein the method further comprises:

performing sample richness evaluation on the sample data in the training data set to generate a sample richness evaluation result;

performing sample size evaluation on the sample data in the training data set to generate a sample total amount evaluation result;

optimizing sample data in the training data set according to the sample richness evaluation result and the sample total evaluation result to obtain an optimized training data set;

and constructing the bond rating model through the optimized training data set.

7. The method of claim 1, wherein the method further comprises:

judging whether the sample amount in the training data set is insufficient;

when the sample is insufficient, calling the encrypted sample data;

and completing the construction of the bond rating model through the encrypted sample data and the training data.

8. A machine learning-based bond risk assessment system, the system comprising:

the system comprises a basic information acquisition module, a basic information acquisition module and a basic information processing module, wherein the basic information acquisition module is used for acquiring basic information of a target company, and the basic information comprises asset scale information, sales information, credit grade information and surplus information;

the training data determining module is used for carrying out data screening identification based on big data and determining a training data set based on screening identification results, wherein each group of data forming the training data set comprises asset scale information, sales information, credit rating information, surplus information and risk rating identification information;

the identification sorting module is used for identifying and sorting the average value of the data proportion weights in each group of data of the training data set to obtain a sequential sorting result;

the weight constraint module is used for constructing an initial weight proportion constraint interval, and carrying out weight constraint on the sequential ordering result based on the initial weight proportion constraint interval to obtain an initial constraint result;

the system comprises a rating model building module, a risk rating identification module and a risk rating model building module, wherein the rating model building module is used for taking asset scale information, sales information, credit rating information and surplus information in a training data set as input data, taking risk rating identification information as supervision data, taking an initial constraint result as a hidden layer calculation weight constraint condition and building a bond rating model;

and the rating result output module is used for inputting the basic information into the constructed bond rating model to obtain a rating output result.