CN117575757A - Background data monitoring method and system of scoring card model - Google Patents

Background data monitoring method and system of scoring card model Download PDF

Info

Publication number
CN117575757A
CN117575757A CN202311450742.XA CN202311450742A CN117575757A CN 117575757 A CN117575757 A CN 117575757A CN 202311450742 A CN202311450742 A CN 202311450742A CN 117575757 A CN117575757 A CN 117575757A
Authority
CN
China
Prior art keywords
data
sub
bad
card model
data source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311450742.XA
Other languages
Chinese (zh)
Inventor
王世今
龙泳先
孙冬琦
杨磊磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Smart Co Ltd Beijing Technology Co ltd
Original Assignee
Smart Co Ltd Beijing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Smart Co Ltd Beijing Technology Co ltd filed Critical Smart Co Ltd Beijing Technology Co ltd
Priority to CN202311450742.XA priority Critical patent/CN117575757A/en
Publication of CN117575757A publication Critical patent/CN117575757A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/01Customer relationship services

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Educational Administration (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Technology Law (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a background data monitoring method and a system of a scoring card model, wherein the method comprises the following steps: collecting variable information of a third-party data platform, wherein the variable information is a sub data source and each data source corresponding to client institutions, scoring products, model versions and time point information; mapping the collected special sub-division of the sub-data source and the sub-data source variable; calculating six-dimension indexes, and carrying out real-time monitoring on third-party data information by using a grading card model through six dimensions; performing visualization processing on six-dimensional indexes by using static picture software; the data production failure is precisely located. The system comprises: the system comprises a data source module, a data preprocessing module, a six-dimensional index calculation module, a visualization module and a precise positioning module. The invention monitors each dimension of the third party data information in real time, monitors all directions, angles and time spans, knows the change of the production call of the data source in time, and ensures the normal and stable operation of each service line.

Description

Background data monitoring method and system of scoring card model
Technical Field
The invention relates to the technical field of data information monitoring, in particular to a background data monitoring method and system of a scoring card model.
Background
The grading card model is an air control model, and in the decision flow of the business system, the air control model can provide effective data support and decision basis for business decision makers, can identify, classify and early warn potential business risks in a prospective manner, automatically evaluate and intelligently quantify the risk details and grades of business objects in a multi-dimensional manner, and can further count and analyze the change trend of the risks, so that the risk cost of enterprises is reduced to the greatest extent. The flow of the wind control model based on big data is generally: starting from historical characteristic data of all aspects of a service object, firstly performing data processing and characteristic engineering, and then performing risk tag classification modeling by using a machine learning algorithm. And finally, outputting the scoring details and the overall scoring grade of each characteristic item of the business object in the form of a scoring card.
Along with the continuous deepening of knowledge of various industries on big data, the strategic significance and importance of the big data are continuously shown, and the strategic completion degree is greatly dependent on the stability of products. Along with the gradual manifestation of the diversification of products, the traditional database monitoring can not discover potential problems in the system operation process in time, database data loss is easy to cause, the influence is brought to enterprises, and the work of operation and maintenance personnel is increased in a complex and multiple way, so that the problems of incomplete monitoring, low efficiency and the like are caused.
In the first prior art, the CN202011378824.4 is based on a small micro-enterprise credit assessment method, equipment and a storage medium of a model, and the method establishes a quantitative model and a qualitative model based on a preset decision configuration structure according to known logic relations and various industry conditions, so that the credit assessment of the small micro-enterprise is realized, the credit assessment efficiency of the small micro-enterprise is improved, and the advice information of whether the credit passes, the interest rate and the quota can be finally output for the small micro-credit application enterprise.
In the second prior art, a CN202010750308.3 credit wind control model generating method, a device, a grading card generating method, a machine-readable medium and equipment, comprising: determining to perform feature engineering treatment on the original attribute data of the credit business object by utilizing a GBDT model with the maximum depth of 1 of the pre-trained base classifier node; and training and generating an LR scoring card model based on the data processed by the feature engineering, and taking the LR scoring card model as a credit wind control model. According to the characteristic that the gradient lifting tree with the maximum tree depth limited to 1 can be degenerated into a linear model, the LR scoring card model is automated and trained end to end, and the feature screening and continuous variable box-division prediction performance which are remarkably superior to those of heuristic rules are achieved, so that the automation and end to end training of the linear, interpretable and high-performance machine learning classification model are realized.
In the third prior art, CN201810810972.5 is a tax payer credit assessment method with distributed automatic feature combination, which comprises the following steps: 1) Training a random forest model by using a MapReduce distributed computing frame through training samples to obtain a distributed random forest model; 2) Inputting training samples into the distributed random forest model to generate a plurality of combined features of each input training sample; 3) Combining the generated combined features with the feature information of the corresponding tax payers; 4) Training a scoring card model by utilizing the combined characteristics; 5) And for a taxpayer to be subjected to credit assessment, generating the combined characteristics of the taxpayer by using the distributed random forest model, combining the combined characteristics with the characteristic information of the taxpayer, inputting the combined characteristics of the taxpayer into a trained scoring card model, and predicting the credit score of the taxpayer. The accurate credit assessment of the taxpayer can be performed.
Background data monitoring cannot timely find potential problems in the running process of a system in the prior art I, the prior art II and the prior art III, and database data are easy to lose; in addition, the data are complex and multiple, so that the working intensity of operation and maintenance personnel is increased, the monitoring is incomplete and the efficiency is low, therefore, in order to better consolidate the stability of products, the invention needs to guide relevant business data into a visual monitoring system, and corresponding conclusions are obtained through different indexes and data, so that the problems are found out in advance and solved in a controllable time range.
Disclosure of Invention
In order to solve the technical problems, the invention provides a background data monitoring method and a system of a scoring card model, comprising the following steps:
collecting variable information of a third party data platform;
mapping the collected special sub-division of the sub-data source and the sub-data source variable;
the six dimension indexes of the scoring group stability index are calculated through the scoring Kerr Mo Geluo f test, the receiver operation characteristic curve, the evidence weight of the variable, the information quantity, the duty ratio of each bin;
the six-dimensional index is visualized by using static picture software, so that the problem can be conveniently found;
each third party data platform is accurately monitored to change sub-data sources of different products used by different customers in different time periods so as to accurately locate data production faults.
Optionally, the variable information is a sub data source and each data source corresponding to the client mechanism, the scoring product, the model version and the time point information.
Optionally, the special sub-is divided into an accumulated bad duty ratio curve and a receiver operation characteristic curve, and the reasons of the abnormal score card model are analyzed from the bottom layer according to whether the values of the accumulated bad duty ratio curve and the receiver operation characteristic curve of the sub-data source are abnormal or not and the fluctuation of the variable along with months.
Optionally, the calculation method of the six-dimension index is as follows:
cole Mo Geluo f test value KS: the degree of differentiation of the scoring card model to the good and bad clients is measured, and the formula is as follows:
KS=max{|cum(bad rate )-cum(good rate )|}
wherein bad rate Bad clients, good, representing scoring card models rate Showing good customers of the card scoring model, cut (bad rate ) Represents the ratio of the accumulated bad customers to the total bad customers, cut (good rate ) Represents the ratio of accumulated good customers to total good customers, |cut (bad) rate )-cum(good rate) The I represents the ratio of the accumulated bad clients to the total bad clients and the total good clientsThe absolute value of the ratio difference of the users, max represents the maximum value of the absolute value;
receiver operating profile: describing the proportion of accumulated bad clients at a certain accumulated good client proportion;
information amount IV: the information value of the variable is calculated, and the calculation formula of the information quantity IV is as follows:
wherein i is a grouping of variables, bad i Representing the value of the Bad clients of the i-th group, bad T Indicating the total value, good, of bad customers i Representing the value, good, of the i-th group of good clients T Indicating the total value of the good customer and ln represents the logarithm.
Optionally, the evidence weight WOE of the variable: calculating the duty ratio WOE of each sub-box of the variable i The calculation formula is as follows:
scoring population stability index PSI: the calculation formula is as follows:
wherein development is the expected population score and validization is the actual population score;
the ratio of each sub-box is as follows: and calculating the proportion of each variable sub-box to the total weight.
Optionally, the visual processing is performed on the six-dimensional index, which specifically includes two parts:
the first part is a visual legend;
the second part is a visual report;
the visual legend shows the situation of using a selected scoring card model in the Kerr Mo Geluo f test values and the receiver operation characteristic curves of each client institution, and using the colors RBG with the color numbers of 242, 142, 43, 78, 161 and 167 from top to bottom in sequence; scoring a line graph of group stability indexes on each client institution by using a selected scoring card model, wherein the colors of the RBG are 242, 142, 43, 78, 161, 167, 225, 87, 89, 120, 182 and 178 in sequence from top to bottom;
the visual report shows a line graph of information quantity and grading group stability indexes of selected sub-data source variables on each client mechanism, wherein the colors RBG are 242, 142, 43, 78, 161, 167, 225, 87, 89, 120, 182 and 178 in sequence from top to bottom; showing the variation of the sharing duty ratio of the selected sub-data source variable with month, wherein the colors RBG are 78, 121, 167, 242, 142, 43, 225, 88, 87, 118, 183, 178, 89, 161, 79, 237, 200, 66, 176, 122, 161 from top to bottom; the fractional variation of the selected sub-data source variable is shown, and the colors RBG are 255,0, 85 and 170,0 from left to right; WOE trends are shown for each client organization using the selected child data source variable, using the colors RBG with color numbers 0, 85,0, 170,0,0 from top to bottom.
Optionally, the filter adopts time unit, third party data platform unit, product service coding unit, product version unit and customer institution unit call volume unit to filter to accurate monitoring each third party data platform uses the sub data source change of different products to different customers in different time periods, so that accurate positioning data production fault.
Optionally, the accurate positioning data production fault adopts the adjustment screener to accurately monitor the change of each scoring card model, and uses the sub-data sources and the sub-component variables of different products for different customers in different time periods.
Optionally, the visual processing is performed on the six-dimensional index, which specifically includes:
receiving a visual request of background monitoring of the scoring card model, and converting six-dimensional indexes into visual legends and visual reports;
displaying the visual legend and the visual report on a display device;
through recognition of the gestures of the manager, the visual legend and the visual report form are enlarged, reduced, translated and rotated, and the six-dimensional index which is suitable for the best viewing angle of the manager is obtained.
The invention provides a background data monitoring system of a scoring card model, which comprises:
the data source module is used for collecting variable information of the third-party data platform;
the data preprocessing module is used for carrying out mapping processing on the special sub-components of the acquired sub-data sources and the sub-data source variables;
the six-dimension index calculation module is used for carrying out real-time monitoring on the third party data information by using each dimension through calculating six-dimension index of the score group stability index, namely, the score Col Mo Geluo f test, the receiver operation characteristic curve, the evidence weight of the variable, the information quantity and the duty ratio of each sub-box;
the visualization module is used for carrying out visualization processing on six-dimensional indexes by using static picture software so as to facilitate the problem searching;
and the screening device screens by adopting a time unit, a third-party data platform unit, a product service coding unit, a product version unit and a customer mechanism unit call quantity unit so as to accurately monitor the sub-data source change of different products used by different customers of each third-party data platform in different time periods, thereby accurately positioning data production faults.
Since the decision of the financial institution depends on the third party data information adopted externally, if the third party data information has production faults, the real-time decision of the dispatching institution is influenced, and the difference and the change of the returned information of the insight data source are important for business development. According to the background data monitoring method of the third-party data platform, through six-dimensional real-time monitoring of the third-party data information, technical indexes of different subdivisions and data versions on third-party data call can be distinguished, and all-dimensional, multi-angle and cross-time monitoring can be carried out, so that changes of data source production call can be known in time, and normal and stable operation of each service line can be ensured. According to the invention, through real-time monitoring of each dimension of the third party data information, the variables required by the online of the scoring model are mainly analyzed, so that technical indexes of different subdivisions and data versions on the third party data call are distinguished, and all-dimensional, multi-angle and cross-time monitoring is performed, so that the change of the data source production call can be known in time, and the normal and stable operation of each service line is ensured. Finally, in a controllable time range, data production accidents can be found and corresponding measures are taken, so that the scoring card model product can normally and stably run on each client mechanism, and the efficiency of monitoring management staff can be greatly improved. The score card model is monitored in real time by using each dimension, so that data production accidents can be found in time and corresponding measures can be taken, and the score card model product can be ensured to run normally and stably on each client mechanism.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims thereof as well as the appended drawings.
The technical scheme of the invention is further described in detail through the drawings and the embodiments.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:
FIG. 1 is a flowchart of a background data monitoring method of a scoring card model in an embodiment of the invention;
FIG. 2 is a schematic diagram of a visual illustration of an embodiment of the present invention;
FIG. 3 is a schematic diagram of a visual report 1 according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a visual report 2 according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of sub-data sources of a scoring card model using different products for different customers at different time periods in an embodiment of the present invention;
FIG. 6 is a schematic diagram of sub-component variables of a scoring card model using different products for different customers at different time periods in an embodiment of the invention;
fig. 7 is a schematic structural diagram of a background data monitoring system of a scoring card model in an embodiment of the invention.
Detailed Description
The preferred embodiments of the present invention will be described below with reference to the accompanying drawings, it being understood that the preferred embodiments described herein are for illustration and explanation of the present invention only, and are not intended to limit the present invention.
The terminology used in the embodiments of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the embodiments of the application. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present application as detailed in the accompanying claims. The specific meaning of the terms in this application will be understood by those of ordinary skill in the art as the case may be.
Example 1
As shown in fig. 1, an embodiment of the present invention provides a method for monitoring background data of a scoring card model, including the following steps:
s100: collecting variable information of a third-party data platform, wherein the variable information is a sub data source and each data source corresponding to client institutions, scoring products, model versions and time point information;
s200: mapping the collected special sub-division of the sub-data source and the sub-data source variable;
s300: real-time monitoring of third party data information by a scoring card model through calculating six-dimensional indexes of a scoring Kerr Mo Geluo v test (Kolmogorov-Smirnov, KS), a receiver operation characteristic curve (Receiver Operating Characteristic Curve, ROC), evidence weights of variables (weight of evidence, WOE), information amounts (Information Value, IV), the duty ratio of each bin and a scoring group stability index (Population Stability Index, PSI);
s400: performing visual processing on six-dimensional indexes by using static picture (Tableau) software so as to facilitate the problem searching;
s500: the screening device screens by adopting a time unit, a third-party data platform unit, a product service coding unit, a product version unit and a customer mechanism unit call volume unit so as to accurately monitor the sub-data source change of different products used by different customers of different third-party data platforms in different time periods, thereby accurately positioning data production faults.
The working principle and beneficial effects of the technical scheme are as follows: since the decision of the financial institution depends on the third party data information adopted externally, if the third party data information has production faults, the real-time decision of the dispatching institution is influenced, and the difference and the change of the returned information of the insight data source are important for business development. According to the background data monitoring method of the third-party data platform, through six-dimensional real-time monitoring of the third-party data information, technical indexes of different subdivisions and data versions on third-party data call can be distinguished, and all-dimensional, multi-angle and cross-time monitoring can be carried out, so that changes of data source production call can be known in time, and normal and stable operation of each service line can be ensured. According to the invention, through real-time monitoring of each dimension of the third party data information, the variables required by the online of the scoring model are mainly analyzed, so that technical indexes of different subdivisions and data versions on the third party data call are distinguished, and all-dimensional, multi-angle and cross-time monitoring is performed, so that the change of the data source production call can be known in time, and the normal and stable operation of each service line is ensured. Finally, in a controllable time range, data production accidents can be found and corresponding measures are taken, so that the scoring card model product can normally and stably run on each client mechanism, and the efficiency of monitoring management staff can be greatly improved. The score card model is monitored in real time by using each dimension, so that data production accidents can be found in time and corresponding measures can be taken, and the score card model product can be ensured to run normally and stably on each client mechanism.
Example 2
Based on embodiment 1, in step S200 provided in the embodiment of the present invention, the specific sub-division is divided into an accumulated bad duty ratio curve KS and a receiver operation characteristic curve ROC, and according to whether the values of the accumulated bad duty ratio curve KS and the receiver operation characteristic curve ROC of the sub-data source are abnormal or not and the fluctuation of the variable with the month, the reason of the abnormality of the scoring card model is analyzed from the bottom layer.
The working principle and beneficial effects of the technical scheme are as follows: the invention carries out mapping processing on the collected special sub-data source and the sub-data source variable, wherein the special sub-data source is divided into an accumulated bad duty ratio curve KS and a receiver operation characteristic curve ROC, and provides reference for monitoring third-party data information by judging whether the special sub-data source is abnormal or not, so that the accuracy of background data monitoring of a grading card model is ensured.
Example 3
On the basis of embodiment 1, the calculation time granularity of the six-dimensional index in step S300 provided by the embodiment of the invention is thinned to be monthly, and the calculation methods of the six-dimensional index are respectively as follows:
cole Mo Geluo f test value KS: the KS value is used for measuring the degree of differentiation of the scoring card model to good and bad clients in the financial wind control field, and the formula is as follows:
KS=max{|cum(bad rate )-cum(good rate )|}
wherein bad rate Bad clients, good, representing scoring card models rate Showing good customers of the card scoring model, cut (bad rate ) Represents the ratio of the accumulated bad customers to the total bad customers, cut (good rate ) Represents the ratio of accumulated good customers to total good customers, |cut (bad) rate )-cum(good rate ) The I represents the absolute value of the difference between the ratio of the accumulated bad clients to the total bad clients and the ratio of the accumulated good clients to the total good clients, and the max represents the maximum value of the absolute value;
receiver operating characteristic curve ROC: the ROC curve describes the proportion of accumulated bad customers at a certain accumulated good customer proportion;
information amount IV: the information value of the variable is calculated, and the calculation formula of the information quantity IV is as follows:
wherein i is a grouping of variables, bad i Representing the value of the Bad clients of the i-th group, bad T Indicating the total value, good, of bad customers i Representing the value, good, of the i-th group of Good clients T Indicating the total value of the good clients, and ln is used for logarithm;
evidence weight WOE of variable: calculating the duty ratio WOE of each sub-box of the variable i The calculation formula is as follows:
scoring population stability index PSI: the calculation formula is as follows:
wherein development is the expected population score and validization is the actual population score;
the ratio of each sub-box is as follows: and calculating the proportion of each variable sub-box to the total weight.
The working principle and beneficial effects of the technical scheme are as follows: the invention adopts the Kerr Mo Geluo f test value to measure the degree of distinction of the scoring card model to the good and bad clients, realizes the test on whether the good and bad clients are continuously distributed, tests whether the difference exists between the good and bad clients and the uniform distribution, and further improves the degree of distinction of the scoring card model to the good and bad clients; the receiver operation characteristic curve describes the proportion of accumulated bad clients under a certain accumulated good client proportion, so as to obtain accurate good client proportion and bad client proportion, and realize good monitoring of client groups; calculating the information value of the variable, and predicting the variable to obtain the prediction capacity; the evidence weight of the variable is beneficial to exploring data and screening the quality ratio of each bin of the variable calculated by the variable; the scoring group stability index is used for measuring indexes of the expected group score and the actual group score, so that the stability of good clients and bad clients in the client group is obtained; by the index, reliable and stable data is provided for background data monitoring of the grading card model, and by the six-dimension index, accurate data sources are provided for obtaining accurate background data monitoring results.
Example 4
Based on embodiment 1, the step S400 provided in the embodiment of the present invention performs visualization processing on six-dimensional indexes, and specifically includes two parts:
the first part is a visual diagram such as that shown in fig. 2;
the second part is a visual report as shown in fig. 3;
the third part is a visual report as shown in fig. 4.
The use of the selected scoring card model for KS and ROC at each customer institution is shown at reference 5 in fig. 2, with the colors RBG having color numbers 242, 142, 43, 78, 161, 167 in order from top to bottom. Fig. 3 shows a plot of PSI at each customer facility using the selected scoring card model, with legend to each customer group on the right, 242, 142, 43, 78, 161, 167, 225, 87, 89, 120, 182, 178 using the selected sub-source variable from top to bottom, fig. 4 shows a plot of IV and PSI at each customer facility using the selected sub-source variable from top to bottom, 242, 142, 43, 78, 161, 167, 225, 87, 89, 120, 182, 178 using the selected sub-source variable from top to bottom, fig. 3 shows a plot of sharing ratio as a function of month, with legend to each bin on the right, 78, 121, 167, 242, 142, 43, 225, 88, 87, 118, 183, 178, 89, 161, 79, 237, 200, 66, 176, 122, 161, fig. 4 shows a plot of using the selected sub-source variable from left to right, and score of RBG from top to bottom, and score of 0, 255, fig. 3 shows a plot of sharing ratio as a function of month using the selected sub-source variable from left to right. The WOE trend for each client organization using the selected child data source variable is shown at 2 in FIG. 4, using the colors RBG with color numbers 0, 85,0, 170,0,0 from top to bottom.
The working principle and beneficial effects of the technical scheme are as follows: the invention carries out visual processing on the six-dimensional index, which comprises a visual legend and a visual report, so that the six-dimensional index can be comprehensively embodied in the legend and the report, an intuitive data base is provided for a background manager, the working intensity of the background manager can be greatly reduced, and the invention has good promotion effects on improving the customer experience and the background data monitoring result precision.
Example 5
Based on embodiment 1, the accurate positioning in step S500 provided by the embodiment of the present invention, as shown in fig. 5, is implemented by adjusting the filter to accurately monitor the variation of the sub-data sources and the sub-component variables of each scoring card model using different products for different customers in different time periods. The product version to be monitored of the screener at 1 is marked in fig. 5, the scoring cost of the screener at 2 is marked in fig. 5, the sub-data sources associated with the scoring card model of the screener at 3 is marked in fig. 4, the time dimension of the screener at 4 is marked in fig. 5, and the month can be precisely reached.
As shown in fig. 6, the filters can be adjusted to accurately monitor changes in the sub-data sources of various third party data platforms using different products for different customers over different time periods. Fig. 6 shows the results of the specific example, the filter at 1 is shown to select a product version, the filter at 2 is shown to select a sub-data source related to the score card model, the filter at 2 is shown to select time, for example 202004, the filter at 4 and 5 is shown to select variable and variable description in fig. 6, the selected variable is found to be in 2021 month 4 by various indexes, a large rise is observed in the line diagram of IV, and a conclusion is drawn due to the change of guest groups by communication with the data source at the later stage.
The working principle and beneficial effects of the technical scheme are as follows: according to the invention, the filters are adjusted to accurately monitor the sub-data sources and the variation of the sub-component variables of different products used by different customers in different time periods, so that the accurate positioning of data production faults is realized, and the efficiency of background data monitoring is further improved.
Example 6
Based on embodiment 4, the visualization processing for the six-dimensional index provided by the embodiment of the invention specifically includes:
receiving a visual request of background monitoring of the scoring card model, and converting six-dimensional indexes into visual legends and visual reports;
displaying the visual legend and the visual report on a display device;
through recognition of the gestures of the manager, the visual legend and the visual report form are enlarged, reduced, translated and rotated, and the six-dimensional index which is suitable for the best viewing angle of the manager is obtained.
The working principle and beneficial effects of the technical scheme are as follows: according to the invention, the visual request of background monitoring of the scoring card model is received, six-dimension indexes are converted into visual legends and visual reports, a technical foundation is laid for visual processing, and management staff can intuitively and efficiently check the six-dimension indexes; the visual legend and the visual report are displayed on the display device, so that the system can automatically perform as long as a manager makes a request, the visual legend and the visual report are displayed, and the automation level of background data monitoring of the scoring card model is greatly improved; the visual legend and the visual report form are amplified, reduced, translated and rotated through the gesture recognition of the manager, the six-dimension index which is suitable for the manager to observe at the optimal visual angle is obtained, and the visual legend and the visual report form are amplified, reduced, translated and rotated through the gesture recognition of the manager, so that the visual legend and the visual report form are convenient to operate, visual to observe and convenient for the manager to manage.
Example 7
On the basis of embodiment 6, the method provided by the embodiment of the invention adopts an enhanced convolution gesture machine algorithm for recognizing the gesture of the manager, and comprises a convolution gesture machine sub-network and a recognition sub-network, wherein the convolution gesture machine sub-network is used for rapidly detecting gesture key points to form a gesture feature skeleton diagram, and inputting the feature diagram into the recognition network, so that the detected gesture feature skeleton diagram is accurately classified, and the visual legend and the visual report are amplified, contracted, translated and rotated; the convolutional attitude machine sub-network comprises five stages, each stage outputs a heat map of the predicted positions of all joints of the hand, and the positions are refined to finally obtain joint characteristic heat maps;
the output of each stage uses a loss function to minimize the error in predicting the joint position of the hand from the joint position of the ideal hand, the ideal position confidence map for the joint position of each hand being:
wherein,representing the ideal position of the joints of each hand, Y p Ideal position, define loss function f minimizing error t The method comprises the following steps:
where p traverses all joints, A denotes the position of a certain joint of the current hand, Z denotes the set of positions of joints of all hands, t denotes a certain stage of the five stages,a function representing the position of the joint at a certain stage,the function representing the joint position of each hand, the five-stage loss function is accumulated, resulting in an accumulated loss function F:
the working principle and beneficial effects of the technical scheme are as follows: according to the convolution gesture machine sub-network, end-to-end training is adopted, image preprocessing processes such as gesture image segmentation, skin color detection and the like in a traditional gesture recognition method are not needed, the convolution gesture machine sub-network is simple and quick, the amplification, the shrinkage, the translation and the rotation of a visual legend and a visual report can be quickly realized, and six-dimensional indexes suitable for optimal visual angle observation of management staff are obtained; in addition, the sub-network of the convolution gesture machine rapidly detects gesture key points to form a gesture feature skeleton diagram, and the feature diagram is input into the recognition network, so that the detected gesture feature skeleton diagram is accurately classified, and the visual legend and the visual report are enlarged, reduced, translated and rotated.
Example 8
On the basis of embodiment 7, the embodiment of the invention provides the method for realizing the amplification, the reduction, the translation and the rotation of the visual legend and the visual report, wherein the translation is transformed by a three-dimensional transformation vector of a moving distance in the three-dimensional coordinate direction; scaling (zooming in and out) transformation is achieved by a scaling factor; establishing a total rotation matrix by using the direction and the rotation angle of a given shaft; the method specifically comprises the following steps:
t for transformation matrix of set of visual legend and visual report 3d The representation is:
wherein a is 11 Represents x-axis scaling, a 12 Representing y-axis shear, a 13 And a 14 Represents x-axis stretching; a, a 21 Representing x-axis shear, a 22 Representing the shrinkage of the y-axis, a 23 And a 24 Represents stretching to the y-axis; a, a 31 Representing x-axis rotation, a 32 Representing y-axis rotation, a 33 And a 34 Indicating z-axis rotation; a, a 41 Representing x-axis translation, a 42 Representing the translation on the y-axis, a 43 Representing z-axis translation, a 44 Representation magnification;
the transformation matrix comprises 4 sub-matrices, the matrixScaling and rotation transformations representing visual legends and visual statements, row matrix T 3d2 =(a 41 a 42 a 43 ) Representing a translation transformation, column matrix->Then it is projective transformation, a 44 Generating an overall scaling;
(1) Zoom (zoom in or zoom out) of visual legend and visual report: the zoom point of enlargement or reduction is (x i y i z i ),s i For scaling up or down, the transformation matrix is:
(2) Translation of visual legend and visual report: the position coordinates of the visual legend or the visual report form are (x, y, z), and the translation amounts in all direction axes are respectively T x ,T y ,T z The result after the translation was (x 1 ,y 1 ,z 1 ) The following steps are:
(3) Rotation of visual legends and visual report: rotating the center point P coordinate (x, y, Z) under the right hand coordinate, and obtaining P (x ', y ', Z ') after the center point P rotates along the Z axis by an angle theta:
rotated along the Y-axis to obtain P (x ', Y ', z '):
rotated along the X-axis to obtain P (X ', y ', z '):
the working principle and beneficial effects of the technical scheme are as follows: the invention realizes the amplification, the reduction, the translation and the rotation of the visual legend and the visual report, wherein the translation is the transformation of the three-dimensional transformation vector of the moving distance in the three-dimensional coordinate direction; scaling (zooming in and out) transformation is achieved by a scaling factor; the total rotation matrix is established by utilizing the direction and the rotation angle of the given shaft, so that three-dimensional display of six-dimensional indexes is realized, the experience of a client is improved, the labor intensity of a background manager is reduced, and the operation is simple.
Example 9
As shown in fig. 7, on the basis of embodiment 1, the background data monitoring system of the scoring card model provided in the embodiment of the present invention specifically includes:
the data source module is used for collecting information of different client institutions, scoring products, model versions and time points, corresponding sub data sources and variable information of each data source of a third-party data platform;
the data preprocessing module is used for carrying out mapping processing on the special sub-components of the acquired sub-data sources and the sub-data source variables;
the six-dimension index calculation module is used for carrying out real-time monitoring on the third party data information by using each dimension through calculating six-dimension index of the score group stability index, namely, the score Col Mo Geluo f test, the receiver operation characteristic curve, the evidence weight of the variable, the information quantity and the duty ratio of each sub-box;
the visualization module is used for carrying out visualization processing on six-dimensional indexes by using static picture software so as to facilitate the problem searching;
and the screening device screens by adopting a time unit, a third-party data platform unit, a product service coding unit, a product version unit and a customer mechanism unit call quantity unit so as to accurately monitor the sub-data source change of different products used by different customers of each third-party data platform in different time periods, thereby accurately positioning data production faults.
The working principle and beneficial effects of the technical scheme are as follows: according to the invention, through real-time monitoring of each dimension of the third party data information, the variables required by the online of the scoring model are mainly analyzed, so that technical indexes of different subdivisions and data versions on the third party data call are distinguished, and all-dimensional, multi-angle and cross-time monitoring is performed, so that the change of the data source production call can be known in time, and the normal and stable operation of each service line is ensured. Finally, in a controllable time range, data production accidents can be found and corresponding measures are taken, so that the scoring card model product can normally and stably run on each client mechanism, and the efficiency of monitoring management staff can be greatly improved. The score card model is monitored in real time by using each dimension, so that data production accidents can be found in time and corresponding measures can be taken, and the score card model product can be ensured to run normally and stably on each client mechanism.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (10)

1. The background data monitoring method of the grading card model is characterized by comprising the following steps of:
collecting variable information of a third party data platform;
mapping the collected special sub-division of the sub-data source and the sub-data source variable;
the scores of six-dimension index Kerr Mo Geluo f test, receiver operation characteristic curve, evidence weight of variables, information quantity, the duty ratio of each bin and group stability index are calculated; the scoring card model monitors the third party data information in real time by adopting six dimensions;
and carrying out visualization processing on six-dimensional indexes by using static picture software, and accurately monitoring the sub-data source changes of different products used by different clients by each third-party data platform in different time periods.
2. The method for monitoring background data of a scoring card model according to claim 1, wherein the variable information is a sub-data source and each data source corresponding to a client organization, a scoring product, a model version and time point information.
3. The background data monitoring method of the scoring card model according to claim 1, wherein the special subcontracting comprises a coler Mo Geluo f test and a receiver operation characteristic curve, and the reason of the scoring card model abnormality is analyzed from the bottom layer according to whether the values of the coler Mo Geluo f test and the receiver operation characteristic curve of the subcoata source are abnormal or not and the fluctuation of the variables with months.
4. The method for background data monitoring of a scoring card model according to claim 1, wherein the kor Mo Geluo f test value KS: the degree of differentiation of the scoring card model to the good and bad clients is measured, and the formula is as follows:
KS=max{|cum(bad rate )-cum(good rate )|}
wherein bad rate Bad clients, good, representing scoring card models rate Showing good customers of the card scoring model, cut (bad rate ) Represents the ratio of the accumulated bad customers to the total bad customers, cut (good rate ) Represents the ratio of accumulated good customers to total good customers, |cut (bad) rate )-cum(good rate ) The I represents the absolute value of the difference between the ratio of the accumulated bad clients to the total bad clients and the ratio of the accumulated good clients to the total good clients, and the max represents the maximum value of the absolute value;
receiver operating profile: describing the proportion of accumulated bad clients at a certain accumulated good client proportion;
information amount IV: the information value of the variable is calculated, and the calculation formula of the information quantity IV is as follows:
wherein i is a grouping of variables, bad i Representing the value of the Bad clients of the i-th group, bad T Indicating the total value, good, of bad customers i Representing the value, good, of the i-th group of Good clients T Indicating the total value of the good customer and ln represents the logarithm.
5. The method for background data monitoring of a scoring card model according to claim 1,
evidence weight WOE of variable: calculating the duty ratio WOE of each sub-box of the variable i The calculation formula is as follows:
scoring population stability index PSI: the calculation formula is as follows:
wherein development is the expected population score and validization is the actual population score;
the ratio of each sub-box is as follows: and calculating the proportion of each variable sub-box to the total weight.
6. The method for monitoring background data of a scoring card model according to claim 1, wherein the method for visualizing six-dimensional indexes comprises two parts:
the first part is a visual legend;
the second part is a visual report;
a visual legend shows the use of a selected scoring card model at the cole Mo Geluo f test values and receiver operating characteristics of each customer institution; scoring a line graph of group stability indexes on each customer institution by using the selected scoring card model;
the visual report shows a line graph of information quantity and grading group stability indexes on each client mechanism by using the selected sub-data source variables; displaying the change condition of each sharing duty ratio of the selected sub-data source variable along with the month; exhibiting fractional changes using selected sub-data source variables; the WOE trend is shown for each client organization using the selected child data source variable.
7. The method for monitoring background data of a scoring card model according to claim 1, wherein the screener screens by using time units, third party data platform units, product service coding units, product version units and customer institution unit call volume units to accurately monitor sub-data source changes of different products used by different customers of each third party data platform in different time periods so as to accurately locate data production faults.
8. The method for background data monitoring of the scoring card model according to claim 7, wherein the accurate positioning of data production faults uses an adjustment filter to accurately monitor the changes of each scoring card model, and sub-data sources and sub-component variables of different products are used for different customers in different time periods.
9. The method for monitoring background data of a scoring card model according to claim 1, wherein the step of performing visualization processing on six-dimensional indexes specifically comprises the steps of:
receiving a visual request of background monitoring of the scoring card model, and converting six-dimensional indexes into visual legends and visual reports;
displaying the visual legend and the visual report on a display device;
through recognition of the gestures of the manager, the visual legend and the visual report form are enlarged, reduced, translated and rotated, and the six-dimensional index which is suitable for the best viewing angle of the manager is obtained.
10. A background data monitoring system for a scoring card model, comprising:
the data source module is used for collecting variable information of the third-party data platform;
the data preprocessing module is used for carrying out mapping processing on the special sub-components of the acquired sub-data sources and the sub-data source variables;
the six-dimension index calculation module is used for carrying out real-time monitoring on the third party data information by using each dimension through calculating six-dimension index of the score group stability index, namely, the score Col Mo Geluo f test, the receiver operation characteristic curve, the evidence weight of the variable, the information quantity and the duty ratio of each sub-box;
the visualization module is used for carrying out visualization processing on six-dimensional indexes by using static picture software so as to facilitate the problem searching;
and the screening device screens by adopting a time unit, a third-party data platform unit, a product service coding unit, a product version unit and a customer mechanism unit call quantity unit so as to accurately monitor the sub-data source change of different products used by different customers of each third-party data platform in different time periods, thereby accurately positioning data production faults.
CN202311450742.XA 2023-11-02 2023-11-02 Background data monitoring method and system of scoring card model Pending CN117575757A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311450742.XA CN117575757A (en) 2023-11-02 2023-11-02 Background data monitoring method and system of scoring card model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311450742.XA CN117575757A (en) 2023-11-02 2023-11-02 Background data monitoring method and system of scoring card model

Publications (1)

Publication Number Publication Date
CN117575757A true CN117575757A (en) 2024-02-20

Family

ID=89892658

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311450742.XA Pending CN117575757A (en) 2023-11-02 2023-11-02 Background data monitoring method and system of scoring card model

Country Status (1)

Country Link
CN (1) CN117575757A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111311400A (en) * 2020-03-30 2020-06-19 百维金科(上海)信息科技有限公司 Modeling method and system of grading card model based on GBDT algorithm
CN111311402A (en) * 2020-03-30 2020-06-19 百维金科(上海)信息科技有限公司 XGboost-based internet financial wind control model
US20210390564A1 (en) * 2020-06-16 2021-12-16 Hartford Fire Insurance Company Automated third-party data evaluation for modeling system
CN116883153A (en) * 2023-07-26 2023-10-13 广东丞策智能科技有限公司 Pedestrian credit investigation-based automobile finance pre-credit rating card development method and terminal

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111311400A (en) * 2020-03-30 2020-06-19 百维金科(上海)信息科技有限公司 Modeling method and system of grading card model based on GBDT algorithm
CN111311402A (en) * 2020-03-30 2020-06-19 百维金科(上海)信息科技有限公司 XGboost-based internet financial wind control model
US20210390564A1 (en) * 2020-06-16 2021-12-16 Hartford Fire Insurance Company Automated third-party data evaluation for modeling system
CN116883153A (en) * 2023-07-26 2023-10-13 广东丞策智能科技有限公司 Pedestrian credit investigation-based automobile finance pre-credit rating card development method and terminal

Similar Documents

Publication Publication Date Title
US8571909B2 (en) Business intelligence system and method utilizing multidimensional analysis of a plurality of transformed and scaled data streams
CN105574593B (en) Track state static detection and control system and method based on cloud computing and big data
US10311394B2 (en) System and method analyzing business intelligence applied to physical assets and environment factors
CN108170769A (en) A kind of assembling manufacturing qualitative data processing method based on decision Tree algorithms
JP6272478B2 (en) Image display system and image display method
US10409833B2 (en) Systems and methods for analyzing energy or environmental factors relative to energy
CN112579845A (en) Industrial big data display geographic information system platform
Li et al. A GIS‐based site selection system for real estate projects
CN117575757A (en) Background data monitoring method and system of scoring card model
CN112231386A (en) Visual interaction method, system, equipment and storage medium for railway scientific research data
CN116910139A (en) Distribution network engineering uninterrupted operation construction cost data management system
CN116611710A (en) Visual dynamic display method and system
CN116932632A (en) Method and system for data asset management and visualization
KR102354181B1 (en) A construction information management system for visualising data and a method for controlling the same
Sancho-Chavarria et al. Task-based assessment of visualization tools for the comparison of biological taxonomies
KR20230065693A (en) System and method for diagnostic evaluation of smart maturity level using data envelopment analysis
CN114417741A (en) One-machine one-file equipment data management and processing system
Sinlae et al. Application Of Business Intelligence In The Analysis And Visualization Of Xyz University Alumni Data Using The Tableau Platform
Nafikov Telemetry data of oilfield facilities analysis for the purpose of monitoring and decision-making support at its operation
Huang et al. Performance prediction and optimization for healthcare enterprises in the context of the COVID-19 pandemic: an intelligent DEA-SVM model
CN117495174A (en) Foreground data monitoring method and system of scoring card model
CN118245316A (en) Foreground data monitoring method and system of three-party data information platform
Arrafi et al. Leveraging Data Analytics to Enhance Decision Making in Purchase Order Management: A Case Study in Aca Company
Fernandez et al. A Methodology for Fast Deployment of Condition Monitoring and Generic Services Platform Technological Design
Naya et al. Prediction of Employee Assessments for Contract Extensions at PT Sagateknindo Sejati Using the Naïve Bayes Algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination