CN111552723A - System and method for client portrait in data management - Google Patents

System and method for client portrait in data management Download PDF

Info

Publication number
CN111552723A
CN111552723A CN202010379332.0A CN202010379332A CN111552723A CN 111552723 A CN111552723 A CN 111552723A CN 202010379332 A CN202010379332 A CN 202010379332A CN 111552723 A CN111552723 A CN 111552723A
Authority
CN
China
Prior art keywords
data
public
debt
client
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010379332.0A
Other languages
Chinese (zh)
Inventor
李岩
宋兵
张洪江
朱启功
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hebei Xiong'an Shungeng Data Technology Co ltd
Original Assignee
Hebei Xiong'an Shungeng Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hebei Xiong'an Shungeng Data Technology Co ltd filed Critical Hebei Xiong'an Shungeng Data Technology Co ltd
Priority to CN202010379332.0A priority Critical patent/CN111552723A/en
Publication of CN111552723A publication Critical patent/CN111552723A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • General Business, Economics & Management (AREA)
  • Technology Law (AREA)
  • Strategic Management (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention discloses a system and a method for client portrait in data management, wherein the system comprises the following modules which are connected in sequence: the data acquisition module is used for acquiring the basic data and the debt data of the commercial bank to the public client; the data extraction module is used for extracting and processing public characteristics of public customers, and also comprises a characteristic engineering module which is used for processing the public characteristics to find out individual characteristics and labeling each characteristic; the model development module is used for constructing association rules of the public client and the debt item information and debt item overdue default probability by using the data model; the data display device comprises a data storage module and a data display module. The invention constructs the method which can solve the problem of internal and external asymmetry of data through data processing, ensures the truth and reasonability of the data, obtains the relation between internal information by utilizing a data mining technology, and finally displays the relation in a multi-aspect visual mode in a knowledge map mode, provides early warning for multi-dimension understanding of overdue of customers and debt default, and provides auxiliary decisions for bottom application and risk control.

Description

System and method for client portrait in data management
Technical Field
The invention relates to a system and a method for client portrait in data management.
Background
At present, each large commercial bank and financial institution has own huge data storage, data management and data application system, and various structured, unstructured and semi-structured 'big data' collected by foreground multi-channel have obtained good deposition in the financial industry, and a method for digging out data with maximum strength has more value, and is a main purpose for the contact between computer technicians and personnel in the industry at present.
In the prior art, data redundancy and relative independence exist among systems developed by various large commercial banks and financial institutions on the basis of data precipitation, data understanding and business logic need to be added as a main line, and contents among data are connected in series to form a reasonable data set; the relatively complete and accurate index system can comprehensively and accurately position the client, and for each large commercial bank and financial institution client portrait, the relatively complete label system is not used for comprehensively and accurately positioning and displaying the multi-dimensional behavior habit of the client; for large commercial banks and financial institutions, the backend deposited data and the real-time conducted frontend business data cannot be concatenated in the time dimension, resulting in untimely timeliness of the data.
Python is the programming language closest to human natural language at present, utilizes the programming language of Python this powerful data analysis ability, and has abundant data science package in addition, can accomplish data acquisition through the technical staff, data preprocessing, data analysis, modeling development, data storage, the whole data governance process of data display, can obtain the biggest value information in the data from this, and then can audio-visually carry out customer portrait through data visualization mode, the three-dimensional behavior mode that shows each customer of multidimension, thereby can accomplish chart show, risk is controlled, the management decision.
Disclosure of Invention
The invention aims to provide a system and a method for client portrait in data processing, which aim to solve the problems in the prior art.
In order to achieve the above object, the present invention provides a system for client portrait in data management, comprising the following modules connected in sequence:
the data acquisition module is used for acquiring the basic data and the debt data of the commercial bank to the public client;
the data extraction module is used for extracting and processing public characteristics of public customers, and also comprises a characteristic engineering module which is used for processing the public characteristics to find out individual characteristics and labeling each characteristic;
the model development module is used for constructing the association rule of the public client and the debt information and the overdue default probability of the debt by using a data model;
the data storage module is used for storing the association rule of the public clients and the debt information and information of overdue default probability of the debt;
and the data display module displays the portrait between the public client and the debt item in a visual mode by utilizing a chart.
The invention also provides a method for client portrait in data management, which comprises the following steps:
step S1, constructing internal data of the basic information and the debt information of the public client;
step S2, adding external data as microscopic data support;
step S3, extracting and processing public characteristics of the public clients;
step S4, processing the public features to find out individual features and labeling each feature;
step S5, constructing the association rule of the public client and the debt information and the overdue default probability of the debt by using a data model;
and step S6, displaying the portrait between the public client and the debt item in a visual mode by using a chart.
Further, in the step S1, the debt item information collection process is based on the service logic sequence, so as to avoid data faults and inconsistencies caused by system upgrade and misoperation, and lay a foundation for subsequent data interpretability and data analysis.
Further, in the step S2, the external data is added to serve as a microscopic data support, and the external data of the industry, the law, the public opinion and the tax of each public client is collected by taking the public client as a unit, so as to check the symmetry of the internal and external data in the following process, and support the data of the whole system architecture of each enterprise.
Further, in step S3, data preprocessing and data analysis are performed on the collected internal data and external data, symmetry verification is required to be performed on the internal data and the external data, and feature engineering is started.
Further, in step S4, feature engineering is performed on the collected internal and external data, and labeling is performed on each feature based on the feature engineering, and a final model entering variable is screened for data prediction.
Further, in the step S5, an association rule algorithm is used to find out an association rule between the debt item information of the public client and the basic data and the external data, and a prediction model is used to predict the overdue probability of the debt item and the default probability index feature of the debt item for warning.
Further, in step S6, the relational graph, the dendrogram, the sang-based graph, and the combined graph in the data visualization are used to display the basic information features, the debt information features, the external information features, and the information features predicted by the model in different dimensions of the client in multiple directions, so as to construct a public visualization application system for the bottom layer application and the upper layer decision.
Advantageous effects
The invention constructs the method which can solve the problem of internal and external asymmetry of data through data processing, ensures the truth and reasonability of the data, obtains the relation between internal information by utilizing a data mining technology, and finally displays the relation in a multi-aspect visual mode in a knowledge map mode, provides early warning for multi-dimension understanding of overdue of customers and debt default, and provides auxiliary decisions for bottom application and risk control.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly described below, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a block diagram of the system of the present invention;
FIG. 2 is a flow chart of the present invention;
fig. 3 is a conceptual diagram of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail with reference to the accompanying drawings, and all other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without creative efforts shall fall within the protection scope of the present invention.
As shown in FIG. 1, the system for client portrait in data management of the present invention comprises the following modules connected in sequence:
the data acquisition module 101 is used for acquiring basic data and debt data of a commercial bank to a public client;
the data extraction module 102 is used for extracting and processing public characteristics of public customers, and also comprises a characteristic engineering module 103 which is used for processing the public characteristics to find out individual characteristics and labeling each characteristic;
the model development module 104 is used for constructing association rules of the public clients and the debt information and overdue default probability of the debt by using a data model;
the data storage module 105 is used for storing association rules of the public clients and the debt information and information of overdue default probability of the debt;
and the data display module 106 displays the portrait between the public client and the debt item in a visual mode by using a chart.
As shown in FIG. 2, the method for client portrayal in data management provided by the present invention comprises:
step S1, constructing internal data of the basic information and the debt information of the public client;
step S2, adding external data as microscopic data support;
step S3, extracting and processing public characteristics of the public clients;
step S4, processing public features to find out individual features and labeling each feature;
step S5, constructing an association rule with the debt item and the overdue default probability of the debt item by using a data model;
and step S6, displaying the portrait between the public client and the debt item in a visual mode by using a chart.
In step S1, when the full-page commercial bank portrays the debt information of the public client, the unique primary key of the public client is first acquired through the information of the public client stored in the internal database of the commercial bank: the method comprises the steps of client numbering, wherein dimensional data of the enterprise basic information in the last 3 years are collected through the client numbering; secondly, in order to avoid the business logic from entering and exiting the data of the internal database, the business logic priority principle is adopted, and the redundancy and inconsistency of the data caused by data storage or database upgrading are avoided, and the method comprises the following steps: the method comprises the steps of taking basic information of client registration, credit contract signing, credit contract execution, debt relation generation and debt item generation as business logic, obtaining corresponding credit contract information for client number of a public client, obtaining basic information of the debt item through the credit contract number, and obtaining debt item overdue information and debt item default information through the debt item number.
In step S2, the business name, business license number, organization code, tax number, and unified social credit code of the public client are fuzzy matched with the external data source through the commercial bank to obtain the external data of the business, judicial expertise, public opinion, and tax, which is used as the support for the commercial bank to the external micro data of the public client, the external information of the public client is more comprehensively presented through the client figure, and the acquired external data information is used for the symmetry check with the internal data of the commercial bank, the check result can be audited and evaluated by the management department, and further, the guidance suggestion for the data filling and verification of the business department is made. And (4) associating the external microscopic data with the internal data of the public client by the commercial bank, so as to obtain all subsequent required client figures and modeling analysis data.
In steps S3 and S4, on the basis of the collection of the internal data and the external microscopic data of the public client by the commercial bank, the characteristic engineering of the internal and external data is started with the time sequence of last 1 year and last 3 years, including:
(1) the characteristics of the internal base data include: enterprise basic information, enterprise change information, enterprise account management information and enterprise high management information;
(2) the internal liability data characteristics include: credit balance information, repayment performance information and contract limit supporting information;
(3) the characteristics of the external data include: enterprise information, enterprise change information, enterprise legal information, enterprise judicial complaint information, enterprise external tax information, enterprise public opinion information and enterprise import and export tax refund information.
After the feature engineering is finished, the cross indexes of internal and external data begin to be constructed, and the method comprises the following steps: and verifying the basic information of the enterprise.
Through the information range, the attribute dimensions of each commercial bank to the public client can be comprehensively and stereoscopically drawn from inside to outside.
In step S5, after all feature engineering, association rules and prediction class data modeling will be performed.
(1) Finding out potential association relations between internal data and external data, between the internal data and the debt data and between the external data and the debt data by utilizing an association rule algorithm for all the characteristics, applying the potential association relations to the situation that one characteristic change affects the other characteristic, and finally selecting out association rules related to the debt as a debt association rule label, such as: and the association relationship between the good and bad public sentiment of the industry and commerce and the overdue debt and the default of the debt.
(2) The overdue and default probability is predicted by adopting a semi-supervised learning model, nearly one-year overdue and debt default customers are taken as a target set, the data of the first two years are taken as sample sets, nearly 1-year overdue or default paired public customers are randomly extracted as training sets, nearly 1-year overdue or default paired public customers are randomly extracted as test sets, and the model after the completion of precision is applied to a prompt early warning label for the overdue and debt default of the public customers.
And constructing a customer attribute label, a debt association rule label and a debt overdue default early warning label.
In step S6, all completed customer attribute labels and debt overdue default warning labels are visualized:
(1) and storing all the completed user attribute tags into a database, and taking the client number and the time stamp as main keys to provide a basis for subsequent incremental addition of data.
(2) The client is taken as a main node, and the analysis results of the 3 modules are combined by using a relational graph, a dendrogram and a sang-base graph to perform data visualization display: the system comprises an external basic information module, a debt item association relation module and a debt item overdue and default probability predicting module.
The conceptual diagram of the system and method for client portrait in data management of the present invention is shown in fig. 3, which takes a business bank to a public client as a main node, and three dimensional modules: (1) the system comprises an internal and external basic information module 202, a debt item association relation module 204, and a debt item overdue and default probability predicting module 203, wherein the debt item overdue and default probability predicting module is a component of a node and can be displayed, managed and decided after a public client 201 is imaged.
The principle and the implementation mode of the invention are explained by applying specific embodiments in the invention, and the description of the embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (8)

1. A system for client portrait in data processing is characterized by comprising the following modules which are connected in sequence:
the data acquisition module is used for acquiring the basic data and the debt data of the commercial bank to the public client;
the data extraction module is used for extracting and processing public characteristics of public customers, and also comprises a characteristic engineering module which is used for processing the public characteristics to find out individual characteristics and labeling each characteristic;
the model development module is used for constructing the association rule of the public client and the debt information and the overdue default probability of the debt by using a data model;
the data storage module is used for storing the association rule of the public clients and the debt information and information of overdue default probability of the debt;
and the data display module displays the portrait between the public client and the debt item in a visual mode by utilizing a chart.
2. A method of client portrayal in data governance, comprising:
step S1, constructing internal data of the basic information and the debt information of the public client;
step S2, adding external data as microscopic data support;
step S3, extracting and processing public characteristics of the public clients;
step S4, processing the public features to find out individual features and labeling each feature;
step S5, constructing the association rule of the public client and the debt information and the overdue default probability of the debt by using a data model;
and step S6, displaying the portrait between the public client and the debt item in a visual mode by using a chart.
3. The method for customer imaging in data processing according to claim 2, wherein in the step S1, the debt item information collection process is based on a business logic sequence, so as to avoid data faults and inconsistencies caused by system upgrade and misoperation and lay a foundation for subsequent data interpretability and data analysis.
4. The method for customer imaging in data management according to claim 2, wherein in step S2, the external data is added as a microscopic data support, and each of the external data of the public customers, such as industry, government, public opinion and tax, is collected in units of the public customers for subsequent verification of the symmetry of the internal and external data and data support of the whole architecture of each enterprise.
5. The method of claim 2, wherein in step S3, the collected internal data and external data are pre-processed and analyzed, symmetry check is required between the internal data and the external data, and feature engineering is started.
6. The method of claim 2, wherein in step S4, the method begins to perform feature engineering on the collected internal and external data, labels each feature based on the feature engineering, and screens final modeling variables for data prediction.
7. The method for customer imaging in data processing according to claim 2, wherein in step S5, association rules between the debt information of the public customer and the basic data and the external data are found out by using an association rule algorithm, and the overdue probability of the debt and the default probability index feature of the debt are predicted by using a prediction model for warning.
8. The method of claim 2, wherein in step S6, the relational graph, the dendrogram, the sang-based graph, and the combined graph in the data visualization are used to display the basic information features, the debt information features, the external information features, and the information features predicted by the model of the client in different dimensions in multiple directions, so as to construct a public visualization application system for the bottom layer application and the top layer decision.
CN202010379332.0A 2020-05-07 2020-05-07 System and method for client portrait in data management Pending CN111552723A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010379332.0A CN111552723A (en) 2020-05-07 2020-05-07 System and method for client portrait in data management

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010379332.0A CN111552723A (en) 2020-05-07 2020-05-07 System and method for client portrait in data management

Publications (1)

Publication Number Publication Date
CN111552723A true CN111552723A (en) 2020-08-18

Family

ID=71999271

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010379332.0A Pending CN111552723A (en) 2020-05-07 2020-05-07 System and method for client portrait in data management

Country Status (1)

Country Link
CN (1) CN111552723A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108596679A (en) * 2018-04-27 2018-09-28 中国联合网络通信集团有限公司 Construction method, device, terminal and the computer readable storage medium of user's portrait
CN108846520A (en) * 2018-06-22 2018-11-20 北京京东金融科技控股有限公司 Overdue loan prediction technique, device and computer readable storage medium
CN109766000A (en) * 2018-12-25 2019-05-17 重庆和贯科技有限公司 A kind of wisdom education system and method based on virtual reality
CN109919436A (en) * 2019-01-29 2019-06-21 华融融通(北京)科技有限公司 A kind of promise breaking user's probability forecasting method based on sparse features insertion
CN110766462A (en) * 2019-10-23 2020-02-07 中国工商银行股份有限公司 Intelligent panoramic customer portrait linkage method and system based on streaming platform

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108596679A (en) * 2018-04-27 2018-09-28 中国联合网络通信集团有限公司 Construction method, device, terminal and the computer readable storage medium of user's portrait
CN108846520A (en) * 2018-06-22 2018-11-20 北京京东金融科技控股有限公司 Overdue loan prediction technique, device and computer readable storage medium
CN109766000A (en) * 2018-12-25 2019-05-17 重庆和贯科技有限公司 A kind of wisdom education system and method based on virtual reality
CN109919436A (en) * 2019-01-29 2019-06-21 华融融通(北京)科技有限公司 A kind of promise breaking user's probability forecasting method based on sparse features insertion
CN110766462A (en) * 2019-10-23 2020-02-07 中国工商银行股份有限公司 Intelligent panoramic customer portrait linkage method and system based on streaming platform

Similar Documents

Publication Publication Date Title
US10977293B2 (en) Technology incident management platform
WO2021103492A1 (en) Risk prediction method and system for business operations
US8990268B2 (en) Domain-specific syntax tagging in a functional information system
US20230162051A1 (en) Method, device and apparatus for execution of automated machine learning process
EP4236197A2 (en) Micro-loan system
CN107785058A (en) Anti- fraud recognition methods, storage medium and the server for carrying safety brain
CN111192012B (en) Item processing method, item processing device, server and storage medium
CN110088749A (en) Automated ontology generates
Moscoso-Zea et al. Datawarehouse design for educational data mining
CN112434024B (en) Relational database-oriented data dictionary generation method, device, equipment and medium
CN111177322A (en) Ontology model construction method of domain knowledge graph
CN110909970A (en) Credit scoring method and device
CN116894152B (en) Multisource data investigation and real-time analysis method
CN115438199A (en) Knowledge platform system based on smart city scene data middling platform technology
CN110310012A (en) Data analysing method, device, equipment and computer readable storage medium
CN112365339A (en) Method for judging commercial value credit loan amount of small and medium-sized enterprises
CN117435603A (en) Training method and device for data consistency determination model and computer equipment
CN111552723A (en) System and method for client portrait in data management
CN115983972A (en) Construction method of agricultural field wind control model
CN111046934B (en) SWIFT message soft clause recognition method and device
KR102110350B1 (en) Domain classifying device and method for non-standardized databases
CN114004575A (en) Personalized recruitment system and method for realizing personalization of recruitment system
CN113705072A (en) Data processing method, data processing device, computer equipment and storage medium
Atanasijevic et al. Upgrading the business intelligence system by implementing the decision tree model in the R software package
Ngo et al. Exploration and integration of job portals in Vietnam

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200818

WD01 Invention patent application deemed withdrawn after publication