CN112950290A - Mining method and device for economic dependence clients, storage medium and electronic equipment - Google Patents

Mining method and device for economic dependence clients, storage medium and electronic equipment Download PDF

Info

Publication number
CN112950290A
CN112950290A CN202110351991.8A CN202110351991A CN112950290A CN 112950290 A CN112950290 A CN 112950290A CN 202110351991 A CN202110351991 A CN 202110351991A CN 112950290 A CN112950290 A CN 112950290A
Authority
CN
China
Prior art keywords
graph
economic
determining
node
customer information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110351991.8A
Other languages
Chinese (zh)
Inventor
夏成扬
关健
袁进威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202110351991.8A priority Critical patent/CN112950290A/en
Publication of CN112950290A publication Critical patent/CN112950290A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Databases & Information Systems (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the invention discloses an economic dependency client mining method, device, storage medium and electronic equipment, and relates to the technical field of artificial intelligence. The method comprises the following steps: acquiring a client information map in response to the mining event of the economic dependence client being triggered; wherein, the customer information map comprises the incidence relation between at least two customers; performing machine learning on the customer information graph based on a graph convolution neural network algorithm, and determining a target embedded vector of each node in the customer information graph; determining an economic dependency graph corresponding to the customer information graph based on the target embedded vector; and performing cluster analysis on the economic dependency relationship graph to determine an economic dependency client group with economic dependency relationship. By the technical scheme provided by the embodiment of the invention, the customers with economic dependence relationship can be accurately excavated, the identification of economic dependence customer groups is realized, and the business management and risk management of financial enterprises are effectively ensured.

Description

Mining method and device for economic dependence clients, storage medium and electronic equipment
Technical Field
The embodiment of the invention relates to the technical field of artificial intelligence, in particular to an economic dependency client mining method, device, storage medium and electronic equipment.
Background
In the financial industry, especially the banking industry, aiming at the identification degree of group customer relationship, the capacity of bank operation and risk management is reflected, and the method is an important embodiment of commercial comprehensive competitiveness. At present, the identification of group customer relationship by banks mainly focuses on member relationship mining in groups, and the risk management level of economic dependence customers is very limited. An economic dependency customer refers to a group of enterprise and public law customers with an economic dependency relationship, which refers to a situation where one customer may not pay for debts in time due to financial difficulties or default when another customer is unable to pay for the debts in time. With the increasing complexity of the association relations between enterprises such as equity, guarantee, upstream and downstream supply, complaints and the like and the association relations between enterprises such as legal persons, actual control persons and the like and between individuals, risks hidden under a complex relation network are also more and more hidden, the traditional bank group customer relation identification method has the larger limitations of information asymmetry, small identification scale, inexperience and the like, so that the traditional method cannot meet the working requirements of the existing bank business operation and risk management.
Disclosure of Invention
The embodiment of the invention provides an economic dependency client mining method, device, storage medium and electronic equipment, which can accurately mine clients with economic dependency relationship, realize identification of economic dependency client groups and effectively ensure business management and risk management of financial enterprises.
In a first aspect, an embodiment of the present invention provides a mining method for economic dependency clients, including:
acquiring a client information map in response to the mining event of the economic dependence client being triggered; wherein, the customer information map comprises the incidence relation between at least two customers;
performing machine learning on the customer information graph based on a graph convolution neural network algorithm, and determining a target embedded vector of each node in the customer information graph;
determining an economic dependency graph corresponding to the customer information graph based on the target embedded vector;
and performing cluster analysis on the economic dependency relationship graph to determine an economic dependency client group with economic dependency relationship.
In a second aspect, an embodiment of the present invention further provides an economic client-dependent mining apparatus, including:
the information map acquisition module is used for responding to the triggering of the mining event of the economic dependency client and acquiring a client information map; wherein, the customer information map comprises the incidence relation between at least two customers;
the embedded vector determining module is used for performing machine learning on the client information map based on a graph convolution neural network algorithm and determining a target embedded vector of each node in the client information map;
a dependency graph determining module for determining an economic dependency graph corresponding to the customer information graph based on the target embedded vector;
and the dependency client mining module is used for carrying out cluster analysis on the economic dependency relationship graph and determining the economic dependency client group with the economic dependency relationship.
In a third aspect, the present invention provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the economic dependency client mining method according to the present invention.
In a fourth aspect, an embodiment of the present invention provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the economic dependency mining method according to the embodiment of the present invention.
According to the mining scheme of the economic dependence client provided by the embodiment of the invention, the mining event of the economic dependence client is triggered to obtain the client information map; wherein, the customer information map comprises the incidence relation between at least two customers; performing machine learning on the customer information graph based on a graph convolution neural network algorithm, and determining a target embedded vector of each node in the customer information graph; determining an economic dependency graph corresponding to the customer information graph based on the target embedded vector; and performing cluster analysis on the economic dependency relationship graph to determine an economic dependency client group with economic dependency relationship. By the technical scheme provided by the embodiment of the invention, the customers with economic dependence relationship can be accurately excavated, the identification of economic dependence customer groups is realized, and the business management and risk management of financial enterprises are effectively ensured.
Drawings
FIG. 1 is a flowchart of a mining method for economic dependency customers according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a customer information graph provided in accordance with an embodiment of the present invention;
FIG. 3 is a schematic diagram of an economic dependency graph provided by an embodiment of the present invention;
FIG. 4 is a flow chart of a method for economic dependency mining in another embodiment of the present invention;
FIG. 5 is a schematic diagram of an economically dependent mining apparatus according to another embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device in another embodiment of the invention.
Detailed Description
Embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present invention. It should be understood that the drawings and the embodiments of the present invention are illustrative only and are not intended to limit the scope of the present invention.
It should be understood that the various steps recited in the method embodiments of the present invention may be performed in a different order and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the invention is not limited in this respect.
The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.
It should be noted that the terms "first", "second", and the like in the present invention are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.
It is noted that references to "a", "an", and "the" modifications in the present invention are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that reference to "one or more" unless the context clearly dictates otherwise.
The names of messages or information exchanged between devices in the embodiments of the present invention are for illustrative purposes only, and are not intended to limit the scope of the messages or information.
Fig. 1 is a flowchart of a mining method for economic dependency customers according to an embodiment of the present invention, which may be applied to the case of calculating risk limits, and the mining method may be executed by a mining apparatus for economic dependency customers, which may be composed of hardware and/or software and may be generally integrated in an electronic device. As shown in fig. 1, the method specifically includes the following steps:
step 110, in response to the mining event of the economic dependency customer being triggered, obtaining a customer information map; wherein, the customer information map comprises the association relationship between at least two customers.
In the embodiment of the invention, when the mining event of the economic dependency client is triggered, the client information map is obtained, so that the mining event of the economic dependency client is carried out. Optionally, when a mining instruction of the economic dependency client input by the user is detected, a mining event representing the economic dependency client is triggered. The client information map is a heterogeneous map containing the association relation between at least two clients. For example, the customer information graph may include an association relationship between a business and a corporation, or may include an association relationship between a corporation and a family of an individual (e.g., a legal person or a control person). In the embodiment of the invention, when the mining event of the economic dependency client is triggered, the constructed client information map imported by the user can be directly obtained, and the client information data can also be obtained to construct the client information map according to the enterprise information data. It should be noted that, in the embodiment of the present invention, the obtaining manner of the client information map is not limited.
Optionally, in response to a mining event of the economic dependency client being triggered, obtaining a client information graph, including: acquiring client information data in response to the mining event of the economic dependency client being triggered; and screening the customer information data according to a preset economic dependence rule, and constructing a customer information map based on the screened customer information data. The advantage of this arrangement is that the customer information map with possible economic dependence relationship can be quickly constructed, which is helpful to improve the mining efficiency of economic dependence customers. Specifically, when mining events of economic dependency clients are triggered, client information data is obtained, wherein the client information data can comprise clients (enterprises or individuals) with economic communication and relationships among the clients. For example, the relationships between customers may include any one or more of an enterprise external investment relationship, an enterprise holdings relationship, a personal guaranty relationship, a personal holdings relationship, an enterprise guaranty relationship, a branch office relationship, an enterprise upstream and downstream relationship, an account fund transaction relationship, a corporate relationship, a spouse relationship, a child parent relationship, a sibling relationship, a common guaranty relationship, a common borrowing relationship, a corporate to enterprise customer investment relationship, an enterprise and individual to guaranty relationship, and an enterprise note upstream and downstream relationship. Since not every piece of client information data is data with economic dependency relationship in the client information data, the client information data can be screened based on the preset economic dependency rule. For example, the economic dependency rule can be constructed from the aspects of fund occupation, fund compensation, repayment fund source and the like, data which meets the economic dependency rule in all the customer information data is reserved, and data which does not meet the economic dependency rule is filtered out. For example, the economic dependency rules may include the following rules: enterprise B is held by enterprise a (if enterprise a holds enterprise B, then enterprise a is considered to have the possibility of occupying enterprise B funds); the enterprise A and the enterprise B have a fund exchange relationship, and the transaction amount exceeds a preset proportion (such as 30%) of the annual total posting or the annual total posting of the enterprise A or the enterprise B; the enterprise A and the enterprise B have a bill upstream and downstream relationship, and the bill amount exceeds 30% of the total bill collection or total bill payment in the last year of the enterprise A or the enterprise B; in the fund-to-transaction relationship between the enterprise A and the enterprise B, the amount exceeds the preset proportion (such as 30%) of the registered capital of the enterprise; in the fund exchange relationship between the enterprise A and the enterprise B within one year, the amount exceeds the amount of accounts due to be paid or debt paid within half a year of the enterprise A or the enterprise B; for enterprise a, the average transferred funds from enterprise B in two years is greater than the difference between the total income of enterprise a in the previous year minus the debt paid off in the current year; the enterprise A and the enterprise B are in upstream and downstream relation of bills, the enterprise A is a payee, the enterprise B is a payer, and the bill amount exceeds 50% of the total income or total expenditure of the enterprise A in the last year or exceeds the difference of the total income of the enterprise A in the last year minus the liability paid in the current year by the enterprise A; enterprise A provides a guarantee to Enterprise B; the actual controller/legal of enterprise a provides a guarantee to enterprise B. For the customer information data meeting the preset economic dependency rules, it can be considered that there is a great default risk conduction possibility and a close economic transaction relationship, and the customer information data can be retained as the data related to the economic dependency customers. And then constructing a customer information graph based on the screened customer information data, for example, customers with economic communication can be used as nodes in the customer information graph, and the relationship between the customers can be used as edges in the customer information graph. Illustratively, fig. 2 is a schematic diagram of a customer information graph provided by an embodiment of the present invention.
And 120, performing machine learning on the client information graph based on a graph convolution neural network algorithm, and determining a target embedded vector of each node in the client information graph.
Among them, Graph Convolutional neural Network (GCN) is a method capable of deep learning of Graph data. In the embodiment of the invention, machine learning is carried out on the client information map based on the graph convolution neural network algorithm, and the target embedded vector of each node in the client information map is determined. For example, a corresponding node sequence may be obtained from the customer information graph based on the random walk of the meta-path, and then the node sequence is learned by using a skip-gram algorithm to determine a target embedded vector of each node in the customer information graph.
And step 130, determining an economic dependency graph corresponding to the customer information graph based on the target embedded vector.
In the embodiment of the invention, an economic dependency relationship graph corresponding to the client information graph is constructed according to the target embedded vector of each node in the client information graph, wherein the economic dependency relationship graph is an undirected graph capable of reflecting the client information graph. Optionally, determining the economic dependency graph corresponding to the customer information graph based on the target embedded vector includes: aiming at any two nodes in the customer information map, calculating cosine similarity according to target embedded vectors of the current two nodes; and taking the cosine similarity as a weight value between two corresponding nodes in the customer information map, and constructing an economic dependency relationship map corresponding to the customer information map according to the weight value. Specifically, any two nodes in the customer information graph are traversed, cosine similarity is calculated according to target embedded vectors corresponding to the two selected target nodes, and the cosine similarity is used as a weight value for connecting the two target nodes. And constructing an economic dependency relationship graph corresponding to the customer information graph according to the weight values. It can be understood that the economic dependency relationship graph includes nodes, connection relationships between the nodes, and weight values between the nodes having the connection relationships, where the weight values reflect the closeness of economic transactions between the entity clients corresponding to the two nodes. Illustratively, fig. 3 is a schematic diagram of an economic dependency graph provided by an embodiment of the present invention.
And 140, performing cluster analysis on the economic dependency relationship graph to determine an economic dependency client group with economic dependency relationship.
In the embodiment of the invention, the economic dependency graph can be subjected to clustering analysis based on a preset clustering algorithm, and the economic dependency client group with the economic dependency is determined according to a clustering result. Optionally, performing cluster analysis on the economic dependency relationship graph to determine an economic dependency client group, including: and performing cluster analysis on the economic dependency relationship graph based on a Louvain algorithm to determine an economic dependency client group with economic dependency relationship. The economic dependency customer group can comprise a guarantee circle customer group, a supply chain customer group or an enterprise affiliation group of economic communication relation, and other enterprises within the enterprise affiliation risk influence of the economic communication relation.
The mining method of the economic dependency client provided by the embodiment of the invention responds to the triggering of the mining event of the economic dependency client and obtains the client information map; wherein, the customer information map comprises the incidence relation between at least two customers; performing machine learning on the customer information graph based on a graph convolution neural network algorithm, and determining a target embedded vector of each node in the customer information graph; determining an economic dependency graph corresponding to the customer information graph based on the target embedded vector; and performing cluster analysis on the economic dependency relationship graph to determine an economic dependency client group with economic dependency relationship. By the technical scheme provided by the embodiment of the invention, the customers with economic dependence relationship can be accurately excavated, the identification of economic dependence customer groups is realized, and the business management and risk management of financial enterprises are effectively ensured.
In some embodiments, learning the customer information graph based on a graph convolution neural network algorithm to determine a target embedded vector for each node in the customer information graph comprises: determining at least one conductive link contained in the customer information graph; determining initial embedded vectors of all nodes in each conductive link based on a graph convolution neural network algorithm; and aiming at each node in the customer information map, determining a target embedded vector of the current node in the customer information map according to the initial embedded vector of the current node in the involved conductive link.
Specifically, the physical clients related to each node in the client information map may include enterprises (controlled) - > enterprise (controller), enterprise (guarantor) - > enterprise (guarantor), legal (guarantor) - > enterprise (guarantor), real controller (guarantor) - > enterprise (guarantor), parent (guarantor) - > enterprise (guarantor), enterprise (supply chain downstream) - > enterprise (supply chain upstream), enterprise (fund transfer) - > enterprise (fund transfer), enterprise (bill) - > enterprise (bill collection), enterprise (controlled) - > real controller, enterprise (controlled) -, and the relationship between any two nodes in the client information map may include enterprise (controlled) - > enterprise (controller), enterprise (guarantor) - > enterprise (guarantor), legal (guarantor) - > enterprise) (supply chain downstream) - > enterprise (supply chain upstream), enterprise (bill transfer, Enterprise (controlled) - > corporate and enterprise (controlled) - > parent. Thus, the relationships of the edges constructed between two nodes in the customer information graph may include natural person- (holdings) -business, business- (holdings) -natural person, parent-holdings-business, business-holdings-parent, business- (holdings) -business, business-economic exchange (transfer) -business, business-guaranty-business, business-upstream-downstream-core business, and core business-upstream-downstream-business. By summarizing the relationship between risk conduction and economic exchange, the conduction link (metapath) involved in the customer information graph may include: enterprise- > (held stock) - > nature- > (held stock) - > enterprise (representing conduction through human), enterprise- > (held stock) - > parent-for- > (held stock) - > enterprise (representing conduction through parent), enterprise- > (held stock) - > enterprise, enterprise- > (economic exchange relation) - > enterprise, enterprise- > (guaranteed relation) - > enterprise (representing conduction through guaranteed relation) and enterprise- > (upstream and downstream relation) - > core enterprise (upstream and downstream relation) - > enterprise (conducting through supply chain).
In the embodiment of the invention, the customer information graph is analyzed to determine the specific conductive link contained in the customer information graph, wherein the conductive link contained in different customer information graphs is different. And determining initial embedded vectors of all nodes in each conductive link based on a graph convolution neural network algorithm. For example, graph embedding learning may be performed for each conductive link in the customer information graph based on Metapath2vec, determining an initial embedding vector for each node. Specifically, for each conductive link, a heterogeneous neighborhood of each node in the current conductive link is constructed by using random walk, the heterogeneous neighborhood is learned through a skip-gram algorithm, and an initial embedded vector of each node is calculated.
Optionally, the graph convolutional neural network algorithm includes a Metapath2vec algorithm, and the Metapath2vec algorithm includes a random walk algorithm and a skip-gram algorithm; determining initial embedded vectors of nodes in each conductive link based on a graph convolution neural network algorithm, wherein the initial embedded vectors comprise: for each conductive link, determining a training sample set corresponding to the current conductive link based on a random walk algorithm; and learning the training sample set based on the skip-gram algorithm, and determining the initial embedded vector of each node in the current conduction link. Optionally, for each conductive link, determining a training sample set corresponding to the current conductive link based on a random walk algorithm includes: determining a random walk parameter for each conductive link; the random walk parameters comprise a random walk step length and a random walk frequency; sequentially taking each node of a current conductive link as a starting point, and determining a node sequence corresponding to the current conductive link in a random walk mode from the current conductive link based on the random walk parameters; and determining a training sample set corresponding to the current conductive link according to the node sequence. The random walk parameter may be a quantity input by a user, or may be a default parameter of the system. The random walk parameters may include a random walk step L and a random walk number n, for example, the random walk step may be set to 5 (i.e., a maximum of 5 nodes may be included in the walk), and the random walk number may be set to 10 to 20. In the embodiment of the invention, each node of each conductive link is sequentially taken as a starting point for each conductive link, random walk is carried out on the current conductive link based on the random walk parameters, and in the process of random walk, a corresponding node sequence is determined from the current conductive link. For example, if the step size of the random walk is L, the number of random walks is n, and the current conductive link includes m nodes in total, then the node sequences corresponding to the current conductive link determined based on the random walk method are m × n, and the length of each node sequence is L. And then determining a training sample set corresponding to the current conductive link from the node sequence, learning the training sample set based on a skip-gram algorithm, and determining an initial embedded vector of each node in the current conductive link. Specifically, in the process of learning a training sample set based on a skip-gram algorithm, cross-entry loss is used as a training target, and after training is completed, an initial embedding vector (embedding) corresponding to the current metapath of each node is obtained. It should be noted that the random walk parameters corresponding to each conductive link may be the same or different, and this is not limited in the embodiment of the present invention.
Optionally, determining a training sample set corresponding to the current conductive link according to the node sequence includes: traversing each node in the node sequence, determining a target adjacent node corresponding to the current node according to the size of a preset window, and taking a node pair formed by the current node and the target adjacent node as a positive sample set; according to a preset positive and negative sample proportion, sampling neighbor node pairs which do not belong to the positive sample set from the node sequence to form a negative sample set; and using the positive sample set and the negative sample set as training sample sets corresponding to the current conductive link. Specifically, for each node sequence corresponding to each conductive link, each node in the current node sequence is traversed, a target adjacent node (adj) corresponding to the current node (center) is determined according to the preset window size w, that is, in the current node sequence, a node within the range of the left window w and the right window w of the current node (center) is selected as the target adjacent node (adj), and a node pair (center, adj) formed by the current node (center) and the target adjacent node (adj) is used as a positive sample set. For each node in the current node sequence, sampling neighbor node pairs which are not in the positive sample set from the node sequence according to a preset positive-negative sample ratio r (for example, r is 1: 5), and taking the neighbor node pairs as a negative sample set. Then, the positive sample set and the negative sample set are used as a training sample set corresponding to the current conductive link.
In the embodiment of the present invention, the initial embedded vector of each node in each conductive link, that is, the initial embedded vector of each node in each conductive link in the customer information graph, can be determined by the above method. A target embedding vector for each node in the customer information graph may then be calculated based on the initial embedding vector for each node in each conductive link in the customer information graph. Optionally, determining a target embedding vector of the current node in the customer information graph according to an initial embedding vector of the current node in the involved conductive links, includes: determining initial embedding vectors of the current node in each involved conductive link; and performing connection operation on the initial embedded vectors of the current node in the involved conducting links, and taking the vectors generated by connection as target embedded vectors of the current node in the customer information map. Specifically, for each node in the customer information graph, performing a connection operation (concatenate) on an initial embedded vector of the current node in each involved conductive link, and using the generated vector as a target embedded vector of the current node in the customer information graph. Optionally, the connecting operation includes a splicing operation and a summing operation. The splicing operation is to enlarge the dimension of the initial embedded vector, and if a certain node relates to two 10-dimensional initial embedded vectors, the generated target embedded vector is a 20-dimensional vector after the two 10-dimensional initial embedded vectors are spliced. The summing operation is to sum the dimensions corresponding to the initial embedding vectors.
In some embodiments, after performing cluster analysis on the economic dependency graph and determining the economic dependency customer population, the method further includes: constructing a guarantee connection diagram based on a preset guarantee relationship in the economic dependency client group; and determining a connected component in the guarantee connection graph, and taking the client related to the connected component as a guarantee chain client group. Specifically, in the data exploration process, the guarantee relationship in the existing map relationship is found to only cover 26% of the guarantee relationship in the security system, so that the guarantee relationship in the customer information map can be supplemented by extracting from the data of the security system. And constructing a guarantee connection graph based on a preset guarantee relation in the identified economic dependency client group. Wherein, the presetting of the guaranty relationship may include: enterprise (guarantor) -enterprise (guarantor), naturaler (guarantor) -enterprise (guarantor) and naturaler (guarantor) -naturaler (guarantor) guarantor relationships. And determining all connected components in the guarantee connection graph based on the preset guarantee relationship as an undirected graph, and taking the clients related to each connected component as a client group with mutual guarantee relationship. Alternatively, since the guarantee chain or the guarantee circle is generally generated centering on one or several center companies, the center company of each connected component may be further determined as the center company of the guarantee circle. The guarantee modes include direct guarantee, joint guarantee, cyclic guarantee, mutual guarantee and the like, the guarantee forms are generally in the forms of linearity, star, chain, ring and the like, the complex relations can be organized by using a guarantee connection diagram, and a complete guarantee enterprise group is found by a diagram algorithm.
In some embodiments, after performing cluster analysis on the economic dependency graph and determining the economic dependency customer population, the method further includes: and constructing a supply chain connection diagram based on a preset supply chain relation in the economic dependency customer population. Specifically, in the identified economic dependency customer group, a supply chain connection diagram is constructed based on preset supply chain relations, wherein the preset supply chain relations may include: enterprise (provider) -enterprise (core enterprise), enterprise (graph supply chain upstream) -enterprise (graph supply chain downstream), and enterprise (graph document upstream) -enterprise (graph document downstream). The supply chain data is mainly the relationship between the supplier and the core enterprise taking the e-credit data for financing by bank acceptance draft as the core. This portion of the data can be covered by the upstream and downstream supply chain and note upstream and downstream relationships in the map at a rate of 2%. The upstream and downstream of the bill in the map data are mainly the extracted draft, the applicant and the ticket holder of the draft, and the drawer and the payee of the check, so that the supply chain relation of the bank acceptance draft can be extracted from the e credit data to enrich the map data. In addition, unlike guaranties, supply chains generally do not have a circular relationship, but rather are primarily in a chain relationship such as a supplier-core enterprise. Since one core enterprise has a plurality of suppliers, and one supplier corresponds to a plurality of core enterprises, the supply chain relationships are interlaced together from the community perspective, and the clear boundaries cannot be divided. Thus, the supply chain is identified, and a supply chain connection diagram is constructed by inquiring related upstream and downstream enterprises of the enterprise by carrying out diagram inquiry from the enterprise.
FIG. 4 is a flowchart of a mining method for economic dependency clients in another embodiment of the present invention, as shown in FIG. 4, the method includes the following steps:
in step 410, client information data is obtained in response to the mining event of the economic dependency client being triggered.
And step 420, performing screening operation on the client information data according to a preset economic dependence rule, and constructing a client information map based on the screened client information data.
At step 430, at least one conductive link included in the customer information graph is determined.
Step 440, determining a random walk parameter for each conductive link; the random walk parameters include a random walk step length and a random walk number.
And step 450, sequentially taking each node of the current conductive link as a starting point, and determining a node sequence corresponding to the current conductive link in a random walk mode from the current conductive link based on the random walk parameters.
Step 460, traversing each node in the node sequence, determining a target neighboring node corresponding to the current node according to the size of the preset window, and taking a node pair formed by the current node and the target neighboring node as a positive sample set.
And 470, sampling neighbor node pairs which do not belong to the positive sample set from the node sequence according to the preset positive and negative sample proportion to form a negative sample set.
Step 480, the positive sample set and the negative sample set are used as training sample sets corresponding to the current conductive link.
And step 490, learning the training sample set based on the skip-gram algorithm, and determining the initial embedded vector of each node in the current transmission link.
Step 4100, for each node in the customer information graph, determining a target embedding vector of the current node in the customer information graph according to the initial embedding vector of the current node in the involved conductive link.
Step 4110, aiming at any two nodes in the customer information graph, calculating cosine similarity according to target embedded vectors of the current two nodes.
Step 4120, using the cosine similarity as a weight value between two corresponding nodes in the customer information graph, and constructing an economic dependency graph corresponding to the customer information graph according to the weight value.
Step 4130, performing cluster analysis on the economic dependency graph based on the Louvain algorithm, and determining the economic dependency client population with economic dependency.
The method for mining the economic dependence client provided by the embodiment of the invention can accurately mine the client with the economic dependence relationship, realize the identification of the economic dependence client group and effectively ensure the business operation and risk management of financial enterprises.
Fig. 5 is a schematic structural diagram of an economic customer-dependent mining device according to another embodiment of the present invention. As shown in fig. 5, the apparatus includes: an information graph obtaining module 510, an embedded vector determining module 520, a dependency graph determining module 530, and a dependency client mining module 540. Wherein,
an information map obtaining module 510, configured to obtain a customer information map in response to a mining event of an economic dependency customer being triggered; wherein, the customer information map comprises the incidence relation between at least two customers;
an embedded vector determination module 520, configured to perform machine learning on the client information graph based on a graph convolution neural network algorithm, and determine a target embedded vector of each node in the client information graph;
a dependency graph determining module 530, configured to determine an economic dependency graph corresponding to the customer information graph based on the target embedded vector;
and the dependency client mining module 540 is configured to perform cluster analysis on the economic dependency relationship graph to determine an economic dependency client group with economic dependency relationship.
The mining device of the economic dependency client provided by the embodiment of the invention responds to the triggering of the mining event of the economic dependency client and obtains the client information map; wherein, the customer information map comprises the incidence relation between at least two customers; performing machine learning on the customer information graph based on a graph convolution neural network algorithm, and determining a target embedded vector of each node in the customer information graph; determining an economic dependency graph corresponding to the customer information graph based on the target embedded vector; and performing cluster analysis on the economic dependency relationship graph to determine an economic dependency client group with economic dependency relationship. By the technical scheme provided by the embodiment of the invention, the customers with economic dependence relationship can be accurately excavated, the identification of economic dependence customer groups is realized, and the business management and risk management of financial enterprises are effectively ensured.
Optionally, the embedded vector determining module includes:
a conductive link determination submodule for determining at least one conductive link included in the customer information map;
the initial embedded vector determining submodule is used for determining an initial embedded vector of each node in each conducting link based on a graph convolution neural network algorithm;
and the target embedded vector determining sub-module is used for determining a target embedded vector of the current node in the customer information map according to the initial embedded vector of the current node in the involved conductive link aiming at each node in the customer information map.
Optionally, the graph convolutional neural network algorithm includes a Metapath2vec algorithm, and the Metapath2vec algorithm includes a random walk algorithm and a skip-gram algorithm;
the initial embedded vector determination sub-module includes:
a training sample set determining unit, configured to determine, for each conductive link, a training sample set corresponding to a current conductive link based on a random walk algorithm;
and the initial embedded vector determining unit is used for learning the training sample set based on the skip-gram algorithm and determining the initial embedded vector of each node in the current conduction link.
Optionally, the training sample set determining unit includes:
a walk parameter determination subunit configured to determine a random walk parameter for each conductive link; the random walk parameters comprise a random walk step length and a random walk frequency;
a node sequence determining subunit, configured to sequentially use each node of a current conductive link as a starting point, and determine, based on the random walk parameter, a node sequence corresponding to the current conductive link in a random walk manner from the current conductive link;
and the training sample set determining subunit is used for determining a training sample set corresponding to the current conductive link according to the node sequence.
Optionally, the training sample set determining subunit is configured to:
traversing each node in the node sequence, determining a target adjacent node corresponding to the current node according to the size of a preset window, and taking a node pair formed by the current node and the target adjacent node as a positive sample set;
according to a preset positive and negative sample proportion, sampling neighbor node pairs which do not belong to the positive sample set from the node sequence to form a negative sample set;
and using the positive sample set and the negative sample set as training sample sets corresponding to the current conductive link.
Optionally, the target embedded vector determining sub-module is configured to:
determining initial embedding vectors of the current node in each involved conductive link;
and performing connection operation on the initial embedded vectors of the current node in the involved conducting links, and taking the vectors generated by connection as target embedded vectors of the current node in the customer information map.
Optionally, the connecting operation includes a splicing operation and a summing operation.
Optionally, the dependency graph determining module is configured to:
aiming at any two nodes in the customer information map, calculating cosine similarity according to target embedded vectors of the current two nodes;
and taking the cosine similarity as a weight value between two corresponding nodes in the customer information map, and constructing an economic dependency relationship map corresponding to the customer information map according to the weight value.
Optionally, the information map obtaining module is configured to:
acquiring client information data in response to the mining event of the economic dependency client being triggered;
and screening the customer information data according to a preset economic dependence rule, and constructing a customer information map based on the screened customer information data.
Optionally, the dependent client mining module is configured to:
and performing cluster analysis on the economic dependency relationship graph based on a Louvain algorithm to determine an economic dependency client group with economic dependency relationship.
Optionally, the apparatus further comprises:
the guarantee connection graph building module is used for building a guarantee connection graph based on a preset guarantee relation in the economic dependency client group after performing cluster analysis on the economic dependency relation graph and determining the economic dependency client group;
and the guarantee chain client group determining module is used for determining the connected component in the guarantee connection graph and taking the client related to the connected component as a guarantee chain client group.
Optionally, the method further includes:
and the supply chain connection diagram construction module is used for constructing a supply chain connection diagram based on a preset supply chain relation in the economic dependency client group after the economic dependency relationship diagram is subjected to cluster analysis and the economic dependency client group is determined.
The device can execute the methods provided by all the embodiments of the invention, and has corresponding functional modules and beneficial effects for executing the methods. For technical details which are not described in detail in the embodiments of the present invention, reference may be made to the methods provided in all the aforementioned embodiments of the present invention.
Embodiments of the present invention also provide a storage medium containing computer-executable instructions, which when executed by a computer processor, perform a mining method for economic dependency clients, the method comprising:
acquiring a client information map in response to the mining event of the economic dependence client being triggered; wherein, the customer information map comprises the incidence relation between at least two customers;
performing machine learning on the customer information graph based on a graph convolution neural network algorithm, and determining a target embedded vector of each node in the customer information graph;
determining an economic dependency graph corresponding to the customer information graph based on the target embedded vector;
and performing cluster analysis on the economic dependency relationship graph to determine an economic dependency client group with economic dependency relationship.
Storage medium-any of various types of memory devices or storage devices. The term "storage medium" is intended to include: mounting media such as CD-ROM, floppy disk, or tape devices; computer system memory or random access memory such as DRAM, DDRRAM, SRAM, EDORAM, Lanbas (Rambus) RAM, etc.; non-volatile memory such as flash memory, magnetic media (e.g., hard disk or optical storage); registers or other similar types of memory elements, etc. The storage medium may also include other types of memory or combinations thereof. In addition, the storage medium may be located in a first computer system in which the program is executed, or may be located in a different second computer system connected to the first computer system through a network (such as the internet). The second computer system may provide program instructions to the first computer for execution. The term "storage medium" may include two or more storage media that may reside in different locations, such as in different computer systems that are connected by a network. The storage medium may store program instructions (e.g., embodied as a computer program) that are executable by one or more processors.
Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the mining operation of the economic dependency client as described above, and may also perform related operations in the mining method of the economic dependency client provided by any embodiment of the present invention.
The embodiment of the invention provides electronic equipment, and the economic client-dependent mining device provided by the embodiment of the invention can be integrated in the electronic equipment. Fig. 6 is a block diagram of an electronic device according to an embodiment of the present invention. The electronic device 600 may include: the system comprises a memory 601, a processor 602 and a computer program stored on the memory 601 and executable by the processor, wherein the processor 602 implements the mining method for economic dependence clients according to the embodiment of the invention when executing the computer program.
The electronic equipment provided by the embodiment of the invention responds to the triggering of the mining event of the economic dependence client and acquires the client information map; wherein, the customer information map comprises the incidence relation between at least two customers; performing machine learning on the customer information graph based on a graph convolution neural network algorithm, and determining a target embedded vector of each node in the customer information graph; determining an economic dependency graph corresponding to the customer information graph based on the target embedded vector; and performing cluster analysis on the economic dependency relationship graph to determine an economic dependency client group with economic dependency relationship. By the technical scheme provided by the embodiment of the invention, the customers with economic dependence relationship can be accurately excavated, the identification of economic dependence customer groups is realized, and the business management and risk management of financial enterprises are effectively ensured.
The mining device, the storage medium and the electronic device for the economic dependency client provided by the above embodiments can execute the mining method for the economic dependency client provided by any embodiment of the present invention, and have corresponding functional modules and beneficial effects for executing the method. The technical details not described in detail in the above embodiments may be referred to the mining method for economic dependency clients provided by any embodiment of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (15)

1. An economic dependency client mining method, comprising:
acquiring a client information map in response to the mining event of the economic dependence client being triggered; wherein, the customer information map comprises the incidence relation between at least two customers;
performing machine learning on the customer information graph based on a graph convolution neural network algorithm, and determining a target embedded vector of each node in the customer information graph;
determining an economic dependency graph corresponding to the customer information graph based on the target embedded vector;
and performing cluster analysis on the economic dependency relationship graph to determine an economic dependency client group with economic dependency relationship.
2. The method of claim 1, wherein learning the customer information graph based on a graph convolutional neural network algorithm to determine a target embedding vector for each node in the customer information graph comprises:
determining at least one conductive link contained in the customer information graph;
determining initial embedded vectors of all nodes in each conductive link based on a graph convolution neural network algorithm;
and aiming at each node in the customer information map, determining a target embedded vector of the current node in the customer information map according to the initial embedded vector of the current node in the involved conductive link.
3. The method of claim 2, wherein the graph convolutional neural network algorithm comprises a Metapath2vec algorithm, the Metapath2vec algorithm comprising a random walk algorithm and a skip-gram algorithm;
determining initial embedded vectors of nodes in each conductive link based on a graph convolution neural network algorithm, wherein the initial embedded vectors comprise:
for each conductive link, determining a training sample set corresponding to the current conductive link based on a random walk algorithm;
and learning the training sample set based on the skip-gram algorithm, and determining the initial embedded vector of each node in the current conduction link.
4. The method of claim 3, wherein determining, for each conductive link, a set of training samples corresponding to a current conductive link based on a random walk algorithm comprises:
determining a random walk parameter for each conductive link; the random walk parameters comprise a random walk step length and a random walk frequency;
sequentially taking each node of a current conductive link as a starting point, and determining a node sequence corresponding to the current conductive link in a random walk mode from the current conductive link based on the random walk parameters;
and determining a training sample set corresponding to the current conductive link according to the node sequence.
5. The method of claim 4, wherein determining a training sample set corresponding to the current conductive link from the sequence of nodes comprises:
traversing each node in the node sequence, determining a target adjacent node corresponding to the current node according to the size of a preset window, and taking a node pair formed by the current node and the target adjacent node as a positive sample set;
according to a preset positive and negative sample proportion, sampling neighbor node pairs which do not belong to the positive sample set from the node sequence to form a negative sample set;
and using the positive sample set and the negative sample set as training sample sets corresponding to the current conductive link.
6. The method of claim 2, wherein determining a target embedding vector of a current node in the customer information graph based on an initial embedding vector of the current node in a involved conductive link comprises:
determining initial embedding vectors of the current node in each involved conductive link;
and performing connection operation on the initial embedded vectors of the current node in the involved conducting links, and taking the vectors generated by connection as target embedded vectors of the current node in the customer information map.
7. The method of claim 6, wherein the join operation comprises a splice operation and a summation operation.
8. The method of claim 1, wherein determining an economic dependency graph corresponding to the customer information graph based on the target embedded vector comprises:
aiming at any two nodes in the customer information map, calculating cosine similarity according to target embedded vectors of the current two nodes;
and taking the cosine similarity as a weight value between two corresponding nodes in the customer information map, and constructing an economic dependency relationship map corresponding to the customer information map according to the weight value.
9. The method of claim 1, wherein obtaining a customer information graph in response to an economic dependency customer mining event being triggered comprises:
acquiring client information data in response to the mining event of the economic dependency client being triggered;
and screening the customer information data according to a preset economic dependence rule, and constructing a customer information map based on the screened customer information data.
10. The method according to claim 1, wherein performing cluster analysis on the economic dependency graph to determine an economic dependency customer population comprises:
and performing cluster analysis on the economic dependency relationship graph based on a Louvain algorithm to determine an economic dependency client group with economic dependency relationship.
11. The method according to claim 1, wherein after performing cluster analysis on the economic dependency graph to determine economic dependency customer groups, further comprising:
constructing a guarantee connection diagram based on a preset guarantee relationship in the economic dependency client group;
and determining a connected component in the guarantee connection graph, and taking the client related to the connected component as a guarantee chain client group.
12. The method according to claim 1, wherein after performing cluster analysis on the economic dependency graph to determine economic dependency customer groups, further comprising:
and constructing a supply chain connection diagram based on a preset supply chain relation in the economic dependency customer population.
13. An economically dependent mining device, comprising:
the information map acquisition module is used for responding to the triggering of the mining event of the economic dependency client and acquiring a client information map; wherein, the customer information map comprises the incidence relation between at least two customers;
the embedded vector determining module is used for performing machine learning on the client information map based on a graph convolution neural network algorithm and determining a target embedded vector of each node in the client information map;
a dependency graph determining module for determining an economic dependency graph corresponding to the customer information graph based on the target embedded vector;
and the dependency client mining module is used for carrying out cluster analysis on the economic dependency relationship graph and determining the economic dependency client group with the economic dependency relationship.
14. A computer-readable medium, on which a computer program is stored, which, when executed by a processing device, implements the economic dependency client mining method as claimed in any one of claims 1-12.
15. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor when executing the computer program implements the economic dependency client mining method of any one of claims 1-12.
CN202110351991.8A 2021-03-31 2021-03-31 Mining method and device for economic dependence clients, storage medium and electronic equipment Pending CN112950290A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110351991.8A CN112950290A (en) 2021-03-31 2021-03-31 Mining method and device for economic dependence clients, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110351991.8A CN112950290A (en) 2021-03-31 2021-03-31 Mining method and device for economic dependence clients, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN112950290A true CN112950290A (en) 2021-06-11

Family

ID=76231781

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110351991.8A Pending CN112950290A (en) 2021-03-31 2021-03-31 Mining method and device for economic dependence clients, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN112950290A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114003648A (en) * 2021-10-20 2022-02-01 支付宝(杭州)信息技术有限公司 Risk transaction group partner identification method and device, electronic equipment and storage medium
CN114169945A (en) * 2022-02-08 2022-03-11 北京金堤科技有限公司 Method and device for determining hot supply and demand products in field of object

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114003648A (en) * 2021-10-20 2022-02-01 支付宝(杭州)信息技术有限公司 Risk transaction group partner identification method and device, electronic equipment and storage medium
CN114003648B (en) * 2021-10-20 2024-04-26 支付宝(杭州)信息技术有限公司 Identification method and device for risk transaction group partner, electronic equipment and storage medium
CN114169945A (en) * 2022-02-08 2022-03-11 北京金堤科技有限公司 Method and device for determining hot supply and demand products in field of object

Similar Documents

Publication Publication Date Title
Sadok et al. Artificial intelligence and bank credit analysis: A review
WO2017013627A1 (en) System and method for provisioning financial transaction between a lender and a borrower
CN110796539A (en) Credit investigation evaluation method and device
CN114187112A (en) Training method of account risk model and determination method of risk user group
CN112967130A (en) Method and device for identifying enterprise association relationship
CN112950290A (en) Mining method and device for economic dependence clients, storage medium and electronic equipment
Pagano et al. Implementation of blockchain technology in insurance contracts against natural hazards: a methodological multi-disciplinary approach
CN115545886A (en) Overdue risk identification method, overdue risk identification device, overdue risk identification equipment and storage medium
Tan Interfaces for enterprise valuation from a real options lens
CN113159930A (en) Customer group identification method and device based on economic dependency relationship
Garin et al. Machine learning in classifying bitcoin addresses
CN115345727B (en) Method and device for identifying fraudulent loan application
Gobet et al. Optimal ecological transition path of a credit portfolio distribution, based on multidate Monge–Kantorovich formulation
Dong et al. Accuracy Comparison between Five Machine Learning Algorithms for Financial Risk Evaluation
US20220398583A1 (en) Transaction reconciliation and deduplication
dei Belliera et al. Flood risk insurance: the Blockchain approach to a Bayesian adaptive design of the contract.
Anton Integration of blockchain technologies and machine learning with deep analysis
Macharia et al. Mobile banking influence on wealth creation for the unbanked
CN113706258A (en) Product recommendation method, device, equipment and storage medium based on combined model
Veres et al. The Concept of Using Artificial Intelligence Methods in Debt Financing of Business Entities.
Zand Towards intelligent risk-based customer segmentation in banking
CN112632197A (en) Service relation processing method and device based on knowledge graph
CN111932131A (en) Service data processing method and device
Bundi Risk Concentration in Networks of Banks Connected by Financial Contracts
Liu et al. Network centrality and credit risk: A comprehensive analysis of peer-to-peer lending dynamics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination