CN112328802A - Data processing method and device and server - Google Patents

Data processing method and device and server Download PDF

Info

Publication number
CN112328802A
CN112328802A CN202011062803.1A CN202011062803A CN112328802A CN 112328802 A CN112328802 A CN 112328802A CN 202011062803 A CN202011062803 A CN 202011062803A CN 112328802 A CN112328802 A CN 112328802A
Authority
CN
China
Prior art keywords
target
paths
target object
data
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011062803.1A
Other languages
Chinese (zh)
Inventor
刘丹丹
曾威龙
王膂
钱隽夫
王彦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
AlipayCom Co ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202011062803.1A priority Critical patent/CN112328802A/en
Publication of CN112328802A publication Critical patent/CN112328802A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Molecular Biology (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Strategic Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Animal Behavior & Ethology (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The specification provides a data processing method, a data processing device and a server. In one embodiment, the data processing method determines whether a plurality of paths between a first target object and a second target object with a preset relationship exist to be predicted or not according to a knowledge graph of a target enterprise, which is constructed based on business data related to the target enterprise; and processing the plurality of paths by using a preset processing model, determining whether a preset relation exists between the first target object and the second target object according to the model output to serve as a processing result, and determining and utilizing a target path related to the processing result to serve as interpretation data for the processing result. Therefore, the processing result of whether the first target object and the second target object have the preset relationship can be determined, and simultaneously, more accurate and reasonable interpretation data about the processing result with higher reference value can be obtained.

Description

Data processing method and device and server
Technical Field
The specification belongs to the technical field of internet, and particularly relates to a data processing method, a data processing device and a server.
Background
In some data processing scenarios, a corresponding model is typically used to process specific relevant data similar to a black box to predict whether there are some relationships between different objects (which may also be referred to as link prediction). However, the above methods often fail to provide users with accurate and reasonable explanatory data about the above prediction results.
Disclosure of Invention
The specification provides a data processing method, a data processing device and a server, so that interpretation data which are accurate, reasonable and high in reference value are generated while whether a first target object and a second target object have a preset relation or not is predicted.
The data processing method, the data processing device and the server provided by the specification are realized as follows:
a method of data processing, comprising: acquiring business data related to a target enterprise; constructing a knowledge graph of the target enterprise according to the business data related to the target enterprise; wherein the target business's knowledge-graph includes a plurality of entity objects related to the target business, including at least a first target object and a second target object, and known relationships between the entity objects; determining a plurality of paths from the first target object to the second target object according to the knowledge graph of the target enterprise; and processing the plurality of paths by calling a preset processing model to determine whether a preset relationship exists between the first target object and the second target object as a processing result, and determining a target path associated with the processing result as interpretation data for the processing result.
A method of data processing, comprising: acquiring target service data; constructing a target knowledge graph according to the target service data; wherein the target knowledge-graph comprises a plurality of data objects, including at least a first target object and a second target object, and a plurality of known relationships between the data objects; determining a plurality of paths from the first target object to the second target object according to the target knowledge graph; and processing the plurality of paths by calling a preset processing model to determine whether a preset relationship exists between the first target object and the second target object as a processing result, and determining a target path associated with the processing result as interpretation data for the processing result.
A method of data processing, comprising: acquiring sample data, and determining a type label of the sample data according to the sample type of the sample data; wherein the sample types include positive samples and negative samples; constructing a corresponding sample knowledge graph according to the sample data, and determining a plurality of sample paths from the sample knowledge graph;
training an initial processing model according to the type label of the sample data and the plurality of sample paths to obtain a preset processing model meeting requirements; wherein the initial processing model comprises at least an initial LSTM layer and an initial Attention layer; the preset processing model is used for determining the judgment probability that a first target object and a second target object in the knowledge graph have a preset relation and the weight values of a plurality of paths leading from the first target object to the second target object in the knowledge graph according to the knowledge graph.
A data processing apparatus comprising: the acquisition module is used for acquiring business data related to the target enterprise; the construction module is used for constructing a knowledge graph of the target enterprise according to the business data related to the target enterprise; wherein the target business's knowledge-graph includes a plurality of entity objects related to the target business, including at least a first target object and a second target object, and known relationships between the entity objects; the first determination module is used for determining a plurality of paths from a first target object to a second target object according to the knowledge graph of the target enterprise; the second determining module is used for processing the plurality of paths by calling a preset processing model to determine whether a preset relation exists between the first target object and the second target object as a processing result, and a target path related to the processing result is used as interpretation data aiming at the processing result.
A server comprising a processor and a memory for storing processor-executable instructions that when executed by the processor enable obtaining business data relating to a target enterprise; constructing a knowledge graph of the target enterprise according to the business data related to the target enterprise; wherein the target business's knowledge-graph includes a plurality of entity objects related to the target business, including at least a first target object and a second target object, and known relationships between the entity objects; determining a plurality of paths from the first target object to the second target object according to the knowledge graph of the target enterprise; and processing the plurality of paths by calling a preset processing model to determine whether a preset relationship exists between the first target object and the second target object as a processing result, and determining a target path associated with the processing result as interpretation data for the processing result.
According to the data processing method, the data processing device and the server, whether a plurality of paths between a first target object and a second target object exist in a preset relation or not to be predicted is determined according to a knowledge graph of a target enterprise, which is constructed based on business data related to the target enterprise; and then, processing the plurality of paths by using a preset processing model, determining whether a preset relation exists between the first target object and the second target object according to the model output to serve as a processing result, and determining a target path related to the processing result according to the model output to serve as interpretation data aiming at the processing result. Therefore, the interpretation data which is more accurate and reasonable and has higher reference value and is used for supporting the processing result can be obtained while the processing result that whether the first target object and the second target object have the preset relation is determined.
Drawings
In order to more clearly illustrate the embodiments of the present specification, the drawings needed to be used in the embodiments will be briefly described below, and the drawings in the following description are only some of the embodiments described in the present specification, and it is obvious to those skilled in the art that other drawings can be obtained according to the drawings without any creative effort.
FIG. 1 is a diagram illustrating an embodiment of a system architecture for implementing a data processing method according to an embodiment of the present disclosure;
FIG. 2 is a flow diagram of a data processing method provided by one embodiment of the present description;
FIG. 3 is a diagram illustrating an embodiment of a data processing method according to an embodiment of the present disclosure;
FIG. 4 is a diagram illustrating an embodiment of a data processing method according to an embodiment of the present disclosure;
FIG. 5 is a diagram illustrating an embodiment of a data processing method according to an embodiment of the present disclosure;
FIG. 6 is a diagram illustrating an embodiment of a data processing method according to an embodiment of the present disclosure;
FIG. 7 is a flow diagram of a data processing method provided by one embodiment of the present description;
FIG. 8 is a flow diagram of a data processing method provided by one embodiment of the present description;
FIG. 9 is a schematic structural component diagram of a server provided in an embodiment of the present description;
fig. 10 is a schematic structural composition diagram of a data processing apparatus according to an embodiment of the present specification.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the present specification, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments of the present specification, and not all of the embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments in the present specification without any inventive step should fall within the scope of protection of the present specification.
The embodiment of the specification provides a data processing method, which can be particularly applied to a system comprising a server and a terminal device. As can be seen in figure 1. The terminal equipment can be connected with the server in a wired or wireless mode to carry out data interaction.
Specifically, when a user (or an organization, a platform) wants to determine whether a first target object (e.g., a natural person or an associated company) related to a target enterprise is a beneficiary of the enterprise of the target enterprise (which may be referred to as a second target object), the user may first collect business data related to the target enterprise, such as an advertisement posted by the target enterprise, a contract or agreement established by the target enterprise with other entity objects, registration information of the target enterprise queried at an administrative authority, and the like. Wherein, the business data related to the target enterprise searched by the user cannot directly indicate that the first target object is the enterprise beneficiary of the target enterprise.
Further, the user can generate a corresponding data processing request according to the business data related to the target enterprise through the terminal device. The data processing request may carry the service data related to the target enterprise. The data processing request may specifically include a request data for requesting to predict whether the first target object is an enterprise beneficiary of the target enterprise (which is equivalent to predicting whether a relationship between the first target object and the second target object exists between the first target object and the second target object, where the first target object is the enterprise beneficiary of the second target object) by using the business data related to the target enterprise. And sending the data processing request to a server.
After receiving the data processing request, the server may obtain service data related to the target enterprise, which is carried in the data processing request. And predicting whether the first target object is an enterprise beneficiary of the target enterprise as a processing result by using business data related to the target enterprise in response to the data processing request, and giving interpretation data for supporting the processing result.
Specifically, the server may first construct a knowledge graph of the target enterprise according to the business data related to the target enterprise. The knowledge graph of the target enterprise includes a plurality of entity objects (e.g., natural persons, enterprises, cases, domain names, etc.) related to the target object, which are determined based on the business data, and known relationships between the entity objects, which are determined based on the business data (e.g., enterprise a is a sub-company of the target enterprise, a payment account of natural person B and a payment account of natural person C are friends, enterprise D and the target enterprise are companies with registered addresses, etc.). The target enterprise knowledge graph comprises a first target object and a second target object, and no known relationship capable of representing the first target object as the enterprise beneficiary of the second target object exists between the first target object and the second target object.
Further, the server may determine a plurality of paths leading from the first target object to the second target object according to the relationship type of the preset relationship to be predicted and according to the knowledge graph of the target enterprise. Furthermore, the server can process the plurality of paths by calling a pre-trained preset processing model comprising a characteristic layer, an LSTM layer and an Attention layer so as to determine whether the first target object is an enterprise beneficiary of the target enterprise as a processing result; meanwhile, the server can also determine a target path with strong relevance with the processing result from the paths through the preset processing model, and the target path is used as interpretation data for supporting the processing result so as to interpret the given processing result.
In specific implementation, the server may first call a feature layer in the preset processing model to perform feature processing on the multiple paths, respectively, so as to obtain multiple sets of path features corresponding to the multiple paths, respectively. For example, the identity of the entity object on each path, the entity type of the entity object, the relationship type of the known relationship between the entity objects, and the like are determined. And calling an LSTM layer in the preset processing model to respectively process the multiple groups of path features so as to obtain multiple feature vectors respectively corresponding to the multiple groups of path features. Then, calling an Attention layer in the preset processing model to calculate weighted values of a plurality of paths according to the plurality of feature vectors; calculating a judgment probability that a preset relation exists between the first target object and the second target object by using the Attention layer according to the paths and the weight values of the paths; and finally, outputting the judgment probability and the weighted values of the paths through a preset processing model.
The server can determine whether the first target object is an enterprise beneficiary of the target enterprise or not according to the judgment probability output by the preset processing model, and obtain a corresponding processing result; meanwhile, the server may further find one or more paths with the largest weight values from the plurality of paths as target paths with stronger association with the processing result according to the weight values of the paths and according to the weight values of the paths output by the preset processing model, and determine the target paths as the interpretation data for supporting the processing result.
The server may feed back the processing result to the terminal device together with the interpretation data.
The terminal device may receive and present the above-described processing results to the user, as well as interpretation data for supporting the processing results. The user can judge whether the processing result is reliable or not by combining the interpretation data displayed on the terminal equipment, and further determine whether to adopt and use the processing result or not.
In this embodiment, the server may specifically include a background server that is applied to a network platform side and is capable of implementing functions such as data transmission and data processing. Specifically, the server may be, for example, an electronic device having data operation, storage function and network interaction function. Alternatively, the server may also be a software program that runs in the electronic device and provides support for data processing, storage, and network interaction. In this embodiment, the number of servers included in the server is not particularly limited. The server may specifically be one server, or may also be several servers, or a server cluster formed by several servers.
In this embodiment, the terminal device may specifically include a front-end device that is applied to a user side and can implement functions such as data acquisition and data transmission. Specifically, the terminal device may be, for example, a desktop computer, a tablet computer, a notebook computer, a smart phone, a digital assistant, a smart wearable device, and the like. Alternatively, the terminal device may be a software application capable of running in the electronic device. For example, it may be some APP running on a cell phone, etc.
Referring to fig. 2, an embodiment of the present disclosure provides a data processing method. The method can be applied to the server side in particular. In particular implementations, the method may include the following.
S201: business data related to the target enterprise is obtained.
In some embodiments, the business data related to the target enterprise may be specifically understood as data including an association relationship between the target enterprise and other entity objects. Specifically, the business data related to the target enterprise may include: an announcement posted externally by the target enterprise, a contract or agreement established by the target enterprise with other entity objects, information registered by the target enterprise at the management authority, and so forth. Of course, the above listed service data is only a schematic illustration. In specific implementation, according to a specific application scenario, other types of business data besides the listed business data may also be introduced as the business data related to the target enterprise. The present specification is not limited to these.
In some embodiments, when implemented, the server may receive a data processing request from the terminal device that cannot be addressed to the target enterprise, and obtain, from the data processing request, business data related to the target enterprise, which is provided by the terminal device.
The server may also actively collect and obtain business data associated with the target enterprise based on the identification information of the target enterprise (e.g., name, enterprise number, etc. of the target enterprise). For example, the server may access a website of the target enterprise and collect announcements of the target enterprise posted on the website as a type of business data associated with the target enterprise. The server may also query the target enterprise's registry information at the regulatory agency as a type of business data associated with the target enterprise. The server may also obtain, from other entity objects (e.g., businesses or natural persons, etc.) that have a collaboration with the target enterprise, an agreement, contract, or collaborative plan established by the other entity objects with the target enterprise, as a type of linguistic data associated with the target enterprise.
S202: constructing a knowledge graph of the target enterprise according to the business data related to the target enterprise; wherein the target business's knowledge-graph includes a plurality of entity objects related to the target business, including at least a first target object and a second target object, and known relationships between the entity objects.
In some embodiments, the knowledge-graph may be specifically understood as a data structure including a set of nodes and a set of edges between the nodes.
Further, the aforementioned target enterprise knowledge graph may be specifically understood as a data structure that includes a set of entity objects related to the target enterprise and a set of known relationships between the entity objects, which are determined based on business data related to the target enterprise. The entity object may correspond to a node in the target enterprise's knowledge graph, and the known relationship between the entity objects may correspond to a connecting edge between two nodes in the target enterprise's knowledge graph.
In some embodiments, the entity object may be specifically understood as an object directly or indirectly associated with the target enterprise, which is determined based on business data related to the target enterprise. Specifically, the entity object may include: businesses (e.g., subsidiaries of the target business, parent businesses of the target business, etc.), natural persons (e.g., board of the target business, legal of the target business, etc.), cases (e.g., risk events involved by the target business), domain names (e.g., ICP of a website registered by the target business, etc.), and the like. The risk event may specifically include an illegal action directly or indirectly participated in by the target enterprise. The icp (internet Content provider) may be a web Content provider, for example. Of course, the various entity types listed above are merely illustrative. In specific implementation, the entity object may further include other entity types according to a specific application scenario and specific characteristics of the target enterprise. The present specification is not limited to these.
In some embodiments, the known relationships described above may specifically understand relationships between entity objects that have been determined based on business data associated with the target business. Specifically, the relationship may include: corporate relationships for a business (e.g., natural person F is corporate for business Q, etc.), corporate sub-or parent relationships for a business (e.g., business a is a child of business U, business V is a parent of business U, etc.), friend relationships for payment accounts (e.g., natural person B's payment account and business C's payment account are friends), and the like. Of course, the various relationship types listed above are merely illustrative. In specific implementation, the relationship types may also include other relationship types according to specific application scenarios and specific characteristics of the target enterprise. For example, a registered address relationship of the enterprises (e.g., the registered address of the a enterprise is the U enterprise is the same), etc. The present specification is not limited to these.
In some embodiments, at least a first target object and a second target object to be predicted whether a preset relationship exists are further included in the plurality of entity objects in the knowledge graph of the target enterprise. And determining whether a preset relation exists between the first target object and the second target object based on the indication map of the target enterprise.
In some embodiments, the first target object may be a natural person X, the second target object may be a target enterprise Y, and the preset relationship may be a corporate beneficiary of the target enterprise Y that is the natural person X. Accordingly, the data processing goal of the server is to predict whether the natural person X is a business beneficiary of the target business Y, i.e., to determine whether the preset relationship described above exists between the natural person X (which may be denoted as a first target object X) and the target business Y (which may be denoted as a second target object Y).
Of course, the first target object, the second target object and the preset relationship listed above are only a schematic illustration. In specific implementation, the first target object, the second target object, and the preset relationship may also be other types of entity objects and other types of relationships according to different application scenarios and processing requirements. For example, the first target object may also be an enterprise M, the second target object may also be a target enterprise Y, and the preset relationship may also be a parent company of the enterprise M as the target enterprise Y. Accordingly, the data processing target of the server is to predict whether the business M is a parent company of the target business Y, or the like.
In some embodiments, referring to fig. 3, the above-mentioned building a knowledge graph of a target enterprise according to the business data related to the target enterprise may be implemented by the following steps: determining an entity object related to the target enterprise, an identity of the entity object, an entity type of the entity object, a known relationship between the entity objects and a relationship type of the known relationship according to the business data related to the target enterprise; establishing a node according to the identity of the entity object; and establishing connecting edges among the nodes according to the entity types of the entity objects, the known relations among the entity objects and the relation types of the known relations so as to obtain the corresponding knowledge graph of the target enterprise.
The identity of the entity object may be specifically understood as identification information corresponding to the entity object and capable of distinguishing other entity objects. Specifically, the identity of the entity object may be an identity ID of the entity object, or a name of the entity object.
In some embodiments, when the target enterprise knowledge graph is specifically constructed, a node marked with an identity of an entity object may be first established in the knowledge graph as a node corresponding to the entity object. Meanwhile, the entity type of the entity object can be marked on the node.
According to the method, after the nodes corresponding to the entity objects are established in the graph, two nodes corresponding to the two entity objects with known relations in the graph can be found according to the known relations between the entity objects, and the edges corresponding to the relation types are selected to connect the two nodes according to the relation types of the known relations.
Or two nodes corresponding to two entity objects with known relationship in the graph are found according to the known relationship between the entity objects, and the two nodes are connected by using a connecting edge; and marking the connected edges according to the relationship type of the known relationship so as to mark the relationship type corresponding to the connected edges.
According to the method, after the connecting edges corresponding to the known relation are connected, the knowledge graph of the target enterprise is obtained. The knowledge graph spectrum of the target enterprise comprises nodes of a plurality of entity objects related to the target enterprise, and the nodes of the entity objects with known relations are connected through connecting edges corresponding to the relation types.
S203: and determining a plurality of paths leading from the first target object to the second target object according to the knowledge graph of the target enterprise.
In some embodiments, the path may be a path obtained by combining and connecting edges between nodes by using a node corresponding to the first target object as a start node and a node corresponding to the second target object as an end node. Some of the paths may not have other nodes between the start node and the end node, and some of the paths may have other nodes between the start node and the end node.
In some embodiments, referring to fig. 3, in a specific implementation, according to a relationship type of a preset relationship, the server may traverse paths from a start node to an end node in the target enterprise's knowledge graph and connect through connecting edges corresponding to the known relationship to obtain multiple paths from the first target object to the second target object. For example, path 1, path 2, path 3 … …, path N.
In some embodiments, if the target enterprise knowledge graph is complex and the number of involved paths is large, the length threshold of the path to be traversed may be set according to the accuracy requirement. The length of the path may be specifically understood as the number of nodes (including a start node and an end node) on the path. The greater the number of nodes on a path, the longer the path length. Conversely, if the number of nodes on a path is smaller, the length of the path is longer.
And the server can only traverse the path with the length less than or equal to the length threshold of the path according to the length threshold of the path to find out a plurality of paths pointing to the second target object from the first target object. Therefore, the data processing amount of the server side can be reduced, and the whole processing efficiency is improved.
In some embodiments, specifically, for example, referring to fig. 4, in the knowledge graph of the target enterprise D, the following two paths from the first target object to the second target object may be determined by traversing the knowledge graph of the target enterprise D, with the node a (corresponding to the first target object: natural person a) as the starting node and the node D (corresponding to the second target object: target enterprise D) as the ending node: path 1 and path 2.
Path 1 may be specifically represented as: a (P1) B (P2) C (P3) D. Node B and node C are intermediate nodes located between node a and node D on path 1, respectively. The above-mentioned P1 can be expressed as a relationship type corresponding to a connecting edge between the node a and the node B. P2 may be expressed as a type of relationship corresponding to a connecting edge between node B and node C. P3 may be expressed as a type of relationship corresponding to a connecting edge between node C and node D. The path length of path 1 is 4.
Similarly, path 2 may be specifically expressed as: a (P2) E (P3) M (P4) D. Wherein, the node E and the node M are respectively intermediate nodes between the node a and the node D on the path 2. The above-mentioned P2 can be expressed as a relationship type corresponding to a connecting edge between the node a and the node E. P3 may be expressed as a type of relationship corresponding to a connecting edge between node E and node M. P4 may be expressed as a type of relationship corresponding to a connecting edge between node M and node D. The path length of path 2 is 4.
S204: and processing the plurality of paths by calling a preset processing model to determine whether a preset relationship exists between the first target object and the second target object as a processing result, and determining a target path associated with the processing result as interpretation data for the processing result.
In some embodiments, the preset processing model may be specifically understood as a data processing model trained in advance, and capable of predicting whether a preset relationship exists between a first target object and a second target object that do not originally have a preset relationship in a knowledge graph according to a plurality of paths determined based on the knowledge graph of the target enterprise. Meanwhile, the preset processing model can also output the weight values of the paths which are calculated and used when whether the preset relation exists between the first target object and the second target object is determined according to the paths.
Specifically, as shown in fig. 5, the preset processing model at least includes the following three-layer network structure: a feature layer, an LSTM layer, and an Attention layer. Wherein, the characteristic layer is connected with the LSTM layer, and the LSTM layer is connected with the Attention layer.
The feature layer may be a network layer responsible for feature extraction. The path features extracted by the feature layer are transmitted to the LSTM layer.
The LSTM (Long Short-Term Memory) layer may be specifically understood as a network layer that can learn a Long dependency relationship based on time cycle and can convert received path features into a feature vector corresponding to a path that can be interpreted by a model. The feature vectors obtained via the LSTM layer are transmitted to the Attention layer.
The Attention layer (also referred to as an Attention layer) may be specifically understood as a method that a plurality of paths can be comprehensively analyzed according to feature vectors through training and learning to determine weight values of the paths; and determining a network layer with a judgment probability of a preset relation between the first target object and the second target object according to the weight values of the paths and the paths. The Attention layer enables a model network to automatically learn and evaluate the weighted values of different paths by adopting an unsupervised mode through a full connection layer or a convolution layer.
In some embodiments, referring to fig. 3, in implementation, the server may call the preset processing model, and input a plurality of paths as model inputs into the preset processing model. A preset process model is run. The preset processing model can specifically process the input multiple paths through the multiple network layers, and finally output the judgment probability that the preset relationship exists between the first target object and the second target object and the weighted values of the multiple paths as model output.
In some embodiments, in specific implementation, when the server invokes a preset processing model to process the plurality of paths, the server may first invoke a feature layer in the preset processing model to perform feature processing on the plurality of paths, respectively, so as to obtain a plurality of sets of path features corresponding to the plurality of paths, respectively.
Wherein the path characteristics include at least one of: the identity of the entity objects on the path, the entity type of the entity objects, the relationship type of known relationships between the entity objects, and the like. Of course, the above listed path features are only illustrative. In particular implementations, other types of features than those listed above may also be introduced as path features. For example, the strength of a known relationship, an entity feature of an entity object, and the like may also be introduced as a path feature. The present specification is not limited to these.
In some embodiments, in the above manner, the feature processing may be performed on the multiple paths respectively, so as to obtain multiple sets of path features. Each set of path features may correspond to a path.
In some embodiments, in specific implementation, when the server invokes a preset processing model to process the plurality of paths, the server may further invoke an LSTM layer in the preset processing model to process the plurality of sets of path features respectively, so as to obtain a plurality of feature vectors corresponding to the plurality of sets of path features respectively.
The feature vector may be specifically understood as an implicit expression for a path based on path features; and the above-mentioned feature vector is the data that the model (or the Attention layer) can interpret and process. The feature vector corresponds to a path.
In some embodiments, in specific implementation, when the server calls a preset processing model to process the plurality of paths, the server may finally call an Attention layer in the preset processing model to calculate weight values of the plurality of paths according to the plurality of feature vectors; calculating a judgment probability that a preset relation exists between the first target object and the second target object according to the paths and the weight values of the paths by using the Attention layer; and outputting the decision probability and the weight values of the paths.
The weight value of the path may be used to represent a degree of dependence of the Attention layer on the path corresponding to the weight value when the plurality of paths are integrated to determine the decision probability.
Specifically, for example, if the weight value of a path is larger, when the Attention layer integrates a plurality of paths to determine the decision probability, the higher the dependency degree on the path is, and accordingly, the stronger the association between the path and the finally obtained decision probability is, the larger the interpretation effect of the path on the decision probability is. Conversely, if the weight value of a path is smaller, the degree of dependence on the path is lower when the Attention layer integrates a plurality of paths to determine the decision probability, and accordingly, the correlation between the path and the finally obtained decision probability is weaker, and the interpretation effect of the path on the decision probability is smaller.
In some embodiments, the server may determine, according to a model output of a preset processing model, whether the first target object and the second target object have a preset relationship as a processing result, and determine a target path associated with the processing result as interpretation data for the processing result.
In some embodiments, referring to fig. 3, the server may compare the determination probability that the first target object and the second target object in the model output have the preset relationship with a preset probability threshold to obtain a corresponding comparison result. According to the comparison result, in the case that it is determined that the determination probability is greater than or equal to the preset probability threshold, it may be determined that the first target object and the second target object have the preset relationship as the corresponding processing result. The specific numerical value of the preset probability threshold can be flexibly set according to historical data and specific precision requirements.
On the contrary, according to the comparison result, in the case where it is determined that the above determination probability is smaller than the preset probability threshold, it may be determined that the first target object and the second target object do not have the preset relationship as the corresponding processing result.
In some embodiments, referring to fig. 3, while determining the processing result according to the model output, the server may further select, according to the weight values of the plurality of paths in the model output, one or more paths with the largest weight value from the plurality of paths as the target path associated with the processing result. The target path may then be determined as interpretation data for the processing result.
Thus, the server can determine whether the first target object and the second target object have the preset relationship as the processing result (or called the prediction result); and determining a target path with stronger relevance as interpretation data supporting the processing result.
It should be noted that the target path determined and used as the interpretation data in the above manner is based on the path dimension to interpret the processing result, and is relatively more reasonable, has higher accuracy and reference value, and is more suitable for data processing scenarios like link prediction.
In some embodiments, for example, as shown in fig. 6, a path 1 (i.e., a (P1) B (P2) C (P3) D) and a path 2 (i.e., a (P2) E (P3) M (P4) D) are used as model inputs, a preset processing model is input, and the two paths are specifically processed by using the preset processing model.
During specific processing, feature extraction can be performed on the two paths through a feature layer in a preset processing model, so that two groups of path features are obtained and recorded as: a first set of path features (corresponding to path 1) and a second set of path features (corresponding to path 2). The two sets of path features are then input to the LSTM layer, which is connected to the feature layer.
Then, the two sets of path features are processed through an LSTM layer in a preset processing model, so as to generate corresponding feature vectors that can be interpreted by the model according to the two sets of path features, and the feature vectors are respectively recorded as: the feature vector for path 1 (corresponding to path 1) and the feature vector for path 2 (corresponding to path 2). The two feature vectors are then input to the Attention layer, which is connected to the LSTM layer.
Further, the two feature vectors are processed through an Attention layer in a preset processing model, so that the weight value of the path corresponding to the feature vector is determined according to the two feature vectors. For example, the weight value of path 1 is determined to be 0.8 according to the feature vector of path 1, and the weight value of path 2 is determined to be 0.2 according to the feature vector of path 2. Further, the two paths may be comprehensively used according to the weights of the two paths, and a determination probability that a connection edge (corresponding to a preset relationship, that is, a relationship in which the natural person a is an enterprise beneficiary of the target enterprise D) exists between the start node a (corresponding to the natural person a) and the end node D (corresponding to the target enterprise D) in the two paths is predicted to be 0.75.
Finally, referring to fig. 6, after the above processing, a preset processing model may output the above decision probability of 0.75, and the weight values of the two paths (i.e., the weight value of 0.8 for path 1 and the weight value of 0.2 for path 2) calculated by the Attention layer in the process of determining the above decision probability as model outputs. The server may obtain, from the model output, a determination probability 0.75 that the nature a and the target enterprise D have a preset relationship, and a weight value 0.8 of the path 1 and a weight value 0.2 of the path 2 that are determined and used by the preset processing model when determining the determination probability.
Further, the server may first compare the decision probability with a preset probability threshold (e.g., 0.6). According to the comparison result, determining that the judgment probability is greater than a preset probability threshold value, and judging that the preset relation exists between the nature A and the target enterprise D, namely obtaining a processing result: natural person a is the enterprise beneficiary of target enterprise D.
In addition, the server may numerically compare the weight value of path 1 with the weight value of path 2. According to the comparison result, when the weight value of the path 1 is determined to be greater than that of the path 2, it can be judged that the dependency degree on the path 1 is higher in the process that the preset processing model processes the two paths to obtain the judgment probability 0.75 (corresponding processing result: the natural person A is the enterprise beneficiary of the target enterprise D). Further, it can be determined that the path 1 has a higher degree of correlation with the processing result than the path 2. Therefore, the above path 1 can be determined as the interpretation data for the processing result.
The server may feed back the processing result to the user through the terminal device as interpretation data for the processing result. After the user determines that the natural person A is the enterprise beneficiary of the target enterprise D according to the processing result, the processing result can be verified by utilizing the interpretation data.
In addition, the user can further analyze and process the relationship between the entity objects in the knowledge graph of the target enterprise more specifically according to the processing result and the knowledge graph of the target enterprise in combination with the interpretation data, and detect the enterprise risk of the target enterprise according to the analysis and processing result so as to determine whether the target enterprise has enterprise risks such as money laundering risk, illegal trade risk, transaction credit risk and the like.
In the embodiment, a plurality of paths between a first target object and a second target object which are to be predicted and have a preset relationship are determined according to a knowledge graph of a target enterprise, which is constructed based on business data related to the target enterprise; and processing the plurality of paths by using a preset processing model, determining whether a preset relation exists between the first target object and the second target object according to the model output to serve as a processing result, and determining and utilizing a target path related to the processing result to serve as interpretation data for the processing result. Therefore, the processing result of whether the first target object and the second target object have the preset relationship can be determined, and simultaneously, more accurate and reasonable interpretation data about the processing result with higher reference value can be obtained.
In some embodiments, the building a target enterprise knowledge graph according to the business data related to the target enterprise may be implemented by: determining entity objects related to the target enterprise, identity marks of the entity objects, known relations among the entity objects and relation types of the known relations according to the business data related to the target enterprise; establishing a node according to the identity of the entity object; and establishing connecting edges between the nodes according to the known relationship between the entity objects and the relationship type of the known relationship so as to obtain the corresponding knowledge graph of the target enterprise.
In some embodiments, in specific implementation, the entity type of the entity object may be determined according to the business data related to the target enterprise, and the entity type of the entity object is marked on a node corresponding to the entity object in the graph. For example, a data tag is set on the node to characterize the entity type.
In some embodiments, the entity type may specifically include: enterprises, natural people, cases, domain names, etc.
In some embodiments, the relationship type may specifically include at least one of: corporate relationship of the enterprise, parent company relationship of the enterprise, co-registered address relationship, subsidiary account friend relationship, and the like.
Of course, the entity types and relationship types listed above are merely illustrative. In specific implementation, according to specific situations and processing requirements, other entity types and relationship types besides the listed entity types and relationship types may be included. The present specification is not limited to these.
In some embodiments, the determining a plurality of paths from the first target object to the second target object according to the knowledge graph of the target enterprise may include the following steps: according to the relationship type of the preset relationship, traversing paths which are connected through the known relationship and are from the starting node to the ending node in the knowledge graph of the target enterprise by taking the first target object as the starting node and the second target object as the ending node in the knowledge graph of the target enterprise.
In some embodiments, in order to reduce the amount of computation and improve the processing efficiency, some paths with longer path length and weaker relevance may be filtered in advance when finding the path. Specifically, a path length threshold may be set according to the precision requirement; and according to the relationship type of the preset relationship, when traversing and searching a path from the starting node to the ending node in the target enterprise knowledge graph, only traversing and searching a path with the path length less than or equal to the length threshold value of the path to obtain a plurality of paths meeting the requirements. Thereby effectively reducing the number of paths to be traversed.
In some embodiments, the preset process model may include at least: a feature layer, an LSTM layer, an Attention layer and other network structures.
In some embodiments, the anchoring layer in the preset processing model can be replaced by a posing layer (pooling layer) based on the anchoring mechanism. The above-mentioned Pooling layer may be divided into a maximum Pooling layer (Max Pooling), a Local Mean Pooling layer (Local Mean Pooling), a Global Mean Pooling layer (Global Mean Pooling), and the like, according to different calculation methods.
In some embodiments, the following may be included when the plurality of paths are processed by invoking a preset processing model to determine whether the first target object and the second target object have a preset relationship as a processing result, and determine a target path associated with the processing result as interpretation data for the processing result. And inputting the determined paths into a preset processing model by taking the determined paths as model input. And operating a preset processing model, calling a plurality of network layers in the preset processing model to specifically process the plurality of paths to obtain a judgment probability that the first target object and the second target object have a preset relation, and determining and using path weighted values of the plurality of paths in the process of determining the judgment probability by the preset processing model to output as the model. Further, whether a preset relation exists between the first target object and the second target object or not can be determined as a processing result according to the judgment probability in the model output; and simultaneously, according to the path weight of each path, one or more path groups with the strongest correlation with the processing result (or judgment probability) are screened out from the paths to be used as target paths, and the target paths are used as interpretation data aiming at the processing result.
In some embodiments, the processing the plurality of paths by invoking a preset processing model may include, in specific implementation: and calling a feature layer in the preset processing model to perform feature processing on the plurality of paths respectively so as to obtain a plurality of groups of path features corresponding to the plurality of paths respectively. Wherein the path characteristics include at least one of: the identity of the entity objects on the path, the entity type of the entity objects, the relationship type of known relationships between the entity objects, and the like. Wherein each set of path features corresponds to a path.
The above listed path features are only illustrative. In particular, other types of path characteristics may be introduced, as the case may be. E.g., physical characteristics of the physical object, etc.
In some embodiments, the entity characteristics of the entity object on the path may be used to replace the identity of the entity object on the path as a path characteristic to be input to a subsequent LSTM layer for processing.
In some embodiments, in order to make the length of data input to the LSTM layer through the feature layer consistent, before invoking the feature layer in the preset processing model to perform feature processing on the plurality of paths, the method may further include the following steps: detecting whether a length of the plurality of paths is equal to a length threshold matching the LSTM layer. When the length of the path is determined to be greater than the length threshold, the path may be cut off first; in the event that the length of the path is determined to be less than the length threshold, a padding process (e.g., padding) is performed on the path. So that the lengths of the multiple paths are matched to subsequent LSTM layers.
In some embodiments, the processing the plurality of paths by invoking a preset processing model may further include, in specific implementation: and calling an LSTM layer in the preset processing model to respectively process the multiple groups of path features so as to obtain multiple feature vectors respectively corresponding to the multiple groups of path features. The feature vectors correspond to a path, and the feature vectors can be specifically understood as data that can be interpreted and processed by the Attention layer.
In some embodiments, the processing the plurality of paths by invoking a preset processing model may further include, in specific implementation: calling an Attention layer in the preset processing model to calculate the weighted values of a plurality of paths according to the plurality of feature vectors; calculating a judgment probability that a preset relation exists between the first target object and the second target object according to the paths and the weight values of the paths by using the Attention layer; and outputting the judgment probability and the weighted values of the paths as a model output of a preset processing model.
In some embodiments, in implementation, after outputting the decision probability and the weight values of the multiple paths, the method may further include: comparing the judgment probability with a preset probability threshold value to obtain a comparison result; and determining whether a preset relation exists between the first target object and the second target object according to the comparison result to obtain a corresponding processing result.
If it is determined that the decision probability is greater than or equal to the preset probability threshold according to the comparison result, it may be determined that a preset relationship exists between the first target object and the second target object. If the judgment probability is determined to be smaller than the preset probability threshold according to the comparison result, it can be determined that the preset relationship does not exist between the first target object and the second target object.
In some embodiments, in implementation, after outputting the decision probability and the weight values of the multiple paths, the method may further include: according to the weight values of the paths, one or more paths with the maximum weight values are screened out from the paths to serve as target paths related to the processing results; determining the target path as interpretation data for the processing result.
Specifically, one or more paths with the largest weight values may be selected from the multiple paths according to the weight values of the multiple paths, and the selected path is used as the target path. The paths with the weight values larger than or equal to the preset weight threshold value can be screened out from the paths according to the weight values of the paths, and the paths can be used as target paths.
In some embodiments, in implementation, the target path with a higher correlation with the processing result may be determined as solution data for supporting the processing result. Of course, the interpretation data for interpreting the processing result may be generated based on the target route in combination with the business data related to the target enterprise.
In some embodiments, after determining whether the first target object and the second target object have a preset relationship as a processing result and a target path associated with the processing result as interpretation data for the processing result, the method further includes: and generating a data processing report about the target enterprise according to the interpretation data and the processing result. The data processing report of the target enterprise can be specifically used for characterizing the operation condition of the target enterprise, and/or the risk condition of the target enterprise, and/or the internal relationship of the target enterprise, and the like. Furthermore, the user can more comprehensively and accurately analyze the relevant conditions of the target enterprise according to the data processing report of the target enterprise.
In some embodiments, after obtaining the interpretation data, the user may use the interpretation data to determine a predetermined process model to obtain a judgment basis for the process result, and then the user may know why the predetermined process model provides the process result, and know a judgment logic inside the predetermined process model, so as to avoid using the predetermined process model as a black box, and thus, the predetermined process model may be better used for link prediction.
In some embodiments, after obtaining the interpretation data, the user may further use the interpretation data as a reference, and perform data processing in a related scene more comprehensively and accurately according to a corresponding processing result. For example, in a manual auditing scenario, whether a target enterprise has a corresponding enterprise risk or not may be comprehensively determined according to the above explained data and the processing result. For example, in the customer appeal scene, it can be comprehensively determined whether the customer appeal target needs to take the corresponding responsibility or not according to the interpretation data and the processing result.
In some embodiments, the preset relationship between the first target object and the second target object to be predicted to exist may specifically include: the first target object is an enterprise beneficiary of the second target object. Of course, the above-listed predetermined relationship is only an illustrative one. In specific implementation, the preset relationship may further include relationships of other relationship types according to specific application scenarios and processing requirements. For example, the preset relationship may be a parent company of which the first target object is the second target object, a creditor of which the first target object is the second target object, or the like.
In some embodiments, before implementation, model training may be performed according to sample data to establish a preset processing model.
In some embodiments, the preset processing model may be specifically established as follows.
S1: acquiring sample data, and determining a type label of the sample data according to the sample type of the sample data; wherein the sample types include positive samples and negative samples.
S2: and constructing a corresponding sample knowledge graph according to the sample data, and determining a plurality of sample paths from the sample knowledge graph.
S3: training an initial processing model according to the type label of the sample data and the plurality of sample paths to obtain a preset processing model meeting requirements; wherein the initial treatment model comprises at least an initial LSTM layer and an initial attention layer.
In some embodiments, in specific implementation, corresponding positive samples and negative samples may be obtained as sample data according to a preset relationship for a preset processing model to be trained.
Specifically, for example, the preset relationship for the preset processing model to be trained is the enterprise beneficiary with the first target object as the second target object.
At this time, business data related to the target enterprise B, for example, in the case where the natural person a is not an enterprise beneficiary of the target enterprise B, may be collected as negative sample data in a targeted manner, and a type tag indicating a negative sample may be set on the sample data. Business data related to the target enterprise D, such as a case where the enterprise C is a target enterprise D benefits, is collected as positive sample data, and a type tag indicating the positive sample is set on the sample data.
In some embodiments, in specific implementation, a corresponding knowledge graph of an enterprise may be constructed according to the sample data, and the constructed knowledge graph is used as a sample knowledge graph. And determining a plurality of sample paths according to the sample knowledge graph.
In some embodiments, in determining the sample path, it is considered that a directly found path may be absent from a sample knowledge-graph constructed based on negative sample data. For example, a path from a starting node C (corresponding business C) to an ending node D (corresponding target business D) connected by a set of edges corresponding to known relationships may not be found in the sample knowledge graph. In this case, the start node and the end node may be fixed, and then a path from the start node to the end node may be found as a sample path by means of a jump (e.g., an N-degree jump).
In some embodiments, the initial processing model may specifically include an initial feature layer, an initial LSTM layer, and an initial networking layer such as an initial Attention layer.
In some embodiments, the training of the initial processing model according to the type label of the sample data and the plurality of sample paths may include the following steps: and inputting a plurality of sample paths corresponding to one sample datum into the initial processing model to obtain corresponding probability values and weight values of the plurality of sample paths. And determining a processing result according to the probability value. And calculating a loss function according to the type label of the sample data, the processing result and the weight values of the plurality of sample paths. And according to the loss function, network parameters in the initial processing model, such as the network parameters of the feature layer, the network parameters of the LSTM layer, the network parameters of the Attention layer, etc., in the initial processing model are adjusted in a targeted manner until the value of the loss function is smaller than a preset error value.
According to the method, the initial processing model can be trained and adjusted for multiple times by utilizing a plurality of sample paths corresponding to a plurality of sample data, so that the preset processing model with the accuracy meeting the requirement is obtained.
As can be seen from the above, in the data processing method provided in the embodiments of the present specification, a plurality of paths between a first target object and a second target object, which are to be predicted to have a preset relationship, are determined according to a target enterprise knowledge graph constructed based on business data related to the target enterprise; and processing the plurality of paths by using a preset processing model, determining whether a preset relation exists between the first target object and the second target object according to the model output to serve as a processing result, and determining and utilizing a target path related to the processing result to serve as interpretation data for the processing result. Therefore, the processing result of whether the first target object and the second target object have the preset relationship can be determined, and simultaneously, more accurate and reasonable interpretation data about the processing result with higher reference value can be obtained.
Referring to fig. 7, an embodiment of the present specification further provides a data processing method. When the method is implemented, the following contents may be included.
S701: and acquiring target service data.
S702: constructing a target knowledge graph according to the target service data; wherein the target knowledge-graph comprises a plurality of data objects including at least a first target object and a second target object, and a plurality of known relationships between the data objects.
S703: and determining a plurality of paths leading from the first target object to the second target object according to the target knowledge graph.
S704: and processing the plurality of paths by calling a preset processing model to determine whether a preset relationship exists between the first target object and the second target object as a processing result, and determining a target path associated with the processing result as interpretation data for the processing result.
In some embodiments, the target service data may specifically be service data at least including a plurality of data objects, such as a first target object, a second target object, and so on, and a relationship between a feature of the data object and the plurality of data objects.
In some embodiments, the data object may specifically include at least one of: entity objects, account objects, merchandise objects, and so forth. Of course, the above listed data objects are only illustrative. In specific implementation, the method can also be applied to other types of data objects according to specific application scenarios and processing requirements. The present specification is not limited to these.
By the method, the processing result of whether the first target object and the second target object in the data object have the preset relationship can be determined efficiently, and meanwhile, more accurate and reliable interpretation data for supporting the processing result can be obtained.
Referring to fig. 8, an embodiment of the present disclosure further provides a data processing method. When the method is implemented, the following contents may be included.
S801: acquiring sample data, and determining a type label of the sample data according to the sample type of the sample data; wherein the sample types include positive samples and negative samples;
s802: constructing a corresponding sample knowledge graph according to the sample data, and determining a plurality of sample paths from the sample knowledge graph;
s803: training an initial processing model according to the type label of the sample data and the plurality of sample paths to obtain a preset processing model meeting requirements; wherein the initial processing model comprises at least an initial LSTM layer and an initial Attention layer; the preset processing model is used for determining the judgment probability that a first target object and a second target object in the knowledge graph have a preset relation and the weight values of a plurality of paths leading from the first target object to the second target object in the knowledge graph according to the knowledge graph.
By the method, a probability value capable of determining whether a preset relationship exists between the first target object and the second target object can be obtained through training, and meanwhile, a processing model capable of outputting the weight values of the multiple paths determined and used in the process of determining the probability value can be output, so that a user can be helped to determine whether a processing result of the preset relationship exists between the first target object and the second target object according to the probability value, and meanwhile, a target path with high relevance to the processing result can be screened out of the multiple paths according to the weight values of the multiple paths to serve as interpretation data supporting the processing result.
Embodiments of the present specification further provide a server, including a processor and a memory for storing processor-executable instructions, where the processor, when implemented, may perform the following steps according to the instructions: acquiring business data related to a target enterprise; constructing a knowledge graph of the target enterprise according to the business data related to the target enterprise; wherein the target business's knowledge-graph includes a plurality of entity objects related to the target business, including at least a first target object and a second target object, and known relationships between the entity objects; determining a plurality of paths from the first target object to the second target object according to the knowledge graph of the target enterprise; and processing the plurality of paths by calling a preset processing model to determine whether a preset relationship exists between the first target object and the second target object as a processing result, and determining a target path associated with the processing result as interpretation data for the processing result.
In order to more accurately complete the above instructions, referring to fig. 9, another specific server is provided in the embodiments of the present specification, where the server includes a network communication port 901, a processor 902, and a memory 903, and the above structures are connected by an internal cable, so that the structures may perform specific data interaction.
The network communication port 901 may be specifically configured to obtain business data related to a target enterprise.
The processor 902 may be specifically configured to construct a knowledge graph of the target enterprise according to the service data related to the target enterprise; wherein the target business's knowledge-graph includes a plurality of entity objects related to the target business, including at least a first target object and a second target object, and known relationships between the entity objects; determining a plurality of paths from the first target object to the second target object according to the knowledge graph of the target enterprise; and processing the plurality of paths by calling a preset processing model to determine whether a preset relationship exists between the first target object and the second target object as a processing result, and determining a target path associated with the processing result as interpretation data for the processing result.
The memory 903 may be specifically configured to store a corresponding instruction program.
In this embodiment, the network communication port 901 may be a virtual port that is bound to different communication protocols, so that different data can be sent or received. For example, the network communication port may be a port responsible for web data communication, a port responsible for FTP data communication, or a port responsible for mail data communication. In addition, the network communication port can also be a communication interface or a communication chip of an entity. For example, it may be a wireless mobile network communication chip, such as GSM, CDMA, etc.; it can also be a Wifi chip; it may also be a bluetooth chip.
In this embodiment, the processor 902 may be implemented in any suitable manner. For example, the processor may take the form of, for example, a microprocessor or processor and a computer-readable medium that stores computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, an embedded microcontroller, and so forth. The description is not intended to be limiting.
In this embodiment, the memory 903 may include multiple layers, and in a digital system, the memory may be any memory as long as binary data can be stored; in an integrated circuit, a circuit without a physical form and with a storage function is also called a memory, such as a RAM, a FIFO and the like; in the system, the storage device in physical form is also called a memory, such as a memory bank, a TF card and the like.
The present specification further provides a computer storage medium based on the above data processing method, where the computer storage medium stores computer program instructions, and when the computer program instructions are executed, the computer storage medium implements: acquiring business data related to a target enterprise; constructing a knowledge graph of the target enterprise according to the business data related to the target enterprise; wherein the target business's knowledge-graph includes a plurality of entity objects related to the target business, including at least a first target object and a second target object, and known relationships between the entity objects; determining a plurality of paths from the first target object to the second target object according to the knowledge graph of the target enterprise; and processing the plurality of paths by calling a preset processing model to determine whether a preset relationship exists between the first target object and the second target object as a processing result, and determining a target path associated with the processing result as interpretation data for the processing result.
In this embodiment, the storage medium includes, but is not limited to, a Random Access Memory (RAM), a Read-Only Memory (ROM), a Cache (Cache), a Hard Disk Drive (HDD), or a Memory Card (Memory Card). The memory may be used to store computer program instructions. The network communication unit may be an interface for performing network connection communication, which is set in accordance with a standard prescribed by a communication protocol.
In this embodiment, the functions and effects specifically realized by the program instructions stored in the computer storage medium can be explained by comparing with other embodiments, and are not described herein again.
Referring to fig. 10, in a software level, the embodiment of the present specification further provides a data processing apparatus, which may specifically include the following structural modules.
The obtaining module 1001 may be specifically configured to obtain business data related to a target enterprise.
The constructing module 1002 may be specifically configured to construct a knowledge graph of the target enterprise according to the business data related to the target enterprise; wherein the target business's knowledge-graph includes a plurality of entity objects related to the target business, including at least a first target object and a second target object, and known relationships between the entity objects.
The first determining module 1003 may be specifically configured to determine, according to the knowledge graph of the target enterprise, a plurality of paths from the first target object to the second target object.
The second determining module 1004 may be specifically configured to process the multiple paths by invoking a preset processing model, so as to determine whether a preset relationship exists between the first target object and the second target object as a processing result, and determine a target path associated with the processing result as interpretation data for the processing result.
In some embodiments, the preset process model may include at least: a feature layer, an LSTM layer, an Attention layer, and the like.
In some embodiments, the second determining module 1004 may be specifically configured to invoke a feature layer in the preset processing model to perform feature processing on the multiple paths respectively, so as to obtain multiple sets of path features corresponding to the multiple paths respectively; wherein the path characteristics include at least one of: the identity of the entity objects on the path, the entity type of the entity objects, the relationship type of known relationships between the entity objects, and the like.
In some embodiments, the second determining module 1004 may be further configured to call an LSTM layer in the preset processing model to respectively process the multiple sets of path features, so as to obtain multiple feature vectors respectively corresponding to the multiple sets of path features.
In some embodiments, the second determining module 1004 may be further configured to call an Attention layer in the preset processing model to calculate weight values of a plurality of paths according to the plurality of feature vectors; calculating a judgment probability that a preset relation exists between the first target object and the second target object according to the paths and the weight values of the paths by using the Attention layer; and outputting the decision probability and the weight values of the paths.
In some embodiments, the second determining module 1004 may be further configured to, according to the weight values of the multiple paths, screen one or more paths with a largest weight value from the multiple paths as target paths associated with the processing result; and using the target path as the interpretation data for the processing result.
In some embodiments, when implemented specifically, the building module 1002 may be configured to determine, according to the business data related to the target enterprise, an entity object related to the target enterprise, an identity of the entity object, a known relationship between the entity objects, and a relationship type of the known relationship; establishing a node according to the identity of the entity object; and establishing connecting edges between the nodes according to the known relationship between the entity objects and the relationship type of the known relationship so as to obtain the corresponding knowledge graph of the target enterprise.
In some embodiments, the apparatus may further include a training module, which may be configured to establish a preset processing model according to the following manner: acquiring sample data, and determining a type label of the sample data according to the sample type of the sample data; wherein the sample types include positive samples and negative samples; constructing a corresponding sample knowledge graph according to the sample data, and determining a plurality of sample paths from the sample knowledge graph; training an initial processing model according to the type label of the sample data and the plurality of sample paths to obtain a preset processing model meeting requirements; wherein the initial treatment model comprises at least an initial LSTM layer and an initial attention layer.
It should be noted that, the units, devices, modules, etc. illustrated in the above embodiments may be implemented by a computer chip or an entity, or implemented by a product with certain functions. For convenience of description, the above devices are described as being divided into various modules by functions, and are described separately. It is to be understood that, in implementing the present specification, functions of each module may be implemented in one or more pieces of software and/or hardware, or a module that implements the same function may be implemented by a combination of a plurality of sub-modules or sub-units, or the like. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
As can be seen from the above, the data processing apparatus provided in the embodiment of the present specification can enable a user to obtain more accurate and reasonable interpretation data about a processing result with a higher reference value while obtaining the processing result indicating whether the first target object and the second target object have the preset relationship.
Although the present specification provides method steps as described in the examples or flowcharts, additional or fewer steps may be included based on conventional or non-inventive means. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. When an apparatus or client product in practice executes, it may execute sequentially or in parallel (e.g., in a parallel processor or multithreaded processing environment, or even in a distributed data processing environment) according to the embodiments or methods shown in the figures. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, the presence of additional identical or equivalent elements in a process, method, article, or apparatus that comprises the recited elements is not excluded. The terms first, second, etc. are used to denote names, but not any particular order.
Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may therefore be considered as a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.
This description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, classes, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
From the above description of the embodiments, it is clear to those skilled in the art that the present specification can be implemented by software plus necessary general hardware platform. With this understanding, the technical solutions in the present specification may be essentially embodied in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a mobile terminal, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments in the present specification.
The embodiments in the present specification are described in a progressive manner, and the same or similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. The description is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable electronic devices, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
While the specification has been described with examples, those skilled in the art will appreciate that there are numerous variations and permutations of the specification that do not depart from the spirit of the specification, and it is intended that the appended claims include such variations and modifications that do not depart from the spirit of the specification.

Claims (20)

1. A method of data processing, comprising:
acquiring business data related to a target enterprise;
constructing a knowledge graph of the target enterprise according to the business data related to the target enterprise; wherein the target business's knowledge-graph includes a plurality of entity objects related to the target business, including at least a first target object and a second target object, and known relationships between the entity objects;
determining a plurality of paths from the first target object to the second target object according to the knowledge graph of the target enterprise;
and processing the plurality of paths by calling a preset processing model to determine whether a preset relationship exists between the first target object and the second target object as a processing result, and determining a target path associated with the processing result as interpretation data for the processing result.
2. The method of claim 1, wherein constructing a target enterprise's knowledge graph from the business data associated with the target enterprise comprises:
determining entity objects related to the target enterprise, identity marks of the entity objects, known relations among the entity objects and relation types of the known relations according to the business data related to the target enterprise;
establishing a node according to the identity of the entity object;
and establishing connecting edges between the nodes according to the known relationship between the entity objects and the relationship type of the known relationship so as to obtain the corresponding knowledge graph of the target enterprise.
3. The method of claim 1, the preset processing model comprising at least: characteristic layer, LSTM layer, Attention layer.
4. The method of claim 3, the processing the plurality of paths by invoking a preset processing model, comprising: calling a feature layer in the preset processing model to perform feature processing on the plurality of paths respectively so as to obtain a plurality of groups of path features corresponding to the plurality of paths respectively; wherein the path characteristics include at least one of: the identity of the entity object on the path, the entity type of the entity object, and the relationship type of the known relationship between the entity objects.
5. The method of claim 4, the processing the plurality of paths by invoking a preset processing model, further comprising: and calling an LSTM layer in the preset processing model to respectively process the multiple groups of path features so as to obtain multiple feature vectors respectively corresponding to the multiple groups of path features.
6. The method of claim 5, the processing the plurality of paths by invoking a preset processing model, further comprising: calling an Attention layer in the preset processing model to calculate the weighted values of a plurality of paths according to the plurality of feature vectors; calculating a judgment probability that a preset relation exists between the first target object and the second target object according to the paths and the weight values of the paths by using the Attention layer; and outputting the decision probability and the weight values of the paths.
7. The method of claim 6, after outputting the decision probability, and the weight values for the plurality of paths, the method further comprising:
according to the weight values of the paths, one or more paths with the maximum weight values are screened out from the paths to serve as target paths related to the processing results;
determining the target path as interpretation data for the processing result.
8. The method according to claim 1, after determining whether the first target object and the second target object have a preset relationship as a processing result, and a target path associated with the processing result as interpretation data for the processing result, the method further comprising:
and generating a data processing report about the target enterprise according to the interpretation data and the processing result.
9. The method of claim 1, the preset relationship comprising: the first target object is an enterprise beneficiary of the second target object.
10. The method of claim 1, wherein the pre-set processing model is established as follows:
acquiring sample data, and determining a type label of the sample data according to the sample type of the sample data; wherein the sample types include positive samples and negative samples;
constructing a corresponding sample knowledge graph according to the sample data, and determining a plurality of sample paths from the sample knowledge graph;
training an initial processing model according to the type label of the sample data and the plurality of sample paths to obtain a preset processing model meeting requirements; wherein the initial processing model includes at least an initial LSTM layer and an initial Attention layer.
11. A method of data processing, comprising:
acquiring target service data;
constructing a target knowledge graph according to the target service data; wherein the target knowledge-graph comprises a plurality of data objects, including at least a first target object and a second target object, and a plurality of known relationships between the data objects;
determining a plurality of paths from the first target object to the second target object according to the target knowledge graph;
and processing the plurality of paths by calling a preset processing model to determine whether a preset relationship exists between the first target object and the second target object as a processing result, and determining a target path associated with the processing result as interpretation data for the processing result.
12. The method of claim 11, the data object comprising at least one of: entity object, account object, commodity object.
13. A method of data processing, comprising:
acquiring sample data, and determining a type label of the sample data according to the sample type of the sample data; wherein the sample types include positive samples and negative samples;
constructing a corresponding sample knowledge graph according to the sample data, and determining a plurality of sample paths from the sample knowledge graph;
training an initial processing model according to the type label of the sample data and the plurality of sample paths to obtain a preset processing model meeting requirements; wherein the initial processing model comprises at least an initial LSTM layer and an initial Attention layer; the preset processing model is used for determining the judgment probability that a first target object and a second target object in the knowledge graph have a preset relation and the weight values of a plurality of paths leading from the first target object to the second target object in the knowledge graph according to the knowledge graph.
14. A data processing apparatus comprising:
the acquisition module is used for acquiring business data related to the target enterprise;
the construction module is used for constructing a knowledge graph of the target enterprise according to the business data related to the target enterprise; wherein the target business's knowledge-graph includes a plurality of entity objects related to the target business, including at least a first target object and a second target object, and known relationships between the entity objects;
the first determination module is used for determining a plurality of paths from a first target object to a second target object according to the knowledge graph of the target enterprise;
the second determining module is used for processing the plurality of paths by calling a preset processing model to determine whether a preset relation exists between the first target object and the second target object as a processing result, and a target path related to the processing result is used as interpretation data aiming at the processing result.
15. The apparatus of claim 14, the preset treatment model comprising at least: characteristic layer, LSTM layer, Attention layer.
16. The apparatus according to claim 15, wherein the second determining module is specifically configured to invoke a feature layer in the preset processing model to perform feature processing on the multiple paths respectively, so as to obtain multiple sets of path features corresponding to the multiple paths respectively; wherein the path characteristics include at least one of: the identity of the entity object on the path, the entity type of the entity object, and the relationship type of the known relationship between the entity objects.
17. The apparatus of claim 16, wherein the second determining module is further configured to call an LSTM layer in the preset processing model to respectively process the multiple sets of path features, so as to obtain multiple feature vectors respectively corresponding to the multiple sets of path features.
18. The apparatus according to claim 17, wherein the second determining module is further configured to invoke an Attention layer in the preset processing model to calculate weight values of a plurality of paths according to the plurality of feature vectors; calculating a judgment probability that a preset relation exists between the first target object and the second target object according to the paths and the weight values of the paths by using the Attention layer; and outputting the decision probability and the weight values of the paths.
19. The apparatus according to claim 18, wherein the second determining module is further configured to, according to the weight values of the plurality of paths, screen one or more paths with a largest weight value from the plurality of paths as target paths associated with the processing result; and using the target path as the interpretation data for the processing result.
20. A server comprising a processor and a memory for storing processor-executable instructions which, when executed by the processor, implement the steps of the method of any one of claims 1 to 10.
CN202011062803.1A 2020-09-30 2020-09-30 Data processing method and device and server Pending CN112328802A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011062803.1A CN112328802A (en) 2020-09-30 2020-09-30 Data processing method and device and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011062803.1A CN112328802A (en) 2020-09-30 2020-09-30 Data processing method and device and server

Publications (1)

Publication Number Publication Date
CN112328802A true CN112328802A (en) 2021-02-05

Family

ID=74314413

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011062803.1A Pending CN112328802A (en) 2020-09-30 2020-09-30 Data processing method and device and server

Country Status (1)

Country Link
CN (1) CN112328802A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966099A (en) * 2021-02-26 2021-06-15 北京金堤征信服务有限公司 Relation graph display method and device and computer readable storage medium
CN113127649A (en) * 2021-05-07 2021-07-16 支付宝(杭州)信息技术有限公司 Map construction method and device
CN113792800A (en) * 2021-09-16 2021-12-14 创新奇智(重庆)科技有限公司 Feature generation method and device, electronic device and storage medium
CN114925167A (en) * 2022-05-20 2022-08-19 武汉众智数字技术有限公司 Case processing method and system based on knowledge graph

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019093677A1 (en) * 2017-11-09 2019-05-16 정진욱 Three-dimensional based electronic navigational chart display system
CN109886492A (en) * 2019-02-26 2019-06-14 浙江鑫升新能源科技有限公司 Photovoltaic power generation power prediction model and its construction method based on Attention LSTM
CN109919194A (en) * 2019-01-31 2019-06-21 北京明略软件***有限公司 Piece identity's recognition methods, system, terminal and storage medium in a kind of event
CN110334221A (en) * 2019-07-18 2019-10-15 桂林电子科技大学 A kind of interpretation recommended method in knowledge based map path
CN110390465A (en) * 2019-06-18 2019-10-29 深圳壹账通智能科技有限公司 Air control analysis and processing method, device and the computer equipment of business datum
CN111309879A (en) * 2020-01-20 2020-06-19 北京文思海辉金信软件有限公司 Knowledge graph-based man-machine training scene construction method and device
CN111400504A (en) * 2020-03-12 2020-07-10 支付宝(杭州)信息技术有限公司 Method and device for identifying enterprise key people

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019093677A1 (en) * 2017-11-09 2019-05-16 정진욱 Three-dimensional based electronic navigational chart display system
CN109919194A (en) * 2019-01-31 2019-06-21 北京明略软件***有限公司 Piece identity's recognition methods, system, terminal and storage medium in a kind of event
CN109886492A (en) * 2019-02-26 2019-06-14 浙江鑫升新能源科技有限公司 Photovoltaic power generation power prediction model and its construction method based on Attention LSTM
CN110390465A (en) * 2019-06-18 2019-10-29 深圳壹账通智能科技有限公司 Air control analysis and processing method, device and the computer equipment of business datum
CN110334221A (en) * 2019-07-18 2019-10-15 桂林电子科技大学 A kind of interpretation recommended method in knowledge based map path
CN111309879A (en) * 2020-01-20 2020-06-19 北京文思海辉金信软件有限公司 Knowledge graph-based man-machine training scene construction method and device
CN111400504A (en) * 2020-03-12 2020-07-10 支付宝(杭州)信息技术有限公司 Method and device for identifying enterprise key people

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
周毅仪: "基于多源海道测绘数据库的海事空间信息服务研究", 《测绘通报》, no. 08, pages 21 - 23 *
彭文 等: "电力市场中基于Attention-LSTM的短期负荷预测模型", 《电网技术》, vol. 43, no. 05, 31 May 2019 (2019-05-31), pages 1745 - 1751 *
徐国庆 等: "基于Attenton-LSTM神经网络的船舶航行预测", 《舰船科学技术》, vol. 41, no. 23, 8 December 2019 (2019-12-08), pages 177 - 180 *
於张闲 等: "基于深度学习的虚假健康信息识别", 《软件导刊》, vol. 19, no. 03, 15 March 2020 (2020-03-15), pages 16 - 20 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966099A (en) * 2021-02-26 2021-06-15 北京金堤征信服务有限公司 Relation graph display method and device and computer readable storage medium
CN113127649A (en) * 2021-05-07 2021-07-16 支付宝(杭州)信息技术有限公司 Map construction method and device
CN113792800A (en) * 2021-09-16 2021-12-14 创新奇智(重庆)科技有限公司 Feature generation method and device, electronic device and storage medium
CN113792800B (en) * 2021-09-16 2023-12-19 创新奇智(重庆)科技有限公司 Feature generation method and device, electronic equipment and storage medium
CN114925167A (en) * 2022-05-20 2022-08-19 武汉众智数字技术有限公司 Case processing method and system based on knowledge graph

Similar Documents

Publication Publication Date Title
CN107818344B (en) Method and system for classifying and predicting user behaviors
CN109241461B (en) User portrait construction method and device
CN107515915B (en) User identification association method based on user behavior data
CN112328802A (en) Data processing method and device and server
CN109118316B (en) Method and device for identifying authenticity of online shop
CN110516173B (en) Illegal network station identification method, illegal network station identification device, illegal network station identification equipment and illegal network station identification medium
US11665247B2 (en) Resource discovery agent computing device, software application, and method
CN113095408A (en) Risk determination method and device and server
CN111026853A (en) Target problem determination method and device, server and customer service robot
CN111427915A (en) Information processing method and device, storage medium and electronic equipment
CN111126071A (en) Method and device for determining questioning text data and data processing method of customer service group
CN112347457A (en) Abnormal account detection method and device, computer equipment and storage medium
CN113850669A (en) User grouping method and device, computer equipment and computer readable storage medium
CN113408627A (en) Target object determination method and device and server
CN112035676A (en) User operation behavior knowledge graph construction method and device
CN114139052B (en) Ranking model training method for intelligent recommendation, intelligent recommendation method and device
CN115601042A (en) Information identification method and device, electronic equipment and storage medium
CN112001792B (en) Configuration information consistency detection method and device
CN114493850A (en) Artificial intelligence-based online notarization method, system and storage medium
CN113516398A (en) Risk equipment identification method and device based on hierarchical sampling and electronic equipment
CN110070371B (en) Data prediction model establishing method and equipment, storage medium and server thereof
CN112214387B (en) Knowledge graph-based user operation behavior prediction method and device
CN113052647A (en) Recommendation method and device for cold start and computer readable storage medium
CN113408263A (en) Criminal period prediction method and device, storage medium and electronic device
CN112507079B (en) Document case situation matching method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40045911

Country of ref document: HK

TA01 Transfer of patent application right

Effective date of registration: 20230106

Address after: 310099 B, Huanglong Times Square, 18 Wan Tang Road, Xihu District, Hangzhou, Zhejiang.

Applicant after: Alipay.com Co.,Ltd.

Address before: 310000 801-11 section B, 8th floor, 556 Xixi Road, Xihu District, Hangzhou City, Zhejiang Province

Applicant before: Alipay (Hangzhou) Information Technology Co.,Ltd.

TA01 Transfer of patent application right