CN113886598A - Knowledge graph representation method based on federal learning - Google Patents

Knowledge graph representation method based on federal learning Download PDF

Info

Publication number
CN113886598A
CN113886598A CN202111134706.3A CN202111134706A CN113886598A CN 113886598 A CN113886598 A CN 113886598A CN 202111134706 A CN202111134706 A CN 202111134706A CN 113886598 A CN113886598 A CN 113886598A
Authority
CN
China
Prior art keywords
entity
client
matrix
knowledge graph
embedded
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111134706.3A
Other languages
Chinese (zh)
Inventor
张文
陈名杨
姚祯
陈华钧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202111134706.3A priority Critical patent/CN113886598A/en
Publication of CN113886598A publication Critical patent/CN113886598A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Game Theory and Decision Science (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Algebra (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a knowledge graph representation learning method based on federal learning, which comprises the following steps: firstly, a central server and a plurality of clients are established; the central server aggregates entity embeddings from different clients and sends the aggregated embeddings back to each client; the client updates the entity and the relationship embedding by using the local triple, and sends the updated entity embedding matrix to the central server. Further, given that the embedding learned by the federated knowledge graph embedding framework is complementary to the training embedding based on only one knowledge graph without federated settings, a knowledge graph fusion step is provided to fuse the embedding learned with and without federated settings. The method can simultaneously utilize a plurality of knowledge graphs to supplement each other, ensures the privacy of data, and has good practical value in the knowledge graph completion task.

Description

Knowledge graph representation method based on federal learning
Technical Field
The invention belongs to the technical field of knowledge graph representation, and particularly relates to a knowledge graph representation learning method based on federal learning.
Background
A knowledge-graph (KG) is a data set connected by a head entity (head entry) and a tail entity (tail entry) in the form of a triple through a relationship (relationship). Triplets are represented in the form of [ head, relation, tail entities ] (abbreviated as (h, r, t)), and many large-scale knowledge maps such as FreeBase, YAGO and WordNet are being built up, which provide an effective basis for many important AI tasks such as semantic search, recommendation, question answering, and the like. Knowledge maps often contain a large amount of information, and two of the more important information are considered to be structural information and text information. The structural information refers to a certain relation existing between an entity and other entities through a relationship, and the structural information of one entity can be often embodied through a neighbor triple of the entity; the text information refers to semantic information of text description of the entities and relations in the knowledge-graph, and is usually embodied by names of the entities and relations, additional word descriptions of the entities and relations, and the like. However, many knowledgemaps are still incomplete, and therefore it is important to predict missing triples, i.e., the knowledgemap completion (KGC) task, on existing triples.
In actual knowledge graph application, it is common that the same entity relates to different knowledge graphs, in this case, the knowledge graph becomes a multi-source knowledge graph, a plurality of knowledge graphs can supplement knowledge, and better performance is obtained in link prediction and other problems, but the knowledge graph often relates to some sensitive fields (such as financial or medical fields) or is restricted by some regulations. Therefore, how to utilize the complementary capabilities of different related knowledge graphs while protecting data privacy is an urgent problem to be solved in practical application.
Patent application publication No. CN112200321A discloses an inference method based on knowledge federation and graph network, comprising: each participant server constructs a knowledge graph for local entity data; generating a low-dimensional knowledge vector according to the initial node characteristics and the structural information of the knowledge graph and a pre-trained graph neural network model; sending each low-dimensional knowledge vector to a trusted third-party server; the credible third-party server fuses the received low-dimensional knowledge vector by using a pre-trained fusion model to obtain a fused feature representation; and (4) aiming at the knowledge inference request, the index fused feature representation is inferred to obtain an inference result.
Patent application publication No. CN111767411A discloses a knowledge graph representation learning optimization method, which determines a training sample set from a local knowledge graph data set; and carrying out federal learning on a local knowledge graph representation learning model based on the training sample set and the other data ends to obtain a target knowledge graph representation learning model, wherein the other data ends participate in federal learning based on the training sample set determined in the local knowledge graph data sets.
In the two technical schemes, the knowledge graph is learned by federal learning, but the problem of human fusion that the entity embedded representation is obtained without participating in the federal learning and participating in the federal learning is not considered, so that the obtained entity embedded representation is inaccurate.
Disclosure of Invention
In view of the above, the invention provides a knowledge graph representation learning method based on federal learning, which performs data privacy protection while obtaining a global entity embedded representation in a federal learning manner, and each client fuses entity embedded representations participating in federal learning and entities not participating in federal learning, thereby further improving the accuracy of entity embedded representation.
The technical scheme provided by the invention is as follows:
a knowledge graph representation learning method based on federal learning comprises the following steps that:
(1) the central server maintains an entity list of all knowledge maps, defines a permutation matrix and a presence vector for each client, initializes an entity embedded matrix, screens the entity embedded matrix according to the permutation matrix to determine an entity embedded matrix corresponding to each client and sends the entity embedded matrix to the clients;
(2) and circulating the federal learning process: the client side performs knowledge graph representation learning by adopting a local knowledge graph according to the received entity embedded matrix to update the entity embedded matrix, and uploads the updated entity embedded matrix to the central server; the central server aggregates all entity embedded matrixes uploaded by all the clients according to the permutation matrixes and the existence vectors, screens the aggregated entity embedded matrixes according to the permutation matrixes to determine entity embedded matrixes corresponding to all the clients and issues the entity embedded matrixes to the clients;
(3) and each client side fuses the entity embedded matrix determined by participating in the federal learning and the entity embedded matrix determined by not participating in the federal learning so as to determine a final entity embedded matrix.
Compared with the prior art, the invention has the beneficial effects that at least:
the method comprises the steps that a local knowledge graph of each client is introduced into federal setting, a traditional knowledge graph completion task is expressed as a new task, namely the federal knowledge graph completion task, privacy of the knowledge graphs of the clients is guaranteed while a plurality of knowledge graphs are utilized, and moreover, the accuracy of an entity embedding matrix participating in federal learning and determining is improved in a federal learning mode.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a flowchart of a knowledge graph representation learning method based on federated learning according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the detailed description and specific examples, while indicating the scope of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
Inspired by federal learning and knowledge graph embedding concepts, in order to fully utilize different related knowledge graphs and ensure the privacy of data, the embodiment provides a knowledge graph representation learning method based on federal learning to learn the embedding of shared entities. The system for realizing the embedding of the learning sharing entity comprises a plurality of clients and a central server which is respectively communicated with each client, wherein each client is provided with a local knowledge graph. Specifically, by learning the embedding of the shared entity, the ability to predict missing triples is embedded using the knowledge graph without collecting triples from different customer servers to a central server. Privacy is protected by re-client training embedding and aggregating embedding at the server side without collecting three clients into a central service area. For each client, the triplets and the relation sets cannot be disclosed to other clients, and the entity sets are only read by the central server, so that data privacy is guaranteed. In addition, a model fusion process on each client was also designed in order to fuse the embedded capabilities based on only one client and learning in the federated environment.
Fig. 1 is a flowchart of a knowledge graph representation learning method based on federated learning according to an embodiment. As shown in fig. 1, the knowledge graph representation learning method based on federated learning provided in the embodiment includes the following steps:
and step 1, establishing a federal learning system.
In the embodiment, the established federated learning system comprises a plurality of clients and 1 central server, wherein the central server respectively performs data transmission with each client, the plurality of clients do not perform data transmission with each other, and each client has a local knowledge graph. E represents an entity set in the knowledge graph, R represents a relation set in the knowledge graph, TP represents a triple set in the knowledge graph,the federal knowledge graph
Figure BDA0003281854820000051
Wherein G isc={Ec,Rc,TPcRepresents the knowledge-graph stored at the c-th client. C denotes the total number of clients.
And 2, maintaining an entity list of all knowledge graphs by the central server, and defining a permutation matrix and a presence vector for each client.
In an embodiment, an entity list T is maintained in the central server, and all unique entities included in the knowledge graph of all clients participating in federal learning are recorded in the entity list T, that is, the entities maintained in the entity list T are not duplicated.
In an embodiment, the permutation matrix defined by the central server for each client is represented as
Figure BDA0003281854820000052
Wherein the permutation matrix PcEach element of
Figure BDA0003281854820000053
Representing the correspondence of entities in the entity list of the central server to entities of the client,
Figure BDA0003281854820000054
indicating that the ith entity in the entity list is the jth entity of the c-th client,
Figure BDA0003281854820000055
indicating that the ith entity in the entity list is not the jth entity of the C client, C indicating the total number of the clients, n indicating the number of the entities in the entity list, ncRepresenting the entity number of the c-th client;
the presence vector defined by the central server for each client is represented as
Figure BDA0003281854820000056
Wherein the vector represents VcEach element in total
Figure BDA0003281854820000057
Indicating the presence of the ith entity for the c-th client,
Figure BDA0003281854820000058
an ith entity indicating the existence of the c-th client,
Figure BDA0003281854820000059
indicating that the ith entity does not exist for the c-th client.
The permutation matrix and the presence vector are used for subsequent screening and aggregation of entity embedding matrixes corresponding to the clients.
And 3, initializing the entity embedded matrix by the central server, screening the entity embedded matrix according to the permutation matrix to determine the entity embedded matrix corresponding to each client and issuing the entity embedded matrix to the client.
In an embodiment, the central server initialization entity embedding matrix is represented as
Figure BDA0003281854820000061
Wherein d iseIs the embedding dimension of the entity. Before being sent to each client, the entity embedding matrix is screened to determine the entity embedding matrix corresponding to each client, specifically, the following formula is adopted to screen the entity embedding matrix according to the permutation matrix to determine the entity embedding matrix corresponding to each client:
Figure BDA0003281854820000062
wherein E istRepresenting entity embedded matrix, P, aggregated by the central server in the t-th round of federal learning processcRepresenting a permutation matrix, T representing a transposition,
Figure BDA0003281854820000063
the entity embedding matrix corresponding to the c-th client in the t-th round of federal learning is shown, when t is 0,E0an entity embedding matrix representing the initialization of the central server,
Figure BDA0003281854820000064
and the initialized entity embedded matrix corresponding to the c-th client is represented.
In the t-th round, the central server sends the entity-embedded matrix to each client
Figure BDA0003281854820000065
And the specific entity embedded matrix is returned to the corresponding client, so that the control entity and the relation data can not be leaked.
And 4, training federal learning by each client and the cigarette server based on the entity embedded matrix and the knowledge graph so as to update the entity embedded matrix.
In the examples, the federated learning process is cycled: the client side performs knowledge graph representation learning by adopting a local knowledge graph according to the received entity embedded matrix to update the entity embedded matrix, and uploads the updated entity embedded matrix to the central server; and the central server aggregates all the entity embedded matrixes uploaded by all the clients according to the permutation matrixes and the existence vectors, screens the aggregated entity embedded matrixes according to the permutation matrixes to determine the entity embedded matrixes corresponding to all the clients and issues the entity embedded matrixes to the clients.
In an embodiment, each client receives the entity embedding matrix
Figure BDA0003281854820000066
And then, updating the entity embedded matrix by adopting the local triple and the relationship embedded matrix, and updating the relationship embedded matrix of the client by the client. When a client side adopts a local knowledge graph to carry out knowledge graph representation learning, for each triple (h, r, t) consisting of a head entity, a relation and a tail entity in the knowledge graph, a scoring function is adopted to construct a loss function, an embedded model is trained by using the loss function, and an entity embedded matrix is updated at the same time;
wherein the loss function is:
Figure BDA0003281854820000071
where L (h, r, t) is the loss function of the triplet (h, r, t), fr(h, r, t) represents the scoring function of the triplet (h, r, t), fr(h,r,t’i) Denotes a triplet (h, r, t'i) Score function of (d), triplet (h, r, t'i) Representing that the tail entity t of the triplet (h, r, t) is negatively sampled as the ith entity t'iAnd in the obtained triple, m is the number of negative samples, gamma is an edge value, the value range is a real number set, sigma (·) is a sigmoid function, and p (h, r, t'i) Denotes a triplet (h, r, t'i) Is defined as follows:
Figure BDA0003281854820000072
where α represents a negative sample temperature.
In an embodiment, the specific form of the scoring function may vary according to the embedding model selected, and the following table lists the embedding models that may be used and the corresponding scoring functions.
TABLE 1
Figure BDA0003281854820000073
It is understood that: the embedded model adopts a TransE model and a corresponding scoring function fr(h, r, t) — h + r-t |; or the embedded model adopts a DistMult model and a corresponding scoring function fr(h,r,t)=hTdiag (r) t; or the embedded model adopts a ComplEx model and a corresponding scoring function
Figure BDA00032818548200000810
Or the embedded model adopts a RotaE model and a corresponding scoring function
Figure BDA0003281854820000081
Wherein d isiag (-) denotes a diagonal matrix,
Figure BDA0003281854820000082
represents the complex conjugate of the entity t, Re (. cndot.) represents the real part,
Figure BDA0003281854820000083
representing element multiplication.
After P rounds of training are carried out on the t-1 th round of client side, the client side uploads the trained entity embedded vectors
Figure BDA0003281854820000084
And when the client side sends the information to the central server, the central server aggregates all entity embedded matrixes uploaded by all the client sides according to the permutation matrix and the existence vector:
Figure BDA0003281854820000085
wherein, PcPermutation matrix, V, representing the c-th clientcA presence vector representing the c-th client,
Figure BDA0003281854820000086
representing an entity embedding matrix corresponding to the c-th client in the t-th round of federal learning process, EtRepresenting the entity embedding matrix aggregated by the central server in the tth round of federal learning process,
Figure BDA0003281854820000087
represents a 1 vector, i.e. a vector consisting of elements 1,
Figure BDA0003281854820000088
the method is expressed by the method of element division,
Figure BDA0003281854820000089
representing element multiplication, and ← representing valuation. The aggregation formula is understood as the entity embedding matrix from different clients will be permuted by the corresponding permutation matrix and by each of the presence vectorsThe number of entities present calculates a weight vector.
And 5, fusing the entity embedded matrix determined by participating in the federal learning and the entity embedded matrix determined by not participating in the federal learning by each client so as to determine a final entity embedded matrix.
The embedding learned by the federated knowledge graph embedding framework is complementary to training embedding based on only one knowledge graph without a federated setting. Embodiments therefore also design a fusion process to be applied to each client's knowledge graph to fuse entity embedding matrices with and without federal settings learning. The process of fusing the entity embedded matrix determined by participating in the federal learning and the entity embedded matrix determined by not participating in the federal learning by each client is as follows:
after splicing the entity embedded matrix determined by participating in the federal learning and the entity embedded matrix determined by not participating in the federal learning, fusing the splicing result by using a trained linear classifier, which is specifically represented as:
Figure BDA0003281854820000091
wherein,
Figure BDA0003281854820000092
[;]indicating that the two score vectors are concatenated by column,
Figure BDA0003281854820000093
Figure BDA0003281854820000094
score vectors representing training of a single knowledge-graph not participating in federal learning,
Figure BDA0003281854820000095
a scoring vector representing federal knowledge graph training participating in federal learning.
In an embodiment, the linear classifier is trained by interval loss ordering, so that the ordering of the positive triples is higher than that of the negative triples, and a loss function adopted in the training process is as follows:
Lf(h,r,t)=max(0,β-s(h,r,t)+s(h,r,t`))
wherein s (h, r, t) represents the fusion score vector of the positive triplet (h, r, t) on the linear classifier, s (h, r, t ') represents the fusion score vector of the negative triplet (h, r, t') on the linear classifier, β represents the interval parameter, and the value range is a real number set. The training goal of the linear classifier is to minimize the triplet model fusion penalty.
The knowledge graph representation learning method based on the federal learning provided by the embodiment can be applied to financial knowledge graphs and related applications which need privacy protection. Different relationships between entities, including but not limited to customers, companies, financial products, etc., are constructed and maintained, for example, in different financial institutions, which may include purchasing, attention, holdings, etc. relationships. For the task of classifying customers, each financial institution classifies the customers, but if the data of different institutions can be utilized, a better customer classification effect can be obtained, but under the condition of privacy protection, different financial institutions are not willing to directly exchange the data. The federal learning-based knowledge graph representation learning approach presented herein can effectively address this problem, where different financial institutions are considered client-side and a trusted institution, such as the government, is considered server-side. Finally, each client learns the embedded representation of the client, and the information of other clients is fused on the basis of privacy protection, so that a better client classification effect can be obtained.
In the knowledge graph representation learning method based on federated learning provided in the above embodiment, the local knowledge graph of each client is introduced into federated settings, and the conventional knowledge graph completion task is represented as a new task, that is, the federated knowledge graph completion task, so that privacy of the knowledge graphs of each client is guaranteed while a plurality of knowledge graphs are utilized, and furthermore, the accuracy of the entity embedded matrix participating in federated learning is improved by adopting a federated learning manner.
The above-mentioned embodiments are intended to illustrate the technical solutions and advantages of the present invention, and it should be understood that the above-mentioned embodiments are only the most preferred embodiments of the present invention, and are not intended to limit the present invention, and any modifications, additions, equivalents, etc. made within the scope of the principles of the present invention should be included in the scope of the present invention.

Claims (8)

1. A knowledge graph representation learning method based on federal learning is characterized in that a system for realizing the method comprises a plurality of clients and a central server which is respectively communicated with each client, each client is provided with a local knowledge graph, and the method comprises the following steps:
(1) the central server maintains an entity list of all knowledge maps, defines a permutation matrix and a presence vector for each client, initializes an entity embedded matrix, screens the entity embedded matrix according to the permutation matrix to determine an entity embedded matrix corresponding to each client and sends the entity embedded matrix to the clients;
(2) and circulating the federal learning process: the client side performs knowledge graph representation learning by adopting a local knowledge graph according to the received entity embedded matrix to update the entity embedded matrix, and uploads the updated entity embedded matrix to the central server; the central server aggregates all entity embedded matrixes uploaded by all the clients according to the permutation matrixes and the existence vectors, screens the aggregated entity embedded matrixes according to the permutation matrixes to determine entity embedded matrixes corresponding to all the clients and issues the entity embedded matrixes to the clients;
(3) and each client side fuses the entity embedded matrix determined by participating in the federal learning and the entity embedded matrix determined by not participating in the federal learning so as to determine a final entity embedded matrix.
2. The knowledge graph representation learning method based on federal learning of claim 1, wherein the central serverThe permutation matrix defined by the server for each client is represented as
Figure FDA0003281854810000011
Figure FDA0003281854810000012
Wherein the permutation matrix PcEach element of
Figure FDA0003281854810000013
Representing the correspondence of entities in the entity list of the central server to entities of the client,
Figure FDA0003281854810000014
indicating that the ith entity in the entity list is the jth entity of the c-th client,
Figure FDA0003281854810000015
indicating that the ith entity in the entity list is not the jth entity of the C client, C indicating the total number of the clients, n indicating the number of the entities in the entity list, ncRepresenting the entity number of the c-th client;
the presence vector defined by the central server for each client is represented as
Figure FDA0003281854810000016
Wherein the vector represents VcTotal of each element Vi cIndicating the presence of the ith entity of the c-th client, Vi c1 denotes the ith entity, V, that the c-th client existsi c0 means that the ith entity does not exist for the c-th client.
3. The knowledge graph representation learning method based on federated learning of claim 1, wherein the entity embedding matrix is filtered according to the permutation matrix to determine the entity embedding matrix corresponding to each client using the following formula:
Figure FDA0003281854810000021
wherein E istRepresenting entity embedded matrix, P, aggregated by the central server in the t-th round of federal learning processcRepresenting a permutation matrix, T representing a transposition,
Figure FDA0003281854810000022
representing an entity embedding matrix corresponding to the c-th client in the t-th round of federal learning process, and when t is 0, E0An entity embedding matrix representing the initialization of the central server,
Figure FDA0003281854810000023
and the initialized entity embedded matrix corresponding to the c-th client is represented.
4. The knowledge graph representation learning method based on federated learning of claim 1, wherein, when the client embeds a matrix according to the received entities and performs knowledge graph representation learning by using a local knowledge graph, for each triplet (h, r, t) consisting of a head entity, a relationship and a tail entity in the knowledge graph, a scoring function is used to construct a loss function, an embedding model is trained by using the loss function, and the entity embedding matrix is updated at the same time;
wherein the loss function is:
Figure FDA0003281854810000024
where L (h, r, t) is the loss function of the triplet (h, r, t), fr(h, r, t) represents the scoring function of the triplet (h, r, t), fr(h,r,t’i) Denotes a triplet (h, r, t'i) Score function of (d), triplet (h, r, t'i) Representing that the tail entity t of the triplet (h, r, t) is negatively sampled as the ith entity t'iThe obtained triad m is the number of negative samples, and the gamma tableRepresenting an edge value, wherein the value range is the whole real number set, sigma (·) represents a sigmoid function, and p (h, r, t'i) Denotes a triplet (h, r, t'i) Negative sample weight values of (1).
5. The knowledge graph representation learning method based on federated learning of claim 4, wherein the embedded model employs a TransE model, the corresponding scoring function fr(h, r, t) — h + r-t |; or the embedded model adopts a DistMult model and a corresponding scoring function fr(h,r,t)=hTdiag (r) t; or the embedded model adopts a ComplEx model and a corresponding scoring function
Figure FDA0003281854810000039
Or the embedded model adopts a RotaE model and a corresponding scoring function
Figure FDA0003281854810000031
Wherein diag (·) denotes a diagonal matrix,
Figure FDA0003281854810000032
represents the complex conjugate of the entity t, Re (. cndot.) represents the real part,
Figure FDA0003281854810000033
representing element multiplication.
6. The knowledge graph representation learning method based on federated learning of claim 1, wherein the central server aggregates all entity embedded matrices uploaded by all clients according to permutation matrices and presence vectors using the following formula:
Figure FDA0003281854810000034
wherein, PcPermutation matrix, V, representing the c-th clientcA presence vector representing the c-th client,
Figure FDA0003281854810000035
representing an entity embedding matrix corresponding to the c-th client in the t-th round of federal learning process, EtRepresenting the entity embedding matrix aggregated by the central server in the tth round of federal learning process,
Figure FDA0003281854810000036
a vector of 1 is represented by a vector of 1,
Figure FDA0003281854810000037
the method is expressed by the method of element division,
Figure FDA0003281854810000038
representing element multiplication, and ← representing valuation.
7. The knowledge graph representation learning method based on federal learning as claimed in claim 1, wherein the process of fusing the entity embedded matrix determined by participating in federal learning and the entity embedded matrix determined by not participating in federal learning by each client is as follows:
and after splicing the entity embedded matrix determined by participating in the federal learning and the entity embedded matrix determined by not participating in the federal learning, fusing the splicing result by using a trained linear classifier.
8. A knowledge graph representation learning method based on federated learning as defined in claim 7, wherein the linear classifier is trained by interval loss ordering such that positive triples are ordered higher than negative triples, the loss function employed in the training process being:
Lf(h,r,t)=max(0,β-s(h,r,t)+s(h,r,t`))
wherein s (h, r, t) represents the fusion score vector of the positive triplet (h, r, t) on the linear classifier, s (h, r, t ') represents the fusion score vector of the negative triplet (h, r, t') on the linear classifier, β represents the interval parameter, and the value range is a real number set.
CN202111134706.3A 2021-09-27 2021-09-27 Knowledge graph representation method based on federal learning Pending CN113886598A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111134706.3A CN113886598A (en) 2021-09-27 2021-09-27 Knowledge graph representation method based on federal learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111134706.3A CN113886598A (en) 2021-09-27 2021-09-27 Knowledge graph representation method based on federal learning

Publications (1)

Publication Number Publication Date
CN113886598A true CN113886598A (en) 2022-01-04

Family

ID=79006933

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111134706.3A Pending CN113886598A (en) 2021-09-27 2021-09-27 Knowledge graph representation method based on federal learning

Country Status (1)

Country Link
CN (1) CN113886598A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115062159A (en) * 2022-06-13 2022-09-16 西南交通大学 Multi-granularity dynamic knowledge graph embedded model construction method based on federal learning
CN116502709A (en) * 2023-06-26 2023-07-28 浙江大学滨江研究院 Heterogeneous federal learning method and device
CN116703553A (en) * 2023-08-07 2023-09-05 浙江鹏信信息科技股份有限公司 Financial anti-fraud risk monitoring method, system and readable storage medium
CN116757275A (en) * 2023-06-07 2023-09-15 京信数据科技有限公司 Knowledge graph federal learning device and method
CN116821375A (en) * 2023-08-29 2023-09-29 之江实验室 Cross-institution medical knowledge graph representation learning method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20180092194A (en) * 2017-02-08 2018-08-17 경북대학교 산학협력단 Method and system for embedding knowledge gragh reflecting logical property of relations, recording medium for performing the method
CN111767411A (en) * 2020-07-01 2020-10-13 深圳前海微众银行股份有限公司 Knowledge graph representation learning optimization method and device and readable storage medium
CN111858955A (en) * 2020-07-01 2020-10-30 石家庄铁路职业技术学院 Knowledge graph representation learning enhancement method and device based on encrypted federated learning
CN112200321A (en) * 2020-12-04 2021-01-08 同盾控股有限公司 Inference method, system, device and medium based on knowledge federation and graph network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20180092194A (en) * 2017-02-08 2018-08-17 경북대학교 산학협력단 Method and system for embedding knowledge gragh reflecting logical property of relations, recording medium for performing the method
CN111767411A (en) * 2020-07-01 2020-10-13 深圳前海微众银行股份有限公司 Knowledge graph representation learning optimization method and device and readable storage medium
CN111858955A (en) * 2020-07-01 2020-10-30 石家庄铁路职业技术学院 Knowledge graph representation learning enhancement method and device based on encrypted federated learning
CN112200321A (en) * 2020-12-04 2021-01-08 同盾控股有限公司 Inference method, system, device and medium based on knowledge federation and graph network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MINGYANG CHEN 等: "FedE: Embedding Knowledge Graphs in Federated Setting", ARXIV, 24 October 2020 (2020-10-24), pages 1 - 11 *
陈曦;陈华钧;张文;: "规则增强的知识图谱表示学习方法", 情报工程, no. 01, 15 February 2017 (2017-02-15) *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115062159A (en) * 2022-06-13 2022-09-16 西南交通大学 Multi-granularity dynamic knowledge graph embedded model construction method based on federal learning
CN115062159B (en) * 2022-06-13 2024-05-24 西南交通大学 Multi-granularity event early warning dynamic knowledge graph embedding model construction method based on federal learning
CN116757275A (en) * 2023-06-07 2023-09-15 京信数据科技有限公司 Knowledge graph federal learning device and method
CN116757275B (en) * 2023-06-07 2024-06-11 京信数据科技有限公司 Knowledge graph federal learning device and method
CN116502709A (en) * 2023-06-26 2023-07-28 浙江大学滨江研究院 Heterogeneous federal learning method and device
CN116703553A (en) * 2023-08-07 2023-09-05 浙江鹏信信息科技股份有限公司 Financial anti-fraud risk monitoring method, system and readable storage medium
CN116703553B (en) * 2023-08-07 2023-12-05 浙江鹏信信息科技股份有限公司 Financial anti-fraud risk monitoring method, system and readable storage medium
CN116821375A (en) * 2023-08-29 2023-09-29 之江实验室 Cross-institution medical knowledge graph representation learning method and system
CN116821375B (en) * 2023-08-29 2023-12-22 之江实验室 Cross-institution medical knowledge graph representation learning method and system

Similar Documents

Publication Publication Date Title
CN113886598A (en) Knowledge graph representation method based on federal learning
US20230039182A1 (en) Method, apparatus, computer device, storage medium, and program product for processing data
Criado et al. Non-iid data and continual learning processes in federated learning: A long road ahead
CN112861967B (en) Social network abnormal user detection method and device based on heterogeneous graph neural network
Yang et al. Friend or frenemy? Predicting signed ties in social networks
CN112232925A (en) Method for carrying out personalized recommendation on commodities by fusing knowledge maps
CN113609398B (en) Social recommendation method based on heterogeneous graph neural network
CN112015749A (en) Method, device and system for updating business model based on privacy protection
CN114677200B (en) Business information recommendation method and device based on multiparty high-dimension data longitudinal federation learning
CN113535825A (en) Cloud computing intelligence-based data information wind control processing method and system
CN111291125B (en) Data processing method and related equipment
Xu et al. Trust propagation and trust network evaluation in social networks based on uncertainty theory
CN112257841A (en) Data processing method, device and equipment in graph neural network and storage medium
CN115686868B (en) Cross-node-oriented multi-mode retrieval method based on federated hash learning
CN111858928A (en) Social media rumor detection method and device based on graph structure counterstudy
CN112861006A (en) Recommendation method and system fusing meta-path semantics
CN115062732A (en) Resource sharing cooperation recommendation method and system based on big data user tag information
CN115098692A (en) Cross-domain recommendation method and device, electronic equipment and storage medium
Yin et al. An efficient recommendation algorithm based on heterogeneous information network
Yang et al. Federated continual learning via knowledge fusion: A survey
Sai et al. Machine un-learning: an overview of techniques, applications, and future directions
CN116541592A (en) Vector generation method, information recommendation method, device, equipment and medium
CN115600642A (en) Streaming media-oriented decentralized federal learning method based on neighbor trust aggregation
Madhavi et al. Gradient boosted decision tree (GBDT) AND Grey Wolf Optimization (GWO) based intrusion detection model
Sheu et al. On the potential of a graph attention network in money laundering detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination