CN113807544B - Training method and device of federal learning model and electronic equipment - Google Patents

Training method and device of federal learning model and electronic equipment Download PDF

Info

Publication number
CN113807544B
CN113807544B CN202011621994.0A CN202011621994A CN113807544B CN 113807544 B CN113807544 B CN 113807544B CN 202011621994 A CN202011621994 A CN 202011621994A CN 113807544 B CN113807544 B CN 113807544B
Authority
CN
China
Prior art keywords
training
node
feature
data instance
verification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011621994.0A
Other languages
Chinese (zh)
Other versions
CN113807544A (en
Inventor
王佩琪
张文夕
顾松庠
薄列峰
孙孟哲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingdong Technology Holding Co Ltd
Original Assignee
Jingdong Technology Holding Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingdong Technology Holding Co Ltd filed Critical Jingdong Technology Holding Co Ltd
Priority to CN202011621994.0A priority Critical patent/CN113807544B/en
Publication of CN113807544A publication Critical patent/CN113807544A/en
Priority to JP2023540566A priority patent/JP2024501568A/en
Priority to KR1020237022514A priority patent/KR20230113804A/en
Priority to US18/270,281 priority patent/US20240127123A1/en
Priority to PCT/CN2021/143890 priority patent/WO2022144001A1/en
Application granted granted Critical
Publication of CN113807544B publication Critical patent/CN113807544B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning

Abstract

The application discloses a training method and device of a federal learning model and electronic equipment, which are applied to a server side, wherein the method comprises the following steps: if the training node meets the preset splitting condition, acquiring a target splitting mode corresponding to the training node; the training node is a node on one lifting tree in the plurality of lifting trees; notifying the client to split the nodes based on the target splitting mode, and acquiring updated training nodes; determining that the updated training node meets the training stopping condition, stopping training and generating a target federal learning model; and acquiring a verification set, and verifying the target federal learning model by a cooperative verification client, wherein the verification client is one of clients participating in federal learning model training. Therefore, the method and the device automatically select the tendency of the matched learning mode by mixing the transverse split mode and the longitudinal split mode, do not need to care about the data distribution mode, improve the performance of the federal learning model and reduce the verification loss of the model.

Description

Training method and device of federal learning model and electronic equipment
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a training method and apparatus for a federal learning model, and an electronic device.
Background
The federal learning is an emerging artificial intelligence basic technology, and the design goal is to develop high-efficiency machine learning among multiple participants or multiple computing nodes on the premise of guaranteeing information security during large data exchange, protecting terminal data and personal data privacy and guaranteeing legal compliance. The machine learning algorithm which can be used in the federal learning is not limited to a neural network, and also comprises important algorithms such as random forests. Federal learning is expected to be the basis for next generation artificial intelligence collaborative algorithms and collaborative networks.
According to the data characteristics, federal learning is mainly divided into horizontal federal learning and vertical federal learning. Wherein the lateral federal learning requirement data is isomorphic; longitudinal federal learning requires that the data be heterogeneous. However, it is difficult to fully guarantee heterogeneous or isomorphic multi-party data for federal learning, so that only partial isomorphic or heterogeneous data can be discarded when model training is performed based on federal learning. When there is more data discarded, this results in poor performance of the model based on federal learning training.
Disclosure of Invention
The present application aims to solve at least one of the technical problems in the related art to some extent.
Therefore, a first object of the present application is to provide a training method of a federal learning model, which is used for solving the technical problems that in the existing training method of federal learning models, all data cannot be fully utilized for learning, the training effect is poor due to insufficient data utilization, and the verification loss of the model is very large.
A second object of the application is to propose another method of training a federal learning model.
A third object of the present application is to provide a training device for a federal learning model.
A fourth object of the application is to propose another training device for federal learning models.
A fifth object of the present application is to propose an electronic device.
A sixth object of the present application is to propose a computer readable storage medium.
In order to achieve the above object, an embodiment of a first aspect of the present application provides a training method of a federal learning model, which is applied to a server, and the method includes the following steps: if the training node meets a preset splitting condition, acquiring a target splitting mode corresponding to the training node; wherein the training node is a node on one of a plurality of lifting trees; notifying a client to split nodes based on the target splitting mode, and acquiring updated training nodes; determining that the updated training nodes meet the training stopping conditions, stopping training and generating a target federal learning model; and acquiring a verification set, and verifying the target federal learning model by a cooperative verification client, wherein the verification client is one of clients participating in federal learning model training.
In addition, the training method of the federal learning model according to the above embodiment of the present application may further have the following additional technical features:
according to one embodiment of the present application, the verifying the target model by the cooperative verification client based on the verification set includes: sending a data instance identifier in the verification set and split information of verification nodes to the verification client, wherein the verification nodes are nodes on one of a plurality of promotion trees; receiving a node trend corresponding to the verification node sent by the verification client, wherein the node trend is determined by the verification client according to the data instance identifier and the splitting information; entering a next node according to the node trend, and taking the next node as the updated verification node; and if the updated verification node meets the preset node splitting condition, returning to send the data instance identifier and the splitting information to the verification client until the data instance identifiers in the verification set are verified.
According to one embodiment of the present application, further comprising: and if the updated verification node does not meet the preset node splitting condition, determining the updated verification node as a leaf node, and acquiring a model predictive value of the data instance represented by the data instance identifier.
According to one embodiment of the present application, further comprising: if all the data instance identifiers in the verification set are verified, sending a model predictive value of the data instance to the verification client; receiving verification indication information sent by the verification client, wherein the verification indication information is indication information which is obtained according to the model predicted value and used for indicating whether a model is reserved or not; and determining whether to reserve and use the target federal learning model according to the verification indication information, and sending a determination result to the client.
According to an embodiment of the present application, the obtaining the target splitting manner corresponding to the training node includes: based on a first training set, performing horizontal federation learning in cooperation with the client to obtain a first split value corresponding to the training node; based on a second training set, performing longitudinal federation learning in cooperation with the client to obtain a second split value corresponding to the training node; and determining a target splitting mode corresponding to the training node according to the first splitting value and the second splitting value.
According to an embodiment of the present application, the determining, according to the first split value and the second split value, a target split manner corresponding to the training node includes: determining the larger value of the first split value and the second split value as a target split value corresponding to the training node; and determining a splitting mode corresponding to the training node according to the target splitting value.
According to an embodiment of the present application, the performing, by the client, lateral federal learning based on the first training set to obtain a first split value corresponding to the training node includes: generating a first feature subset available to the training node from the first training set and transmitting the first feature subset to the client; receiving a characteristic value of each characteristic in the first characteristic subset sent by the client; according to the characteristic value of each characteristic in the first characteristic subset, respectively determining each characteristic as a transverse splitting value corresponding to a splitting characteristic point; and determining the first split value of the training node according to the transverse split value corresponding to each feature.
According to an embodiment of the present application, the determining, according to the feature value of each feature in the first feature subset, each feature as a lateral split value corresponding to a split feature point includes: determining a splitting threshold of any feature in the first feature subset according to the feature value of the any feature; acquiring a first data instance identification set and a second data instance identification set corresponding to any feature according to the splitting threshold, wherein the first data instance identification set comprises data instance identifications belonging to a first left subtree space, and the second data instance identification set comprises data instance identifications belonging to a first right subtree space; and determining the transverse split value corresponding to any feature according to the first data instance identification set and the second data instance identification set.
According to an embodiment of the present application, the obtaining, according to the splitting threshold, a first data instance identifier set and a second data instance identifier set corresponding to the any feature includes: sending the split threshold to the client; receiving an initial data instance identifier set corresponding to the training node sent by the client, wherein the initial data instance identifier set is generated when the client splits any feature according to the splitting threshold, and the initial data instance identifier set comprises data instance identifiers belonging to the first left subtree space; the first set of data instance identifications and the second set of data instance identifications are obtained based on the initial set of data instance identifications and all data instance identifications.
According to an embodiment of the present application, the obtaining, based on the second training set, the second split value corresponding to the training node by cooperating with the client to perform longitudinal federal learning includes: notifying the client to perform longitudinal federal learning based on the second training set; receiving first gradient information of at least one third data instance identifier set of each feature sent by the client, wherein the third data instance identifier set comprises data instance identifiers belonging to a second left sub-tree space, and the second left sub-tree space is a left sub-tree space formed by splitting according to one of feature values of the feature, and different feature values correspond to different second left sub-tree spaces; according to the first gradient information of each feature and the total gradient information of the training node, respectively determining a longitudinal split value of each feature; and determining the second split value of the training node according to the longitudinal split value corresponding to each feature.
According to an embodiment of the present application, the determining the longitudinal split value of each feature according to the first gradient information of each feature and the total gradient information of the training node includes: for any feature, respectively acquiring second gradient information corresponding to each first gradient information according to the total gradient information and each first gradient information; for each piece of first gradient information, according to the first gradient information and second gradient information corresponding to the first gradient information, acquiring a candidate longitudinal split value of any feature; and selecting the maximum value of the candidate longitudinal split values as the longitudinal split value of any feature.
According to one embodiment of the application, the validation set is mutually exclusive from the first training set and the second training set, respectively.
The embodiment of the first aspect of the application provides a training method of a federal learning model, and a server side automatically selects the tendency of a matched learning mode by mixing a transverse splitting mode and a longitudinal splitting mode without concern about a data distribution mode, so that the problems that all data cannot be fully utilized for learning and the training effect is poor due to insufficient data utilization in the training process of the existing federal learning model are solved, meanwhile, the loss of the federal learning model is reduced, the performance of the federal learning model is improved, and the verification loss of the model is reduced.
To achieve the above object, an embodiment of a second aspect of the present application provides a training method of a federal learning model, applied to a verification client, the method including the steps of: receiving a target splitting mode sent by a server when a training node meets a preset splitting condition, wherein the training node is a node on one lifting tree in a plurality of lifting trees; node splitting is carried out on the training nodes based on the target splitting mode; and receiving a verification set sent by the server, and verifying the target federal learning model based on the verification set.
According to one embodiment of the application, a target splitting mode sent by a server side when a training node is determined to meet a preset splitting condition is received, wherein the training node is a node on one of a plurality of lifting trees; node splitting is carried out on the training nodes based on the target splitting mode; and receiving a verification set sent by the server, and verifying the target federal learning model based on the verification set.
According to one embodiment of the present application, the verifying the target model by the collaboration verification client based on the verification set includes: receiving a data instance identifier in the verification set sent by the server and splitting information of verification nodes, wherein the verification nodes are nodes on one of a plurality of promotion trees; determining the node trend of the verification node according to the data instance identifier and the splitting information; and the node trend sent to the server side is carried out, so that the server side enters the next node according to the node trend, and the next node is used as the updated verification node.
According to one embodiment of the present application, the determining the node trend of the verification node according to the data instance identifier and the splitting information includes: according to the data instance identifier, determining a characteristic value of each characteristic corresponding to the data instance identifier; and determining the trend of the node according to the splitting information and the characteristic value of each characteristic.
According to one embodiment of the present application, further comprising: if all the data instance identifiers in the verification set are verified, receiving a model predictive value of the data instance represented by the data instance identifier sent by the server; obtaining a final verification result according to the model predictive value, and comparing the verification result with a previous verification result to generate verification indication information for indicating whether to retain and use the target federal learning model; and sending the verification indication information to the server.
According to an embodiment of the present application, before the node splitting is performed on the training node based on the target splitting manner, the method further includes: performing horizontal federation learning based on a first training set to obtain a first split value corresponding to the training node; performing longitudinal federation learning based on a second training set to obtain a second split value corresponding to the training node; and sending the first split value and the second split value to the server.
According to an embodiment of the present application, the performing lateral federation learning based on the first training set to obtain a first split value corresponding to the training node further includes: receiving a first feature subset which is generated by the server side from the first training set and is available to the training node; transmitting the characteristic value of each characteristic in the first characteristic subset to the server; receiving a splitting threshold value of each feature sent by the server; acquiring an initial data instance identification set corresponding to the training node based on the splitting threshold of each feature, and sending the initial data instance identification set to the server; the initial data instance identification set is used for indicating the server to generate a first data instance identification set and a second data instance identification set, wherein the first data instance identification set and the initial data instance identification set both comprise data instance identifications belonging to a first left subtree space, and the second data instance identification set comprises data instance identifications belonging to a first right subtree space.
According to an embodiment of the present application, the obtaining, based on the splitting threshold of each feature, an initial data instance identifier set corresponding to the training node includes: and comparing the splitting threshold value of any feature with the feature value of any feature respectively for any feature, acquiring a data instance identifier of which the feature value is smaller than the splitting threshold value, and generating the initial data instance identifier set.
According to an embodiment of the present application, before the performing longitudinal federal learning based on the second training set to obtain the second split value corresponding to the training node, the method further includes: receiving a gradient information request sent by the server; generating a second feature subset from a second training set according to the gradient information request; acquiring first gradient information of at least one third data instance identifier set of each feature in the second feature subset, wherein the third data instance identifier set comprises data instance identifiers belonging to a second left sub-tree space, and the second left sub-tree space is a left sub-tree space formed by splitting according to one of feature values of the feature, and different feature values correspond to different second left sub-tree spaces; and sending the first gradient information of the third data instance identification set to the server.
According to an embodiment of the present application, the acquiring the first gradient information of the at least one third data instance identification set of each feature in the second feature subset includes: for any feature, acquiring all feature values of the any feature, and classifying the any feature based on the feature values; and acquiring first gradient information of the third data instance identification set of each sub-bucket of any feature.
According to the training method of the federal learning model, a client can determine a target splitting mode sent by a server when a training node meets preset splitting conditions, wherein the training node is a node on one of a plurality of lifting trees, and the training node is split based on the target splitting mode, so that the tendency of a matched learning mode can be automatically selected by mixing a transverse splitting mode and a longitudinal splitting mode without concern about a data distribution mode, the problems that learning cannot be performed by fully utilizing all data and a training effect is poor due to insufficient data utilization in the training process of the existing federal learning model are solved, meanwhile, loss of the federal learning model is reduced, performance of the federal learning model is improved, and verification loss of the model is reduced.
In order to achieve the above object, an embodiment of a third aspect of the present application provides a training device for a federal learning model, which is applied to a server, and includes: the first acquisition module is used for acquiring a target splitting mode corresponding to a training node if the training node meets a preset splitting condition; wherein the training node is a node on one of a plurality of lifting trees; the second acquisition module is used for notifying the client to split the nodes based on the target splitting mode and acquiring the updated training nodes; the generation module is used for determining that the updated training nodes meet the training stopping conditions, stopping training and generating a target federal learning model; the verification module is used for acquiring a verification set, and verifying the target federal learning model by a collaborative verification client, wherein the verification client is one of clients participating in federal learning model training.
According to one embodiment of the application, the verification module comprises: a first sending sub-module, configured to send, to the verification client, a data instance identifier in the verification set and split information of a verification node, where the verification node is a node on one of a plurality of promotion trees; the first receiving sub-module is used for receiving the node trend corresponding to the verification node sent by the verification client, wherein the node trend is determined by the verification client according to the data instance identifier and the splitting information; a node updating sub-module, configured to enter a next node according to the node trend, and use the next node as the updated verification node; and the second sending sub-module is used for returning to send the data instance identifier and the splitting information to the verification client if the updated verification node meets the preset node splitting condition until the data instance identifiers in the verification set are verified.
According to one embodiment of the application, the verification module further comprises: and the acquisition sub-module is used for determining that the updated verification node is a leaf node if the updated verification node does not meet the preset node splitting condition, and acquiring a model predicted value of the data instance represented by the data instance identifier.
According to one embodiment of the application, the verification module further comprises: a third sending sub-module, configured to send a model prediction value of the data instance to the verification client if all the data instance identifiers in the verification set are verified; the second receiving sub-module is used for receiving verification indication information sent by the verification client, wherein the verification indication information is indication information which is obtained according to the model predicted value and used for indicating whether a model is reserved or not; and the determining submodule is used for determining whether to reserve and use the target federal learning model according to the verification indication information and sending a determination result to the client.
According to one embodiment of the present application, the first acquisition module includes: the first learning sub-module is used for carrying out horizontal federal learning in cooperation with the client based on a first training set so as to obtain a first split value corresponding to the training node; the second learning sub-module is used for carrying out longitudinal federal learning in cooperation with the client based on a second training set so as to obtain a second split value corresponding to the training node; and the determining submodule is used for determining a target splitting mode corresponding to the training node according to the first splitting value and the second splitting value.
According to one embodiment of the application, the determining submodule includes: a first determining unit, configured to determine that a larger value of the first split value and the second split value is a target split value corresponding to the training node; and the second determining unit is used for determining a splitting mode corresponding to the training node according to the target splitting value.
According to one embodiment of the application, the first learning sub-module includes: a sending unit, configured to generate a first feature subset available to the training node from the first training set, and send the first feature subset to the client; a first receiving unit, configured to receive a feature value of each feature in the first feature subset sent by the client; a third determining unit, configured to determine, according to a feature value of each feature in the first feature subset, each feature as a lateral split value corresponding to a split feature point; and a fourth determining unit, configured to determine the first split value of the training node according to the lateral split value corresponding to each feature.
According to an embodiment of the present application, the third determining unit includes: a first determining subunit, configured to determine, for any feature in the first feature subset, a splitting threshold of the any feature according to a feature value of the any feature; a first obtaining subunit, configured to obtain, according to the splitting threshold, a first data instance identifier set and a second data instance identifier set corresponding to the any feature, where the first data instance identifier set includes data instance identifiers that belong to a first left subtree space, and the second data instance identifier set includes data instance identifiers that belong to a first right subtree space; and the second determining subunit is used for determining the transverse split value corresponding to any feature according to the first data instance identification set and the second data instance identification set.
According to one embodiment of the application, the first acquisition subunit is further configured to: sending the split threshold to the client; receiving an initial data instance identifier set corresponding to the training node sent by the client, wherein the initial data instance identifier set is generated when the client splits any feature according to the splitting threshold, and the initial data instance identifier set comprises data instance identifiers belonging to the first left subtree space; the first set of data instance identifications and the second set of data instance identifications are obtained based on the initial set of data instance identifications and all data instance identifications.
According to one embodiment of the application, the second learning sub-module includes: a notification unit, configured to notify the client to perform vertical federal learning based on the second training set; a second receiving unit, configured to receive first gradient information of at least one third data instance identifier set of each feature sent by the client, where the third data instance identifier set includes data instance identifiers belonging to a second left sub-tree space, where the second left sub-tree space is a left sub-tree space formed by splitting according to one of feature values of the feature, and different feature values correspond to different second left sub-tree spaces; a fifth determining unit, configured to determine a longitudinal split value of each feature according to the first gradient information of each feature and total gradient information of the training node; and a sixth determining unit, configured to determine the second split value of the training node according to a longitudinal split value corresponding to each feature.
According to an embodiment of the present application, the fifth determining unit includes: the second acquisition subunit is used for respectively acquiring second gradient information corresponding to each first gradient information according to the total gradient information and each first gradient information aiming at any feature; a third obtaining subunit, configured to obtain, for each piece of first gradient information, a candidate longitudinal split value of the any feature according to the first gradient information and second gradient information corresponding to the first gradient information; and the selecting subunit is used for selecting the maximum value in the candidate longitudinal split values as the longitudinal split value of any feature.
According to one embodiment of the application, the validation set is mutually exclusive from the first training set and the second training set, respectively.
According to the training device for the federal learning model, disclosed by the embodiment of the third aspect of the application, the server side automatically selects the tendency of the matched learning mode by mixing the transverse splitting mode and the longitudinal splitting mode without concern about the data distribution mode, so that the problems that all data cannot be fully utilized for learning and the training effect is poor due to insufficient data utilization in the training process of the existing federal learning model are solved, meanwhile, the loss of the federal learning model is reduced, the performance of the federal learning model is improved, and the verification loss of the model is reduced.
To achieve the above object, a fourth aspect of the present application provides a training apparatus of a federal learning model, which is applied to a verification client, including: the system comprises a receiving module, a target splitting module and a splitting module, wherein the receiving module is used for receiving a target splitting mode sent by a server when a training node meets a preset splitting condition, and the training node is a node on one lifting tree in a plurality of lifting trees; the splitting module is used for splitting the training node based on the target splitting mode; and the verification module is used for receiving the verification set sent by the server and verifying the target federal learning model based on the verification set.
According to one embodiment of the application, the verification module comprises: the first receiving sub-module is used for receiving the split information of the verification node and one data instance identifier in the verification set sent by the server side, wherein the verification node is a node on one of a plurality of lifting trees; the first determining submodule is used for determining the node trend of the verification node according to the data instance identifier and the splitting information; and the first sending sub-module is used for sending the node trend to the server so that the server enters the next node according to the node trend, and the next node is used as the updated verification node.
According to one embodiment of the application, the first determination submodule includes: the first determining unit is used for determining the characteristic value of each characteristic corresponding to the data instance identifier according to the data instance identifier; and the second determining unit is used for determining the trend of the node according to the splitting information and the characteristic value of each characteristic.
According to one embodiment of the application, the verification module further comprises: the second receiving sub-module is used for receiving the model predicted value of the data instance represented by the data instance identifier sent by the server if the data instance identifier in the verification set is verified; the generation sub-module is used for obtaining a final verification result according to the model predicted value, and comparing the verification result with a previous verification result to generate verification indication information for indicating whether the target federal learning model is reserved and used; and the second sending submodule is used for sending the verification indication information to the server side.
According to one embodiment of the application, the splitting module further comprises: the first learning sub-module is used for performing horizontal federal learning based on the first training set so as to obtain a first split value corresponding to the training node; the second learning sub-module is used for carrying out longitudinal federal learning based on a second training set so as to obtain a second split value corresponding to the training node; and the third sending submodule is used for sending the first split value and the second split value to the server.
According to one embodiment of the present application, the first learning sub-module further includes: a first receiving unit, configured to receive a first feature subset available to the training node generated by the server from the first training set; a first sending unit, configured to send, to the server, a feature value of each feature in the first feature subset; the second receiving unit is used for receiving the splitting threshold value of each feature sent by the server; the first acquisition unit is used for acquiring an initial data instance identifier set corresponding to the training node based on the splitting threshold value of each feature and sending the initial data instance identifier set to the server; the initial data instance identification set is used for indicating the server to generate a first data instance identification set and a second data instance identification set, wherein the first data instance identification set and the initial data instance identification set both comprise data instance identifications belonging to a first left subtree space, and the second data instance identification set comprises data instance identifications belonging to a first right subtree space.
According to an embodiment of the present application, the first obtaining unit is further configured to: and comparing the splitting threshold value of any feature with the feature value of any feature respectively for any feature, acquiring a data instance identifier of which the feature value is smaller than the splitting threshold value, and generating the initial data instance identifier set.
According to one embodiment of the present application, the second learning sub-module further includes: the third receiving unit is used for receiving the gradient information request sent by the server; a generation unit, configured to generate a second feature subset from a second training set according to the gradient information request; a second obtaining unit, configured to obtain first gradient information of at least one third data instance identifier set of each feature in the second feature subset, where the third data instance identifier set includes data instance identifiers belonging to a second left sub-tree space, where the second left sub-tree space is a left sub-tree space formed by splitting according to one of feature values of the feature, and different feature values correspond to different second left sub-tree spaces; and the second sending unit is used for sending the first gradient information of the third data instance identification set to the server.
According to an embodiment of the present application, the second acquisition unit includes: the sub-unit of barrel division is used for aiming at any feature, obtaining all feature values of the any feature, and carrying out barrel division on the any feature based on the feature values; and the second acquisition subunit is used for acquiring the first gradient information of the third data instance identification set of each sub-bucket of any feature.
According to the training device for the federal learning model, provided by the embodiment of the fourth aspect of the application, the client can determine the target splitting mode sent by the server when the training node meets the preset splitting condition, wherein the training node is a node on one of a plurality of lifting trees, and the training node is split based on the target splitting mode, so that the tendency of the matched learning mode can be automatically selected by mixing the transverse splitting mode and the longitudinal splitting mode without concern about the data distribution mode, the problems that the prior federal learning model cannot fully utilize all data to learn and the training effect is poor due to insufficient data utilization are solved, meanwhile, the loss of the federal learning model is reduced, the performance of the federal learning model is improved, and the verification loss of the model is reduced.
In order to achieve the above object, an embodiment of a fifth aspect of the present application provides an electronic device, including: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing an embodiment of the first aspect of the application or implementing a method of training a federal learning model according to any of the embodiments of the third aspect of the application when the program is executed.
To achieve the above object, an embodiment of a sixth aspect of the present application provides a computer readable storage medium, which when executed by a processor, implements an embodiment of the first aspect of the present application, or implements a training method of a federal learning model according to any one of the embodiments of the third aspect of the present application.
Drawings
FIG. 1 is a schematic diagram of a federal learning application scenario provided by an embodiment of the present application;
FIG. 2 is a flow chart of a method of training a federal learning model according to an embodiment of the present application;
FIG. 3 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 4 is a schematic diagram of data distribution disclosed in one embodiment of the present application;
FIG. 5 is a schematic diagram of node splitting as disclosed in one embodiment of the present application;
FIG. 6 is a schematic diagram of data distribution disclosed in another embodiment of the present application;
FIG. 7 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 8 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 9 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 10 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 11 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 12 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 13 is a schematic diagram illustrating the partitioning of buckets according to the bucket mapping rule according to one embodiment of the present application;
FIG. 14 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 15 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 16 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 17 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 18 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 19 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 20 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 21 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 22 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 23 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 24 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 25 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 26 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 27 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 28 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 29 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 30 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 31 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 32 is a flow chart of a method of training a federal learning model according to another embodiment of the present application;
FIG. 33 is a schematic diagram of a training device for a federal learning model according to an embodiment of the present application;
FIG. 34 is a schematic structural view of a training device for a federal learning model according to another embodiment of the present application;
FIG. 35 is a schematic structural view of a training device for federal learning model according to another embodiment of the present application;
FIG. 36 is a schematic diagram of a training device for federal learning models according to another embodiment of the present application;
FIG. 37 is a schematic structural view of a training device for federal learning model according to another embodiment of the present application;
FIG. 38 is a schematic structural view of a training device for federal learning model according to another embodiment of the present application;
FIG. 39 is a schematic diagram of a training device for a federal learning model according to another embodiment of the present application;
FIG. 40 is a schematic structural view of a training device for a federal learning model according to another embodiment of the present application;
FIG. 41 is a schematic diagram of a training device for a federal learning model according to another embodiment of the present application;
FIG. 42 is a schematic structural view of a training device for a federal learning model according to another embodiment of the present application;
FIG. 43 is a schematic structural view of a training device for federal learning model according to another embodiment of the present application;
FIG. 44 is a schematic diagram of a training device for federal learning models according to another embodiment of the present application;
FIG. 45 is a schematic diagram of a training device for a federal learning model according to another embodiment of the present application;
FIG. 46 is a schematic diagram of a training device for federal learning model according to another embodiment of the present application;
FIG. 47 is a schematic diagram of a training device for a federal learning model according to another embodiment of the present application;
fig. 48 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order that the above-described aspects may be better understood, exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
It should be understood that "and/or" related to the embodiments of the present application, describing the association relationship of the association object, indicates that three relationships may exist, for example, a and/or B may indicate: a alone, a and B together, and B alone, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship.
First, a part of the vocabulary according to the embodiment of the present application will be described.
Isomorphic data: data records owned by different data providers have the same characteristic attributes.
Heterogeneous data: the data records owned by different data providers differ in other characteristic properties except for the data instance Identification (ID).
XGBoost: the XGB is a set of machine learning system with extensible lifting tree.
Before introducing the technical scheme of the application, the problems existing in the prior art and the technical conception process of the application are introduced by combining a specific application scene of the application.
In practical application, multi-party data for federation learning is difficult to completely guarantee all isomerism or all isomorphism, so that when federation learning training is performed by using a lifting tree (boosting tree), only part of isomorphism or isomerism data can be discarded, and then transverse federation learning or longitudinal federation learning is adopted. However, since the discarded data is often relatively large, the performance of the model obtained based on federal learning training is poor. Moreover, even with either horizontal federal learning or vertical federal learning, the tags of the data need to be guaranteed to exist in a party, and cannot exist randomly in multiple parties, which is practically impossible, so that the current technology also limits the practical application of federal learning.
In order to solve the problems, the inventor finds that by the design of federal learning by mixing the horizontal federal learning and the vertical federal learning, the problem that the prior federal learning needs to care about a data distribution mode can be solved, the problem that the learning cannot be performed by fully utilizing all data can be solved, and the problem that the model effect obtained by training is poor due to insufficient data utilization can be solved.
Through the design of the scheme, under the condition of more heterogeneous data, the scheme tends to adopt a longitudinal federal learning (namely a longitudinal lifting tree) mode, so that a model obtained through training can have the characteristic of no damage, and isomorphic data can be utilized at the same time; under the condition of more isomorphic data, the scheme tends to adopt a mode of horizontal federal learning (namely a horizontal lifting tree), and meanwhile, model training is carried out by utilizing heterogeneous data, so that a model obtained through training has the capability of being lossless in a longitudinal mode, and the performance of the model is improved.
Fig. 1 is a schematic view of an application scenario of a model training method based on federal learning. As shown in fig. 1, the application scenario may include: at least one client (three clients are shown in fig. 1, client 111, client 112, client 113, respectively), network 12, and server 13. Wherein each client and server 13 may communicate over network 12.
It should be noted that fig. 1 is only a schematic diagram of an application scenario provided by the embodiment of the present application, and the embodiment of the present application does not limit the devices included in fig. 1 or limit the positional relationship between the devices in fig. 1, for example, in the application scenario shown in fig. 1, the application scenario may further include a data storage device, where the data storage device may be an external memory relative to the server 13 or an internal memory integrated in the server 13.
The application provides a model training method, device and storage medium based on federal learning, which are used for improving the performance of a model obtained through training by mixing the design of transverse federal learning and longitudinal federal learning. The technical scheme of the application is described in detail through specific embodiments. It should be noted that the following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments.
The following describes a training method and device for a federal learning model and electronic equipment according to an embodiment of the present application with reference to the accompanying drawings.
Fig. 2 is a schematic flow chart of a training method of a federal learning model according to an embodiment of the present application.
As shown in fig. 2, the server is taken as an execution subject, and the training method of the federal learning model provided by the embodiment of the application is explained, which specifically includes the following steps:
s201, if the training node meets a preset splitting condition, acquiring a target splitting mode corresponding to the training node; the training node is a node on one of the plurality of promote trees.
In the embodiment of the application, if the training node meets the preset splitting condition, the training node at present needs to continue splitting, and in this case, the target splitting mode corresponding to the training node can be obtained.
The preset splitting condition may be set according to an actual situation, for example, the preset splitting condition may be set as a tree depth requirement that a level where a training node currently processed is located does not reach a maximum, a loss function does not meet a constraint condition, and the like.
The target splitting mode comprises the following steps: a transverse splitting mode and a longitudinal splitting mode.
The lifting Tree (lifting Tree) refers to a lifting method adopting an addition model and a forward distribution algorithm and taking a Decision Tree (Decision Tree) as a basis function.
S202, notifying the client to split the nodes based on the target splitting mode.
In the embodiment of the application, after the target splitting mode corresponding to the training node is acquired, the server side can send the acquired target splitting mode to the client side and inform the client side to split the node based on the target splitting mode. Accordingly, the client can receive the target splitting mode of the server and split the training node based on the target splitting mode.
S203, the left subtree node generated after the training node is split is used as the training node again to carry out the next training until the updated training node does not meet the preset splitting condition.
In the embodiment of the application, the server can re-use the left subtree node generated after the training node is split as the training node to perform the next training, then judge whether the updated training node level meets the preset splitting condition, and when the updated training node is determined to be required to be split continuously, namely, the preset splitting condition is met, the server can continuously acquire the target splitting mode corresponding to the updated training node, and inform the client to continue node splitting based on the target splitting mode until the updated training node no longer meets the preset splitting condition. The preset splitting condition may include a tree depth threshold, a number of split samples threshold, an error threshold of a federal learning model, and the like.
S204, taking other non-leaf nodes of a lifting tree as training nodes again to perform the next training round.
In the embodiment of the application, the server can trace back to other non-leaf nodes of the current lifting tree, and can be used as the current training node again to carry out the next training round.
S205, if node data sets of a plurality of lifting trees are all empty, stopping training and generating a target federal learning model.
In the embodiment of the application, if node data sets of a plurality of lifting trees are all empty, training can be stopped and a target federal learning model can be generated. Further, the generated target federal learning model can be verified until the training times reach the preset training times, and the information is cleaned and the model is reserved.
Therefore, according to the training method of the federal learning model, the server can automatically select the tendency of the matched learning mode by mixing the transverse splitting mode and the longitudinal splitting mode without concern about the data distribution mode, so that the problems that all data cannot be fully utilized for learning and the training effect is poor due to insufficient data utilization in the training process of the existing federal learning model are solved, meanwhile, the loss of the federal learning model is reduced, and the performance of the federal learning model is improved.
In the application, when trying to acquire the target splitting mode corresponding to the training node, the federal learning can be performed through the cooperative client to acquire the corresponding splitting value, and then the target splitting mode corresponding to the training node is determined according to the splitting value.
As a possible implementation manner, as shown in fig. 3, on the basis of the foregoing embodiment, in the foregoing step, a process of obtaining a target splitting manner corresponding to a training node specifically includes the following steps:
s301, based on the first training set, the cooperative client performs horizontal federation learning to obtain a first split value corresponding to the training node.
It should be noted that, when the training node needs to continue splitting, a splitting manner, that is, transverse splitting or longitudinal splitting, needs to be determined. Wherein most nodes undergo two candidate splits in one transverse direction and one longitudinal direction, and then a split mode with larger split gain (gain) in the two candidate splits is selected as a final split mode of the node.
In an attempt to make a pre-judgment, nodes meeting the following conditions are split only longitudinally or only transversely, and are directly used as the final splitting result:
(1) The deeper the tree, the fewer the total number of samples on the node that need to be split. If the ratio of the common sample in the samples on the nodes is judged to be less than a preset value, for example, 10%, and the common sample is extremely small, the data distribution condition on the nodes can be treated as transverse distribution, and only transverse splitting is carried out.
(2) If the common sample ratio in the samples at the nodes exceeds a predetermined value, for example 98%, then only longitudinal splitting may be performed.
It should be noted that, the setting of the two pre-judging conditions is to save training time, and the two ratios can be set in the training parameters.
The data involved in the two splitting modes are explained below for transverse splitting and longitudinal splitting, respectively.
For example, as shown in fig. 4, training is performed with a total of 10000 samples, and 100 samples are obtained when splitting reaches a certain node. At this time, the data distribution is as follows: 70 of the 100 samples are two platform common samples. Where a-stage has 90 samples (including 20 private samples and 70 common samples) and B-stage has 80 samples (including 10 private samples and 70 common samples). The a-platform has 18 features (including 10 private features and 8 common features) and the B-platform has 18 features (including 10 private features and 8 common features).
For transverse splitting, all samples of the node participate in splitting, and a plurality of candidate features and feature thresholds are selected by a server.
For the common feature f and the value v: the platform A divides the local 90 samples into left and right subtrees, the platform B divides the local 80 samples into left and right subtrees, and informs the server of the sample division condition respectively; the server calculates split gain according to the left and right branches of the 100 samples, and takes the split gain as the gain of the characteristic f.
For the private feature f and the value v of platform a: the platform A divides the local 90 samples into left and right subtrees, the platform B divides the local 70 common samples into left and right subtrees, and all 10 samples without the characteristic f value are divided into left subtrees or right subtrees as two branches, and the server side sample division conditions are respectively informed; the server calculates split gain according to the two branches of the 100 samples, and the split gain is larger as the gain of the characteristic f.
For the private feature f and the value v of platform B: similar to that described above.
The server marks the maximum gain value in all the characteristics as gain1.
For longitudinal splitting, only 70 samples common to the nodes participate in splitting. Similar to normal longitudinal federal learning, the maximum feature split gain2 was calculated over 70 samples.
And taking the larger gain of the transverse splitting maximum gain1 and the longitudinal splitting maximum gain2, and splitting the node in a corresponding splitting mode. And entering the next node.
It should be noted that, whether the splitting is performed in a horizontal or a vertical manner, the node sample set is split according to the following rule: the maximum gain comes from the a-plane local feature f1; the maximum gain comes from the common feature f2; the maximum gain comes from the B-plane local feature f3.
For example, as shown in fig. 5 (a), if the node splits according to f1, 90 samples of a are left according to samples whose eigenvalues are less than or equal to the threshold, and samples whose eigenvalues are greater than the threshold are right; the 10 samples of B have no eigenvalues, and in a mode corresponding to the maximum gain, if the missing value samples belong to the right subtree, the 10 samples are rightward, and if the missing value samples belong to the left subtree, the 10 samples are leftward.
As shown in fig. 5 (b), if the node splits at f2, all samples have eigenvalues, samples smaller than or equal to the threshold value are left, and samples larger than the threshold value are right.
As shown in fig. 5 (c), if the node splits according to f3, the specific process is similar to the process of splitting according to f1, and will not be described here.
In the embodiment of the application, the client can perform horizontal federation learning based on the first training set to obtain the first split value corresponding to the training node, and send the first split value to the server. Accordingly, the server may receive the first split value corresponding to the training node, so as to obtain the first split value corresponding to the training node.
S302, based on the second training set, the cooperative client performs longitudinal federation learning to obtain a second split value corresponding to the training node.
In the embodiment of the application, the client can perform horizontal federation learning based on the second training set to obtain the second split value corresponding to the training node, and send the second split value to the server. Correspondingly, the server side can receive the second split value corresponding to the training node to obtain the second split value corresponding to the training node.
S303, determining a target splitting mode corresponding to the training node according to the first splitting value and the second splitting value.
It should be noted that, conventional federal learning mainly includes lateral federal learning and longitudinal federal learning. Wherein for lateral federal learning, multi-platform data featuring exactly the same features are used, i.e., lateral data, such as the data (1.2) + (3) + (5.1) shown in fig. 6; for longitudinal federal learning, the exact same multi-platform data of sample ID (Identity Document, identification number), i.e. longitudinal data, such as data (2) + (3) + (4) shown in fig. 6, is used. From the above, it is clear that in the prior art, only data in the full lateral or full longitudinal direction can be modeled, that is, only part of the data in fig. 6 can be used.
In the embodiment of the present application, the first training set, that is, the data participating in the horizontal federal learning, is all data samples of multiple clients, for example, data (1) + (2) + (3) + (4) + (5) shown in fig. 6; the second training set, data that participates in vertical federal learning, is a common data sample for multiple clients, all features of which participate in training, such as (2) + (3) + (4) shown in fig. 6. From the above, the training method of the federal learning model proposed by the present application can be applied to the data intersecting in the horizontal-vertical direction, that is, all the data in fig. 6 can be used.
Therefore, in the embodiment of the application, the horizontal federation learning and the vertical federation learning can be performed by the cooperative client based on the first training set and the second training set so as to obtain the first split value and the second split value corresponding to the training node.
In the application, when the target splitting mode corresponding to the training node is determined according to the first splitting value and the second splitting value, the target splitting value can be determined by comparing the first splitting value and the second splitting value, and then the corresponding target splitting mode is determined according to the target splitting value.
As a possible implementation manner, as shown in fig. 7, in the above embodiment, in the step S303, a process of determining, according to the first split value and the second split value, a target split mode corresponding to the training node specifically includes the following steps:
s501, determining the larger value of the first split value and the second split value as a target split value corresponding to the training node.
In the embodiment of the application, after obtaining the first split value and the second split value corresponding to the training node, the server side can compare the first split value with the second split value and use the larger value as the target split value corresponding to the training node.
For example, the first split value obtained is Gain 1 The second split value is Gain 2 And Gain 1 >Gain 2 In this case, gain can be used 1 As the target split value corresponding to the training node.
S502, determining a splitting mode corresponding to the training node according to the target splitting value.
In the embodiment of the application, after taking the larger value of the first split value and the second split value as the target split value corresponding to the training node, the server can determine the split mode corresponding to the training node according to the target split value.
Therefore, the training method of the federation learning model can respectively obtain the first split value and the second split value by carrying out horizontal federation learning and vertical federation learning by the cooperative client, further takes the larger split value as the target split value corresponding to the training node, and further determines the split mode corresponding to the training node according to the target split value, so that the tendency of the matched learning mode can be automatically selected according to the target split value without concern about the data distribution mode.
In the application, when the first training set is tried to be based on the first training set and the horizontal federal learning is performed in cooperation with the client to obtain the first split value corresponding to the training node, the horizontal split value corresponding to each feature can be obtained, and then the first split value of the training node is determined according to the horizontal split value.
As a possible implementation manner, as shown in fig. 8, in the above embodiment, in step S301, based on the first training set, the process of performing horizontal federal learning by the collaboration client to obtain the first split value corresponding to the training node specifically includes the following steps:
s601, a first feature subset available to the training node is generated from the first training set and sent to the client.
Alternatively, the server may randomly generate a first feature subset available to the current training node from the first training set, for example, may randomly generate half of all features of the current first training set, form a new feature set as the first feature subset, and send the generated first feature subset to each client. Accordingly, each client may receive the first feature subset, traverse the feature value of each feature in the obtained set according to the obtained first feature subset, and then send the feature value to the server according to the local data, that is, the feature value of the locally stored feature.
S602, receiving a characteristic value of each characteristic in the first characteristic subset sent by the client.
In the embodiment of the application, the client can send the feature value of each feature in the first feature subset to the server. Accordingly, the server may receive the feature value of each feature in the first feature subset sent by the client.
S603, according to the feature value of each feature in the first feature subset, determining each feature as a transverse split value corresponding to the split feature point.
As a possible implementation manner, as shown in fig. 9, in the foregoing step S603, a process of determining, according to a feature value of each feature in the first feature subset, each feature as a lateral split value corresponding to a split feature point, includes the following steps:
s701, determining a splitting threshold of any feature according to the feature value of any feature aiming at any feature in the first feature subset.
In the embodiment of the application, after the server receives the feature value of each feature in the first feature subset sent by the client, a feature value list can be generated according to the feature values. Further, for any feature in the first feature subset, a feature value may be randomly selected from the feature value labels as a global optimal splitting threshold for the current feature.
S702, a first data instance identification set and a second data instance identification set corresponding to any feature are obtained according to a splitting threshold, wherein the first data instance identification set comprises data instance identifications belonging to a first left subtree space, and the second data instance identification set comprises data instance identifications belonging to a first right subtree space.
As a possible implementation manner, as shown in fig. 10, based on the foregoing embodiment, in step S702, a process of acquiring, according to a splitting threshold, a first data instance identifier set and a second data instance identifier set corresponding to any feature specifically includes the following steps:
s801, sending a splitting threshold to a client.
In the embodiment of the application, after the splitting threshold value of any feature is determined, the splitting threshold value can be broadcasted to the client. Correspondingly, the client can receive the splitting threshold, acquire an initial data instance identification set corresponding to the training node based on the splitting threshold of each feature, and send the initial data instance identification set to the server.
S802, receiving an initial data instance identification set corresponding to a training node sent by a client, wherein the initial data instance identification set is generated when the client splits any feature according to a splitting threshold, and comprises data instance identifications belonging to a first left subtree space.
In the embodiment of the present application, the server may receive the IL sent by the client, that is, the initial data instance identifier set IL including the data instance identifiers belonging to the first left subtree space.
S803, a first data instance identification set and a second data instance identification set are obtained based on the initial data instance identification set and all data instance identifications.
As a possible implementation manner, as shown in fig. 11, in the foregoing step S803, a process of obtaining, based on the initial data instance identifier set, a first data instance identifier set and a second data instance identifier set specifically includes the following steps:
s901, comparing each data instance identifier in the initial data instance identifier set with the data instance identifier of the client to obtain the data instance identifier with abnormality.
Wherein, the abnormal data instance identifier can be repeated data instance identifier, contradictory data instance identifier and the like.
S902, preprocessing abnormal data instance identifiers to obtain a first data instance identifier set.
In the embodiment of the application, after receiving the IL, the server can filter out repeated instance IDs from each IL set, and process the contradictory ID information to determine the final IL.
For example, if an instance ID is added to IL for client a, but there is an ID for client B, no instance is added, at which point this ID may be considered to be present in IL.
S903, acquiring a second data instance identifier set based on all the data instance identifiers and the first data instance identifier set.
In the embodiment of the present application, after the first data instance identifier set is obtained, the first data instance identifier set IL may be removed from all the data instance identifiers, and then the second data instance identifier set IR may be obtained.
In the present application, when the second split value corresponding to the training node is obtained by trying to perform longitudinal federal learning based on the second training set and cooperating with the client, the second split value of the training node may be determined according to the longitudinal split value corresponding to each feature.
As a possible implementation manner, as shown in fig. 12, on the basis of the foregoing embodiment, in the foregoing step, a process of obtaining a target splitting manner corresponding to a training node specifically includes the following steps:
s1001, notifying the client to perform vertical federation learning based on the second training set.
In the embodiment of the application, after the server side informs the client side to perform longitudinal federal learning based on the second training set, a gradient information request can be sent to the client side to acquire Gkv and Hkv information. Correspondingly, the client can obtain the data which is not processed by the current node according to the part of data of the common ID, randomly obtain a feature set, perform barrel-division mapping on each sample according to each feature k and all values of the corresponding feature in the set, calculate Gkv and Hkv of the left subtree space as first gradient information, and send the first gradient information to the server after homomorphic encryption processing.
/>
Wherein x is i,k Representing data instance x i Is a value of a characteristic k of (c).
For example, a population originally taking a value between 1 and 100 years old is mapped into three barrels under 20 years old, between 20 and 50 years old, and over 50 years old. Samples in one sub-bin are either split to the left or split to the right. Theoretically, for G and H sent to the server, there should be three G sent out, such as the example above, which is the running sum: the sum of G1-20 years old, the sum of G1-50 years old, the sum of G1-100 years old (corresponding to G of the left subtree, respectively). Since homomorphic encryption ciphertext is slow in operation and ciphertext also becomes long to increase traffic, in practice, the client sends three separate G's: the sum of G1-20 years old, the sum of G20-50 years old, and the sum of G above 50. After the platform with Label receives the G of these three buckets, it decrypts into plaintext first and then calculates the cumulative G of 1-20 years old/1-50 years old/1-100 years old.The two formulas are the process, which means that the sum of g of each sub-bucket is calculated. s is(s) k,v Is the maximum value of the current bucket (50 years), s k,v-1 Is the largest feature value of the last bucket (20 years), thus filtering out x from 20 to 50 years old. By means of the characteristic barrel division mode, the calculated amount can be reduced.
S1002, receiving first gradient information of at least one third data instance identifier set of each feature sent by a client, wherein the third data instance identifier set comprises data instance identifiers belonging to a second left sub-tree space, the second left sub-tree space is a left sub-tree space formed by splitting according to one of feature values of the feature, and different feature values correspond to different second left sub-tree spaces.
In the embodiment of the application, the client can acquire all the characteristic values of any characteristic aiming at any characteristic, and can segment any characteristic based on the characteristic values and acquire the first gradient information of the third data instance identification set of each segment of any characteristic. Accordingly, the server may receive the first gradient information of the at least one third data instance identification set for each feature sent by the client.
S1003, determining the longitudinal split value of each feature according to the first gradient information of each feature and the total gradient information of the training node.
S1004, determining a second split value of the training node according to the longitudinal split value corresponding to each feature.
Gradient information is explained below in exemplary form.
For example, as shown in fig. 13, taking the feature with the number k on the client a as an example, all samples on the current node may be ranked according to the feature value on the feature k from small to large. Further, the samples may be segmented into multiple data buckets (corresponding to multiple small-to-large feature thresholds) in order according to a bucket mapping rule. Further, the sum G of the first order gradients G of the samples contained in the v-th sub-bin and the sum H of the second order gradients H, i.e. Gkv and Hkv corresponding to the v-th feature threshold v, may be calculated.
The Gkv node samples are sequenced according to the value of the characteristic with the number k, and are sequentially segmented into a plurality of data barrels, and the first-order gradient g of all the samples in the v-th barrel after sequencing is summed.
Where Hkv is the sum of the second order gradients h of these samples.
It should be noted that, the above-mentioned bucket mapping rule is various, the specific manner of the bucket mapping rule is not limited in the present application, and only the samples with the same eigenvalue, for example, two samples with a value of 1 in fig. 11, are required to be separated into the same data bucket.
For example, samples of the same value may be taken as a bucket, i.e., n samples, and if there are m values on a feature, the samples are split into m buckets, and the corresponding feature threshold is the m values.
For another example, the number of buckets may be limited, e.g., up to m buckets may be partitioned, in which case if the value of the feature k is less than m, the partitioning may be performed in the above manner; if there are more than m, it can be divided into m buckets in an approximately bisecting way.
In an attempt to determine a longitudinal split value of each feature, a maximum value of candidate longitudinal split values may be selected as a longitudinal split value of any feature.
As a possible implementation manner, as shown in fig. 14, in the above embodiment, in step S1003, a process of determining a longitudinal split value of each feature according to the first gradient information of each feature and the total gradient information of the training node, includes the following steps:
s1201, respectively acquiring second gradient information corresponding to each first gradient information according to the total gradient information and each first gradient information aiming at any feature;
s1202, acquiring a candidate longitudinal split value of any feature according to first gradient information and second gradient information corresponding to the first gradient information aiming at each piece of first gradient information;
and S1203, selecting the maximum value in the candidate longitudinal split values as the longitudinal split value of any feature.
The first gradient information comprises the sum of first-order gradients of the features corresponding to the data instances belonging to the second left subtree space and the sum of second-order gradients of the features corresponding to the data instances belonging to the second left subtree space; the second gradient information includes a sum of first-order gradients of features corresponding to the data instances belonging to the second right sub-tree space and a sum of second-order gradients of features corresponding to the data instances belonging to the second right sub-tree space.
As a possible implementation manner, in the embodiment of the present application, the server requests to obtain Gkv and Hkv information from each client. Correspondingly, the client can obtain the data which is not processed by the current node according to the part of data of the common ID, randomly obtain a feature set, map each sample barrel according to each feature k in the set and all values of the corresponding feature v, calculate Gkv and Hkv of the left subtree space, homomorphic encrypt and send to the server. In addition, the client can calculate some intermediate results of the loss function, such as the first derivative g of the loss function, based on the common data identification and the local data i And second derivative h i And sending the data to the server.
Further, the server may decrypt Gkv and Hkv sent by the client, and according to the data corresponding to the common ID of the current node, and all the obtained g i And h i All g of the left subtree space of the current node can be calculated i Sum GL, right subtree space all g i Sum GR and left subtree space all h i Sum of HL, right subtree space all h i And HR.
Taking XGB as an example, the objective function is as follows:
it is proposed in XGB to use a second order taylor expansion approximation to represent the above formula. The second order form of the taylor expansion is:
Wherein g i And h i The calculation formula of (2) is as follows:
from the Taylor expansion mentioned above, g can be seen i Is the first derivative, h i Is the second derivative. Wherein GL and HL can be calculated by the following formulas respectively,
/>
where n represents the number of instances of the left subtree space, that is, in this case, the left subtree space has a total of n instances.
Further, the server side can calculate the optimal division points of each feature according to the result, and then determine the global optimal division points (k, v, gain) according to the division point information. If several clients have the same characteristics, the server will randomly fetch Gkv as the current characteristic from the received Gkv, and the same is true in the same way as Hkv.
Further, the server may request IL information from the corresponding client according to (k, v, gain). Correspondingly, the client receives the partition point information (k, v, gain), searches for and obtains a partition point threshold value, and records the partition point (k, value) information. The local data set is segmented according to the segmentation points to obtain an IL set, and the IL set is sent to a server. Wherein record represents the index of the record at the client.
Further, the server receives (record, IL, value) information sent by the client, segments all common ID instances of the node space, and associates the current node with the client through (client ID, record). Information of the (client id, record_id, IL, feature_name, feature_value) is recorded as information of the vertical split, that is, a vertical split value of any feature.
In this case, the sample of the current node may be divided into the nodes of the right subtree of the left subtree according to the value of the feature.
S604, determining a first split value of the training node according to the transverse split value corresponding to each feature.
It should be noted that, in the embodiment of the present application, optionally, transverse splitting may be performed first, and then longitudinal splitting may be performed; alternatively, the longitudinal splitting may be performed first, followed by the transverse splitting.
Further, as all data is used in the transverse splitting mode and only the part of data with the same ID is used in the longitudinal splitting mode, the data used in the transverse splitting mode is more, the probability is higher, a better effect is obtained, and the data interaction amount of the client and the server is smaller and the speed is higher in the longitudinal splitting mode. Therefore, in order to obtain a training temporary result of a deeper layer as far as possible when training is interrupted, transverse splitting can be performed first, and then longitudinal splitting can be performed.
In the embodiment of the application, if the training node meets the preset splitting condition, the training node at present needs to be split continuously, and in this case, the target splitting mode corresponding to the training node can be obtained; if the training node does not meet the preset splitting condition, the training node in which the training node is currently located is not required to continue splitting, in this case, the leaf node can be determined, and the weight value of the leaf node is sent to the client.
As a possible implementation manner, as shown in fig. 15, on the basis of the foregoing embodiment, the method specifically includes the following steps:
s1301, if the training node does not meet the preset splitting condition, determining the training node as a leaf node, and acquiring a weight value of the leaf node.
In the embodiment of the application, if the training node does not meet the preset splitting condition, the server can use the node as the leaf node to calculate the weight value w of the leaf node j Store the w j As a vertical leaf node weight value.
Wherein the weight value W of the leaf node j For calculating the sample prediction score, the calculation can be performed by the following formula:
wherein G is j G corresponding to all instances of the representation node j i Sum of H j Represents h corresponding to all instances of node j i And (3) summing.
S1302, sending the weight value of the leaf node to the client.
In the embodiment of the application, after the client acquires the weight value of the leaf node, the weight value of the leaf node can be sent to the client to inform each client that the leaf node splitting in the longitudinal mode is not performed any more, that is, the node splitting operation is completed.
It should be noted that, in the embodiment of the present application, before attempting to notify the client to perform node splitting based on the target splitting manner, splitting information may be sent to the client, where the splitting information includes the target splitting manner, the target splitting feature selected as the feature splitting point, and the target splitting value.
As shown in fig. 16, on the basis of the above embodiment, if the target splitting mode is a longitudinal splitting mode, the method may notify the client before node splitting based on the target splitting mode, and specifically includes the following steps:
s1401, sending split information to the tagged client.
In the embodiment of the application, the server can send the splitting information to the client with the tag. Accordingly, the tagged client may receive the split information and perform node splitting for the training node based on the split information.
For the longitudinal splitting manner, optionally, the server may notify each client to perform a real node splitting operation according to the recorded longitudinal splitting information, including (client_id, record_id, IL, feature_name, feature_value). The client corresponding to the client_id knows all information, namely (client_id, record_id, IL, feature_name, feature_value), and other clients only need to know IL information. Further, the server takes the left subtree node after the current splitting as the current processing node.
Correspondingly, the client receives IL or all information sent by the server, namely (client_id, record_id, IL, feature_name, feature_value), and performs node splitting operation in a longitudinal mode; if there is (client_id, record_id, IL, feature_name, feature_value) information, the client also needs to record and store this information at the time of splitting. Further, after the splitting is completed, the client may use the split left subtree node as the current processing node.
For the transverse splitting mode, optionally, the server may perform the node splitting in the transverse mode, that is, perform splitting operation on the current node according to the (k, value) information obtained in the transverse mode, obtain IL information, and broadcast the IL to each client.
Correspondingly, the client can receive the (k, value) information of the server and split the data of the common ID in a node way, wherein the splitting way is to put the ID of the piece of data into the IL set if the value of the characteristic k of the data of the common ID is smaller than the value, or put the ID of the piece of data into the IR set. If the data does not have the feature k, the data is put into the right subtree space.
S1402, a left subtree space set sent by the tagged client is received.
In the embodiment of the application, after node splitting is performed on the training node, the labeled client can send the left subtree space generated by splitting to the server. Accordingly, the server may receive the left subtree space set sent by the tagged client.
S1403, splitting the second training set according to the left subtree space set.
S1404, associating the training node with the identity of the tagged client.
It should be noted that, in the present application, the initialization may be performed before the preset splitting condition is satisfied in response to the current training node.
As a possible implementation manner, as shown in fig. 17, on the basis of the foregoing embodiment, the method specifically includes the following steps:
s1501, receiving a data instance identifier sent by a client.
In the embodiment of the application, the client can send the unique identification ID of each piece of data to the server. Accordingly, the client may receive a unique identification ID for each piece of data, i.e., a data instance identification.
S1502, according to the data instance identifications, determining common data instance identifications among the clients, wherein the common data instance identifications are used for instructing the clients to determine a first training set and a second training set.
In the embodiment of the application, the server can collect all instance IDs of the clients to obtain the common IDs among the clients and notify the clients. Further, the server may select a client as the verification client, select a portion of the tagged data from the client as the verification data set, where the portion of the data set does not exist in the common ID data set, and modify the training data set list corresponding to the client to initially verify the information of the data set. The respective clients are then notified of the information of the authentication ID list, the common ID list. Accordingly, the client may receive the common ID list and the verification ID list (if any) sent by the server and initialize global local data information.
Further, the server may perform information initialization of each training round for the current XGB forest list and training round, and perform information initialization of each tree for the current tree node and the current XGB forest list, so as to notify the client to perform information initialization of each training round or initialization of each tree training.
Further, after the target federal learning model is obtained, the generated target federal learning model may be validated.
Optionally, the server may perform verification on the target federal learning model by using a cooperative verification client based on a verification set, where the verification client is one of clients participating in training of the federal learning model, and the verification set is mutually exclusive to the first training set and the second training set respectively.
As a possible implementation manner, the server may notify the client to perform the verification initialization operation. Accordingly, the client performs authentication initialization.
Further, the server may select an ID to start verification, initialize the XGB tree, and notify the client to start verification. Accordingly, the client initially verifies the authentication information.
Further, the server may send the split node information and the verified data ID to the verification client according to the current tree. Correspondingly, the client can acquire corresponding data according to the data ID, then judge whether the client should walk to the left subtree or the right subtree according to the split node information sent by the server, and return the judging result to the server.
Further, the server may enter the next node according to the trend returned by the client. Then judging whether the leaf node is reached, if the leaf node is not reached, selecting an ID to restart the verification, initializing an XGB tree, and informing the client to restart the verification. If the predicted value reaches the leaf node, the weight of the leaf node can be recorded, and the predicted value is calculated and stored. If the current predicted ID is not the last one, one ID can be selected to restart verification, an XGB tree is initialized, and a client is informed to restart verification; if the current predicted ID is the last of all predicted IDs, all predicted results may be sent to the client. Accordingly, the client can receive all the prediction results, perform final verification results, compare with the last verification result, judge whether the current model needs to be reserved and used, and inform the server of the judgment results.
Further, the server may determine whether to reserve and use the current model according to the verification result returned by the client, and broadcast the determination result to all clients. Accordingly, each client receives the broadcast information of the server and processes the broadcast information.
Further, the server side can judge whether the final prediction round is reached, if the final prediction round is not reached, the information initialization of each round of training can be carried out again aiming at the current XGB forest list heat exchange training round; if the final prediction round is reached, all training can be finished, information is cleaned, and the model is reserved. Accordingly, the client ends all training, cleans up information, and retains the model.
Therefore, the training method of the federal learning model provided by the application can automatically select the tendency of the matched learning mode by mixing the transverse splitting mode and the longitudinal splitting mode without concern about the data distribution mode, solves the problems that all data cannot be fully utilized for learning and the training effect is poor due to insufficient data utilization in the training process of the existing federal learning model, reduces the loss of the federal learning model and improves the performance of the federal learning model.
FIG. 18 is a flow chart of a method of training a federal learning model according to an embodiment of the present application.
As shown in fig. 18, the training method of the federal learning model according to the embodiment of the present application is explained by using a client as an execution body, and specifically includes the following steps:
S1601, a target splitting mode sent by a receiving server when determining that a training node meets a preset splitting condition is received, wherein the training node is a node on one of a plurality of lifting trees.
In the embodiment of the application, if the training node meets the preset splitting condition, the training node in which the training node is currently located is required to continue splitting, and in this case, the server side can acquire the target splitting mode corresponding to the training node and inform the client side to split the node based on the target splitting mode. Accordingly, the client can receive a target splitting mode sent by the server when the training node is determined to meet the preset splitting condition.
S1602, performing node splitting on the training nodes based on the target splitting mode.
In the embodiment of the application, the server side can determine the target splitting mode corresponding to the training node according to the first splitting value and the second splitting value. Correspondingly, the client can receive the IL or (client_id, record_id, IL, feature_name) information sent by the server, and perform node splitting on the training node according to the target splitting mode. Wherein, if there is (client_id, record_id, IL, feature_name, feature_value) information, the client also needs to record and store this information when splitting the training node.
Further, after the splitting is completed, the client may use the split left subtree node as the current processing node.
Therefore, according to the training method of the federal learning model, the client can determine the target splitting mode sent by the server when the training node meets the preset splitting condition, wherein the training node is a node on one of a plurality of lifting trees, and the training node is split based on the target splitting mode, so that the tendency of the matched learning mode can be automatically selected by mixing the transverse splitting mode and the longitudinal splitting mode without concern about the data distribution mode, the problems that all data cannot be fully utilized for learning and the training effect is poor due to insufficient data utilization in the training process of the existing federal learning model are solved, meanwhile, the loss of the federal learning model is reduced, and the performance of the federal learning model is improved.
In the application, before the training node is split based on the target splitting mode, the client can cooperate with the server to perform federal learning and acquire the corresponding splitting value.
As a possible implementation manner, as shown in fig. 19, on the basis of the foregoing embodiment, the method specifically includes the following steps:
s1701, performing horizontal federation learning based on the first training set to obtain a first split value corresponding to the training node.
In the application, when the first split value corresponding to the training node is obtained by attempting to perform the horizontal federal learning based on the first training set, the initial data instance identifier set corresponding to the training node can be obtained, and the initial data instance identifier set is sent to the server.
As a possible implementation manner, as shown in fig. 20, on the basis of the foregoing embodiment, the method specifically includes the following steps:
s1801, a first feature subset which is generated by the server side from the first training set and is available to the training nodes is received.
In the embodiment of the application, the server may randomly generate the first feature subset available to the current training node from the first training set, for example, may randomly generate half of all features of the current first training set, form a new feature set as the first feature subset, and send the generated first feature subset to each client. Accordingly, each client may receive the first feature subset.
S1802, sending the feature value of each feature in the first feature subset to the server.
In the embodiment of the application, the client can traverse to obtain the characteristic value of each characteristic in the set according to the obtained first characteristic subset, and then randomly select one of all the values of the characteristic according to the local data, namely the characteristic value of the locally stored characteristic, and send the selected value to the server. Correspondingly, the server collects the feature value information sent by each client to form a value list, randomly selects a global optimal splitting threshold value serving as the current feature from the list, and broadcasts the splitting threshold value to each client.
S1803, receiving a splitting threshold value of each feature sent by the server.
In the embodiment of the application, the server side can determine the splitting threshold value of each feature according to the received feature value of each feature in the first feature subset and send the splitting threshold value to the client side. Accordingly, the client may receive the split threshold for each feature sent by the server.
S1804, based on the splitting threshold of each feature, acquiring an initial data instance identification set corresponding to the training node, and sending the initial data instance identification set to the server; the initial data instance identification set is used for indicating the server to generate a first data instance identification set and a second data instance identification set, wherein the first data instance identification set and the initial data instance identification set both comprise data instance identifications belonging to a first left subtree space, and the second data instance identification set comprises data instance identifications belonging to a first right subtree space.
In the embodiment of the application, when the initial data instance identifier set corresponding to the training node is obtained based on the splitting threshold value of each feature, the splitting threshold value of any feature can be respectively compared with the feature value of any feature for any feature, and the data instance identifier with the feature value smaller than the splitting threshold value is obtained to generate the initial data instance identifier set. The split threshold may be set before training is started according to actual situations.
As a possible implementation manner, the client may perform node splitting on the current feature according to the received feature splitting threshold information to obtain IL, and notify the server; if the client does not have the corresponding feature, then an empty IL is returned.
Wherein IL is the set of instance IDs in the left subtree space; the calculation method is as follows: receiving a threshold value of a feature k sent by a server, and adding the ID1 into an IL set if the value of the feature k corresponding to the actual ID1 in the local data is smaller than the value; the formula is as follows:
S IL ={ID|ID k <value}
wherein, ID k Representing the value of instance ID feature A, S IL Representing the set IL.
S1702, performing longitudinal federation learning based on a second training set to obtain a second split value corresponding to the training node.
In the present application, before attempting to perform longitudinal federal learning based on the second training set to obtain the second split value corresponding to the training node, the first gradient information of at least one third data instance identifier set may be obtained, and the first gradient information of the third data instance identifier set may be sent to the server.
As a possible implementation manner, as shown in fig. 21, on the basis of the foregoing embodiment, the method specifically includes the following steps:
s1901, receiving a gradient information request sent by a server.
In the embodiment of the application, the server may send a gradient information request to the client to request for obtaining Gkv and Hkv information. Accordingly, the client can receive the gradient information request sent by the server.
S1902, generating a second feature subset from the second training set according to the gradient information request.
S1903, obtaining first gradient information of at least one third data instance identifier set of each feature in the second feature subset, wherein the third data instance identifier set comprises data instance identifiers belonging to a second left sub-tree space, the second left sub-tree space is a left sub-tree space formed by splitting according to one of feature values of the features, and different feature values correspond to different second left sub-tree spaces.
As a possible implementation manner, as shown in fig. 22, on the basis of the foregoing embodiment, the method specifically includes the following steps:
s2001, acquiring all characteristic values of any characteristic aiming at any characteristic, and classifying any characteristic based on the characteristic values.
In the embodiment of the application, each sample barrel can be mapped according to each feature k and all values of the corresponding features in the set and each value v.
It should be noted that, the above-mentioned bucket mapping rule is various, the specific manner of the bucket mapping rule is not limited in the present application, and only the samples with the same eigenvalue, for example, two samples with a value of 1 in fig. 13, are required to be separated into the same data bucket.
For example, samples of the same value may be taken as a bucket, i.e., n samples, and if there are m values on a feature, the samples are split into m buckets, and the corresponding feature threshold is the m values.
For another example, the number of buckets may be limited, e.g., up to m buckets may be partitioned, in which case if the value of the feature k is less than m, the partitioning may be performed in the above manner; if there are more than m, it can be divided into m buckets in an approximately bisecting way.
S2002, acquiring first gradient information of a third data instance identification set of each sub-bucket of any feature.
In the embodiment of the application, the client can acquire the first gradient information of the third data instance identification set of each sub-bucket of any feature. Accordingly, the server may receive the first gradient information of the at least one third data instance identification set for each feature sent by the client.
S1904, the first gradient information of the third data instance identification set is sent to the server.
In the embodiment of the application, the client can obtain the data which is not processed by the current node according to the part of data of the common ID, randomly obtain the feature set, map each sample barrel according to each feature k in the set and all values of the corresponding feature v, calculate Gkv and Hkv of the left subtree space, homomorphic encrypt and send to the server.
S1703, the first split value and the second split value are sent to the server.
In the embodiment of the application, after the client performs horizontal federation learning and vertical federation learning based on the first training set and the second training set to obtain the second split value corresponding to the training node, the first split value and the second split value can be sent to the server. Accordingly, the server may receive the first split value and the second split value.
In the application, when the training node is subjected to node splitting based on the target splitting mode, the node splitting can be performed according to the splitting information sent by the server.
As a possible implementation manner, as shown in fig. 23, on the basis of the foregoing embodiment, the method specifically includes the following steps:
s2101, receiving splitting information sent by a server, wherein the splitting information comprises a target splitting mode, target splitting characteristics selected as characteristic splitting points and target splitting values.
In the embodiment of the application, before attempting to inform the client of node splitting based on the target splitting mode, the server can send splitting information to the client, wherein the splitting information comprises the target splitting mode, the target splitting feature selected as the feature splitting point and the target splitting value. Accordingly, the client may receive the split information sent by the server.
As a possible implementation manner, the server may perform the node splitting in a lateral manner, that is, perform a splitting operation on the current node according to the (k, value) information obtained in the lateral manner, so as to obtain IL information, and broadcast the IL to each client.
And S2102, performing node splitting on the training nodes based on the splitting information.
As a possible implementation manner, the client may perform node splitting on the data of the common ID according to the received (k, value) information of the server, where the splitting manner is to make the ID of the piece of data into the IL set if the value is smaller than the value for the feature k of the data of the common ID, or into the IR set if the value is smaller than the value. If the data does not have the feature k, the data is put into the right subtree space.
Further, after node splitting is performed on the training node, the client may send the left subtree space generated by the splitting to the server. Accordingly, the server may receive the split-generated left subtree space.
It should be noted that, in the embodiment of the present application, if the training node meets the preset splitting condition, it is indicated that the training node currently located needs to continue splitting; if the training node does not meet the preset splitting condition, the training node at present is not required to continue splitting, and in this case, the client can input the residual as the residual of the next lifting tree, and simultaneously backtrack the node.
As a possible implementation manner, as shown in fig. 24, on the basis of the foregoing embodiment, the method specifically includes the following steps:
S2201, if the training node is a leaf node, receiving a weight value of the leaf node sent by the server.
In the embodiment of the application, if the training node does not meet the preset splitting condition, the server can use the node as the leaf node to calculate the weight value w of the leaf node j Store the w j As a vertical leaf node weight value. Accordingly, the client may receive the weight w of the leaf node sent by the server j
S2202, determining residual errors of all data contained in the leaf nodes according to the weight values of the leaf nodes.
S2203, taking the residual as the residual input of the next lifting tree.
In the embodiment of the application, the client can be used for processing the data according to [ Ij (m), w ] j ]A new y' (t-1) (i) is calculated and traced back to other non-leaf nodes of the current tree as current nodes. Where y' (t-1) (i) represents the Label residual corresponding to the ith instance, t represents the current t tree, and t-1 represents the last tree.
Therefore, according to the training method of the federal learning model, the client can determine the target splitting mode sent by the server when the training node meets the preset splitting condition, wherein the training node is a node on one of a plurality of lifting trees, and the training node is split based on the target splitting mode, so that the tendency of the matched learning mode can be automatically selected by mixing the transverse splitting mode and the longitudinal splitting mode without concern about the data distribution mode, the problems that all data cannot be fully utilized for learning and the training effect is poor due to insufficient data utilization in the training process of the existing federal learning model are solved, meanwhile, the loss of the federal learning model is reduced, and the performance of the federal learning model is improved.
It should be noted that, in the embodiment of the present application, the training process of the federal learning model mainly includes several stages of node splitting, generating a model, and verifying the model. The following explains the training method of the federal learning model according to the embodiment of the present application, by taking node splitting, model generation and model verification stages for training the federal learning model as examples, with respect to the server as an execution subject and the verification client as an execution subject, respectively.
As shown in fig. 25, the training method of the federal learning model according to the embodiment of the present application specifically includes the following steps:
s2301, if the training node meets a preset splitting condition, acquiring a target splitting mode corresponding to the training node; the training node is a node on one of the plurality of promote trees.
S2302, notifying the client to split the nodes based on the target splitting mode, and acquiring updated training nodes.
S2303, determining that the updated training nodes meet the training stopping conditions, stopping training and generating a target federal learning model.
It should be noted that, the relevant contents of steps S2301 to S2303 can be referred to the above embodiments, and are not repeated here.
S2304, acquiring a verification set, and verifying the target federal learning model by a collaborative verification client, wherein the verification client is one of clients participating in federal learning model training.
Wherein the validation set is typically part of the samples in the training set. Alternatively, samples may be randomly sampled from the training set as the verification set at a preset ratio. In the embodiment of the application, the verification set comprises a data instance identifier, and the verification set is mutually exclusive with the first training set and the second training set respectively.
Therefore, after the federal learning model is generated, the server can acquire the verification set and perform verification on the target federal learning model by cooperating with the verification client, so that under the condition of more isomorphism of user data, the verification loss of the federal learning model is reduced by adopting a mode of combining training with verification, the reasoning effect of the federal learning model is improved, and the effectiveness and reliability of the federal learning model in the training process are further improved.
It should be noted that, in the embodiment of the present application, the updated training node includes: the left subtree node generated after the training node splits and other non-leaf nodes of a lifting tree. Wherein the updated training satisfies the stop training condition, comprising: the updated training nodes no longer meet the preset splitting conditions; alternatively, the updated training node is the last node of the plurality of promote trees.
In the embodiment of the application, when attempting to acquire the verification set and the collaborative verification client verifies the target federal learning model, the data instance identifiers in the verification set can be verified one by one until the data instance identifiers in the verification set are verified.
As a possible implementation manner, as shown in fig. 26, based on the foregoing embodiment, the process of obtaining the verification set in the foregoing step S2304 and verifying the target federal learning model by the cooperative verification client specifically includes the following steps:
s2401, sending a data instance identifier in a verification set and split information of a verification node to a verification client, wherein the verification node is a node on one of a plurality of promotion trees.
In the embodiment of the application, the server can send any data instance identifier to the verification client and send the splitting information of the verification node at the same time. Correspondingly, the verification client can receive the data instance identifier and the splitting information of the verification node, acquire corresponding data according to the data instance identifier, and judge the node trend corresponding to the verification node according to the splitting information, namely judge the node trend to walk to the left subtree or to the right subtree.
Wherein the split information includes features for splitting and a split threshold.
S2402, receiving a node trend corresponding to the verification node sent by the verification client, wherein the node trend is determined by the verification client according to the data instance identifier and the splitting information.
In the embodiment of the application, after the verification client determines the node trend corresponding to the verification node, the verification client can send the node trend to the server. Accordingly, the server may receive the node trend corresponding to the verification node sent by the verification client.
S2403, entering the next node according to the trend of the node, and taking the next node as an updated verification node.
In the embodiment of the application, the server can enter the next node according to the trend of the node returned by the verification client, and the next node is used as the updated verification node. Further, the server may determine whether the updated verification node meets a preset node splitting condition, and if the updated verification node meets the preset node splitting condition, it is indicated that the leaf node is not reached, step S2404 may be executed; if the updated verification node does not meet the preset node splitting condition, indicating that the leaf node has been reached, step S2405 may be performed.
S2404, if the updated verification node meets the preset node splitting condition, returning to execute sending the data instance identifier and the splitting information to the verification client until the data instance identifiers in the verification set are verified.
S2405, if the updated verification node does not meet the preset node splitting condition, determining that the updated verification node is a leaf node, and obtaining a model predictive value of the data instance represented by the data instance identifier.
In the embodiment of the application, after the server determines that the updated verification node is the leaf node, the server can record the weight value of the leaf node, calculate and store the model predicted value of the data instance represented by the data instance identifier.
Wherein the data instance identifies a model predictor for the characterized data instance, referring to the predictor for each sample. When each sample walks to a certain Leaf node on each tree during verification, the Leaf Score of the Leaf is the Score of the sample on the tree, and the sum of scores of the samples on all trees is the predicted value.
Further, after the completion of step S2404 described above, it may be determined whether to retain and use the target federal learning model.
As a possible implementation manner, as shown in fig. 27, after the step S2404, the following steps are specifically included:
S2501, if all the data instance identifiers in the verification set are verified, sending a model predictive value of the data instance to the verification client.
In the embodiment of the present application, if all the data instance identifiers in the verification set are verified, that is, the currently predicted data instance identifier is the last one of all the predicted data instance identifiers, the model predicted value of the data instance may be sent to the verification client. Correspondingly, the verification client can receive all the prediction results, calculate the final verification result, compare the final verification result with the last verification result to judge whether the current target federal learning model needs to be reserved and used, and generate verification indication information according to the judgment result.
It should be noted that, when attempting to generate the verification instruction information, the client may calculate the predicted value for all samples in the verification set. Since the clients are verified to have their true Label values, in this case, the clients can calculate the correlation difference index of the values between the predicted value and the Lable value, such as the index of Aatcuracy (accuracy), RMSE (Root Mean Squared Error, root mean square error) and the like, and determine the performance of the model in the current Epoch according to the index.
The current Epoch, also called current generation training, refers to a process that all training samples are transmitted in a forward direction and in a backward direction in the neural network, that is, one Epoch is a process that all training samples are trained once. Therefore, if the acquired related difference index is better than the index of the last Epoch, the currently acquired model can be reserved; if the acquired related difference index is inferior to the index of the last Epoch, the currently acquired model can be discarded.
S2502, receiving verification indication information sent by a verification client, wherein the verification indication information is indication information which is obtained according to a model predicted value and used for indicating whether a model is reserved or not.
In the embodiment of the application, the verification client can send the verification indication information to the server. Accordingly, the server may receive the verification indication information sent by the verification client.
S2503, determining whether to reserve and use the target federal learning model according to the verification indication information, and sending the determination result to the client.
In the embodiment of the application, the server can determine whether to reserve and use the target federal learning model according to the verification indication information, and send the determination result to all clients.
As shown in fig. 28, the training method of the federal learning model according to the embodiment of the present application specifically includes the following steps:
s2601, receiving a target splitting mode sent by a server when determining that a training node meets a preset splitting condition, wherein the training node is a node on one of a plurality of lifting trees.
S2602, performing node splitting on the training nodes based on the target splitting mode.
It should be noted that, the relevant content of the steps S2601 to S2602 can be referred to the above embodiments, and will not be repeated here.
S2603, receiving a verification set sent by the server, and verifying the target federal learning model based on the verification set.
In the embodiment of the application, the server can acquire the verification set and send the verification set to the verification client. Accordingly, the verification client may receive the verification set sent by the server and verify the target federal learning model based on the verification set.
And receiving a verification set sent by the server, and verifying the target federal learning model based on the verification set.
Therefore, after the training nodes are split based on the target splitting mode, the verification client can verify the target federal learning model based on the verification set by receiving the verification set sent by the server, so that under the condition that user data isomorphism is more, the verification loss of the federal learning model is reduced by adopting the mode of combining training with verification, the reasoning effect of the federal learning model is improved, and the effectiveness and reliability of the federal learning model in the training process are further improved.
In the embodiment of the application, when the verification client tries to verify the target federal learning model based on the verification set, the data instance identifiers in the verification set can be verified one by one until the data instance identifiers in the verification set are verified.
As a possible implementation manner, as shown in fig. 29, based on the foregoing embodiment, the process of verifying the target federal learning model in step S2603 based on the verification set specifically includes the following steps:
s2701, the receiving server sends a data instance identifier in the verification set and split information of the verification node, wherein the verification node is a node on one of a plurality of promotion trees.
In the embodiment of the application, the server can send any data instance identifier to the verification client and send the splitting information of the verification node at the same time. Accordingly, the verification client may receive the data instance identification and split information of the verification node.
Wherein the split information includes features for splitting and a split threshold.
S2702, determining the node trend of the verification node according to the data instance identification and the splitting information.
In the embodiment of the application, the verification client can acquire the corresponding data according to the data instance identifier, and judge the node trend corresponding to the verification node according to the split information, namely judge the node trend to be the left subtree or the right subtree.
As a possible implementation manner, as shown in fig. 30, the process of determining the node trend of the verification node according to the data instance identifier and the splitting information in the step S2702 specifically includes the following steps:
s2801, according to the data instance identifier, determining a feature value of each feature corresponding to the data instance identifier.
It should be noted that, the relevant content of step S2801 can be referred to the above embodiments, and will not be described herein.
And S2802, determining the trend of the node according to the split information and the characteristic value of each characteristic.
In the embodiment of the application, the verification client can determine the feature for splitting according to the splitting information, and further determine the trend of the node based on the feature value and the splitting threshold value of the feature.
S2703, the node trend sent to the server side is changed, so that the server side enters the next node according to the node trend, and the next node is used as an updated verification node.
In the embodiment of the application, after the verification client determines the node trend corresponding to the verification node, the verification client can send the node trend to the server. Accordingly, the server may receive the node trend corresponding to the verification node sent by the verification client.
Further, after the completion of the above step S2703, it may be determined whether to retain and use the target federal learning model.
As a possible implementation manner, as shown in fig. 31, after the above step S2703, the following steps are specifically included:
s2901, if all the data instance identifiers in the verification set are verified, receiving a model predicted value of the data instance represented by the data instance identifier sent by the server.
In the embodiment of the present application, if all the data instance identifiers in the verification set are verified, that is, the currently predicted data instance identifier is the last one of all the predicted data instance identifiers, the model predicted value of the data instance may be sent to the verification client. Correspondingly, the verification client can receive all the prediction results, calculate the final verification result, compare the final verification result with the last verification result to judge whether the current target federal learning model needs to be reserved and used, and generate verification indication information according to the judgment result.
It should be noted that, when attempting to generate the verification instruction information, the verification client may calculate the predicted value for all samples in the verification set. Since the clients are verified to have their true Label values, in this case, the clients can calculate the correlation difference index of the values between the predicted value and the Lable value, such as the index of Aatcuracy (accuracy), RMSE (Root Mean Squared Error, root mean square error) and the like, and determine the performance of the model in the current Epoch according to the index.
The current Epoch, also called current generation training, refers to a process that all training samples are transmitted in a forward direction and in a backward direction in the neural network, that is, one Epoch is a process that all training samples are trained once. Therefore, if the acquired related difference index is better than the index of the last Epoch, the currently acquired model can be reserved; if the acquired related difference index is inferior to the index of the last Epoch, the currently acquired model can be discarded.
S2902, obtaining a final verification result according to the model predicted value, and comparing the verification result with a previous verification result to generate verification indicating information for indicating whether to retain and use the target federal learning model.
In the embodiment of the application, the verification client can send the verification indication information to the server. Accordingly, the server may receive the verification indication information sent by the verification client.
S2903, verification instruction information is sent to the server side.
In the embodiment of the application, the server side can determine whether to reserve and use the target federal learning model according to the verification indication information sent by the verification client side, and send the determination result to all the client sides.
FIG. 32 is a flow chart of a training method of a federal learning model according to an embodiment of the present application.
As shown in fig. 32, taking XGB as an example, the whole process of training the federal learning model by the server and the client (including the verification client) is taken as an example, and the training method of the federal learning model provided by the embodiment of the present application is explained, which specifically includes the following steps:
s3001, the server and the client respectively perform initialization processing.
Optionally, the client sends the data identifier to the server. Specifically, the client sends the data identifier of each piece of data to the server. Wherein the data identification is uniquely distinguishing each piece of data. Correspondingly, the server receives the data identifiers sent by the clients.
Further, the server determines a common data identifier between the clients according to the received data identifier. The common data identifier is the same data identifier in different clients determined by the server according to the data identifiers reported by the clients.
Further, the server sends the common data identifier to the client.
Further, the client obtains the derivative of the loss formula according to the common data identifier and the local data, and performs homomorphic encryption processing. In particular, the client calculates some intermediate results of the loss function, such as the first derivative gi and the second derivative hi of the loss function, from the common data identity and the local data. Wherein g i And h i The calculation formula of (2) is as follows:
wherein y is i A prediction result of the sample i; please refer to the related art for each symbol meaning.
The calculation formula of the loss function is as follows:
it is proposed in XGB to use a second order taylor expansion approximation to represent the above formula. The second order form of the taylor expansion is:
further, the client sends the encrypted derivative to the server. Accordingly, the server receives the encrypted derivative sent by the client.
Further, the server decrypts the received encrypted derivative, and averages the decrypted derivative to obtain an average value. And aiming at the same common data identifier, calculating an average value based on accumulation of derivatives corresponding to all clients. For example, the first derivatives g corresponding to the common data identifiers are respectively calculated as described above i And second derivative h i And (5) adding and then obtaining an average value. Specifically:
wherein n represents the number of data IDs in common, g i (j) Representing the first derivative g of data j i ,h i (j) Representing the second derivative h of the data j i
Further, the server sends the average value to the client. The server sends the mean to the client, illustratively in the form of a list. In one implementation, the first derivative g i And second derivative h i May co-exist in the same list; in another implementation, the first derivative g i And second derivative h i Respectively present in different lists, e.g. first derivative g i Exists in the list A, the second derivative h i Exists in list B. Correspondingly, the client receives the average value sent by the server.
Further, the client updates the locally stored mean.
S3002, the server and the client start to perform transverse XGB processing.
Optionally, the server determines whether the current tree node needs to continue splitting. For example, the server determines whether the tree node needs to continue splitting according to whether the hierarchy of the tree node at present reaches the maximum tree depth; if the tree node does not need to continue splitting, the server takes the node as a leaf node, and calculates the weight value w of the leaf node j Store the w j As a leaf node weight value for the transverse XGB; if the tree node needs to continue splitting, the server randomly generates a feature set available to the current node from all feature sets, and sends the feature set to each client.
Further, the server randomly generates a feature set available to the current node from the set of all features, and sends the feature set to each client.
Further, the client traverses each feature in the feature set according to the obtained feature set, randomly selects one of all values of the feature according to the local data, and sends the value to the server.
Further, the server collects feature value information sent by each client to form a value list, randomly selects a global optimal splitting threshold value serving as a current feature from the value list, and broadcasts the global optimal splitting threshold value to each client.
Further, the client performs node splitting on the current feature according to the received global optimal splitting threshold to obtain IL, and notifies the server; if the client does not have the corresponding feature, then an empty IL is returned.
Wherein IL is an instance ID set in the left subtree space, calculated as follows: the client receives a global optimal splitting threshold value of the feature k sent by the server, and if the value of the feature k corresponding to the instance ID1 in the local data is smaller than the global optimal splitting threshold value, the ID1 is added into the IL set. The formula is as follows:
S IL ={ID|ID k <value}
wherein, ID k Representing the value of instance ID feature k, S IL Representation ofAggregate IL.
Further, the server receives the IL sent by each client and filters repeated instance IDs from each IL to process the ID information which is contradictory to the current characteristics. For some clients, an instance ID will be added to the IL; some clients have an ID but not added, at which point it is considered that the ID should be present in the IL, thereby determining the final IL and IR. If the current characteristic does not exist in the data of a certain client, the data instance ID of the client is put into the IR. Then, GL, GR, HL and HR are calculated, and then the split value Gain of the current feature is calculated and obtained:
Wherein GL is all first derivatives g in the left subtree space i And (2) a sum of (2); GR is all first derivatives g of the right subtree space i And (2) a sum of (2); HL is all second derivatives h of left subtree space i And (2) a sum of (2); HR is all second derivatives h of the right subtree space i A kind of electronic device. The calculation method is as follows:
wherein n is 1 Representing the number of instances in the left subtree space, n 2 Representing the number of instances in the right subtree space.
The server cooperates with the client to traverse each feature in the randomly selected feature set, can be used as a split value calculated by a split node according to each feature, and takes the feature with the largest split value as the best effect feature of the current node. Meanwhile, the server side also knows threshold information corresponding to the split node, and takes information such as a split threshold, a split value Gain, selected characteristics and the like as information of optimal split of the transverse XGB current node.
Further, the server side performs a cleaning operation after the node splitting in the transverse mode, and the server side informs each client side that the node splitting in the transverse mode is not performed any more, namely the node splitting in the transverse mode is completed.
Further, the server takes the node as a leaf node, and calculates a weight value w of the leaf node j Store the w j As a leaf node weight value for the transverse XGB.
Wherein:G m g corresponding to all instances of the node m i Sum of H m Represents h corresponding to all instances of node m i And (3) summing.
Further, the server side informs each client side that the node splitting operation in the transverse mode is no longer performed, namely the node splitting operation in the transverse mode is completed.
Further, the client performs processing after the transverse node splitting is completed.
S3003, the server and the client start to perform longitudinal XGB processing.
Optionally, the server notifies the client to perform the processing of the longitudinal XGB.
Further, the server requests to obtain G from each client kv And H kv Information.
Further, the client obtains the data which is not processed by the current node according to the part of the data identified by the common data, randomly obtains a feature set, maps each sample barrel according to each feature k and all values v of the corresponding features in the set, and calculates G of the left subtree space kv And H kv And sending the homomorphic encrypted data to the server. Optionally, after sorting the values of the features k in the dataset, the bucket throwing operation may be performed, and the bucket throwing operation may be divided into the following buckets: { s k,1 ,s k,2 ,s k,3 ,…,s k,v-1 -a }; then G kv And H kv The calculation formula of (2) is as follows:
Wherein x is i,k Representing data x i Is a value of a characteristic k of (c).
Further, the server transmits [ [ G ] to each client kv ]]And [ [ H ] kv ]]Decrypting and identifying the part of the data according to the common data of the current node and all g obtained before i And h i G and H for the current node can be calculated. Alternatively, if some clients have the same characteristics, the server will receive G kv Randomly taking a G as the current characteristic kv Similarly, H kv This is also the case. Further, it is possible according to G, H and G kv 、H kv The optimal division point of each feature can be calculated, and the global optimal division point (k, v, gain) is determined based on the aforementioned division point information.
In the embodiment of the present application, the received information may be compared with a preset threshold, if Gain is less than or equal to the threshold, node splitting in a longitudinal mode is not performed, and step S3027 may be executed; if Gain is greater than the threshold, step S3023 may be performed.
Further, the server requests the IL from the corresponding client according to (k, v, gain).
Further, the client C receives the partition point information (k, v, gain), searches for a partition point threshold value, and records the partition point (k, value) information. The local data set is segmented according to the segmentation point, and IL is obtained and sent to a server. Where record represents the index of the piece of record at the client, IL is calculated as previously mentioned.
Further, the server receives (record, IL, value) information sent by the client C, segments the instances of all common IDs of the node space where the server is located, and associates the current node with the client C through (client ID, record). In the embodiment of the present application, the server may record (client id, record_id, IL, feature_name, feature_value) as the information of the longitudinal split, and execute step S3027.
Further, the server takes the node as a leaf node, and calculates a weight value w of the leaf node j Store the w j As a vertical leaf node weight value.
Further, the server side informs each client side that the leaf node splitting in the longitudinal mode is not performed any more; or, the node splitting operation is completed.
Further, each client performs processing after the longitudinal node splitting is completed.
S3004, the server and the client start to perform mixing processing of the transverse XGB and the longitudinal XGB.
Optionally, the server determines whether the current node needs to split. In the embodiment of the application, if the current node does not need to be split, the server takes the node as the leaf node, and calculates the weight value w of the leaf node j Transmitting information [ Ij (m), w j ]Giving all clients; if the current node needs to be split, the server determines a target Gain according to the Gain obtained by the transverse XGB and the Gain obtained by the longitudinal XGB, so that the node splitting mode is determined to carry out node splitting.
Further, the server determines a target Gain according to the Gain obtained by the transverse XGB and the Gain obtained by the longitudinal XGB, so as to determine a node splitting mode to perform node splitting. In the embodiment of the application, if the server side uses the transverse XGB mode to split the node, namely, the current node is split according to the (k, value) information obtained by the transverse mode, the IL information can be obtained, and the IL is broadcasted to each client side; if the method is a longitudinal XGB method, the client informs each client to perform real node splitting operation according to the longitudinal split (client_id, record_id, IL, feature_name, feature_value) recorded by the server.
Further, the client informs each client to perform real node splitting operation according to the longitudinally split (client_id, record_id, IL, feature_name, feature_value) recorded by the server. The client corresponding to the client_id must know all information (client_id, record_id, IL, feature_name, feature_value), and other clients only need to know IL information.
The server takes the left subtree node after the current splitting as the current processing node.
Further, the client receives an IL or (client_id, record_id, IL, feature_name) sent by the server, and performs a node splitting operation in a longitudinal mode; if there is (client_id, record_id, IL, feature_name, feature_value) information, the client also needs to record and store this information at the time of splitting. In the embodiment of the application, after the splitting is finished, the split left subtree node is used as the current processing node.
Further, the server returns to step S3002 to continue the subsequent processing, and the client returns to step S3002 to wait for the server message. It should be noted that, since only the splitting of the current node is completed at this time, the splitting of the nodes of the left subtree and the right subtree of the next layer is also required. Therefore, the process returns to step S3002, and the node splitting of the next node is performed.
Further, the server performs the node splitting in a transverse manner, that is, performs splitting operation on the current node according to the (k, value) information obtained in the transverse manner, and may obtain IL information, and broadcasts the IL to each client. Wherein IL can be expressed by the following formula:
S IL ={ID|ID k <value}
Wherein, ID k Representing the value of instance ID feature k, S IL Representing the set IL.
Further, the client receives the IL broadcast by the server, and can determine the IL and IR of the current node according to the IL combined with the data of the local non-common ID, and then perform node splitting operation. The IL and IR determination is that the local non-common ID data ID is not included in the IL set transmitted from the server, but is included in the IR set.
Further, when the server performs node splitting in a transverse manner, the server broadcasts (k, value) to the client according to the selected feature k and the threshold value.
Further, each client receives the (k, value) information of the server, and performs node splitting on the data of the common ID in a manner of characteristic k of the data of the common ID. Wherein if the value is less than the value, the ID of the piece of data should be put into the IL set, otherwise, the ID is put into the IR set. If the data does not have the feature k, the data is put into the right subtree space.
Further, the server returns to step S3002 to continue splitting operation of the next node, and the client returns to step S3002 to wait for the server message.
Further, the server takes the node as a leaf node, and calculates a weight value w of the leaf node j Transmitting information [ Ij (m), w j ]To all clients. Where Ij (m) is the instance ID set of the current node space, w j Is the weight of the current node.
Further, the client is based on [ Ij (m), w j ]A new y' (t-1) (i) is calculated and traced back to other non-leaf nodes of the current tree as current nodes. Wherein y' (t-1) (i) represents a Label residual error corresponding to the ith instance, t represents a current t-th tree, and t-1 represents a previous tree.
Further, the server backtracks to other non-leaf nodes of the current tree as the current node.
Further, if the current node after backtracking exists and is not null, the process returns to step S3002 for the next process, and the client returns to step S3002 to wait for the server message.
Further, if the current node is empty, a verification operation of the model is performed.
S3005, the server and the verification client perform verification of the target federal learning model.
Optionally, the server notifies the verification client to perform verification initialization operation
Further, the authentication client performs authentication initialization.
Further, the server selects an ID to start verification. And initializing an XGB tree, and informing the verification client to start verification.
Further, the client-side initial verification information is verified.
Further, the server sends the split node information and the verified data ID to the verification client according to the current XGB tree.
Further, the verification client acquires corresponding data according to the data ID, and then judges whether the user should walk to the left subtree or the right subtree according to split node information sent by the server, and returns the data to the server.
Further, the server enters the next node according to the trend returned by the verification client. Then judging whether the leaf nodes are reached, if so, the server records the weights of the leaf nodes, calculates a predicted value and stores the predicted value; otherwise, the server selects an ID to start verification. And initializing an XGB tree, and informing the verification client to start verification.
Further, the server records the weight of the leaf node, calculates a predicted value, and stores the predicted value. If the current predicted ID is the last of all predicted IDs, the server side sends all predicted results to the client side; otherwise, the server selects an ID to start verification. And initializing an XGB tree, and informing the verification client to start verification.
Further, the server sends all the prediction results to the client.
Further, the verification client receives all the prediction results, performs final verification results, compares the final verification results with the last verification results, judges whether the current model needs to be reserved and used, and notifies the server.
Further, the server judges whether to reserve and use the current model according to the verification result returned by the verification client, and notifies all clients.
Further, each client receives the broadcast information of the server and processes the broadcast information.
Further, the server determines whether the final prediction round has been reached, and if so, performs step S3006.
S3006, the server and the client end respectively end training and keep the target federal learning model.
Optionally, the server ends all training, cleans up information, and retains the model.
Further, the client ends all training, cleans up information, and retains the model.
In summary, according to the training method of the federal learning model in the embodiment of the application, the transverse splitting mode and the longitudinal splitting mode are mixed, so that the tendency of the matched learning mode is automatically selected, the data distribution mode is not required to be concerned, the problems that all data cannot be fully utilized for learning and the training effect is poor due to insufficient data utilization in the training process of the existing federal learning model are solved, meanwhile, the loss of the federal learning model is reduced, and the performance of the federal learning model is improved.
Based on the same application conception, the embodiment of the application also provides a device corresponding to the training method of the federal learning model.
Fig. 33 is a schematic structural diagram of a training device for a federal learning model according to an embodiment of the present application.
As shown in fig. 33, the training device 1000 of the federal learning model is applied to a server, and includes: the system comprises an acquisition module 110, a notification module 120, a generation module 130 and a verification module 140.
The acquiring module 110 is configured to acquire a target splitting manner corresponding to a training node if the training node meets a preset splitting condition; wherein the training node is a node on one of a plurality of lifting trees;
the notification module 120 is configured to notify a client to perform node splitting based on the target splitting manner, and obtain the updated training node;
a generating module 130, configured to determine that the updated training node meets a training stopping condition, stop training, and generate a target federal learning model;
and the verification module 140 is used for acquiring a verification set, and verifying the target federal learning model by a cooperative verification client, wherein the verification client is one of clients participating in federal learning model training.
According to one embodiment of the present application, as shown in fig. 34, the verification module 140 in fig. 33 includes:
a first sending sub-module 141, configured to send, to the verification client, an identifier of one data instance in the verification set and split information of a verification node, where the verification node is a node on one of a plurality of promotion trees;
a first receiving sub-module 142, configured to receive a node trend corresponding to the verification node sent by the verification client, where the node trend is determined by the verification client according to the data instance identifier and the splitting information;
a node update sub-module 143, configured to enter a next node according to the node trend, and use the next node as the updated verification node;
and the second sending sub-module 144 is configured to, if the updated verification node meets the preset node splitting condition, return to perform sending the data instance identifier and the splitting information to the verification client until the data instance identifiers in the verification set are verified.
According to one embodiment of the present application, as shown in fig. 34, the verification module 140 in fig. 33 further includes:
An obtaining sub-module 145, configured to determine that the updated verification node is a leaf node if the updated verification node does not meet the preset node splitting condition, and obtain a model prediction value of the data instance represented by the data instance identifier.
According to one embodiment of the present application, as shown in fig. 34, the verification module 140 in fig. 33 further includes:
a third sending sub-module 146, configured to send the model prediction value of the data instance to the verification client if all the data instance identifiers in the verification set are verified;
a second receiving sub-module 147, configured to receive verification indication information sent by the verification client, where the verification indication information is indication information obtained according to the model prediction value and used to indicate whether a model is reserved;
a determining submodule 148, configured to determine whether to reserve and use the target federal learning model according to the verification instruction information, and send a determination result to the client.
According to one embodiment of the present application, as shown in fig. 35, the acquisition module 110 in fig. 33 includes:
a first learning sub-module 111, configured to perform horizontal federal learning in cooperation with the client based on a first training set, so as to obtain a first split value corresponding to the training node;
A second learning sub-module 112, configured to perform vertical federal learning in coordination with the client based on a second training set, so as to obtain a second split value corresponding to the training node;
and the determining submodule 113 is configured to determine a target splitting manner corresponding to the training node according to the first splitting value and the second splitting value.
According to one embodiment of the present application, as shown in fig. 36, the determining sub-module 113 in fig. 35 includes:
a first determining unit 1131, configured to determine that a larger value of the first split value and the second split value is a target split value corresponding to the training node;
and a second determining unit 1132, configured to determine, according to the target split value, a split manner corresponding to the training node.
According to one embodiment of the present application, as shown in fig. 37, the first learning sub-module 111 in fig. 35 includes:
a transmitting unit 1111, configured to generate a first feature subset available to the training node from the first training set, and transmit the first feature subset to the client;
a first receiving unit 1112, configured to receive a feature value of each feature in the first feature subset sent by the client;
a third determining unit 1113, configured to determine, according to the feature value of each feature in the first feature subset, each feature as a lateral split value corresponding to a split feature point;
A fourth determining unit 1114, configured to determine the first split value of the training node according to the lateral split value corresponding to each feature.
According to an embodiment of the present application, as shown in fig. 38, the third determination unit 1113 in fig. 37 includes:
a first determining subunit 11131, configured to determine, for any feature in the first feature subset, a splitting threshold of any feature according to a feature value of the any feature;
a first obtaining subunit 11132, configured to obtain, according to the splitting threshold, a first data instance identifier set and a second data instance identifier set corresponding to the any feature, where the first data instance identifier set includes data instance identifiers that belong to a first left subtree space, and the second data instance identifier set includes data instance identifiers that belong to a first right subtree space;
a second determining subunit 11133, configured to determine the transverse split value corresponding to the any feature according to the first data instance identifier set and the second data instance identifier set.
According to one embodiment of the application, the first acquisition subunit 11132 in fig. 38 is further configured to: sending the split threshold to the client; receiving an initial data instance identifier set corresponding to the training node sent by the client, wherein the initial data instance identifier set is generated when the client splits any feature according to the splitting threshold, and the initial data instance identifier set comprises data instance identifiers belonging to the first left subtree space; the first set of data instance identifications and the second set of data instance identifications are obtained based on the initial set of data instance identifications and all data instance identifications.
According to one embodiment of the present application, as shown in fig. 39, the second learning sub-module 112 in fig. 35 includes:
a notification unit 1121, configured to notify the client to perform vertical federal learning based on the second training set;
a second receiving unit 1122, configured to receive first gradient information of at least one third data instance identifier set of each feature sent by the client, where the third data instance identifier set includes data instance identifiers belonging to a second left sub-tree space, where the second left sub-tree space is a left sub-tree space formed by splitting according to one of feature values of the feature, and different feature values correspond to different second left sub-tree spaces;
a fifth determining unit 1123, configured to determine a longitudinal split value of each feature according to the first gradient information of each feature and total gradient information of the training node, respectively;
a sixth determining unit 1124 is configured to determine the second split value of the training node according to the longitudinal split value corresponding to each feature.
According to an embodiment of the present application, as shown in fig. 40, the fifth determining unit 1123 in fig. 39 includes:
a second obtaining subunit 11231, configured to obtain, for any feature, second gradient information corresponding to each first gradient information according to the total gradient information and each first gradient information;
A third obtaining subunit 11232, configured to obtain, for each piece of first gradient information, a candidate longitudinal split value of the any feature according to the first gradient information and second gradient information corresponding to the first gradient information;
a selecting subunit 11233, configured to select a maximum value of the candidate longitudinal split values as a longitudinal split value of the any feature.
According to one embodiment of the application, the validation set is mutually exclusive from the first training set and the second training set, respectively.
Therefore, the server side can automatically select the tendency of the matched learning mode by mixing the transverse splitting mode and the longitudinal splitting mode without concern about the data distribution mode, so that the problems that all data cannot be fully utilized for learning and the training effect is poor due to insufficient data utilization in the training process of the existing federal learning model are solved, meanwhile, the loss of the federal learning model is reduced, the performance of the federal learning model is improved, and the verification loss of the model is reduced.
Based on the same application conception, the embodiment of the application also provides a device corresponding to the model evaluation method of the federal learning model.
Fig. 41 is a schematic structural diagram of a training device for a federal learning model according to an embodiment of the present application. As shown in fig. 41, the model evaluation device 2000 of the federal learning model is applied to a verification client, and includes: a receiving module 210, a splitting module 220, and a verification module 230.
The receiving module 210 receives a target splitting mode sent by a server when determining that a training node meets a preset splitting condition, wherein the training node is a node on one of a plurality of lifting trees;
a splitting module 220, configured to perform node splitting on the training node based on the target splitting manner, and perform node splitting on the training node based on the target splitting manner;
and the verification module 230 is configured to receive a verification set sent by the server, and verify the target federal learning model based on the verification set.
According to one embodiment of the present application, as shown in fig. 42, the verification module 230 in fig. 41 includes:
a first receiving sub-module 231, configured to receive the split information of the verification node and one data instance identifier in the verification set sent by the server, where the verification node is a node on one of a plurality of promotion trees;
A first determining submodule 232, configured to determine a node trend of the verification node according to the data instance identifier and the splitting information;
and a first sending sub-module 233, configured to send the node trend to the server, so that the server enters a next node according to the node trend, and uses the next node as the updated verification node.
According to one embodiment of the application, as shown in FIG. 43, the first determination submodule 232 in FIG. 42 includes:
a first determining unit 2321, configured to determine, according to the data instance identifier, a feature value of each feature corresponding to the data instance identifier;
a second determining unit 2322, configured to determine the node trend according to the splitting information and the feature value of each feature.
According to one embodiment of the present application, as shown in fig. 42, the verification module 230 in fig. 41 further includes:
a second receiving sub-module 234, configured to receive, if all the data instance identifiers in the verification set are verified, a model prediction value of the data instance represented by the data instance identifier sent by the server;
a generating sub-module 235, configured to obtain a final verification result according to the model predicted value, and compare the verification result with a previous verification result to generate verification indication information for indicating whether to retain and use the target federal learning model;
And a second sending sub-module 236, configured to send the verification indication information to the server.
According to one embodiment of the present application, as shown in fig. 44, the splitting module 220 in fig. 41 includes:
a first learning sub-module 221, configured to perform horizontal federal learning based on a first training set, so as to obtain a first split value corresponding to the training node;
a second learning sub-module 222, configured to perform vertical federal learning based on a second training set, so as to obtain a second split value corresponding to the training node;
and a third sending submodule 223, configured to send the first split value and the second split value to the server.
According to one embodiment of the present application, as shown in fig. 45, the first learning sub-module 221 in fig. 44 includes:
a first receiving unit 2211, configured to receive a first feature subset available to the training node generated by the server from the first training set;
a first sending unit 2212, configured to send, to the server, a feature value of each feature in the first feature subset;
a second receiving unit 2213, configured to receive a splitting threshold of each feature sent by the server;
a first obtaining unit 2214, configured to obtain an initial data instance identifier set corresponding to the training node based on the splitting threshold of each feature, and send the initial data instance identifier set to the server;
The initial data instance identification set is used for indicating the server to generate a first data instance identification set and a second data instance identification set, wherein the first data instance identification set and the initial data instance identification set both comprise data instance identifications belonging to a first left subtree space, and the second data instance identification set comprises data instance identifications belonging to a first right subtree space.
According to an embodiment of the present application, the first acquisition unit 2214 in fig. 45 is further configured to:
and comparing the splitting threshold value of any feature with the feature value of any feature respectively for any feature, acquiring a data instance identifier of which the feature value is smaller than the splitting threshold value, and generating the initial data instance identifier set.
According to one embodiment of the present application, as shown in fig. 46, the second learning sub-module 222 in fig. 44 includes:
a third receiving unit 2221, configured to receive a gradient information request sent by the server;
a generating unit 2222 configured to generate a second feature subset from the second training set according to the gradient information request;
a second obtaining unit 2223, configured to obtain first gradient information of at least one third data instance identifier set of each feature in the second feature subset, where the third data instance identifier set includes data instance identifiers belonging to a second left sub-tree space, the second left sub-tree space is a left sub-tree space formed by splitting according to one of feature values of the feature, and different feature values correspond to different second left sub-tree spaces;
A second sending unit 2224, configured to send the first gradient information of the third data instance identifier set to the server.
According to one embodiment of the present application, as shown in fig. 47, the second acquisition unit 2223 in fig. 46 includes:
a sub-bucket unit 22231, configured to obtain, for any feature, all feature values of the any feature, and perform bucket classification on the any feature based on the feature values;
a second obtaining subunit 22232, configured to obtain the first gradient information of the third data instance identifier set of each sub-bucket of the any feature.
Therefore, according to the training device for the federal learning model, the client can determine the target splitting mode sent by the server when the training node meets the preset splitting condition, wherein the training node is a node on one of a plurality of lifting trees, and the training node is split based on the target splitting mode, so that the tendency of the matched learning mode can be automatically selected by mixing the transverse splitting mode and the longitudinal splitting mode without concern about the data distribution mode, the problems that all data cannot be fully utilized for learning and the training effect is poor due to insufficient data utilization in the training process of the existing federal learning model are solved, meanwhile, the loss of the federal learning model is reduced, the performance of the federal learning model is improved, and the verification loss of the model is reduced.
Based on the same application conception, the embodiment of the application also provides electronic equipment.
Fig. 48 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 48, the electronic device 3000 includes a memory 310, a processor 320, and a computer program stored in the memory 310 and executable on the processor 320, and when the processor executes the program, the foregoing training method of the federal learning model is implemented.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be noted that in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The application may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (40)

1. The training method of the federal learning model is characterized by being applied to a server and comprising the following steps of:
if the training node meets a preset splitting condition, acquiring a target splitting mode corresponding to the training node; wherein the training node is a node on one of a plurality of lifting trees;
notifying a client to split nodes based on the target splitting mode, and acquiring updated training nodes;
determining that the updated training nodes meet the training stopping conditions, stopping training and generating a target federal learning model;
Acquiring a verification set, and verifying the target federal learning model by a cooperative verification client, wherein the verification client is one of clients participating in federal learning model training;
the obtaining the target splitting mode corresponding to the training node comprises the following steps:
based on a first training set, performing horizontal federation learning in cooperation with the client to obtain a first split value corresponding to the training node;
based on a second training set, performing longitudinal federation learning in cooperation with the client to obtain a second split value corresponding to the training node;
and determining a target splitting mode corresponding to the training node according to the first splitting value and the second splitting value.
2. The method for training a federal learning model according to claim 1, wherein the collaborative verification client verifies the target model based on a verification set, comprising:
sending a data instance identifier in the verification set and split information of verification nodes to the verification client, wherein the verification nodes are nodes on one of a plurality of promotion trees;
receiving a node trend corresponding to the verification node sent by the verification client, wherein the node trend is determined by the verification client according to the data instance identifier and the splitting information;
Entering a next node according to the node trend, and taking the next node as the updated verification node;
and if the updated verification node meets the preset node splitting condition, returning to send the data instance identifier and the splitting information to the verification client until the data instance identifiers in the verification set are verified.
3. The method of training a federal learning model of claim 2, further comprising:
and if the updated verification node does not meet the preset node splitting condition, determining the updated verification node as a leaf node, and acquiring a model predictive value of the data instance represented by the data instance identifier.
4. A method of training a federal learning model in accordance with claim 3, further comprising:
if all the data instance identifiers in the verification set are verified, sending a model predictive value of the data instance to the verification client;
receiving verification indication information sent by the verification client, wherein the verification indication information is indication information which is obtained according to the model predicted value and used for indicating whether a model is reserved or not;
And determining whether to reserve and use the target federal learning model according to the verification indication information, and sending a determination result to the client.
5. The method for training a federal learning model according to claim 1, wherein determining the target split mode corresponding to the training node according to the first split value and the second split value comprises:
determining the larger value of the first split value and the second split value as a target split value corresponding to the training node;
and determining a splitting mode corresponding to the training node according to the target splitting value.
6. The method for training a federal learning model according to claim 5, wherein the performing lateral federal learning by cooperating with the client based on the first training set to obtain a first split value corresponding to the training node includes:
generating a first feature subset available to the training node from the first training set and transmitting the first feature subset to the client;
receiving a characteristic value of each characteristic in the first characteristic subset sent by the client;
according to the characteristic value of each characteristic in the first characteristic subset, respectively determining each characteristic as a transverse splitting value corresponding to a splitting characteristic point;
And determining the first split value of the training node according to the transverse split value corresponding to each feature.
7. The method for training a federal learning model according to claim 6, wherein the determining, based on the feature value of each feature in the first feature subset, each feature as a lateral split value corresponding to a split feature point includes:
determining a splitting threshold of any feature in the first feature subset according to the feature value of the any feature;
acquiring a first data instance identification set and a second data instance identification set corresponding to any feature according to the splitting threshold, wherein the first data instance identification set comprises data instance identifications belonging to a first left subtree space, and the second data instance identification set comprises data instance identifications belonging to a first right subtree space;
and determining the transverse split value corresponding to any feature according to the first data instance identification set and the second data instance identification set.
8. The method for training a federal learning model according to claim 7, wherein the obtaining the first set of data instance identifiers and the second set of data instance identifiers corresponding to the any one feature according to the split threshold includes:
Sending the split threshold to the client;
receiving an initial data instance identifier set corresponding to the training node sent by the client, wherein the initial data instance identifier set is generated when the client splits any feature according to the splitting threshold, and the initial data instance identifier set comprises data instance identifiers belonging to the first left subtree space;
the first set of data instance identifications and the second set of data instance identifications are obtained based on the initial set of data instance identifications and all data instance identifications.
9. The method for training the federal learning model according to claim 1, wherein the performing longitudinal federal learning by cooperating with the client based on the second training set to obtain the second split value corresponding to the training node includes:
notifying the client to perform longitudinal federal learning based on the second training set;
receiving first gradient information of at least one third data instance identifier set of each feature sent by the client, wherein the third data instance identifier set comprises data instance identifiers belonging to a second left sub-tree space, and the second left sub-tree space is a left sub-tree space formed by splitting according to one of feature values of the feature, and different feature values correspond to different second left sub-tree spaces;
According to the first gradient information of each feature and the total gradient information of the training node, respectively determining a longitudinal split value of each feature;
and determining the second split value of the training node according to the longitudinal split value corresponding to each feature.
10. The method of training a federal learning model according to claim 9, wherein the determining the longitudinal split value of each feature based on the first gradient information of each feature and the total gradient information of the training node, respectively, comprises:
for any feature, respectively acquiring second gradient information corresponding to each first gradient information according to the total gradient information and each first gradient information;
for each piece of first gradient information, according to the first gradient information and second gradient information corresponding to the first gradient information, acquiring a candidate longitudinal split value of any feature;
and selecting the maximum value of the candidate longitudinal split values as the longitudinal split value of any feature.
11. The method of claim 1, wherein the validation set is mutually exclusive of the first training set and the second training set, respectively.
12. A method for training a federal learning model, applied to a verification client, comprising:
receiving a target splitting mode sent by a server when a training node meets a preset splitting condition, wherein the training node is a node on one lifting tree in a plurality of lifting trees;
node splitting is carried out on the training nodes based on the target splitting mode;
receiving a verification set sent by the server, and verifying the target federal learning model based on the verification set;
before the node splitting is performed on the training node based on the target splitting mode, the method further comprises:
performing horizontal federation learning based on a first training set to obtain a first split value corresponding to the training node;
performing longitudinal federation learning based on a second training set to obtain a second split value corresponding to the training node;
and sending the first split value and the second split value to the server.
13. The method of claim 12, wherein the validating the target model based on the validation set in conjunction with the validation client comprises:
Receiving a data instance identifier in the verification set sent by the server and splitting information of verification nodes, wherein the verification nodes are nodes on one of a plurality of promotion trees;
determining the node trend of the verification node according to the data instance identifier and the splitting information;
and the node trend sent to the server side is carried out, so that the server side enters the next node according to the node trend, and the next node is used as the updated verification node.
14. The method of claim 13, wherein determining the node trend of the verification node based on the data instance identification and the split information comprises:
according to the data instance identifier, determining a characteristic value of each characteristic corresponding to the data instance identifier;
and determining the trend of the node according to the splitting information and the characteristic value of each characteristic.
15. The method of training a federal learning model of claim 13, further comprising:
if all the data instance identifiers in the verification set are verified, receiving a model predictive value of the data instance represented by the data instance identifier sent by the server;
Obtaining a final verification result according to the model predictive value, and comparing the verification result with a previous verification result to generate verification indication information for indicating whether to retain and use the target federal learning model;
and sending the verification indication information to the server.
16. The method for training a federal learning model according to claim 12, wherein the performing lateral federal learning based on the first training set to obtain the first split value corresponding to the training node further comprises:
receiving a first feature subset which is generated by the server side from the first training set and is available to the training node;
transmitting the characteristic value of each characteristic in the first characteristic subset to the server;
receiving a splitting threshold value of each feature sent by the server;
acquiring an initial data instance identification set corresponding to the training node based on the splitting threshold of each feature, and sending the initial data instance identification set to the server;
the initial data instance identification set is used for indicating the server to generate a first data instance identification set and a second data instance identification set, wherein the first data instance identification set and the initial data instance identification set both comprise data instance identifications belonging to a first left subtree space, and the second data instance identification set comprises data instance identifications belonging to a first right subtree space.
17. The method for training a federal learning model according to claim 16, wherein the obtaining the initial set of data instance identifiers corresponding to the training node based on the split threshold of each feature comprises:
and comparing the splitting threshold value of any feature with the feature value of any feature respectively for any feature, acquiring a data instance identifier of which the feature value is smaller than the splitting threshold value, and generating the initial data instance identifier set.
18. The method for training a federal learning model according to claim 12, wherein before performing longitudinal federal learning based on the second training set to obtain the second split value corresponding to the training node, further comprising:
receiving a gradient information request sent by the server;
generating a second feature subset from a second training set according to the gradient information request;
acquiring first gradient information of at least one third data instance identifier set of each feature in the second feature subset, wherein the third data instance identifier set comprises data instance identifiers belonging to a second left sub-tree space, and the second left sub-tree space is a left sub-tree space formed by splitting according to one of feature values of the feature, and different feature values correspond to different second left sub-tree spaces;
And sending the first gradient information of the third data instance identification set to the server.
19. The method of claim 18, wherein the obtaining first gradient information for the set of at least one third data instance identification for each feature in the second subset of features comprises:
for any feature, acquiring all feature values of the any feature, and classifying the any feature based on the feature values;
and acquiring first gradient information of the third data instance identification set of each sub-bucket of any feature.
20. The utility model provides a training device of federal study model which characterized in that is applied to the server, includes:
the acquisition module is used for acquiring a target splitting mode corresponding to the training node if the training node meets a preset splitting condition; wherein the training node is a node on one of a plurality of lifting trees;
the notification module is used for notifying the client to split the nodes based on the target splitting mode and acquiring the updated training nodes;
the generation module is used for determining that the updated training nodes meet the training stopping conditions, stopping training and generating a target federal learning model;
The verification module is used for acquiring a verification set, and verifying the target federal learning model by a collaborative verification client, wherein the verification client is one of clients participating in federal learning model training;
the acquisition module comprises:
the first learning sub-module is used for carrying out horizontal federal learning in cooperation with the client based on a first training set so as to obtain a first split value corresponding to the training node;
the second learning sub-module is used for carrying out longitudinal federal learning in cooperation with the client based on a second training set so as to obtain a second split value corresponding to the training node;
and the determining submodule is used for determining a target splitting mode corresponding to the training node according to the first splitting value and the second splitting value.
21. The federal learning model training apparatus of claim 20, wherein the verification module comprises:
a first sending sub-module, configured to send, to the verification client, a data instance identifier in the verification set and split information of a verification node, where the verification node is a node on one of a plurality of promotion trees;
the first receiving sub-module is used for receiving the node trend corresponding to the verification node sent by the verification client, wherein the node trend is determined by the verification client according to the data instance identifier and the splitting information;
A node updating sub-module, configured to enter a next node according to the node trend, and use the next node as the updated verification node;
and the second sending sub-module is used for returning to send the data instance identifier and the splitting information to the verification client if the updated verification node meets the preset node splitting condition until the data instance identifiers in the verification set are verified.
22. The federal learning model training apparatus of claim 21, wherein the verification module further comprises:
and the acquisition sub-module is used for determining that the updated verification node is a leaf node if the updated verification node does not meet the preset node splitting condition, and acquiring a model predicted value of the data instance represented by the data instance identifier.
23. The federal learning model training apparatus of claim 22, wherein the verification module further comprises:
a third sending sub-module, configured to send a model prediction value of the data instance to the verification client if all the data instance identifiers in the verification set are verified;
the second receiving sub-module is used for receiving verification indication information sent by the verification client, wherein the verification indication information is indication information which is obtained according to the model predicted value and used for indicating whether a model is reserved or not;
And the determining submodule is used for determining whether to reserve and use the target federal learning model according to the verification indication information and sending a determination result to the client.
24. The federal learning model training apparatus of claim 20, wherein the determining submodule comprises:
a first determining unit, configured to determine that a larger value of the first split value and the second split value is a target split value corresponding to the training node;
and the second determining unit is used for determining a splitting mode corresponding to the training node according to the target splitting value.
25. The federal learning model training apparatus of claim 24, wherein the first learning sub-module comprises:
a sending unit, configured to generate a first feature subset available to the training node from the first training set, and send the first feature subset to the client;
a first receiving unit, configured to receive a feature value of each feature in the first feature subset sent by the client;
a third determining unit, configured to determine, according to a feature value of each feature in the first feature subset, each feature as a lateral split value corresponding to a split feature point;
And a fourth determining unit, configured to determine the first split value of the training node according to the lateral split value corresponding to each feature.
26. The training apparatus of the federal learning model according to claim 25, wherein the third determining unit comprises:
a first determining subunit, configured to determine, for any feature in the first feature subset, a splitting threshold of the any feature according to a feature value of the any feature;
a first obtaining subunit, configured to obtain, according to the splitting threshold, a first data instance identifier set and a second data instance identifier set corresponding to the any feature, where the first data instance identifier set includes data instance identifiers that belong to a first left subtree space, and the second data instance identifier set includes data instance identifiers that belong to a first right subtree space;
and the second determining subunit is used for determining the transverse split value corresponding to any feature according to the first data instance identification set and the second data instance identification set.
27. The federal learning model training apparatus of claim 26, wherein the first acquisition subunit is further configured to:
Sending the split threshold to the client;
receiving an initial data instance identifier set corresponding to the training node sent by the client, wherein the initial data instance identifier set is generated when the client splits any feature according to the splitting threshold, and the initial data instance identifier set comprises data instance identifiers belonging to the first left subtree space;
the first set of data instance identifications and the second set of data instance identifications are obtained based on the initial set of data instance identifications and all data instance identifications.
28. The federal learning model training apparatus of claim 20, wherein the second learning sub-module comprises:
a notification unit, configured to notify the client to perform vertical federal learning based on the second training set;
a second receiving unit, configured to receive first gradient information of at least one third data instance identifier set of each feature sent by the client, where the third data instance identifier set includes data instance identifiers belonging to a second left sub-tree space, where the second left sub-tree space is a left sub-tree space formed by splitting according to one of feature values of the feature, and different feature values correspond to different second left sub-tree spaces;
A fifth determining unit, configured to determine a longitudinal split value of each feature according to the first gradient information of each feature and total gradient information of the training node;
and a sixth determining unit, configured to determine the second split value of the training node according to a longitudinal split value corresponding to each feature.
29. The training apparatus of the federal learning model according to claim 28, wherein the fifth determining unit comprises:
the second acquisition subunit is used for respectively acquiring second gradient information corresponding to each first gradient information according to the total gradient information and each first gradient information aiming at any feature;
a third obtaining subunit, configured to obtain, for each piece of first gradient information, a candidate longitudinal split value of the any feature according to the first gradient information and second gradient information corresponding to the first gradient information;
and the selecting subunit is used for selecting the maximum value in the candidate longitudinal split values as the longitudinal split value of any feature.
30. The federal learning model training arrangement according to claim 20, wherein the validation set is mutually exclusive of the first training set and the second training set, respectively.
31. A training device of a federal learning model, for use in validating a client, comprising:
the system comprises a receiving module, a target splitting module and a splitting module, wherein the receiving module is used for receiving a target splitting mode sent by a server when a training node meets a preset splitting condition, and the training node is a node on one lifting tree in a plurality of lifting trees;
the splitting module is used for splitting the training node based on the target splitting mode;
the verification module is used for receiving a verification set sent by the server and verifying the target federal learning model based on the verification set;
the splitting module comprises:
the first learning sub-module is used for performing horizontal federal learning based on the first training set so as to obtain a first split value corresponding to the training node;
the second learning sub-module is used for carrying out longitudinal federal learning based on a second training set so as to obtain a second split value corresponding to the training node;
and the third sending submodule is used for sending the first split value and the second split value to the server.
32. The federal learning model training apparatus of claim 31, wherein the verification module comprises:
The first receiving sub-module is used for receiving the split information of the verification node and one data instance identifier in the verification set sent by the server side, wherein the verification node is a node on one of a plurality of lifting trees;
the first determining submodule is used for determining the node trend of the verification node according to the data instance identifier and the splitting information;
and the first sending sub-module is used for sending the node trend to the server so that the server enters the next node according to the node trend, and the next node is used as the updated verification node.
33. The federal learning model training apparatus of claim 32, wherein the first determination submodule comprises:
the first determining unit is used for determining the characteristic value of each characteristic corresponding to the data instance identifier according to the data instance identifier;
and the second determining unit is used for determining the trend of the node according to the splitting information and the characteristic value of each characteristic.
34. The federal learning model training apparatus of claim 32, wherein the verification module further comprises:
The second receiving sub-module is used for receiving the model predicted value of the data instance represented by the data instance identifier sent by the server if the data instance identifier in the verification set is verified;
the generation sub-module is used for obtaining a final verification result according to the model predicted value, and comparing the verification result with a previous verification result to generate verification indication information for indicating whether the target federal learning model is reserved and used;
and the second sending submodule is used for sending the verification indication information to the server side.
35. The federal learning model training apparatus of claim 31, wherein the first learning sub-module comprises:
a first receiving unit, configured to receive a first feature subset available to the training node generated by the server from the first training set;
a first sending unit, configured to send, to the server, a feature value of each feature in the first feature subset;
the second receiving unit is used for receiving the splitting threshold value of each feature sent by the server;
the first acquisition unit is used for acquiring an initial data instance identifier set corresponding to the training node based on the splitting threshold value of each feature and sending the initial data instance identifier set to the server;
The initial data instance identification set is used for indicating the server to generate a first data instance identification set and a second data instance identification set, wherein the first data instance identification set and the initial data instance identification set both comprise data instance identifications belonging to a first left subtree space, and the second data instance identification set comprises data instance identifications belonging to a first right subtree space.
36. The federal learning model training apparatus according to claim 35, wherein the first acquisition unit is further configured to:
and comparing the splitting threshold value of any feature with the feature value of any feature respectively for any feature, acquiring a data instance identifier of which the feature value is smaller than the splitting threshold value, and generating the initial data instance identifier set.
37. The federal learning model training apparatus of claim 31, wherein the second learning sub-module comprises:
the third receiving unit is used for receiving the gradient information request sent by the server;
a generation unit, configured to generate a second feature subset from a second training set according to the gradient information request;
A second obtaining unit, configured to obtain first gradient information of at least one third data instance identifier set of each feature in the second feature subset, where the third data instance identifier set includes data instance identifiers belonging to a second left sub-tree space, where the second left sub-tree space is a left sub-tree space formed by splitting according to one of feature values of the feature, and different feature values correspond to different second left sub-tree spaces;
and the second sending unit is used for sending the first gradient information of the third data instance identification set to the server.
38. The training apparatus of the federal learning model of claim 37, wherein the second acquisition unit comprises:
the sub-unit of barrel division is used for aiming at any feature, obtaining all feature values of the any feature, and carrying out barrel division on the any feature based on the feature values;
and the second acquisition subunit is used for acquiring the first gradient information of the third data instance identification set of each sub-bucket of any feature.
39. An electronic device, comprising: memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the training method of the federal learning model according to any one of claims 1-11 or 12-19 when the program is executed.
40. A computer readable storage medium having stored thereon a computer program, which when executed by a processor, implements the method of training a federal learning model according to any of claims 1-11, or claims 12-19.
CN202011621994.0A 2020-12-31 2020-12-31 Training method and device of federal learning model and electronic equipment Active CN113807544B (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
CN202011621994.0A CN113807544B (en) 2020-12-31 2020-12-31 Training method and device of federal learning model and electronic equipment
JP2023540566A JP2024501568A (en) 2020-12-31 2021-12-31 Federated learning model training method, device and electronic equipment
KR1020237022514A KR20230113804A (en) 2020-12-31 2021-12-31 Training methods, devices and electronic devices of federated learning models
US18/270,281 US20240127123A1 (en) 2020-12-31 2021-12-31 Federated learning model training method and apparatus, and electronic device
PCT/CN2021/143890 WO2022144001A1 (en) 2020-12-31 2021-12-31 Federated learning model training method and apparatus, and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011621994.0A CN113807544B (en) 2020-12-31 2020-12-31 Training method and device of federal learning model and electronic equipment

Publications (2)

Publication Number Publication Date
CN113807544A CN113807544A (en) 2021-12-17
CN113807544B true CN113807544B (en) 2023-09-26

Family

ID=78943613

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011621994.0A Active CN113807544B (en) 2020-12-31 2020-12-31 Training method and device of federal learning model and electronic equipment

Country Status (1)

Country Link
CN (1) CN113807544B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113807380B (en) * 2020-12-31 2023-09-01 京东科技信息技术有限公司 Training method and device of federal learning model and electronic equipment
US20240127123A1 (en) * 2020-12-31 2024-04-18 Jingdong Technology Holding Co., Ltd. Federated learning model training method and apparatus, and electronic device
CN114785810B (en) * 2022-03-31 2023-05-16 海南师范大学 Tree-like broadcast data synchronization method suitable for federal learning
CN115545356B (en) * 2022-11-30 2024-02-27 深圳市峰和数智科技有限公司 Determination method of prediction model, S-wave travel time curve prediction method and related equipment
CN117436078B (en) * 2023-12-18 2024-03-12 烟台大学 Bidirectional model poisoning detection method and system in federal learning

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109165683A (en) * 2018-08-10 2019-01-08 深圳前海微众银行股份有限公司 Sample predictions method, apparatus and storage medium based on federation's training
CN110991552A (en) * 2019-12-12 2020-04-10 支付宝(杭州)信息技术有限公司 Isolated forest model construction and prediction method and device based on federal learning
CN111598186A (en) * 2020-06-05 2020-08-28 腾讯科技(深圳)有限公司 Decision model training method, prediction method and device based on longitudinal federal learning
CN111724174A (en) * 2020-06-19 2020-09-29 安徽迪科数金科技有限公司 Citizen credit point evaluation method applying Xgboost modeling
CN111723946A (en) * 2020-06-19 2020-09-29 深圳前海微众银行股份有限公司 Federal learning method and device applied to block chain
CN111797999A (en) * 2020-07-10 2020-10-20 深圳前海微众银行股份有限公司 Longitudinal federal modeling optimization method, device, equipment and readable storage medium
CN111814985A (en) * 2020-06-30 2020-10-23 平安科技(深圳)有限公司 Model training method under federated learning network and related equipment thereof
CN112001500A (en) * 2020-08-13 2020-11-27 星环信息科技(上海)有限公司 Model training method, device and storage medium based on longitudinal federated learning system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7439125B2 (en) * 2019-03-26 2024-02-27 ザ リージェンツ オブ ザ ユニバーシティ オブ カリフォルニア Decentralized privacy-preserving computing for protected data
US11562228B2 (en) * 2019-06-12 2023-01-24 International Business Machines Corporation Efficient verification of machine learning applications

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109165683A (en) * 2018-08-10 2019-01-08 深圳前海微众银行股份有限公司 Sample predictions method, apparatus and storage medium based on federation's training
CN110991552A (en) * 2019-12-12 2020-04-10 支付宝(杭州)信息技术有限公司 Isolated forest model construction and prediction method and device based on federal learning
CN111598186A (en) * 2020-06-05 2020-08-28 腾讯科技(深圳)有限公司 Decision model training method, prediction method and device based on longitudinal federal learning
CN111724174A (en) * 2020-06-19 2020-09-29 安徽迪科数金科技有限公司 Citizen credit point evaluation method applying Xgboost modeling
CN111723946A (en) * 2020-06-19 2020-09-29 深圳前海微众银行股份有限公司 Federal learning method and device applied to block chain
CN111814985A (en) * 2020-06-30 2020-10-23 平安科技(深圳)有限公司 Model training method under federated learning network and related equipment thereof
CN111797999A (en) * 2020-07-10 2020-10-20 深圳前海微众银行股份有限公司 Longitudinal federal modeling optimization method, device, equipment and readable storage medium
CN112001500A (en) * 2020-08-13 2020-11-27 星环信息科技(上海)有限公司 Model training method, device and storage medium based on longitudinal federated learning system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
联邦学习及其在电信行业的应用;李鉴;邵云峰;卢燚;吴骏;;信息通信技术与政策(09);全文 *

Also Published As

Publication number Publication date
CN113807544A (en) 2021-12-17

Similar Documents

Publication Publication Date Title
CN113807544B (en) Training method and device of federal learning model and electronic equipment
CN113822311B (en) Training method and device of federal learning model and electronic equipment
CN113807380B (en) Training method and device of federal learning model and electronic equipment
CN108985309B (en) Data processing method and device
EP3284017B1 (en) Systems and methods for reducing data density in large datasets
CN111695697A (en) Multi-party combined decision tree construction method and device and readable storage medium
CN112380531A (en) Black product group partner identification method, device, equipment and storage medium
CN110543584B (en) Method, device, processing server and storage medium for establishing face index
Ji et al. Cret: Cross-modal retrieval transformer for efficient text-video retrieval
CN109598289B (en) Cross-platform data processing method, device, equipment and readable storage medium
CN105354343B (en) User characteristics method for digging based on remote dialogue
CN110610434A (en) Community discovery method based on artificial intelligence
Kim et al. Federated semi-supervised learning with prototypical networks
CN113515606A (en) Big data processing method based on intelligent medical safety and intelligent medical AI system
CN110222187B (en) Common activity detection and data sharing method for protecting user privacy
da Silva et al. Inference in distributed data clustering
WO2022144001A1 (en) Federated learning model training method and apparatus, and electronic device
Nguyen et al. Blockchain-based secure client selection in federated learning
CN116185296A (en) Distributed safe storage system based on multimedia teleconference information
CN116029392A (en) Joint training method and system based on federal learning
CN115578765A (en) Target identification method, device, system and computer readable storage medium
CN114239049A (en) Parameter compression-based defense method facing federal learning privacy reasoning attack
CN115329378A (en) Unmanned aerial vehicle air patrol platform with encryption function
Yang et al. Zero-Shot Point Cloud Segmentation by Semantic-Visual Aware Synthesis
CN114629693B (en) Suspicious broadband account identification method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant