CN114638377A

CN114638377A - Model training method and device based on federal learning and electronic equipment

Info

Publication number: CN114638377A
Application number: CN202210506266.8A
Authority: CN
Inventors: 陈立峰; 李腾飞; 卞阳; 张翔
Original assignee: Fucun Technology Shanghai Co ltd
Current assignee: Fucun Technology Shanghai Co ltd
Priority date: 2022-05-11
Filing date: 2022-05-11
Publication date: 2022-06-17

Abstract

The application provides a method, a device and electronic equipment for training a model based on federal learning, wherein the method comprises the following steps: processing external intermediate data and local feature data by using a first secret key to obtain initial secret state data, wherein the first secret key is obtained from a central node, and the external intermediate data is obtained from other training nodes; sending the initial secret state difference value to a central node, and receiving a screened index sent by the central node, wherein the initial secret state data is used for determining the screened index; determining an updated dense state difference value according to the screened index and the initial dense state data; determining a dense state gradient value by using the updated dense state difference value and the training subset corresponding to the updated dense state difference value; sending the dense gradient value to a central node, and receiving a target gradient value sent by the central node; the current model is updated using the target gradient value.

Description

Model training method and device based on federal learning and electronic equipment

Technical Field

The application relates to the technical field of model training, in particular to a method and a device for model training based on federal learning and electronic equipment.

Background

The federal learning is that each participant with different data performs joint calculation or machine learning on the premise that the data of other participants cannot be snooped. Federated learning can better exploit the value potential of joint modeling of data from various parties.

Federated data modeling of federal learning is valuable, but due to the large amount of data, the amount of computation required to exercise the expense is proportionately increased. Therefore, the federate learning training mode in the prior art has the problems of time consumption, energy consumption, low efficiency and the like in calculation.

Disclosure of Invention

In view of this, an object of the embodiments of the present application is to provide a method and an apparatus for model training based on federal learning, and an electronic device. The problem that the existing federal learning training mode is low in efficiency can be solved.

In a first aspect, an embodiment of the present application provides a model training method based on federal learning, which is applied to a first training node, and includes:

processing external intermediate data and local feature data by using a first key to obtain initial secret data, wherein the first key is obtained from a central node, the external intermediate data is obtained from other training nodes, and the initial secret data comprises: an initial dense state difference value;

sending the initial secret state difference value to a central node, and receiving a screened index sent by the central node, wherein the initial secret state data is used for determining the screened index;

determining an updated secret state difference value according to the screened index and the initial secret state data;

determining a dense state gradient value by using the updated dense state difference value and the training subset corresponding to the updated dense state difference value;

sending the dense gradient value to a central node, and receiving a target gradient value sent by the central node;

and updating the current model by using the target gradient value.

In an optional implementation manner, the determining, according to the filtered index and the initial secret state data, an updated secret state difference value further includes:

and determining an updated dense state difference value from the initial dense state difference value according to the screened index and the mapping relation data.

In an optional implementation, the processing the external intermediate data and the local feature data by using the first key to obtain initial secret data includes:

calculating external intermediate data and local feature data in the first training node by using the first key to obtain an original secret state difference value;

and carrying out error sequence processing on the original secret state difference value to obtain an initial secret state difference value and mapping relation data.

In the above embodiment, the original secret state difference value may be subjected to out-of-order processing, and the security of the local feature data may be further protected on the basis of the secret state.

In an optional embodiment, the determining the dense state gradient value by using the updated dense state difference value and the training subset corresponding to the updated dense state difference value includes:

calculating to obtain an initial dense state gradient value by using the updated dense state difference value and a training subset corresponding to the updated dense state difference value;

carrying out noise enhancement processing on the initial dense-state gradient value to obtain a dense-state gradient value;

the updating the current model by using the target gradient value comprises:

noise reduction processing corresponding to the noise enhancement processing is carried out on the target gradient value to obtain a model gradient value;

and updating the current model by using the model gradient value.

In the above embodiment, when the secret gradient value is transmitted to the central node, noise may be added, and the security of local data may be improved.

In an optional embodiment, the method further comprises:

and sending the updated dense state difference value to other training nodes, wherein the updated dense state difference value is used for determining the respective dense state gradient value of each other training node.

In the above embodiment, for other training nodes, the subsequent calculation of gradient values may be performed without repeatedly determining the dense gradient values, so that the calculation resources required for training other training nodes may be reduced.

In a second aspect, an embodiment of the present application provides a method for training a model based on federal learning, which is applied to a central node, and includes:

sending a first key to each training node, so that each training node processes the characteristic data carried by each training node according to the first key to determine secret intermediate data;

receiving an initial secret state difference value sent by a first training node in each training node, wherein the initial secret state difference value is determined by the first training node according to secret state intermediate data of each training node;

determining the screened index according to the initial secret state difference value;

sending the screened index to the first training node, wherein the screened index is used for determining an updated secret state difference value;

receiving a dense state gradient value which is sent by each training node and determined according to the updated dense state difference value;

decrypting the secret gradient value to obtain a target gradient value;

and sending the target gradient value to each training node, wherein the target gradient value is used for updating the current model in each training node.

In an optional implementation manner, the determining the filtered index according to the initial secret state difference value includes:

screening out a preset number of numerical values from the initial dense-state data, wherein the sum of the absolute values of the preset number of numerical values is greater than or equal to a specified numerical value;

and determining the screened index according to the bit sequence of the numerical values of the preset number in the initial secret state data.

In a third aspect, an embodiment of the present application provides a user type identification method, including:

and acquiring user data of the user to be identified.

And inputting the user data into the models of the training nodes, and determining the user type of the user to be recognized based on the output result of the models of the training nodes.

In a fourth aspect, an embodiment of the present application provides a model training apparatus based on federal learning, which is applied to a first training node, and includes:

the first processing module is used for processing external intermediate data and local feature data by using a first secret key to obtain initial secret state data, wherein the first secret key is obtained from a central node, the external intermediate data is obtained from other training nodes, and the initial secret state data comprises an initial secret state difference value;

the first transmission module is used for sending the initial secret state difference value to a central node and receiving a screened index sent by the central node, wherein the initial secret state data is used for determining the screened index;

a difference value determining module, configured to determine an updated dense state difference value according to the filtered index and the initial dense state data;

the gradient value determining module is used for determining the dense gradient value by using the updated dense difference value and the training subset corresponding to the updated dense difference value;

the second transmission module is used for sending the secret gradient value to a central node and receiving a target gradient value sent by the central node;

and the updating module is used for updating the current model by using the target gradient value.

In a fifth aspect, an embodiment of the present application provides a model training apparatus based on federal learning, which is applied to a central node, and includes:

the first sending module is used for sending a first key to each training node so that each training node can process the characteristic data carried by each training node according to the first key to determine secret intermediate data;

the first receiving module is used for receiving an initial secret state difference value sent by a first training node in each training node, wherein the initial secret state difference value is determined by the first training node according to secret state intermediate data of each training node;

the identification determining module is used for determining the screened index according to the initial secret state difference value;

a second sending module, configured to send the filtered index to the first training node, where the filtered index is used to determine an updated secret state difference value;

the second receiving module is used for receiving the dense state gradient value which is sent by each training node and determined according to the updated dense state difference value;

the first decryption module is used for decrypting the secret gradient value to obtain a target gradient value;

and a third sending module, configured to send the target gradient value to each of the training nodes, where the target gradient value is used to update a current model in each of the training nodes.

In a sixth aspect, an embodiment of the present application provides a user type identification apparatus, including:

and the acquisition module is used for acquiring the user data of the user to be identified.

And the input module is used for inputting the user data into the models of the training nodes and determining the user type of the user to be recognized based on the output result of the models of the training nodes.

In a seventh aspect, an embodiment of the present application provides an electronic device, including: a processor, a memory storing machine-readable instructions executable by the processor, the machine-readable instructions being executable by the processor to perform the steps of the method described above when the electronic device is run.

In an eighth aspect, embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the above-mentioned method.

According to the model training method and device based on federated learning and the electronic device, when the dense state difference value used for calculating the gradient value is determined, the dense state difference value can be processed by the central node, so that a relatively shorter updated dense state difference value can be obtained, the calculation amount required for calculating the gradient value based on the updated dense state difference value subsequently can be reduced, and the training efficiency of federated learning can be further achieved.

In order to make the aforementioned objects, features and advantages of the present application more comprehensible, embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.

FIG. 1 is a schematic diagram illustrating interaction in an operating environment of a federated learning-based model training method provided in an embodiment of the present application;

fig. 2 is a block diagram of an electronic device according to an embodiment of the present disclosure;

FIG. 3 is a flowchart of a federated learning-based model training method provided in an embodiment of the present application;

FIG. 4 is a functional module diagram of a model training apparatus based on federated learning provided in an embodiment of the present application;

FIG. 5 is another flowchart of a federated learning-based model training method provided in an embodiment of the present application;

FIG. 6 is a schematic diagram of another functional module of a model training apparatus based on federated learning according to an embodiment of the present application;

fig. 7 is a flowchart of a user type identification method according to an embodiment of the present application;

fig. 8 is a block diagram illustrating a user type identification apparatus according to an embodiment of the present application.

Detailed Description

The technical solution in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.

Federal learning is that each participant with different data performs joint calculation or machine learning on the premise that the data of other participants cannot be snooped, so that the value potential of joint modeling of each data can be fully exerted. Wherein the data owned by the parties participating in the longitudinal federal learning have similar identities (e.g., the same user), different characteristics (e.g., height, weight, consumption level, etc.).

The inventor has appreciated that while federal learning can make full use of data owned by various participants, as the amount of data increases, the computational resources required for modeling multiply and the efficiency of modeling decreases dramatically. Based on this, the embodiment of the application provides a model training method based on federal learning, which can fully utilize data resources of each participant, and has relatively low demand on computing resources. The above method is described below by some examples.

To facilitate understanding of the present embodiment, a description will first be given of an operating environment for performing a federal learning based model training method disclosed in the embodiments of the present application.

Fig. 1 is a schematic diagram illustrating interaction in an operating environment of a model training method based on federal learning according to an embodiment of the present application. The runtime environment includes a central node 110 and two or more training nodes 120 in communication with the central node 110. The training node 120 or the central node 110 may be a server, some local terminals, various application service platforms, and the like. Illustratively, the server may be a web server, a database server, or the like, and the local terminal may be a Personal Computer (PC), a tablet, a smart phone, a Personal Digital Assistant (PDA), or the like.

The selection of each central node 110 and training node 120 may be different in order to achieve different modeling. For example, if the purpose of modeling is to build a model for identifying a user's reputation, each training node 120 may be a banking platform, various lending platforms, etc. For another example, if the modeling is performed to build a model for identifying user interests, each training node 120 may be a security guard platform, a shopping platform, a browser backend server, or the like.

As shown in fig. 2, is a block schematic diagram of an electronic device. The electronic device 200 may include a memory 211, a processor 213. It will be understood by those skilled in the art that the structure shown in fig. 2 is merely illustrative and is not intended to limit the structure of the electronic device 200. For example, electronic device 200 may also include more or fewer components than shown in FIG. 2, or have a different configuration than shown in FIG. 2.

The aforementioned elements of the memory 211 and the processor 213 are electrically connected to each other directly or indirectly to achieve data transmission or interaction. For example, the components may be electrically connected to each other via one or more communication buses or signal lines. The processor 213 described above is used to execute the executable modules stored in the memory.

The Memory 211 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like. The memory 211 is configured to store a program, and the processor 213 executes the program after receiving an execution instruction, and the method executed by the electronic device 200 according to the process definition disclosed in any embodiment of the present application may be applied to the processor 213, or implemented by the processor 213.

The processor 213 may be an integrated circuit chip having signal processing capability. The Processor 213 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the Integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The training nodes and central nodes shown in fig. 1 may include various components provided in the electronic device 200 shown in fig. 2, and of course, when the training nodes and central nodes are different devices, the training nodes and central nodes may include more components than the electronic device 200 shown in fig. 2. For example, when the training node is a personal computer, the personal computer may further include a display unit, a positioning unit, and the like.

The electronic device 200 in this embodiment may be configured to perform each step in each method provided in this embodiment. The implementation of the federal learning based model training method is described in detail below by way of several embodiments.

Please refer to fig. 3, which is a flowchart of a model training method based on federal learning according to an embodiment of the present application. The method in this embodiment may be applied to one of the training nodes, e.g., the first training node. The specific flow shown in fig. 3 will be described in detail below.

Step 310, processing the external intermediate data and the local feature data by using the first key to obtain initial secret state data.

The first key in this embodiment may be obtained from the central node. Before model training, the central node may first send the first key to each training node that needs to participate in model training. For example, the first key may be a public key in asymmetric encryption, the central node may store a second key, and the second key may be a corresponding private key of the public key and may be stored in the central node.

External intermediate data is obtained from other training nodes. Illustratively, the external intermediate data may be intermediate data calculated by other training nodes. For example, the intermediate data may be obtained by inputting the feature data contained in the external training node into the current model of the external training node for calculation, and then encrypting the data by using the first secret key received from the central node to obtain encrypted data.

Optionally, the number of sets of external intermediate data is related to the number of federally learned training nodes. Illustratively, the number of federately learned training nodes is M, then the number of sets of external intermediate data is M-1. For example, if the number of federally learned training nodes is 2, the number of external intermediate data sets is 1.

The following description will be given taking the number of training nodes participating in federal learning as 2 as an example.

In one embodiment, the initial secret state data may include an initial secret state difference value, which may be expressed as:

g=a₁*Z+a₂*y+a₃；

Z=u_A+u_B+b；

u_A=X_A*w_A；

wherein g may represent an initial dense state difference value; a is₁、a₂、a₃Respectively a set of determined constants; y represents a real classification label value corresponding to the feature data used currently in the first training node; b represents the model parameters and is a scalar quantity; x_ARepresenting a two-dimensional matrix, owned by the first training node for training the modelThe characteristic data of (a); w is a_AThe model parameter representing the first training node is a vector, and the length of the vector is the same as the characteristic number of a single sample; u. of_BRepresenting external intermediate data.

Alternatively, the external intermediate data may be calculated in other training nodes, and the calculation manner of the external intermediate data may be as follows:

u_B0=X_B*w_B；

using the first key to the u_B0The external intermediate data u can be obtained by performing encryption processing_B。

In another embodiment, formula a may be further modified to improve data security in the first training node₁*Z+a₂*y+a₃And processing the calculation result, and taking the processed result as an initial dense state difference value.

Exemplarily, the above g = a may be set₁*Z+a₂*y+a₃And as the original secret state difference value, carrying out error sequence processing on the original secret state difference value to obtain an initial secret state difference value and mapping relation data.

For example, the order of the numerical values in g may be scrambled, g' in the scrambled order may be used as the initial dense-state difference value, and the mapping relationship between the original position of each numerical value and the position after the scrambling may be retained.

Step 320, sending the initial secret state difference value to a central node, and receiving the filtered index sent by the central node.

The initial dense state difference is used to determine the filtered index.

In order to reduce the computing resources required by the training node, the initial secret state difference value can be sent to the central node, so that the central node screens out partial numerical values from the initial secret state difference value for subsequent computation.

For example, the filtered index may represent a bit sequence of the filtered difference values, the initial dense state difference value may include 100 values, a proportion of the 100 values needs to be selected, for example, 30% of the values, and then the filtered index may be a thirty-bit sequence identifier. For example, if the selected value is the first thirty values, the filtered index may be thirty position indices from 1 to 30.

For example, the filtered index may represent a filtered value, the initial secret difference value may include 200 values, a proportion of the 200 values, for example, 50% of the values, may be selected, and the filtered index may be one hundred values.

Illustratively, the number of the selected differences from the initial dense state differences may be set as required, for example, the selected differences may account for 30% to 50% of the initial dense state differences, and of course, more or less values may be selected according to actual requirements.

Alternatively, the criterion for the central node to screen update differences from the initial dense state differences may be that the sum of the absolute values of the screened values is greater than a specified value. Illustratively, it is agreed in advance that a preset number of values need to be filtered out. The sum of the absolute values of the predetermined number of values is greater than or equal to the specified value.

The specified value may be determined by the values carried in the initial secret state difference value. For example, the specified value may be a preset number times the average of the absolute values of the respective values carried in the initial dense state difference value. For example, if the predetermined number is 40, the specified value may be 40 times the average of the absolute values of the values carried in the initial secret state difference value.

For another example, the specified value may be a number that is greater than a preset number times an average of absolute values of the values carried in the initial secret state difference value. For example, if the predetermined number is 50, the specified value may be 55, 58, 60, etc. times the average of the absolute values of the values carried in the initial secret state difference value.

For another example, the specified value may be equal to the sum of the values with the highest predetermined number in the absolute value ordering of the values carried in the initial secret state difference value.

Illustratively, the filter criteria for the filter index may be: and sorting according to the absolute value, and selecting the numerical value of the bit sequence with the absolute value sorted in the front preset number. And taking the index of the numerical value of the absolute value sequence with the front preset number as the index after screening.

Illustratively, the filter criteria of the filter index may be: sorting according to absolute values, selecting numerical values of a first preset number of bit sequences with the absolute values sorted in the front, and then randomly screening second preset number of numerical values from the initial dense state difference values; and taking the index of the numerical value of the first preset number of bit sequences and the index of the numerical value of the second preset number as the indexes after screening. The sum of the first preset number and the second preset value is a preset number.

And 330, determining an updated secret state difference value according to the screened index and the initial secret state data.

For example, the filtered index may be a filtered value, and the value of the misorder may be adjusted to the original order, so that an updated dense state difference value may be obtained.

For example, the filtered index may be a position sequence number of the filtered numerical value, and an updated secret state difference value may be determined from the initial secret state difference value according to the filtered index and the mapping relationship data.

Step 340, determining a dense state gradient value by using the updated dense state difference value and the training subset corresponding to the updated dense state difference value.

In one embodiment, the secret gradient value of the first training node may be determined as follows:

△w_A=[X_topAk]^T g_topAk；

wherein, [ X ]_topAk]^TRepresenting a data subset corresponding to the updated dense state difference value of the first training node; g_topAkRepresenting an updated dense state difference value of the first training node; delta w_ARepresenting the calculation result of the updated dense state difference value and the training subset corresponding to the updated dense state difference value;

the secret gradient value of the first training node may be equal to the updated secret difference value and the computation result of the training subset corresponding to the updated secret difference value.

In another embodiment, the initial dense-state gradient value is subjected to noise enhancement processing to obtain a dense-state gradient value. The dense state gradient value may also be equal to a value obtained by calculating the updated dense state difference value and the calculation result of the training subset corresponding to the updated dense state difference value, with the first noise value of the first training node.

E.g. Δ w'_A=△w_A +noiseA;

The noiseA represents the noise value used for the noise-boosting process of the first training node.

Step 350, sending the dense gradient value to a central node, and receiving a target gradient value sent by the central node.

The central node carries out decryption processing according to the received secret state gradient value, and can obtain a target gradient value in a non-encrypted state.

Illustratively, the central node may decrypt the secret gradient value using a second key corresponding to the owned first key.

And step 360, updating the current model by using the target gradient value.

Optionally, step 360 may include step 361 and step 362.

And 361, performing noise reduction processing corresponding to the noise increase processing on the target gradient value to obtain a model gradient value.

For example, the first noise value used in the noise-increasing process may be subtracted from the target gradient value, and then the model gradient value may be obtained.

Step 362, updating the current model using the model gradient value.

And updating the weight of the current model of the first training node by using the model gradient value, so as to obtain the updated model of the first training node.

In this embodiment, the parameters used by the other training node update models may also update the dense state difference value to calculate the corresponding dense state gradient value. Illustratively, the federal learning based model training method may further include: step 370, sending the updated dense state difference value to other training nodes, where the updated dense state difference value is used by each of the other training nodes to determine its respective dense state gradient value.

For example, other training nodes may also calculate the dense gradient values in a similar manner as the first training node.

Taking two training nodes involved in the model training method of federal learning as an example, the other training nodes may be referred to as second training nodes.

In one embodiment, the dense gradient value of the second training node may be calculated as follows:

△w_B=[X_topBk]^T g_topBk；

wherein [ X ]_topBk]^TRepresenting a data subset corresponding to the updated secret state difference value of the second training node; g_topBkRepresenting an updated dense state difference value of the second training node; delta w_BRepresenting the calculation result of the updated dense state difference value and the training subset corresponding to the updated dense state difference value;

the secret gradient value of the second training node may be equal to the updated secret difference value and the calculation result of the training subset corresponding to the updated secret difference value.

E.g. Δ w'_B=△w_B+noiseB;

The noiseB represents the noise value used for the noise-boosting process of the second training node.

The gradient values of the second training node may then be calculated in the manner of

steps

350 and 360.

In order to understand the model training situation, the loss value of the training model can also be calculated. The model training method based on federal learning can further comprise the following steps:

and 380, determining the secret state loss value of the current model according to the external intermediate data, the related data of the external intermediate data, the local feature data and the classification label value.

Illustratively, the correlation data of the external intermediate data may be a square of the external intermediate data.

Step 380 can be calculated by the following formula:

Loss=f(y, u_A,u_B,u_B ²,b)；

where f () represents a loss function.

Step 390, sending the secret state loss value to a central node, so that the central node decrypts the secret state loss value to obtain a target loss value.

Optionally, the target loss value may be output to a display device for displaying, so that the relevant user can know the model training condition.

The model training method based on the federal learning in the embodiment can be a training mode of logistic regression, and training of logistic regression can be accelerated through the training mode of the embodiment, so that training efficiency is improved.

In the model training method based on federal learning provided in the embodiment of the application, when the dense state difference value used for calculating the gradient value is determined, the dense state difference value can be processed by the central node, so that a relatively shorter updated dense state difference value can be obtained, the calculation amount required for calculating the gradient value based on the updated dense state difference value subsequently can be reduced, and the training efficiency of federal learning can be further improved.

In the embodiment of the application, the number of elements in the difference g is reduced in the central node through screening, so that the calculation amount of homomorphic encryption is reduced when each training node calculates the gradient value, and the purposes of reducing calculation consumption and improving operation efficiency are achieved. The method in the embodiment can obviously reduce the dense state calculation amount of the algorithm, and along with the reduction of the ratio of the preset number k value to the number of the elements of g, the time consumption of the dense state calculation is reduced in proportion. Therefore, when the amount of data involved in training is large, the effect of the technique on the performance improvement of the algorithm is very considerable. Table 1 below:

TABLE 1

top_k %	time per epoch (s)
		10%	29.019
20%	41.404
		30%	55.302
40%	69.369
		50%	80.714
60%	95.579
		no select	158.673

Wherein top _ k% represents the ratio of the value corresponding to the screened index to the initial dense state value, and time per epoch represents the time consumed by each round of training.

The time consumed by each training round for not screening the initial dense state difference value is 160 s.

By using the method of the embodiment of the application, the calculated amount is reduced, and meanwhile, the convergence of the training model can be still ensured, and even under a specific condition, the convergence speed and the convergence precision can exceed all elements using the difference value g. See the following examples for specific data.

Comparing results of Accuracy:

when the ratio of the screening indexes in the initial dense state difference value is 30% -50%, the result evaluation index accuracy and f1 converge fastest, and specifically, the evaluation index accuracy can reach more than 0.95 after 40 rounds of epochs training. Compared with the prior art, the speed of convergence is slower instead of screening and selecting the initial dense state difference value, and specifically, the evaluation index accuracy can reach 0.95 after 100 rounds of epochs training.

The details are shown in tables 2 to 4 below:

TABLE 2

Comparison of AUC results:

TABLE 3

f1 comparison of results:

TABLE 4

As can be understood from the data in the table, the method of the embodiment of the present application can ensure the convergence of the training model while reducing the calculation amount, and even under specific conditions, the convergence speed and accuracy can be improved.

Based on the same application concept, a federate learning-based model training device corresponding to the federate learning-based model training method is further provided in the embodiment of the present application, and as the principle of solving the problem of the device in the embodiment of the present application is similar to that of the federate learning-based model training method, the implementation of the device in the embodiment of the present application may refer to the description in the embodiment of the method, and repeated details are omitted.

Please refer to fig. 4, which is a schematic diagram of a functional module of a model training apparatus based on federal learning according to an embodiment of the present application. The modules in the model training apparatus based on federal learning in this embodiment are used for executing the steps in the above method embodiments. The model training device based on federal learning comprises: a first processing module 410, a first transmission module 420, a difference determination module 430, a gradient value determination module 440, a second transmission module 450, and an update module 460; wherein the content of the first and second substances,

a first processing module 410, configured to process external intermediate data and local feature data using a first key to obtain initial secret state data, where the first key is obtained from a central node, the external intermediate data is obtained from other training nodes, and the initial secret state data includes an initial secret state difference value;

a first transmission module 420, configured to send the initial secret state difference value to a central node, and receive a filtered index sent by the central node, where the initial secret state data is used to determine the filtered index;

a difference determining module 430, configured to determine an updated secret state difference according to the filtered index and the initial secret state data;

a gradient value determining module 440, configured to determine a dense gradient value by using the updated dense difference value and the training subset corresponding to the updated dense difference value;

a second transmission module 450, configured to send the dense gradient value to a central node, and receive a target gradient value sent by the central node;

an updating module 460, configured to update the current model with the target gradient value.

In an optional implementation manner, the difference determining module 430 is configured to determine an updated dense state difference value from the initial dense state difference value according to the filtered index and the mapping relationship data.

In an optional implementation, the first processing module 410 is configured to:

In an alternative embodiment, the difference determining module 430 is configured to:

the updating the current model using the target gradient value comprises:

and updating the current model by using the model gradient value.

In an optional implementation manner, the federal learning-based model training apparatus in this embodiment may further include an updated secret state difference value sending module, configured to:

Please refer to fig. 5, which is a flowchart of a model training method based on federal learning according to an embodiment of the present application. The federate learning-based model training method provided by the embodiment is similar to the federate learning-based model training method provided by the previous embodiment, and the difference is that the method in the embodiment is a method written based on a central node. The specific flow shown in fig. 5 will be described in detail below.

Step 510, sending a first key to each training node, so that each training node processes the characteristic data carried by each training node according to the first key, and determining secret intermediate data.

Step 520, receiving the initial secret state difference value sent by the first training node in each training node.

And the initial secret state difference value is determined by the first training node according to the secret state intermediate data of each training node.

And step 530, determining the screened index according to the initial secret state difference value.

Optionally, step 530 may include step 531 and step 532.

Step 531, screening out a preset number of numerical values from the initial dense state difference value, wherein the sum of absolute values of the preset number of numerical values is greater than or equal to a specified numerical value.

Alternatively, the filtering criteria of the filtering index may be: and sorting according to the absolute value, and selecting the numerical value of the bit sequence with the absolute value sorted in the front preset number. And taking the index of the numerical value of the absolute value sequence with the front preset number as the index after screening.

Alternatively, the filter criteria of the filter index may be: sorting according to absolute values, selecting numerical values of a first preset number of bit sequences with the absolute values sorted in the front, and then randomly screening second preset number of numerical values from the initial dense state difference values; and taking the index of the numerical value of the first preset number of bit sequences and the index of the numerical value of the second preset number as the indexes after screening. The sum of the first preset number and the second preset value is a preset number.

For example, the initial dense state difference value includes 200 elements, the predetermined number is 60, the first predetermined number is 30, and the second predetermined number is also 30. Then 30 elements with absolute value ordering first 30 can be selected from 200 elements and then 30 elements can be randomly selected from the remaining 170 elements.

For another example, the initial dense state difference value includes 200 elements, the predetermined number is 60, the first predetermined number is 30, and the second predetermined number is also 30. Then 30 elements with absolute value ordering first 30 can be selected from 200 elements and then 30 elements can be randomly selected from the last 140 elements with absolute value ordering.

And 532, determining the screened index according to the bit sequence of the preset number of numerical values in the initial dense-state data.

And 540, sending the screened index to the first training node, wherein the screened index is used for determining an updated secret state difference value.

And step 550, receiving the dense gradient values determined according to the updated dense difference values sent by each training node.

And step 560, decrypting the secret gradient value to obtain a target gradient value.

Step 570, sending the target gradient value to each of the training nodes, where the target gradient value is used to update the current model in each of the training nodes.

For other details of this embodiment, reference may be made to the description in the federal learning based model training method applied to the first training node, and repeated details are not repeated.

Please refer to fig. 6, which is a schematic diagram of a functional module of a model training apparatus based on federal learning according to an embodiment of the present application. The modules in the model training apparatus based on federal learning in this embodiment are used for executing the steps in the above method embodiments. The model training device based on federal learning comprises: a first transmitting module 610, a first receiving module 620, an identification determining module 630, a second transmitting module 640, a second receiving module 650, a first decrypting module 660, and a third transmitting module 670; wherein the content of the first and second substances,

a first sending module 610, configured to send a first key to each training node, so that each training node processes feature data carried by the training node according to the first key to determine secret-state intermediate data;

a first receiving module 620, configured to receive an initial secret state difference value sent by a first training node in each training node, where the initial secret state difference value is determined by the first training node according to secret state intermediate data of each training node;

an identifier determining module 630, configured to determine the screened index according to the initial secret state difference;

a second sending module 640, configured to send the filtered index to the first training node, where the filtered index is used to determine an updated secret state difference value;

a second receiving module 650, configured to receive the dense state gradient value determined according to the updated dense state difference value and sent by each training node;

the first decryption module 660 is configured to decrypt the secret gradient value to obtain a target gradient value;

a third sending module 670, configured to send the target gradient value to each training node, where the target gradient value is used to update a current model in each training node.

In a possible implementation, the identification determining module 630 is configured to:

Please refer to fig. 7, which is a flowchart illustrating a user type identification method according to an embodiment of the present application. The specific flow shown in fig. 7 will be described in detail below.

Step 710, obtaining user data of the user to be identified.

And 720, inputting the user data into the models of the training nodes, and determining the user type of the user to be recognized based on the output result of the models of the training nodes.

For example, the user type may be different based on the user scenario of the targets.

In one example, a user type identification method may be used to identify the user who needs to be loaned, and the user type identification method may be used to identify whether the reputation of the user is viable. The user types may include trusted users and untrusted users.

In another example, a user type identification method may be used to identify whether a user is interested in targeting a web page. The user type may include that the user has an interest in the target web page or that the user has no interest in the target web page.

Based on the same application concept, a user type identification device corresponding to the user type identification method is further provided in the embodiments of the present application, and since the principle of solving the problem of the device in the embodiments of the present application is similar to that in the embodiments of the user type identification method, the implementation of the device in the embodiments of the present application may refer to the description in the embodiments of the method, and repeated details are omitted.

Please refer to fig. 8, which is a schematic diagram of functional modules of a user type identification apparatus according to an embodiment of the present application. Each module in the user type identification apparatus in this embodiment is configured to perform each step in the above method embodiment. The user type identification device includes: an acquisition module 810 and an input module 820; wherein the content of the first and second substances,

an obtaining module 810, configured to obtain user data of a user to be identified.

An input module 820, configured to input the user data into the model of each training node, and determine the user type of the user to be recognized based on an output result of the model of each training node.

In addition, an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program performs the steps of the federal learning based model training method in the above method embodiment.

The computer program product of the model training method based on federal learning provided in the embodiments of the present application includes a computer-readable storage medium storing a program code, where instructions included in the program code may be used to execute steps of the model training method based on federal learning described in the above method embodiments, which may be specifically referred to in the above method embodiments, and details are not described here.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The functions may be stored in a computer-readable storage medium if they are implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes. It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of additional identical elements in the process, method, article, or apparatus that comprises the element.

The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A model training method based on federal learning is characterized in that the model training method is applied to a first training node and comprises the following steps:

processing external intermediate data and local feature data by using a first key to obtain initial secret state data, wherein the first key is obtained from a central node, the external intermediate data is obtained from other training nodes, and the initial secret state data comprises an initial secret state difference value;

determining a dense state gradient value by using the updated dense state difference value and a training subset corresponding to the updated dense state difference value;

and updating the current model by using the target gradient value.

2. The method of claim 1, wherein the initial secret state data further comprises mapping relationship data, and wherein determining an updated secret state difference value from the filtered index and the initial secret state data comprises:

3. The method of claim 2, wherein processing the external intermediate data and the local feature data using the first key to obtain initial secret data comprises:

4. The method of claim 1, wherein determining the dense state gradient values using the updated dense state difference values and the training subsets corresponding to the updated dense state difference values comprises:

the updating the current model using the target gradient value comprises:

and updating the current model by using the model gradient value.

5. The method of claim 1, further comprising:

and sending the updated secret state difference value to other training nodes, wherein the updated secret state difference value is used for determining the respective secret state gradient value of each other training node.

6. A model training method based on federal learning is characterized in that the model training method is applied to a central node and comprises the following steps:

decrypting the secret gradient value to obtain a target gradient value;

7. The method of claim 6, wherein determining the filtered index based on the initial dense state difference comprises:

8. A method for identifying a user type, comprising:

acquiring user data of a user to be identified;

inputting the user data into a model obtained by training each training node by using the model training method based on federal learning according to any one of claims 1 to 7, and determining the user type of the user to be recognized based on the output result of the model of each training node.

9. A model training device based on federal learning is characterized in that, applied to a first training node, the device comprises:

the first processing module is used for processing external intermediate data and local feature data by using a first key to obtain initial secret state data, wherein the first key is obtained from a central node, the external intermediate data is obtained from other training nodes, and the initial secret state data comprises an initial secret state difference value;

a difference determining module, configured to determine an updated dense state difference according to the filtered index and the initial dense state data;

10. A model training device based on federal learning is characterized in that, applied to a central node, the device comprises:

11. A user type identification device, comprising:

the acquisition module is used for acquiring user data of a user to be identified;

an input module, configured to input the user data into a model obtained by training each training node by using the model training method based on federated learning according to any one of claims 1 to 7, and determine the user type of the user to be identified based on an output result of the model of each training node.

12. An electronic device, comprising: a processor, a memory storing machine-readable instructions executable by the processor, the machine-readable instructions when executed by the processor performing the steps of the method of any of claims 1 to 8 when the electronic device is operated.

13. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, is adapted to carry out the steps of the method according to any one of the claims 1 to 8.