CN111178524B

CN111178524B - Data processing method, device, equipment and medium based on federal learning

Info

Publication number: CN111178524B
Application number: CN201911346900.0A
Authority: CN
Inventors: 董厶溢
Original assignee: Ping An Life Insurance Company of China Ltd
Current assignee: Ping An Life Insurance Company of China Ltd
Priority date: 2019-12-24
Filing date: 2019-12-24
Publication date: 2024-06-14
Anticipated expiration: 2039-12-24
Also published as: CN111178524A

Abstract

The invention discloses a data processing method, a device, equipment and a medium based on federal learning, which comprise the following steps executed by a first terminal: determining user characteristic data common to the first terminal and the second terminal; performing feature coding processing on the user feature data to obtain feature data to be processed; obtaining a model predicted value obtained by processing based on the feature data to be processed; acquiring a loss value obtained by processing training tag data and a model predicted value by adopting a predefined loss function; if the loss value is the external loss value, the external loss value is sent to the second terminal in an encrypted mode; if the loss value is an internal loss value, determining a target gradient based on the internal loss value and current model parameters corresponding to the first model to be trained, and performing model optimization on the first model to be trained according to the target gradient to obtain a target prediction model. The data processing method based on federal learning can effectively improve model training efficiency and accuracy.

Description

Data processing method, device, equipment and medium based on federal learning

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a data processing method, device, equipment and medium based on federal learning.

Background

With the widespread use of artificial intelligence techniques, various machine learning methods have been developed. At present, a novel learning method, namely federal learning, is provided for realizing information interaction and model learning on the premise of solving the problem of data island and realizing that sensitive data is not provided externally. Federal learning (FEDERATED LEARNING) is an emerging artificial intelligence basic technology, and the design goal is to develop high-efficiency machine learning among multiple participants or multiple computing nodes on the premise of guaranteeing information security during big data exchange and legal compliance.

At present, when some financial institutions train models, because the user data of the financial institutions cannot meet the training requirement, the user data of other financial institutions are required to be subjected to deep learning, and in order to ensure the safety of the user data, the federal learning thought is adopted for model training so as to carry out model training by learning the user data provided by different data parties. Because the data adopted by the current model training is from multiparty institutions (namely from different data parties), and for different data parties, the correlation of the data owned by each data party is different, the data scalar is different and the data type is different, therefore, the learning rate of the model training of each data party is greatly different, so that great fluctuation occurs in the model learning process, the learning efficiency is lower, namely the model training efficiency is low, the cost is higher when each institution trains the model, and the model training accuracy is lower because the key information of multiparty data is different, the same set of optimization algorithm is adopted for optimization.

Disclosure of Invention

The embodiment of the invention provides a data processing method, device, equipment and medium based on federal learning, which are used for solving the problem of low training efficiency and accuracy when a data training model provided by multiple data parties is adopted at present.

A data processing method based on federal learning comprises the following steps executed by a first terminal:

determining user characteristic data common to the first terminal and the second terminal; the first terminal corresponds to a first model to be trained; the second terminal corresponds to a second model to be trained; the first terminal comprises training tag data corresponding to the user characteristic data;

performing feature coding processing on the user feature data to obtain feature data to be processed;

Obtaining a model predictive value obtained by processing based on the feature data to be processed;

Obtaining a loss value obtained by processing the training tag data and the model predicted value by adopting a predefined loss function;

If the loss value is an external loss value, the external loss value is sent to a second terminal in an encrypted mode, so that the second terminal determines a target gradient based on the external loss value and current model parameters corresponding to the second model to be trained, and performs model optimization on the second model to be trained according to the target gradient;

And if the loss value is an internal loss value, determining a target gradient based on the internal loss value and current model parameters corresponding to the first model to be trained, and carrying out model optimization on the first model to be trained according to the target gradient to obtain a target prediction model.

A federal learning-based data processing apparatus, comprising:

the user characteristic data determining module is used for determining user characteristic data shared by the first terminal and the second terminal; the first terminal corresponds to a first model to be trained; the second terminal corresponds to a second model to be trained; the first terminal comprises training tag data corresponding to the user characteristic data;

The feature coding module is used for carrying out feature coding processing on the user feature data to obtain feature data to be processed;

the model predicted value acquisition module is used for acquiring a model predicted value obtained by processing based on the feature data to be processed;

The loss value acquisition module is used for acquiring a loss value obtained by processing the training tag data and the model predicted value by adopting a predefined loss function;

The external loss value sending module is used for sending the external loss value to a second terminal in an encrypted mode if the loss value is the external loss value, so that the second terminal determines a target gradient based on the external loss value and current model parameters corresponding to the second model to be trained, and performs model optimization on the second model to be trained according to the target gradient;

And the target prediction model optimization module is used for determining a target gradient based on the internal loss value and the current model parameter corresponding to the first model to be trained if the loss value is the internal loss value, and performing model optimization on the first model to be trained according to the target gradient to obtain a target prediction model.

A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the federal learning-based data processing method described above when the computer program is executed.

A computer storage medium storing a computer program which, when executed by a processor, performs the steps of the federal learning-based data processing method described above.

In the data processing method, device, equipment and medium based on federal learning, the user characteristic data shared by the first terminal and the second terminal is determined so as to perform characteristic coding processing on the user characteristic data, so that the characteristic data to be processed is obtained; then, obtaining a model predicted value obtained by processing based on the feature data to be processed, so as to obtain a loss value obtained by processing the training tag data and the model predicted value by adopting a predefined loss function; if the loss value is the external loss value, the external loss value is sent to the second terminal in an encryption mode, so that the safety of data interaction in the training process is ensured, the second terminal determines a target gradient based on the external loss value and current model parameters corresponding to the second model to be trained, and model optimization is carried out on the second model to be trained according to the target gradient; by taking the first terminal as the cooperator for cooperator modeling, the method does not need to rely on the dependable third-party cooperator for cooperator modeling, and removes the limitation that the federal learning can only rely on the dependable third-party cooperator for modeling at present, thereby achieving the aim of synchronous training. If the loss value is an internal loss value, determining a target gradient based on the internal loss value and a current model parameter corresponding to the first model to be trained, directly correcting the magnitude of the gradient by combining the relative weight of the gradient (namely the current model parameter), determining the target gradient, and ensuring that the model-optimized gradient is determined by integrating the characteristics of data of the data party, so as to effectively solve the problems that the gradient magnitude of other data parties cannot be influenced by adopting the same set of optimization strategies in the traditional federal learning, so that larger fluctuation occurs in the model learning process, the learning efficiency is lower, and the stability of model training is ensured. And finally, carrying out model optimization on the first model to be trained according to the target gradient to obtain a target prediction model, so that the target prediction model keeps the self key information of the data of different data parties, and further, the accuracy of the target prediction model is ensured.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of an application environment of a federal learning-based data processing method according to an embodiment of the present invention;

FIG. 2 is a flow chart of a federal learning-based data processing method in accordance with an embodiment of the present invention;

FIG. 3 is a flowchart showing step S60 in FIG. 2;

FIG. 4 is a flowchart showing step S62 in FIG. 3;

FIG. 5 is a flowchart showing step S63 in FIG. 3;

FIG. 6 is a flow chart of a federal learning-based data processing apparatus in accordance with an embodiment of the present invention;

fig. 7 is a schematic diagram of a first terminal according to an embodiment of the invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The data processing method based on federal learning provided by the embodiment of the invention can be applied to the terminal equipment configured by financial institutions such as banks, securities, insurance and the like or other institutions, is used for rapidly training the model of each institution by integrating the data provided by a plurality of data parties, can effectively ensure the stability of the training process, and can fully embody the quality and the characteristics of the data of each data party so as to further improve the accuracy of the model. The data processing method based on federal learning can be applied to an application environment as shown in fig. 1, wherein a first terminal communicates with a second terminal through a network. The first terminal or the second terminal may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.

In one embodiment, as shown in fig. 2, a data processing method based on federal learning is provided, and the method is applied to the first terminal in fig. 1 for illustration, and includes the following steps:

s10: determining user characteristic data common to the first terminal and the second terminal; the first terminal corresponds to a first model to be trained; the second terminal corresponds to a second model to be trained; the first terminal includes training tag data corresponding to user characteristic data.

Specifically, the first terminal refers to a data side terminal or a terminal corresponding to a trusted third party collaborator for assisting the multiparty data organization in synchronous modeling. The second terminal refers to a data-side terminal that needs to rely on the first terminal for modeling. According to the idea of federal learning, on the premise that the first terminal and the second terminal do not disclose respective data, the common users of the two parties are confirmed, and the users which are not overlapped with each other are not exposed, so that modeling is conducted by combining the user characteristic data of the common users, therefore, the user characteristic data of the common users of the first terminal and the second terminal are required to be confirmed, sample alignment is achieved, the user characteristic data of the users which are not overlapped with each other are not exposed, and the safety of the user data is guaranteed.

In this embodiment, the shared user-specific data may refer to the feature data of the first terminal and the second terminal that are shared by the user, that is, the feature data of the first terminal and the second terminal that are identical in user feature and not identical in user feature, or the feature data of the first terminal and the second terminal that are shared by the user feature, that is, the feature data of the first terminal and the second terminal that are identical in user feature and not identical in user feature.

It can be understood that if the data party providing the training sample data is a and B, and a does not hold the tag data required for training, and B holds the tag data required for training, B may be taken as the first terminal, and a may be taken as the second terminal; if both A and B hold the label data required by training, one of A or B can be selected as the first terminal, or a trusted third party collaborator can be adopted to assist in training. It should be noted that the number of the second terminals may be one or more, which is not limited herein.

In this embodiment, the first model to be trained or the second model to be trained includes, but is not limited to, deep learning by using a neural network algorithm (e.g., lstm) or machine learning (e.g., random forest). It should be noted that the deep learning algorithm adopted by the first model to be trained and the second model to be trained should be consistent, i.e. the model initialization parameters of the first model to be trained and the second model to be trained are the same.

The training label data includes, but is not limited to, label data required for modeling corresponding to a model training theme, for example, when the repayment capability of a user is predicted, the training label data may be loan amount, that is, when the corresponding user characteristic data is the user characteristic data that has been loaned and repayment, for example, when the user preference product is predicted, the training label data may be purchased product (such as glove), and the corresponding user characteristic data is the user characteristic data of the purchased product.

It will be appreciated that, taking data parties a and B, and data party B may provide training tag data required for training, data party a does not hold the tag data required for training, model prediction subject is described as predicting the repayment capability of the user, and the user characteristic data includes personal data of the user; while the training sample data required for training the model includes personal data (such as gender, age, and academic) of the user owned by the loan company (data party a), the training label data refers to repayment capability data (i.e., repayment amount data) of the user owned by the bank (data party B), it should be noted that the data party B may also include user feature data (such as working experience) not owned by the data party a required for training, and the loan company and the bank need to train the model to be trained synchronously, since the data party a needs to train according to the label data of the data party B, the data party a and the data party B cannot train the model directly through exchanging data in order to protect the privacy of the user, so that the idea of federal learning needs to be introduced to ensure that the data of each party is not revealed. In this scenario, since the loan company only has the personal data of the user and does not have the tag data required for training, the data party B can be used as a "collaborator" in federal learning to receive the data (such as the user feature data or the model predicted value) sent by the data party a, so that the data parties a and B can learn synchronously without revealing the private data, i.e. the data party a can train according to the tag data of the data party B or the data party B can be modeled according to the feature data of the data party a and the own data.

In this embodiment, the first terminal is taken as a collaborator for illustration, so that training is performed according to the initialization model parameters defined by the first terminal, so as to train the model to be trained in combination with the idea of federal learning. The federal learning (FEDERATED LEARNING) is an emerging artificial intelligence basic technology, and the design goal is to develop high-efficiency machine learning among multiple participants or multiple computing nodes on the premise of guaranteeing information security during big data exchange and legal compliance.

S20: and carrying out feature coding processing on the user feature data to obtain feature data to be processed.

User characteristic data refers to characteristic data, such as gender, age, working years and academia, which are required for modeling and are associated with a user, and the user characteristic data can be obtained from a big data platform. The feature data to be processed is the feature data which can be processed by the model to be trained. Specifically, the feature encoding process may encode the user feature data according to a preset encoding rule, for example, the user feature data includes gender, age, working period, and academic age, so that a male and a female in the gender may be encoded in a discrete encoding form, for example, a male- "0", a female- "1", a working period of 1 year is set to "1", a2 year is set to "2", and so on, where the academic age is a special department and may be set to "0", a family is set to "1", and so on, where the description is not given.

S30: and obtaining a model predicted value obtained by processing based on the feature data to be processed.

The model predictive value is obtained by processing the user characteristic data by adopting a first model to be trained or a second model to be trained. It may be understood that the model prediction value may be obtained by processing the user feature data by the first terminal using the first model to be trained, or the model prediction value may be obtained by processing the user feature data by the second receiving terminal using the second model to be trained. It may be understood that the model prediction value may refer to a model prediction value obtained by processing the feature data to be processed by using the first model to be trained; or receiving a model predicted value obtained by processing the feature data to be processed by adopting the second model to be trained, which is sent by the second terminal in an encrypted mode.

The second terminal is required to rely on the first terminal for modeling, so the second terminal is required to send a model predicted value obtained by processing the user characteristic data according to the second model to be trained to the first terminal so as to calculate a loss value, the second terminal performs model optimization according to the loss value fed back by the first terminal, and in the process, the data of all data parties are kept locally, so that the safety of personal data of all terminal users is protected.

Further, when the second terminal sends the model predicted value to the first terminal, the second terminal needs to send the model predicted value in an encrypted manner to ensure the privacy of the user data, specifically, the first terminal may acquire a key by adopting an encryption algorithm, send the public key to the second terminal, so that the second terminal encrypts the model predicted value by adopting the public key and sends the model predicted value to the first terminal, and the first terminal decrypts the encrypted model predicted value by adopting the private key, so as to calculate a corresponding loss value according to the decrypted model predicted value.

In this embodiment, the model predicted value obtained by processing the user feature data by the first terminal is obtained or the model predicted value obtained by processing the user feature data by the second to-be-trained model corresponding to the second terminal is received, so that the cooperative modeling of the trusted third party collaborators is not needed, the limitation that the federal learning can only rely on the trusted third party collaborators for modeling at present is relieved, and the aim of synchronous training is achieved.

S40: and obtaining a loss value obtained by processing the training label data and the model predicted value by adopting a predefined loss function.

In this embodiment, a first terminal is described as an example of a terminal corresponding to a data side that holds tag data required for training. Specifically, for optimization based on conventional back propagation, a loss function is generally defined to calculate a loss value for evaluating the model effect. A standard loss function is defined by: where f is a loss function for evaluating the true result (i.e., y) and the predicted result (i.e.,/> ) Is a difference in (a) between the two. In this embodiment, the loss function may be specifically defined by a user, which is not limited herein. The loss value is calculated from a model predictive value by a predefined loss function.

The penalty values resulting from processing the training tag data and the model predictive values using the predefined penalty function in this embodiment include either external penalty values or internal penalty values. The external loss value is obtained by processing training tag data and a received model predicted value sent by the second terminal by adopting a predefined loss function; the internal loss value is a loss value obtained by processing the training tag data and the feature data to be processed by the first terminal through a first model to be trained by adopting a predefined loss function.

S50: and if the loss value is the external loss value, the external loss value is sent to the second terminal in an encrypted mode, so that the second terminal determines a target gradient based on the external loss value and current model parameters corresponding to the second model to be trained, and performs model optimization on the second model to be trained according to the target gradient.

And sending the external loss value to the second terminal in an encryption mode, so that the second terminal determines a target gradient based on the external loss value and current model parameters corresponding to the second model to be trained, and performs model optimization on the second model to be trained according to the target gradient to obtain a target prediction model. The current model parameters refer to model parameters of the first model to be trained corresponding to each training. The target gradient is the optimized gradient determined by comprehensive analysis according to the loss value and the current model parameters. It should be noted that, in this embodiment, the loss value required for determining the target gradient may include an external loss value or an internal loss value. The current model parameters correspond to the types of the loss values, namely if the loss values are external loss values, the target gradient is determined according to the external loss values and the current model parameters corresponding to the second model to be trained; if the loss value is an internal loss value, determining a target gradient according to the internal loss value and current model parameters corresponding to the first model to be trained; and are not limited herein.

Specifically, in order to ensure the safety of data interaction in the training process, the loss value obtained by processing according to the model predicted value sent by the second terminal is required to be sent to the second terminal in an encrypted mode, so that the data privacy is prevented from being revealed. Specifically, the second terminal may acquire the key by using an encryption algorithm, and send the key to the first terminal by using the public key, so that the first terminal encrypts the loss value by using the public key and sends the encrypted loss value to the second terminal, and the second terminal decrypts the encrypted loss value by using the private key.

S60: if the loss value is an internal loss value, determining a target gradient based on the internal loss value and current model parameters corresponding to the first model to be trained, and performing model optimization on the first model to be trained according to the target gradient to obtain a target prediction model.

The current model parameters corresponding to the first model to be trained refer to model weights corresponding to the first model to be trained. Specifically, since the data involved in learning come from different data parties, and the respective gradients have obvious differences according to the advantages and disadvantages of the own data, the same set of optimization strategies cannot influence the gradient sizes of other data parties, so in this embodiment, the target gradient is determined by determining the target gradient based on the loss value and the current model parameter corresponding to the first model to be trained, so that the gradient is directly corrected by combining the relative weight of the target gradient (namely the current model parameter), the target gradient is determined, the optimized gradient of the model is determined by integrating the characteristics of the data parties, the quality and the characteristics of the data of each data party can be fully reflected, the model accuracy can be further improved, meanwhile, the problem that the gradient sizes of other data parties cannot be influenced by adopting the same set of optimization strategies in the traditional federal learning process is effectively solved, the problem that the model learning process has larger fluctuation and lower learning efficiency is solved, and the stability of model training is ensured.

It should be noted that the process of determining the target gradient by the first terminal and the second terminal remain the same.

In this embodiment, the model predicted value is obtained by determining user feature data shared by the first terminal and the second terminal so as to be based on the user feature data; then, obtaining a model predicted value obtained by processing based on the feature data to be processed, so as to obtain a loss value obtained by processing the training tag data and the model predicted value by adopting a predefined loss function; if the loss value is the external loss value, the external loss value is sent to the second terminal in an encryption mode, so that the safety of data interaction in the training process is ensured, the second terminal determines a target gradient based on the external loss value and current model parameters corresponding to the second model to be trained, and model optimization is carried out on the second model to be trained according to the target gradient; by taking the first terminal as the cooperator for cooperator modeling, the method does not need to rely on the dependable third-party cooperator for cooperator modeling, and removes the limitation that the federal learning can only rely on the dependable third-party cooperator for modeling at present, thereby achieving the aim of synchronous training. If the loss value is an internal loss value, determining a target gradient based on the internal loss value and a current model parameter corresponding to the first model to be trained, directly correcting the magnitude of the gradient by combining the relative weight of the gradient (namely the current model parameter), determining the target gradient, and ensuring that the model-optimized gradient is determined by integrating the characteristics of data of the data party, so as to effectively solve the problems that the gradient magnitude of other data parties cannot be influenced by adopting the same set of optimization strategies in the traditional federal learning, so that larger fluctuation occurs in the model learning process, the learning efficiency is lower, and the stability of model training is ensured. And finally, carrying out model optimization on the first model to be trained according to the target gradient to obtain a target prediction model, so that the target prediction model keeps the self key information of the data of different data parties, and further, the accuracy of the target prediction model is ensured.

In one embodiment, as shown in fig. 3, in step S60, a target gradient is determined based on the internal loss value and the current model parameter corresponding to the first model to be trained, and the method specifically includes the following steps:

s61: and calculating according to the internal loss value and the current model parameters corresponding to the first model to be trained to obtain an original gradient.

Specifically, a pre-defined gradient calculation formula may be used to calculate the internal loss value, so as to obtain the original gradient. The gradient calculation formula may be, for exampleWhere J (ω) represents the internal loss value, ω _j represents the current model parameters, i.e., the weight parameters, and g represents the original gradient.

S62: and acquiring boundary conditions based on the current model parameters corresponding to the first model to be trained.

Specifically, the first terminal obtains a boundary condition based on a current model parameter corresponding to the first model to be trained, so as to control the gradient size of model optimization according to the boundary condition.

In one embodiment, as shown in fig. 4, in step S62, a boundary condition is obtained based on the original gradient and the current model parameters corresponding to the first model to be trained, which specifically includes the following steps:

S621: and acquiring a first super parameter and a second super parameter.

Wherein the first super parameter is a predefined constant satisfying (0-1). The second super parameter is a predefined constant satisfying (0-1). It will be appreciated that the first superparameter is used to determine the upper boundary of the boundary condition and the second superparameter is used to determine the lower boundary of the boundary condition. In this embodiment, the first super-parameter is smaller than the second super-parameter.

S622: and calculating a first norm of the current model parameters corresponding to the first model to be trained.

The first norm refers to an L2 norm obtained by processing current model parameters. The L2 norm refers to the sum of squares of the elements in the vector (i.e., the current model parameters) and the square root. In this embodiment, the size of one gradient is measured by the Norm of Norm-2 (L2), so that the regularization term can be optimized to avoid the problem of overfitting, and therefore, a second Norm corresponding to the current model parameter needs to be calculated to provide a data source for the subsequent determination of the target gradient.

S623: the product of the first superparameter and the first norm is taken as an upper boundary and the product of the second superparameter and the first norm is taken as a lower boundary.

S624: based on the upper and lower boundaries, boundary conditions are obtained.

Specifically, assuming that the first super parameter is represented by bl, the second super parameter is represented by bh, the first norm is represented by ω, and then an upper boundary of the boundary condition is bl, and a lower boundary of the boundary condition is bh, then the boundary condition is [ bl, bh.

In this embodiment, the first superparameter and the second superparameter are introduced to combine the L2 norm of the current model parameter with the first superparameter and the second superparameter to obtain the upper boundary and the lower boundary, thereby obtaining the boundary condition, so as to correct the original gradient size according to the boundary condition, so that the gradient size meets the controllable boundary condition, ensure the stability of model training, fully embody the quality and the characteristics of data of each data party, and further improve the accuracy of the model

S63: and processing the original gradient based on the boundary condition to obtain the target gradient.

In federal learning, the updating scale of model weights is affected by a plurality of factors, such as learning rate, gradient magnitude, and the like. If the update gradient is too large, larger oscillation occurs in the update of the model; if the update gradient is too small, the model update is abnormally slow and stable, and the learning effect of the model is poor. In addition, the learning effect of the model is mainly related to the gradient direction and the gradient magnitude. The direction of the gradient update is changeable, so that the gradient of the model update needs to be controlled to improve the model learning effect.

In the conventional optimization algorithm, the gradient magnitude control mainly plays a role in controlling by optimizing the learning rate. The existing learning rate control method mainly comprises the step of obtaining the ratio of first-order gradient to second-order gradient through some attenuation functions or by introducing momentum to control the learning rate. Such methods can play a role in limiting the gradient magnitude in the case of single learning. However, in federal learning, since the data involved in learning come from different data parties, and the gradient of the model trained by each data party has obvious difference according to the advantages and disadvantages of own data, the same set of optimization strategies cannot know or influence the gradient of other data parties, in this embodiment, the gradient (i.e. the original gradient) is directly corrected by introducing the relative weight of the data party (i.e. the current model parameter), and the target gradient is obtained, so that the gradient satisfies the controllable boundary condition, and the stability of model training is ensured.

In one embodiment, as shown in fig. 5, in step S63, the original gradient is processed based on the boundary condition to obtain the target gradient, which specifically includes the following steps:

S631: and calculating a second norm corresponding to the original gradient, and taking the ratio of the first norm to the second norm as the ratio to be judged.

Wherein the first norm refers to an L2 norm obtained by processing the original gradient. Specifically, the ratio of the first norm and the second norm is taken as the ratio to be judged, namelyWherein, the first norm is represented by g; and ω represents the second norm.

S632: and if the ratio to be judged meets the boundary condition, taking the original gradient as the target gradient.

Specifically, if the ratio to be determined is within the upper and lower boundary ranges, the original gradient is taken as the target gradient, i.e. whenWhen the original gradient is used as the target gradient.

S633: if the ratio to be judged is smaller than the upper boundary of the boundary condition, processing the original gradient according to a first calculation formula to obtain a target gradient; the first calculation formula isWherein bl represents a first super-parameter, the first norm is represented by g; and ω represents a second norm; g represents the original gradient and ω represents the current model parameters.

In particular, whenWhen the target gradient is obtained, the original gradient is processed according to a second calculation formula to obtain the target gradient, namely/>

S634: if the ratio to be judged is greater than the lower boundary of the boundary condition, processing the original gradient according to a second calculation formula to obtain a target gradient; the second calculation formula comprisesWherein bh represents a second super-parameter; the first norm is represented by g; and ω represents a second norm; g represents the original gradient and ω represents the current model parameters.

In this embodiment, the update gradient size is controlled according to the boundary condition, so as to effectively solve the problem that the gradient size of other data parties cannot be affected by adopting the same set of optimization strategies in the traditional federal learning, and make the update gradient size of each data party more fit with the data characteristics of the data provided by the data party, so as to achieve the purposes of enhancing the stability of the model training process and accelerating the progress of model training.

In an embodiment, if the internal loss value is smaller than a preset threshold, a step of calculating according to the internal loss value and a current model parameter corresponding to the first model to be trained is performed to obtain an original gradient.

The preset threshold is a threshold for judging model convergence. Specifically, when judging whether the models are converged, the judgment can be performed in two ways, firstly, the judgment is performed according to whether the loss value corresponding to each model (namely, the first model to be trained or the second model to be trained) meets the standard (a preset threshold can be set); and secondly, judging whether the loss average value of the loss values of the models meets the standard or not so as to achieve the aim of synchronous convergence. If the internal loss value is larger than the preset threshold, the model is proved not to be converged at the moment, and iterative training needs to be continued, the step of calculating according to the internal loss value and the current model parameters corresponding to the first model to be trained to obtain an original gradient or is executed, or if the external loss value is larger than the preset threshold, the step of sending the external loss value to the second terminal in an encrypted mode is executed. Or if the average value of the internal loss value and the external loss value meets the standard (namely, is smaller than a preset threshold value), if the average value of the internal loss value and the external loss value is smaller than the preset threshold value, the model is proved to be converged, and training can be stopped.

It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present invention.

In an embodiment, a data processing device based on federal learning is provided, where the data processing device based on federal learning corresponds to the data processing method based on federal learning in the foregoing embodiment one by one. As shown in fig. 6, the data processing apparatus based on federal learning includes a user feature data determining module 10, a feature encoding module 20, a model prediction value obtaining module 30, a loss value obtaining module 40, an external loss value transmitting module 50, and a target prediction model optimizing module 60. The functional modules are described in detail as follows:

a user characteristic data determining module 10, configured to determine user characteristic data common to the first terminal and the second terminal; the first terminal corresponds to a first model to be trained; the second terminal corresponds to a second model to be trained; the first terminal comprises training tag data corresponding to the user characteristic data;

the feature encoding module 20 is configured to perform feature encoding processing on the user feature data to obtain feature data to be processed;

a model predicted value obtaining module 30, configured to obtain a model predicted value obtained by processing based on the feature data to be processed;

A loss value acquisition module 40, configured to acquire a loss value obtained by processing the training tag data and the model predicted value by using a predefined loss function;

The external loss value sending module 50 sends the external loss value to the second terminal in an encrypted mode if the loss value is the external loss value, so that the second terminal determines a target gradient based on the external loss value and current model parameters corresponding to the second model to be trained, and performs model optimization on the second model to be trained according to the target gradient;

the target prediction model optimization module 60 is configured to determine a target gradient based on the internal loss value and a current model parameter corresponding to the first model to be trained if the loss value is the internal loss value, and perform model optimization on the first model to be trained according to the target gradient, so as to obtain a target prediction model.

Specifically, the model predicted value obtaining module specifically includes: and receiving a model predicted value which is transmitted by the second terminal in an encryption mode and is obtained by processing the feature data to be processed by adopting the second model to be trained.

Specifically, the model predicted value obtaining module specifically includes: and processing the feature data to be processed by adopting the first model to be trained to obtain a model predictive value.

Specifically, the target prediction model optimization module includes an original gradient acquisition unit, a boundary condition acquisition unit, and a target gradient acquisition unit.

The original gradient acquisition unit is used for calculating according to the internal loss value and the current model parameters corresponding to the first model to be trained to obtain an original gradient.

And the boundary condition acquisition unit is used for acquiring the boundary condition based on the current model parameters corresponding to the first model to be trained.

And the target gradient acquisition unit is used for processing the original gradient based on the boundary condition to acquire the target gradient.

Specifically, the boundary condition acquisition unit includes a super parameter acquisition subunit, a first norm calculation subunit, an upper and lower boundary determination subunit, and a boundary condition acquisition subunit.

And the super-parameter acquisition subunit is used for acquiring the first super-parameter and the second super-parameter.

And the first norm calculation subunit is used for calculating a first norm corresponding to the current model parameter corresponding to the first model to be trained.

And the upper and lower boundary determining subunit is used for taking the product of the first super parameter and the first norm as an upper boundary and taking the product of the second super parameter and the first norm as a lower boundary.

And the boundary condition acquisition subunit is used for acquiring the boundary condition based on the upper boundary and the lower boundary.

Specifically, the target gradient acquisition unit comprises a ratio acquisition subunit to be judged, a first processing subunit, a second processing subunit and a third processing subunit.

The to-be-judged ratio obtaining subunit is used for calculating a second norm corresponding to the original gradient, and taking the ratio of the first norm and the second norm as the to-be-judged ratio.

And the first processing subunit is used for taking the original gradient as the target gradient if the ratio to be judged meets the boundary condition.

The second processing subunit is used for processing the original gradient according to the first calculation formula to obtain a target gradient if the ratio to be judged is smaller than the upper boundary of the boundary condition; the first calculation formula isWherein bl represents a first super-parameter, the ω represents the first norm and the g represents the second norm; g represents the original gradient; ω represents the current model parameters.

The third processing subunit is used for processing the original gradient according to the second calculation formula to obtain a target gradient if the ratio to be judged is larger than the lower boundary of the boundary condition; the second calculation formula comprisesWherein bh represents a second super-parameter; the ω represents the first norm and the g represents the second norm; g represents the original gradient; ω represents the current model parameters.

For specific limitations on the federal learning-based data processing apparatus, reference may be made to the above limitations on the federal learning-based data processing method, and no further description is given here. The various modules in the federal learning-based data processing apparatus described above may be implemented in whole or in part in software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 7. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a computer storage medium, an internal memory. The computer storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the computer storage media. The database of the computer device is used for storing data, such as a target prediction model, generated or acquired during the process of executing the data processing method based on federal learning. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements a data processing method based on federal learning.

In one embodiment, a computer device is provided that includes a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the federal learning-based data processing method of the above embodiments, such as steps S10-S60 shown in fig. 2, or the steps shown in fig. 3-5, when the computer program is executed by the processor. Or the processor may implement the functions of the modules/units in this embodiment of the federally learned data processing apparatus when executing the computer program, such as the functions of the modules/units shown in fig. 6, which are not described herein again for the sake of avoiding repetition.

In an embodiment, a computer storage medium is provided, and a computer program is stored on the computer storage medium, where the computer program when executed by a processor implements the steps of the data processing method based on federal learning in the foregoing embodiment, for example, steps S10-S60 shown in fig. 2, or steps shown in fig. 3-5, which are not repeated herein. Or the computer program, when executed by the processor, implements the functions of each module/unit in the embodiment of the data processing apparatus based on federal learning, for example, the functions of each module/unit shown in fig. 6, which are not repeated herein.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link (SYNCHLINK) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions.

The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims

1. The data processing method based on federal learning is characterized by comprising the following steps executed by a first terminal:

If the loss value is an external loss value, the external loss value is sent to a second terminal in an encrypted mode, so that the second terminal determines a target gradient based on the external loss value and current model parameters corresponding to the second model to be trained, and performs model optimization on the second model to be trained according to the target gradient; the external loss value is obtained by processing training tag data and a received model predicted value sent by the second terminal by adopting a predefined loss function;

if the loss value is an internal loss value, calculating according to the internal loss value and the current model parameter corresponding to the first model to be trained to obtain an original gradient;

acquiring boundary conditions based on current model parameters corresponding to the first model to be trained;

calculating a second norm corresponding to the original gradient, and taking the ratio of the first norm to the second norm as a ratio to be judged;

If the ratio to be judged meets the boundary condition, taking the original gradient as the target gradient;

If the ratio to be judged is smaller than the upper boundary of the boundary condition, processing the original gradient according to a first calculation formula to obtain the target gradient; the first calculation formula is ; Wherein/>Representing the first hyper-parameter,/>Representing the first norm,/>Representing a second norm; /(I)Representing the original gradient; /(I)Representing the current model parameters;

If the ratio to be judged is larger than the lower boundary of the boundary condition, processing the original gradient according to a second calculation formula to obtain the target gradient; the second calculation formula comprises ; Wherein/>Representing a second hyper-parameter; /(I)Representing the first norm,/>Representing a second norm; /(I)Representing the original gradient; /(I)Representing the current model parameters, and carrying out model optimization on the first model to be trained according to the target gradient to obtain a target prediction model; the internal loss value is obtained by processing the training tag data and the feature data to be processed by the first terminal through a first model to be trained by adopting a predefined loss function.

2. The federal learning-based data processing method according to claim 1, wherein the obtaining of the model predictive value obtained by processing based on the feature data to be processed includes:

And receiving a model predicted value which is sent by the second terminal in an encrypted mode and is obtained by processing the feature data to be processed by adopting the second model to be trained.

3. The federal learning-based data processing method according to claim 1, wherein the obtaining of the model predictive value obtained by processing based on the feature data to be processed includes:

And processing the feature data to be processed by adopting the first model to be trained to obtain the model predictive value.

4. The federal learning-based data processing method according to claim 1, wherein acquiring boundary conditions based on the original gradient and current model parameters corresponding to the first model to be trained, comprises:

Acquiring a first super parameter and a second super parameter;

calculating a first norm corresponding to the current model parameter corresponding to the first model to be trained;

Taking the product of the first super-parameter and the first norm as an upper boundary and taking the product of the second super-parameter and the first norm as a lower boundary;

and obtaining the boundary condition based on the upper boundary and the lower boundary.

5. The federal learning-based data processing method according to claim 1, wherein before the calculating according to the internal loss value and the current model parameter corresponding to the first model to be trained, the federal learning-based data processing method further comprises:

And if the internal loss value is larger than a preset threshold value, executing the step of calculating according to the internal loss value and the current model parameters corresponding to the first model to be trained to obtain an original gradient.

6. A federal learning-based data processing apparatus for implementing the federal learning-based data processing method according to any one of claims 1 to 5, the apparatus comprising:

The model predicted value acquisition module is used for acquiring a model predicted value based on the user characteristic data; the model predicted value is obtained by processing the user characteristic data by adopting the first model to be trained or the second model to be trained;

7. Computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the federal learning-based data processing method according to any of claims 1 to 5 when the computer program is executed.

8. A computer storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the federal learning-based data processing method according to any one of claims 1 to 5.