CN114492854A

CN114492854A - Method and device for training model, electronic equipment and storage medium

Info

Publication number: CN114492854A
Application number: CN202210122134.5A
Authority: CN
Inventors: 敬清贺; 董大祥; 汤伟; 徐龙腾; 杨博; 叶柏威
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2022-02-09
Filing date: 2022-02-09
Publication date: 2022-05-13

Abstract

The disclosure provides a method and a device for training a model, electronic equipment and a storage medium, relates to the field of data processing, further relates to the technical field of cloud computing and federal learning, and particularly relates to a method and a device for training a model, electronic equipment and a storage medium. The specific implementation scheme is as follows: receiving encrypted data from a plurality of terminals, wherein the encrypted data is obtained by encrypting an intermediate result obtained by training a local model by each terminal in the plurality of terminals; training and updating the initial cloud model based on the encrypted data to obtain a target cloud model; and returning the updated network parameters corresponding to the target cloud model to the plurality of terminals so that each terminal in the plurality of terminals updates the local model.

Description

Method and device for training model, electronic equipment and storage medium

Technical Field

The disclosure relates to the field of data processing, and further relates to the technical field of cloud computing and federal learning, in particular to a method and device for training a model, an electronic device and a storage medium.

Background

At present, when a model is jointly trained, data of different data holders are generally aggregated in the same cluster, and the model is trained by using a traditional distributed technology, but in recent years, more and more laws are proposed to protect data privacy, so that aggregation of data is limited, and higher requirements are provided for joint training of the model.

Disclosure of Invention

The present disclosure provides a method, apparatus, device, and storage medium for training a model.

According to an aspect of the present disclosure, there is provided a method of training a model, comprising: receiving encrypted data from a plurality of terminals, wherein the encrypted data is obtained by encrypting an intermediate result obtained by training a local model by each terminal in the plurality of terminals; training and updating the initial cloud model based on the encrypted data to obtain a target cloud model; and returning the updated network parameters corresponding to the target cloud model to the plurality of terminals so that each terminal in the plurality of terminals updates the local model.

Optionally, training and updating the initial cloud model based on the encrypted data, and obtaining the target cloud model includes: performing aggregation processing on the encrypted data based on the data dimension of the encrypted data to obtain a processing result; and training and updating the initial cloud model by using the processing result to obtain a target cloud model.

According to another aspect of the present disclosure, there is provided an apparatus for training a model, including: the receiving module is used for receiving encrypted data from a plurality of terminals, wherein the encrypted data is obtained by encrypting an intermediate result obtained after each terminal in the plurality of terminals trains a local model; the processing module is used for training and updating the initial cloud model based on the encrypted data to obtain a target cloud model; and the feedback module is used for returning the updated network parameters corresponding to the target cloud model to the plurality of terminals so that each terminal in the plurality of terminals updates the local model.

Optionally, the processing module comprises: a first processing unit and a second processing unit. The first processing unit is used for carrying out aggregation processing on the encrypted data based on the data dimension of the encrypted data to obtain a processing result; and the second processing unit is used for training and updating the initial cloud model by using the processing result to obtain a target cloud model.

According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of training a model of an embodiment of the present disclosure.

According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform a method of training a model of an embodiment of the present disclosure.

According to another aspect of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the method of training a model of an embodiment of the present disclosure.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is a flow chart of a method of training a model according to an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of a method for joint training of models in accordance with an embodiment of the present disclosure;

FIG. 3 is a schematic illustration of longitudinal federal learning in accordance with an embodiment of the present disclosure;

fig. 4 is a schematic diagram of a SplitNN-based cloud model training scheme and reward mechanism according to an embodiment of the present disclosure;

FIG. 5 is a schematic diagram of an apparatus for training a model according to an embodiment of the present disclosure;

fig. 6 is a schematic block diagram of an electronic device in accordance with an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The method of training the model of the embodiments of the present disclosure is further described below.

Fig. 1 is a flow chart of a method of training a model according to an embodiment of the present disclosure, which may include the following steps, as shown in fig. 1:

step S102, receiving encrypted data from a plurality of terminals, wherein the encrypted data is obtained by encrypting an intermediate result obtained by training a local model by each terminal in the plurality of terminals.

In the technical solution provided in the above step S102 of the present disclosure, the terminal may be a mobile device held by a user, for example, a mobile phone, a tablet, and the like, the local model may be a model obtained by an internet data center (for example, pan-bao) based on terminal data training, and when performing joint training, the cloud server receives encrypted intermediate result data output by the local model from each training party.

In this embodiment, after the local model training is completed, the weights of the layers (dimensions) are obtained, the names of the intermediate layers to be output are specified, the function objects from input to output are established, the data dimensions of input and output are obtained, then the output values of the intermediate layers are obtained, and finally the output data of the intermediate layers are obtained, that is, the intermediate results are obtained.

In this embodiment, encrypting the intermediate result may be encoding intermediate result data, which is represented using one vector.

And step S104, training and updating the initial cloud model based on the encrypted data to obtain a target cloud model.

In the technical scheme provided in step S104 of the present disclosure, the target cloud model may be a joint training model having a plurality of training participants, and after the vectors used for representing the intermediate result are encrypted and uploaded to the cloud, the cloud model trains and updates the joint training model after splicing the vectors of different training parties, so as to obtain an updated joint training model.

In this embodiment, training and updating the initial cloud model based on the encrypted data to obtain the target cloud model may include: performing aggregation processing on the encrypted data based on the data dimension of the encrypted data to obtain a processing result; and training and updating the initial cloud model by using the processing result to obtain a target cloud model.

In this embodiment, the process of training and updating the initial cloud model based on the encrypted data may be implemented by means of a federal learning technique. For example, in the vertical federal learning, under the condition that the users of two data sets overlap more and the features overlap less, the data sets are divided according to the vertical direction (namely, feature dimension), and the part of data which is the same for both users and the features of the users are not completely the same is removed for training, and the vertical federal learning is to aggregate the different features in an encryption state so as to enhance the model capability.

In this embodiment, preferably, the cloud server adjusts an aggregation mode of intermediate result data according to data dimensions of each model training party, and when different data parties hold data with the same dimensions, the cloud server trains and updates a target cloud model in a weighted average mode; and when different data parties hold data with different dimensions, the cloud server longitudinally splices the received intermediate result data and completes subsequent training.

In this embodiment, preferably, the training and updating the initial cloud model further includes: the reward mechanism, namely, the cloud server can evaluate vectors provided by different training parties, and scores the contribution of the data party through different performances on the test set, so that the higher the score is, the higher the reward is obtained, and meanwhile, the higher proportion can be born in the following training process.

And step S106, returning the updated network parameters corresponding to the target cloud model to the plurality of terminals so that each terminal in the plurality of terminals updates the local model.

In the technical solution provided in the above step S106 of the present disclosure, the network parameter may be a vector gradient of each training party, and after the target cloud model is trained and updated based on the encrypted intermediate result data, the cloud server calculates the vector gradient of each training party and returns the vector gradient to each training party, so that the local model is updated by the cloud server.

Receiving encrypted data from a plurality of terminals through the steps S102 to S106, where the encrypted data is obtained by encrypting an intermediate result obtained by training a local model by each terminal in the plurality of terminals; training and updating the initial cloud model based on the encrypted data to obtain a target cloud model; the updated network parameters corresponding to the target cloud model are returned to the terminals, so that each terminal in the terminals updates the local model, namely data are protected through the federal learning technology, the training parties encode intermediate result data and express the intermediate result data by using vectors, the data are encrypted and then uploaded to the cloud, the cloud model is aggregated in a weighted average or longitudinal splicing mode through vectors of different training parties, joint training and updating are carried out on the cloud model, calculated vector gradients of all the training parties are returned, and the local model is updated.

The above-described method of this embodiment is described in further detail below.

As an optional implementation manner, in step S104, training and updating the initial cloud model based on the encrypted data, and obtaining the target cloud model includes: performing aggregation processing on the encrypted data based on the data dimension of the encrypted data to obtain a processing result; and training and updating the initial cloud model by using the processing result to obtain a target cloud model.

In this embodiment, the encrypted data is aggregated based on the data dimension of the encrypted data to obtain a processing result, for example, the training party encrypts an intermediate result in the model training and uploads the intermediate result to the cloud server, and the cloud server adjusts an aggregation mode of the intermediate result data according to different scenes to obtain the processing result.

In this embodiment, the processing result may be data obtained by the cloud server by using a weighted average method for the encrypted intermediate result data uploaded by each training party and/or data obtained by the cloud server by using a longitudinal splicing method for the encrypted intermediate result data uploaded by each training party.

As an optional implementation manner, performing aggregation processing on the encrypted data based on the data dimension of the encrypted data, and obtaining a processing result includes: and in response to the fact that the encrypted data from different terminals have the same data dimension, performing aggregation processing on the encrypted data in a weighted average mode to obtain a processing result.

In this embodiment, in response to that the encrypted data from different terminals have the same data dimension, the encrypted data may be aggregated in a weighted average manner to obtain a processing result, for example, when the cloud server detects that the data dimensions (characteristics) of the encrypted intermediate results uploaded by different data parties are the same, the cloud server sends an instruction signal for adjusting the aggregation manner of the encrypted data set to be weighted average processing, and in response to the instruction signal, the cloud server aggregates the encrypted data set in a weighted average manner to obtain a processing result.

In this embodiment, the method may further include: testing the target cloud model by using a test set corresponding to the target cloud model to obtain a test result, wherein the test result is used for evaluating the data quality of encrypted data sent by each terminal in the plurality of terminals; and in the process of aggregating the encrypted data by adopting the weighted average mode again, adjusting the weight of the encrypted data sent by each terminal in the plurality of terminals in the weighted average calculation based on the test result.

As an optional implementation manner, performing aggregation processing on the encrypted data based on the data dimension of the encrypted data, and obtaining a processing result includes: and responding to the fact that the encrypted data from different terminals have different data dimensions, and performing aggregation processing on the encrypted data in a longitudinal splicing mode to obtain a processing result.

In this embodiment, in response to the fact that the encrypted data from different terminals have different data dimensions, the encrypted data can be aggregated in a longitudinal splicing manner to obtain a processing result, for example, when the cloud server detects that the data dimensions (characteristics) of the encrypted intermediate results uploaded by different data parties are different, the cloud server sends an instruction signal for adjusting the aggregation manner of the encrypted data set to the longitudinal splicing processing, and the cloud server responds to the instruction signal and aggregates the encrypted data set in the longitudinal splicing manner to obtain the processing result.

In this embodiment, the encrypted data of different terminals have different data dimensional scenarios, e.g. one party has behavior and tags and the other party has content and tags.

As an optional implementation manner, in response to that the encrypted data from different terminals have the same data dimension, performing aggregation processing on the encrypted data in a weighted average manner to obtain a processing result, the method further includes: and testing the target cloud model by using the test set corresponding to the target cloud model to obtain a test result, wherein the test result is used for evaluating the data quality of the encrypted data sent by each terminal in the plurality of terminals.

In this embodiment, the target cloud model may be tested by using a test set corresponding to the target cloud model to obtain a test result, where the test result is used to evaluate the data quality of encrypted data sent by each of the plurality of terminals, and in order to prevent a training participant from maliciously providing a model with low quality to destroy the training effect of the cloud model, the cloud server may evaluate the quality of vectors provided by each party.

For example, the cloud server performs testing by using a test set corresponding to a cloud model, and if it is found that the effect of model training is not greatly influenced or even has a negative influence on the effect of model training by using intermediate result data provided by a certain participant, the cloud server reduces the proportion of the participant in the next training; if the intermediate result data provided by a participant positively affects the effectiveness of the model training, it is awarded and provides a higher training weight.

In this embodiment, the data provided by each training party are distributed differently, and if a certain party occupies a higher proportion in training, the effect of the cloud model on the data is better, and from the two aspects of the reward effect and the model effect, each participant is more inclined to provide data with higher quality.

As an optional implementation manner, the target cloud model is tested by using the test set corresponding to the target cloud model, so as to obtain a test result, and the method further includes: and in the process of aggregating the encrypted data by adopting the weighted average mode again, adjusting the weight of the encrypted data sent by each terminal in the plurality of terminals in the weighted average calculation based on the test result.

In this embodiment, in the process of aggregating the encrypted data again by using the weighted average method, the weight of the encrypted data sent by each of the multiple terminals in the weighted average calculation may be adjusted based on the test result, for example, when different data parties hold data of the same dimension, the cloud server adjusts the weight of the encrypted data sent by each of the multiple terminals in the weighted average calculation based on the test result of the data quality provided by the multiple data parties, and then aggregates the encrypted data by using the weighted average method.

In the embodiment of the disclosure, data is protected by a federal learning technology, each training party encrypts intermediate result data of local model training and uploads the intermediate result data to a cloud server, the cloud server adjusts an aggregation mode of the intermediate result according to whether data dimensions held by different data parties are the same, if the different data parties hold data with the same dimensions, the cloud server aggregates the encrypted intermediate result data in a weighted average mode, otherwise, the encrypted intermediate result data is aggregated in a longitudinal splicing mode, when the next weighted average processing is carried out, the cloud server evaluates data provided by each party, and adjusts the weight of the encrypted data sent by each terminal in a weighted average calculation according to a scoring result of the data provided by each party on a test set, so that each participant tends to provide data with higher quality, therefore, the technical problems of low efficiency and low data security of the joint training of the cloud model are solved, and the technical effects of improving the efficiency and the data security of the joint training of the cloud model are achieved.

The method of training the model of the present disclosure is further described below in conjunction with the preferred embodiments.

In the related art, the joint training method mainly has the following disadvantages:

(1) the traditional distributed technology needs data of all parties to be aggregated, but the protection of the data is stricter at present, the data holder is not allowed to easily migrate the private data of the user out of the private domain server, so that the data cannot be aggregated sufficiently, and a model cannot be trained sufficiently;

(2) the traditional distributed technology only supports training on data with the same latitude (the data provided by all data parties need to have the same data characteristics), so if the data characteristics provided by different parties are different, the data needs to be spliced, and the security of the data cannot be ensured.

To solve the above problem, the present solution introduces a split network driven vertical partitioning (SplitNN) technique. And the cloud model is subjected to joint training under the condition that the data cannot be output in a private domain, so that the safety of the data is ensured. Meanwhile, the SplitNN technology meets the training of data with different dimensions, and meets the requirements of different data holders on training of different feature data.

Fig. 2 is a schematic diagram of a method for jointly training a model in related art according to an embodiment of the disclosure, as shown in fig. 2, data of different data holders are aggregated to the same cluster, and the model is trained using a conventional distributed technology. The training is mainly divided into two roles, and the roles respectively undertake a plurality of steps in the training process:

first, the training partner (Trainer) trains the data: each trainer will use local data to train the local model and output the gradient for each parameter and upload to a Server (Server).

Secondly, the training party aggregates the gradients uploaded by all the training parties, updates the cloud model, and the updated server transmits new parameters back to the training party.

And finally, the training party updates the local model and starts a new round of training.

Fig. 3 is a schematic diagram of longitudinal federal learning according to an embodiment of the present disclosure, and as shown in fig. 3, in the training process, a training party only needs to encode own data, use a vector to represent own data, encrypt the vector representation, and upload the vector representation to the cloud. The cloud model trains and updates the model of the cloud model after splicing the vectors of different training parties, calculates the vector gradient of each training party and returns the vector gradient so as to update the local model.

Meanwhile, the cloud server can evaluate vectors provided by different training parties, and the contributions of the data parties are scored through different performances on the test set, so that the higher the score is, the higher the reward can be, and meanwhile, the higher proportion can be borne in the following training process.

Solution of key problems in design and implementation:

(1) data privacy issues

The method abandons the mode of aggregating data in the original scheme to train the model, and protects the data by means of the federal learning technology. The training party only needs to encrypt the intermediate result in the model training and then transmits the encrypted intermediate result to the cloud end, so that the cloud end server or a deliberate attacker can not restore the data held by the training party even if the uploaded data are taken. Meanwhile, after the data are aggregated by the cloud server and returned to each training party, the training parties cannot restore the data of other participants. Therefore, the scheme greatly protects the data security.

(2) Training problem for data of different dimensions

Before training begins, different data training parties can agree whether own data have the same data dimensionality, and the cloud server can adjust the aggregation mode of the intermediate results according to different scenes. If different data parties hold data with the same dimensionality, the cloud server adopts a weighted average mode; and otherwise, longitudinally splicing the received intermediate results and finishing the subsequent training.

(3) Quality problems of data and intermediate vectors of data providers

In order to prevent a training participant from maliciously providing a model with low quality to destroy the effect of a cloud model, the cloud service evaluates the quality of vectors provided by each participant, and if the fact that a certain participant has little influence on the effect of the model and even has negative influence on the effect of the provided parameters is found, the cloud server reduces the proportion of the participant in the next training; if a participant provides data with great positive effect on model promotion, the participant is provided with reward and higher training proportion.

In this kind of training paradigm, the data distribution that each side held is all the different, and if a certain party is higher in the proportion that accounts for in the training, the effect of high in the clouds model on this side data just is better. Each participant is more inclined to provide higher quality data, both from the perspective of the reward mechanism and the effectiveness of the model.

(4) Problem of training efficiency

The traditional federal learning technology involves many cryptology calculations, so the training efficiency is nearly two orders of magnitude lower than that of the common deep learning framework. The SplitNN has extremely high training efficiency through testing, and the training speed of the SplitNN is similar to that of a common deep learning framework, so that the SplitNN can help the participants to train available models more quickly.

Fig. 4 is a schematic diagram of a cloud model training scheme and reward mechanism based on SplitNN according to an embodiment of the present disclosure, as shown in fig. 4, taking training of a face recognition model as an example, different places hold face photos of local residents, but the photos cannot be shared between cities.

In the above embodiments of the present disclosure, the training party only needs to encode its own data, use a vector to represent its own data, encrypt the vector representation, and upload it to the cloud. The cloud model trains and updates own models after vectors of different training parties are spliced, vector gradients of all training parties are calculated and returned, so that the local models are updated, meanwhile, the cloud server can evaluate the vectors provided by different training parties, contributions to data parties are scored through different performances on a test set, higher rewards are scored more and more, and larger specific gravity can be borne in the following training process, so that the trained and updated cloud model is obtained, the technical problems of low efficiency and low data of joint training of the cloud model are solved, and the technical effects of improving the efficiency and the data security of the joint training of the cloud model are achieved.

The embodiment of the disclosure also provides a device for executing the method for training the model of the embodiment shown in fig. 1.

Fig. 5 is a schematic diagram of an apparatus for training a model according to an embodiment of the present disclosure, and as shown in fig. 5, the apparatus 50 for training a model may include: a receiving module 51, a processing module 52 and a feedback module 53.

A receiving module 51, configured to receive encrypted data from multiple terminals, where the encrypted data is obtained by encrypting an intermediate result obtained after each terminal in the multiple terminals trains a local model;

the processing module 52 is configured to train and update the initial cloud model based on the encrypted data, so as to obtain a target cloud model;

and a feedback module 53, configured to return the updated network parameters corresponding to the target cloud model to the multiple terminals, so that each terminal in the multiple terminals updates the local model.

Optionally, the processing module 52 comprises: a first processing unit and a second processing unit, wherein the first processing unit may include: a first processing sub-module and a second processing sub-module, wherein the first processing sub-module may include: a test module, wherein the test module may include: and an adjusting module.

The first processing unit is used for carrying out aggregation processing on the encrypted data based on the data dimension of the encrypted data to obtain a processing result; the second processing unit is used for training and updating the initial cloud model by using the processing result to obtain a target cloud model; the first processing submodule is used for responding that the encrypted data from different terminals have the same data dimension and performing aggregation processing on the encrypted data in a weighted average mode to obtain a processing result; the second processing submodule is used for responding to the fact that the encrypted data from different terminals have different data dimensions, and conducting aggregation processing on the encrypted data in a longitudinal splicing mode to obtain a processing result; the test module is used for testing the target cloud model by using the test set corresponding to the target cloud model to obtain a test result, wherein the test result is used for evaluating the data quality of the encrypted data sent by each terminal in the plurality of terminals; and the adjusting module is used for adjusting the weight of the encrypted data sent by each terminal in the plurality of terminals in the weighted average calculation based on the test result in the process of aggregating the encrypted data by adopting the weighted average mode again.

In the embodiment of the present disclosure, the receiving module 51 receives encrypted data from a plurality of terminals, where the encrypted data is obtained by encrypting an intermediate result obtained by training a local model by each terminal in the plurality of terminals; the processing module 52 trains and updates the initial cloud model based on the encrypted data to obtain a target cloud model; the feedback module 53 returns the updated network parameters corresponding to the target cloud model to the plurality of terminals, so that each terminal in the plurality of terminals updates the local model, the technical problems of low efficiency and low data security of joint training of the cloud model are solved, and the technical effects of improving the efficiency and the data security of the joint training of the cloud model are achieved.

In the embodiment of the disclosure, the acquisition, storage, application and the like of the personal information of the related user in the technical scheme of the disclosure all conform to the regulations of related laws and regulations, and do not violate the good custom of the public order.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

Embodiments of the present disclosure provide an electronic device, which may include: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of training a model of an embodiment of the present disclosure.

Optionally, the electronic device may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.

Alternatively, in the present embodiment, the above-mentioned nonvolatile storage medium may be configured to store a computer program for executing the steps of:

step S1, receiving encrypted data from a plurality of terminals, wherein the encrypted data is obtained by encrypting an intermediate result obtained by training a local model by each terminal in the plurality of terminals;

step S2, training and updating the initial cloud model based on the encrypted data to obtain a target cloud model;

step S3, returning the updated network parameters corresponding to the target cloud model to the multiple terminals, so that each terminal in the multiple terminals updates the local model.

Alternatively, in the present embodiment, the non-transitory computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

According to an embodiment of the present disclosure, the present disclosure also provides a computer program product comprising a computer program which, when executed by a processor, realizes the steps of:

Fig. 6 is a schematic block diagram of an electronic device in accordance with an embodiment of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

FIG. 6 illustrates a schematic block diagram of an example electronic device 600 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 6, the apparatus 600 includes a computing unit 601, which may perform various appropriate actions and processes in accordance with a computer program stored in a Read Only Memory (ROM)602 or a computer program loaded from a storage unit 1108 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the device 1100 can also be stored. The calculation unit 601, the ROM 602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

A number of components in the device 600 are connected to the I/O interface 605, including: an input unit 606 such as a keyboard, a mouse, or the like; an output unit 607 such as various types of displays, speakers, and the like; a storage unit 608, such as a magnetic disk, optical disk, or the like; and a communication unit 609 such as a network card, modem, wireless communication transceiver, etc. The communication unit 609 allows the device 600 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

The computing unit 601 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of the computing unit 601 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The computing unit 601 performs the above-described methods and processes, for example, the methods train and update the initial cloud model based on the encrypted data to obtain the target cloud model. For example, in some embodiments, the method trains and updates the initial cloud model based on the encrypted data, resulting in a target cloud model that can be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 608. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 600 via the ROM 602 and/or the communication unit 609. When loaded into RAM 603 and executed by computing unit 601, the computer program may perform one or more of the steps described above for training and updating the initial cloud model based on the encrypted data to obtain the target cloud model. Alternatively, in other embodiments, the computing unit 601 may be configured in any other suitable manner (e.g., by means of firmware) to perform a method of training and updating the initial cloud model based on the encrypted data, resulting in the target cloud model.

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. A method of training a model, comprising:

receiving encrypted data from a plurality of terminals, wherein the encrypted data is obtained by encrypting an intermediate result obtained by training a local model by each terminal in the plurality of terminals;

training and updating the initial cloud model based on the encrypted data to obtain a target cloud model;

and returning updated network parameters corresponding to the target cloud model to the terminals, so that each terminal in the terminals updates the local model.

2. The method of claim 1, wherein training and updating the initial cloud model based on the encrypted data to obtain the target cloud model comprises:

based on the data dimension of the encrypted data, carrying out aggregation processing on the encrypted data to obtain a processing result;

and training and updating the initial cloud model by using the processing result to obtain the target cloud model.

3. The method of claim 2, wherein aggregating the encrypted data based on the data dimension of the encrypted data to obtain the processing result comprises:

and in response to the encrypted data from different terminals having the same data dimension, performing aggregation processing on the encrypted data in a weighted average mode to obtain the processing result.

4. The method of claim 2, wherein aggregating the encrypted data based on the data dimension of the encrypted data to obtain the processing result comprises:

and responding to the encrypted data from different terminals with different data dimensions, and performing aggregation processing on the encrypted data in a longitudinal splicing mode to obtain the processing result.

5. The method of claim 3, wherein the method further comprises:

and testing the target cloud model by using the test set corresponding to the target cloud model to obtain a test result, wherein the test result is used for evaluating the data quality of the encrypted data sent by each terminal in the plurality of terminals.

6. The method of claim 5, wherein the method further comprises:

and in the process of aggregating the encrypted data by adopting the weighted average mode again, adjusting the weight of the encrypted data sent by each terminal in the plurality of terminals in weighted average calculation based on the test result.

7. An apparatus for training a model, comprising:

the system comprises a receiving module, a processing module and a processing module, wherein the receiving module is used for receiving encrypted data from a plurality of terminals, and the encrypted data is obtained by encrypting an intermediate result obtained by training a local model by each terminal in the plurality of terminals;

the processing module is used for training and updating the initial cloud model based on the encrypted data to obtain a target cloud model;

and the feedback module is used for returning the updated network parameters corresponding to the target cloud model to the plurality of terminals so that each terminal in the plurality of terminals updates the local model.

8. The apparatus according to claim 7, wherein the processing module is configured to perform aggregation processing on the encrypted data based on a data dimension of the encrypted data to obtain a processing result; and training and updating the initial cloud model by using the processing result to obtain the target cloud model.

9. The apparatus of claim 8, wherein the processing module is configured to perform aggregation processing on the encrypted data in a weighted average manner to obtain the processing result, in response to that the encrypted data from different terminals have the same data dimension.

10. The apparatus of claim 8, wherein the processing module is configured to perform aggregation processing on the encrypted data in a vertical splicing manner to obtain the processing result, in response to that the encrypted data from different terminals have different data dimensions.

11. The apparatus of claim 9, wherein the apparatus further comprises:

the test module is used for testing the target cloud model by using the test set corresponding to the target cloud model to obtain a test result, wherein the test result is used for evaluating the data quality of the encrypted data sent by each terminal in the plurality of terminals.

12. The apparatus of claim 11, wherein the apparatus further comprises:

and the adjusting module is used for adjusting the weight of the encrypted data sent by each terminal in the weighted average calculation based on the test result in the process of aggregating the encrypted data by adopting the weighted average mode again.

13. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.

14. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-6.

15. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-6.