WO2021218167A1 - Data processing model generation method and apparatus and data processing method and apparatus - Google Patents

Data processing model generation method and apparatus and data processing method and apparatus Download PDF

Info

Publication number
WO2021218167A1
WO2021218167A1 PCT/CN2020/135350 CN2020135350W WO2021218167A1 WO 2021218167 A1 WO2021218167 A1 WO 2021218167A1 CN 2020135350 W CN2020135350 W CN 2020135350W WO 2021218167 A1 WO2021218167 A1 WO 2021218167A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
data
service
intersection
terminal
Prior art date
Application number
PCT/CN2020/135350
Other languages
French (fr)
Chinese (zh)
Inventor
周学立
朱恩东
张茜
蔡满天
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021218167A1 publication Critical patent/WO2021218167A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • This application relates to the field of data processing, and in particular to a method and device for generating a data processing model, and a method and device for data processing.
  • the inventor realizes that at present, in the process of processing user data in an application, it is often necessary to have sufficient sample data as a support for the data processing process.
  • the location of the sample data storage is uncertain, for example, the same User data is often stored in various departments, and each department will not disclose the data samples in order to protect the privacy of the stored sample data.
  • the problem of incomplete sample data often occurs, which in turn leads to the problem of data islands, and it is impossible to generate accurate demand information.
  • the embodiments of the present application provide a method and device for generating a data processing model and a method and device for data processing to solve the problem of data islands.
  • a method for generating a data processing model including:
  • the model training request including the service partition sent by the service terminal through the data support terminal, determine the data partition corresponding to the service partition in the data support terminal, and determine the second ID information according to the data partition through the data support terminal ;
  • the service partition includes service data and corresponding first ID information
  • the data partition includes support data and corresponding second ID information
  • the data support terminal and the service terminal After the data support terminal and the service terminal perform intersection processing on the first ID information and the second ID information, instruct the data support terminal to obtain the engine calculation containing the intersection ID sent by the service terminal As a result, the first ID information for which each engine calculation result is an intersection ID corresponds to a second ID information that has an intersection with it;
  • intersection training set When the federation is successful, through the data support terminal and the service terminal, according to the engine calculation result, the first ID information of the intersection ID and the service data and the support corresponding to the intersection ID Data, generate intersection training set;
  • the service terminal and the data support terminal perform federated learning training according to the intersection training set to obtain a federated data processing model.
  • the federated data processing model is used when the federation is successful and after receiving the input ID information to be processed,
  • the federated prediction result is output, and the federated prediction result includes supporting data corresponding to the second ID information that has an intersection with the ID information to be processed.
  • a data processing model generating device which includes:
  • the information determining module is used to obtain the model training request including the service partition sent by the service terminal through the data support terminal, determine the data partition corresponding to the service partition in the data support terminal, and use the data support terminal according to the data
  • the partition determines the second ID information; wherein the service partition contains service data and corresponding first ID information, and the data partition contains support data and corresponding second ID information;
  • the first engine calculation module is configured to: after the data support terminal and the service terminal perform intersection processing on the first ID information and the second ID information, the data support terminal obtains the information sent by the service terminal The engine calculation result containing the intersection ID, each engine calculation result is the first ID information of the intersection ID corresponding to a second ID information that has an intersection with it;
  • intersection training set generation module is used for when the federation is successful, the first ID information of the intersection ID and the all corresponding to the intersection ID through the data support terminal and the service terminal according to the engine calculation result Said service data and said support data, generating an intersection training set;
  • the federated learning module is used to perform federated learning training according to the intersection training set through the service terminal and the data support terminal to obtain a federated data processing model.
  • the federated data processing model is used to receive input when the federation is successful.
  • a federated prediction result is output, and the federated prediction result includes support data corresponding to the second ID information that has an intersection with the ID information to be processed.
  • a data processing method including:
  • the current federation When the current federation is in a successful state, input the ID information to be processed into a preset federated data processing model to obtain the federated prediction result output by the preset federated data processing model, and the federated prediction result includes
  • the ID information to be processed has supporting data corresponding to the second ID information in the intersection; the preset federal data processing model is generated according to the data processing model generation method described above.
  • a data processing device includes:
  • the federal state detection module is used to detect whether the current federal state is in a successful federal state when the data support terminal receives a data support request containing the ID information to be processed from the service terminal;
  • the federation prediction module is used to input the ID information to be processed into a preset federated data processing model when the federation is currently in a successful state, and obtain the federated prediction result output by the preset federated data processing model.
  • the federated prediction result includes supporting data corresponding to the second ID information that has an intersection with the ID information to be processed; the preset federated data processing model is generated according to the data processing model generating method described above.
  • a computer device includes a memory, a processor, and computer-readable instructions that are stored in the memory and can run on the processor, and the processor implements the following steps when the processor executes the computer-readable instructions:
  • the model training request including the service partition sent by the service terminal through the data support terminal, determine the data partition corresponding to the service partition in the data support terminal, and determine the second ID information according to the data partition through the data support terminal ;
  • the service partition includes service data and corresponding first ID information
  • the data partition includes support data and corresponding second ID information
  • the data support terminal and the service terminal After the data support terminal and the service terminal perform intersection processing on the first ID information and the second ID information, instruct the data support terminal to obtain the engine calculation containing the intersection ID sent by the service terminal As a result, the first ID information for which each engine calculation result is an intersection ID corresponds to a second ID information that has an intersection with it;
  • intersection training set When the federation is successful, through the data support terminal and the service terminal, according to the engine calculation result, the first ID information of the intersection ID and the service data and the support corresponding to the intersection ID Data, generate intersection training set;
  • the service terminal and the data support terminal perform federated learning training according to the intersection training set to obtain a federated data processing model.
  • the federated data processing model is used when the federation is successful and after receiving the input ID information to be processed,
  • the federated prediction result is output, and the federated prediction result includes supporting data corresponding to the second ID information that has an intersection with the ID information to be processed.
  • a computer device includes a memory, a processor, and computer-readable instructions that are stored in the memory and can run on the processor, and the processor implements the following steps when the processor executes the computer-readable instructions:
  • the data support terminal When the data support terminal receives the data support request containing the ID information to be processed sent by the service terminal, it detects whether it is currently in the federally successful state;
  • the current federation When the current federation is in a successful state, input the ID information to be processed into a preset federated data processing model to obtain the federated prediction result output by the preset federated data processing model, and the federated prediction result includes
  • the ID information to be processed has supporting data corresponding to the second ID information in the intersection;
  • the preset federated data processing model refers to the generation according to the data processing model generation method described above.
  • One or more readable storage media storing computer readable instructions, when the computer readable instructions are executed by one or more processors, the one or more processors execute the following steps:
  • the model training request including the service partition sent by the service terminal through the data support terminal, determine the data partition corresponding to the service partition in the data support terminal, and determine the second ID information according to the data partition through the data support terminal ;
  • the service partition includes service data and corresponding first ID information
  • the data partition includes support data and corresponding second ID information
  • the data support terminal and the service terminal After the data support terminal and the service terminal perform intersection processing on the first ID information and the second ID information, instruct the data support terminal to obtain the engine calculation containing the intersection ID sent by the service terminal As a result, the first ID information for which each engine calculation result is an intersection ID corresponds to a second ID information that has an intersection with it;
  • intersection training set When the federation is successful, through the data support terminal and the service terminal, according to the engine calculation result, the first ID information of the intersection ID and the service data and the support corresponding to the intersection ID Data, generate intersection training set;
  • the service terminal and the data support terminal perform federated learning training according to the intersection training set to obtain a federated data processing model.
  • the federated data processing model is used when the federation is successful and after receiving the input ID information to be processed,
  • the federated prediction result is output, and the federated prediction result includes supporting data corresponding to the second ID information that has an intersection with the ID information to be processed.
  • One or more readable storage media storing computer readable instructions, when the computer readable instructions are executed by one or more processors, the one or more processors execute the following steps:
  • the data support terminal When the data support terminal receives the data support request containing the ID information to be processed sent by the service terminal, it detects whether it is currently in the federally successful state;
  • the current federation When the current federation is in a successful state, input the ID information to be processed into a preset federated data processing model to obtain the federated prediction result output by the preset federated data processing model, and the federated prediction result includes
  • the ID information to be processed has supporting data corresponding to the second ID information in the intersection;
  • the preset federated data processing model refers to the generation according to the data processing model generation method described above.
  • the above-mentioned data processing model generation method, device, computer equipment and storage medium through the intersection processing of the ID information in the service terminal and the data support terminal, the service data and the support data corresponding to the intersection ID information are generated into an intersection data set, which can ensure
  • the premise of the security of the data of the service terminal and the data support terminal is to find the ID information jointly owned by both parties, and under the condition that the ID information of both parties is not leaked, the collaborative work has completed the training of the federation model and the construction of the federation prediction, making the federation prediction
  • the result is very close to the prediction result obtained by the transparent training model of both parties, thus solving the problem of data islanding.
  • the intersection data set is used for federated learning training, which improves the efficiency of data processing.
  • the federated data processing model obtained by training is used for federated prediction work to improve the integrity of the entire system, and with the support of multi-party data, the prediction results will be more accurate.
  • the above-mentioned data processing methods, devices, computer equipment and storage media through the use of the federal data processing model for prediction when the current state is in the federally successful state, can solve the problem of data islanding and improve data prediction under the premise of ensuring data security. Accuracy.
  • the local data processing model is used for prediction, so that the local data processing model prediction method is used as an emergency plan to ensure that the system tasks can be completed. This improves the comprehensiveness of the system and reduces the communication failure or The risk of communication interruption.
  • FIG. 1 is a schematic diagram of an application environment of a data processing model generation method and a data processing method in an embodiment of the present application;
  • Fig. 2 is a flowchart of a method for generating a data processing model in an embodiment of the present application
  • FIG. 3 is another flowchart of a method for generating a data processing model in an embodiment of the present application
  • Fig. 4 is a functional block diagram of a data processing model generating device in an embodiment of the present application.
  • FIG. 5 is another principle block diagram of the data processing model generating device in an embodiment of the present application.
  • Fig. 6 is a flowchart of a data processing method in an embodiment of the present application.
  • Fig. 7 is a functional block diagram of a data processing device in an embodiment of the present application.
  • Fig. 8 is a schematic diagram of a computer device in an embodiment of the present application.
  • the embodiment of the present application provides a method for generating a data processing model, and the method for generating a data processing model can be applied to the application environment shown in FIG. 1.
  • the data processing model generation method is applied in a data processing model generation system.
  • the data processing model generation system includes a client and a server as shown in FIG. 1.
  • the client and the server communicate through a network for data islanding. .
  • the client is also called the client, which refers to the program that corresponds to the server and provides local services to the client.
  • the client can be installed on, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.
  • the server can be implemented as an independent server or a server cluster composed of multiple servers. Further, the server includes a service terminal and a data support terminal.
  • a method for generating a data processing model is provided.
  • the method is applied to the server in FIG. 1 as an example for description, including the following steps:
  • S11 Obtain the model training request including the service partition sent by the service terminal through the data support terminal, determine the data partition corresponding to the service partition in the data support terminal, and determine the second ID information according to the data partition through the data support terminal; wherein, in the service partition Contains service data and corresponding first ID information, and the data partition includes support data and corresponding second ID information.
  • the data support terminal is a terminal that receives service requests from the service terminal and provides data support and model calculation.
  • the data support terminal can be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and computer clusters.
  • a service terminal is a terminal that sends a service request to a data support terminal.
  • the service terminal can be, but is not limited to, various personal computers, notebook computers, smart phones, and tablet computers.
  • Model training is performed based on artificial intelligence technology, and the model training request is a request sent by the service terminal to the data support terminal to obtain corresponding data support.
  • the first ID information is ID information corresponding to the service data in the service partition of the service terminal.
  • the service data is data in the service terminal, and each first ID information has corresponding service data.
  • the data partition is the area obtained after the data support terminal calculates and classifies the ID information corresponding to the support data.
  • the number of partitions of the data partition is determined according to the number of binning machines included in the data support terminal and the data volume of the second ID information of.
  • the service partition is the area obtained by the service terminal performing the same calculation and classification on the ID information corresponding to the service data as in the data support terminal, and each service partition may correspond to a data partition.
  • the second ID information is ID information corresponding to the supporting data in the data partition corresponding to the first ID information in the data supporting terminal.
  • the supporting data is data in the data supporting terminal, and each second ID information has corresponding supporting data.
  • the data support terminal obtains the model training request including the service partition sent by the service terminal, determines the data partition corresponding to the service partition in the data support terminal, and determines the second ID information according to the data partition through the data support terminal.
  • the service partition contains service data and corresponding first ID information
  • the data partition contains support data and corresponding second ID information.
  • the method before acquiring the data model training request including the service partition sent by the service terminal through the data support terminal, the method includes:
  • the data support terminal receives the preset rules sent by the service terminal, uses the preset rules to perform the uniform operation on the ID information corresponding to the support data in the data support terminal, and then uses the uniform encryption algorithm to perform the uniform encryption algorithm on the uniformized ID information and the The support data corresponding to the ID information is uniformly distributed to obtain the data to be binned.
  • the data support terminal performs binning processing on the data to be binned according to a preset binning strategy, and obtains binning information and data partitions corresponding to the data to be binned.
  • the binning information is sent to the service terminal through the data support terminal, and the service terminal hashes the ID information corresponding to the service data according to the binning information to obtain the binning number.
  • the data support request is that the service terminal determines the first ID information from the ID information corresponding to the service data in the service partition, and then according to the box number and the first ID information. ID information is generated.
  • the preset rule is the rule sent by the service terminal to the data support terminal, and the essence of the preset rule is to uniform the data format of the ID information of the service terminal and the ID information of the data support terminal, so that the service terminal and the data The data format of the ID information in the support terminal remains consistent.
  • the unification operation means that the data support terminal performs a unified operation on the data format of the ID information of the data in the database and the ID information of the service terminal.
  • the uniform encryption algorithm is used to ensure the non-retrospecibility of the processed data to be binned, and to make the data to be binned can be evenly distributed, and the uniform encryption algorithm can be a uniform hash algorithm.
  • the data to be sorted refers to the data in the database waiting to be sorted.
  • the data to be sorted can include ID identity information and keyword data of the search record corresponding to the ID, ID identity information and the ID corresponding to the data or ID of the visited page
  • the identity information and the ID correspond to the data of downloading application software.
  • the data to be sorted may also include ID identity information and static data corresponding to the ID.
  • the static data may be the age, gender, or residential area of the ID identity information.
  • Hashing is an arithmetic method that transforms an input of any length (also called a pre-mapped pre-image) into a fixed-length output through a hashing algorithm.
  • intersection processing refers to a processing method of determining the shared ID information of the first ID information and the second ID information.
  • the engine calculation result is the result obtained after the intersection processing of the first ID information and the second ID information.
  • the intersection ID is ID information shared by the service terminal and the data support terminal, that is, the first ID information and the second ID information are the same.
  • the data support terminal and the service terminal perform intersection processing on the first ID information and the second ID information to obtain the engine calculation result.
  • the engine calculation results include intersection ID and non-intersection ID.
  • the data support terminal After obtaining the engine calculation result through the service terminal, the data support terminal obtains the engine calculation result including the intersection ID sent through the service terminal.
  • an RSA encryption method may be used to perform intersection processing on the first ID information and the second ID information.
  • the non-intersection ID is ID information owned by the service terminal but not used by the data support terminal, that is, the first ID information and the second ID information are different.
  • one service partition in the service terminal includes at least one first ID information
  • one data partition in the data support terminal also includes at least one second ID information. Therefore, the engine calculation result may contain multiple intersection IDs and multiple non-intersection IDs.
  • the essence of the intersection training set is a data training set
  • the data in the intersection training set is ID information shared by the service terminal and the data support terminal and the service data and support data corresponding to the ID information.
  • the above-mentioned service data is still stored in the service terminal, and the supporting data is still stored in the data support terminal.
  • the intersection training set here is a data set jointly generated by the service terminal and the data support terminal.
  • the data support terminal and the service terminal are the first ID information of the intersection ID and the first ID information according to the engine calculation result.
  • the corresponding service data and supporting data are generated to generate an intersection training set.
  • intersection ID is the ID information shared by the service terminal and the data support terminal, that is, the ID set when the first ID information and the second ID information are the same, so only the service data corresponding to the first ID information is explained above. And supporting data.
  • the service terminal and the data support terminal perform federated learning training according to the intersection training set to obtain the federated data processing model.
  • the federated data processing model is used to output the federated prediction result after receiving the input to-be-processed ID information when the federation is successful.
  • the prediction result includes supporting data corresponding to the second ID information that has an intersection with the ID information to be processed.
  • federated learning training is a training method that can build a machine learning system without directly accessing the training data.
  • the federated data processing model is a model obtained after federated learning training is performed on the intersection training set, and the federated data processing model is used for subsequent prediction steps.
  • the ID information to be processed is the result information obtained by waiting for the federated prediction of the input model.
  • the ID information to be processed may be ID information corresponding to document download information of multiple platforms, ID information corresponding to historical access information of multiple platforms, or ID information corresponding to historical purchase records of multiple platforms.
  • the federated prediction result is the result obtained after the federated prediction of the ID information to be processed using the federated data processing model.
  • the federated prediction result can be target classification information or recommendation information.
  • the target classification information or recommendation information contains the service data and support corresponding to the intersection ID. data.
  • the service terminal and the data performs federated learning training according to the intersection training set, so as to protect the service data and support data in the intersection training set from being disclosed, thereby generating a federated data processing model.
  • the federated data processing model can be used to predict ID information to be processed to obtain federated prediction results.
  • the federated prediction result includes the supporting data corresponding to the second ID information that has an intersection with the ID information to be processed, and also includes the service data corresponding to the intersection ID as the result of the engine calculation.
  • this embodiment further includes the following steps: storing the federated prediction result in the blockchain.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • the service data and the support data corresponding to the intersection ID information are generated into an intersection data set, which can ensure the data of the service terminal and the data support terminal.
  • the premise of security is to find the ID information jointly owned by both parties, and under the condition that the ID information of both parties is not leaked, the collaborative work has completed the training of the federation model and the construction of the federation prediction, so that the federation prediction result and the transparent training model of the data of both parties are obtained.
  • the predicted results are extremely close, thus solving the problem of data islands.
  • the intersection data set is used for federated learning training, which improves the efficiency of data processing.
  • the federated data processing model obtained by training is used for federated prediction work to improve the integrity of the entire system, and with the support of multi-party data, the prediction results will be more accurate.
  • step S20 that is, after the first ID information and the second ID information are intersected, the data support terminal is instructed to obtain the engine calculation result containing the intersection ID sent by the service terminal specifically includes The following steps:
  • S121 Send the encryption key to the service terminal through the data support terminal, and obtain the first encrypted information obtained after the service terminal uses the encryption key and the first private key to encrypt the first ID information.
  • the encryption key is a public key provided by the data support terminal to the service terminal, and the encryption key is used by the service terminal to encrypt the first ID information.
  • the first private key is a key used by the service terminal to encrypt the first ID information, and the first private key is owned only by the service terminal.
  • the first encrypted information is information obtained by the service terminal using the encryption key and the first private key to encrypt the first ID information.
  • the service terminal After the service terminal sends the encryption key to the service terminal through the data support terminal after the second ID information of the data partition corresponding to the first ID information in the data support terminal, after receiving the encryption key, the service terminal passes The service terminal uses the encryption key and the first private key to encrypt the first ID information to obtain the first encrypted information; the service terminal sends the first encrypted information to the data support terminal.
  • S122 Use the second private key to encrypt the first encrypted information through the data support terminal to obtain the second encrypted information.
  • the second private key is the only encryption key of the data support terminal.
  • the second encrypted information is information obtained by the data support terminal using the second private key to encrypt the first encrypted information.
  • the data support terminal uses the second private key to perform the first encrypted information on the first encrypted information. Encryption to obtain the second encrypted information, which further improves the security of the data.
  • S123 Use the encryption key and the second private key to encrypt the second ID information through the data support terminal to obtain the third encrypted information.
  • the third encrypted information is information obtained by encrypting the second ID information by the data support terminal using the encryption key and the second private key.
  • the data support terminal uses the encryption key and the second private key to encrypt the second ID information to obtain The third encrypted information.
  • the data support terminal uses the encryption key and the second private key to encrypt the second ID information after each data support request sent by the service terminal, these operations will greatly increase the response time and waste a lot of Computing resources, and have higher requirements for the peak value of data support terminal calculations.
  • the data support terminal can use the encryption key and the second private key to encrypt all the ID information corresponding to the support data in advance, and after the data support terminal obtains the data support request sent by the service terminal, the data support terminal can obtain the data support request from the service terminal.
  • the data partition determines the partition corresponding to the first ID information in the service partition of the service terminal.
  • the encryption key can be updated through the data support terminal to further ensure the security of the data.
  • the first encrypted information is generated through the service terminal and the third encrypted information is generated through the data support terminal.
  • the sequence of these two steps is not fixed.
  • the first encrypted information may be obtained through the service terminal first, or may be obtained through the data support terminal first.
  • the third encryption information or the above two steps are performed simultaneously.
  • S124 Send the second encrypted information and the third encrypted information to the service terminal through the data support terminal, and obtain the first encrypted information obtained by the intersection engine calculation of the second encrypted information and the third encrypted information through the service terminal through the data support terminal.
  • the first intermediate result contains the intersection ID.
  • the first intermediate result is the result obtained by the service terminal after the intersection engine calculation of the second encrypted information and the third encrypted information.
  • the essence of the intersection engine calculation is to determine the calculation method of the intersection ID information in the service terminal and the data support terminal.
  • the intersection engine calculation is performed on the second encrypted information and the third encrypted information through the service terminal to obtain the first intermediate result, and pass The service terminal sends the first intermediate result to the data support terminal, and the first intermediate result includes the intersection ID.
  • S125 Perform decryption calculation on the first intermediate result through the data support terminal to obtain the second intermediate result, and send the second intermediate result to the service terminal through the data support terminal, and obtain the second intermediate result through the service terminal through the data support terminal.
  • the engine calculation result containing the intersection ID obtained after the integration.
  • the decryption calculation is calculation for the data support terminal to decrypt the first intermediate result.
  • the second intermediate result is the result obtained after decrypting the first intermediate result.
  • the data support terminal decrypts and calculates the first intermediate result to obtain the second intermediate result, and pass the data
  • the support terminal sends the second intermediate result to the service terminal; after the second intermediate result is sent to the service terminal through the data support terminal, the data support terminal obtains the engine calculation obtained by integrating the second intermediate result through the service terminal result.
  • the data support terminal continues to perform the intersection engine calculation on the first intermediate result to determine whether there is The intersection ID, the third intermediate result is obtained, and the third intermediate result is sent to the service terminal through the data support terminal.
  • the third intermediate result is sent to the service terminal through the data support terminal, and the intersection engine calculation is continued through the service terminal until the intersection ID is determined. If the third intermediate result contains the intersection ID, the third intermediate result is decrypted and calculated by the data support terminal, and then the decrypted third intermediate result is sent to the service terminal through the data support terminal.
  • the encryption key, the first private key, and the second private key are used to encrypt the data information to ensure that the ID information of the service terminal and the data support terminal is not visible, and the information of the service terminal and the data support terminal is found.
  • the shared ID information further improves the security of the data of both parties, and at the same time ensures the privacy of the user of the service terminal.
  • the data processing model generation method further includes:
  • the service terminal When the federation is successful, the service terminal generates the supplementary training set according to the first ID information of the non-intersection ID as the result of the engine calculation in the service partition, and the service data corresponding to the non-intersection ID.
  • the essence of the complement training set is the data training set
  • the data in the complement training set includes the first ID information whose engine calculation result is a non-intersection ID in the service partition of the service terminal and the first ID information corresponding to the first ID information.
  • Service data; that is, the first ID information in the complement training set does not overlap with the second ID information in the data partition of the data support terminal.
  • the engine calculation result contains the intersection ID and the non-intersection ID. Therefore, when the federation is successful, the service terminal is The engine calculation result in the service partition is the first ID information of the non-intersection ID and the service data corresponding to the first ID information, and a supplementary training set is generated.
  • the data processing model generation method further includes the following steps:
  • the service terminal generates a local training set according to all the first ID information in the service partition and all the service data corresponding to each first ID information in the service partition of the service terminal.
  • the essence of the local training set is a data training set, and the data in the local training set is all the first ID information in the service partition of the service terminal and all the service data corresponding to each first ID information in the service partition of the service terminal.
  • the service terminal uses the service terminal according to all the first ID information in the service partition and the first ID information in the service partition of the service terminal. Generate the local training set for all service data corresponding to the information.
  • the intersection ID part is generated into the intersection training set
  • the non-intersection ID part is generated into the complementary training set
  • the ID information of the service terminal and the corresponding service data are generated into the local training set.
  • It can avoid the traditional use of only the intersection ID and discard the non-intersection ID. If the non-intersection ID is discarded, another machine learning platform is required to perform additional work, and the intersection ID, non-intersection ID and all ID information of the service terminal Corresponding data training sets are generated, so that all data can be used effectively and costs are saved.
  • the data processing model generation method further includes:
  • the service terminal performs local learning and training according to the complementary training set to obtain the first local data processing model.
  • the first local data processing model is used to receive the input ID information to be processed, and then output the first local prediction result, the first local prediction result Contains the service data corresponding to the non-intersection ID; and/or
  • the service terminal performs local learning and training according to the local training set to obtain a second local data processing model, and the second local data processing model is used to output a second local prediction result after receiving the input ID information to be processed ,
  • the second local prediction result includes the service data corresponding to the first ID information in the service partition of the service terminal.
  • the local learning training is a method of training using the service data of the service terminal.
  • Both the first local data processing model and the second local data processing model are models for local prediction.
  • the local prediction result may be target classification information, recommendation information, etc.
  • the target classification information or recommendation information includes the service data corresponding to the non-intersection ID or all the service data in the service partition of the service terminal.
  • the service terminal performs local learning training according to the supplementary training set to obtain the first local data processing model; the service terminal performs local learning according to the local training set Train to get the second local data processing model.
  • the first local data processing model and the second local data processing model can be used when the federation is successful, after receiving the input ID to be processed, the ID information to be processed is locally predicted to obtain the local prediction result. The local prediction at this time The result can be fused with the federated prediction result generated in the above embodiment to improve the accuracy.
  • the first local data processing model and the second local data processing model are mainly used to perform local prediction on the ID information to be processed after receiving the input ID to be processed when the federation is interrupted, so as to obtain the local prediction result.
  • the local prediction result can be used as an emergency plan to compensate for the failure to generate the federal prediction result during the federal interruption, making the system more comprehensive.
  • local learning and training can support and use multiple machine learning algorithms, for example, LR, XGB, NB, or DNN, etc.
  • the tasks of local learning and training can be supervised regression problems, supervised classification problems, or unsupervised machine learning problems.
  • the local data processing model is obtained by performing local learning training according to the complement training set and/or the local training set, and the local data processing model is used to perform local prediction on the ID information to be processed to obtain the local prediction result.
  • Local learning and training through the supplementary training set and the local training set can obtain the prediction results of all data, making the prediction results more comprehensive, avoiding the need to add additional machine learning platforms for local training, and improving the comprehensiveness and flexibility of the system sex.
  • a data processing model generating device is provided, and the data processing model generating device corresponds to the data processing model generating method in the above-mentioned embodiment in a one-to-one correspondence.
  • the data processing model generation device includes an information determination module 11, a first engine calculation module 12, an intersection training set generation module 13 and a federated learning module 14. The detailed description of each functional module is as follows:
  • the information determining module 11 is configured to obtain the model training request including the service partition sent by the service terminal through the data support terminal, determine the data partition corresponding to the service partition in the data support terminal, and determine the second ID information according to the data partition through the data support terminal; Wherein, the service partition includes service data and corresponding first ID information, and the data partition includes support data and corresponding second ID information.
  • the intersection processing module 12 is used to obtain, through the data support terminal, the engine calculation result containing the intersection ID sent by the service terminal after the first ID information and the second ID information are intersected by the data support terminal and the service terminal, each engine The first ID information whose calculation result is an intersection ID corresponds to a second ID information that has an intersection.
  • intersection training set generation module 13 is used to support the terminal and the service terminal according to the engine calculation result as the first ID information of the intersection ID and the service data corresponding to the first ID information corresponding to the intersection ID through the data support terminal and the service terminal when the federation succeeds. Support data, generate intersection training set.
  • the federated learning module 14 is used to perform federated learning training according to the intersection training set through the service terminal and the data support terminal to obtain the federated data processing model.
  • the federated data processing model is used for outputting after receiving the input pending ID information when the federation is successful
  • the federated prediction result, the federated prediction result includes supporting data corresponding to the second ID information that has an intersection with the ID information to be processed.
  • intersection processing module 12 further includes:
  • the first encrypted information generating module 121 is configured to send the encryption key to the service terminal through the data support terminal, and obtain the first ID information obtained after the service terminal uses the encryption key and the first private key to encrypt the first ID information through the data support terminal.
  • the second encrypted information generating module 122 is configured to encrypt the first encrypted information by using the second private key through the data support terminal to obtain the second encrypted information.
  • the third encrypted information generating module 123 uses the encryption key and the second private key to encrypt the second ID information through the data support terminal to obtain the third encrypted information.
  • the intersection engine calculation module 124 is configured to send the second encrypted information and the third encrypted information to the service terminal through the data support terminal, and obtain the intersection engine of the second encrypted information and the third encrypted information through the service terminal through the data support terminal.
  • the first intermediate result obtained after the calculation, the first intermediate result contains the intersection ID.
  • the decryption calculation module 125 is used to perform decryption calculation on the first intermediate result through the data support terminal to obtain the second intermediate result, and send the second intermediate result to the service terminal through the data support terminal, and obtain the passed service through the data support terminal
  • the engine calculation result containing the intersection ID is obtained after the terminal integrates the second intermediate result.
  • the data processing model generating device further includes:
  • the supplementary training set generation module is used to generate the service data corresponding to the first ID information corresponding to the non-intersection ID and the first ID information corresponding to the non-intersection ID by the service terminal according to the engine calculation result in the service partition when the federation is successful Complementary training set.
  • the data processing model generating device further includes:
  • the local training set generating module is used for generating the local training set by the service terminal according to all the first ID information in the service partition and all the service data corresponding to each first ID information in the service partition of the service terminal.
  • the data processing model generating device further includes:
  • the local learning module is used to perform local learning training according to the complement training set through the service terminal to obtain the first local data processing model.
  • the first local data processing model is used to output the first local prediction result after receiving the input ID information to be processed ,
  • the first local prediction result includes the service data corresponding to the non-intersection ID;
  • the second local data processing model is used to output the second local prediction result after receiving the input ID information to be processed ,
  • the second local prediction result includes the service data corresponding to the first ID information in the service partition of the service terminal.
  • Each module in the above-mentioned data processing model generating device can be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
  • the embodiment of the present application also provides a data processing method, which can be applied in the application environment shown in FIG. 1.
  • the data processing method is applied in a data processing system, and the data processing system includes a client and a server as shown in FIG.
  • the client is also called the client, which refers to the program that corresponds to the server and provides local services to the client.
  • the client can be installed on, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.
  • the server can be implemented as an independent server or a server cluster composed of multiple servers. Further, the server includes a service terminal and a data support terminal.
  • FIG. 6 a data processing method is proposed, and the method is applied to the server in FIG. 1 as an example for description, including the following steps:
  • the data support request containing the ID information to be processed is a support request for the service terminal to request the data support terminal to perform predictive processing on the ID information to be processed.
  • the ID information to be processed is the result information obtained by waiting for the federated prediction of the input model.
  • the data support terminal After the data support terminal receives the data support request containing the ID information to be processed sent by the service terminal, it detects the current federation state and detects whether it is currently in the federation success state.
  • the federated data processing model is used to perform federated predictions on the ID information to be processed and generate federated prediction results.
  • the federated prediction result includes supporting data corresponding to the second ID information that has an intersection with the ID information to be processed.
  • the preset federated data processing model is generated according to the data processing model generating method in the foregoing embodiment.
  • the data processing method further includes:
  • the local data processing model is used for the ID to be processed Information is locally predicted, and local prediction results are generated.
  • the local data processing model is generated according to the data processing model generation method in the above-mentioned embodiment. Further, the local data processing model may be the first local data processing model or the second local data processing model.
  • the local data processing model can be used to output the local prediction result after receiving the input ID information to be processed when the federation is successful or when the federation is interrupted.
  • the local data processing model is used to locally predict the ID information to be processed, and the local prediction result is obtained, which can make the prediction result more comprehensive; when the federation is interrupted, after receiving the input ID information to be processed, the local prediction result is output.
  • the federated interruption state may be a state in which the communication is cut off or the communication is unstable during the federated learning process, or may be a state where the federated learning is not responding during the federated learning process.
  • the service data prediction method further includes:
  • an evaluation model is set in the service terminal, and the evaluation model is used to evaluate the result predicted by the ID information to be processed in the federal data processing model or the local data processing model.
  • the method for obtaining the evaluation model may be to fuse the federated prediction result or the local prediction result with the ID information to be processed.
  • the method of fusion includes, but is not limited to, voting mechanism, stacking training mechanism, reinforcement learning or bandit, etc.
  • the PSI index is used to measure the stability and accuracy of the federal data processing model or the local data processing model.
  • the PSI index contains a threshold. If the PSI index exceeds the threshold, the service terminal and data support terminal will be considered to update the data client group, and retrain according to the updated data client group.
  • the service terminal and data support terminal After receiving the federal prediction result or the local prediction result, perform accuracy analysis on the federal prediction result or the local prediction result, and update the evaluation model or adjust the evaluation weight in the evaluation model according to the accuracy analysis result.
  • the federated data processing model is used for prediction, which can solve the data island problem under the premise of ensuring data security, and also improve the accuracy of data prediction.
  • the local data processing model is used for prediction, so that the local data processing model prediction method is used as an emergency plan to ensure that the system tasks can be completed. This improves the comprehensiveness of the system and reduces the communication failure or The risk of communication interruption.
  • a data processing device is provided, and the data processing device corresponds to the data processing method in the above-mentioned embodiment one-to-one.
  • the data processing device includes a federal state detection 21 module and a federal prediction module 22.
  • the detailed description of each functional module is as follows:
  • the federation state detection module 21 is used to detect whether the current federation is in a successful state when the data support terminal receives a data support request containing ID information to be processed sent by the service terminal;
  • the federation prediction module 22 is used to obtain the federated prediction result output by the preset federated data processing model by inputting ID information to be processed into the preset federated data processing model when the federation is currently in a successful state. Contains supporting data corresponding to the second ID information that has an intersection with the ID information to be processed; the preset federal data processing model is generated according to the data processing model generation method in the foregoing embodiment.
  • Each module in the above-mentioned data processing device can be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 8.
  • the computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a readable storage medium and an internal memory.
  • the readable storage medium stores an operating system, computer readable instructions, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer readable instructions in the readable storage medium.
  • the database of the computer equipment is used to store the data used in the above-mentioned data processing model generation method and the above-mentioned data processing method.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer-readable instruction is executed by the processor to realize a data processing model generation method, or the computer-readable instruction is executed by the processor to realize a data processing method.
  • the readable storage medium provided in this embodiment includes a non-volatile readable storage medium and a volatile readable storage medium.
  • a computer device including a memory, a processor, and computer readable instructions stored in the memory and capable of running on the processor, and the processor implements the following steps when the processor executes the computer readable instructions:
  • the model training request including the service partition sent by the service terminal through the data support terminal, determine the data partition corresponding to the service partition in the data support terminal, and determine the second ID information according to the data partition through the data support terminal ;
  • the service partition includes service data and corresponding first ID information
  • the data partition includes support data and corresponding second ID information
  • the data support terminal After the intersection processing is performed on the first ID information and the second ID information by the data support terminal and the service terminal, the data support terminal obtains the engine calculation containing the intersection ID sent by the service terminal As a result, the first ID information for which each engine calculation result is an intersection ID corresponds to a second ID information that has an intersection with it;
  • intersection training set When the federation is successful, through the data support terminal and the service terminal, according to the engine calculation result, the first ID information of the intersection ID and the service data and the support corresponding to the intersection ID Data, generate intersection training set;
  • the service terminal and the data support terminal perform federated learning training according to the intersection training set to obtain a federated data processing model.
  • the federated data processing model is used when the federation is successful and after receiving the input ID information to be processed,
  • the federated prediction result is output, and the federated prediction result includes supporting data corresponding to the second ID information that has an intersection with the ID information to be processed.
  • a computer device including a memory, a processor, and computer readable instructions stored in the memory and capable of running on the processor, and the processor implements the following steps when the processor executes the computer readable instructions:
  • the data support terminal When the data support terminal receives the data support request containing the ID information to be processed sent by the service terminal, it detects whether it is currently in the federally successful state;
  • the current federation When the current federation is in a successful state, input the ID information to be processed into a preset federated data processing model to obtain the federated prediction result output by the preset federated data processing model, and the federated prediction result includes
  • the ID information to be processed has supporting data corresponding to the second ID information in the intersection;
  • the preset federated data processing model refers to the generation according to the data processing model generation method described above.
  • one or more readable storage media storing computer readable instructions are provided.
  • the readable storage media provided in this embodiment include non-volatile readable storage media and volatile readable storage. Medium; the readable storage medium stores computer readable instructions, and when the computer readable instructions are executed by one or more processors, the one or more processors implement the following steps:
  • the model training request including the service partition sent by the service terminal through the data support terminal, determine the data partition corresponding to the service partition in the data support terminal, and determine the second ID information according to the data partition through the data support terminal ;
  • the service partition includes service data and corresponding first ID information
  • the data partition includes support data and corresponding second ID information
  • the data support terminal After the intersection processing is performed on the first ID information and the second ID information by the data support terminal and the service terminal, the data support terminal obtains the engine calculation containing the intersection ID sent by the service terminal As a result, the first ID information for which each engine calculation result is an intersection ID corresponds to a second ID information that has an intersection with it;
  • intersection training set When the federation is successful, through the data support terminal and the service terminal, according to the engine calculation result, the first ID information of the intersection ID and the service data and the support corresponding to the intersection ID Data, generate intersection training set;
  • the service terminal and the data support terminal perform federated learning training according to the intersection training set to obtain a federated data processing model.
  • the federated data processing model is used when the federation is successful and after receiving the input ID information to be processed,
  • the federated prediction result is output, and the federated prediction result includes supporting data corresponding to the second ID information that has an intersection with the ID information to be processed.
  • one or more readable storage media storing computer readable instructions are provided.
  • the readable storage media provided in this embodiment include non-volatile readable storage media and volatile readable storage. Medium; the readable storage medium stores computer readable instructions, and when the computer readable instructions are executed by one or more processors, the one or more processors implement the following steps:
  • the data support terminal When the data support terminal receives the data support request containing the ID information to be processed sent by the service terminal, it detects whether it is currently in the federally successful state;
  • the current federation When the current federation is in a successful state, input the ID information to be processed into a preset federated data processing model to obtain the federated prediction result output by the preset federated data processing model, and the federated prediction result includes
  • the ID information to be processed has supporting data corresponding to the second ID information in the intersection;
  • the preset federated data processing model refers to the generation according to the data processing model generation method described above.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Storage Device Security (AREA)

Abstract

A data processing model generation method and apparatus and a data processing method and apparatus. In a data processing model generation phase, performing intersection processing on determined first ID information and second ID information to obtain an engine calculation result comprising an intersection ID; generating an intersection training set on the basis of first ID information corresponding to the intersection ID corresponding to service data and support data and, in a successful federation state, performing federated learning training on the basis of the intersection training set to generate a federated data processing model. In a data processing phase, when federation is successful, inputting ID information to be processed into the federated data processing model to perform federation prediction in order to generate a federation prediction result, thereby solving the problem of data islands whilst ensuring data transmission security. In addition, during the data processing model generation phase, generating a local data processing model and, by means of inputting the ID information to be processed into the local data processing model, generating a local prediction result, thereby increasing the accuracy and security of data prediction results. Also relating to artificial intelligence and blockchain technology.

Description

数据处理模型生成方法和装置、数据处理方法和装置Data processing model generation method and device, data processing method and device
本申请要求于2020年4月29日提交中国专利局、申请号为202010356458.6,发明名称为“数据处理模型生成方法和装置、数据处理方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on April 29, 2020, the application number is 202010356458.6, and the invention title is "Data processing model generation method and device, data processing method and device", the entire content of which is approved The reference is incorporated in this application.
技术领域Technical field
本申请涉及数据处理领域,尤其涉及一种数据处理模型生成方法和装置、数据处理方法和装置。This application relates to the field of data processing, and in particular to a method and device for generating a data processing model, and a method and device for data processing.
背景技术Background technique
随着大数据时代的来临,数据处理技术发展也越来越迅速,例如:推荐***、语音助手或者精准广告***等应用,如此,对应用中的用户数据的处理变得尤为重要。With the advent of the big data era, the development of data processing technology is becoming more and more rapid, such as: recommendation systems, voice assistants or precision advertising systems and other applications. In this way, the processing of user data in the application becomes particularly important.
发明人意识到,目前,在对应用中的用户数据进行处理的过程中,往往需要有充足的样本数据作为数据处理过程的支撑,但是,样本数据存储的位置是不确定的,比如,同一个用户的数据往往存储在各个部门中,而每个部门为了保护自身存储的样本数据的隐私安全,不会将数据样本进行公开。如此,在对应用中的用户数据进行处理时,往往会出现样本数据不全的问题,进而导致数据孤岛问题,且无法生成准确的需求信息。The inventor realizes that at present, in the process of processing user data in an application, it is often necessary to have sufficient sample data as a support for the data processing process. However, the location of the sample data storage is uncertain, for example, the same User data is often stored in various departments, and each department will not disclose the data samples in order to protect the privacy of the stored sample data. In this way, when processing user data in the application, the problem of incomplete sample data often occurs, which in turn leads to the problem of data islands, and it is impossible to generate accurate demand information.
申请内容Application content
本申请实施例提供一种数据处理模型生成方法和装置以及数据处理方法和装置,以解决数据孤岛问题。The embodiments of the present application provide a method and device for generating a data processing model and a method and device for data processing to solve the problem of data islands.
一种数据处理模型生成方法,包括:A method for generating a data processing model, including:
通过数据支持终端获取服务终端发送的包含服务分区的模型训练请求,确定所述数据支持终端中与所述服务分区对应的数据分区,通过所述数据支持终端根据所述数据分区确定第二ID信息;其中,所述服务分区中包含服务数据以及与其对应的第一ID信息,所述数据分区中包含支持数据以及与其对应的第二ID信息;Obtain the model training request including the service partition sent by the service terminal through the data support terminal, determine the data partition corresponding to the service partition in the data support terminal, and determine the second ID information according to the data partition through the data support terminal ; Wherein, the service partition includes service data and corresponding first ID information, and the data partition includes support data and corresponding second ID information;
在通过所述数据支持终端和所述服务终端对所述第一ID信息和所述第二ID信息进行交集处理后,指示所述数据支持终端获取所述服务终端发送的包含交集ID的引擎计算结果,每一个引擎计算结果为交集ID的第一ID信息均对应一个与其存在交集的第二ID信息;After the data support terminal and the service terminal perform intersection processing on the first ID information and the second ID information, instruct the data support terminal to obtain the engine calculation containing the intersection ID sent by the service terminal As a result, the first ID information for which each engine calculation result is an intersection ID corresponds to a second ID information that has an intersection with it;
在联邦成功时,通过所述数据支持终端和所述服务终端,根据引擎计算结果为所述交集ID的所述第一ID信息以及与所述交集ID均对应的所述服务数据和所述支持数据,生成交集训练集;When the federation is successful, through the data support terminal and the service terminal, according to the engine calculation result, the first ID information of the intersection ID and the service data and the support corresponding to the intersection ID Data, generate intersection training set;
通过所述服务终端和所述数据支持终端根据所述交集训练集进行联邦学习训练,得到联邦数据处理模型,所述联邦数据处理模型用于在联邦成功时,接收输入的待处理ID信息之后,输出联邦预测结果,所述联邦预测结果中包含与所述待处理ID信息存在交集的第二ID信息对应的支持数据。The service terminal and the data support terminal perform federated learning training according to the intersection training set to obtain a federated data processing model. The federated data processing model is used when the federation is successful and after receiving the input ID information to be processed, The federated prediction result is output, and the federated prediction result includes supporting data corresponding to the second ID information that has an intersection with the ID information to be processed.
一种数据处理模型生成装置,其中,包括:A data processing model generating device, which includes:
信息确定模块,用于通过数据支持终端获取服务终端发送的包含服务分区的模型训练请求,确定所述数据支持终端中与所述服务分区对应的数据分区,通过所述数据支持终端根据所述数据分区确定第二ID信息;其中,所述服务分区中包含服务数据以及与其对应的第一ID信息,所述数据分区中包含支持数据以及与其对应的第二ID信息;The information determining module is used to obtain the model training request including the service partition sent by the service terminal through the data support terminal, determine the data partition corresponding to the service partition in the data support terminal, and use the data support terminal according to the data The partition determines the second ID information; wherein the service partition contains service data and corresponding first ID information, and the data partition contains support data and corresponding second ID information;
第一引擎计算模块,用于在通过所述数据支持终端和所述服务终端对所述第一ID信 息和所述第二ID信息进行交集处理后,所述数据支持终端获取所述服务终端发送的包含交集ID的引擎计算结果,每一个引擎计算结果为交集ID的第一ID信息均对应一个与其存在交集的第二ID信息;The first engine calculation module is configured to: after the data support terminal and the service terminal perform intersection processing on the first ID information and the second ID information, the data support terminal obtains the information sent by the service terminal The engine calculation result containing the intersection ID, each engine calculation result is the first ID information of the intersection ID corresponding to a second ID information that has an intersection with it;
交集训练集生成模块,用于在联邦成功时,通过所述数据支持终端和所述服务终端根据引擎计算结果为所述交集ID的所述第一ID信息以及与所述交集ID均对应的所述服务数据和所述支持数据,生成交集训练集;The intersection training set generation module is used for when the federation is successful, the first ID information of the intersection ID and the all corresponding to the intersection ID through the data support terminal and the service terminal according to the engine calculation result Said service data and said support data, generating an intersection training set;
联邦学习模块,用于通过所述服务终端和所述数据支持终端根据所述交集训练集进行联邦学习训练,得到联邦数据处理模型,所述联邦数据处理模型用于在联邦成功时,接收输入的待处理ID信息之后,输出联邦预测结果,所述联邦预测结果中包含与所述待处理ID信息存在交集的第二ID信息对应的支持数据。The federated learning module is used to perform federated learning training according to the intersection training set through the service terminal and the data support terminal to obtain a federated data processing model. The federated data processing model is used to receive input when the federation is successful. After the ID information to be processed, a federated prediction result is output, and the federated prediction result includes support data corresponding to the second ID information that has an intersection with the ID information to be processed.
一种数据处理方法,包括:A data processing method, including:
在数据支持终端接收服务终端发送的包含待处理ID信息的数据支持请求,检测当前是否处于联邦成功状态;Receive the data support request containing the ID information to be processed from the service terminal at the data support terminal, and check whether it is currently in a federally successful state;
在当前处于联邦成功状态时,将所述待处理ID信息输入至预设的联邦数据处理模型中,获取所述预设的联邦数据处理模型输出的联邦预测结果,所述联邦预测结果中包含与所述待处理ID信息存在交集的第二ID信息对应的支持数据;所述预设的联邦数据处理模型根据上述数据处理模型生成方法生成。When the current federation is in a successful state, input the ID information to be processed into a preset federated data processing model to obtain the federated prediction result output by the preset federated data processing model, and the federated prediction result includes The ID information to be processed has supporting data corresponding to the second ID information in the intersection; the preset federal data processing model is generated according to the data processing model generation method described above.
一种数据处理装置,包括:A data processing device includes:
联邦状态检测模块,用于在数据支持终端接收服务终端发送的包含待处理ID信息的数据支持请求时,检测当前是否处于联邦成功状态;The federal state detection module is used to detect whether the current federal state is in a successful federal state when the data support terminal receives a data support request containing the ID information to be processed from the service terminal;
联邦预测模块,用于在当前处于联邦成功状态时,将所述待处理ID信息输入至预设的联邦数据处理模型中,获取所述预设的联邦数据处理模型输出的联邦预测结果,所述联邦预测结果中包含与所述待处理ID信息存在交集的第二ID信息对应的支持数据;所述预设的联邦数据处理模型根据上述数据处理模型生成方法生成。The federation prediction module is used to input the ID information to be processed into a preset federated data processing model when the federation is currently in a successful state, and obtain the federated prediction result output by the preset federated data processing model. The federated prediction result includes supporting data corresponding to the second ID information that has an intersection with the ID information to be processed; the preset federated data processing model is generated according to the data processing model generating method described above.
一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:A computer device includes a memory, a processor, and computer-readable instructions that are stored in the memory and can run on the processor, and the processor implements the following steps when the processor executes the computer-readable instructions:
通过数据支持终端获取服务终端发送的包含服务分区的模型训练请求,确定所述数据支持终端中与所述服务分区对应的数据分区,通过所述数据支持终端根据所述数据分区确定第二ID信息;其中,所述服务分区中包含服务数据以及与其对应的第一ID信息,所述数据分区中包含支持数据以及与其对应的第二ID信息;Obtain the model training request including the service partition sent by the service terminal through the data support terminal, determine the data partition corresponding to the service partition in the data support terminal, and determine the second ID information according to the data partition through the data support terminal ; Wherein, the service partition includes service data and corresponding first ID information, and the data partition includes support data and corresponding second ID information;
在通过所述数据支持终端和所述服务终端对所述第一ID信息和所述第二ID信息进行交集处理后,指示所述数据支持终端获取所述服务终端发送的包含交集ID的引擎计算结果,每一个引擎计算结果为交集ID的第一ID信息均对应一个与其存在交集的第二ID信息;After the data support terminal and the service terminal perform intersection processing on the first ID information and the second ID information, instruct the data support terminal to obtain the engine calculation containing the intersection ID sent by the service terminal As a result, the first ID information for which each engine calculation result is an intersection ID corresponds to a second ID information that has an intersection with it;
在联邦成功时,通过所述数据支持终端和所述服务终端,根据引擎计算结果为所述交集ID的所述第一ID信息以及与所述交集ID均对应的所述服务数据和所述支持数据,生成交集训练集;When the federation is successful, through the data support terminal and the service terminal, according to the engine calculation result, the first ID information of the intersection ID and the service data and the support corresponding to the intersection ID Data, generate intersection training set;
通过所述服务终端和所述数据支持终端根据所述交集训练集进行联邦学习训练,得到联邦数据处理模型,所述联邦数据处理模型用于在联邦成功时,接收输入的待处理ID信息之后,输出联邦预测结果,所述联邦预测结果中包含与所述待处理ID信息存在交集的第二ID信息对应的支持数据。The service terminal and the data support terminal perform federated learning training according to the intersection training set to obtain a federated data processing model. The federated data processing model is used when the federation is successful and after receiving the input ID information to be processed, The federated prediction result is output, and the federated prediction result includes supporting data corresponding to the second ID information that has an intersection with the ID information to be processed.
一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:A computer device includes a memory, a processor, and computer-readable instructions that are stored in the memory and can run on the processor, and the processor implements the following steps when the processor executes the computer-readable instructions:
在数据支持终端接收到服务终端发送的包含待处理ID信息的数据支持请求时,检测当前是否处于联邦成功状态;When the data support terminal receives the data support request containing the ID information to be processed sent by the service terminal, it detects whether it is currently in the federally successful state;
在当前处于联邦成功状态时,将所述待处理ID信息输入至预设的联邦数据处理模型中,获取所述预设的联邦数据处理模型输出的联邦预测结果,所述联邦预测结果中包含与所述待处理ID信息存在交集的第二ID信息对应的支持数据;所述预设的联邦数据处理模型是指根据上述数据处理模型生成方法生成。When the current federation is in a successful state, input the ID information to be processed into a preset federated data processing model to obtain the federated prediction result output by the preset federated data processing model, and the federated prediction result includes The ID information to be processed has supporting data corresponding to the second ID information in the intersection; the preset federated data processing model refers to the generation according to the data processing model generation method described above.
一个或多个存储有计算机可读指令的可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:One or more readable storage media storing computer readable instructions, when the computer readable instructions are executed by one or more processors, the one or more processors execute the following steps:
通过数据支持终端获取服务终端发送的包含服务分区的模型训练请求,确定所述数据支持终端中与所述服务分区对应的数据分区,通过所述数据支持终端根据所述数据分区确定第二ID信息;其中,所述服务分区中包含服务数据以及与其对应的第一ID信息,所述数据分区中包含支持数据以及与其对应的第二ID信息;Obtain the model training request including the service partition sent by the service terminal through the data support terminal, determine the data partition corresponding to the service partition in the data support terminal, and determine the second ID information according to the data partition through the data support terminal ; Wherein, the service partition includes service data and corresponding first ID information, and the data partition includes support data and corresponding second ID information;
在通过所述数据支持终端和所述服务终端对所述第一ID信息和所述第二ID信息进行交集处理后,指示所述数据支持终端获取所述服务终端发送的包含交集ID的引擎计算结果,每一个引擎计算结果为交集ID的第一ID信息均对应一个与其存在交集的第二ID信息;After the data support terminal and the service terminal perform intersection processing on the first ID information and the second ID information, instruct the data support terminal to obtain the engine calculation containing the intersection ID sent by the service terminal As a result, the first ID information for which each engine calculation result is an intersection ID corresponds to a second ID information that has an intersection with it;
在联邦成功时,通过所述数据支持终端和所述服务终端,根据引擎计算结果为所述交集ID的所述第一ID信息以及与所述交集ID均对应的所述服务数据和所述支持数据,生成交集训练集;When the federation is successful, through the data support terminal and the service terminal, according to the engine calculation result, the first ID information of the intersection ID and the service data and the support corresponding to the intersection ID Data, generate intersection training set;
通过所述服务终端和所述数据支持终端根据所述交集训练集进行联邦学习训练,得到联邦数据处理模型,所述联邦数据处理模型用于在联邦成功时,接收输入的待处理ID信息之后,输出联邦预测结果,所述联邦预测结果中包含与所述待处理ID信息存在交集的第二ID信息对应的支持数据。The service terminal and the data support terminal perform federated learning training according to the intersection training set to obtain a federated data processing model. The federated data processing model is used when the federation is successful and after receiving the input ID information to be processed, The federated prediction result is output, and the federated prediction result includes supporting data corresponding to the second ID information that has an intersection with the ID information to be processed.
一个或多个存储有计算机可读指令的可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:One or more readable storage media storing computer readable instructions, when the computer readable instructions are executed by one or more processors, the one or more processors execute the following steps:
在数据支持终端接收到服务终端发送的包含待处理ID信息的数据支持请求时,检测当前是否处于联邦成功状态;When the data support terminal receives the data support request containing the ID information to be processed sent by the service terminal, it detects whether it is currently in the federally successful state;
在当前处于联邦成功状态时,将所述待处理ID信息输入至预设的联邦数据处理模型中,获取所述预设的联邦数据处理模型输出的联邦预测结果,所述联邦预测结果中包含与所述待处理ID信息存在交集的第二ID信息对应的支持数据;所述预设的联邦数据处理模型是指根据上述数据处理模型生成方法生成。When the current federation is in a successful state, input the ID information to be processed into a preset federated data processing model to obtain the federated prediction result output by the preset federated data processing model, and the federated prediction result includes The ID information to be processed has supporting data corresponding to the second ID information in the intersection; the preset federated data processing model refers to the generation according to the data processing model generation method described above.
上述数据处理模型生成方法、装置、计算机设备及存储介质,通过对服务终端和数据支持终端中的ID信息进行交集处理,将交集ID信息对应的服务数据和支持数据生成交集数据集,能够在确保服务终端和数据支持终端的数据的安全性的前提,找到双方共同拥有的ID信息,并在双方ID信息不进行泄露的情况下,协同作业完成了联邦模型的训练和联邦预测建设,使得联邦预测结果与双方数据透明训练的模型得到的预测结果极大的接近,从而解决了数据孤岛问题。并且采用交集数据集进行联邦学习训练,在数据处理上提升了效率。最后采用训练得到的联邦数据处理模型进行联邦预测工作,提高整个***完整性,并且在多方数据支持下,预测结果会更加准确。The above-mentioned data processing model generation method, device, computer equipment and storage medium, through the intersection processing of the ID information in the service terminal and the data support terminal, the service data and the support data corresponding to the intersection ID information are generated into an intersection data set, which can ensure The premise of the security of the data of the service terminal and the data support terminal is to find the ID information jointly owned by both parties, and under the condition that the ID information of both parties is not leaked, the collaborative work has completed the training of the federation model and the construction of the federation prediction, making the federation prediction The result is very close to the prediction result obtained by the transparent training model of both parties, thus solving the problem of data islanding. And the intersection data set is used for federated learning training, which improves the efficiency of data processing. Finally, the federated data processing model obtained by training is used for federated prediction work to improve the integrity of the entire system, and with the support of multi-party data, the prediction results will be more accurate.
上述数据处理方法、装置、计算机设备及存储介质,通过在当前状态处于联邦成功状态时,采用联邦数据处理模型进行预测,能够在保证数据安全前提下,解决了数据孤岛问题,还提高了数据预测的准确性。在当前状态处于联邦中断状态时,采用本地数据处理模型进行预测,使得本地数据处理模型预测方法作为应急方案,保证***任务能够完成的前提下,提升了***的全面性,降低了由于通信失败或者通信中断带来的风险。The above-mentioned data processing methods, devices, computer equipment and storage media, through the use of the federal data processing model for prediction when the current state is in the federally successful state, can solve the problem of data islanding and improve data prediction under the premise of ensuring data security. Accuracy. When the current state is in the federal interruption state, the local data processing model is used for prediction, so that the local data processing model prediction method is used as an emergency plan to ensure that the system tasks can be completed. This improves the comprehensiveness of the system and reduces the communication failure or The risk of communication interruption.
本申请的一个或多个实施例的细节在下面的附图和描述中提出,本申请的其他特征和优点将从说明书、附图以及权利要求变得明显。The details of one or more embodiments of the present application are presented in the following drawings and description, and other features and advantages of the present application will become apparent from the description, drawings and claims.
附图说明Description of the drawings
为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例的描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments of the present application. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative labor.
图1是本申请一实施例中数据处理模型生成方法和数据处理方法的一应用环境示意图;FIG. 1 is a schematic diagram of an application environment of a data processing model generation method and a data processing method in an embodiment of the present application;
图2是本申请一实施例中数据处理模型生成方法的一流程图;Fig. 2 is a flowchart of a method for generating a data processing model in an embodiment of the present application;
图3是本申请一实施例中数据处理模型生成方法的另一流程图;FIG. 3 is another flowchart of a method for generating a data processing model in an embodiment of the present application;
图4是本申请一实施例中数据处理模型生成装置的一原理框图;Fig. 4 is a functional block diagram of a data processing model generating device in an embodiment of the present application;
图5是本申请一实施例中数据处理模型生成装置的另一原理框图;FIG. 5 is another principle block diagram of the data processing model generating device in an embodiment of the present application;
图6是本申请一实施例中数据处理方法的一流程图;Fig. 6 is a flowchart of a data processing method in an embodiment of the present application;
图7是本申请一实施例中数据处理装置的一原理框图;Fig. 7 is a functional block diagram of a data processing device in an embodiment of the present application;
图8是本申请一实施例中计算机设备的一示意图。Fig. 8 is a schematic diagram of a computer device in an embodiment of the present application.
具体实施方式Detailed ways
本申请实施例提供一数据处理模型生成方法,该数据处理模型生成方法可应用如图1所示的应用环境中。具体地,该数据处理模型生成方法应用在数据处理模型生成***中,该数据处理模型生成***包括如图1所示的客户端和服务器,客户端与服务器通过网络进行通信,用于数据孤岛问题。其中,客户端又称为用户端,是指与服务器相对应,为客户提供本地服务的程序。客户端可安装在但不限于各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备上。服务器可以用独立的服务器或者是多个服务器组成的服务器集群来实现。进一步地,服务器中包括了服务终端和数据支持终端。The embodiment of the present application provides a method for generating a data processing model, and the method for generating a data processing model can be applied to the application environment shown in FIG. 1. Specifically, the data processing model generation method is applied in a data processing model generation system. The data processing model generation system includes a client and a server as shown in FIG. 1. The client and the server communicate through a network for data islanding. . Among them, the client is also called the client, which refers to the program that corresponds to the server and provides local services to the client. The client can be installed on, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server can be implemented as an independent server or a server cluster composed of multiple servers. Further, the server includes a service terminal and a data support terminal.
在一实施例中,如图2所示,提供一种数据处理模型生成方法,以该方法应用在图1中的服务器为例进行说明,包括如下步骤:In an embodiment, as shown in FIG. 2, a method for generating a data processing model is provided. The method is applied to the server in FIG. 1 as an example for description, including the following steps:
S11:通过数据支持终端获取服务终端发送的包含服务分区的模型训练请求,确定数据支持终端中与服务分区对应的数据分区,通过数据支持终端根据数据分区确定第二ID信息;其中,服务分区中包含服务数据以及与其对应的第一ID信息,数据分区中包含支持数据以及与其对应的第二ID信息。S11: Obtain the model training request including the service partition sent by the service terminal through the data support terminal, determine the data partition corresponding to the service partition in the data support terminal, and determine the second ID information according to the data partition through the data support terminal; wherein, in the service partition Contains service data and corresponding first ID information, and the data partition includes support data and corresponding second ID information.
其中,数据支持终端为接收服务终端的服务请求并提供数据支持与模型计算的终端,数据支持终端可以为但不限于各种个人计算机、笔记本电脑、智能手机、平板电脑和计算机集群。服务终端为向数据支持终端发送服务请求的终端,服务终端可以但不限于各种个人计算机、笔记本电脑、智能手机和平板电脑。基于人工智能技术进行模型训练,模型训练请求为服务终端向数据支持终端发送的获取相应的数据支持的请求。第一ID信息为服务终端的服务分区中服务数据对应的ID信息。服务数据为服务终端中的数据,每一第一ID信息都存在相对应的服务数据。数据分区为数据支持终端对支持数据对应的ID信息进行计算分类后得到的区域,数据分区的分区数量是根据数据支持终端中所包含的分箱机器的数量和第二ID信息的数据量来确定的。服务分区为服务终端对服务数据对应的ID信息进行与数据支持终端中相同的计算分类后得到的区域,每一个服务分区可能对应着一个数据分区。第二ID信息为数据支持终端中与第一ID信息相对应的数据分区中的支持数据对应的ID信息。支持数据为数据支持终端中的数据,每一第二ID信息都存在相对应的支持数据。Among them, the data support terminal is a terminal that receives service requests from the service terminal and provides data support and model calculation. The data support terminal can be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and computer clusters. A service terminal is a terminal that sends a service request to a data support terminal. The service terminal can be, but is not limited to, various personal computers, notebook computers, smart phones, and tablet computers. Model training is performed based on artificial intelligence technology, and the model training request is a request sent by the service terminal to the data support terminal to obtain corresponding data support. The first ID information is ID information corresponding to the service data in the service partition of the service terminal. The service data is data in the service terminal, and each first ID information has corresponding service data. The data partition is the area obtained after the data support terminal calculates and classifies the ID information corresponding to the support data. The number of partitions of the data partition is determined according to the number of binning machines included in the data support terminal and the data volume of the second ID information of. The service partition is the area obtained by the service terminal performing the same calculation and classification on the ID information corresponding to the service data as in the data support terminal, and each service partition may correspond to a data partition. The second ID information is ID information corresponding to the supporting data in the data partition corresponding to the first ID information in the data supporting terminal. The supporting data is data in the data supporting terminal, and each second ID information has corresponding supporting data.
具体地,通过数据支持终端获取服务终端发送的包含服务分区的模型训练请求,确定数据支持终端中与服务分区对应的数据分区,并通过数据支持终端根据数据分区确定第二ID信息。其中,服务分区包含服务数据以及与其对应的第一ID信息,数据分区中包含支 持数据以及与其对应的第二ID信息。Specifically, the data support terminal obtains the model training request including the service partition sent by the service terminal, determines the data partition corresponding to the service partition in the data support terminal, and determines the second ID information according to the data partition through the data support terminal. Wherein, the service partition contains service data and corresponding first ID information, and the data partition contains support data and corresponding second ID information.
在一具体实施例中,在通过数据支持终端获取服务终端发送的包含服务分区的数模型训练请求之前,包括:In a specific embodiment, before acquiring the data model training request including the service partition sent by the service terminal through the data support terminal, the method includes:
通过数据支持终端接收服务终端发送的预设的规则,采用预设的规则对数据支持终端中支持数据对应的ID信息进行一致化操作,再采用均匀加密算法对一致化操作后的ID信息和该ID信息对应的支持数据进行均匀分布处理,得到待分箱数据。The data support terminal receives the preset rules sent by the service terminal, uses the preset rules to perform the uniform operation on the ID information corresponding to the support data in the data support terminal, and then uses the uniform encryption algorithm to perform the uniform encryption algorithm on the uniformized ID information and the The support data corresponding to the ID information is uniformly distributed to obtain the data to be binned.
通过数据支持终端根据预设的分箱策略对待分箱数据进行分箱处理,得到待分箱数据对应的分箱信息和数据分区。The data support terminal performs binning processing on the data to be binned according to a preset binning strategy, and obtains binning information and data partitions corresponding to the data to be binned.
通过数据支持终端将分箱信息发送至服务终端,服务终端根据分箱信息对服务数据对应的ID信息进行哈希运算,得到分箱号。The binning information is sent to the service terminal through the data support terminal, and the service terminal hashes the ID information corresponding to the service data according to the binning information to obtain the binning number.
通过数据支持终端获取服务终端发送的数据支持请求,所述数据支持请求为所述服务终端从与服务分区中服务数据对应的ID信息中确定出第一ID信息之后,根据分箱号和第一ID信息生成。Obtain the data support request sent by the service terminal through the data support terminal. The data support request is that the service terminal determines the first ID information from the ID information corresponding to the service data in the service partition, and then according to the box number and the first ID information. ID information is generated.
其中,预设的规则为服务终端发送至数据支持终端的规则,预设的规则的实质为对服务终端的ID信息和数据支持终端的ID信息的数据格式进行统一的规则,使得服务终端和数据支持终端中的ID信息的数据格式保持一致。一致化操作指的是数据支持终端对数据库中的数据的ID信息的数据格式进行与服务终端的ID信息进行统一的操作。均匀加密算法用于保证处理得到的待分箱数据的不可回溯性,并且使得待分箱数据能够均匀分布,均匀加密算法可以为均匀哈希算法。待分箱数据指的是数据库中等待进行分箱处理的数据,待分箱数据可以包括ID身份信息和该ID对应搜索记录的关键词数据,ID身份信息和该ID对应访问页面的数据或者ID身份信息和该ID对应下载应用软件的数据,待分箱数据还可以包括ID身份信息和该ID对应的静态数据等,静态数据可以为ID身份信息的年龄、性别或者居住地区等。哈希运算为把任意长度的输入(又叫做预映射pre-image)通过散列算法变换成固定长度的输出的运算方法。Among them, the preset rule is the rule sent by the service terminal to the data support terminal, and the essence of the preset rule is to uniform the data format of the ID information of the service terminal and the ID information of the data support terminal, so that the service terminal and the data The data format of the ID information in the support terminal remains consistent. The unification operation means that the data support terminal performs a unified operation on the data format of the ID information of the data in the database and the ID information of the service terminal. The uniform encryption algorithm is used to ensure the non-retrospecibility of the processed data to be binned, and to make the data to be binned can be evenly distributed, and the uniform encryption algorithm can be a uniform hash algorithm. The data to be sorted refers to the data in the database waiting to be sorted. The data to be sorted can include ID identity information and keyword data of the search record corresponding to the ID, ID identity information and the ID corresponding to the data or ID of the visited page The identity information and the ID correspond to the data of downloading application software. The data to be sorted may also include ID identity information and static data corresponding to the ID. The static data may be the age, gender, or residential area of the ID identity information. Hashing is an arithmetic method that transforms an input of any length (also called a pre-mapped pre-image) into a fixed-length output through a hashing algorithm.
S12:在通过数据支持终端和服务终端对第一ID信息和第二ID信息进行交集处理后,通过数据支持终端获取服务终端发送的包含交集ID的引擎计算结果,每一个引擎计算结果为交集ID的第一ID信息均对应一个与其存在交集的第二ID信息。S12: After the first ID information and the second ID information are intersected by the data support terminal and the service terminal, the data support terminal obtains the engine calculation result containing the intersection ID sent by the service terminal, and each engine calculation result is the intersection ID Each of the first ID information corresponds to an intersection with the second ID information.
其中,交集处理指的是确定第一ID信息和第二ID信息的共有ID信息的处理方法。引擎计算结果为对第一ID信息和第二ID信息进行交集处理后得到的结果。交集ID为服务终端和数据支持终端共有的ID信息,即第一ID信息和第二ID信息相同。Wherein, the intersection processing refers to a processing method of determining the shared ID information of the first ID information and the second ID information. The engine calculation result is the result obtained after the intersection processing of the first ID information and the second ID information. The intersection ID is ID information shared by the service terminal and the data support terminal, that is, the first ID information and the second ID information are the same.
具体地,在确定第一ID信息和第二ID信息之后,通过数据支持终端和服务终端对第一ID信息和第二ID信息进行交集处理,以得到引擎计算结果。其中,引擎计算结果包含交集ID和非交集ID。在通过服务终端得到引擎计算结果之后,通过数据支持终端获取通过服务终端发送的包含交集ID的引擎计算结果。可选地,可以采用RSA加密方法对第一ID信息和第二ID信息进行交集处理。其中,非交集ID为服务终端拥有,但数据支持终端不用有的ID信息,即第一ID信息和第二ID信息不相同。Specifically, after determining the first ID information and the second ID information, the data support terminal and the service terminal perform intersection processing on the first ID information and the second ID information to obtain the engine calculation result. Among them, the engine calculation results include intersection ID and non-intersection ID. After obtaining the engine calculation result through the service terminal, the data support terminal obtains the engine calculation result including the intersection ID sent through the service terminal. Optionally, an RSA encryption method may be used to perform intersection processing on the first ID information and the second ID information. Among them, the non-intersection ID is ID information owned by the service terminal but not used by the data support terminal, that is, the first ID information and the second ID information are different.
进一步地,在服务终端中的一个服务分区中包含至少一个第一ID信息,数据支持终端中的一个数据分区中也包含至少一个第二ID信息。因此,引擎计算结果中可能包含多个交集ID和多个非交集ID。Further, one service partition in the service terminal includes at least one first ID information, and one data partition in the data support terminal also includes at least one second ID information. Therefore, the engine calculation result may contain multiple intersection IDs and multiple non-intersection IDs.
S13:在联邦成功时,通过数据支持终端和服务终端,根据引擎计算结果为交集ID的第一ID信息以及与该交集ID均对应的服务数据和支持数据,生成交集训练集。S13: When the federation is successful, the data support terminal and the service terminal are used to generate an intersection training set according to the first ID information of the intersection ID as the result of the engine calculation and the service data and support data corresponding to the intersection ID.
其中,交集训练集的实质为数据训练集合,交集训练集中的数据为服务终端和数据支持终端共同拥有的ID信息以及该ID信息对应的服务数据和支持数据。进一步地,上述的服务数据仍存储在服务终端中,支持数据仍存储在数据支持终端中,此处的交集训练集为服务终端和数据支持终端共同协作生成的数据集合。Among them, the essence of the intersection training set is a data training set, and the data in the intersection training set is ID information shared by the service terminal and the data support terminal and the service data and support data corresponding to the ID information. Further, the above-mentioned service data is still stored in the service terminal, and the supporting data is still stored in the data support terminal. The intersection training set here is a data set jointly generated by the service terminal and the data support terminal.
具体地,在通过数据支持终端获取服务终端包含交集ID的引擎计算结果之后,在联邦成功时,通过数据支持终端和服务终端根据引擎计算结果为交集ID的第一ID信息以及该第一ID信息均对应的服务数据和支持数据,生成交集训练集。Specifically, after obtaining the engine calculation result of the service terminal including the intersection ID through the data support terminal, when the federation is successful, the data support terminal and the service terminal are the first ID information of the intersection ID and the first ID information according to the engine calculation result. The corresponding service data and supporting data are generated to generate an intersection training set.
其中,由于交集ID即为服务终端和数据支持终端共有的ID信息,即第一ID信息和第二ID信息是相同的情况的ID集合,因此仅在上述说明第一ID信息均对应的服务数据和支持数据。Among them, because the intersection ID is the ID information shared by the service terminal and the data support terminal, that is, the ID set when the first ID information and the second ID information are the same, so only the service data corresponding to the first ID information is explained above. And supporting data.
S14:通过服务终端和数据支持终端根据交集训练集进行联邦学习训练,得到联邦数据处理模型,联邦数据处理模型用于在联邦成功时,接收输入的待处理ID信息之后,输出联邦预测结果,联邦预测结果中包含与待处理ID信息存在交集的第二ID信息对应的支持数据。S14: The service terminal and the data support terminal perform federated learning training according to the intersection training set to obtain the federated data processing model. The federated data processing model is used to output the federated prediction result after receiving the input to-be-processed ID information when the federation is successful. The prediction result includes supporting data corresponding to the second ID information that has an intersection with the ID information to be processed.
其中,联邦学***台的文档下载信息对应的ID信息、多平台的历史访问信息对应的ID信息或者多平台的历史购买记录对应的ID信息。联邦预测结果为采用联邦数据处理模型对待处理ID信息进行联邦预测之后得到的结果,联邦预测结果可以为目标分类信息、推荐信息,该目标分类信息或者推荐信息包含与交集ID对应的服务数据和支持数据。Among them, federated learning training is a training method that can build a machine learning system without directly accessing the training data. The federated data processing model is a model obtained after federated learning training is performed on the intersection training set, and the federated data processing model is used for subsequent prediction steps. The ID information to be processed is the result information obtained by waiting for the federated prediction of the input model. The ID information to be processed may be ID information corresponding to document download information of multiple platforms, ID information corresponding to historical access information of multiple platforms, or ID information corresponding to historical purchase records of multiple platforms. The federated prediction result is the result obtained after the federated prediction of the ID information to be processed using the federated data processing model. The federated prediction result can be target classification information or recommendation information. The target classification information or recommendation information contains the service data and support corresponding to the intersection ID. data.
具体地,在通过数据支持终端和服务终端根据引擎计算结果为交集ID的第一ID信息以及与该第一ID信息均对应的服务数据和支持数据,生成交集训练集之后,通过服务终端和数据支持终端根据交集训练集进行联邦学习训练,以达到保护交集训练集中的服务数据和支持数据不被公开,从而生成联邦数据处理模型。该联邦数据处理模型可用于对待处理ID信息进行预测,以获取联邦预测结果。联邦预测结果中包含了与待处理ID信息存在交集的第二ID信息对应的支持数据,此外还包含了引擎计算结果为交集ID对应的服务数据。Specifically, after generating the intersection training set through the first ID information of the intersection ID and the service data and support data corresponding to the first ID information through the data support terminal and the service terminal according to the engine calculation result, the service terminal and the data The support terminal performs federated learning training according to the intersection training set, so as to protect the service data and support data in the intersection training set from being disclosed, thereby generating a federated data processing model. The federated data processing model can be used to predict ID information to be processed to obtain federated prediction results. The federated prediction result includes the supporting data corresponding to the second ID information that has an intersection with the ID information to be processed, and also includes the service data corresponding to the intersection ID as the result of the engine calculation.
进一步地,为进一步保证样本数据的私密和安全性,本实施例还包括以下步骤:将联邦预测结果存储于区块链中。Further, in order to further ensure the privacy and security of the sample data, this embodiment further includes the following steps: storing the federated prediction result in the blockchain.
需要说明的是,本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。It should be noted that the blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
在本实施例中,通过对服务终端和数据支持终端中的ID信息进行交集处理,将交集ID信息对应的服务数据和支持数据生成交集数据集,能够在确保服务终端和数据支持终端的数据的安全性的前提,找到双方共同拥有的ID信息,并在双方ID信息不进行泄露的情况下,协同作业完成了联邦模型的训练和联邦预测建设,使得联邦预测结果与双方数据透明训练的模型得到的预测结果极大的接近,从而解决了数据孤岛问题。并且采用交集数据集进行联邦学习训练,在数据处理上提升了效率。最后采用训练得到的联邦数据处理模型进行联邦预测工作,提高整个***完整性,并且在多方数据支持下,预测结果会更加准确。In this embodiment, by performing the intersection processing on the ID information in the service terminal and the data support terminal, the service data and the support data corresponding to the intersection ID information are generated into an intersection data set, which can ensure the data of the service terminal and the data support terminal. The premise of security is to find the ID information jointly owned by both parties, and under the condition that the ID information of both parties is not leaked, the collaborative work has completed the training of the federation model and the construction of the federation prediction, so that the federation prediction result and the transparent training model of the data of both parties are obtained. The predicted results are extremely close, thus solving the problem of data islands. And the intersection data set is used for federated learning training, which improves the efficiency of data processing. Finally, the federated data processing model obtained by training is used for federated prediction work to improve the integrity of the entire system, and with the support of multi-party data, the prediction results will be more accurate.
在一实施例中,如图3所示,步骤S20中,即对第一ID信息和第二ID信息进行交集处理之后,指示数据支持终端获取服务终端发送的包含交集ID的引擎计算结果具体包括如下步骤:In one embodiment, as shown in FIG. 3, in step S20, that is, after the first ID information and the second ID information are intersected, the data support terminal is instructed to obtain the engine calculation result containing the intersection ID sent by the service terminal specifically includes The following steps:
S121:通过数据支持终端将加密钥匙发送至服务终端中,并获取服务终端采用加密钥匙和第一私密钥匙对第一ID信息进行加密之后得到的第一加密信息。S121: Send the encryption key to the service terminal through the data support terminal, and obtain the first encrypted information obtained after the service terminal uses the encryption key and the first private key to encrypt the first ID information.
其中,加密钥匙为数据支持终端提供给服务终端的公共钥匙,加密钥匙用于服务终端 对第一ID信息进行加密。第一私密钥匙为服务终端对第一ID信息进行加密的钥匙,该第一私密钥匙仅服务终端拥有。第一加密信息为服务终端采用加密钥匙和第一私密钥匙对第一ID信息进行加密后得到的信息。The encryption key is a public key provided by the data support terminal to the service terminal, and the encryption key is used by the service terminal to encrypt the first ID information. The first private key is a key used by the service terminal to encrypt the first ID information, and the first private key is owned only by the service terminal. The first encrypted information is information obtained by the service terminal using the encryption key and the first private key to encrypt the first ID information.
具体地,在服务终端在数据支持终端中与第一ID信息对应的数据分区的第二ID信息之后,通过数据支持终端将加密钥匙发送至服务终端中,服务终端在接收到加密钥匙后,通过服务终端采用加密钥匙和第一私密钥匙对第一ID信息进行加密,得到第一加密信息;通过服务终端将第一加密信息发送至数据支持终端中。Specifically, after the service terminal sends the encryption key to the service terminal through the data support terminal after the second ID information of the data partition corresponding to the first ID information in the data support terminal, after receiving the encryption key, the service terminal passes The service terminal uses the encryption key and the first private key to encrypt the first ID information to obtain the first encrypted information; the service terminal sends the first encrypted information to the data support terminal.
S122:通过数据支持终端采用第二私密钥匙对第一加密信息进行加密,得到第二加密信息。S122: Use the second private key to encrypt the first encrypted information through the data support terminal to obtain the second encrypted information.
其中,第二私密钥匙为数据支持终端仅有的加密钥匙。第二加密信息为数据支持终端采用第二私密钥匙对第一加密信息进行加密得到的信息。Among them, the second private key is the only encryption key of the data support terminal. The second encrypted information is information obtained by the data support terminal using the second private key to encrypt the first encrypted information.
具体地,在通过数据支持终端获取服务终端采用加密钥匙和第一私密钥匙对第一ID信息进行加密之后得到的第一加密信息之后,通过数据支持终端采用第二私密钥匙对第一加密信息进行加密,得到第二加密信息,进一步提高数据的安全性。Specifically, after obtaining the first encrypted information obtained after the service terminal uses the encryption key and the first private key to encrypt the first ID information through the data support terminal, the data support terminal uses the second private key to perform the first encrypted information on the first encrypted information. Encryption to obtain the second encrypted information, which further improves the security of the data.
S123:通过数据支持终端采用加密钥匙和第二私密钥匙对第二ID信息进行加密,得到第三加密信息。S123: Use the encryption key and the second private key to encrypt the second ID information through the data support terminal to obtain the third encrypted information.
其中,第三加密信息为数据支持终端采用加密钥匙和第二私密钥匙对第二ID信息进行加密得到的信息。Wherein, the third encrypted information is information obtained by encrypting the second ID information by the data support terminal using the encryption key and the second private key.
具体地,在确定服务终端获取在数据支持终端中与第一ID信息对应的数据分区的第二ID信息之后,通过数据支持终端采用加密钥匙和第二私密钥匙对第二ID信息进行加密,得到第三加密信息。Specifically, after it is determined that the service terminal obtains the second ID information of the data partition corresponding to the first ID information in the data support terminal, the data support terminal uses the encryption key and the second private key to encrypt the second ID information to obtain The third encrypted information.
进一步地,若在服务终端每次发送数据支持请求之后,再通过数据支持终端再采用加密钥匙和第二私密钥匙对第二ID信息进行加密,这些操作会极大的增加响应时间,浪费大量的计算资源,并对数据支持终端计算的峰值有了更高的要求。Further, if the data support terminal uses the encryption key and the second private key to encrypt the second ID information after each data support request sent by the service terminal, these operations will greatly increase the response time and waste a lot of Computing resources, and have higher requirements for the peak value of data support terminal calculations.
故可选地,可以通过数据支持终端提前采用加密钥匙和第二私密钥匙对所有支持数据对应ID信息进行加密,待通过数据支持终端获取到服务终端发送的数据支持请求之后,通过数据支持终端从数据分区中确定与服务终端的服务分区中第一ID信息对应的分区。采用上述方式,就不必通过数据支持终端在服务终端发送请求时候再进行S123步骤,降低了响应时间。Therefore, optionally, the data support terminal can use the encryption key and the second private key to encrypt all the ID information corresponding to the support data in advance, and after the data support terminal obtains the data support request sent by the service terminal, the data support terminal can obtain the data support request from the service terminal. The data partition determines the partition corresponding to the first ID information in the service partition of the service terminal. With the above method, it is unnecessary to perform step S123 when the service terminal sends a request through the data support terminal, which reduces the response time.
同时,考虑到服务终端在请求访问数据终端时的安全性,在一定周期之后,可以通过数据支持终端更新加密钥匙,进一步保证了数据的安全性。At the same time, considering the security of the service terminal when requesting to access the data terminal, after a certain period of time, the encryption key can be updated through the data support terminal to further ensure the security of the data.
进一步地,通过服务终端生成第一加密信息和通过数据支持终端生成第三加密信息,这两个步骤的顺序不是固定,可以先通过服务终端得到第一加密信息,也可以先通过数据支持终端得到第三加密信息或者上述两个步骤同时进行。Further, the first encrypted information is generated through the service terminal and the third encrypted information is generated through the data support terminal. The sequence of these two steps is not fixed. The first encrypted information may be obtained through the service terminal first, or may be obtained through the data support terminal first. The third encryption information or the above two steps are performed simultaneously.
S124:通过数据支持终端将第二加密信息和第三加密信息发送至服务终端中,并通过数据支持终端获取通过服务终端对第二加密信息和第三加密信息进行交集引擎计算后得到的第一中间结果,第一中间结果包含交集ID。S124: Send the second encrypted information and the third encrypted information to the service terminal through the data support terminal, and obtain the first encrypted information obtained by the intersection engine calculation of the second encrypted information and the third encrypted information through the service terminal through the data support terminal. Intermediate result, the first intermediate result contains the intersection ID.
其中,第一中间结果为服务终端对第二加密信息和第三加密信息进行交集引擎计算后得到的结果。交集引擎计算的实质为确定服务终端和数据支持终端中交集ID信息的计算方式。The first intermediate result is the result obtained by the service terminal after the intersection engine calculation of the second encrypted information and the third encrypted information. The essence of the intersection engine calculation is to determine the calculation method of the intersection ID information in the service terminal and the data support terminal.
具体地,在通过数据支持终端将第二加密信息和第三加密信息发送至服务终端之后,通过服务终端对第二加密信息和第三加密信息进行交集引擎计算,得到第一中间结果,并通过服务终端将第一中间结果发送至数据支持终端,该第一中间结果包含交集ID。Specifically, after the second encrypted information and the third encrypted information are sent to the service terminal through the data support terminal, the intersection engine calculation is performed on the second encrypted information and the third encrypted information through the service terminal to obtain the first intermediate result, and pass The service terminal sends the first intermediate result to the data support terminal, and the first intermediate result includes the intersection ID.
S125:通过数据支持终端对第一中间结果进行解密计算,得到第二中间结果,并通过数据支持终端将第二中间结果发送至服务终端中,通过数据支持终端获取通过服务终端对 第二中间结果进行整合之后得到的包含交集ID的引擎计算结果。S125: Perform decryption calculation on the first intermediate result through the data support terminal to obtain the second intermediate result, and send the second intermediate result to the service terminal through the data support terminal, and obtain the second intermediate result through the service terminal through the data support terminal. The engine calculation result containing the intersection ID obtained after the integration.
其中,解密计算为数据支持终端对第一中间结果进行解密的计算。第二中间结果为对第一中间结果进行解密后得到的结果。Wherein, the decryption calculation is calculation for the data support terminal to decrypt the first intermediate result. The second intermediate result is the result obtained after decrypting the first intermediate result.
具体地,在通过数据支持终端获取到第一中间结果之后,由于第一中间结果包含了交集ID,因此,通过数据支持终端对第一中间结果进行解密计算,得到第二中间结果,并通过数据支持终端将该第二中间结果发送至服务终端中;在通过数据支持终端将第二中间结果发送至服务终端之后,通过数据支持终端获取通过服务终端对第二中间结果进行整合之后得到的引擎计算结果。Specifically, after the first intermediate result is obtained through the data support terminal, since the first intermediate result contains the intersection ID, the data support terminal decrypts and calculates the first intermediate result to obtain the second intermediate result, and pass the data The support terminal sends the second intermediate result to the service terminal; after the second intermediate result is sent to the service terminal through the data support terminal, the data support terminal obtains the engine calculation obtained by integrating the second intermediate result through the service terminal result.
在一具体实施例中,若第一中间结果不包含交集ID,则在通过数据支持终端获取到第一中间结果后,通过数据支持终端对第一中间结果继续进行交集引擎计算,以确定是否存在交集ID,得到第三中间结果,并通过数据支持终端将第三中间结果发送至服务终端。其中,若第三中间结果不包含交集ID,则通过数据支持终端将第三中间结果发送至服务终端,通过服务终端继续进行交集引擎计算,直到确定交集ID为止。若第三中间结果中包含交集ID,则通过数据支持终端对第三中间结果进行解密计算,再通过数据支持终端将解密后的第三中间结果发送至服务终端。In a specific embodiment, if the first intermediate result does not include the intersection ID, after the first intermediate result is obtained through the data support terminal, the data support terminal continues to perform the intersection engine calculation on the first intermediate result to determine whether there is The intersection ID, the third intermediate result is obtained, and the third intermediate result is sent to the service terminal through the data support terminal. Wherein, if the third intermediate result does not include the intersection ID, the third intermediate result is sent to the service terminal through the data support terminal, and the intersection engine calculation is continued through the service terminal until the intersection ID is determined. If the third intermediate result contains the intersection ID, the third intermediate result is decrypted and calculated by the data support terminal, and then the decrypted third intermediate result is sent to the service terminal through the data support terminal.
在本实施例中,采用加密钥匙、第一私密钥匙和第二私密钥匙对数据信息加密的方式,保证服务终端和数据支持终端的ID信息不可见的情况下,找到服务终端和数据支持终端的共有的ID信息,进一步提高了双方数据的安全性,同时保证了服务终端的用户的隐私性。In this embodiment, the encryption key, the first private key, and the second private key are used to encrypt the data information to ensure that the ID information of the service terminal and the data support terminal is not visible, and the information of the service terminal and the data support terminal is found. The shared ID information further improves the security of the data of both parties, and at the same time ensures the privacy of the user of the service terminal.
在一实施例中,在步骤S12之后,也即在通过数据支持终端和服务终端对第一ID信息和第二ID信息进行交集处理之后,数据处理模型生成方法还包括:In an embodiment, after step S12, that is, after the first ID information and the second ID information are intersected by the data support terminal and the service terminal, the data processing model generation method further includes:
在联邦成功时,通过服务终端根据服务分区中引擎计算结果为非交集ID的第一ID信息,以及与非交集ID对应的服务数据,生成补集训练集。When the federation is successful, the service terminal generates the supplementary training set according to the first ID information of the non-intersection ID as the result of the engine calculation in the service partition, and the service data corresponding to the non-intersection ID.
其中,补集训练集的实质为数据训练集合,补集训练集中的数据包括服务终端的所述服务分区中引擎计算结果为非交集ID的的第一ID信息以及与该第一ID信息对应的服务数据;也即,所述补集训练集中的第一ID信息不与数据支持终端的所述数据分区中的第二ID信息产生交集。Wherein, the essence of the complement training set is the data training set, and the data in the complement training set includes the first ID information whose engine calculation result is a non-intersection ID in the service partition of the service terminal and the first ID information corresponding to the first ID information. Service data; that is, the first ID information in the complement training set does not overlap with the second ID information in the data partition of the data support terminal.
具体地,在通过数据支持终端和服务终端对第一ID信息和第二ID信息进行交集处理之后,由于引擎计算结果中包含交集ID和非交集ID,因此,在联邦成功时,通过服务终端根据服务分区中引擎计算结果为非交集ID的第一ID信息以及该第一ID信息对应的服务数据,生成补集训练集。Specifically, after the first ID information and the second ID information are intersected by the data support terminal and the service terminal, the engine calculation result contains the intersection ID and the non-intersection ID. Therefore, when the federation is successful, the service terminal is The engine calculation result in the service partition is the first ID information of the non-intersection ID and the service data corresponding to the first ID information, and a supplementary training set is generated.
在一实施例中,在步骤S12之后,即在通过数据支持终端和服务终端对第一ID信息和第二ID信息进行交集处理之后,该数据处理模型生成方法还包括如下步骤:In an embodiment, after step S12, that is, after the first ID information and the second ID information are intersected by the data support terminal and the service terminal, the data processing model generation method further includes the following steps:
通过服务终端根据服务分区中的所有第一ID信息以及服务终端的服务分区中与各第一ID信息对应的所有服务数据,生成本地训练集。The service terminal generates a local training set according to all the first ID information in the service partition and all the service data corresponding to each first ID information in the service partition of the service terminal.
其中,本地训练集的实质为数据训练集合,本地训练集中的数据为服务终端的服务分区中的所有第一ID信息以及服务终端的服务分区中各第一ID信息对应的所有服务数据。The essence of the local training set is a data training set, and the data in the local training set is all the first ID information in the service partition of the service terminal and all the service data corresponding to each first ID information in the service partition of the service terminal.
具体地,在通过数据支持终端和服务终端对第一ID信息和第二ID信息进行交集处理之后,通过服务终端根据服务分区中所有第一ID信息以及服务终端的服务分区中与各第一ID信息对应的所有服务数据,生成本地训练集。Specifically, after the first ID information and the second ID information are intersected by the data support terminal and the service terminal, the service terminal uses the service terminal according to all the first ID information in the service partition and the first ID information in the service partition of the service terminal. Generate the local training set for all service data corresponding to the information.
本实施例中,在得到交集ID和非交集ID之后,将交集ID部分生成交集训练集,非交集ID部分生成补集训练集,再将服务终端的ID信息和对应的服务数据生成本地训练集。能够避免传统上只使用交集ID而舍去了非交集ID的情况,如果舍去非交集ID则需要另外的机器学习平台进行额外的工作,而将交集ID、非交集ID和服务终端所有ID信息都生成相对应的数据训练集,使得能够有效使用全部数据,且节省了成本。In this embodiment, after the intersection ID and non-intersection ID are obtained, the intersection ID part is generated into the intersection training set, the non-intersection ID part is generated into the complementary training set, and then the ID information of the service terminal and the corresponding service data are generated into the local training set. . It can avoid the traditional use of only the intersection ID and discard the non-intersection ID. If the non-intersection ID is discarded, another machine learning platform is required to perform additional work, and the intersection ID, non-intersection ID and all ID information of the service terminal Corresponding data training sets are generated, so that all data can be used effectively and costs are saved.
在一实施例中,数据处理模型生成方法还包括:In an embodiment, the data processing model generation method further includes:
通过服务终端根据补集训练集进行本地学习训练,得到第一本地数据处理模型,第一本地数据处理模型用于接收输入的待处理ID信息之后,输出第一本地预测结果,第一本地预测结果中包含所述非交集ID对应的所述服务数据;和/或The service terminal performs local learning and training according to the complementary training set to obtain the first local data processing model. The first local data processing model is used to receive the input ID information to be processed, and then output the first local prediction result, the first local prediction result Contains the service data corresponding to the non-intersection ID; and/or
通过所述服务终端根据所述本地训练集进行本地学习训练,得到第二本地数据处理模型,所述第二本地数据处理模型用于在接收输入的待处理ID信息之后,输出第二本地预测结果,所述第二本地预测结果中包含与所述服务终端的所述服务分区中所述第一ID信息对应的所述服务数据。The service terminal performs local learning and training according to the local training set to obtain a second local data processing model, and the second local data processing model is used to output a second local prediction result after receiving the input ID information to be processed , The second local prediction result includes the service data corresponding to the first ID information in the service partition of the service terminal.
其中,本地学习训练为使用服务终端的服务数据进行训练的方法。第一本地数据处理模型和第二本地数据处理模型均是用于进行本地预测的模型。本地预测结果可以为目标分类信息、推荐信息等,该目标分类信息或者推荐信息包含非交集ID对应的服务数据或者服务终端的服务分区中的所有服务数据。Among them, the local learning training is a method of training using the service data of the service terminal. Both the first local data processing model and the second local data processing model are models for local prediction. The local prediction result may be target classification information, recommendation information, etc. The target classification information or recommendation information includes the service data corresponding to the non-intersection ID or all the service data in the service partition of the service terminal.
具体地,在生成补集训练集和本地训练集之后,通过服务终端根据补集训练集进行本地学习训练,得到第一本地数据处理模型;通过所述服务终端根据所述本地训练集进行本地学习训练,得到第二本地数据处理模型。其中,该第一本地数据处理模型和第二本地数据处理模型均可用于联邦成功时,接收输入的待处理ID之后,对待处理ID信息进行本地预测,以获取本地预测结果,此时的本地预测结果可以与上述实施例中生成的联邦预测结果进行融合,以提高准确率。进一步地,该第一本地数据处理模型和第二本地数据处理模型主要用于在联邦中断时,接收输入的待处理ID之后,对待处理ID信息进行本地预测,以获取本地预测结果。在联邦中断时,本地预测结果能够作为应急方案,弥补联邦中断时不能生成联邦预测结果的情况,使得***更加全面。Specifically, after the supplementary training set and the local training set are generated, the service terminal performs local learning training according to the supplementary training set to obtain the first local data processing model; the service terminal performs local learning according to the local training set Train to get the second local data processing model. Wherein, the first local data processing model and the second local data processing model can be used when the federation is successful, after receiving the input ID to be processed, the ID information to be processed is locally predicted to obtain the local prediction result. The local prediction at this time The result can be fused with the federated prediction result generated in the above embodiment to improve the accuracy. Further, the first local data processing model and the second local data processing model are mainly used to perform local prediction on the ID information to be processed after receiving the input ID to be processed when the federation is interrupted, so as to obtain the local prediction result. In the event of a federal interruption, the local prediction result can be used as an emergency plan to compensate for the failure to generate the federal prediction result during the federal interruption, making the system more comprehensive.
可选地,本地学习训练可以支持并使用多种机器学习算法,示例性地,LR、XGB、NB或者DNN等。本地学习训练的任务可以为有监督的回归问题、有监督的分类问题或者非监督的机器学习问题等。Optionally, local learning and training can support and use multiple machine learning algorithms, for example, LR, XGB, NB, or DNN, etc. The tasks of local learning and training can be supervised regression problems, supervised classification problems, or unsupervised machine learning problems.
在本实施例中,通过根据补集训练集和/或本地训练集进行本地学***台进行本地训练工作的情况,提高***的全面性和灵活性。In this embodiment, the local data processing model is obtained by performing local learning training according to the complement training set and/or the local training set, and the local data processing model is used to perform local prediction on the ID information to be processed to obtain the local prediction result. Local learning and training through the supplementary training set and the local training set can obtain the prediction results of all data, making the prediction results more comprehensive, avoiding the need to add additional machine learning platforms for local training, and improving the comprehensiveness and flexibility of the system sex.
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution. The execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.
在一实施例中,提供一种数据处理模型生成装置,该数据处理模型生成装置与上述实施例中数据处理模型生成方法一一对应。如图4所示,该数据处理模型生成装置包括信息确定模块11、第一引擎计算模块12、交集训练集生成模块13和联邦学习模块14。各功能模块详细说明如下:In one embodiment, a data processing model generating device is provided, and the data processing model generating device corresponds to the data processing model generating method in the above-mentioned embodiment in a one-to-one correspondence. As shown in FIG. 4, the data processing model generation device includes an information determination module 11, a first engine calculation module 12, an intersection training set generation module 13 and a federated learning module 14. The detailed description of each functional module is as follows:
信息确定模块11,用于通过数据支持终端获取服务终端发送的包含服务分区的模型训练请求,确定数据支持终端中与服务分区对应的数据分区,通过数据支持终端根据数据分区确定第二ID信息;其中,服务分区中包含服务数据以及与其对应的第一ID信息,数据分区中包含支持数据以及与其对应的第二ID信息。The information determining module 11 is configured to obtain the model training request including the service partition sent by the service terminal through the data support terminal, determine the data partition corresponding to the service partition in the data support terminal, and determine the second ID information according to the data partition through the data support terminal; Wherein, the service partition includes service data and corresponding first ID information, and the data partition includes support data and corresponding second ID information.
交集处理模块12,用于在通过数据支持终端和服务终端对第一ID信息和第二ID信息进行交集处理后,通过数据支持终端获取服务终端发送的包含交集ID的引擎计算结果,每一个引擎计算结果为交集ID的第一ID信息均对应一个与其存在交集的第二ID信息。The intersection processing module 12 is used to obtain, through the data support terminal, the engine calculation result containing the intersection ID sent by the service terminal after the first ID information and the second ID information are intersected by the data support terminal and the service terminal, each engine The first ID information whose calculation result is an intersection ID corresponds to a second ID information that has an intersection.
交集训练集生成模块13,用于在联邦成功时,通过数据支持终端和服务终端,根据引擎计算结果为交集ID的第一ID信息以及与交集ID对应的第一ID信息均对应的服务数据和支持数据,生成交集训练集。The intersection training set generation module 13 is used to support the terminal and the service terminal according to the engine calculation result as the first ID information of the intersection ID and the service data corresponding to the first ID information corresponding to the intersection ID through the data support terminal and the service terminal when the federation succeeds. Support data, generate intersection training set.
联邦学习模块14,用于通过服务终端和数据支持终端根据交集训练集进行联邦学习训练,得到联邦数据处理模型,联邦数据处理模型用于在联邦成功时,接收输入的待处理ID信息之后,输出联邦预测结果,联邦预测结果中包含与待处理ID信息存在交集的第二ID信息对应的支持数据。The federated learning module 14 is used to perform federated learning training according to the intersection training set through the service terminal and the data support terminal to obtain the federated data processing model. The federated data processing model is used for outputting after receiving the input pending ID information when the federation is successful The federated prediction result, the federated prediction result includes supporting data corresponding to the second ID information that has an intersection with the ID information to be processed.
可选地,如图5所示,交集处理模块12还包括:Optionally, as shown in FIG. 5, the intersection processing module 12 further includes:
第一加密信息生成模块121,用于通过数据支持终端将加密钥匙发送至服务终端中,并通过数据支持终端获取服务终端采用加密钥匙和第一私密钥匙对第一ID信息进行加密之后得到的第一加密信息。The first encrypted information generating module 121 is configured to send the encryption key to the service terminal through the data support terminal, and obtain the first ID information obtained after the service terminal uses the encryption key and the first private key to encrypt the first ID information through the data support terminal. One encrypted information.
第二加密信息生成模块122,用于通过数据支持终端采用第二私密钥匙对第一加密信息进行加密,得到第二加密信息。The second encrypted information generating module 122 is configured to encrypt the first encrypted information by using the second private key through the data support terminal to obtain the second encrypted information.
第三加密信息生成模块123,通过数据支持终端采用加密钥匙和第二私密钥匙对第二ID信息进行加密,得到第三加密信息。The third encrypted information generating module 123 uses the encryption key and the second private key to encrypt the second ID information through the data support terminal to obtain the third encrypted information.
交集引擎计算模块124,用于通过数据支持终端将第二加密信息和第三加密信息发送至服务终端中,并通过数据支持终端获取通过服务终端对第二加密信息和第三加密信息进行交集引擎计算后得到的第一中间结果,第一中间结果包含交集ID。The intersection engine calculation module 124 is configured to send the second encrypted information and the third encrypted information to the service terminal through the data support terminal, and obtain the intersection engine of the second encrypted information and the third encrypted information through the service terminal through the data support terminal. The first intermediate result obtained after the calculation, the first intermediate result contains the intersection ID.
解密计算模块125,用于通过数据支持终端对第一中间结果进行解密计算,得到第二中间结果,并通过数据支持终端将第二中间结果发送至服务终端中,并通过数据支持终端获取通过服务终端对第二中间结果进行整合之后得到的包含交集ID的引擎计算结果。The decryption calculation module 125 is used to perform decryption calculation on the first intermediate result through the data support terminal to obtain the second intermediate result, and send the second intermediate result to the service terminal through the data support terminal, and obtain the passed service through the data support terminal The engine calculation result containing the intersection ID is obtained after the terminal integrates the second intermediate result.
可选地,数据处理模型生成装置还包括:Optionally, the data processing model generating device further includes:
补集训练集生成模块,用于在联邦成功时,通过服务终端根据服务分区中引擎计算结果为非交集ID的第一ID信息以及与非交集ID对应的第一ID信息对应的服务数据,生成补集训练集。The supplementary training set generation module is used to generate the service data corresponding to the first ID information corresponding to the non-intersection ID and the first ID information corresponding to the non-intersection ID by the service terminal according to the engine calculation result in the service partition when the federation is successful Complementary training set.
可选地,数据处理模型生成装置还包括:Optionally, the data processing model generating device further includes:
本地训练集生成模块,用于通过服务终端根据服务分区中的所有第一ID信息以及服务终端的服务分区中与各第一ID信息对应的所有服务数据,生成本地训练集。The local training set generating module is used for generating the local training set by the service terminal according to all the first ID information in the service partition and all the service data corresponding to each first ID information in the service partition of the service terminal.
可选地,数据处理模型生成装置还包括:Optionally, the data processing model generating device further includes:
本地学习模块,用于通过服务终端根据补集训练集进行本地学习训练,得到第一本地数据处理模型,第一本地数据处理模型用于接收输入的待处理ID信息之后,输出第一本地预测结果,第一本地预测结果中包含与所述非交集ID对应的所述服务数据;和/或The local learning module is used to perform local learning training according to the complement training set through the service terminal to obtain the first local data processing model. The first local data processing model is used to output the first local prediction result after receiving the input ID information to be processed , The first local prediction result includes the service data corresponding to the non-intersection ID; and/or
通过所述服务终端根据所述本地训练集进行第二本地学习训练,得到第二本地数据处理模型,第二本地数据处理模型用于在接收输入的待处理ID信息之后,输出第二本地预测结果,第二本地预测结果中包含与所述服务终端的所述服务分区中所述第一ID信息对应的所述服务数据。Perform a second local learning training by the service terminal according to the local training set to obtain a second local data processing model. The second local data processing model is used to output the second local prediction result after receiving the input ID information to be processed , The second local prediction result includes the service data corresponding to the first ID information in the service partition of the service terminal.
关于数据处理模型生成装置的具体限定可以参见上文中对于数据处理模型生成方法的限定,在此不再赘述。上述数据处理处理模型生成装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。For the specific definition of the data processing model generating device, please refer to the above definition of the data processing model generating method, which will not be repeated here. Each module in the above-mentioned data processing model generating device can be implemented in whole or in part by software, hardware, and a combination thereof. The above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
本申请实施例还提供一数据处理方法,该数据处理方法可应用如图1所示的应用环境中。具体地,该数据处理方法应用在数据处理***中,该数据处理***包括如图1所示的客户端和服务器,客户端与服务器通过网络进行通信,用于数据孤岛问题。其中,客户端又称为用户端,是指与服务器相对应,为客户提供本地服务的程序。客户端可安装在但不限于各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备上。服务器可以用独立的服务器或者是多个服务器组成的服务器集群来实现。进一步地,服务器中包括了服务终端和数据支持终端。The embodiment of the present application also provides a data processing method, which can be applied in the application environment shown in FIG. 1. Specifically, the data processing method is applied in a data processing system, and the data processing system includes a client and a server as shown in FIG. Among them, the client is also called the client, which refers to the program that corresponds to the server and provides local services to the client. The client can be installed on, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server can be implemented as an independent server or a server cluster composed of multiple servers. Further, the server includes a service terminal and a data support terminal.
在一实施例中,如图6所示,提出一种数据处理方法,以该方法应用在图1中的服务器为例进行说明,包括如下步骤:In an embodiment, as shown in FIG. 6, a data processing method is proposed, and the method is applied to the server in FIG. 1 as an example for description, including the following steps:
S21:在数据支持终端接收服务终端发送的包含待处理ID信息的数据支持请求时,检测当前是否处于联邦成功状态。S21: When the data support terminal receives the data support request containing the ID information to be processed sent by the service terminal, it detects whether it is currently in the federation success state.
其中,包含待处理ID信息的数据支持请求为服务终端请求数据支持终端对待处理ID信息进行预测处理的支持请求。待处理ID信息为等待输入模型进行联邦预测得到的结果信息。Among them, the data support request containing the ID information to be processed is a support request for the service terminal to request the data support terminal to perform predictive processing on the ID information to be processed. The ID information to be processed is the result information obtained by waiting for the federated prediction of the input model.
具体地,在数据支持终端接收到服务终端发送的包含待处理ID信息的数据支持请求之后,对当前联邦状态进行检测,检测当前是否处于联邦成功状态。Specifically, after the data support terminal receives the data support request containing the ID information to be processed sent by the service terminal, it detects the current federation state and detects whether it is currently in the federation success state.
S22:在当前处于联邦成功状态时,通过将待处理ID信息输入至预设的联邦数据处理模型中,获取预设的联邦数据处理模型输出的联邦预测结果,联邦预测结果中包含与待处理ID信息存在交集的第二ID信息对应的支持数据;其中,预设的联邦数据处理模型根据上述实施例中的数据处理模型生成方法生成。S22: In the current state of successful federation, by inputting the ID information to be processed into the preset federated data processing model, the federated prediction result output by the preset federated data processing model is obtained, and the federated prediction result contains the ID to be processed. The supporting data corresponding to the second ID information where the information exists; wherein, the preset federated data processing model is generated according to the data processing model generating method in the foregoing embodiment.
具体地,在数据支持终端接收服务终端发送的包含待处理ID信息的数据支持请求,并检测当前处于联邦成功状态之后,将接收到的待处理ID信息输入至预设的联邦数据处理模型中,采用该联邦数据处理模型对待处理ID信息进行联邦预测,生成联邦预测结果。该联邦预测结果中包含与待处理ID信息存在交集的第二ID信息对应的支持数据。Specifically, after the data support terminal receives the data support request containing the ID information to be processed sent by the service terminal, and detects that it is currently in a successful federal state, input the received ID information to be processed into the preset federal data processing model, The federated data processing model is used to perform federated predictions on the ID information to be processed and generate federated prediction results. The federated prediction result includes supporting data corresponding to the second ID information that has an intersection with the ID information to be processed.
其中,预设的联邦数据处理模型根据上述实施例中的数据处理模型生成方法生成。Wherein, the preset federated data processing model is generated according to the data processing model generating method in the foregoing embodiment.
在一具体实施方式中,数据处理方法还包括:In a specific embodiment, the data processing method further includes:
将待处理ID信息输入至预设的本地数据处理模型中,获取本地数据处理模型输出的本地预测结果。Input the ID information to be processed into the preset local data processing model, and obtain the local prediction result output by the local data processing model.
具体地,在通过数据支持终端接收服务终端发送的包含待处理ID信息的数据支持请求,将接收到的待处理ID信息输入预设的本地数据处理模型中,采用该本地数据处理模型对待处理ID信息进行本地预测,生成本地预测结果。Specifically, after receiving a data support request containing ID information to be processed from the service terminal through the data support terminal, and inputting the received ID information to be processed into a preset local data processing model, the local data processing model is used for the ID to be processed Information is locally predicted, and local prediction results are generated.
其中,本地数据处理模型是根据上述实施例中的数据处理模型生成方法生成的,进一步地,本地数据处理模型可以为第一本地数据处理模型,也可以为第二本地数据处理模型。该本地数据处理模型可用于在联邦成功时,也可用于在联邦中断时,接收输入的待处理ID信息之后,输出本地预测结果。在联邦成功时,采用本地数据处理模型对待处理ID信息进行本地预测,得到本地预测结果,能够使得预测结果更加全面;在联邦中断时,接收输入的待处理ID信息之后,输出本地预测结果。使得在联邦中断时,联邦数据处理模型失效,而本地数据处理模型能够作为应急方案,增加***容错率。其中,联邦中断状态可以为联邦学习过程中通信出现截断或者通信不稳定的状态,也可以为联邦学习过程中联邦学习没有响应的状态。The local data processing model is generated according to the data processing model generation method in the above-mentioned embodiment. Further, the local data processing model may be the first local data processing model or the second local data processing model. The local data processing model can be used to output the local prediction result after receiving the input ID information to be processed when the federation is successful or when the federation is interrupted. When the federation is successful, the local data processing model is used to locally predict the ID information to be processed, and the local prediction result is obtained, which can make the prediction result more comprehensive; when the federation is interrupted, after receiving the input ID information to be processed, the local prediction result is output. When the federation is interrupted, the federated data processing model becomes invalid, and the local data processing model can be used as an emergency plan to increase the fault tolerance rate of the system. Among them, the federated interruption state may be a state in which the communication is cut off or the communication is unstable during the federated learning process, or may be a state where the federated learning is not responding during the federated learning process.
在一具体实施例中,在得到联邦预测结果或者得到本地预测结果之后,服务数据预测方法还包括:In a specific embodiment, after obtaining the federal prediction result or the local prediction result, the service data prediction method further includes:
可选地,在服务终端中设置一个评估模型,该评估模型用于对待处理ID信息在联邦数据处理模型或者本地数据处理模型中预测得到的结果进行评估。可选地,得到该评估模型的方法可以为将联邦预测结果或者本地预测结果与待处理ID信息进行融合。其中,进行融合的方法包括但不限于通过投票机制、stacking训练机制、reinforcement learning或者bandit等等。Optionally, an evaluation model is set in the service terminal, and the evaluation model is used to evaluate the result predicted by the ID information to be processed in the federal data processing model or the local data processing model. Optionally, the method for obtaining the evaluation model may be to fuse the federated prediction result or the local prediction result with the ID information to be processed. Among them, the method of fusion includes, but is not limited to, voting mechanism, stacking training mechanism, reinforcement learning or bandit, etc.
进一步地,在评估模型中存在PSI指数,该PSI指数用于衡量联邦数据处理模型或者本地数据处理模型的稳定性和准确度。PSI指数中包含一个阈值,如果PSI指数超过该阈值,则会考虑对服务终端和数据支持终端更新数据客群,并根据更新后的数据客群进行重新训练。可选地,在接收到联邦预测结果或者本地预测结果之后,对联邦预测结果或者本地预测结果进行准确度分析,根据准确度分析结果,对评估模型进行更新或者对评估模型 中评估权重进行调整。Further, there is a PSI index in the evaluation model, and the PSI index is used to measure the stability and accuracy of the federal data processing model or the local data processing model. The PSI index contains a threshold. If the PSI index exceeds the threshold, the service terminal and data support terminal will be considered to update the data client group, and retrain according to the updated data client group. Optionally, after receiving the federal prediction result or the local prediction result, perform accuracy analysis on the federal prediction result or the local prediction result, and update the evaluation model or adjust the evaluation weight in the evaluation model according to the accuracy analysis result.
在本实施例中,在当前状态处于联邦成功状态时,采用联邦数据处理模型进行预测,能够在保证数据安全前提下,解决了数据孤岛问题,还提高了数据预测的准确性。在当前状态处于联邦中断状态时,采用本地数据处理模型进行预测,使得本地数据处理模型预测方法作为应急方案,保证***任务能够完成的前提下,提升了***的全面性,降低了由于通信失败或者通信中断带来的风险。In this embodiment, when the current state is in the federated success state, the federated data processing model is used for prediction, which can solve the data island problem under the premise of ensuring data security, and also improve the accuracy of data prediction. When the current state is in the federal interruption state, the local data processing model is used for prediction, so that the local data processing model prediction method is used as an emergency plan to ensure that the system tasks can be completed. This improves the comprehensiveness of the system and reduces the communication failure or The risk of communication interruption.
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution. The execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.
在一实施例中,提供一种数据处理装置,该数据处理装置与上述实施例中数据处理方法一一对应。如图7所示,该数据处理装置包括联邦状态检测21模块和联邦预测模块22。各功能模块详细说明如下:In one embodiment, a data processing device is provided, and the data processing device corresponds to the data processing method in the above-mentioned embodiment one-to-one. As shown in FIG. 7, the data processing device includes a federal state detection 21 module and a federal prediction module 22. The detailed description of each functional module is as follows:
联邦状态检测模块21,用于在数据支持终端接收服务终端发送的包含待处理ID信息的数据支持请求时,检测当前是否处于联邦成功状态;The federation state detection module 21 is used to detect whether the current federation is in a successful state when the data support terminal receives a data support request containing ID information to be processed sent by the service terminal;
联邦预测模块22,用于在当前处于联邦成功状态时,通过将待处理ID信息输入至预设的联邦数据处理模型中,获取预设的联邦数据处理模型输出的联邦预测结果,联邦预测结果中包含与待处理ID信息存在交集的第二ID信息对应的支持数据;预设的联邦数据处理模型根据上述实施例中的数据处理模型生成方法生成。The federation prediction module 22 is used to obtain the federated prediction result output by the preset federated data processing model by inputting ID information to be processed into the preset federated data processing model when the federation is currently in a successful state. Contains supporting data corresponding to the second ID information that has an intersection with the ID information to be processed; the preset federal data processing model is generated according to the data processing model generation method in the foregoing embodiment.
关于数据处理装置的具体限定可以参见上文中对于数据处理方法的限定,在此不再赘述。上述数据处理装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。For the specific definition of the data processing device, please refer to the above definition of the data processing method, which will not be repeated here. Each module in the above-mentioned data processing device can be implemented in whole or in part by software, hardware, and a combination thereof. The above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图8所示。该计算机设备包括通过***总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括可读存储介质、内存储器。该可读存储介质存储有操作***、计算机可读指令和数据库。该内存储器为可读存储介质中的操作***和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储上述数据处理模型生成方法和上述数据处理方法中使用到的数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种数据处理模型生成方法,或该计算机可读指令被处理器执行时以实现一种数据处理方法。本实施例所提供的可读存储介质包括非易失性可读存储介质和易失性可读存储介质。In one embodiment, a computer device is provided. The computer device may be a server, and its internal structure diagram may be as shown in FIG. 8. The computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities. The memory of the computer device includes a readable storage medium and an internal memory. The readable storage medium stores an operating system, computer readable instructions, and a database. The internal memory provides an environment for the operation of the operating system and computer readable instructions in the readable storage medium. The database of the computer equipment is used to store the data used in the above-mentioned data processing model generation method and the above-mentioned data processing method. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer-readable instruction is executed by the processor to realize a data processing model generation method, or the computer-readable instruction is executed by the processor to realize a data processing method. The readable storage medium provided in this embodiment includes a non-volatile readable storage medium and a volatile readable storage medium.
在一个实施例中,提供了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机可读指令,处理器执行计算机可读指令时实现如下步骤:In one embodiment, a computer device is provided, including a memory, a processor, and computer readable instructions stored in the memory and capable of running on the processor, and the processor implements the following steps when the processor executes the computer readable instructions:
通过数据支持终端获取服务终端发送的包含服务分区的模型训练请求,确定所述数据支持终端中与所述服务分区对应的数据分区,通过所述数据支持终端根据所述数据分区确定第二ID信息;其中,所述服务分区中包含服务数据以及与其对应的第一ID信息,所述数据分区中包含支持数据以及与其对应的第二ID信息;Obtain the model training request including the service partition sent by the service terminal through the data support terminal, determine the data partition corresponding to the service partition in the data support terminal, and determine the second ID information according to the data partition through the data support terminal ; Wherein, the service partition includes service data and corresponding first ID information, and the data partition includes support data and corresponding second ID information;
在通过所述数据支持终端和所述服务终端对所述第一ID信息和所述第二ID信息进行交集处理后,通过所述数据支持终端获取所述服务终端发送的包含交集ID的引擎计算结果,每一个引擎计算结果为交集ID的第一ID信息均对应一个与其存在交集的第二ID信息;After the intersection processing is performed on the first ID information and the second ID information by the data support terminal and the service terminal, the data support terminal obtains the engine calculation containing the intersection ID sent by the service terminal As a result, the first ID information for which each engine calculation result is an intersection ID corresponds to a second ID information that has an intersection with it;
在联邦成功时,通过所述数据支持终端和所述服务终端,根据引擎计算结果为所述交集ID的所述第一ID信息以及与所述交集ID均对应的所述服务数据和所述支持数据,生成交集训练集;When the federation is successful, through the data support terminal and the service terminal, according to the engine calculation result, the first ID information of the intersection ID and the service data and the support corresponding to the intersection ID Data, generate intersection training set;
通过所述服务终端和所述数据支持终端根据所述交集训练集进行联邦学习训练,得到 联邦数据处理模型,所述联邦数据处理模型用于在联邦成功时,接收输入的待处理ID信息之后,输出联邦预测结果,所述联邦预测结果中包含与所述待处理ID信息存在交集的第二ID信息对应的支持数据。The service terminal and the data support terminal perform federated learning training according to the intersection training set to obtain a federated data processing model. The federated data processing model is used when the federation is successful and after receiving the input ID information to be processed, The federated prediction result is output, and the federated prediction result includes supporting data corresponding to the second ID information that has an intersection with the ID information to be processed.
在一个实施例中,提供了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机可读指令,处理器执行计算机可读指令时实现如下步骤:In one embodiment, a computer device is provided, including a memory, a processor, and computer readable instructions stored in the memory and capable of running on the processor, and the processor implements the following steps when the processor executes the computer readable instructions:
在数据支持终端接收到服务终端发送的包含待处理ID信息的数据支持请求时,检测当前是否处于联邦成功状态;When the data support terminal receives the data support request containing the ID information to be processed sent by the service terminal, it detects whether it is currently in the federally successful state;
在当前处于联邦成功状态时,将所述待处理ID信息输入至预设的联邦数据处理模型中,获取所述预设的联邦数据处理模型输出的联邦预测结果,所述联邦预测结果中包含与所述待处理ID信息存在交集的第二ID信息对应的支持数据;所述预设的联邦数据处理模型是指根据上述数据处理模型生成方法生成。When the current federation is in a successful state, input the ID information to be processed into a preset federated data processing model to obtain the federated prediction result output by the preset federated data processing model, and the federated prediction result includes The ID information to be processed has supporting data corresponding to the second ID information in the intersection; the preset federated data processing model refers to the generation according to the data processing model generation method described above.
在一个实施例中,提供了一个或多个存储有计算机可读指令的可读存储介质,本实施例所提供的可读存储介质包括非易失性可读存储介质和易失性可读存储介质;该可读存储介质上存储有计算机可读指令,该计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器实现如下步骤:In one embodiment, one or more readable storage media storing computer readable instructions are provided. The readable storage media provided in this embodiment include non-volatile readable storage media and volatile readable storage. Medium; the readable storage medium stores computer readable instructions, and when the computer readable instructions are executed by one or more processors, the one or more processors implement the following steps:
通过数据支持终端获取服务终端发送的包含服务分区的模型训练请求,确定所述数据支持终端中与所述服务分区对应的数据分区,通过所述数据支持终端根据所述数据分区确定第二ID信息;其中,所述服务分区中包含服务数据以及与其对应的第一ID信息,所述数据分区中包含支持数据以及与其对应的第二ID信息;Obtain the model training request including the service partition sent by the service terminal through the data support terminal, determine the data partition corresponding to the service partition in the data support terminal, and determine the second ID information according to the data partition through the data support terminal ; Wherein, the service partition includes service data and corresponding first ID information, and the data partition includes support data and corresponding second ID information;
在通过所述数据支持终端和所述服务终端对所述第一ID信息和所述第二ID信息进行交集处理后,通过所述数据支持终端获取所述服务终端发送的包含交集ID的引擎计算结果,每一个引擎计算结果为交集ID的第一ID信息均对应一个与其存在交集的第二ID信息;After the intersection processing is performed on the first ID information and the second ID information by the data support terminal and the service terminal, the data support terminal obtains the engine calculation containing the intersection ID sent by the service terminal As a result, the first ID information for which each engine calculation result is an intersection ID corresponds to a second ID information that has an intersection with it;
在联邦成功时,通过所述数据支持终端和所述服务终端,根据引擎计算结果为所述交集ID的所述第一ID信息以及与所述交集ID均对应的所述服务数据和所述支持数据,生成交集训练集;When the federation is successful, through the data support terminal and the service terminal, according to the engine calculation result, the first ID information of the intersection ID and the service data and the support corresponding to the intersection ID Data, generate intersection training set;
通过所述服务终端和所述数据支持终端根据所述交集训练集进行联邦学习训练,得到联邦数据处理模型,所述联邦数据处理模型用于在联邦成功时,接收输入的待处理ID信息之后,输出联邦预测结果,所述联邦预测结果中包含与所述待处理ID信息存在交集的第二ID信息对应的支持数据。The service terminal and the data support terminal perform federated learning training according to the intersection training set to obtain a federated data processing model. The federated data processing model is used when the federation is successful and after receiving the input ID information to be processed, The federated prediction result is output, and the federated prediction result includes supporting data corresponding to the second ID information that has an intersection with the ID information to be processed.
在一个实施例中,提供了一个或多个存储有计算机可读指令的可读存储介质,本实施例所提供的可读存储介质包括非易失性可读存储介质和易失性可读存储介质;该可读存储介质上存储有计算机可读指令,该计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器实现如下步骤:In one embodiment, one or more readable storage media storing computer readable instructions are provided. The readable storage media provided in this embodiment include non-volatile readable storage media and volatile readable storage. Medium; the readable storage medium stores computer readable instructions, and when the computer readable instructions are executed by one or more processors, the one or more processors implement the following steps:
在数据支持终端接收到服务终端发送的包含待处理ID信息的数据支持请求时,检测当前是否处于联邦成功状态;When the data support terminal receives the data support request containing the ID information to be processed sent by the service terminal, it detects whether it is currently in the federally successful state;
在当前处于联邦成功状态时,将所述待处理ID信息输入至预设的联邦数据处理模型中,获取所述预设的联邦数据处理模型输出的联邦预测结果,所述联邦预测结果中包含与所述待处理ID信息存在交集的第二ID信息对应的支持数据;所述预设的联邦数据处理模型是指根据上述数据处理模型生成方法生成。When the current federation is in a successful state, input the ID information to be processed into a preset federated data processing model to obtain the federated prediction result output by the preset federated data processing model, and the federated prediction result includes The ID information to be processed has supporting data corresponding to the second ID information in the intersection; the preset federated data processing model refers to the generation according to the data processing model generation method described above.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质或者易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失 性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through computer-readable instructions. The computer-readable instructions can be stored in a non-volatile computer. In a readable storage medium or a volatile computer readable storage medium, when the computer readable instruction is executed, it may include the processes of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database, or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. As an illustration and not a limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。Those skilled in the art can clearly understand that, for the convenience and conciseness of description, only the division of the above functional units and modules is used as an example. In practical applications, the above functions can be allocated to different functional units and modules as needed. Module completion, that is, the internal structure of the device is divided into different functional units or modules to complete all or part of the functions described above.
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, a person of ordinary skill in the art should understand that it can still implement the foregoing The technical solutions recorded in the examples are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in Within the scope of protection of this application.

Claims (20)

  1. 一种数据处理模型生成方法,其中,包括:A method for generating a data processing model, which includes:
    通过数据支持终端获取服务终端发送的包含服务分区的模型训练请求,确定所述数据支持终端中与所述服务分区对应的数据分区,通过所述数据支持终端根据所述数据分区确定第二ID信息;其中,所述服务分区中包含服务数据以及与其对应的第一ID信息,所述数据分区中包含支持数据以及与其对应的第二ID信息;Obtain the model training request including the service partition sent by the service terminal through the data support terminal, determine the data partition corresponding to the service partition in the data support terminal, and determine the second ID information according to the data partition through the data support terminal ; Wherein, the service partition includes service data and corresponding first ID information, and the data partition includes support data and corresponding second ID information;
    在通过所述数据支持终端和所述服务终端对所述第一ID信息和所述第二ID信息进行交集处理后,通过所述数据支持终端获取所述服务终端发送的包含交集ID的引擎计算结果,每一个引擎计算结果为交集ID的第一ID信息均对应一个与其存在交集的第二ID信息;After the intersection processing is performed on the first ID information and the second ID information by the data support terminal and the service terminal, the data support terminal obtains the engine calculation containing the intersection ID sent by the service terminal As a result, the first ID information for which each engine calculation result is an intersection ID corresponds to a second ID information that has an intersection with it;
    在联邦成功时,通过所述数据支持终端和所述服务终端,根据引擎计算结果为所述交集ID的所述第一ID信息以及与所述交集ID均对应的所述服务数据和所述支持数据,生成交集训练集;When the federation is successful, through the data support terminal and the service terminal, according to the engine calculation result, the first ID information of the intersection ID and the service data and the support corresponding to the intersection ID Data, generate intersection training set;
    通过所述服务终端和所述数据支持终端根据所述交集训练集进行联邦学习训练,得到联邦数据处理模型,所述联邦数据处理模型用于在联邦成功时,接收输入的待处理ID信息之后,输出联邦预测结果,所述联邦预测结果中包含与所述待处理ID信息存在交集的第二ID信息对应的支持数据。The service terminal and the data support terminal perform federated learning training according to the intersection training set to obtain a federated data processing model. The federated data processing model is used when the federation is successful and after receiving the input ID information to be processed, The federated prediction result is output, and the federated prediction result includes supporting data corresponding to the second ID information that has an intersection with the ID information to be processed.
  2. 如权利要求1所述的数据处理模型生成方法,其中,在所述数据支持终端和所述服务终端对所述第一ID信息和所述第二ID信息进行交集处理后,指示所述数据支持终端获取所述服务终端发送的包含交集ID的引擎计算结果,包括:The method for generating a data processing model according to claim 1, wherein after the data support terminal and the service terminal perform intersection processing on the first ID information and the second ID information, instruct the data to support The terminal acquiring the engine calculation result including the intersection ID sent by the service terminal includes:
    通过所述数据支持终端将加密钥匙发送至所述服务终端中,并通过所述数据支持终端获取所述服务终端采用所述加密钥匙和第一私密钥匙对所述第一ID信息进行加密之后得到的第一加密信息;The encryption key is sent to the service terminal through the data support terminal, and obtained through the data support terminal. The service terminal uses the encryption key and the first private key to encrypt the first ID information. The first encrypted information;
    通过所述数据支持终端采用第二私密钥匙对所述第一加密信息进行加密,得到第二加密信息;Encrypting the first encrypted information by using the second private key by the data support terminal to obtain second encrypted information;
    通过所述数据支持终端采用所述加密钥匙和第二私密钥匙对所述第二ID信息进行加密,得到第三加密信息;Encrypting the second ID information by the data support terminal using the encryption key and the second private key to obtain third encrypted information;
    通过所述数据支持终端将所述第二加密信息和所述第三加密信息发送至所述服务终端中,并通过所述数据支持终端获取通过所述服务终端对所述第二加密信息和所述第三加密信息进行交集引擎计算后得到的第一中间结果,所述第一中间结果包含交集ID;The second encrypted information and the third encrypted information are sent to the service terminal through the data support terminal, and the data support terminal obtains the second encrypted information and the third encrypted information through the service terminal. A first intermediate result obtained after the third encrypted information is calculated by an intersection engine, and the first intermediate result includes an intersection ID;
    通过所述数据支持终端对所述第一中间结果进行解密计算,得到第二中间结果,并通过所述数据支持终端将所述第二中间结果发送至所述服务终端中,并通过所述数据支持终端获取通过所述服务终端对所述第二中间结果进行整合之后得到的包含交集ID的引擎计算结果。The first intermediate result is decrypted and calculated by the data support terminal to obtain a second intermediate result, and the second intermediate result is sent to the service terminal through the data support terminal, and the data is passed through The supporting terminal obtains the engine calculation result including the intersection ID obtained after the second intermediate result is integrated by the service terminal.
  3. 如权利要求1所述的数据处理模型生成方法,所述引擎计算结果还包含非交集ID,其中,在所述数据支持终端和所述服务终端对所述第一ID信息和所述第二ID信息进行交集处理后之后,所述数据处理模型生成方法还包括:The data processing model generation method according to claim 1, wherein the calculation result of the engine further includes a non-intersection ID, wherein the first ID information and the second ID are compared between the data support terminal and the service terminal. After the information is subjected to intersection processing, the data processing model generation method further includes:
    在联邦成功时,通过所述服务终端根据所述服务分区中引擎计算结果为非交集ID的第一ID信息,以及与所述非交集ID对应的服务数据,生成补集训练集。When the federation is successful, the service terminal generates a supplementary training set according to the first ID information of the non-intersection ID as the result of the engine calculation in the service partition and the service data corresponding to the non-intersection ID.
  4. 如权利要求3所述的数据处理模型生成方法,其中,在所述数据支持终端和所述服务终端对所述第一ID信息和所述第二ID信息进行交集处理后之后,所述数据处理模型生成方法还包括:The data processing model generating method according to claim 3, wherein, after the data support terminal and the service terminal perform intersection processing on the first ID information and the second ID information, the data processing The model generation method also includes:
    通过所述服务终端根据所述服务分区中的所有所述第一ID信息以及所述服务终端的所述服务分区中与各所述第一ID信息对应的所有服务数据,生成本地训练集。The service terminal generates a local training set according to all the first ID information in the service partition and all the service data corresponding to each of the first ID information in the service partition of the service terminal.
  5. 如权利要求4所述的数据处理模型生成方法,其中,所述数据处理模型生成方法还包括:5. The data processing model generating method according to claim 4, wherein the data processing model generating method further comprises:
    通过所述服务终端根据所述补集训练集进行本地学习训练,得到第一本地数据处理模型,所述第一本地数据处理模型用于在接收输入的待处理ID信息之后,输出第一本地预测结果,所述第一本地预测结果中包含与所述非交集ID对应的所述服务数据;和/或The service terminal performs local learning training according to the supplementary training set to obtain a first local data processing model. The first local data processing model is used to output the first local prediction after receiving the input ID information to be processed As a result, the first local prediction result includes the service data corresponding to the non-intersection ID; and/or
    通过所述服务终端根据所述本地训练集进行本地学习训练,得到第二本地数据处理模型,所述第二本地数据处理模型用于在接收输入的待处理ID信息之后,输出第二本地预测结果,所述第二本地预测结果中包含与所述服务终端的所述服务分区中所述第一ID信息对应的所述服务数据。The service terminal performs local learning and training according to the local training set to obtain a second local data processing model, and the second local data processing model is used to output a second local prediction result after receiving the input ID information to be processed , The second local prediction result includes the service data corresponding to the first ID information in the service partition of the service terminal.
  6. 一种数据处理方法,包括:A data processing method, including:
    在数据支持终端接收到服务终端发送的包含待处理ID信息的数据支持请求时,检测当前是否处于联邦成功状态;When the data support terminal receives the data support request containing the ID information to be processed sent by the service terminal, it detects whether it is currently in the federally successful state;
    在当前处于联邦成功状态时,将所述待处理ID信息输入至预设的联邦数据处理模型中,获取所述预设的联邦数据处理模型输出的联邦预测结果,所述联邦预测结果中包含与所述待处理ID信息存在交集的第二ID信息对应的支持数据;所述预设的联邦数据处理模型是指根据权利要求1至5中任一项所述的数据处理模型生成方法生成。When the current federation is in a successful state, input the ID information to be processed into a preset federated data processing model, and obtain the federated prediction result output by the preset federated data processing model, and the federated prediction result includes The ID information to be processed has supporting data corresponding to the second ID information in the intersection; the preset federal data processing model is generated according to the data processing model generation method of any one of claims 1 to 5.
  7. 一种数据处理模型生成装置,其中,包括:A data processing model generating device, which includes:
    信息确定模块,用于通过数据支持终端获取服务终端发送的包含服务分区的模型训练请求,确定所述数据支持终端中与所述服务分区对应的数据分区,通过所述数据支持终端根据所述数据分区确定第二ID信息;其中,所述服务分区中包含服务数据以及与其对应的第一ID信息,所述数据分区中包含支持数据以及与其对应的第二ID信息;The information determining module is used to obtain the model training request including the service partition sent by the service terminal through the data support terminal, determine the data partition corresponding to the service partition in the data support terminal, and use the data support terminal according to the data The partition determines the second ID information; wherein the service partition contains service data and corresponding first ID information, and the data partition contains support data and corresponding second ID information;
    第一引擎计算模块,用于在通过所述数据支持终端和所述服务终端对所述第一ID信息和所述第二ID信息进行交集处理后,通过所述数据支持终端获取所述服务终端发送的包含交集ID的引擎计算结果,每一个所述引擎计算结果为交集ID的所述第一ID信息均对应一个与其存在交集的所述第二ID信息;The first engine calculation module is configured to obtain the service terminal through the data support terminal after the intersection processing of the first ID information and the second ID information through the data support terminal and the service terminal The sent engine calculation result containing the intersection ID, each of the first ID information whose engine calculation result is the intersection ID corresponds to one of the second ID information that has an intersection with the first ID information;
    交集训练集生成模块,用于在联邦成功时,通过所述数据支持终端和所述服务终端,根据引擎计算结果为所述交集ID的所述第一ID信息以及与所述交集ID均对应的所述服务数据和所述支持数据,生成交集训练集;The intersection training set generation module is configured to use the data support terminal and the service terminal according to the engine calculation result to be the first ID information of the intersection ID and the information corresponding to the intersection ID when the federation is successful Generating an intersection training set for the service data and the support data;
    联邦学习模块,用于通过所述服务终端和所述数据支持终端根据所述交集训练集进行联邦学习训练,得到联邦数据处理模型,所述联邦数据处理模型用于在联邦成功时,接收输入的待处理ID信息之后,输出联邦预测结果,所述联邦预测结果中包含与所述待处理ID信息存在交集的第二ID信息对应的支持数据。The federated learning module is used to perform federated learning training according to the intersection training set through the service terminal and the data support terminal to obtain a federated data processing model. The federated data processing model is used to receive input when the federation is successful. After the ID information to be processed, a federated prediction result is output, and the federated prediction result includes support data corresponding to the second ID information that has an intersection with the ID information to be processed.
  8. 一种数据处理装置,其中,包括:A data processing device, which includes:
    联邦状态检测模块,用于在数据支持终端接收服务终端发送的包含待处理ID信息的数据支持请求时,检测当前是否处于联邦成功状态;The federal state detection module is used to detect whether the current federal state is in a successful federal state when the data support terminal receives a data support request containing the ID information to be processed from the service terminal;
    联邦预测模块,用于在当前处于联邦成功状态时,将所述待处理ID信息输入至预设的联邦数据处理模型中,获取所述预设的联邦数据处理模型输出的联邦预测结果,所述联邦预测结果中包含与所述待处理ID信息存在交集的第二ID信息对应的支持数据;所述预设的联邦数据处理模型根据权利要求1至5中任一项所述的数据处理模型生成方法生成。The federation prediction module is used to input the ID information to be processed into a preset federated data processing model when the federation is currently in a successful state, and obtain the federated prediction result output by the preset federated data processing model. The federated prediction result contains supporting data corresponding to the second ID information that has an intersection with the ID information to be processed; the preset federated data processing model is generated according to the data processing model of any one of claims 1 to 5 Method generation.
  9. 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,其中,所述处理器执行所述计算机可读指令时实现如下步骤:A computer device includes a memory, a processor, and computer-readable instructions that are stored in the memory and can run on the processor, wherein the processor implements the following steps when the processor executes the computer-readable instructions:
    通过数据支持终端获取服务终端发送的包含服务分区的模型训练请求,确定所述数据支持终端中与所述服务分区对应的数据分区,通过所述数据支持终端根据所述数据分区确定第二ID信息;其中,所述服务分区中包含服务数据以及与其对应的第一ID信息,所述数据分区中包含支持数据以及与其对应的第二ID信息;Obtain the model training request including the service partition sent by the service terminal through the data support terminal, determine the data partition corresponding to the service partition in the data support terminal, and determine the second ID information according to the data partition through the data support terminal ; Wherein, the service partition includes service data and corresponding first ID information, and the data partition includes support data and corresponding second ID information;
    在通过所述数据支持终端和所述服务终端对所述第一ID信息和所述第二ID信息进行交集处理后,通过所述数据支持终端获取所述服务终端发送的包含交集ID的引擎计算结果,每一个引擎计算结果为交集ID的第一ID信息均对应一个与其存在交集的第二ID信息;After the intersection processing is performed on the first ID information and the second ID information by the data support terminal and the service terminal, the data support terminal obtains the engine calculation containing the intersection ID sent by the service terminal As a result, the first ID information for which each engine calculation result is an intersection ID corresponds to a second ID information that has an intersection with it;
    在联邦成功时,通过所述数据支持终端和所述服务终端,根据引擎计算结果为所述交集ID的所述第一ID信息以及与所述交集ID均对应的所述服务数据和所述支持数据,生成交集训练集;When the federation is successful, through the data support terminal and the service terminal, according to the engine calculation result, the first ID information of the intersection ID and the service data and the support corresponding to the intersection ID Data, generate intersection training set;
    通过所述服务终端和所述数据支持终端根据所述交集训练集进行联邦学习训练,得到联邦数据处理模型,所述联邦数据处理模型用于在联邦成功时,接收输入的待处理ID信息之后,输出联邦预测结果,所述联邦预测结果中包含与所述待处理ID信息存在交集的第二ID信息对应的支持数据。The service terminal and the data support terminal perform federated learning training according to the intersection training set to obtain a federated data processing model. The federated data processing model is used when the federation is successful and after receiving the input ID information to be processed, The federated prediction result is output, and the federated prediction result includes supporting data corresponding to the second ID information that has an intersection with the ID information to be processed.
  10. 如权利要求9所述的计算机设备,其中,在所述数据支持终端和所述服务终端对所述第一ID信息和所述第二ID信息进行交集处理后,指示所述数据支持终端获取所述服务终端发送的包含交集ID的引擎计算结果,包括:The computer device according to claim 9, wherein after the data support terminal and the service terminal perform the intersection processing on the first ID information and the second ID information, the data support terminal is instructed to obtain the The engine calculation result including the intersection ID sent by the service terminal includes:
    通过所述数据支持终端将加密钥匙发送至所述服务终端中,并通过所述数据支持终端获取所述服务终端采用所述加密钥匙和第一私密钥匙对所述第一ID信息进行加密之后得到的第一加密信息;The encryption key is sent to the service terminal through the data support terminal, and obtained through the data support terminal. The service terminal uses the encryption key and the first private key to encrypt the first ID information. The first encrypted information;
    通过所述数据支持终端采用第二私密钥匙对所述第一加密信息进行加密,得到第二加密信息;Encrypting the first encrypted information by using the second private key by the data support terminal to obtain second encrypted information;
    通过所述数据支持终端采用所述加密钥匙和第二私密钥匙对所述第二ID信息进行加密,得到第三加密信息;Encrypting the second ID information by the data support terminal using the encryption key and the second private key to obtain third encrypted information;
    通过所述数据支持终端将所述第二加密信息和所述第三加密信息发送至所述服务终端中,并通过所述数据支持终端获取通过所述服务终端对所述第二加密信息和所述第三加密信息进行交集引擎计算后得到的第一中间结果,所述第一中间结果包含交集ID;The second encrypted information and the third encrypted information are sent to the service terminal through the data support terminal, and the data support terminal obtains the second encrypted information and the third encrypted information through the service terminal. A first intermediate result obtained after the third encrypted information is calculated by an intersection engine, and the first intermediate result includes an intersection ID;
    通过所述数据支持终端对所述第一中间结果进行解密计算,得到第二中间结果,并通过所述数据支持终端将所述第二中间结果发送至所述服务终端中,并通过所述数据支持终端获取通过所述服务终端对所述第二中间结果进行整合之后得到的包含交集ID的引擎计算结果。The first intermediate result is decrypted and calculated by the data support terminal to obtain a second intermediate result, and the second intermediate result is sent to the service terminal through the data support terminal, and the data is passed through The supporting terminal obtains the engine calculation result including the intersection ID obtained after the second intermediate result is integrated by the service terminal.
  11. 如权利要求9所述的计算机设备,其中,所述引擎计算结果还包含非交集ID,其特征在于,在所述数据支持终端和所述服务终端对所述第一ID信息和所述第二ID信息进行交集处理后之后,所述处理器执行所述计算机可读指令时还实现如下步骤:The computer device according to claim 9, wherein the calculation result of the engine further includes a non-intersection ID, characterized in that the first ID information and the second ID information are compared between the data support terminal and the service terminal. After the ID information is subjected to the intersection processing, the processor further implements the following steps when executing the computer-readable instruction:
    在联邦成功时,通过所述服务终端根据所述服务分区中引擎计算结果为非交集ID的第一ID信息,以及与所述非交集ID对应的服务数据,生成补集训练集。When the federation is successful, the service terminal generates a supplementary training set according to the first ID information of the non-intersection ID as the result of the engine calculation in the service partition, and the service data corresponding to the non-intersection ID.
  12. 如权利要求11所述的计算机设备,其中,所述数据支持终端和所述服务终端对所述第一ID信息和所述第二ID信息进行交集处理后之后,所述处理器执行所述计算机可读指令时还实现如下步骤:The computer device according to claim 11, wherein after the data support terminal and the service terminal perform intersection processing on the first ID information and the second ID information, the processor executes the computer The following steps are also implemented when the instructions are readable:
    通过所述服务终端根据所述服务分区中的所有所述第一ID信息以及所述服务终端的所述服务分区中与各所述第一ID信息对应的所有服务数据,生成本地训练集。The service terminal generates a local training set according to all the first ID information in the service partition and all the service data corresponding to each of the first ID information in the service partition of the service terminal.
  13. 如权利要求12所述的计算机设备,其中,所述处理器执行所述计算机可读指令时还实现如下步骤:The computer device of claim 12, wherein the processor further implements the following steps when executing the computer-readable instructions:
    通过所述服务终端根据所述补集训练集进行本地学习训练,得到第一本地数据处理模型,所述第一本地数据处理模型用于在接收输入的待处理ID信息之后,输出第一本地预测结果,所述第一本地预测结果中包含与所述非交集ID对应的所述服务数据;和/或The service terminal performs local learning training according to the supplementary training set to obtain a first local data processing model. The first local data processing model is used to output the first local prediction after receiving the input ID information to be processed As a result, the first local prediction result includes the service data corresponding to the non-intersection ID; and/or
    通过所述服务终端根据所述本地训练集进行本地学习训练,得到第二本地数据处理模型,所述第二本地数据处理模型用于在接收输入的待处理ID信息之后,输出第二本地预 测结果,所述第二本地预测结果中包含与所述服务终端的所述服务分区中所述第一ID信息对应的所述服务数据。The service terminal performs local learning training according to the local training set to obtain a second local data processing model, and the second local data processing model is used to output a second local prediction result after receiving the input ID information to be processed , The second local prediction result includes the service data corresponding to the first ID information in the service partition of the service terminal.
  14. 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,其中,所述处理器执行所述计算机可读指令时实现如下步骤:A computer device includes a memory, a processor, and computer-readable instructions that are stored in the memory and can run on the processor, wherein the processor implements the following steps when the processor executes the computer-readable instructions:
    在数据支持终端接收到服务终端发送的包含待处理ID信息的数据支持请求时,检测当前是否处于联邦成功状态;When the data support terminal receives the data support request containing the ID information to be processed sent by the service terminal, it detects whether it is currently in the federally successful state;
    在当前处于联邦成功状态时,将所述待处理ID信息输入至预设的联邦数据处理模型中,获取所述预设的联邦数据处理模型输出的联邦预测结果,所述联邦预测结果中包含与所述待处理ID信息存在交集的第二ID信息对应的支持数据;所述预设的联邦数据处理模型是指根据权利要求1至5中任一项所述的数据处理模型生成方法生成。When the current federation is in a successful state, the ID information to be processed is input into a preset federated data processing model, and the federated prediction result output by the preset federated data processing model is obtained, and the federated prediction result contains and The ID information to be processed has supporting data corresponding to the second ID information in the intersection; the preset federal data processing model refers to the generation of the data processing model generation method according to any one of claims 1 to 5.
  15. 一个或多个存储有计算机可读指令的可读存储介质,其中,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:One or more readable storage media storing computer readable instructions, where when the computer readable instructions are executed by one or more processors, the one or more processors execute the following steps:
    通过数据支持终端获取服务终端发送的包含服务分区的模型训练请求,确定所述数据支持终端中与所述服务分区对应的数据分区,通过所述数据支持终端根据所述数据分区确定第二ID信息;其中,所述服务分区中包含服务数据以及与其对应的第一ID信息,所述数据分区中包含支持数据以及与其对应的第二ID信息;Obtain the model training request including the service partition sent by the service terminal through the data support terminal, determine the data partition corresponding to the service partition in the data support terminal, and determine the second ID information according to the data partition through the data support terminal ; Wherein, the service partition includes service data and corresponding first ID information, and the data partition includes support data and corresponding second ID information;
    在通过所述数据支持终端和所述服务终端对所述第一ID信息和所述第二ID信息进行交集处理后,通过所述数据支持终端获取所述服务终端发送的包含交集ID的引擎计算结果,每一个引擎计算结果为交集ID的第一ID信息均对应一个与其存在交集的第二ID信息;After the intersection processing is performed on the first ID information and the second ID information by the data support terminal and the service terminal, the data support terminal obtains the engine calculation containing the intersection ID sent by the service terminal As a result, the first ID information for which each engine calculation result is an intersection ID corresponds to a second ID information that has an intersection with it;
    在联邦成功时,通过所述数据支持终端和所述服务终端,根据引擎计算结果为所述交集ID的所述第一ID信息以及与所述交集ID均对应的所述服务数据和所述支持数据,生成交集训练集;When the federation is successful, through the data support terminal and the service terminal, according to the engine calculation result, the first ID information of the intersection ID and the service data and the support corresponding to the intersection ID Data, generate intersection training set;
    通过所述服务终端和所述数据支持终端根据所述交集训练集进行联邦学习训练,得到联邦数据处理模型,所述联邦数据处理模型用于在联邦成功时,接收输入的待处理ID信息之后,输出联邦预测结果,所述联邦预测结果中包含与所述待处理ID信息存在交集的第二ID信息对应的支持数据。The service terminal and the data support terminal perform federated learning training according to the intersection training set to obtain a federated data processing model. The federated data processing model is used when the federation is successful and after receiving the input ID information to be processed, The federated prediction result is output, and the federated prediction result includes supporting data corresponding to the second ID information that has an intersection with the ID information to be processed.
  16. 如权利要求15所述的可读存储介质,其中,在所述数据支持终端和所述服务终端对所述第一ID信息和所述第二ID信息进行交集处理后,指示所述数据支持终端获取所述服务终端发送的包含交集ID的引擎计算结果,包括:The readable storage medium according to claim 15, wherein after the data support terminal and the service terminal perform intersection processing on the first ID information and the second ID information, the data support terminal is instructed Obtaining the engine calculation result containing the intersection ID sent by the service terminal includes:
    通过所述数据支持终端将加密钥匙发送至所述服务终端中,并通过所述数据支持终端获取所述服务终端采用所述加密钥匙和第一私密钥匙对所述第一ID信息进行加密之后得到的第一加密信息;The encryption key is sent to the service terminal through the data support terminal, and obtained through the data support terminal. The service terminal uses the encryption key and the first private key to encrypt the first ID information. The first encrypted information;
    通过所述数据支持终端采用第二私密钥匙对所述第一加密信息进行加密,得到第二加密信息;Encrypting the first encrypted information by using the second private key by the data support terminal to obtain second encrypted information;
    通过所述数据支持终端采用所述加密钥匙和第二私密钥匙对所述第二ID信息进行加密,得到第三加密信息;Encrypting the second ID information by the data support terminal using the encryption key and the second private key to obtain third encrypted information;
    通过所述数据支持终端将所述第二加密信息和所述第三加密信息发送至所述服务终端中,并通过所述数据支持终端获取通过所述服务终端对所述第二加密信息和所述第三加密信息进行交集引擎计算后得到的第一中间结果,所述第一中间结果包含交集ID;The second encrypted information and the third encrypted information are sent to the service terminal through the data support terminal, and the second encrypted information and the third encrypted information obtained through the service terminal are acquired through the data support terminal. The first intermediate result obtained after the third encrypted information is calculated by the intersection engine, and the first intermediate result includes an intersection ID;
    通过所述数据支持终端对所述第一中间结果进行解密计算,得到第二中间结果,并通过所述数据支持终端将所述第二中间结果发送至所述服务终端中,并通过所述数据支持终端获取通过所述服务终端对所述第二中间结果进行整合之后得到的包含交集ID的引擎计算结果。The first intermediate result is decrypted and calculated by the data support terminal to obtain a second intermediate result, and the second intermediate result is sent to the service terminal through the data support terminal, and the data is passed through The supporting terminal obtains the engine calculation result including the intersection ID obtained after the second intermediate result is integrated by the service terminal.
  17. 如权利要求15所述的可读存储介质,其中,所述引擎计算结果还包含非交集ID, 其特征在于,在所述数据支持终端和所述服务终端对所述第一ID信息和所述第二ID信息进行交集处理后之后,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器还执行如下步骤:15. The readable storage medium according to claim 15, wherein the engine calculation result further includes a non-intersection ID, characterized in that the data support terminal and the service terminal compare the first ID information and the After the intersection processing of the second ID information, when the computer-readable instructions are executed by one or more processors, the one or more processors further execute the following steps:
    在联邦成功时,通过所述服务终端根据所述服务分区中引擎计算结果为非交集ID的第一ID信息,以及与所述非交集ID对应的服务数据,生成补集训练集。When the federation is successful, the service terminal generates a supplementary training set according to the first ID information of the non-intersection ID as the result of the engine calculation in the service partition, and the service data corresponding to the non-intersection ID.
  18. 如权利要求17所述的可读存储介质,其中,在所述数据支持终端和所述服务终端对所述第一ID信息和所述第二ID信息进行交集处理后之后,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器还执行如下步骤:The readable storage medium according to claim 17, wherein, after the data support terminal and the service terminal perform intersection processing on the first ID information and the second ID information, the computer readable When the instruction is executed by one or more processors, the one or more processors further execute the following steps:
    通过所述服务终端根据所述服务分区中的所有所述第一ID信息以及所述服务终端的所述服务分区中与各所述第一ID信息对应的所有服务数据,生成本地训练集。The service terminal generates a local training set according to all the first ID information in the service partition and all the service data corresponding to each of the first ID information in the service partition of the service terminal.
  19. 如权利要求18所述的可读存储介质,其中,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器还执行如下步骤:The readable storage medium of claim 18, wherein when the computer-readable instructions are executed by one or more processors, the one or more processors further execute the following steps:
    通过所述服务终端根据所述补集训练集进行本地学习训练,得到第一本地数据处理模型,所述第一本地数据处理模型用于在接收输入的待处理ID信息之后,输出第一本地预测结果,所述第一本地预测结果中包含与所述非交集ID对应的所述服务数据;和/或The service terminal performs local learning training according to the supplementary training set to obtain a first local data processing model. The first local data processing model is used to output the first local prediction after receiving the input ID information to be processed As a result, the first local prediction result includes the service data corresponding to the non-intersection ID; and/or
    通过所述服务终端根据所述本地训练集进行本地学习训练,得到第二本地数据处理模型,所述第二本地数据处理模型用于在接收输入的待处理ID信息之后,输出第二本地预测结果,所述第二本地预测结果中包含与所述服务终端的所述服务分区中所述第一ID信息对应的所述服务数据。The service terminal performs local learning and training according to the local training set to obtain a second local data processing model, and the second local data processing model is used to output a second local prediction result after receiving the input ID information to be processed , The second local prediction result includes the service data corresponding to the first ID information in the service partition of the service terminal.
  20. 一个或多个存储有计算机可读指令的可读存储介质,其中,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:One or more readable storage media storing computer readable instructions, where when the computer readable instructions are executed by one or more processors, the one or more processors execute the following steps:
    在数据支持终端接收到服务终端发送的包含待处理ID信息的数据支持请求时,检测当前是否处于联邦成功状态;When the data support terminal receives the data support request containing the ID information to be processed sent by the service terminal, it detects whether it is currently in the federally successful state;
    在当前处于联邦成功状态时,将所述待处理ID信息输入至预设的联邦数据处理模型中,获取所述预设的联邦数据处理模型输出的联邦预测结果,所述联邦预测结果中包含与所述待处理ID信息存在交集的第二ID信息对应的支持数据;所述预设的联邦数据处理模型是指根据权利要求1至5中任一项所述的数据处理模型生成方法生成。When the current federation is in a successful state, the ID information to be processed is input into a preset federated data processing model, and the federated prediction result output by the preset federated data processing model is obtained, and the federated prediction result contains and The ID information to be processed has supporting data corresponding to the second ID information in the intersection; the preset federal data processing model refers to the generation of the data processing model generation method according to any one of claims 1 to 5.
PCT/CN2020/135350 2020-04-29 2020-12-10 Data processing model generation method and apparatus and data processing method and apparatus WO2021218167A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010356458.6A CN111666576B (en) 2020-04-29 2020-04-29 Data processing model generation method and device, and data processing method and device
CN202010356458.6 2020-04-29

Publications (1)

Publication Number Publication Date
WO2021218167A1 true WO2021218167A1 (en) 2021-11-04

Family

ID=72383037

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/135350 WO2021218167A1 (en) 2020-04-29 2020-12-10 Data processing model generation method and apparatus and data processing method and apparatus

Country Status (2)

Country Link
CN (1) CN111666576B (en)
WO (1) WO2021218167A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114564742A (en) * 2022-02-18 2022-05-31 北京交通大学 Lightweight federated recommendation method based on Hash learning
CN116582341A (en) * 2023-05-30 2023-08-11 连连银通电子支付有限公司 Abnormality detection method, abnormality detection device, abnormality detection apparatus, and storage medium
CN116582341B (en) * 2023-05-30 2024-06-04 连连银通电子支付有限公司 Abnormality detection method, abnormality detection device, abnormality detection apparatus, and storage medium

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111666576B (en) * 2020-04-29 2023-08-04 平安科技(深圳)有限公司 Data processing model generation method and device, and data processing method and device
CN111666493B (en) * 2020-04-30 2024-05-03 平安科技(深圳)有限公司 Page data generation method, device, computer equipment and storage medium
CN112132292B (en) * 2020-09-16 2024-05-14 建信金融科技有限责任公司 Longitudinal federation learning data processing method, device and system based on block chain
CN112184474B (en) * 2020-09-25 2023-01-17 江苏中利集团股份有限公司 Intelligent material manufacturing method and device based on block chain
CN112231768B (en) * 2020-10-27 2021-06-18 腾讯科技(深圳)有限公司 Data processing method and device, computer equipment and storage medium
CN113781082B (en) * 2020-11-18 2023-04-07 京东城市(北京)数字科技有限公司 Method and device for correcting regional portrait, electronic equipment and readable storage medium
CN112232528B (en) * 2020-12-15 2021-03-09 之江实验室 Method and device for training federated learning model and federated learning system
CN113127916B (en) * 2021-05-18 2023-07-28 腾讯科技(深圳)有限公司 Data set processing method, data processing method, device and storage medium
CN113190871B (en) * 2021-05-28 2023-10-31 脸萌有限公司 Data protection method and device, readable medium and electronic equipment
CN113807415A (en) * 2021-08-30 2021-12-17 中国再保险(集团)股份有限公司 Federal feature selection method and device, computer equipment and storage medium
CN116167057B (en) * 2023-04-19 2023-07-28 国网江苏省电力有限公司信息通信分公司 Code dynamic safe loading method and device based on key code semantic detection
CN116579020B (en) * 2023-07-04 2024-04-05 深圳前海环融联易信息科技服务有限公司 Campus risk prediction method, device, equipment and medium based on privacy protection

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180204111A1 (en) * 2013-02-28 2018-07-19 Z Advanced Computing, Inc. System and Method for Extremely Efficient Image and Pattern Recognition and Artificial Intelligence Platform
CN110309587A (en) * 2019-06-28 2019-10-08 京东城市(北京)数字科技有限公司 Decision model construction method, decision-making technique and decision model
CN110797124A (en) * 2019-10-30 2020-02-14 腾讯科技(深圳)有限公司 Model multi-terminal collaborative training method, medical risk prediction method and device
CN111666576A (en) * 2020-04-29 2020-09-15 平安科技(深圳)有限公司 Data processing model generation method and device and data processing method and device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109165683B (en) * 2018-08-10 2023-09-12 深圳前海微众银行股份有限公司 Sample prediction method, device and storage medium based on federal training
CN109492420B (en) * 2018-12-28 2021-07-20 深圳前海微众银行股份有限公司 Model parameter training method, terminal, system and medium based on federal learning
CN110263908B (en) * 2019-06-20 2024-04-02 深圳前海微众银行股份有限公司 Federal learning model training method, apparatus, system and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180204111A1 (en) * 2013-02-28 2018-07-19 Z Advanced Computing, Inc. System and Method for Extremely Efficient Image and Pattern Recognition and Artificial Intelligence Platform
CN110309587A (en) * 2019-06-28 2019-10-08 京东城市(北京)数字科技有限公司 Decision model construction method, decision-making technique and decision model
CN110797124A (en) * 2019-10-30 2020-02-14 腾讯科技(深圳)有限公司 Model multi-terminal collaborative training method, medical risk prediction method and device
CN111666576A (en) * 2020-04-29 2020-09-15 平安科技(深圳)有限公司 Data processing model generation method and device and data processing method and device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114564742A (en) * 2022-02-18 2022-05-31 北京交通大学 Lightweight federated recommendation method based on Hash learning
CN114564742B (en) * 2022-02-18 2024-05-14 北京交通大学 Hash learning-based lightweight federal recommendation method
CN116582341A (en) * 2023-05-30 2023-08-11 连连银通电子支付有限公司 Abnormality detection method, abnormality detection device, abnormality detection apparatus, and storage medium
CN116582341B (en) * 2023-05-30 2024-06-04 连连银通电子支付有限公司 Abnormality detection method, abnormality detection device, abnormality detection apparatus, and storage medium

Also Published As

Publication number Publication date
CN111666576B (en) 2023-08-04
CN111666576A (en) 2020-09-15

Similar Documents

Publication Publication Date Title
WO2021218167A1 (en) Data processing model generation method and apparatus and data processing method and apparatus
US11153072B2 (en) Processing blockchain data based on smart contract operations executed in a trusted execution environment
US10860710B2 (en) Processing and storing blockchain data under a trusted execution environment
US11650955B2 (en) Systems and methods for distributed data storage and delivery using blockchain
US10917249B2 (en) Processing data elements stored in blockchain networks
CN107948152B (en) Information storage method, information acquisition method, information storage device, information acquisition device and information acquisition equipment
US11546348B2 (en) Data service system
Reen et al. Decentralized patient centric e-health record management system using blockchain and IPFS
CA2627936A1 (en) Data matching using data clusters
WO2021208701A1 (en) Method, apparatus, electronic device, and storage medium for generating annotation for code change
WO2023134055A1 (en) Privacy-based federated inference method and apparatus, device, and storage medium
CN111709860A (en) Homote advice processing method, device, equipment and storage medium
CN111756684B (en) Method, system and non-transitory computer-readable storage medium for transmitting critical data
CN114417364A (en) Data encryption method, federal modeling method, apparatus and computer device
KR20220092811A (en) Method and device for storing encrypted data
CN112506481A (en) Service data interaction method and device, computer equipment and storage medium
WO2021218177A1 (en) Page data generation method and apparatus, computer device, and storage medium
US20230147654A1 (en) Method and system for providing privacy-preserving data analysis
TW202119229A (en) Data management method and system capable of safely accessing and deleting data wherein operations are performed by using a management server
US11562352B1 (en) Data storage and management and methods of thereof
US20230246847A1 (en) Methods and devices for storing information in a distributed ledger database
JP7011513B2 (en) Information processing equipment, systems, information processing methods and programs
CN117151856A (en) Resource borrowing service handling method, device, computer equipment and storage medium
CN114254311A (en) System and method for anonymously collecting data related to malware from a client device
CN117813797A (en) Matching cryptographic computing resources to predicted requirements for decrypting encrypted communications

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20933970

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20933970

Country of ref document: EP

Kind code of ref document: A1