CN114547658B - Data processing method, device, equipment and computer readable storage medium - Google Patents

Data processing method, device, equipment and computer readable storage medium Download PDF

Info

Publication number
CN114547658B
CN114547658B CN202210198357.XA CN202210198357A CN114547658B CN 114547658 B CN114547658 B CN 114547658B CN 202210198357 A CN202210198357 A CN 202210198357A CN 114547658 B CN114547658 B CN 114547658B
Authority
CN
China
Prior art keywords
model
data
feature
trained
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210198357.XA
Other languages
Chinese (zh)
Other versions
CN114547658A (en
Inventor
何元钦
康焱
骆家焕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN202210198357.XA priority Critical patent/CN114547658B/en
Publication of CN114547658A publication Critical patent/CN114547658A/en
Application granted granted Critical
Publication of CN114547658B publication Critical patent/CN114547658B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioethics (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a data processing method, a data processing device and a computer readable storage medium, which are applied to first party equipment; the method comprises the following steps: acquiring first characteristic data, second characteristic data and encrypted characteristics obtained by the second party equipment through encryption processing according to third characteristic data, wherein the second characteristic data and the third characteristic data are data of different characteristics of the same user; training based on the first characteristic data, the second characteristic data and the encryption characteristic to obtain a trained first model; transmitting the trained first model to target equipment for aggregation to obtain a global model; and acquiring attribute information of the object to be recommended, and processing the attribute information by using the global model sent by the target equipment to obtain a recommendation result. By means of the method for model pre-training, the model effect of joint modeling can be improved, the recommendation result is determined based on the extracted features of the high degree of distinction, and the recommendation success rate can be improved.

Description

Data processing method, device, equipment and computer readable storage medium
Technical Field
The present application relates to artificial intelligence technology, and in particular, to a data processing method, apparatus, electronic device, computer readable storage medium, and computer program product.
Background
With the great trend of each industry to gradually strengthen data privacy protection, federal learning, a technology capable of cooperating with multiparty data to establish machine learning under the condition of protecting data privacy, becomes one of the keys of cooperation among enterprises/industries. The longitudinal federal learning is to take out the part of users and data with the same participant users and different user data characteristics for joint modeling under the condition that the data characteristics of the participants are less in overlapping and the users are more in overlapping, and can enable the participant users to provide better services for the clients by improving the model performance.
Under the vertical federation scene, the participating parties and the participating parties only carry out supervised joint modeling based on tagged users overlapped by the parties. However, in a practical scenario, since tag data is difficult to obtain, there is only a small amount of tagged data on both sides. In addition, because the respective scenes are different, the number of coincident users of the two parties may be small, which results in less data that can be used for joint modeling by the two parties finally, and affects the model effect of the joint modeling. For example, in the case of cross-platform recommendation, company a and company B have user groups with very high overlapping degree, but their products are different, that is, their features are not overlapped from the aspect of longitudinal federation, for example, one feature is a book ID, the other feature is a movie ID, each company has a scoring matrix of the user on the product, but the scoring data is less, and the joint model is performed according to only a small amount of scoring data, so that the model effect is poor, and the recommendation effect is affected.
Disclosure of Invention
The embodiment of the application provides a data processing method, a data processing device, electronic equipment, a computer readable storage medium and a computer program product, which can improve the model effect of joint modeling, determine a recommendation result based on the extracted characteristics of high degree of distinction and improve the recommendation success rate.
The technical scheme of the embodiment of the application is realized as follows:
The embodiment of the application provides a data processing method, which is based on a federal learning system, wherein the federal learning system comprises first participant equipment and at least one second participant equipment, the method is applied to the first participant equipment, and the method comprises the following steps:
acquiring first characteristic data, second characteristic data and encryption characteristics sent by second participant equipment, wherein the first characteristic data, the second characteristic data and the encryption characteristics are obtained by carrying out encryption processing on third characteristic data held by the second participant equipment, and the second characteristic data and the third characteristic data are data of different characteristics of the same user;
Training a preset first model based on the first characteristic data, the second characteristic data and the encryption characteristic to obtain a trained first model;
Transmitting the trained first model to target equipment so that the target equipment aggregates the trained first model to obtain a global model; the target device is a server device or a participant device in the federal learning system;
receiving the global model sent by the target equipment;
and acquiring attribute information of the object to be recommended, and processing the attribute information by utilizing the global model to obtain a recommendation result.
An embodiment of the present application provides a data processing apparatus, including:
The first acquisition module is used for acquiring first characteristic data, second characteristic data and encryption characteristics sent by second party equipment, wherein the first characteristic data, the second characteristic data and the encryption characteristics are obtained by carrying out encryption processing on third characteristic data held by the second party equipment, and the second characteristic data and the third characteristic data are data of different characteristics of the same user;
The training module is used for training a preset first model based on the first characteristic data, the second characteristic data and the encryption characteristic to obtain a trained first model;
the first sending module is used for sending the trained first model to target equipment so that the target equipment can aggregate the trained first model to obtain a global model; the target equipment is server equipment or participant equipment in the federal learning system;
The receiving module is used for receiving the global model sent by the target equipment;
The second acquisition module is used for acquiring attribute information of the object to be recommended;
and the processing module is used for processing the attribute information by utilizing the global model to obtain a recommendation result.
In the above scheme, the training module is further configured to:
Respectively converting the first characteristic data and the second characteristic data by using a preset conversion model to obtain converted first characteristic data and converted second characteristic data;
Performing feature extraction processing on the converted first feature data by using a preset first model to obtain local features corresponding to the first feature data;
determining a first private feature corresponding to the first feature data based on the converted first feature data, the converted second feature data and the encrypted feature;
Processing the local feature and the first private feature by using a preset first loss function to determine a first loss value;
And reversely transmitting the first loss value to the preset first model to adjust the parameters of the preset first model, so as to obtain a trained first model.
In the above scheme, the training module is further configured to:
Training a preset second model based on the converted second characteristic data and the encryption characteristic to obtain a trained second model;
and carrying out feature extraction processing on the converted first feature data by using the trained second model to obtain first private features corresponding to the first feature data.
In the above scheme, the training module is further configured to:
carrying out projection processing on the converted second characteristic data by using a preset second model to obtain projection characteristics;
Processing the projection characteristic and the encryption characteristic by using a preset second loss function to determine a second loss value;
and reversely transmitting the second loss value to the preset second model to adjust the parameters of the preset second model, so as to obtain a trained second model.
In the above scheme, the training module is further configured to:
Acquiring a preset second model, wherein the preset second model comprises an initial first sub-model and an initial second sub-model;
Performing feature extraction processing on the converted second feature data by using the initial first sub-model to obtain second private features;
And carrying out projection processing on the second private feature by using the initial second sub-model to obtain a projection feature.
In the above scheme, the training module is further configured to:
Back-propagating the second loss value to the initial first sub-model to adjust parameters of the initial first sub-model to obtain a trained first sub-model;
Back-propagating the second loss value to the initial second sub-model to adjust parameters of the initial second sub-model to obtain a trained second sub-model;
and determining the trained first sub-model and the trained second sub-model as a trained second model.
In the above scheme, the device further includes:
the adjustment module is used for reversely transmitting the first loss value to the trained first sub-model so as to adjust the parameters of the trained first sub-model and obtain an updated first sub-model;
Correspondingly, the training module is further configured to:
and carrying out feature extraction processing on the converted first feature data by using the updated first sub-model to obtain a first private feature corresponding to the first feature data.
In the above solution, when the first participant device holds tag data, the processing module is further configured to:
acquiring training data and label data corresponding to the training data;
Constructing an initial classification model based on the conversion model, a feature extraction model and a preset classifier, wherein the feature extraction model comprises the global model and/or the trained second sub-model;
Training the initial classification model based on the training data and the label data to obtain a trained classification model;
And processing the attribute information by using the trained classification model to obtain a recommendation result.
In the above scheme, the processing module is further configured to:
processing the training data by using the initial classification model to obtain a training result;
processing the tag data and the training result by using a preset third loss function, and determining a third loss value;
And reversely transmitting the third loss value to the initial classification model to adjust parameters of the initial classification model so as to obtain a trained classification model.
The embodiment of the application also provides a data processing method, which is based on a federal learning system, wherein the federal learning system comprises a first participant device and at least one second participant device, and the method is applied to the second participant device and comprises the following steps:
acquiring third characteristic data and a preset third model held by the second participant equipment, wherein the preset third model comprises an initial third sub-model;
performing feature extraction processing on the third feature data by using the initial third sub-model to obtain a third private feature;
Encrypting the third private feature to obtain an encrypted feature;
and sending the encryption characteristic to the first participant device so that the first participant device determines a global model for processing the attribute information of the object to be recommended based on the encryption characteristic.
The embodiment of the application also provides a data processing device, which comprises:
The third acquisition module is used for acquiring third characteristic data and a preset third model held by the second participant equipment, wherein the preset third model comprises an initial third sub-model;
The feature extraction module is used for carrying out feature extraction processing on the third feature data by utilizing the initial third sub-model to obtain a third private feature;
The encryption module is used for carrying out encryption processing on the third private feature to obtain an encrypted feature;
And the second sending module is used for sending the encryption characteristic to the first participant device so that the first participant device determines a global model for processing the attribute information of the object to be recommended based on the encryption characteristic.
An embodiment of the present application provides an electronic device, including:
A memory for storing executable instructions;
And the processor is used for realizing the data processing method provided by the embodiment of the application when executing the executable instructions stored in the memory.
The embodiment of the application provides a computer readable storage medium which stores executable instructions for realizing the data processing method provided by the embodiment of the application when being executed by a processor.
The embodiment of the application provides a computer program product, which comprises a computer program, wherein the computer program is executed by a processor to realize the data processing method provided by the embodiment of the application.
The embodiment of the application has the following beneficial effects:
According to the data processing method provided by the embodiment of the application, the problem that label data in overlapping users of an active party and a passive party in a longitudinal scene is less and the quality of a data source of the passive party is difficult to accurately evaluate is solved by introducing a label-free joint modeling process. And the combined modeling is carried out by using the data of the non-label coincident users and the data of the non-coincident users, so that the combined model of the feature extraction part is optimized, the model effect of the combined modeling can be improved, the recommendation result is determined based on the extracted features with high degree of distinction, and the recommendation success rate can be improved.
Drawings
FIG. 1 is a schematic diagram of a data processing system according to an embodiment of the present application;
FIGS. 2A-2B are schematic structural diagrams of an electronic device according to an embodiment of the present application;
FIG. 3 is a schematic flow chart of a data processing method according to an embodiment of the present application;
fig. 4 is a schematic flow chart of training a preset first model according to an embodiment of the present application;
FIG. 5 is a schematic flow chart of training a second model according to an embodiment of the present application;
fig. 6 is a schematic flow chart of projection processing on the converted second feature data according to an embodiment of the present application;
fig. 7 is a schematic diagram of an implementation flow for processing attribute information according to an embodiment of the present application;
FIG. 8 is a schematic flow chart of another data processing method according to an embodiment of the present application;
FIG. 9 is a schematic diagram of a longitudinal federal learning system architecture according to an embodiment of the present application;
FIG. 10 is a schematic diagram of the overall structure of a longitudinal federal learning model pre-training method according to an embodiment of the present application;
FIG. 11 is a schematic flow chart of private model training based on alignment data according to an embodiment of the present application;
FIG. 12 is a schematic flow chart of local model training based on local data according to an embodiment of the present application;
FIG. 13 is a schematic flow chart of local model aggregation and distribution provided by an embodiment of the present application;
fig. 14 is a schematic flow chart of labeled supervised learning based on a pre-training model according to an embodiment of the present application.
Detailed Description
The present application will be further described in detail with reference to the accompanying drawings, for the purpose of making the objects, technical solutions and advantages of the present application more apparent, and the described embodiments should not be construed as limiting the present application, and all other embodiments obtained by those skilled in the art without making any inventive effort are within the scope of the present application.
In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is to be understood that "some embodiments" can be the same subset or different subsets of all possible embodiments and can be combined with one another without conflict.
If a similar description of "first/second" appears in the application document, the following description is added, in which the terms "first/second/third" are merely distinguishing between similar objects and not representing a particular ordering of the objects, it being understood that the "first/second/third" may be interchanged with a particular order or precedence, if allowed, so that embodiments of the application described herein may be practiced otherwise than as illustrated or described herein.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the application only and is not intended to be limiting of the application.
Before describing embodiments of the present application in further detail, the terms and terminology involved in the embodiments of the present application will be described, and the terms and terminology involved in the embodiments of the present application will be used in the following explanation.
1) Federal learning (FEDERATED LEARNING) refers to a method of machine learning by joining different participants (participant, or party, also known as data owner, or client). In federal learning, participants do not need to expose their own data to other participants and coordinators (coordinator, also referred to as parameter servers (PARAMETER SERVER), or aggregation servers (aggregation server)), so federal learning can well protect user privacy and data security.
2) And the transverse federation learning is to take out the part of data with the same data characteristics of the participants and the incomplete users for joint machine learning under the condition that the data characteristics of each participant overlap more and the user overlap less.
3) The longitudinal federation learning is to take out the part of users and data with the same participant users and different user data characteristics to perform joint machine learning training under the condition that the data characteristics of the participants are less in overlapping and the users are more in overlapping.
4) Contrast learning (contrastive learning) is a method of describing tasks of similar and dissimilar things for a machine learning model. With this approach, machine learning models can be trained to distinguish between similar and different images.
The embodiment of the application provides a data processing method, a data processing device, electronic equipment, a computer readable storage medium and a computer program product, which can improve the model effect of joint modeling, determine a recommendation result based on the extracted characteristics of high degree of distinction and improve the recommendation success rate.
Based on the above explanation of terms and terminology involved in the embodiments of the present application, referring first to fig. 1, fig. 1 is a schematic architecture diagram of a data processing system provided in the embodiments of the present application, in the data processing system 100, a first participant device 400 and a second participant device 410 (2 second participant devices are shown in an exemplary manner and are respectively denoted as 410-1 and 410-2 for distinguishing), where the first participant device 400 and the second participant device 410 are connected to each other through a network 300, and meanwhile, the server device is connected through the network 300, and the network 300 may be a wide area network or a local area network, or a combination of both, and data transmission is implemented using a wireless link.
In some embodiments, the first participant device 400 and the second participant device 410 may be, but are not limited to, a notebook computer, a tablet computer, a desktop computer, a smart phone, a dedicated messaging device, a portable game device, a smart speaker, a smart watch, etc., and may also be a client terminal of a federal learning participant, such as a participant device storing user characteristic data at each bank or financial institution, etc. The server device may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks (CDNs, content Delivery Network), and basic cloud computing services such as big data and artificial intelligence platforms, and is used for assisting each participant device in federal learning to obtain a federal learning model. The network 300 may be a wide area network or a local area network, or a combination of both. The first participant device 400 and the second participant device 410 may be directly or indirectly connected through wired or wireless communication, which is not limited in the embodiment of the present application.
The first participant device 400 is configured to obtain first feature data and second feature data held by the first participant device, receive an encryption feature sent by the second participant device, where the encryption feature is obtained by performing encryption processing on third feature data held by the second participant device, where the second feature data and the third feature data are data of different features of the same user; training a preset first model based on the first characteristic data, the second characteristic data and the encryption characteristic to obtain a trained first model; the method comprises the steps that a trained first model is sent to target equipment, so that the target equipment aggregates the trained models sent by a plurality of participant equipment to obtain a global model, and the target equipment is server equipment or any participant equipment in a federal learning system; receiving a global model sent by target equipment; and acquiring attribute information of the object to be recommended, and processing the attribute information by using the global model to obtain a recommendation result.
A second participant device 410, configured to acquire third feature data and a preset third model, where the preset third model includes an initial third sub-model; performing feature extraction processing on the third feature data by using the initial third sub-model to obtain a third private feature; then, the third private feature is encrypted to obtain an encrypted feature; and finally, the encryption feature is sent to the first participant device, so that the first participant device determines a global model for processing the attribute information of the object to be recommended based on the encryption feature.
Referring to fig. 2A-2B, fig. 2A-2B are schematic structural diagrams of an electronic device according to an embodiment of the present application, in practical application, the electronic device 500 may be implemented as the first participant device 400 or the second participant device 410 in fig. 1, and an electronic device implementing a data processing method according to an embodiment of the present application is described. The electronic device 500 shown in fig. 2A-2B includes: at least one processor 510, a memory 550, at least one network interface 520, and a user interface 530. The various components in electronic device 500 are coupled together by bus system 540. It is appreciated that bus system 540 is used to facilitate connected communications between these components. The bus system 540 includes a power bus, a control bus, and a status signal bus in addition to the data bus. The various buses are labeled as bus system 540 in fig. 2 for clarity of illustration.
The Processor 510 may be an integrated circuit chip having signal processing capabilities such as a general purpose Processor, such as a microprocessor or any conventional Processor, a digital signal Processor (DSP, digital Signal Processor), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like.
The user interface 530 includes one or more output devices 531 that enable presentation of media content, including one or more speakers and/or one or more visual displays. The user interface 530 also includes one or more input devices 532, including user interface components that facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, other input buttons and controls.
The memory 550 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard drives, optical drives, and the like. Memory 550 may optionally include one or more storage devices physically located remote from processor 510.
Memory 550 includes volatile memory or nonvolatile memory, and may also include both volatile and nonvolatile memory. The non-volatile Memory may be a Read Only Memory (ROM) and the volatile Memory may be a random access Memory (RAM, random Access Memory). The memory 550 described in embodiments of the present application is intended to comprise any suitable type of memory.
In some embodiments, memory 550 is capable of storing data to support various operations, examples of which include programs, modules and data structures, or subsets or supersets thereof, as exemplified below.
An operating system 551 including system programs for handling various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and handling hardware-based tasks;
Network communication module 552 is used to reach other computing devices via one or more (wired or wireless) network interfaces 520, exemplary network interfaces 520 include: bluetooth, wireless compatibility authentication (WiFi), and universal serial bus (USB, universal Serial Bus), etc.;
A presentation module 553 for enabling presentation of information (e.g., a user interface for operating a peripheral device and displaying content and information) via one or more output devices 531 (e.g., a display screen, speakers, etc.) associated with the user interface 530;
The input processing module 554 is configured to detect one or more user inputs or interactions from one of the one or more input devices 532 and translate the detected inputs or interactions.
In some embodiments, the data processing apparatus provided in the embodiments of the present application may be implemented in a software manner, fig. 2A shows a schematic structural diagram of the electronic device provided in the embodiments of the present application as the first participant device 400, and the data processing apparatus 555 stored in the memory 550 may be software in the form of a program, a plug-in, or the like, including the following software modules: the first acquisition module 5551, training module 5552, first transmission module 5553, reception module 5554, second acquisition module 5555, and processing module 5556 are logical, and thus may be arbitrarily combined or further split depending on the functions implemented. The functions of the respective modules will be described hereinafter.
In some embodiments, as shown in fig. 2B, fig. 2B illustrates a schematic structural diagram of an electronic device provided as the second participant device 410 according to an embodiment of the present application, and software modules stored in the data processing apparatus 555 of the memory 550 may include: the third acquisition module 5557, the feature extraction module 5558, the encryption module 5559, and the second transmission module 5560 are logical, and thus may be arbitrarily combined or further split according to the implemented functions. The functions of the respective modules will be described hereinafter.
In other embodiments, the data processing apparatus provided in the embodiments of the present application may be implemented in hardware, and by way of example, the data processing apparatus provided in the embodiments of the present application may be a processor in the form of a hardware decoding processor that is programmed to perform the data processing method provided in the embodiments of the present application, for example, the processor in the form of a hardware decoding processor may employ one or more Application Specific Integrated Circuits (ASICs), DSPs, programmable logic devices (PLDs, programmable Logic Device), complex Programmable logic devices (CPLDs, complex Programmable Logic Device), field Programmable Gate Arrays (FPGAs), field-Programmable GATE ARRAY), or other electronic components.
The data processing method provided by the embodiment of the present application will be described in connection with exemplary applications and implementations of the first participant device provided by the embodiment of the present application. The data processing method provided by the embodiment of the application is based on a federal learning system, wherein the federal learning system comprises a first participant device and at least one second participant device. Referring to fig. 3, fig. 3 is a flowchart of a data processing method according to an embodiment of the present application, and will be described with reference to the steps shown in fig. 3.
Step S301, acquiring first feature data, second feature data, and an encrypted feature sent by a second participant device, where the first feature data and the second feature data are held by the first participant device.
Here, the encryption feature is obtained by encrypting the held third feature data by the second participant device. The second feature data and the third feature data are data of different features of the same user, i.e. the second feature data and the third feature data are aligned.
In actual implementation, in the context of the vertical federal model, at least two participants are typically involved, a first participant being a participant holding characteristic data and tag data (also referred to as a Guest or active party) and a second participant being a participant holding characteristic data (also referred to as a Host or passive party). The method provided by the embodiment of the application can be applied to a longitudinal federal model participated by a Guest party and at least one Host party. In this scenario, in the overlapping users of the first participant holding user and the second participant holding user, the first participant has no tagged data or only a small amount of tagged data, and conventional joint modeling cannot be performed through the tagged data. The embodiment of the application provides joint modeling under a label-free data scene.
In the embodiment of the present application, an active party a and a passive party B are taken as examples. The first party corresponds to an active party and the second party corresponds to a passive party. In practical application, the model training method based on the same scheme can be expanded to combine a plurality of passive parties simultaneously for model training. The first characteristic data of the first party is denoted as X A,unaligned, the second characteristic data of the first party device is denoted as X A,aligned, the third characteristic data of the second party is denoted as X B,aligned, the second party device performs processes such as conversion, characteristic extraction, projection, encryption and the like on the third characteristic data to obtain a second encrypted characteristic f B,aligned (the second encrypted characteristic is an encrypted characteristic sent by the second party herein, in order to distinguish from the encrypted characteristic obtained by the processing of the first party device, the characteristic obtained by the encryption processing of the first party device is hereinafter referred to as a first encrypted characteristic, and the characteristic obtained by the encryption processing of the second party device is referred to as a second encrypted characteristic). The first participant device acquires the first characteristic data X A,unaligned and the second characteristic data X A,aligned from the storage space of the first participant device and acquires the second encryption characteristic f B,aligned from the second participant device. The second feature data X A,aligned and the third feature data are denoted as X B,aligned as two-party alignment data.
Step S302, training a preset first model based on the first feature data, the second feature data and the encryption feature to obtain a trained first model.
Training a preset first model by using (X A,unaligned,XA,aligned,fB,aligned) to obtain a trained first model. The first model preset here is a private feature extraction model initialized by the first participant device. The step S302 is based on model pre-training of the unlabeled data, and joint modeling is implemented.
Fig. 4 is a schematic flow chart of training a preset first model according to an embodiment of the present application, as shown in fig. 4, in an implementation manner, step S302 may be implemented by steps S3021 to S3025 shown in fig. 4:
In step S3021, the first feature data and the second feature data are respectively converted by using a preset conversion model, so as to obtain converted first feature data and converted second feature data.
Each participant participating in the joint training has a conversion model C, and each participant converts the respective original data X into a predefined feature (matrix or vector) of the same dimension to obtain a conversion feature h. For example, the first feature data X A,unaligned of the a-party and the second feature data X A,aligned of the a-party are input to a preset conversion model C A for conversion processing, so as to obtain converted first feature data h A,unaligned and converted second feature data h A,aligned. The same operation is also executed by the B-party, and the third feature data X B,aligned of the B-party is input to the preset conversion model C B to perform conversion processing, so as to obtain converted third feature data h B,aligned.
In step S3022, feature extraction processing is performed on the converted first feature data by using a preset first model, so as to obtain local features corresponding to the first feature data.
Each participant has a respective local feature extraction model M local (first model), and the structures of the local feature extraction models are the same. Each local feature extraction model M local acts on the output of the respective conversion model, that is, acts on the converted feature data h (conversion feature), and extracts the local feature f corresponding to the original data. For example, the first model preset in the a-side is M A,local, and the converted first feature data h A,unaligned obtained in the step S3021 is input to M A,local for feature extraction processing, so as to obtain the local feature f A,unaligned corresponding to the first feature data X A,unaligned.
Step S3023, determining a first private feature corresponding to the first feature data based on the converted first feature data, the converted second feature data, and the encrypted feature.
In the embodiment of the present application, determining the first private feature may be implemented as: training a preset second model based on the converted second characteristic data and the encryption characteristic to obtain a trained second model; and performing feature extraction processing on the converted first feature data by using the trained second model to obtain first private features corresponding to the first feature data.
Each participant has a respective private feature extraction model, and the preset second model comprises the private feature extraction model. The structures of the private feature extraction models of all the participants can be the same or different, and the structures and the parameter weights of the private models are private data of all the participants, so that other participants are not known.
The first participant device trains a preset second model based on the converted second characteristic data and the encryption characteristic to obtain a trained second model, the second model comprises a private characteristic extraction model, the private characteristic of the data can be extracted, and the private characteristic extraction model included in the trained second model is recorded as M A,private. And processing the converted first characteristic data h A,unaligned, the converted second characteristic data h A,aligned and the second encryption characteristic F A,aligned by using a private characteristic extraction model M A,private to extract a first private characteristic F A,aligned corresponding to the first characteristic data X A,unaligned.
The second encryption feature f B,aligned is obtained by performing private feature extraction on the third feature data X B,aligned by the second party device to obtain a private feature, and further performing encryption processing on the private feature.
In step S3024, the local feature and the first private feature are processed by using a preset first loss function to determine a first loss value.
The step is a model updating process, and the purpose is to make the features of the same user as close as possible in the feature space and the features of different users as far as possible in the feature space. In the embodiment of the application, the classification target can be realized through a preset first loss function. The method can be concretely realized as follows: the a-side calculates a first loss function based on (f A,unaligned,fA,aligned). The first loss function preset here may be a contrast loss function or a similarity loss function, and in this embodiment of the present application, the similarity loss function is illustrated by the following formula (1):
Wherein i is a positive integer, the value range is [1 ], the number of the local features is equal to the number of the first feature data, and f i represents the ith local feature; j is a positive integer, the value range is [1 ], the number of the first private features is equal to the number of the second feature data, and f j represents the j first private features. i. j respectively traversing the samples in the current training pool of the A side, and calculating the first loss function according to the formula (1). And calculating the similarity loss of each sample, and summing to obtain a final first loss value of the result, wherein the first loss value is shown in the formula (2):
In step S3025, the first loss value is back-propagated to the preset first model to adjust parameters of the preset first model, so as to obtain a trained first model.
And (3) the A side calculates a first loss value L A,sim according to the step (f A,unaligned,fA,aligned), reversely propagates the first loss value to the first model, and adjusts parameters of the first model by a gradient descent method to update the first model so as to obtain a trained first model.
The method is also suitable for the local feature extraction model M B,local trained by the B side.
Step S303, the trained first model is sent to the target device, so that the target device aggregates the trained first model to obtain a global model.
The target device here is a server device or a participant device in the federal learning system.
The first participant device sends the trained first model, namely the trained local feature extraction model, to the server device or the preset participant device, and the server device or the preset participant device aggregates the local feature extraction models trained by the participant devices respectively to obtain a global model, and then sends the global model back to each participant device.
Step S304, receiving the global model sent by the target device.
In some embodiments, before processing the data to be processed, if the first participant device has a small amount of tag data, the global model may be further adjusted according to the tag data, so as to further improve the effect of the joint model, and make the processing result more accurate.
In step S305, attribute information of the object to be recommended is acquired.
And step S306, processing the attribute information by using the global model to obtain a recommendation result.
After the global model is obtained, the first participant device can process the data to be processed by using the global model to obtain a processing result. If the method is applied to information recommendation, the data to be processed is attribute information of an object to be recommended, the features in the attribute information are extracted by using a global model, the features with high degree of distinction can be extracted, recommendation results are determined based on the features with high degree of distinction, and the recommendation success rate can be improved.
The data processing method provided by the embodiment of the application is applied to the first participant device based on a federal learning system comprising the first participant device and at least one second participant device, and comprises the following steps: the method comprises the steps of obtaining first characteristic data, second characteristic data and an encryption characteristic sent by second participant equipment, wherein the first characteristic data, the second characteristic data and the encryption characteristic are obtained by encryption processing of third characteristic data held by the second participant equipment, and the second characteristic data and the third characteristic data are data of different characteristics of the same user; training a preset first model based on the first characteristic data, the second characteristic data and the encryption characteristic to obtain a trained first model; the trained first model is sent to target equipment, so that the target equipment aggregates the trained first model to obtain a global model, and the target equipment is server equipment or participant equipment in the federal learning system; receiving a global model sent by target equipment; and acquiring attribute information of the object to be recommended, and processing the attribute information by using the global model to obtain a recommendation result. By means of the method for model pre-training, the model effect of joint modeling can be improved, the recommendation result is determined based on the extracted features of the high degree of distinction, and the recommendation success rate can be improved.
Fig. 5 is a schematic flow chart of training a second model provided by an embodiment of the present application, as shown in fig. 5, in some embodiments, "training a preset second model based on converted second feature data and encryption features, to obtain a trained second model" may be implemented as the following steps:
step S501, performing projection processing on the converted second feature data by using a preset second model to obtain projection features.
Here, the preset second model includes an initial first sub-model, which is a private feature extraction model capable of extracting a private feature, denoted as M A,private, and an initial second sub-model, which is a projection model for projecting a feature, denoted as H A.
Fig. 6 is a schematic flow chart of projection processing of the converted second feature data according to an embodiment of the present application, as shown in fig. 6, in some embodiments, the step S501 may be implemented as the following steps:
Step S5011, a preset second model is obtained.
And step S5012, performing feature extraction processing on the converted second feature data by using the initial first sub-model to obtain second private features.
And inputting the converted second characteristic data h A,aligned into the initial first submodel M A,private, and extracting the private characteristic of the converted second characteristic data h A,aligned to obtain a second private characteristic f A,aligned corresponding to the second characteristic data X A,aligned. In the embodiment of the application, in order to ensure that the data privacy is not revealed, the second private feature can be encrypted by adopting an encryption method such as a differential privacy method.
And step S5013, performing projection processing on the second private feature by using the initial second sub-model to obtain a projection feature.
The second private feature f A,aligned is input into an initial second sub-model H A, projection processing is carried out on the second private feature f A,aligned, and the projection feature corresponding to the first feature data X A,aligned is recorded as z A.
Step S502, the projection feature and the encryption feature are processed by using a preset second loss function, and a second loss value is determined.
Similar to step S3024 described above, this step is also a model update procedure, in which the features of the same user are made as close as possible in the feature space, and the features of different users are made as far apart as possible in the feature space. In the embodiment of the application, the classification target can be realized through a preset second loss function. The method can be concretely realized as follows: the a-side calculates the loss function based on (z A,aligned,fB,aligned). Here, the preset second loss function may be a contrast loss function or a similarity loss function, and in the embodiment of the present application, the similarity loss function is illustrated by the following formula (3):
wherein i is a positive integer, the value range is [1 ], the number of the second encryption features is equal to the number of the third feature data, and f i represents the ith second encryption feature; j is a positive integer, the value range is [1 ], the number of projection features is equal to the number of first feature data, and z j represents the j-th projection feature. i. j traverse samples in the current training pools of the A side and the B side respectively, f i and z j are features or encryption features of the same user, namely, a certain sample j of the A side and a sample i of the B side are aligned, and the calculation mode of the total loss function is as shown in the formula (3). And calculating the similarity loss of each aligned sample, and summing to obtain a final second loss value of the result, wherein the second loss value is shown in the formula (4):
Step S503, the second loss value is reversely propagated to a preset second model to adjust the parameters of the preset second model, so as to obtain a trained second model.
And (3) the A side calculates a second loss value L A,sim2 according to the (z A,aligned,fB,aligned), reversely propagates the second loss value to the second model, and adjusts parameters of the second model by a gradient descent method to update the second model so as to obtain a trained second model.
The preset second model comprises an initial first sub-model and an initial second sub-model, and parameters of each sub-model can be respectively adjusted during back propagation to obtain a trained second model. In actual implementation, step S503 may be implemented as the following steps: the second loss value is reversely transmitted to the initial first sub-model to adjust parameters of the initial first sub-model, and a trained first sub-model is obtained; back-propagating the second loss value to an initial second sub-model to adjust parameters of the initial second sub-model to obtain a trained second sub-model; and determining the trained first sub-model and the trained second sub-model as the trained second model.
In some embodiments, after updating the private feature extraction model (i.e., the first sub-model) with the second loss value, the model may be further updated with the first loss value, and the first loss value may be back-propagated to the trained first sub-model, so as to adjust parameters of the trained first sub-model, to obtain an updated first sub-model. And when the private feature extraction is performed next time, the updated first sub-model can be utilized to perform feature extraction processing on the converted first feature data, so as to obtain a first private feature corresponding to the first feature data.
In some embodiments, when the first participant device holds the tag data, after the global model is obtained, the global model may be further adjusted according to the tag data, so as to further improve the effect of the joint model, and make the recommendation result more accurate. Fig. 7 is a schematic flow chart of an implementation of processing attribute information according to an embodiment of the present application, as shown in fig. 7, the step S305 "processing attribute information by using a global model to obtain a recommendation result" may include the following steps:
Step S3051, obtaining training data and tag data corresponding to the training data.
The obtained training data is denoted as X train, and the label data corresponding to the training data is denoted as Y train.
Step S3052, an initial classification model is constructed and obtained based on the conversion model, the feature extraction model and a preset classifier.
The feature extraction model here includes a global model M A,local and/or a trained first sub-model M A,private. The initial classification model constructed from the conversion model C A of the a-side, the feature extraction model M A, and the preset classifier P A is denoted as (C A,MA,PA). In the embodiment of the present application, the feature extraction model M A may only include the global model M A,local, that is, the trained local feature extraction model is used to process attribute information; the feature extraction model M A may also include only the trained first sub-model M A,private, that is, the attribute information is processed by using the trained private feature extraction model; the feature extraction model M A may further include a global model M A,local and a trained first sub-model M A,private, and the attribute information is processed by using the trained local feature extraction model and the trained private feature extraction model, and when the attribute information and the private feature extraction model are jointly processed, the sum of the outputs of the two may be used, or the outputs of the two may be spliced.
Step S3053, training the initial classification model based on the training data and the label data to obtain a trained classification model.
This step may be implemented as: processing training data by using an initial classification model to obtain a training result; processing the label data and the training result by using a preset third loss function, and determining a third loss value; and reversely transmitting the third loss value to the initial classification model to adjust parameters of the initial classification model, thereby obtaining a trained classification model.
And step S3054, processing the attribute information by using the trained classification model to obtain a recommendation result.
And inputting attribute information of the object to be recommended into the trained classification model for feature extraction, and further performing information recommendation based on the extracted features to obtain a recommendation result.
In the embodiment of the application, the model is adjusted according to the label data, so that the effect of the joint model can be further improved, and then the attribute information is processed, so that the recommendation result is more accurate, and the recommendation success rate is further improved.
Next, the data processing method provided by the embodiment of the present application will be described in connection with exemplary applications and implementations of the second participant device provided by the embodiment of the present application. Referring to fig. 8, fig. 8 is another flow chart of the data processing method according to the embodiment of the present application, and the steps shown in fig. 8 will be described.
Step S801, obtaining third feature data and a preset third model held by the second participant device.
Here, the preset third model includes an initial third sub-model for performing feature extraction on the third feature data.
And step S802, performing feature extraction processing on the third feature data by using the initial third sub-model to obtain a third private feature.
Step S803, performing encryption processing on the third private feature to obtain an encrypted feature.
The encryption feature here is the second encryption feature in the above embodiment.
In the embodiment of the application, in order to protect the data privacy of the second participant device, the second feature may be encrypted to obtain a second encrypted feature.
Step S804, the encryption feature is sent to the first participant device, so that the first participant device determines a global model for processing attribute information of the object to be recommended based on the encryption feature.
In some embodiments, the preset third model may further include an initial fourth sub-model, configured to perform a projection process on the second encrypted feature to obtain a second projection feature. The second participant device executes the data processing method provided by the embodiment, a joint model suitable for the second participant device can be trained, and attribute information of an object to be recommended of the second participant device is processed through joint modeling.
In the following, an exemplary application of the embodiment of the present application in a practical application scenario will be described.
The longitudinal federation learning is to take out the part of users and data with the same participant users and different user data characteristics to perform joint machine learning training under the condition that the data characteristics of the participants are less overlapped and the users are more overlapped. For example, there are two participants a and B belonging to the same area, wherein participant a is a bank and participant B is an e-commerce platform. Participants a and B have more and the same users in the same area, but the services of a and B are different, and the characteristics of the recorded user data are different. In particular, the user data characteristics of the a and B records may be complementary. In such a scenario, vertical federal learning may be used to help a and B build a joint machine learning predictive model, helping a and B provide better service to customers.
Fig. 9 is a schematic diagram of a longitudinal federal learning system architecture according to an embodiment of the present application, in order to assist in joint modeling of a and B, a coordinator C is required to participate. A first part: participants a and B achieve encrypted sample alignment as illustrated in fig. 9. Since the user groups of the two enterprises A and B are not completely coincident, the system utilizes an encryption-based user sample alignment technology to confirm the common users of the two parties on the premise that the A and B do not disclose respective data, and the users which are not mutually overlapped are not exposed so as to combine the characteristics of the users for modeling.
A second part: and (5) training an encryption model. After the community of users is determined, the machine learning model may be trained using the data. In order to ensure confidentiality of data in the training process, encryption training is required by means of a coordinator C. Taking a linear regression model as an example, the training process can be divided into the following 4 steps. At ①, coordinator C distributes the public key to A and B for encrypting the data to be exchanged during training. At ②, the intermediate results used to calculate the gradient are interacted between participants a and B in encrypted form. ③ step: participants a and B calculate based on the encrypted gradient values, respectively, while participant B calculates the loss function from its tag data and aggregates the results to coordinator C. Coordinator C calculates the total gradient value by summarizing the results and decrypts it. ④ step: the coordinator C transmits the decrypted gradients back to the participants A and B respectively, and the participants A and B update the parameters of the respective models according to the gradients.
The participant and the coordinator iterate the steps until the loss function converges, the model parameters converge, the maximum iteration number is reached, or the maximum training time is reached, so that the whole model training process is completed.
It should be noted that in both horizontal federation learning and vertical federation learning, the encryption operation and the encryption transmission are optional, and are determined according to the specific application scenario, and not all application scenarios need the encryption operation and the encryption transmission.
As described above, the active party owns the tag data Y and the feature data X 1 in the vertical federal scenario, which are not identical to the passive party owning the feature data X 2(X2 and X 1), and the performance of the self model is improved. The data quality of the passive party (X 2) determines the level of improvement of the effect of the active party model in joint modeling. Therefore, evaluating passive party data quality is an important step in the federal modeling process. The data quality includes various evaluation indexes such as user overlap ratio, modeling effect improvement relative to the reference data, and the like. The modeling effect improvement is usually obtained by evaluating after modeling is combined with one or more groups of tagged data and the passive party, and if the model performance improvement is large, the passive party data is considered to be good in quality. However, in an actual scene, sometimes, there is less tag data of the active party, and there is less tagged data that the active party and the passive party can match, and the data is less, so that data quality evaluation cannot be performed or the evaluation result cannot reflect the actual data quality. Therefore, if the quality of the passive party data can be evaluated through joint modeling based on a large amount of unlabeled data, the accessible data source range can be expanded on one hand, and the accuracy of data evaluation is improved on the other hand.
The embodiment of the application provides a longitudinal federal learning model pre-training method which can be used for only partial data alignment scenes. Fig. 10 is a schematic diagram of an overall structure of a longitudinal federal learning model pre-training method provided by an embodiment of the present application, where, as shown in fig. 10, the pre-training process includes three steps, the first step is self-supervision training of participants based on aligned users by using respective private models, the second step is training of each participant by using a local model based on local data, and the third step is aggregation and retransmission of the local model of the second step in a server.
Parties a and B have data (X A,aligned,XA,unaligned) and (X B,aligned,XB,unaligned), respectively, where X A,aligned and X B,aligned are parts of the two parties' data aligned. The two parties have a conversion model C A and C B, respectively, and the user converts the respective raw data into a predefined feature (matrix or vector) of the same dimension, h A and h B. For any participant, there is only one conversion model C. The two parties have respective private feature models M A,private and M B,private to act on the output of the respective conversion models and output features f A and f B, the private structure models can be structurally different, the structure and the weight are private data of the parties, and the other parties are not known. The two sides have respective local feature extraction models M A,local and M B,local, and the two models have the same structure. The inputs and outputs of the above private and local feature extraction models have the same dimensions, respectively. In addition, the A-side has projection model P A, and the B-side has projection model P B acting on f A and f B, respectively, to produce projections z A and z B.
Training basic flow:
first, training is based on a private model of the aligned user.
Fig. 11 is a schematic flow chart of private model training based on alignment data according to an embodiment of the present application, and the training is first initialized: each participant initializes a respective model M A,private and M B,private,PA and P B. The A side and the B side realize data alignment in an encryption mode.
① The A side and the B side obtain the characteristics f A and f B of the data through the respective models M A,private and M B,private, and respectively obtain z A and z B through P A and P B. And the party B transmits f B to the party A, and the party A transmits f A to the party B, and in addition, the security of the data, such as a differential privacy method, can be further improved through various encryption modes for f A,fB.
② The step is a model updating process, and the purpose is to make the features of the same user as close as possible in the feature space and the features of different users as far as possible in the feature space. There are a number of ways to achieve this goal, such as by choosing a specific loss function, a typical way is as follows. The a-side and the B-side calculate the similarity loss function based on (z A,fB) and (z B,fA), respectively.
For two aligned samples fi and zj, the specific form of its loss function is as follows (5):
Wherein i, j traverse samples in the current training batch of the A side and the B side respectively. For a certain sample i of the a-side, which is aligned with a sample j of the B-side, its overall loss function is calculated in the manner described above. And calculating the similarity loss of each aligned sample, and summing to obtain a final result:
③ The a-side and the B-side calculate respective loss functions L A,sim and L B,sim according to (z A,fB) and (z B,fA), respectively, and update the respective models (M A,private,PA) and (M B,private,PB) by a gradient descent method.
Second, local model training based on local data (fig. 3).
Fig. 12 is a schematic flow chart of local model training based on local data according to an embodiment of the present application, and the training is initialized first: the a-side and B-side have now completed the training of M A,private and M B,private of the current step. Here, the training process is described by way of example only with party a, and party B is the same as party a.
① The a-party obtains corresponding features based on the local data (X A,aligned,XA,unaligned) and on M A,private and M A,local respectively, and then calculates a contrast loss function or a similarity loss function L thereof. There are various ways to tie the outputs of the two models, such as directly using mean squared error loss functions; if an additional projection model is added, the zooming-in is performed by the method as in the first step.
② M A,private and M A,local are updated based on the loss function calculated in the previous step, and whether to update M A,private may be selected according to the method of the previous step.
Third, the local model is polymerized and issued.
Fig. 13 is a schematic flow chart of local model aggregation and delivery provided in an embodiment of the present application, and first performs initialization during training: the a-side and B-side have now completed training of current steps M A,local and M B,loca l.
① Each participant uploads the local model, M A,local and M B,local, to a server or one participant predetermined in advance, for model aggregation, a common model aggregation manner is FedAvg.
② And issuing the aggregated global model to each participant. A simple way is to replace the local model directly with the global model. Alternatively, the local model may be updated by the global model using the local data, as in the second step.
Supervised training/fine tuning based on pre-training models. Fig. 14 is a schematic flow chart of labeled supervised learning based on a pre-training model according to an embodiment of the present application, as shown in fig. 14, after the pre-training is completed, a trained local model and a trained private model may be used as a feature extraction model (M A,private,MA,local) in combination with a classification model Q to form a complete model (C, M A,private,MA,local, Q), as shown in fig. 5. The model is employed to perform local training based on limited tagged data, or joint modeling with participants previously participating in pre-training. Wherein, the output of M A,private,MA,local can be feature aggregated according to the need and scene, for example, only the output of M A,private or M A,local is used, or the sum of the two outputs is used, or the outputs of the two outputs are spliced.
The steps of the method can be extended to multiple participants, so that the whole scheme is also applicable to the scene of multiple participants. The data of each participant, including the unlabeled aligned data and the unlabeled unaligned data, are fully utilized, and a feature extraction model with excellent performance can be obtained through pre-training, so that features with high degree of distinction are obtained; and determining a recommendation result based on the extracted features of the height discrimination, so that the recommendation success rate can be improved. The feature extraction is carried out based on the model, so that the problem of poor federal modeling effect based on supervised learning under the condition of less tag data can be solved.
The method provided by the embodiment of the application combines self-supervision learning and model aggregation, and fully utilizes the unlabeled alignment data and the unlabeled misalignment data; by improving the data utilization efficiency, the performance of the feature extraction model is improved, and therefore the performance of the model on a final task is improved.
Continuing with the description below of an exemplary architecture of the data processing apparatus 555 implemented as a software module provided in an embodiment of the present application, in some embodiments, as shown in fig. 2A, fig. 2A illustrates a schematic structural diagram of a first participant device 400 provided in an embodiment of the present application, the software module stored in the data processing apparatus 555 of the memory 540 may include:
A first obtaining module 5551, configured to obtain first feature data held by a first participant device, second feature data, and an encryption feature sent by a second participant device, where the encryption feature is obtained by performing encryption processing on third feature data held by the second participant device, and the second feature data and the third feature data are data of different features of the same user;
The training module 5552 is configured to train a preset first model based on the first feature data, the second feature data, and the encryption feature, to obtain a trained first model;
A first sending module 5553, configured to send the trained first model to a target device, so that the target device aggregates the trained first model to obtain a global model; the target equipment is server equipment or participant equipment in the federal learning system;
a receiving module 5554, configured to receive the global model sent by the target device;
a second obtaining module 5555, configured to obtain attribute information of an object to be recommended;
and the processing module 5556 is configured to process the attribute information by using the global model to obtain a recommendation result.
In some embodiments, the training module 5552 is further configured to:
Respectively converting the first characteristic data and the second characteristic data by using a preset conversion model to obtain converted first characteristic data and converted second characteristic data;
Performing feature extraction processing on the converted first feature data by using a preset first model to obtain local features corresponding to the first feature data;
determining a first private feature corresponding to the first feature data based on the converted first feature data, the converted second feature data and the encrypted feature;
Processing the local feature and the first private feature by using a preset first loss function to determine a first loss value;
And reversely transmitting the first loss value to the preset first model to adjust the parameters of the preset first model, so as to obtain a trained first model.
In some embodiments, the training module 5552 is further configured to:
Training a preset second model based on the converted second characteristic data and the encryption characteristic to obtain a trained second model;
and carrying out feature extraction processing on the converted first feature data by using the trained second model to obtain first private features corresponding to the first feature data.
In some embodiments, the training module 5552 is further configured to:
carrying out projection processing on the converted second characteristic data by using a preset second model to obtain projection characteristics;
Processing the projection characteristic and the encryption characteristic by using a preset second loss function to determine a second loss value;
and reversely transmitting the second loss value to the preset second model to adjust the parameters of the preset second model, so as to obtain a trained second model.
In some embodiments, the training module 5552 is further configured to:
Acquiring a preset second model, wherein the preset second model comprises an initial first sub-model and an initial second sub-model;
Performing feature extraction processing on the converted second feature data by using the initial first sub-model to obtain second private features;
And carrying out projection processing on the second private feature by using the initial second sub-model to obtain a projection feature.
In some embodiments, the training module 5552 is further configured to:
Back-propagating the second loss value to the initial first sub-model to adjust parameters of the initial first sub-model to obtain a trained first sub-model;
Back-propagating the second loss value to the initial second sub-model to adjust parameters of the initial second sub-model to obtain a trained second sub-model;
and determining the trained first sub-model and the trained second sub-model as a trained second model.
In some embodiments, the apparatus further comprises:
the adjustment module is used for reversely transmitting the first loss value to the trained first sub-model so as to adjust the parameters of the trained first sub-model and obtain an updated first sub-model;
correspondingly, the training module 5552 is further configured to:
and carrying out feature extraction processing on the converted first feature data by using the updated first sub-model to obtain a first private feature corresponding to the first feature data.
In some embodiments, when the first participant device holds tag data, the processing module 5556 is further configured to:
acquiring training data and label data corresponding to the training data;
Constructing an initial classification model based on the conversion model, a feature extraction model and a preset classifier, wherein the feature extraction model comprises the global model and/or the trained second sub-model;
Training the initial classification model based on the training data and the label data to obtain a trained classification model;
And processing the attribute information by using the trained classification model to obtain a recommendation result.
In some embodiments, the processing module 5556 is further configured to:
processing the training data by using the initial classification model to obtain a training result;
processing the tag data and the training result by using a preset third loss function, and determining a third loss value;
And reversely transmitting the third loss value to the initial classification model to adjust parameters of the initial classification model so as to obtain a trained classification model.
In some embodiments, as shown in fig. 2B, fig. 2B illustrates a schematic structural diagram of a second participant device 410 according to an embodiment of the present application, where software modules stored in the data processing device 555 of the memory 540 may include:
A third obtaining module 5557, configured to obtain third feature data and a preset third model, where the third preset model includes an initial third sub-model, where the third feature data is held by the second participant device;
the feature extraction module 5558 is configured to perform feature extraction processing on the third feature data by using the initial third sub-model to obtain a third private feature;
An encryption module 5559, configured to encrypt the third private feature to obtain an encrypted feature;
A second sending module 5560, configured to send the encryption feature to a first participant device, so that the first participant device determines a global model for processing attribute information of the object to be recommended based on the encryption feature.
It should be noted that, the description of the apparatus according to the embodiment of the present application is similar to the description of the embodiment of the method described above, and has similar beneficial effects as the embodiment of the method, so that a detailed description is omitted.
The embodiment of the application provides a computer program product, which comprises a computer program, wherein the computer program is executed by a processor to realize the data processing method provided by the embodiment of the application.
Embodiments of the present application provide a computer readable storage medium having stored therein executable instructions which, when executed by a processor, cause the processor to perform a method provided by embodiments of the present application, for example, a data processing method as shown in fig. 3.
In some embodiments, the computer readable storage medium may be FRAM, ROM, PROM, EPROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; but may be a variety of devices including one or any combination of the above memories.
In some embodiments, the executable instructions may be in the form of programs, software modules, scripts, or code, written in any form of programming language (including compiled or interpreted languages, or declarative or procedural languages), and they may be deployed in any form, including as stand-alone programs or as modules, components, subroutines, or other units suitable for use in a computing environment.
As an example, executable instructions may, but need not, correspond to files in a file system, may be stored as part of a file that holds other programs or data, such as in one or more scripts in a hypertext markup language (HTML, hyper Text Markup Language) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).
As an example, executable instructions may be deployed to be executed on one computing device or on multiple computing devices located at one site or distributed across multiple sites and interconnected by a communication network.
The foregoing is merely exemplary embodiments of the present application and is not intended to limit the scope of the present application. Any modification, equivalent replacement, improvement, etc. made within the spirit and scope of the present application are included in the protection scope of the present application.

Claims (12)

1. A data processing method, characterized in that it is based on a federal learning system comprising a first participant device and at least one second participant device, the method being applied to the first participant device, the method comprising:
acquiring first characteristic data, second characteristic data and encryption characteristics sent by second participant equipment, wherein the first characteristic data, the second characteristic data and the encryption characteristics are obtained by carrying out encryption processing on third characteristic data held by the second participant equipment, and the second characteristic data and the third characteristic data are data of different characteristics of the same user;
Training a preset first model based on the first characteristic data, the second characteristic data and the encryption characteristic to obtain a trained first model;
The training of the preset first model based on the first feature data, the second feature data and the encryption feature to obtain a trained first model includes:
Respectively converting the first characteristic data and the second characteristic data by using a preset conversion model to obtain converted first characteristic data and converted second characteristic data;
Performing feature extraction processing on the converted first feature data by using a preset first model to obtain local features corresponding to the first feature data;
determining a first private feature corresponding to the first feature data based on the converted first feature data, the converted second feature data and the encrypted feature;
Processing the local feature and the first private feature by using a preset first loss function to determine a first loss value;
The first loss value is reversely transmitted to the preset first model, so that parameters of the preset first model are adjusted, and a trained first model is obtained;
Transmitting the trained first model to target equipment so that the target equipment aggregates the trained first model to obtain a global model; the target device is a server device or a participant device in the federal learning system;
receiving the global model sent by the target equipment;
and acquiring attribute information of the object to be recommended, and processing the attribute information by utilizing the global model to obtain a recommendation result.
2. The method of claim 1, wherein the determining a first private feature corresponding to the first feature data based on the converted first feature data, the converted second feature data, and the encrypted feature comprises:
Training a preset second model based on the converted second characteristic data and the encryption characteristic to obtain a trained second model;
and carrying out feature extraction processing on the converted first feature data by using the trained second model to obtain first private features corresponding to the first feature data.
3. The method according to claim 2, wherein training the preset second model based on the converted second feature data and the encrypted feature to obtain a trained second model includes:
carrying out projection processing on the converted second characteristic data by using a preset second model to obtain projection characteristics;
Processing the projection characteristic and the encryption characteristic by using a preset second loss function to determine a second loss value;
and reversely transmitting the second loss value to the preset second model to adjust the parameters of the preset second model, so as to obtain a trained second model.
4. A method according to claim 3, wherein the projecting the converted second feature data with a preset second model to obtain a projection feature includes:
Acquiring a preset second model, wherein the preset second model comprises an initial first sub-model and an initial second sub-model;
Performing feature extraction processing on the converted second feature data by using the initial first sub-model to obtain second private features;
And carrying out projection processing on the second private feature by using the initial second sub-model to obtain a projection feature.
5. The method of claim 4, wherein back-propagating the second loss value to the preset second model to adjust parameters of the preset second model to obtain a trained second model comprises:
Back-propagating the second loss value to the initial first sub-model to adjust parameters of the initial first sub-model to obtain a trained first sub-model;
Back-propagating the second loss value to the initial second sub-model to adjust parameters of the initial second sub-model to obtain a trained second sub-model;
and determining the trained first sub-model and the trained second sub-model as a trained second model.
6. The method of claim 5, wherein the method further comprises:
back-propagating the first loss value to the trained first sub-model to adjust parameters of the trained first sub-model to obtain an updated first sub-model;
correspondingly, the feature extraction processing is performed on the converted first feature data by using the trained second model to obtain a first private feature corresponding to the first feature data, which includes:
and carrying out feature extraction processing on the converted first feature data by using the updated first sub-model to obtain a first private feature corresponding to the first feature data.
7. The method of claim 5, wherein said processing said attribute information using said global model to obtain a recommendation result when said first participant device holds tag data, comprises:
acquiring training data and label data corresponding to the training data;
Constructing an initial classification model based on the conversion model, a feature extraction model and a preset classifier, wherein the feature extraction model comprises the global model and/or the trained second sub-model;
Training the initial classification model based on the training data and the label data to obtain a trained classification model;
And processing the attribute information by using the trained classification model to obtain a recommendation result.
8. The method of claim 7, wherein training the initial classification model based on the training data and the tag data results in a trained classification model, comprising:
processing the training data by using the initial classification model to obtain a training result;
processing the tag data and the training result by using a preset third loss function, and determining a third loss value;
And reversely transmitting the third loss value to the initial classification model to adjust parameters of the initial classification model so as to obtain a trained classification model.
9. A data processing apparatus, comprising:
The first acquisition module is used for acquiring first characteristic data, second characteristic data and encryption characteristics sent by second party equipment, wherein the first characteristic data, the second characteristic data and the encryption characteristics are obtained by carrying out encryption processing on third characteristic data held by the second party equipment, and the second characteristic data and the third characteristic data are data of different characteristics of the same user;
the training module is used for training a preset first model based on the first characteristic data, the second characteristic data and the encryption characteristic to obtain a trained first model, wherein the training module is specifically used for respectively converting the first characteristic data and the second characteristic data by using a preset conversion model to obtain converted first characteristic data and converted second characteristic data; performing feature extraction processing on the converted first feature data by using a preset first model to obtain local features corresponding to the first feature data; determining a first private feature corresponding to the first feature data based on the converted first feature data, the converted second feature data and the encrypted feature; processing the local feature and the first private feature by using a preset first loss function to determine a first loss value; the first loss value is reversely transmitted to the preset first model, so that parameters of the preset first model are adjusted, and a trained first model is obtained;
the first sending module is used for sending the trained first model to target equipment so that the target equipment can aggregate the trained first model to obtain a global model; the target equipment is server equipment or participant equipment in the federal learning system;
The receiving module is used for receiving the global model sent by the target equipment;
The second acquisition module is used for acquiring attribute information of the object to be recommended;
and the processing module is used for processing the attribute information by utilizing the global model to obtain a recommendation result.
10. An electronic device, comprising:
A memory for storing executable instructions;
A processor for implementing the data processing method of any one of claims 1 to 8 when executing executable instructions stored in said memory.
11. A computer readable storage medium storing executable instructions for implementing the data processing method of any one of claims 1 to 8 when executed by a processor.
12. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements the data processing method of any of claims 1 to 8.
CN202210198357.XA 2022-03-02 2022-03-02 Data processing method, device, equipment and computer readable storage medium Active CN114547658B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210198357.XA CN114547658B (en) 2022-03-02 2022-03-02 Data processing method, device, equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210198357.XA CN114547658B (en) 2022-03-02 2022-03-02 Data processing method, device, equipment and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN114547658A CN114547658A (en) 2022-05-27
CN114547658B true CN114547658B (en) 2024-06-04

Family

ID=81662210

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210198357.XA Active CN114547658B (en) 2022-03-02 2022-03-02 Data processing method, device, equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN114547658B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116863309B (en) * 2023-09-04 2024-01-09 中电科网络安全科技股份有限公司 Image recognition method, device, system, electronic equipment and storage medium
CN117853212B (en) * 2024-03-06 2024-06-18 之江实验室 Longitudinal federal financial wind control method based on knowledge migration and self-supervision learning

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021189906A1 (en) * 2020-10-20 2021-09-30 平安科技(深圳)有限公司 Target detection method and apparatus based on federated learning, and device and storage medium
CN113516255A (en) * 2021-07-28 2021-10-19 深圳前海微众银行股份有限公司 Federal learning modeling optimization method, apparatus, readable storage medium, and program product

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021189906A1 (en) * 2020-10-20 2021-09-30 平安科技(深圳)有限公司 Target detection method and apparatus based on federated learning, and device and storage medium
CN113516255A (en) * 2021-07-28 2021-10-19 深圳前海微众银行股份有限公司 Federal learning modeling optimization method, apparatus, readable storage medium, and program product

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于堆栈降噪自编码器改进的混合推荐算法;杨帅;王鹃;;计算机应用;20180327(第07期);全文 *

Also Published As

Publication number Publication date
CN114547658A (en) 2022-05-27

Similar Documents

Publication Publication Date Title
CN113159327B (en) Model training method and device based on federal learning system and electronic equipment
CN110189192B (en) Information recommendation model generation method and device
CN114547658B (en) Data processing method, device, equipment and computer readable storage medium
US20230078061A1 (en) Model training method and apparatus for federated learning, device, and storage medium
US10677607B2 (en) Blockchain-based crowdsourcing of map applications
CN112749749B (en) Classification decision tree model-based classification method and device and electronic equipment
CN113240524B (en) Account anomaly detection method and device in federal learning system and electronic equipment
CN111081337B (en) Collaborative task prediction method and computer readable storage medium
CN112529101B (en) Classification model training method and device, electronic equipment and storage medium
US20220391642A1 (en) Method and apparatus for evaluating joint training model
CN112199709A (en) Multi-party based privacy data joint training model method and device
CN113722753B (en) Private data processing method, device and system based on blockchain
CN113051239A (en) Data sharing method, use method of model applying data sharing method and related equipment
CN113409134A (en) Enterprise financing trust method and device based on federal learning
CN111897890A (en) Financial business processing method and device
CN116186769A (en) Vertical federal XGBoost feature derivation method based on privacy calculation and related equipment
CN112949866A (en) Poisson regression model training method and device, electronic equipment and storage medium
CN113807157A (en) Method, device and system for training neural network model based on federal learning
CN110837657B (en) Data processing method, client, server and storage medium
CN111368314A (en) Modeling and predicting method, device, equipment and storage medium based on cross features
CN116957112A (en) Training method, device, equipment and storage medium of joint model
CN113033209B (en) Text relation extraction method and device, storage medium and computer equipment
CN114723012A (en) Computing method and device based on distributed training system
US20230418794A1 (en) Data processing method, and non-transitory medium and electronic device
CN114996741A (en) Data interaction method, device, equipment and storage medium based on federal learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant