CN114861829A - Training method, device, equipment, medium and program product of feature extraction model - Google Patents

Training method, device, equipment, medium and program product of feature extraction model Download PDF

Info

Publication number
CN114861829A
CN114861829A CN202210624491.1A CN202210624491A CN114861829A CN 114861829 A CN114861829 A CN 114861829A CN 202210624491 A CN202210624491 A CN 202210624491A CN 114861829 A CN114861829 A CN 114861829A
Authority
CN
China
Prior art keywords
feature
user
target
model
characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210624491.1A
Other languages
Chinese (zh)
Inventor
康焱
何元钦
骆家焕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN202210624491.1A priority Critical patent/CN114861829A/en
Publication of CN114861829A publication Critical patent/CN114861829A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a training method, a training device, an electronic device, a computer readable storage medium and a computer program product of a feature extraction model, which comprise: the method comprises the steps that first participant equipment obtains local first user characteristics and receives second user characteristics sent by second participant equipment; training a shared feature sub-model in the feature extraction model based on the first user feature and the second user feature to obtain a trained shared feature sub-model; performing feature extraction on a local target user sample through a local feature submodel in the feature extraction model to obtain a first target feature, and performing feature extraction on the target user sample through a trained shared feature submodel to obtain a second target feature; and updating the model parameters of the feature extraction model by combining the first target feature and the second target feature. By the method and the device, comprehensiveness of the extracted features can be guaranteed when the feature extraction is carried out on the feature extraction model based on the federal framework.

Description

Method, apparatus, device, medium, and program product for training feature extraction model
Technical Field
The present application relates to artificial intelligence technologies, and in particular, to a method and an apparatus for training a feature extraction model, an electronic device, a computer-readable storage medium, and a computer program product.
Background
In a traditional vertical federal scenario, all participant devices can only perform supervised joint modeling based on labeled training samples aligned by the participant devices. However, in an actual scene, tag data is difficult to obtain, and each participant device has only a small number of training samples with tags, so that the number of usable training samples is small; meanwhile, due to different scenes of the participant devices, when joint training is performed by using the alignment samples of the participant devices, part of feature information is easily lost, so that the effect of joint modeling is influenced.
Disclosure of Invention
The embodiment of the application provides a training method and device for a feature extraction model, an electronic device, a computer readable storage medium and a computer program product, which can ensure the comprehensiveness of extracted features when the feature extraction model is used for feature extraction based on the federal framework.
The technical scheme of the embodiment of the application is realized as follows:
the embodiment of the application provides a training method of a feature extraction model, which is based on a federal learning system, wherein the federal learning system comprises a first participant device and a second participant device, the feature extraction model comprises a local feature submodel and a shared feature submodel, and the method comprises the following steps:
the method comprises the steps that first participant equipment obtains local first user characteristics and receives second user characteristics sent by second participant equipment;
the first user characteristic and the second user characteristic correspond to the same user, and the first user characteristic and the second user characteristic are partially the same;
training the shared characteristic submodel based on the first user characteristic and the second user characteristic to obtain a trained shared characteristic submodel;
performing feature extraction on a local target user sample through the local feature submodel to obtain a first target feature, and performing feature extraction on the target user sample through the trained shared feature submodel to obtain a second target feature;
and updating the model parameters of the feature extraction model by combining the first target feature and the second target feature.
The embodiment of the application provides a training device of a feature extraction model, based on the federal learning system, the federal learning system includes first party's equipment and second party's equipment, the feature extraction model includes local feature submodel and shared feature submodel, includes:
the acquisition module is used for acquiring local first user characteristics by first party equipment and receiving second user characteristics sent by second party equipment;
the first user characteristic and the second user characteristic correspond to the same user, and the first user characteristic and the second user characteristic are partially the same;
the training module is used for training the shared feature submodel based on the first user feature and the second user feature to obtain a trained shared feature submodel;
the characteristic extraction module is used for extracting the characteristics of a local target user sample through the local characteristic submodel to obtain a first target characteristic, and extracting the characteristics of the target user sample through the trained shared characteristic submodel to obtain a second target characteristic;
and the updating module is used for updating the model parameters of the feature extraction model by combining the first target feature and the second target feature.
In the foregoing solution, the updating module is further configured to determine a similarity between the first target feature and the second target feature; determining a loss of the feature extraction model based on a similarity of the first target feature and the second target feature; updating model parameters of the feature extraction model based on the loss.
In the above scheme, the updating module is further configured to obtain a first loss function of the local feature sub-model and a second loss function of the shared feature sub-model, and construct a target loss function of the feature extraction model based on the first loss function and the second loss function; determining a first loss of the first loss function and a second loss of the second loss function based on a similarity of the first target feature and the second target feature; determining a loss of the feature extraction model based on the first loss, the second loss, and the target loss function.
In the above scheme, the target user sample is obtained by performing data enhancement on a training user sample, and the apparatus further includes a first enhancement module, where the first enhancement module is configured to perform data enhancement on the training user sample again to obtain an enhanced training user sample; copying the feature extraction model to obtain a reference feature extraction model; performing feature extraction on the enhanced training user sample through a local feature sub-model in the reference feature extraction model to obtain a third target feature, and performing feature extraction on the enhanced training user sample through a shared feature sub-model in the reference feature extraction model to obtain a fourth target feature; the updating module is further configured to update the model parameters of the feature extraction model in combination with the first target feature, the second target feature, the third target feature, and the fourth target feature.
In the foregoing scheme, the update module is further configured to perform feature splicing on the first target feature and the second target feature to obtain a first spliced feature, and perform feature splicing on the third target feature and the fourth target feature to obtain a second spliced feature; and updating the model parameters of the feature extraction model by adopting a comparison learning mode based on the first splicing feature and the second splicing feature.
In the above scheme, the apparatus further includes a second enhancement module, where the second enhancement module is configured to obtain at least two training user samples, perform data enhancement on one of the at least two training user samples to obtain a first enhancement training user sample and a second enhancement training user sample, and use the first enhancement training user sample as the target user sample; respectively extracting features of the second enhanced training user sample and the training user sample which is not subjected to data enhancement through the local feature sub-model to obtain a first enhanced target feature of the second enhanced training user sample and a second enhanced target feature of the training user sample which is not subjected to data enhancement; respectively extracting the features of the second enhanced training user sample and the training user sample which is not subjected to data enhancement through the trained shared feature sub-model to obtain a third enhanced target feature of the second enhanced training user sample and a fourth enhanced target feature of the training user sample which is not subjected to data enhancement; the updating module is further configured to perform feature splicing on the first target feature and the second target feature to obtain a target splicing feature, perform feature splicing on the first enhanced target feature and the third enhanced target feature to obtain a first enhanced feature, and perform feature splicing on the second enhanced target feature and the fourth enhanced target feature to obtain a second enhanced feature; and updating the model parameters of the feature extraction model by adopting a comparison learning mode based on the target splicing feature, the first enhancement feature and the second enhancement feature.
In the foregoing solution, the updating module is further configured to determine a similarity between the target splicing feature and the first enhancement feature, and a similarity between the target splicing feature and the second enhancement feature; updating model parameters of the feature extraction model based on the similarity between the target stitching feature and the first enhanced feature and the similarity between the target stitching feature and the second enhanced feature.
In the above scheme, the apparatus further includes an application module, where the application module is configured to obtain a classification model to be trained; performing feature extraction on a first tag user sample carrying a tag locally through a feature extraction model after model parameters are updated to obtain a first tag user feature corresponding to the local first tag user sample; receiving a second tag user characteristic sent by the second participant device, wherein the second tag user characteristic is characteristic data obtained by the second participant device performing characteristic extraction on a local second tag user sample through a locally updated characteristic extraction model; updating model parameters of the classification model based on the first label user characteristic and the second label user characteristic.
In the foregoing solution, the first tag user characteristic and the second tag user characteristic correspond to the same user sample, and the application module is further configured to input the first tag user characteristic and the second tag user characteristic into the classification model respectively, so as to obtain a first classification result corresponding to the first tag user sample and a second classification result corresponding to the second tag user sample; determining a first label corresponding to the first classification result and a second label corresponding to the second classification result; acquiring a first difference between the first classification result and the corresponding first label and a second difference between the second classification result and the corresponding second label; updating model parameters of the classification model based on the first difference and the second difference.
An embodiment of the present application provides an electronic device, including:
a memory for storing executable instructions;
and the processor is used for realizing the training method of the feature extraction model provided by the embodiment of the application when the executable instructions stored in the memory are executed.
The embodiment of the present application provides a computer-readable storage medium, which stores executable instructions for causing a processor to execute the method for training a feature extraction model provided in the embodiment of the present application.
The embodiment of the present application provides a computer program product, which includes a computer program, and when the computer program is executed by a processor, the computer program implements the training method of the feature extraction model provided in the embodiment of the present application.
The embodiment of the application has the following beneficial effects:
the first participant device extracts the characteristics of a local target user sample based on a local characteristic submodel to obtain a first target characteristic, extracts the characteristics of the target user sample based on a shared characteristic submodel obtained by training under a longitudinal federated learning architecture to obtain a second target characteristic, and updates the model parameters of the characteristic extraction model by combining the first target characteristic and the second target characteristic.
Drawings
Fig. 1 is a schematic view of an implementation scenario of a training method of a feature extraction model provided in an embodiment of the present application;
fig. 2 is a schematic structural diagram of an electronic device provided in an embodiment of the present application;
FIG. 3 is a schematic flow chart of a training method of a feature extraction model provided in an embodiment of the present application;
FIG. 4 is a schematic diagram of a process for training a shared feature sub-model according to an embodiment of the present application;
fig. 5 is a schematic diagram of a process of feature extraction based on a shared feature sub-model and a local feature sub-model according to an embodiment of the present application;
FIG. 6 is a process diagram of a training method of a feature extraction model provided in an embodiment of the present application;
FIG. 7 is a flow chart illustrating a training process of a feature extraction model provided in an embodiment of the present application;
FIG. 8 is a schematic flow chart illustrating a downstream federated learning task performed by a trained feature extraction model according to an embodiment of the present application;
FIG. 9 is a process diagram of a downstream federated learning task performed by a trained feature extraction model provided in an embodiment of the present application;
FIG. 10 is a schematic flow chart diagram illustrating an alternative method for training a feature extraction model according to an embodiment of the present application;
fig. 11 is a schematic structural diagram of a training apparatus 254 for a feature extraction model according to an embodiment of the present application.
Detailed Description
In order to make the objectives, technical solutions and advantages of the present application clearer, the present application will be described in further detail with reference to the attached drawings, the described embodiments should not be considered as limiting the present application, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.
In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.
In the following description, references to the terms "first \ second \ third" are only to distinguish similar objects and do not denote a particular order, but rather the terms "first \ second \ third" are used to interchange specific orders or sequences, where appropriate, so as to enable the embodiments of the application described herein to be practiced in other than the order shown or described herein.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.
Before further detailed description of the embodiments of the present application, terms and expressions referred to in the embodiments of the present application will be described, and the terms and expressions referred to in the embodiments of the present application will be used for the following explanation.
1) Federal machine Learning (Federated machine Learning/Federated Learning), also known as federal Learning, joint Learning, league Learning, refers to a method of machine Learning by uniting different participants (participants, or party, also known as data owners, or clients). In federal learning, participants do not need to expose own data to other participants and coordinators (coordinators, also called parameter servers or aggregation servers), so that federal learning can protect user privacy and guarantee data security well.
Federal Learning (Federal Learning, a.k.a. Federal Machine Learning) can be classified into three categories: horizontal federal Learning (Horizontal federal Learning), Vertical federal Learning (Vertical federal Learning), and federal Transfer Learning (fed transferred Learning).
The method includes the steps of horizontal federal Learning, also called Feature-Aligned federal Learning (Feature-Aligned fed Learning), that is, data features of participants of the horizontal federal Learning are Aligned, and the method is suitable for the case that the data features of the participants overlap more, and sample Identifications (IDs) overlap less, that is, in the case that the sample features of the participants overlap more, and the user samples overlap less, a part of data with the same data features of the participants and not identical users is taken out for joint machine Learning. For example, if two banks in different regions exist, their user groups are respectively from the regions where they are located, and the intersection of the user groups is very small. But their services are similar and the recorded user data characteristics are largely the same. Horizontal federal learning can be used to help two banks build a federated model to predict their customer behavior.
The vertical federal Learning is also called Sample-Aligned federal Learning (Sample-Aligned fed Learning), and training samples of vertical federal Learning participants are Aligned, so that the method is suitable for the situation that the IDs of the training samples of the participants overlap more and the data features overlap less, namely, under the situation that the Sample features of the participants overlap less and the user samples overlap more, a part of users and data which are the same as users of the participants and have different user data features are taken out to perform the joint machine Learning training. For example, there are two participants a and B belonging to the same region, where participant a is a bank and participant B is an e-commerce platform. Participants a and B have more of the same users in the same region, but a and B have different services and different recorded user data characteristics. In particular, the user data characteristics of the a and B records may be complementary. In such a scenario, vertical federated learning may be used to help a and B build a joint machine learning predictive model, helping a and B provide better service to customers.
2) Contrast learning (contrast learning), a method for describing the task of similar and dissimilar things for a machine learning model, with which the machine learning model can be trained to distinguish between similar and dissimilar images. The comparative learning focuses on learning common features among similar examples and distinguishes differences among non-similar examples. Compared with the generative learning, the comparative learning does not need to pay attention to the complex details on the examples, and only needs to learn the data differentiation on the feature space of the abstract semantic level, so that the model and the optimization thereof become simpler and the generalization capability is stronger. The goal of contrast learning is to learn an encoder that encodes data of the same type similarly and makes the encoding results of data of different types as different as possible.
In a traditional vertical federal scenario, all participants can only perform supervised joint modeling based on labeled exemplars aligned by the devices of the participants. However, in an actual scenario, since the tag data is difficult to obtain, the participant has only a small amount of tagged sample data. Meanwhile, due to different scenes of the participant devices, the number of aligned sample data of the participant devices is possibly less, so that the aligned labeled sample data which can be used for joint modeling by the participant devices is less, and the effect of joint modeling is influenced. In addition, since joint training is performed by using aligned samples of each participant device, part of feature information is easily lost during training, thereby further affecting the effect of joint modeling.
Based on this, embodiments of the present application provide a training method and apparatus for a feature extraction model, an electronic device, a computer-readable storage medium, and a computer program product, which perform model training by fully utilizing local unlabeled sample data of each participant device and unlabeled alignment sample data of all participants to obtain a feature extraction model with excellent performance, thereby obtaining features with high degree of discrimination.
Based on the above explanations of terms and terms involved in the embodiments of the present application, an implementation scenario of the training method for the feature extraction model provided in the embodiments of the present application is described below, referring to fig. 1, fig. 1 is a schematic diagram of an implementation scenario of the training method for the feature extraction model provided in the embodiments of the present application, in order to support an exemplary application, a first participant device 200-1 is connected to a second participant device 200-2 through a network 300, where the first participant device 200-1 and the second participant device 200-2 may be an entity holding a training user sample, such as a hospital, a bank, a shopping mall, or a supermarket, and the first participant device 200-1 and the second participant device 200-2 assist each other to perform federated learning to make the first participant device 200-1 obtain the feature extraction model, and the network 300 may be a wide area network or a local area network, or a combination of both, using wireless or wired links.
Wherein the first participant device (including the first participant device 200-1) is configured to obtain a local first user characteristic;
a second participant device (comprising a second participant device 200-2) for sending a second user characteristic of a user corresponding to the same user as the first user characteristic to the first participant device;
a first participant device (comprising first participant device 200-1) further configured to receive a second user characteristic transmitted by a second participant device; training the shared feature submodel based on the first user feature and the second user feature to obtain a trained shared feature submodel; performing feature extraction on a local target user sample through a local feature submodel to obtain a first target feature, and performing feature extraction on the target user sample through a trained shared feature submodel to obtain a second target feature; and updating the model parameters of the feature extraction model by combining the first target feature and the second target feature.
In practical applications, the first participant device 200-1 and the second participant device 200-2 may be independent physical servers, may also be a server cluster or a distributed system formed by a plurality of physical servers, and may also be cloud servers that provide basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, Network services, cloud communications, middleware services, domain name services, security services, Content Delivery Networks (CDN), and big data and artificial intelligence platforms. The first participant device 200-1 and the second participant device 200-2 may likewise be, but are not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, and the like. The first participant device 200-1 and the second participant device 200-2 may be directly or indirectly connected via wired or wireless communication, and the application is not limited thereto.
The hardware structure of the electronic device implementing the training method for feature extraction models provided in the embodiments of the present application is described in detail below, where the electronic device includes, but is not limited to, a server or a terminal. Referring to fig. 2, fig. 2 is a schematic structural diagram of an electronic device provided in an embodiment of the present application, and the electronic device 200 shown in fig. 2 includes: at least one processor 210, memory 250, at least one network interface 220, and a user interface 230. The various components in electronic device 200 are coupled together by a bus system 240. It will be appreciated that the bus system 240 is used to enable communications among the components of the connection. The bus system 240 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 240 in fig. 2.
The Processor 210 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like, wherein the general purpose Processor may be a microprocessor or any conventional Processor, or the like.
The user interface 230 includes one or more output devices 231, including one or more speakers and/or one or more visual display screens, that enable the presentation of media content. The user interface 230 also includes one or more input devices 232, including user interface components that facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, other input buttons and controls.
The memory 250 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard disk drives, optical disk drives, and the like. Memory 250 optionally includes one or more storage devices physically located remote from processor 210.
The memory 250 includes volatile memory or nonvolatile memory, and may include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read Only Memory (ROM), and the volatile Memory may be a Random Access Memory (RAM). The memory 250 described in embodiments herein is intended to comprise any suitable type of memory.
In some embodiments, memory 250 is capable of storing data, examples of which include programs, modules, and data structures, or a subset or superset thereof, to support various operations, as exemplified below.
An operating system 251 including system programs for processing various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and processing hardware-based tasks;
a network communication module 252 for communicating to other computing devices via one or more (wired or wireless) network interfaces 220, exemplary network interfaces 220 including: bluetooth, wireless compatibility authentication (WiFi), and Universal Serial Bus (USB), etc.;
an input processing module 253 for detecting one or more user inputs or interactions from one of the one or more input devices 232 and translating the detected inputs or interactions.
In some embodiments, the training device for the feature extraction model provided in the embodiments of the present application may be implemented in software, and fig. 2 shows the training device 254 for the feature extraction model stored in the memory 250, which may be software in the form of programs and plug-ins, and includes the following software modules: an acquisition module 2541, a training module 2542, a feature extraction module 2543 and an update module 2544, which are logical and thus can be arbitrarily combined or further split according to the implemented functions, which will be described below.
In other embodiments, the training Device of the feature extraction model provided in the embodiments of the present Application may be implemented by a combination of hardware and software, and as an example, the training Device of the feature extraction model provided in the embodiments of the present Application may be a processor in the form of a hardware decoding processor, which is programmed to execute the training method of the feature extraction model provided in the embodiments of the present Application, for example, the processor in the form of the hardware decoding processor may be implemented by one or more Application Specific Integrated Circuits (ASICs), DSPs, Programmable Logic Devices (PLDs), Complex Programmable Logic Devices (CPLDs), Field Programmable Gate Arrays (FPGAs), or other electronic components.
Based on the above description of the implementation scenario of the training method for the feature extraction model according to the embodiment of the present application and the electronic device, the following description describes the training method for the feature extraction model according to the embodiment of the present application. Referring to fig. 3, fig. 3 is a schematic flowchart of a training method of a feature extraction model provided in the embodiment of the present application, where the training method of the feature extraction model provided in the embodiment of the present application includes:
step 101, a first participant device obtains a local first user characteristic, and receives a second user characteristic sent by a second participant device.
The first user characteristic and the second user characteristic correspond to the same user, and the first user characteristic and the second user characteristic are partially the same.
It should be noted that the first user characteristics at least include a first service characteristic and a basic characteristic, and the second user characteristics at least include a second service characteristic and a basic characteristic, where the first service characteristic corresponds to a first service, the second service characteristic corresponds to a second service, the first service is a service corresponding to a first party device, the second service is a service corresponding to a second party device, and the first service and the second service are two different types of services.
It should be noted that the first and second participant devices may be at least one of institutions in the same area, such as banks, supermarkets, internet companies, schools, hospitals, etc., for example, when the first participant device is a bank and the second participant device is a supermarket, the first business may be a deposit business or a loan business, etc. that the user has at the bank, and the second business may be a purchase business, etc. that the user has at the supermarket, the first user characteristics include, but are not limited to, the user's name, sex, age, academic calendar, occupation, loan amount, loan time, repayment amount, repayment time, deposit amount, overdue repayment data, etc., wherein the first business characteristics include, but are not limited to, the user's loan amount, loan time, repayment amount, repayment time, deposit amount, overdue repayment data, etc., and the basic characteristics include, but are not limited to, the user's name, loan amount, loan time, repayment amount, and overdue data, etc, Gender, age, academic calendar, occupation, etc., and the second user characteristic includes, but is not limited to, the name, gender, age, academic calendar, occupation, amount of consumption, time of consumption, manner of consumption (e.g., credit card consumption or savings card consumption, etc.), common payment manner (e.g., two-dimensional code payment manner, card swiping payment manner), etc., wherein the second service characteristic includes, but is not limited to, amount of consumption, time of consumption, manner of consumption (e.g., credit card consumption or savings card consumption, etc.), common payment manner (e.g., two-dimensional code payment manner, card swiping payment manner), etc., and wherein the basic characteristic in the first user characteristic is the same as the basic characteristic in the second user characteristic, and includes, but is not limited to, the name, gender, age, academic calendar, occupation, etc., of the user.
In practical implementation, the first user characteristic is a characteristic obtained by the first participant device performing characteristic extraction on a training user sample local to the first participant device through a local shared characteristic sub-model, where the local training user sample is a local user held by the first participant device, and may be a user sample carrying a tag or a user sample not carrying a tag, and the first user characteristic may be characteristic data of a user, and the tag may be real characteristic data of a corresponding user.
In practical implementation, the second user characteristic is a characteristic obtained by the second party device performing characteristic extraction on a local training user sample of the second party device through a local shared characteristic submodel, where the local training user sample of the second party device is a local user held by the second party device, and may be a user sample carrying a tag or a user sample not carrying a tag, and the second user characteristic may be characteristic data of a user, and the tag may be real characteristic data of a corresponding user.
It should be noted that, because the first user feature corresponds to the same user as the second user feature, before the first participant device performs feature extraction on the local training user sample of the first participant device through the local shared feature sub-model and the second participant device performs feature extraction on the local training user sample of the second participant device through the local shared feature sub-model, user sample alignment needs to be performed on all user samples local to the first participant device and all user samples local to the second participant device, so as to obtain the training user sample local to the first participant device and the training user sample local to the second participant device for training.
Next, a process of performing user pattern alignment on all user patterns local to the first participant device and all user patterns local to the second participant device will be described.
In actual implementation, a process of aligning all local user samples of the first participant device and all local user samples of the second participant device to obtain local training user samples of the first participant device and local training user samples of the second participant device for training specifically includes obtaining user identifications of all local user samples of the first participant device and user identifications of all local user samples of the second participant device, and matching the user identifications of all local user samples of the first participant device with the user identifications of all local user samples of the second participant device to obtain matching results; and when the matching result represents that the user identification of the corresponding user sample local to the first participant device is the same as the user identification of the corresponding user sample local to the second participant device, determining a training user sample local to the first participant device, and determining a training user sample local to the second participant device.
It should be noted that the user identifier is used to identify the identity of the user, such as an identification number or a mobile phone number of the user.
And 102, training the shared characteristic submodel based on the first user characteristic and the second user characteristic to obtain the trained shared characteristic submodel.
In practical implementation, after the first user characteristic and the second user characteristic are obtained, the shared characteristic submodel is trained based on the first user characteristic and the second user characteristic to obtain a trained shared characteristic submodel.
In some embodiments, the similarity between the first user characteristic and the second user characteristic is first determined, and then a predetermined contrast learning loss function is obtained, i.e.
Figure BDA0003676293520000121
Wherein u in formula (1) is a first user characteristic, v + For the second user profile, τ is a hyperparameter.
And then determining the loss of the corresponding shared feature submodel based on the similarity between the first user feature and the second user feature and the obtained comparison learning loss function, and updating the model parameters of the shared feature submodel based on the loss to obtain the trained shared feature submodel.
For example, referring to fig. 4, fig. 4 is a schematic diagram of a process of training a shared feature sub-model according to an embodiment of the present application, and based on fig. 4, a first participant device first aligns local training user samples x subjected to user sample alignment 1 Input local shared feature submodel B 1 Obtaining a first user characteristic f 1, Then receiving a sub-model B of the second participant according to the local shared characteristics 2 For local training user sample x aligned with user sample 2 Second user characteristic f obtained by characteristic extraction 2 Then by determining and then determining f 1 And f 2 Similarity between them, and obtaining a preset contrast learning loss function, base f 1 And f 2 Similarity between the two models and the obtained comparison learning loss function, determining the loss of the corresponding shared characteristic submodel, and updating the model parameters of the shared characteristic submodel based on the loss to obtain a trained shared characteristic submodel B 1
And 103, performing feature extraction on a local target user sample through the local feature submodel to obtain a first target feature, and performing feature extraction on the target user sample through the trained shared feature submodel to obtain a second target feature.
It should be noted that the feature extraction model includes a local feature sub-model and a shared feature sub-model, where the local feature sub-model may be used to extract features locally unique to the first participant device, and the shared feature sub-model is used to extract features common to the first participant device and the second participant device.
In practical implementation, after the trained shared feature submodel is obtained, the local purpose is achieved through the local feature submodelAnd performing feature extraction on the target user sample to obtain a first target feature, and performing feature extraction on the target user sample through the trained shared feature sub-model to obtain a second target feature. Exemplarily, referring to fig. 5, fig. 5 is a schematic diagram of a process of feature extraction based on a shared feature sub-model and a local feature sub-model provided in an embodiment of the present application, and based on fig. 5, a local feature sub-model a is used for extracting features of the feature extraction 1 Carrying out feature extraction on a local target user sample x to obtain a first target feature g 1 And through the trained shared characteristic submodel B 1 Carrying out feature extraction on the target user sample x to obtain a second target feature g 2
And step 104, updating the model parameters of the feature extraction model by combining the first target feature and the second target feature.
In actual implementation, after the first target feature and the second target feature are obtained, the similarity between the first target feature and the second target feature is determined; determining the loss of the feature extraction model based on the similarity of the first target feature and the second target feature; based on the loss, model parameters of the feature extraction model are updated. It should be noted that, the method for determining the similarity may include, but is not limited to, vector space cosine similarity, euclidean distance, cosine similarity, and the like.
In practical implementation, the process of determining the loss of the feature extraction model based on the similarity between the first target feature and the second target feature specifically comprises the steps of obtaining a first loss function of the local feature sub-model and a second loss function of the shared feature sub-model, and constructing a target loss function of the feature extraction model based on the first loss function and the second loss function; determining a first loss of the first loss function and a second loss of the second loss function based on the similarity of the first target feature and the second target feature; determining a loss of the feature extraction model based on the first loss, the second loss, and the target loss function. It should be noted that the target loss function of the feature extraction model may be a sine and cosine similarity function, and the like, so that in the multi-round training process, the similarity between the features extracted by the local feature submodel in the feature extraction model and the features extracted by the shared feature submodel is smaller and smaller, so that the feature extraction model can extract the features with high degree of distinction, and the comprehensiveness of the extracted features is ensured.
In some embodiments, the target user sample is obtained by performing data enhancement on the training user sample, and after the trained shared feature sub-model is obtained, the training user sample may be further subjected to data enhancement again to obtain an enhanced training user sample; copying the feature extraction model to obtain a reference feature extraction model; performing feature extraction on the enhanced training user sample through a local feature submodel in the reference feature extraction model to obtain a third target feature, and performing feature extraction on the enhanced training user sample through a shared feature submodel in the reference feature extraction model to obtain a fourth target feature; and updating the model parameters of the feature extraction model by combining the first target feature, the second target feature, the third target feature and the fourth target feature. It should be noted that, by fixing the model parameters of the reference feature extraction model, it is ensured that the reference feature extraction model does not update the model parameters; in addition, in some embodiments, the model parameters of the reference feature extraction model may not be fixed, so that the model parameters of the reference feature extraction model are also updated, which is not limited in this embodiment.
In actual implementation, a process of updating model parameters of the feature extraction model by combining the first target feature, the second target feature, the third target feature and the fourth target feature specifically includes performing feature splicing on the first target feature and the second target feature to obtain a first spliced feature, and performing feature splicing on the third target feature and the fourth target feature to obtain a second spliced feature; and updating the model parameters of the feature extraction model by adopting a comparison learning mode based on the first splicing feature and the second splicing feature. The feature splicing method may be, for example, adding the features, and the embodiments of the present application are not limited thereto.
Exemplarily, referring to fig. 6, fig. 6 is a schematic representation of an embodiment of the present applicationBased on fig. 6, the process diagram of the training method of the feature extraction model is that firstly, data enhancement is performed on a training user sample x to obtain a target user sample x 1 And enhanced training user sample x 2 Copying the feature extraction model to obtain a reference feature extraction model;
by local feature submodel A in the feature extraction model 1 For target user sample x 1 Carrying out feature extraction to obtain a first target feature g 1 By sharing feature submodel B in the feature extraction model 1 For target user sample x 1 Carrying out feature extraction to obtain a second target feature g 2 (ii) a Extracting local feature sub-model A in model by reference feature 2 For enhanced training user sample x 2 Carrying out feature extraction to obtain a third target feature g 3 By referring to the shared feature submodel B in the feature extraction model 2 For enhanced training user sample x 2 Carrying out feature extraction to obtain a fourth target feature g 4 (ii) a For the first target feature g 1 And the second target feature g 2 Performing characteristic splicing to obtain a first splicing characteristic G 1 And for the third target feature g 3 And a fourth target feature g 4 Performing characteristic splicing to obtain a second splicing characteristic G 2 (ii) a Based on first concatenation characteristic G 1 And a second stitching feature G 2 And updating the model parameters of the feature extraction model by adopting a comparison learning mode.
In practical implementation, the process of updating the model parameters of the feature extraction model by adopting a comparison learning mode based on the first splicing feature and the second splicing feature specifically comprises the steps of determining the similarity between the first splicing feature and the second splicing feature, and updating the model parameters of the feature extraction model based on the similarity between the first splicing feature and the second splicing feature. Here, the method for determining the similarity may include, but is not limited to, vector space cosine similarity, euclidean distance, cosine similarity, etc., and the process of updating the model parameters of the feature extraction model based on the similarity between the first and second splicing features may be to obtain a preset contrast learning loss function, that is, to obtain a preset contrast learning loss function
Figure BDA0003676293520000151
Wherein u in the formula (2) is a first splicing characteristic, v + For the second splice characteristic, τ is a hyperparameter.
And then determining the loss of the corresponding feature extraction model based on the similarity between the first splicing feature and the second splicing feature and the obtained comparison learning loss function, and updating the model parameters of the feature extraction model based on the loss.
In some embodiments, after obtaining the trained shared feature sub-model, at least two training user samples may be further obtained, and then the feature extraction model is trained based on the at least two training user samples, see fig. 7, where fig. 7 is a flowchart of a training process of the feature extraction model provided in this embodiment of the present application, and based on fig. 7, after step 102, the following steps may be further performed:
step 201, a first participant device obtains at least two training user samples, performs data enhancement on one of the at least two training user samples to obtain a first enhanced training user sample and a second enhanced training user sample, and uses the first enhanced training user sample as a target user sample.
Step 202, respectively performing feature extraction on the second enhanced training user sample and the training user sample which is not subjected to data enhancement through the local feature sub-model to obtain a first enhanced target feature of the second enhanced training user sample and a second enhanced target feature of the training user sample which is not subjected to data enhancement.
In actual implementation, firstly copying the local characteristic submodel to obtain a first reference local characteristic submodel and a second reference local characteristic submodel; and then, carrying out feature extraction on the second enhanced training user sample through the first reference local feature sub-model to obtain a first enhanced feature corresponding to the second enhanced training user sample, and carrying out feature extraction on the training user sample which is not subjected to data enhancement through the second reference local feature sub-model to obtain a second enhanced feature corresponding to the training user sample which is not subjected to data enhancement.
And 203, respectively performing feature extraction on the second enhanced training user sample and the training user sample which is not subjected to data enhancement through the trained shared feature sub-model to obtain a third enhanced target feature of the second enhanced training user sample and a fourth enhanced target feature of the training user sample which is not subjected to data enhancement.
In actual implementation, firstly copying the trained shared feature submodel to obtain a first reference shared feature submodel and a second reference shared feature submodel; and performing feature extraction on the second enhanced training user sample through the first reference shared feature sub-model to obtain a third enhanced feature corresponding to the second enhanced training user sample, and performing feature extraction on the training user sample which is not subjected to data enhancement through the second reference shared feature sub-model to obtain a fourth enhanced feature corresponding to the training user sample which is not subjected to data enhancement.
It should be noted that, by fixing the first reference local feature sub-model, the second reference local feature sub-model, the first reference shared feature sub-model and the second reference shared feature sub-model, it is ensured that the first reference local feature sub-model, the second reference local feature sub-model, the first reference shared feature sub-model and the second reference shared feature sub-model do not update the model parameters; in addition, in some embodiments, model parameters of the first reference local feature sub-model, the second reference local feature sub-model, the first reference shared feature sub-model, and the second reference shared feature sub-model may not be fixed, so that the first reference local feature sub-model, the second reference local feature sub-model, the first reference shared feature sub-model, and the second reference shared feature sub-model also update the model parameters, which is not limited in this embodiment of the present application.
And 204, performing feature splicing on the first target feature and the second target feature to obtain a target splicing feature, performing feature splicing on the first enhanced target feature and the third enhanced target feature to obtain a first enhanced feature, and performing feature splicing on the second enhanced target feature and the fourth enhanced target feature to obtain a second enhanced feature.
The feature splicing method may be, for example, adding the features, and the embodiments of the present application are not limited thereto.
And step 205, updating model parameters of the feature extraction model by adopting a comparison learning mode based on the target splicing feature, the first enhancement feature and the second enhancement feature.
In actual implementation, updating model parameters of the feature extraction model by adopting a comparison learning mode based on the target splicing feature, the first enhancement feature and the second enhancement feature, specifically comprising the steps of determining the similarity between the target splicing feature and the first enhancement feature and the similarity between the target splicing feature and the second enhancement feature; and updating the model parameters of the feature extraction model based on the similarity between the target splicing feature and the first enhanced feature and the similarity between the target splicing feature and the second enhanced feature. It should be noted that, methods for determining the similarity between the target splicing feature and the first enhanced feature and the similarity between the target splicing feature and the second enhanced feature may include, but are not limited to, vector space cosine similarity, euclidean distance, cosine similarity, and the like.
In practical implementation, the process of updating the model parameters of the feature extraction model based on the similarity between the target splicing feature and the first enhanced feature and the similarity between the target splicing feature and the second enhanced feature specifically includes obtaining a preset contrast learning loss function, that is, obtaining a preset contrast learning loss function
Figure BDA0003676293520000171
Wherein u in the formula (2) is a target splicing characteristic, v + As a first enhancement feature, v - In order to provide the second enhancement feature,τ is a hyperparameter.
And then, determining the loss of the feature extraction model based on the similarity between the target splicing feature and the first enhanced feature, the similarity between the target splicing feature and the second enhanced feature and a preset contrast learning loss function, and updating the model parameters of the feature extraction model based on the loss.
In some embodiments, after updating the model parameters of the feature extraction model, a downstream local machine learning task or a downstream federal learning task may be further performed by training the completed feature extraction model, referring to fig. 8, fig. 8 is a schematic flowchart of a downstream federal learning task performed by training the completed feature extraction model according to an embodiment of the present application, and based on fig. 8, after step 104, the following steps may be further performed:
step 301, a first participant device obtains a classification model to be trained.
In practical implementation, the classification model may be a binary classification model or a multi-classification model, for example, when the classification model is a binary classification model, the classification model may be a binary classification model applied to wind control management or wind control prediction, and the form of the classification model is not limited in this application.
Step 302, performing feature extraction on the first tag user sample locally carrying the tag through the feature extraction model after the model parameter is updated, so as to obtain the first tag user feature corresponding to the local first tag user sample.
In actual implementation, performing feature extraction on a first tag user sample carrying a tag locally through a local feature sub-model in a feature extraction model after model parameters are updated to obtain a first local feature; performing feature extraction on a first label user sample carrying a label locally through a shared feature sub-model in a feature extraction model after model parameters are updated to obtain a first shared feature; and splicing the first local feature and the first shared feature to obtain a first label user feature corresponding to the local first label user sample.
By way of example, referring to fig. 9, fig. 9 is an embodiment of the present applicationBased on fig. 9, the provided process schematic diagram of the federate learning task performed by the over-trained feature extraction model is that the local feature submodel a in the feature extraction model is updated by the model parameters 1 For the first tag user sample P carrying tag locally 1 Extracting the features to obtain a first local feature h 1 (ii) a Shared feature sub-model B in feature extraction model updated by model parameters 1 For the first tag user sample P carrying tag locally 1 Carrying out feature extraction to obtain a first shared feature h 2 (ii) a The first local feature h 1 With the first shared characteristic h 2 Splicing to obtain a first local label user sample P 1 First tag user characteristic H 1
It should be noted that the label here is a result indicating a classification corresponding to a sample of a trained user, and when the classification model is a binary classification model, the label here is a binary classification label for indicating a result of a binary classification corresponding to a sample of a trained user, and the binary classification label may be a wind control label, for example, the wind control label may be a binary classification label for identifying whether the user is credited or not credited, or a binary classification label for identifying whether the user is a high-quality customer or not with high or low loyalty.
Step 303, receiving a second tag user characteristic sent by the second participant device.
The second label user characteristics are characteristic data obtained by the second participant device performing characteristic extraction on a local second label user sample through a locally updated characteristic extraction model.
In actual implementation, the second participant equipment performs feature extraction on a local second tag user sample through a local feature sub-model in the feature extraction model after local model parameters are updated to obtain a second local feature; performing feature extraction on a local second label user sample through a shared feature sub-model in the feature extraction model after the model parameters are updated to obtain a second shared feature; and splicing the second local characteristic and the second sharing characteristic to obtain a second label user characteristic corresponding to the local second label user sample.
Continuing with the example above and with reference to fig. 9, based on fig. 9, the second participant device first updates the local feature sub-model a in the feature extraction model with the local model parameters 2 For local second label user sample P 2 Extracting the features to obtain a second local feature h 3 (ii) a Sub-model B for shared features in feature extraction model after model parameter updating 2 For local second label user sample P 2 Carrying out feature extraction to obtain a second shared feature h 4 (ii) a Second local feature h 3 With the second shared feature h 4 Splicing to obtain a corresponding local second label user sample P 2 Second tag user characteristic H 2
It should be noted that the first tagged user characteristic corresponds to the same user pattern as the second tagged user characteristic, but here, only the first participant device holds the tag of the user pattern, and the second participant device does not have the tag of the user pattern.
Step 304, updating model parameters of the classification model based on the first label user characteristics and the second label user characteristics.
In actual implementation, updating the model parameters of the classification model based on the first label user characteristics and the second label user characteristics, specifically including inputting the first label user characteristics and the second label user characteristics into the classification model respectively to obtain a first classification result corresponding to the first label user sample and a second classification result corresponding to the second label user sample; determining a first label corresponding to the first classification result and a second label corresponding to the second classification result; acquiring a first difference between the first classification result and the corresponding first label and a second difference between the second classification result and the corresponding second label; based on the first difference and the second difference, model parameters of the classification model are updated.
In some embodiments, when the first participant device is a bank, the second participant device is a supermarket, and the training user sample is a local resident, the training method of the feature extraction model herein specifically includes, first, aligning the first participant device with the second participant device to obtain an aligned user sample such as a user who has handled business at the bank and has consumed at the supermarket, and then the first participant device obtains a first user feature (such as a feature of the user stored by the bank), where the feature includes but is not limited to a name, a gender, an age, a school calendar, an occupation amount, a loan time, a repayment amount, a repayment time, overdue repayment data, and the like of the user; receiving a second user characteristic sent by the second participant device, where the second user characteristic is a characteristic of the user stored in the supermarket and acquired by the second participant device, where the characteristic includes, but is not limited to, the name, sex, age, academic calendar, occupation, consumption amount, consumption time, consumption mode (e.g., credit card consumption or savings card consumption, etc.), common payment mode (e.g., two-dimensional code payment mode, card swiping payment mode), and the like of the user; training a shared feature sub-model in the feature extraction model based on the first user feature and the second user feature to obtain a trained shared feature sub-model; performing feature extraction on a local target user sample (such as all users handling business in a bank) through a local feature submodel in the feature extraction model to obtain first target features, namely the specific features of the user at the bank, such as user loan amount, loan time, repayment amount, repayment time, overdue repayment data and the like, and performing feature extraction on the target user sample through a trained shared feature submodel to obtain second target features, namely the common features of the user at the bank and a supermarket, such as the name, the sex, the age, the academic calendar, the occupation and the like of the user; and updating the model parameters of the feature extraction model by combining the first target feature and the second target feature.
Then, obtaining a binary model to be trained, wherein the binary model is applied to wind control management, then performing feature extraction on a first label user sample carrying a label locally, namely an aligned user sample, through a feature extraction model after model parameters are updated to obtain a first label user feature corresponding to the first label user sample locally, and simultaneously receiving a second participant device to perform feature extraction on a second label user sample, namely the aligned user sample, to obtain a second label user feature corresponding to the second label user sample locally, wherein the label is a binary wind control label used for indicating a result of second classification corresponding to the training user sample, such as a binary label for identifying user trust or non-trust; then, the first label user characteristics and the second label user characteristics are respectively input into a classification model to obtain a first classification result such as credit conservation or non-credit conservation corresponding to the first label user sample and a second classification result such as credit conservation or non-credit conservation corresponding to the second label user sample; determining a first label corresponding to the first classification result, such as the real credit condition of the user, and a second label corresponding to the second classification result, such as the real credit condition of the user; acquiring a first difference between the first classification result and the corresponding first label and a second difference between the second classification result and the corresponding second label; based on the first difference and the second difference, model parameters of the classification model are updated.
By applying the embodiment of the application, the first participant device performs feature extraction on a local target user sample based on the local feature submodel to obtain a first target feature, performs feature extraction on the target user sample based on the shared feature submodel trained under the longitudinal federated learning architecture to obtain a second target feature, and updates the model parameters of the feature extraction model by combining the first target feature and the second target feature.
Next, taking a specific application scenario as an example, a training method for a feature extraction model provided in the embodiment of the present application is introduced, referring to fig. 10, fig. 10 is an optional flowchart of the training method for a feature extraction model provided in the embodiment of the present application, and based on fig. 10, the training method for a feature extraction model provided in the embodiment of the present application is cooperatively implemented by a first participant device and a second participant device. The first party device and the second party device may be both servers or terminals. Referring to fig. 10, the training method for the feature extraction model provided in the embodiment of the present application further includes:
step 401, the first party device initializes a local model to obtain a shared feature sub-model.
In practical implementation, each participant device initializes its local model at the beginning of model training, and since the model training procedure of each participant is the same, the training process is described only by taking the first participant device as an example, and it should be noted that all participants have a small number of labels.
Step 402, performing feature extraction on a local training user sample through a local shared feature sub-model to obtain a first user feature.
And step 403, the second participant device performs feature extraction on the local training user sample through the local shared feature sub-model to obtain a second user feature.
It should be noted that, here, the training user sample local to the second participant device and the training user sample local to the first participant device are training user samples after user sample alignment, and therefore, the first user feature and the second user feature here correspond to the same user.
Step 404, sending the second user characteristic to the first participant device.
Step 405, the first participant device inputs the first user characteristic and the second user characteristic to a comparison learning loss function to obtain a gradient of the shared characteristic submodel, and updates a model parameter of the shared characteristic submodel through the gradient to obtain the trained shared characteristic submodel.
It should be noted that the comparative learning loss function here may be a loss function in formula (1).
And 406, the first participant device copies the feature extraction model to obtain a reference feature extraction model, and performs data enhancement on the local training user sample to obtain a target user sample and an enhanced training user sample.
It should be noted that after the copied reference feature extraction model is obtained, the model parameters of the reference feature extraction model need to be fixed, so as to ensure that the reference feature extraction model does not update the subsequent model parameters.
Step 407, performing feature extraction on the target user sample through a local feature submodel in the feature extraction model to obtain a first target feature, and performing feature extraction on the target user sample through a shared feature submodel in the feature extraction model to obtain a second target feature.
It should be noted that after the first target feature and the second target feature are determined, the first target feature and the second target feature need to be input to a similarity loss function, such as a sine and cosine similarity function, so as to calculate a gradient of the local feature submodel and a gradient of the shared feature submodel, and update the corresponding models based on the gradient of the local feature submodel and the gradient of the shared feature submodel. Here, the subsequent steps may be continued after performing multiple rounds of updating on the local feature submodel and the shared feature submodel in the feature extraction model, or the subsequent steps may be directly performed after performing one round of updating on the local feature submodel and the shared feature submodel in the feature extraction model, which is not limited in the embodiment of the present application.
And 408, extracting the characteristics of the enhanced training user sample by referring to the local characteristic submodel in the characteristic extraction model to obtain a third target characteristic, and extracting the characteristics of the enhanced training user sample by referring to the shared characteristic submodel in the characteristic extraction model to obtain a fourth target characteristic.
And 409, performing feature splicing on the first target feature and the second target feature to obtain a first splicing feature, and performing feature splicing on the third target feature and the fourth target feature to obtain a second splicing feature.
And step 410, inputting the first splicing characteristic and the second splicing characteristic into a comparison learning loss function to obtain a gradient of the characteristic extraction model, and updating model parameters of the characteristic extraction model through the gradient.
It should be noted that the comparative learning loss function here may be a loss function in the formula (2).
In actual implementation, after the training of the feature extraction model is completed, the trained feature extraction model is obtained, and then a downstream federal learning task such as step 411-step 418 can be performed based on the trained feature extraction model.
Step 411, obtaining a classification model to be trained.
Step 412, performing feature extraction on the local first tag user sample carrying the tag through the feature extraction model after the model parameter is updated, so as to obtain a first tag user feature corresponding to the local first tag user sample.
In actual implementation, performing feature extraction on a first tag user sample carrying a tag locally through a local feature sub-model in a feature extraction model after model parameters are updated to obtain a first local feature; performing feature extraction on a first label user sample carrying a label locally through a shared feature sub-model in a feature extraction model after model parameters are updated to obtain a first shared feature; and splicing the first local feature and the first shared feature to obtain a first label user feature corresponding to the local first label user sample.
It should be noted that, when the federal learning task performed by the trained feature extraction model is a binary classification task, the real label is a binary classification label, and the classification model to be trained is a binary classification model, where the binary classification label is used to indicate a result of the secondary classification corresponding to the training user sample. For example, when the classification model corresponding to the two-classification task is applied to wind control management or wind control prediction, the two-classification tag may be a wind control tag, for example, the wind control tag may be a two-classification tag that identifies whether a user is credited or not credited, may also be a two-classification tag that identifies whether the user is highly or lowly loyal, or may be a two-classification tag that is used to evaluate whether the user is a good customer.
And 413, the second participant device performs feature extraction on the local second tag user sample through the locally updated feature extraction model to obtain a second tag user feature.
In actual implementation, the second participant equipment performs feature extraction on a local second tag user sample through a local feature sub-model in the feature extraction model after local model parameters are updated to obtain a second local feature; performing feature extraction on a local second label user sample through a shared feature sub-model in the feature extraction model after the model parameters are updated to obtain a second shared feature; and splicing the second local characteristic and the second sharing characteristic to obtain a second label user characteristic corresponding to the local second label user sample.
It should be noted that the first tagged user characteristic corresponds to the same user sample as the second tagged user characteristic, that is, the first tagged user sample is the same as the second tagged user sample, but here, only the first participant device holds the tag of the user sample, and the second participant device does not have the tag of the user sample.
Step 414, send the second tag user characteristic to the first participant device.
Step 415, the first participant device inputs the first label user characteristic and the second label user characteristic into the classification model respectively to obtain a first classification result corresponding to the first label user sample and a second classification result corresponding to the second label user sample.
At step 416, a first label corresponding to the first classification result and a second label corresponding to the second classification result are determined.
Step 417, a first difference between the first classification result and the corresponding first label and a second difference between the second classification result and the corresponding second label are obtained.
Step 418, updating the model parameters of the classification model based on the first difference and the second difference.
Therefore, the local unlabeled training user samples of all the participator devices and the unlabeled alignment user samples of all the participators are fully utilized, the feature extraction model with excellent performance is obtained through pre-training, so that the features with high discrimination are obtained, and the problem of poor federal modeling effect based on supervised learning under the scene of less labeled user sample data is solved based on the feature extraction models.
By applying the embodiment of the application, the first participant device performs feature extraction on a local target user sample based on the local feature submodel to obtain a first target feature, performs feature extraction on the target user sample based on the shared feature submodel trained under the longitudinal federated learning architecture to obtain a second target feature, and updates the model parameters of the feature extraction model by combining the first target feature and the second target feature.
Continuing with the description of the training device 254 for feature extraction model provided in the embodiment of the present application, referring to fig. 11, fig. 11 is a schematic structural diagram of the training device 254 for feature extraction model provided in the embodiment of the present application, and the training device 254 for feature extraction model provided in the embodiment of the present application includes:
an obtaining module 2541, configured to, by a first participant device, obtain a local first user feature, and receive a second user feature sent by a second participant device; the first user characteristic and the second user characteristic correspond to the same user, and the first user characteristic and the second user characteristic are partially the same;
a training module 2542, configured to train the shared feature submodel based on the first user feature and the second user feature, to obtain a trained shared feature submodel;
a feature extraction module 2543, configured to perform feature extraction on a local target user sample through the local feature sub-model to obtain a first target feature, and perform feature extraction on the target user sample through the trained shared feature sub-model to obtain a second target feature;
an updating module 2544, configured to update the model parameters of the feature extraction model in combination with the first target feature and the second target feature.
In some embodiments, the update module 2544 is further configured to determine a similarity between the first target feature and the second target feature; determining a loss of the feature extraction model based on a similarity of the first target feature and the second target feature; updating model parameters of the feature extraction model based on the loss.
In some embodiments, the updating module 2544 is further configured to obtain a first loss function of the local feature sub-model and a second loss function of the shared feature sub-model, and construct a target loss function of the feature extraction model based on the first loss function and the second loss function; determining a first loss of the first loss function and a second loss of the second loss function based on a similarity of the first target feature and the second target feature; determining a loss of the feature extraction model based on the first loss, the second loss, and the target loss function.
In some embodiments, the target user sample is obtained by performing data enhancement on a training user sample, and the apparatus further includes a first enhancement module, configured to perform data enhancement again on the training user sample to obtain an enhanced training user sample; copying the feature extraction model to obtain a reference feature extraction model; performing feature extraction on the enhanced training user sample through a local feature sub-model in the reference feature extraction model to obtain a third target feature, and performing feature extraction on the enhanced training user sample through a shared feature sub-model in the reference feature extraction model to obtain a fourth target feature; the updating module 2544 is further configured to update the model parameters of the feature extraction model in combination with the first target feature, the second target feature, the third target feature and the fourth target feature.
In some embodiments, the update module 2544 is further configured to perform feature splicing on the first target feature and the second target feature to obtain a first spliced feature, and perform feature splicing on the third target feature and the fourth target feature to obtain a second spliced feature; and updating the model parameters of the feature extraction model by adopting a comparison learning mode based on the first splicing feature and the second splicing feature.
In some embodiments, the apparatus further includes a second enhancement module, where the second enhancement module is configured to obtain at least two training user samples, perform data enhancement on one of the at least two training user samples to obtain a first enhancement training user sample and a second enhancement training user sample, and use the first enhancement training user sample as the target user sample; respectively extracting features of the second enhanced training user sample and the training user sample which is not subjected to data enhancement through the local feature sub-model to obtain a first enhanced target feature of the second enhanced training user sample and a second enhanced target feature of the training user sample which is not subjected to data enhancement; respectively extracting the features of the second enhanced training user sample and the training user sample which is not subjected to data enhancement through the trained shared feature sub-model to obtain a third enhanced target feature of the second enhanced training user sample and a fourth enhanced target feature of the training user sample which is not subjected to data enhancement; the update module 2544 is further configured to perform feature splicing on the first target feature and the second target feature to obtain a target splicing feature, perform feature splicing on the first enhanced target feature and the third enhanced target feature to obtain a first enhanced feature, and perform feature splicing on the second enhanced target feature and the fourth enhanced target feature to obtain a second enhanced feature; and updating the model parameters of the feature extraction model by adopting a comparison learning mode based on the target splicing feature, the first enhancement feature and the second enhancement feature.
In some embodiments, the update module 2544 is further configured to determine a similarity between the target stitching feature and the first enhancement feature and a similarity between the target stitching feature and the second enhancement feature; updating model parameters of the feature extraction model based on the similarity between the target stitching feature and the first enhanced feature and the similarity between the target stitching feature and the second enhanced feature.
In some embodiments, the apparatus further comprises an application module, configured to obtain a classification model to be trained; performing feature extraction on a first tag user sample carrying a tag locally through a feature extraction model after model parameters are updated to obtain a first tag user feature corresponding to the local first tag user sample; receiving a second tag user characteristic sent by the second participant device, wherein the second tag user characteristic is characteristic data obtained by the second participant device performing characteristic extraction on a local second tag user sample through a locally updated characteristic extraction model; updating model parameters of the classification model based on the first label user characteristic and the second label user characteristic.
In some embodiments, the first tagged user feature and the second tagged user feature correspond to the same user sample, and the application module is further configured to input the first tagged user feature and the second tagged user feature into the classification model respectively, so as to obtain a first classification result corresponding to the first tagged user sample and a second classification result corresponding to the second tagged user sample; determining a first label corresponding to the first classification result and a second label corresponding to the second classification result; acquiring a first difference between the first classification result and the corresponding first label and a second difference between the second classification result and the corresponding second label; updating model parameters of the classification model based on the first difference and the second difference.
By applying the embodiment of the application, the first participant device performs feature extraction on a local target user sample based on the local feature submodel to obtain a first target feature, performs feature extraction on the target user sample based on the shared feature submodel trained under the longitudinal federated learning architecture to obtain a second target feature, and updates the model parameters of the feature extraction model by combining the first target feature and the second target feature.
An embodiment of the present application further provides an electronic device, where the electronic device includes:
a memory for storing executable instructions;
and the processor is used for realizing the training method of the feature extraction model provided by the embodiment of the application when the executable instructions stored in the memory are executed.
The embodiment of the present application provides a computer program product, which includes a computer program, and when the computer program is executed by a processor, the computer program implements the training method of the feature extraction model provided in the embodiment of the present application.
Embodiments of the present application provide a computer-readable storage medium storing executable instructions, which when executed by a processor, cause the processor to execute the training method for feature extraction model provided in embodiments of the present application.
In some embodiments, the computer-readable storage medium may be memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash, magnetic surface memory, optical disk, or CD-ROM; or may be various devices including one or any combination of the above memories.
In some embodiments, executable instructions may be written in any form of programming language (including compiled or interpreted languages), in the form of programs, software modules, scripts or code, and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
By way of example, executable instructions may correspond, but do not necessarily have to correspond, to files in a file system, and may be stored in a portion of a file that holds other programs or data, such as in one or more scripts in a hypertext Markup Language (HTML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).
By way of example, executable instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or distributed across multiple sites and interconnected by a communication network.
In summary, according to the embodiment of the present application, the common features of the participant devices are extracted based on the shared feature submodel in the feature extraction model, and the local unique features of the participant devices are extracted based on the local feature submodel in the feature extraction model, so that the comprehensiveness of the features extracted by the feature extraction model during feature extraction is ensured, and the accuracy of the feature extraction model is improved.
The above description is only an example of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present application are included in the protection scope of the present application.

Claims (13)

1. A training method of a feature extraction model is characterized in that the method is based on a federal learning system, the federal learning system comprises a first participant device and a second participant device, the feature extraction model comprises a local feature submodel and a shared feature submodel, and the method comprises the following steps:
the method comprises the steps that first participant equipment obtains local first user characteristics and receives second user characteristics sent by second participant equipment;
the first user characteristic and the second user characteristic correspond to the same user, and the first user characteristic and the second user characteristic are partially the same;
training the shared characteristic submodel based on the first user characteristic and the second user characteristic to obtain a trained shared characteristic submodel;
performing feature extraction on a local target user sample through the local feature submodel to obtain a first target feature, and performing feature extraction on the target user sample through the trained shared feature submodel to obtain a second target feature;
and updating the model parameters of the feature extraction model by combining the first target feature and the second target feature.
2. The method of claim 1, wherein said updating model parameters of the feature extraction model in combination with the first target feature and the second target feature comprises:
determining a similarity of the first target feature and the second target feature;
determining a loss of the feature extraction model based on a similarity of the first target feature and the second target feature;
updating model parameters of the feature extraction model based on the loss.
3. The method of claim 2, wherein determining the loss of the feature extraction model based on the similarity of the first target feature to the second target feature comprises:
acquiring a first loss function of the local feature submodel and a second loss function of the shared feature submodel, and constructing a target loss function of the feature extraction model based on the first loss function and the second loss function;
determining a first loss of the first loss function and a second loss of the second loss function based on a similarity of the first target feature and the second target feature;
determining a loss of the feature extraction model based on the first loss, the second loss, and the target loss function.
4. The method of claim 1, wherein the target user sample is obtained by data enhancement of a training user sample, the method further comprising:
performing data enhancement on the training user sample again to obtain an enhanced training user sample;
copying the feature extraction model to obtain a reference feature extraction model;
performing feature extraction on the enhanced training user sample through a local feature sub-model in the reference feature extraction model to obtain a third target feature, and performing feature extraction on the enhanced training user sample through a shared feature sub-model in the reference feature extraction model to obtain a fourth target feature;
the updating the model parameters of the feature extraction model by combining the first target feature and the second target feature comprises:
updating model parameters of the feature extraction model in combination with the first target feature, the second target feature, the third target feature and the fourth target feature.
5. The method of claim 4, wherein said updating model parameters of the feature extraction model in combination with the first target feature, the second target feature, the third target feature, and the fourth target feature comprises:
performing feature splicing on the first target feature and the second target feature to obtain a first spliced feature, and performing feature splicing on the third target feature and the fourth target feature to obtain a second spliced feature;
and updating the model parameters of the feature extraction model by adopting a comparison learning mode based on the first splicing feature and the second splicing feature.
6. The method of claim 1, wherein the method further comprises:
acquiring at least two training user samples, performing data enhancement on one of the at least two training user samples to obtain a first enhanced training user sample and a second enhanced training user sample, and taking the first enhanced training user sample as the target user sample;
respectively extracting features of the second enhanced training user sample and the training user sample which is not subjected to data enhancement through the local feature sub-model to obtain a first enhanced target feature of the second enhanced training user sample and a second enhanced target feature of the training user sample which is not subjected to data enhancement;
respectively extracting the features of the second enhanced training user sample and the training user sample which is not subjected to data enhancement through the trained shared feature sub-model to obtain a third enhanced target feature of the second enhanced training user sample and a fourth enhanced target feature of the training user sample which is not subjected to data enhancement;
the updating the model parameters of the feature extraction model by combining the first target feature and the second target feature comprises:
performing feature splicing on the first target feature and the second target feature to obtain target splicing features, and performing feature splicing on the first target feature and the second target feature
Performing feature splicing on the first enhanced target feature and the third enhanced target feature to obtain a first enhanced feature, and
performing feature splicing on the second enhanced target feature and the fourth enhanced target feature to obtain a second enhanced feature;
and updating the model parameters of the feature extraction model by adopting a comparison learning mode based on the target splicing feature, the first enhancement feature and the second enhancement feature.
7. The method of claim 6, wherein updating the model parameters of the feature extraction model based on the target stitching feature, the first enhanced feature and the second enhanced feature by means of contrast learning comprises:
determining a similarity between the target stitching feature and the first enhancement feature and a similarity between the target stitching feature and the second enhancement feature;
updating model parameters of the feature extraction model based on the similarity between the target stitching feature and the first enhanced feature and the similarity between the target stitching feature and the second enhanced feature.
8. The method of claim 1, wherein after updating the model parameters of the feature extraction model, the method further comprises:
obtaining a classification model to be trained;
performing feature extraction on a first tag user sample carrying a tag locally through a feature extraction model after model parameters are updated to obtain a first tag user feature corresponding to the local first tag user sample;
receiving a second tag user characteristic sent by the second participant device, wherein the second tag user characteristic is characteristic data obtained by the second participant device performing characteristic extraction on a local second tag user sample through a locally updated characteristic extraction model;
updating model parameters of the classification model based on the first label user characteristic and the second label user characteristic.
9. The method of claim 8, wherein the first labeled user feature corresponds to a same sample user as the second labeled user feature, and wherein updating the model parameters of the classification model based on the first labeled user feature and the second labeled user feature comprises:
inputting the first label user characteristics and the second label user characteristics into the classification model respectively to obtain a first classification result corresponding to the first label user sample and a second classification result corresponding to the second label user sample;
determining a first label corresponding to the first classification result and a second label corresponding to the second classification result;
acquiring a first difference between the first classification result and the corresponding first label and a second difference between the second classification result and the corresponding second label;
updating model parameters of the classification model based on the first difference and the second difference.
10. The utility model provides a trainer of feature extraction model, its characterized in that, based on federal learning system, federal learning system includes first party's equipment and second party's equipment, the feature extraction model includes local characteristic submodel and shared characteristic submodel, the device includes:
the acquisition module is used for acquiring local first user characteristics by first participant equipment and receiving second user characteristics sent by second participant equipment;
the first user characteristic and the second user characteristic correspond to the same user, and the first user characteristic and the second user characteristic are partially the same;
the training module is used for training the shared feature submodel based on the first user feature and the second user feature to obtain a trained shared feature submodel;
the characteristic extraction module is used for extracting the characteristics of a local target user sample through the local characteristic submodel to obtain a first target characteristic, and extracting the characteristics of the target user sample through the trained shared characteristic submodel to obtain a second target characteristic;
and the updating module is used for updating the model parameters of the feature extraction model by combining the first target feature and the second target feature.
11. An electronic device, characterized in that the electronic device comprises:
a memory for storing executable instructions;
a processor for implementing the method of any one of claims 1 to 9 when executing executable instructions stored in the memory.
12. A computer-readable storage medium having stored thereon executable instructions for, when executed by a processor, implementing the method of any one of claims 1 to 9.
13. A computer program product comprising a computer program, characterized in that the computer program realizes the method of any of claims 1 to 9 when executed by a processor.
CN202210624491.1A 2022-06-02 2022-06-02 Training method, device, equipment, medium and program product of feature extraction model Pending CN114861829A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210624491.1A CN114861829A (en) 2022-06-02 2022-06-02 Training method, device, equipment, medium and program product of feature extraction model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210624491.1A CN114861829A (en) 2022-06-02 2022-06-02 Training method, device, equipment, medium and program product of feature extraction model

Publications (1)

Publication Number Publication Date
CN114861829A true CN114861829A (en) 2022-08-05

Family

ID=82623914

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210624491.1A Pending CN114861829A (en) 2022-06-02 2022-06-02 Training method, device, equipment, medium and program product of feature extraction model

Country Status (1)

Country Link
CN (1) CN114861829A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116028820A (en) * 2023-03-20 2023-04-28 支付宝(杭州)信息技术有限公司 Model training method and device, storage medium and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116028820A (en) * 2023-03-20 2023-04-28 支付宝(杭州)信息技术有限公司 Model training method and device, storage medium and electronic equipment
CN116028820B (en) * 2023-03-20 2023-07-04 支付宝(杭州)信息技术有限公司 Model training method and device, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
US10521505B2 (en) Cognitive mediator for generating blockchain smart contracts
Ondrus et al. The impact of openness on the market potential of multi-sided platforms: a case study of mobile payment platforms
CN113901320A (en) Scene service recommendation method, device, equipment and storage medium
Steinberg et al. Media power in digital Asia: Super apps and megacorps
CN112487109A (en) Entity relationship extraction method, terminal and computer readable storage medium
CN114357020A (en) Service scene data extraction method and device, computer equipment and storage medium
KR102459466B1 (en) Integrated management method for global e-commerce based on metabus and nft and integrated management system for the same
CN114861829A (en) Training method, device, equipment, medium and program product of feature extraction model
CN111553487B (en) Business object identification method and device
US20220076288A1 (en) Method of and system for managing a reward program
CN113591934A (en) Method, device and equipment for arranging business analysis model and storage medium
Santos et al. Modelling a deep learning framework for recognition of human actions on video
Daasan et al. Enhancing Face Recognition Accuracy through Integration of YOLO v8 and Deep Learning: A Custom Recognition Model Approach
US20220172271A1 (en) Method, device and system for recommending information, and storage medium
Tkacz Money’s new abstractions: Apple Pay and the economy of experience
Kumar et al. (Retracted) Cloud and deep learning-based image analyzer
CN114493850A (en) Artificial intelligence-based online notarization method, system and storage medium
CN110163482B (en) Method for determining safety scheme data of activity scheme, terminal equipment and server
KR20220150060A (en) platform that provides company matching services based on user information and provides security services for them
Lv Smart product marketing strategy in a cloud service wireless network based on SWOT analysis
CN114912542A (en) Method, apparatus, device, medium, and program product for training feature extraction model
Ramphull et al. A model for smart banking in Mauritius
KR102498521B1 (en) System for providing metaverse based virtual office service
CN113496304B (en) Method, device, equipment and storage medium for controlling delivery of network medium information
CN114090962B (en) Intelligent publishing system and method based on big data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication