CN111832591A - Machine learning model training method and device - Google Patents

Machine learning model training method and device Download PDF

Info

Publication number
CN111832591A
CN111832591A CN201910327485.8A CN201910327485A CN111832591A CN 111832591 A CN111832591 A CN 111832591A CN 201910327485 A CN201910327485 A CN 201910327485A CN 111832591 A CN111832591 A CN 111832591A
Authority
CN
China
Prior art keywords
sample data
global
model
value
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910327485.8A
Other languages
Chinese (zh)
Other versions
CN111832591B (en
Inventor
周俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Original Assignee
Advanced New Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced New Technologies Co Ltd filed Critical Advanced New Technologies Co Ltd
Priority to CN201910327485.8A priority Critical patent/CN111832591B/en
Publication of CN111832591A publication Critical patent/CN111832591A/en
Application granted granted Critical
Publication of CN111832591B publication Critical patent/CN111832591B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present disclosure provides a method and apparatus for model training. At the global model device, a training sample data set is divided into a plurality of independent training sample data subsets, a plurality of global submodels are trained respectively and independently by utilizing each training sample data subset, and the global submodels are subjected to model fusion to obtain a global model. When the local model training is carried out, the local model training device sends unlabeled sample data to the global model side, the labeled value of the sample data is obtained by using the global model, then, the local model of the user is trained by using the sample data and the corresponding labeled value locally, and the trained local model is deployed locally on the user for model prediction service. By utilizing the model training method and the model training device, data leakage of training data can be protected.

Description

Machine learning model training method and device
Technical Field
The present disclosure relates generally to the field of computer technology, and more particularly, to a method and apparatus for training a machine learning model.
Background
In some machine learning applications, the training of machine learning models involves sensitive data, such as a large amount of face data when training a model for detecting whether a picture is a face, and a large amount of personal privacy data when training a model for medical diagnosis.
Through research, the training data used in model training can be reconstructed based on the prediction result of the machine learning model by using a reverse engineering technology. Therefore, the conventional model training method has a high possibility of causing the leakage of the personal privacy data, for example, a large number of model prediction results are obtained through a large number of queries, and then the training data is reconstructed based on the obtained model prediction results, so that the personal privacy data used for the model training data is obtained.
Disclosure of Invention
In view of the foregoing, the present disclosure provides a model training method and apparatus. By utilizing the model training method and the model training device, data leakage of training data can be protected.
According to an aspect of the present disclosure, there is provided a method for model training, comprising: sending at least one first sample data to a global model device to obtain a labeled value of the at least one first sample data based on a global model at the global model device, the first sample data being no-labeled sample data; and training a local model locally at the user using the at least one first sample data and the corresponding marker values, wherein the global model comprises at least one global sub-model, each global sub-model being trained using a separate second sample data set.
Optionally, in an example of the above aspect, the second sample data set is obtained by dividing the sample data set or is acquired by a different data acquisition device.
Optionally, in an example of the above aspect, the method may further include: the at least one first sample data is collected locally at the user.
Optionally, in one example of the above aspect, the at least one first sample data is public sample data.
Optionally, in an example of the above aspect, the label value of each of the at least one first sample data is obtained by inputting the first sample data into each of the at least one global sub-model for prediction and fusing the obtained prediction values of the global sub-models.
Optionally, in an example of the above aspect, the predicted value of each global sub-model is a predicted value after noise addition processing.
According to another aspect of the present disclosure, there is provided a method for model training, comprising: receiving at least one first sample data locally from a user, the first sample data being no-marker sample data; providing the at least one first sample data to a global model to obtain a labeled value of the at least one first sample data; and sending the obtained tag value of the at least one first sample data to the user local to train a local model at the user local using the at least one first sample data and the corresponding tag value, wherein the global model comprises at least one global sub-model, and each global sub-model is trained by using an independent second sample data set.
Optionally, in an example of the above aspect, the second sample data set is obtained by dividing the sample data set or is acquired by a different data acquisition device.
Optionally, in an example of the above aspect, providing the at least one first sample data to a global model to obtain the marker value to the one first sample data includes: inputting each first sample data of the at least one first sample data into each global submodel of the at least one global submodel for prediction; and fusing the predicted values of all the global submodels of each obtained first sample data to obtain the marking value of the sample data.
Optionally, in an example of the above aspect, providing the at least one first sample data to a global model to obtain the tag value to the one first sample data further includes: performing noise addition processing on the obtained predicted values of the global submodels, wherein the step of fusing the obtained predicted values of the global submodels of each piece of first sample data to obtain the marking value of the sample data comprises the following steps: and fusing the predicted values of each obtained first sample data after noise addition processing to obtain the mark value of the sample data.
According to another aspect of the present disclosure, there is provided an apparatus for model training, comprising: a sample data sending unit configured to send at least one first sample data to a global model device to obtain a marker value of the at least one first sample data based on a global model at the global model device, the first sample data being no-marker sample data; a tag value receiving unit configured to receive a tag value of the at least one first sample data; and a local model training unit configured to train a local model at the user's local using the at least one first sample data and the corresponding marker values, wherein the global model comprises at least one global sub-model, each global sub-model being trained using an independent second sample data set.
Optionally, in an example of the above aspect, the apparatus may further include: a sample data acquisition unit configured to acquire the at least one first sample data locally at a user.
Optionally, in one example of the above aspect, the at least one first sample data is public sample data.
According to another aspect of the present disclosure, there is provided an apparatus for model training, comprising: the system comprises a sample data receiving unit, a data processing unit and a data processing unit, wherein the sample data receiving unit is configured to locally receive at least one first sample data from a user, and the first sample data is no-mark sample data; a tag value obtaining unit configured to provide the at least one first sample data to a global model to obtain a tag value of the at least one first sample data; and a tag value sending unit configured to send the obtained tag value of the at least one first sample data to the user local to train a local model at the user local using the at least one first sample data and the corresponding tag value, wherein the global model comprises at least one global sub-model, and each global sub-model is trained by using an independent second sample data set.
Optionally, in an example of the above aspect, the apparatus may further include: at least one global submodel training unit configured to train out each of the at least one global submodel using an independent second sample data set.
Optionally, in an example of the above aspect, the flag value acquiring unit includes: a prediction module configured to input each of the at least one first sample data to each of the at least one global sub-model for prediction; and the data fusion module is configured to fuse the obtained predicted values of the global submodels of each first sample data to obtain the marking value of the sample data.
Optionally, in an example of the above aspect, the apparatus may further include: a noise adding module configured to perform noise adding processing on the obtained predicted values of the global sub-models, wherein the data fusion module is configured to: and fusing the predicted values of each obtained first sample data after noise addition processing to obtain the mark value of the sample data.
According to another aspect of the present disclosure, there is provided a system for model training, comprising: means for model training on the user's local side as described above; and a means for model training at the remote location as described above.
According to another aspect of the present disclosure, there is provided a computing device comprising: at least one processor, and a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform a method for user local model training as described above.
According to another aspect of the present disclosure, there is provided a machine-readable storage medium having stored thereon executable instructions that, when executed, cause the machine to perform the method for user local model training as described above.
According to another aspect of the present disclosure, there is provided a computing device comprising: at least one processor, and a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform a method for global model training as described above.
According to another aspect of the present disclosure, there is provided a machine-readable storage medium having stored thereon executable instructions that, when executed, cause the machine to perform the method for global model training as described above.
Drawings
A further understanding of the nature and advantages of the present disclosure may be realized by reference to the following drawings. In the drawings, similar components or features may have the same reference numerals.
FIG. 1 shows a block diagram of a system for model training in accordance with an embodiment of the present disclosure;
FIG. 2 shows a flow diagram of a method for model training in accordance with an embodiment of the present disclosure;
FIG. 3 shows a flowchart of a process for obtaining a marker value for first sample data, according to an embodiment of the present disclosure;
FIG. 4 illustrates a block diagram of one example of a local model training apparatus in accordance with an embodiment of the present disclosure;
FIG. 5 shows a block diagram of one example of a global model device in accordance with an embodiment of the present disclosure;
FIG. 6 shows a block diagram of one implementation example of a tag value obtaining unit according to an embodiment of the present disclosure;
FIG. 7 illustrates a block diagram of a computing device for local model training in accordance with an embodiment of the present disclosure;
FIG. 8 illustrates a block diagram of a computing device for model training on the far-end side, in accordance with an embodiment of the present disclosure.
Detailed Description
The subject matter described herein will now be discussed with reference to example embodiments. It should be understood that these embodiments are discussed only to enable those skilled in the art to better understand and thereby implement the subject matter described herein, and are not intended to limit the scope, applicability, or examples set forth in the claims. Changes may be made in the function and arrangement of elements discussed without departing from the scope of the disclosure. Various examples may omit, substitute, or add various procedures or components as needed. For example, the described methods may be performed in an order different from that described, and various steps may be added, omitted, or combined. In addition, features described with respect to some examples may also be combined in other examples.
As used herein, the term "include" and its variants mean open-ended terms in the sense of "including, but not limited to. The term "based on" means "based at least in part on". The terms "one embodiment" and "an embodiment" mean "at least one embodiment". The term "another embodiment" means "at least one other embodiment". The terms "first," "second," and the like may refer to different or the same object. Other definitions, whether explicit or implicit, may be included below. The definition of a term is consistent throughout the specification unless the context clearly dictates otherwise.
Embodiments of the present disclosure propose a new model training scheme. In the model training scheme, a sample data set used for model training is divided into a plurality of independent sample data subsets, then a plurality of global submodels are respectively and independently trained by using the independent sample data subsets, or each global submodel in the global model is trained by using a training party with the sample data subsets, and then the obtained global submodels are subjected to fusion processing to obtain the global model. Here, "the global model is obtained by performing fusion processing on the global submodels" means that the global model is composed of a plurality of global submodels, and when prediction is performed using the global model, the prediction results of the global model are obtained by performing fusion processing on the prediction results of the respective global submodels by some mechanism. Further, public sample data is collected locally at the user, the public sample data being unlabeled sample data. The collected public sample data is sent to the global model side, the global model is used for obtaining a mark value of the public sample data (namely a predicted value of the global model), the public sample data and the corresponding mark value are used for training a user local model at the user local, and the trained local model is deployed on the user local, such as a mobile phone, so as to perform model prediction service. Through the mode that the global model is combined with the local model, on one hand, when the global model is trained, data are distributed at multiple ends, the possibility of leakage of all data is avoided, meanwhile, the model prediction speed can be accelerated through training the local model, and meanwhile, the local model only uses public data, so that the leakage of data privacy is avoided.
A method and apparatus for model training according to an embodiment of the present disclosure will be described in detail below with reference to the accompanying drawings.
FIG. 1 shows a block diagram of a system for model training (hereinafter referred to as model training system 10) according to an embodiment of the present disclosure. As shown in FIG. 1, model training system 10 includes a local model training device 110 and a global model device 120.
In performing model training, the local model training means 110 sends at least one first sample data to the global model means 120, the at least one first sample data being no-marker sample data, i.e. having no marker value. In one example of the present disclosure, the at least one first sample data may be public data collected locally at the user, i.e., the first sample data used for local model training is not sample data locally at the user.
The global model means 120 has a global model. The global model comprises at least one global sub-model, each global sub-model being trained using a separate second set of sample data. In this disclosure, the second sample data set may be a sample data subset obtained by dividing a sample data set used for global model training, or acquired by different data acquisition devices. The second sample data subset is an independent sample data set, and respective corresponding global submodels are trained in an independent training environment, such as training on an independent training device. When the global model is used for prediction, the prediction results of the global model are obtained by fusing the prediction results of all global submodels through a certain mechanism.
After receiving at least one first sample data sent by the local model training device 110, for each first sample data, a corresponding label value is obtained based on the global model. How to obtain the tag value based on the global model will be described in detail later with reference to the drawings.
Then, the global model device 120 sends the obtained label value of each first sample data to the local model training device 110. The local model training device 110 uses the at least one first sample data and the corresponding marker value to train the local model locally at the user. And then, deploying the trained local model locally for subsequent model prediction.
FIG. 2 shows a flow diagram of a method for model training in accordance with an embodiment of the present disclosure.
As shown in FIG. 2, at block 210, the local model training device 110 collects at least one first sample data. The collected at least one first sample data is then sent to the global model device 120 at block 220.
At the global model device 120, at block 230, the marker values of the respective first sample data are predicted based on the global model in the global model device 120.
Fig. 3 shows a flowchart of a process for obtaining a marker value of first sample data according to an embodiment of the present disclosure.
As shown in fig. 3, at block 310, each first sample data of the at least one first sample data is respectively input to each global sub-model in the global model for prediction, so as to obtain a corresponding predicted value.
Next, at block 320, noise addition processing is performed on the obtained predicted values of the respective global sub-models for the respective first sample data. Here, the noise may be, for example, gaussian noise or laplacian noise. For example, for a data distribution of at least one first sample data, the corresponding noise may be generated using, for example, a sample data mean or variance.
Then, at block 330, the predicted values of the global submodels for each of the first sample data are fused to obtain a labeled value for the sample data.
Here, it is to be noted that, in other examples of the present disclosure, the acquisition process of the flag value for the first sample data shown in fig. 3 may not include the operation of the block 320.
After obtaining the labeled values of the respective first sample data as above, the global model device 120 sends the obtained labeled values of the respective first sample data to the local model training device 110 at block 240.
After receiving the labeled values of the respective first sample data, the local model training device 110 trains out a local model using the respective first sample data and the corresponding labeled values locally to the user at block 250.
With the method and apparatus for model training according to the embodiments of the present disclosure, a model is performed by combining a global model and a local model, on one hand, when the global model is trained, training data is divided into a plurality of parts and each part is distributed in an independent environment (for example, an independent training party), so as to avoid the possibility of leakage of all training data, and on the other hand, by training the local model to perform model prediction using sample data without a label value and a corresponding label value obtained based on the global model at a local user, the model prediction speed can be accelerated.
With the method and apparatus for model training according to the disclosed embodiments, the global submodels can be prevented from being reverse-cracked by adding noise to the predicted values of the respective global submodels. For example, a malicious user performs model prediction by inputting a large number of samples to obtain a stack of prediction results, and when the input sample size is large, the malicious user can recover the global submodel and the training sample data used by the global submodel by using the prediction results, so that the global submodel and the training sample data are leaked. After the noise is added, a malicious user cannot recover the global submodel, so that the safety of the global submodel and training sample data is ensured.
Furthermore, with the method and apparatus for model training according to embodiments of the present disclosure, since the first sample data used in local model training is public data collected locally at the user, privacy disclosure of private data locally at the user is avoided.
The method for model training according to the embodiment of the present disclosure is described above with reference to fig. 1 to 3, and the apparatus for model training according to the embodiment of the present disclosure will be described below with reference to fig. 4 to 6.
FIG. 4 shows a block diagram of a local model training apparatus 110 according to an embodiment of the present disclosure. As shown in fig. 4, the local model training apparatus 110 includes a sample data transmitting unit 111, a marker value receiving unit 113, and a local model training unit 115.
The sample data transmitting unit 111 is configured to transmit the at least one first sample data to the global model device 120 to obtain a marker value of the at least one first sample data based on the global model at the global model device 120, the first sample data being no marker sample data. The global model comprises at least one global submodel, each global submodel being trained using an independent second sample data set. The operation of the sample data transmitting unit 111 may refer to the operation of block 220 described above with reference to fig. 2.
The flag value receiving unit 113 is configured to receive a flag value of the at least one first sample data. The operation of the flag value receiving unit 113 may refer to the operation of the block 240 described above with reference to fig. 2.
The local model training unit 115 is configured to train the local model at the user's local using the at least one first sample data and the corresponding label value. The operation of the local model training unit 115 may refer to the operation of block 250 described above with reference to FIG. 2.
Furthermore, in another example of the present disclosure, the local model training device 110 may further include a sample data acquisition unit (not shown). The sample data acquisition unit is configured to acquire at least one first sample data locally at a user.
FIG. 5 shows a block diagram of a global model device 120 according to an embodiment of the present disclosure. As shown in fig. 5, the global model device 120 includes a sample data receiving unit 121, a marker value acquiring unit 123, and a marker value transmitting unit 125.
The sample data receiving unit 121 is configured to receive at least one first sample data locally from a user, the first sample data being no-marker sample data. The operation of the sample data receiving unit 121 may refer to the operation of block 220 described above with reference to fig. 2.
The marker value obtaining unit 123 is configured to provide the at least one first sample data to the global model to obtain the marker value of the at least one first sample data. The operation of the marker value acquisition unit 123 may refer to the operation of the block 230 described above with reference to fig. 2 and the operation described with reference to fig. 3.
The tag value transmitting unit 125 is configured to transmit the obtained tag value of the at least one first sample data to the user local, so as to train the local model at the user local using the at least one first sample data and the corresponding tag value. The operation of the tag value transmitting unit 125 may refer to the operation of the block 240 described above with reference to fig. 2.
Fig. 6 shows a block diagram of an implementation example of the marker value acquisition unit 123 according to an embodiment of the present disclosure. As shown in fig. 6, the marker value acquisition unit 123 includes a prediction module 124 and a data fusion module 126.
The prediction module 124 is configured to input each of the at least one first sample data to a respective one of the at least one global sub-model for prediction.
The data fusion module 126 is configured to fuse the obtained respective predicted values of each first sample data to obtain a labeled value of the sample data.
In another example of the present disclosure, the mark value obtaining unit 125 may further include a noise adding module (not shown). The noise adding module is configured to perform noise adding processing on the obtained predicted values of the global submodels. Then, the data fusion module fuses the predicted values of each obtained first sample data after the noise addition processing to obtain the marking value of the sample data.
Embodiments of a method for model training and an apparatus for model training according to the present disclosure are described above with reference to fig. 1 to 6. The above means for model training may be implemented in hardware, or may be implemented in software, or a combination of hardware and software.
FIG. 7 illustrates a hardware block diagram of a computing device 700 for local model training according to an embodiment of the disclosure. As shown in fig. 7, computing device 700 may include at least one processor 710, storage 720, memory 730, and communication interface 740, and at least one processor 710, storage 720, memory 730, and communication interface 740 are connected together via a bus 760. The at least one processor 710 executes at least one computer-readable instruction (i.e., the elements described above as being implemented in software) stored or encoded in memory.
In one embodiment, computer-executable instructions are stored in the memory that, when executed, cause the at least one processor 710 to: sending at least one first sample data to a global model device to obtain a labeled value of the at least one first sample data based on a global model at the global model device, the first sample data being no-labeled sample data; and training a local model locally at the user using the at least one first sample data and the corresponding marker values, wherein the global model comprises at least one global sub-model, each global sub-model being trained using a separate second sample data set.
It should be understood that the computer-executable instructions stored in the memory, when executed, cause the at least one processor 710 to perform the various operations and functions described above in connection with fig. 1-6 in the various embodiments of the present disclosure.
Fig. 8 shows a hardware block diagram of a computing device 800 for model training (i.e., the global model apparatus above) on the remote side according to an embodiment of the present disclosure. As shown in fig. 8, computing device 800 may include at least one processor 810, storage 820, memory 830, and communication interface 840, and at least one processor 810, storage 820, memory 830, and communication interface 840 are coupled together via a bus 860. The at least one processor 810 executes at least one computer-readable instruction (i.e., the elements described above as being implemented in software) stored or encoded in memory.
In one embodiment, computer-executable instructions are stored in the memory that, when executed, cause the at least one processor 810 to: receiving at least one first sample data locally from a user, the first sample data being no-marker sample data; providing the at least one first sample data to a global model to obtain a labeled value of the at least one first sample data; and sending the obtained tag value of the at least one first sample data to the user local to train a local model at the user local using the at least one first sample data and the corresponding tag value, wherein the global model comprises at least one global sub-model, and each global sub-model is trained by using an independent second sample data set.
It should be understood that the computer-executable instructions stored in the memory, when executed, cause the at least one processor 810 to perform the various operations and functions described above in connection with fig. 1-6 in the various embodiments of the present disclosure.
In the present disclosure, computing device 700/800 may include, but is not limited to: personal computers, server computers, workstations, desktop computers, laptop computers, notebook computers, mobile computing devices, smart phones, tablet computers, cellular phones, Personal Digital Assistants (PDAs), handheld devices, messaging devices, wearable computing devices, consumer electronics, and so forth.
According to one embodiment, a program product, such as a machine-readable medium, is provided. A machine-readable medium may have instructions (i.e., elements described above as being implemented in software) that, when executed by a machine, cause the machine to perform various operations and functions described above in connection with fig. 1-5 in various embodiments of the disclosure. Specifically, a system or apparatus may be provided which is provided with a readable storage medium on which software program code implementing the functions of any of the above embodiments is stored, and causes a computer or processor of the system or apparatus to read out and execute instructions stored in the readable storage medium.
In this case, the program code itself read from the readable medium can realize the functions of any of the above-described embodiments, and thus the machine-readable code and the readable storage medium storing the machine-readable code form part of the present invention.
Examples of the readable storage medium include floppy disks, hard disks, magneto-optical disks, optical disks (e.g., CD-ROMs, CD-R, CD-RWs, DVD-ROMs, DVD-RAMs, DVD-RWs), magnetic tapes, nonvolatile memory cards, and ROMs. Alternatively, the program code may be downloaded from a server computer or from the cloud via a communications network.
It will be understood by those skilled in the art that various changes and modifications may be made in the above-disclosed embodiments without departing from the spirit of the invention. Accordingly, the scope of the invention should be determined from the following claims.
It should be noted that not all steps and units in the above flows and system structure diagrams are necessary, and some steps or units may be omitted according to actual needs. The execution order of the steps is not fixed, and can be determined as required. The apparatus structures described in the above embodiments may be physical structures or logical structures, that is, some units may be implemented by the same physical entity, or some units may be implemented by a plurality of physical entities, or some units may be implemented by some components in a plurality of independent devices.
In the above embodiments, the hardware units or modules may be implemented mechanically or electrically. For example, a hardware unit, module or processor may comprise permanently dedicated circuitry or logic (such as a dedicated processor, FPGA or ASIC) to perform the corresponding operations. The hardware units or processors may also include programmable logic or circuitry (e.g., a general purpose processor or other programmable processor) that may be temporarily configured by software to perform the corresponding operations. The specific implementation (mechanical, or dedicated permanent, or temporarily set) may be determined based on cost and time considerations.
The detailed description set forth above in connection with the appended drawings describes exemplary embodiments but does not represent all embodiments that may be practiced or fall within the scope of the claims. The term "exemplary" used throughout this specification means "serving as an example, instance, or illustration," and does not mean "preferred" or "advantageous" over other embodiments. The detailed description includes specific details for the purpose of providing an understanding of the described technology. However, the techniques may be practiced without these specific details. In some instances, well-known structures and devices are shown in block diagram form in order to avoid obscuring the concepts of the described embodiments.
The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (22)

1. A method for model training, comprising:
sending at least one first sample data to a global model device to obtain a labeled value of the at least one first sample data based on a global model at the global model device, the first sample data being no-labeled sample data; and
training a local model at the user's local using the at least one first sample data and the corresponding marker value,
wherein the global model comprises at least one global submodel, each global submodel being trained using an independent second sample data set.
2. The method of claim 1, wherein the second set of sample data is obtained by partitioning the set of sample data or acquired by a different data acquisition device.
3. The method of claim 1, further comprising:
the at least one first sample data is collected locally at the user.
4. The method of claim 3, wherein the at least one first sample data is public sample data.
5. The method of claim 1, wherein the label value of each of the at least one first sample data is obtained by inputting the first sample data into each of the at least one global submodel for prediction and fusing the obtained prediction values of the global submodels.
6. The method of claim 5, wherein the predicted value of each global sub-model is a predicted value after noise addition processing.
7. A method for model training, comprising:
receiving at least one first sample data locally from a user, the first sample data being no-marker sample data;
providing the at least one first sample data to a global model to obtain a labeled value of the at least one first sample data; and
sending the obtained tag value to the user local to train a local model at the user local using the at least one first sample data and the corresponding tag value,
wherein the global model comprises at least one global submodel, each global submodel being trained using an independent second sample data set.
8. The method of claim 7, wherein the second set of sample data is obtained by partitioning the set of sample data or acquired by a different data acquisition device.
9. The method of claim 7, wherein providing the at least one first sample data to a global model to obtain the labeled value to the one first sample data comprises:
inputting each first sample data of the at least one first sample data into each global submodel of the at least one global submodel for prediction; and
and fusing the predicted values of the global submodels of each obtained first sample data to obtain the marking value of the sample data.
10. The method of claim 9, providing the at least one first sample data to a global model to obtain the labeled value to the one first sample data further comprises:
the noise adding process is carried out on the obtained predicted values of the global submodels,
the step of fusing the predicted values of the global submodels of each obtained first sample data to obtain the labeled values of the sample data comprises the following steps:
and fusing the predicted values of each obtained first sample data after noise addition processing to obtain the mark value of the sample data.
11. An apparatus for model training, comprising:
a sample data sending unit configured to send at least one first sample data to a global model device to obtain a marker value of the at least one first sample data based on a global model at the global model device, the first sample data being no-marker sample data;
a tag value receiving unit configured to receive a tag value of the at least one first sample data; and
a local model training unit configured to train a local model at the user's local using the at least one first sample data and the corresponding marker value,
wherein the global model comprises at least one global submodel, each global submodel being trained using an independent second sample data set.
12. The apparatus of claim 11, further comprising:
a sample data acquisition unit configured to acquire the at least one first sample data locally at a user.
13. The apparatus of claim 12, wherein the at least one first sample data is public sample data.
14. An apparatus for model training, comprising:
the system comprises a sample data receiving unit, a data processing unit and a data processing unit, wherein the sample data receiving unit is configured to locally receive at least one first sample data from a user, and the first sample data is no-mark sample data;
a tag value obtaining unit configured to provide the at least one first sample data to a global model to obtain a tag value of the at least one first sample data; and
a tag value transmitting unit configured to transmit the obtained tag value of the at least one first sample data locally to the user to train a local model at the user local using the at least one first sample data and the corresponding tag value,
wherein the global model comprises at least one global submodel, each global submodel being trained using an independent second sample data set.
15. The apparatus of claim 14, further comprising:
at least one global submodel training unit configured to train out each of the at least one global submodel using an independent second sample data set.
16. The apparatus of claim 14, wherein the flag value acquiring unit comprises:
a prediction module configured to input each of the at least one first sample data to each of the at least one global sub-model for prediction; and
and the data fusion module is configured to fuse the obtained predicted values of the global sub-models of each first sample data to obtain the mark value of the sample data.
17. The apparatus of claim 16, wherein the flag value acquiring unit further comprises:
a noise adding module configured to perform noise adding processing on the obtained prediction values of the respective global sub-models,
wherein the data fusion module is configured to: and fusing the predicted values of each obtained first sample data after noise addition processing to obtain the mark value of the sample data.
18. A system for model training, comprising:
means for model training as claimed in any one of claims 11 to 13; and
apparatus for model training as claimed in any one of claims 14 to 17.
19. A computing device, comprising:
at least one processor, and
a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method of any of claims 1-6.
20. A machine-readable storage medium storing executable instructions that, when executed, cause the machine to perform the method of any of claims 1 to 6.
21. A computing device, comprising:
at least one processor, and
a memory coupled with the at least one processor, the memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method of any of claims 7-10.
22. A machine-readable storage medium storing executable instructions that, when executed, cause the machine to perform the method of any of claims 7 to 10.
CN201910327485.8A 2019-04-23 2019-04-23 Machine learning model training method and device Active CN111832591B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910327485.8A CN111832591B (en) 2019-04-23 2019-04-23 Machine learning model training method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910327485.8A CN111832591B (en) 2019-04-23 2019-04-23 Machine learning model training method and device

Publications (2)

Publication Number Publication Date
CN111832591A true CN111832591A (en) 2020-10-27
CN111832591B CN111832591B (en) 2024-06-04

Family

ID=72912298

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910327485.8A Active CN111832591B (en) 2019-04-23 2019-04-23 Machine learning model training method and device

Country Status (1)

Country Link
CN (1) CN111832591B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113420322A (en) * 2021-05-24 2021-09-21 阿里巴巴新加坡控股有限公司 Model training and desensitizing method and device, electronic equipment and storage medium
CN113689000A (en) * 2021-08-25 2021-11-23 深圳前海微众银行股份有限公司 Federal learning model training method and device, electronic equipment and storage medium
WO2022111398A1 (en) * 2020-11-26 2022-06-02 华为技术有限公司 Data model training method and apparatus

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104573720A (en) * 2014-12-31 2015-04-29 北京工业大学 Distributed training method for kernel classifiers in wireless sensor network
CN107169573A (en) * 2017-05-05 2017-09-15 第四范式(北京)技术有限公司 Using composite machine learning model come the method and system of perform prediction
WO2018033890A1 (en) * 2016-08-19 2018-02-22 Linear Algebra Technologies Limited Systems and methods for distributed training of deep learning models
CN107967491A (en) * 2017-12-14 2018-04-27 北京木业邦科技有限公司 Machine learning method, device, electronic equipment and the storage medium again of plank identification
CN108289115A (en) * 2017-05-10 2018-07-17 腾讯科技(深圳)有限公司 A kind of information processing method and system
CN108491720A (en) * 2018-03-20 2018-09-04 腾讯科技(深圳)有限公司 A kind of application and identification method, system and relevant device
CN108764065A (en) * 2018-05-04 2018-11-06 华中科技大学 A kind of method of pedestrian's weight identification feature fusion assisted learning
US20180336486A1 (en) * 2017-05-17 2018-11-22 International Business Machines Corporation Training a machine learning model in a distributed privacy-preserving environment

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104573720A (en) * 2014-12-31 2015-04-29 北京工业大学 Distributed training method for kernel classifiers in wireless sensor network
WO2018033890A1 (en) * 2016-08-19 2018-02-22 Linear Algebra Technologies Limited Systems and methods for distributed training of deep learning models
CN107169573A (en) * 2017-05-05 2017-09-15 第四范式(北京)技术有限公司 Using composite machine learning model come the method and system of perform prediction
CN108289115A (en) * 2017-05-10 2018-07-17 腾讯科技(深圳)有限公司 A kind of information processing method and system
US20180336486A1 (en) * 2017-05-17 2018-11-22 International Business Machines Corporation Training a machine learning model in a distributed privacy-preserving environment
CN107967491A (en) * 2017-12-14 2018-04-27 北京木业邦科技有限公司 Machine learning method, device, electronic equipment and the storage medium again of plank identification
CN108491720A (en) * 2018-03-20 2018-09-04 腾讯科技(深圳)有限公司 A kind of application and identification method, system and relevant device
CN108764065A (en) * 2018-05-04 2018-11-06 华中科技大学 A kind of method of pedestrian's weight identification feature fusion assisted learning

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022111398A1 (en) * 2020-11-26 2022-06-02 华为技术有限公司 Data model training method and apparatus
CN113420322A (en) * 2021-05-24 2021-09-21 阿里巴巴新加坡控股有限公司 Model training and desensitizing method and device, electronic equipment and storage medium
CN113420322B (en) * 2021-05-24 2023-09-01 阿里巴巴新加坡控股有限公司 Model training and desensitizing method and device, electronic equipment and storage medium
CN113689000A (en) * 2021-08-25 2021-11-23 深圳前海微众银行股份有限公司 Federal learning model training method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN111832591B (en) 2024-06-04

Similar Documents

Publication Publication Date Title
CN110929870B (en) Method, device and system for training neural network model
CN112052942B (en) Neural network model training method, device and system
CN111523673B (en) Model training method, device and system
CN111061963B (en) Machine learning model training and predicting method and device based on multi-party safety calculation
CN111832591B (en) Machine learning model training method and device
CN112580826B (en) Business model training method, device and system
CN111260053A (en) Method and apparatus for neural network model training using trusted execution environments
CN110933102B (en) Abnormal flow detection model training method and device based on semi-supervised learning
CN111935179B (en) Model training method and device based on trusted execution environment
WO2014176790A1 (en) A method and technical equipment for people identification
CN111523556B (en) Model training method, device and system
CN111741020A (en) Public data set determination method, device and system based on data privacy protection
CN111523134B (en) Homomorphic encryption-based model training method, device and system
CN112101531A (en) Neural network model training method, device and system based on privacy protection
CN109525949A (en) Register method and device, storage medium, server, user terminal
CN111401483A (en) Sample data processing method and device and multi-party model training system
CN110929887A (en) Logistic regression model training method, device and system
CN113139527B (en) Video privacy protection method, device, equipment and storage medium
US9332031B1 (en) Categorizing accounts based on associated images
CN111737756B (en) XGB model prediction method, device and system performed through two data owners
CN111738453B (en) Business model training method, device and system based on sample weighting
CN111931870B (en) Model prediction method, model prediction device and system based on model multiplexing
CN109829150B (en) Insurance claim text processing method and apparatus
CN113378025A (en) Data processing method and device, electronic equipment and storage medium
CN114510592A (en) Image classification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant