JP2021193533A

JP2021193533A - Machine learning device, machine learning method, and machine learning program

Info

Publication number: JP2021193533A
Application number: JP2020099778A
Authority: JP
Inventors: 清良披田野; Seira Hidano; 晋作清本; Shinsaku Kiyomoto
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2020-06-09
Filing date: 2020-06-09
Publication date: 2021-12-23
Anticipated expiration: 2040-06-09
Also published as: JP7290606B2

Abstract

To provide a machine learning device, machine learning method, and machine learning program capable of creating a learned model having a sufficient performance from a data set containing private data while protecting privacy.SOLUTION: A machine learning device 1 includes: a data dividing unit 11 for dividing training data of a machine learning model into a first set containing private data and a second set not containing the private data; a first learning unit 12 for allowing the machine learning model to learn by using the first set; a model dividing unit 13 for dividing the machine learning model learned by the first learning unit 12 into an input side model and an output side model; a model substitution unit 14 for substituting the output side model with an initial model having a form suitable to a task of the second set; and a second learning unit 15 for allowing a part of the machine learning model containing at least the initial model to learn by using the second set.SELECTED DRAWING: Figure 1

Description

本発明は、プライバシを保護する機械学習装置、機械学習方法及び機械学習プログラムに関する。 The present invention relates to machine learning devices, machine learning methods and machine learning programs that protect privacy.

従来、機械学習のプライバシに関する脅威として、学習済みモデルの出力から訓練データを推測する攻撃がある。代表的な攻撃としては、訓練データそのものを復元するモデルインバージョン攻撃や、あるデータが訓練データに含まれていたかどうかを推定するメンバシップインファレンス攻撃がある（例えば、非特許文献１参照）。 Traditionally, a threat to machine learning privacy has been attacks that infer training data from the output of trained models. Typical attacks include a model inversion attack that restores the training data itself and a membership inference attack that estimates whether or not certain data is included in the training data (see, for example, Non-Patent Document 1).

メンバシップインファレンス攻撃の概念に基づき定義されたメンバシッププライバシは、機械学習のプライバシに関する最も強い安全性の定義である。本概念では、学習済みモデルの出力が訓練データとテストデータとで同じ場合、あるデータが訓練データに入っていたかどうかを推定できないため、このモデルはメンバシッププライバシを満たすという。 Membership privacy, defined based on the concept of membership inference attacks, is the strongest definition of security for machine learning privacy. In this concept, if the output of the trained model is the same for the training data and the test data, it is not possible to estimate whether or not some data was included in the training data, so this model satisfies the membership privacy.

S. Reza, et al., “Membership Inference Attacks Against Machine Learning Models,” IEEE S&P 2017.S. Reza, et al., “Membership Inference Attacks Against Machine Learning Models,” IEEE S & P 2017.

しかしながら、一般的な学習方法では、機械学習モデルを完全に汎化することが難しく、訓練データとテストデータとで出力される予測値（確率）に乖離が生じるため、メンバシッププライバシを達成することはできなかった。 However, with general learning methods, it is difficult to completely generalize the machine learning model, and there is a discrepancy in the predicted values (probabilities) output between the training data and the test data, so membership privacy must be achieved. I couldn't.

訓練データからプライベートなデータを除外すれば、たとえ出力から訓練データの情報が漏洩したとしても、メンバシッププライバシは問題とならない。しかしながら、プライベートなデータが訓練データの大多数を占める場合、プライベートなデータを除外することで全体のデータ数が大幅に減少する。この場合、学習済みモデルの性能が大幅に低下する。このため、モデルの性能を低下させずに、プライベートなデータを除外する方法が必要となる。 By excluding private data from the training data, membership privacy is not an issue, even if the training data information is leaked from the output. However, if private data make up the majority of training data, excluding private data will significantly reduce the total number of data. In this case, the performance of the trained model is significantly reduced. Therefore, there is a need for a way to exclude private data without degrading the performance of the model.

本発明は、プライベートなデータを含むデータ集合から、プライバシを保護しつつ、十分な性能を持つ学習済みモデルを作成できる機械学習装置、機械学習方法及び機械学習プログラムを提供することを目的とする。 An object of the present invention is to provide a machine learning device, a machine learning method, and a machine learning program capable of creating a trained model having sufficient performance while protecting privacy from a data set including private data.

本発明に係る機械学習装置は、機械学習モデルの訓練データを、プライベートなデータを含む第１の集合、及びプライベートなデータを含まない第２の集合に分割するデータ分割部と、前記第１の集合を用いて前記機械学習モデルを学習する第１学習部と、前記第１学習部により学習された前記機械学習モデルを、入力側モデルと出力側モデルとに分割するモデル分割部と、前記出力側モデルを、前記第２の集合のタスクに適合した形式の初期モデルに置き換えるモデル置換部と、前記第２の集合を用いて、前記機械学習モデルのうち、少なくとも前記初期モデルを含む部分を学習する第２学習部と、を備える。 The machine learning device according to the present invention has a data dividing unit that divides the training data of the machine learning model into a first set including private data and a second set not containing private data, and the first set described above. A first learning unit that learns the machine learning model using a set, a model division unit that divides the machine learning model learned by the first learning unit into an input side model and an output side model, and the output. Using the model replacement part that replaces the side model with an initial model of a format suitable for the task of the second set, and the second set, the part of the machine learning model including at least the initial model is learned. It is equipped with a second learning unit.

前記第２学習部は、前記機械学習モデルの全体を更新してもよい。 The second learning unit may update the entire machine learning model.

前記データ分割部は、プライベートなデータのみで前記第１の集合を構成してもよい。 The data division unit may form the first set only with private data.

本発明に係る機械学習方法は、機械学習モデルの訓練データを、プライベートなデータを含む第１の集合、及びプライベートなデータを含まない第２の集合に分割するデータ分割ステップと、前記第１の集合を用いて前記機械学習モデルを学習する第１学習ステップと、前記第１学習ステップにおいて学習された前記機械学習モデルを、入力側モデルと出力側モデルとに分割するモデル分割ステップと、前記出力側モデルを、前記第２の集合のタスクに適合した形式の初期モデルに置き換えるモデル置換ステップと、前記第２の集合を用いて、前記機械学習モデルのうち、少なくとも前記初期モデルを含む部分を学習する第２学習ステップと、をコンピュータが実行する。 The machine learning method according to the present invention comprises a data division step of dividing the training data of a machine learning model into a first set containing private data and a second set not containing private data, and the first set described above. A first learning step for learning the machine learning model using a set, a model division step for dividing the machine learning model learned in the first learning step into an input side model and an output side model, and the output. Using the model replacement step of replacing the side model with an initial model of a format suitable for the task of the second set, and the second set, the part of the machine learning model including at least the initial model is learned. The computer executes the second learning step to be performed.

本発明に係る機械学習プログラムは、前記機械学習装置としてコンピュータを機能させるためのものである。 The machine learning program according to the present invention is for making a computer function as the machine learning device.

本発明によれば、プライベートなデータを含むデータ集合から、プライバシを保護しつつ、十分な性能を持つ学習済みモデルを作成できる。 According to the present invention, it is possible to create a trained model having sufficient performance while protecting privacy from a data set including private data.

実施形態における機械学習装置の機能構成を示す図である。It is a figure which shows the functional structure of the machine learning apparatus in embodiment. 実施形態における機械学習方法を説明する概念図である。It is a conceptual diagram explaining the machine learning method in an embodiment. 実施形態における機械学習方法の流れを示すフローチャートである。It is a flowchart which shows the flow of the machine learning method in an embodiment.

以下、本発明の実施形態の一例について説明する。
本実施形態では、機械学習のためにｎ個のデータからなる訓練データの集合Ｄが用意されているものとする。各データｘには、データが分類されたクラスを表すラベルｙが付与されている。機械学習では、データｘとラベルｙとを関連付けるモデルｆが学習される。 Hereinafter, an example of the embodiment of the present invention will be described.
In this embodiment, it is assumed that a set D of training data consisting of n data is prepared for machine learning. Each data x is given a label y representing the class in which the data is classified. In machine learning, a model f that associates data x with label y is learned.

ここで、あるデータｘが与えられたときに、攻撃者はｘをモデルｆに入力し、出力ｆ（ｘ）を得る。このとき、攻撃者がｆ（ｘ）に基づいてデータｘがモデルｆの訓練データに含まれていたかどうかを推定することが難しい場合、モデルｆは、メンバシッププライバシが満たされているという。 Here, given some data x, the attacker inputs x into the model f and obtains the output f (x). At this time, if it is difficult for the attacker to estimate whether or not the data x is included in the training data of the model f based on f (x), the model f is said to satisfy the membership privacy.

訓練データの集合Ｄには、プライベートなデータが含まれているため、本実施形態では、プライバシ保護のため、プライベートな情報を含まないデータを分類するための機械学習モデルを生成する。
このとき、本実施形態の機械学習方法では、学習済みモデルの分類性能を維持するために、プライベートな情報を含むデータ集合も学習に用いつつ、転移学習の手法を適用することで、このデータ集合に基づく訓練データとテストデータとの間での出力の乖離を消失させる。 Since the set D of training data contains private data, in the present embodiment, a machine learning model for classifying data that does not include private information is generated for privacy protection.
At this time, in the machine learning method of the present embodiment, in order to maintain the classification performance of the trained model, this data set is applied by applying the transfer learning method while also using the data set including private information for learning. Eliminate the output discrepancy between the training data and the test data based on.

図１は、本実施形態における機械学習装置１の機能構成を示す図である。
機械学習装置１は、サーバ装置又はパーソナルコンピュータ等の情報処理装置（コンピュータ）であり、制御部１０及び記憶部２０の他、各種データの入出力デバイス及び通信デバイス等を備える。 FIG. 1 is a diagram showing a functional configuration of the machine learning device 1 in the present embodiment.
The machine learning device 1 is an information processing device (computer) such as a server device or a personal computer, and includes a control unit 10 and a storage unit 20, as well as various data input / output devices and communication devices.

制御部１０は、機械学習装置１の全体を制御する部分であり、記憶部２０に記憶された各種プログラムを適宜読み出して実行することにより、本実施形態における各機能を実現する。制御部１０は、ＣＰＵであってよい。 The control unit 10 is a part that controls the entire machine learning device 1, and realizes each function in the present embodiment by appropriately reading and executing various programs stored in the storage unit 20. The control unit 10 may be a CPU.

記憶部２０は、ハードウェア群を機械学習装置１として機能させるための各種プログラム、及び各種データ等の記憶領域であり、ＲＯＭ、ＲＡＭ、フラッシュメモリ又はハードディスクドライブ（ＨＤＤ）等であってよい。具体的には、記憶部２０は、本実施形態の各機能を制御部１０に実行させるためのプログラム（機械学習プログラム）、及び機械学習モデル、訓練データ等を記憶する。 The storage unit 20 is a storage area for various programs and various data for making the hardware group function as the machine learning device 1, and may be a ROM, RAM, flash memory, hard disk drive (HDD), or the like. Specifically, the storage unit 20 stores a program (machine learning program) for causing the control unit 10 to execute each function of the present embodiment, a machine learning model, training data, and the like.

制御部１０は、データ分割部１１と、第１学習部１２と、モデル分割部１３と、モデル置換部１４と、第２学習部１５とを備える。 The control unit 10 includes a data division unit 11, a first learning unit 12, a model division unit 13, a model replacement unit 14, and a second learning unit 15.

データ分割部１１は、機械学習モデルの訓練データ集合Ｄを、プライベートなデータを含む第１の集合Ｄ_１、及びプライベートなデータを含まない第２の集合Ｄ_２に分割する。
データ分割部１１は、プライベートなデータのみで第１の集合Ｄ_１を構成してもよいが、第１の集合Ｄ_１には、プライベートでないデータが含まれてもよい。
また、分割の条件としてラベルｙに関する制約はないものとする。 The data division unit 11 divides the training data set D of the machine learning model into a first set D ₁ _{containing private data and a second set D 2} containing no private data.
Data dividing unit 11 may constitute a first set D ₁ only in private data, the first set D _1, may include data that is not private.
Further, it is assumed that there is no restriction on the label y as a condition of division.

第１学習部１２は、第１の集合Ｄ_１を訓練データとして用い、機械学習モデルを学習する。
このとき、機械学習モデルのパラメータは、第１の集合Ｄ_１に含まれるラベルｙに応じた出力となるように設計される。 First learning section 12 uses the first set D ₁ as training data, to learn the machine learning model.
At this time, parameters of the machine learning model is designed to be output according to the label y included in the first set D _1.

モデル分割部１３は、第１学習部１２により学習された機械学習モデルを、入力側モデルＡと出力側モデルＢ_１とに分割する。
なお、複数レイヤで構成された機械学習モデルの分割箇所は、モデルによる分類タスクに応じて適宜決定されてよい。 The model dividing unit 13 divides the machine learning model learned by the first learning unit 12 into an input side model A and an output side model B ₁ .
The division points of the machine learning model composed of a plurality of layers may be appropriately determined according to the classification task by the model.

モデル置換部１４は、分割された出力側モデルＢ_１を、第２の集合Ｄ_２のタスクに適合した形式の初期モデルＢ_２に置き換える。
なお、第１の集合Ｄ_１と第２の集合Ｄ_２とは、分類のタスクが異なっていてよく、第２の集合Ｄ_２に付与されているラベルｙに応じた出力となるように初期モデルＢ_２のパラメータが設計される。
すなわち、Ｂ_１とＢ_２とは、同じ形式でなくてよく、例えば、ニューラルネットワークにおけるユニットの数や、分類クラスの数等、構造が異なっていてよい。 The model replacement unit 14 replaces the divided output-side model B ₁ _{with an initial model B 2} in a format suitable for the task of the second set D ₂ .
The first set D ₁ and the second set D ₂ may have different classification tasks, and the initial model is such that the output corresponds to the label y given to _{the second set D 2.} The parameters of B _{2 are designed.}
That is, B ₁ and B ₂ do not have to have the same format, and may have different structures such as the number of units in the neural network and the number of classification classes.

第２学習部１５は、第２の集合Ｄ_２を訓練データとして用い、機械学習モデルのうち、少なくとも初期モデルＢ_２を含む部分を学習する。
このとき、第２学習部１５は、機械学習モデルの全体（Ａ＋Ｂ_２）を更新してもよいし、初期モデルＢ_２のみを更新してもよい。 The second learning unit 15 uses the second set D ₂ as training data and learns a part of the machine learning model including at least the initial model B _2.
At this time, the second learning unit 15 _{may update the entire machine learning model (A + B 2} ), or may update only _{the initial model B 2.}

図２は、本実施形態における機械学習方法を説明する概念図である。
訓練データの集合Ｄを分割して、プライベートなデータを含む第１の集合Ｄ_１と、プライベートなデータを含まない第２の集合Ｄ_２とが与えられると、まず、第１の集合を用いて、モデルｆ_１が学習される。
学習済みモデルｆ_１は、Ｄ_１に含まれているラベル（例えばａ，ｂ，ｃ）に応じた予測値を出力する。 FIG. 2 is a conceptual diagram illustrating the machine learning method in the present embodiment.
When the set D of training data is divided and a first set D ₁ _{containing private data and a second set D 2} containing no private data are given, first, the first set is used. , Model f ₁ is trained.
The trained model f ₁ outputs predicted values according to labels (for example, a, b, c) included in D _1.

次に、モデルｆ_１は、入力に近い前半部分（Ａ）と出力に近い後半部分（Ｂ_１）とに分割され、転移学習の手順にしたがって、後半部分がＢ_１からＢ_２に置き換えられる。
このとき、モデルＢ_２は、第２の集合Ｄ_２のタスクに合わせて初期化され、モデルｆ_２は、Ｄ_２に含まれているラベル（例えばｃ，ｄ，ｅ）に応じた予測値を出力する。 Next, the model f ₁ is divided into a first half portion (A) close to the input and a second half portion (B ₁ ) close to the output, and the second half portion is replaced from B ₁ _{to B 2 according to the transfer learning procedure.}
At this time, the model B ₂ is initialized according to the task of the second set D ₂ _{, and the model f 2} sets the predicted value according to the label (for example, c, d, e) included in the D _2. Output.

続いて、モデルｆ_２は、第２の集合Ｄ_２を用いて学習され、少なくとも後半部分（Ｂ_２）が更新される。 Subsequently, the model f ₂ is trained using the second set D ₂ and at least the latter half (B ₂ ) is updated.

図３は、本実施形態における機械学習方法の流れを示すフローチャートである。
ステップＳ１において、データ分割部１１は、訓練データの集合Ｄを、プライベートなデータを含む第１の集合Ｄ_１と、プライベートなデータを含まない第２の集合Ｄ_２とに分割する。 FIG. 3 is a flowchart showing the flow of the machine learning method in the present embodiment.
In step S1, the data division unit 11 divides the training data set D into a first set D ₁ _{containing private data and a second set D 2} not including private data.

ステップＳ２において、第１学習部１２は、第１の集合Ｄ_１を訓練データとして用い、Ｄ_１に含まれているデータのクラスを分類するための複数のレイヤからなるモデルｆ_１を学習する。 In step S2, the first learning unit 12 uses the first set D ₁ as training data, and learns a _{model f 1} composed of a plurality of layers for classifying the classes of data contained in _{D 1.}

ステップＳ３において、モデル分割部１３は、学習したモデルｆ_１を２つのモデルに分割し、入力に近い側の前半部のモデルをＡ、出力に近い側の後半部のモデルをＢ_１とする。 In step S3, the model dividing unit 13 divides the learned model f ₁ into two models, and the model in the first half on the side closer to the input is A, and the model in the second half on the side closer to the output is B ₁ .

ステップＳ４において、モデル置換部１４は、学習済みモデルｆ_１の前半部のモデルＡに繋げて、第２の集合Ｄ_２に含まれているデータのクラスを分類するための後半部のモデルＢ_２を作成し、モデルｆ_２を初期化する。 In step S4, the model replacement unit 14 is connected to the model A in the first half of the _{trained model f 1} to classify the classes of data contained in _{the second set D 2} _{in the second half model B 2} create and initialize the model f _2.

ステップＳ５において、第２学習部１５は、第２の集合Ｄ_２を訓練データとして用い、Ｄ_２に含まれているデータのクラスを分類するためのモデルｆ_２を更新する。なお、第２学習部１５は、Ａを更新せずにＢ_２のみを更新してもよいし、Ａ及びＢ_２の両方を更新してもよい。 In step S5, the second learning section 15 uses the second set D ₂ as training data, and updates the model f ₂ for classifying the class of data contained in D _2. The second learning unit 15 may update only _{B 2} without updating A, or may update both _{A and B 2.}

本実施形態によれば、機械学習装置１は、訓練データＤを、プライベートなデータを含む第１の集合Ｄ_１、及びプライベートなデータを含まない第２の集合Ｄ_２に分割する。そして、機械学習装置１は、第１の集合Ｄ_１を用いて機械学習モデルを学習した後、入力側のモデルＡのみを転移させ、出力側モデルＢ_１を第２の集合Ｄ_２のタスクに適合した形式の初期モデルＢ_２に置き換えてからＤ_２を用いて更新する。 According to the present embodiment, the machine learning device 1 divides the _{training data D into a first set D 1} _{containing private data and a second set D 2} containing no private data. Then, after the machine learning device 1 learns _{the machine learning model using the first set D 1} , only the input side model A is transferred, and the output side model B ₁ becomes the task of the _second set D 2. Replace with the initial model B ₂ in a suitable format and then update with _{D 2.}

これにより、モデルｆ_２の出力はプライベートなデータを含まない第２の集合Ｄ_２のみに依存し、プライベートなデータを含む第１の集合Ｄ_１についての訓練データとテストデータとの差は消失する。この結果、モデルｆ_２の出力からプライベートなデータが訓練データに含まれていたことを推定できなくなる。
また、モデルｆ_２の入力に近い部分（Ａ）はプライベートなデータを含む第１の集合を用いて学習されたため、データ集合Ｄ全体の傾向を反映したものであり、特徴抽出が十分に行われ学習済みモデルｆ_２の性能は保たれる。
したがって、機械学習装置１は、転移学習の手法を応用することで、プライベートなデータを含むデータ集合から、プライバシを保護しつつ、十分な性能を持つ学習済みモデルを作成できる。 As a result, _{the output of the model f 2} depends only on the second set D ₂ containing the private data, and the difference between the training data and the test data for the _first set D 1 containing the private data disappears. .. As a result, private data can not be estimated that were included in the training data from the output of the model f _2.
Further, a portion close to the input of the model f ₂ (A) is because it was learned using the first set including a private data, which reflects the tendency of the whole data set D, feature extraction is performed sufficiently the performance of the learned model f ₂ is maintained.
Therefore, the machine learning device 1 can create a trained model having sufficient performance while protecting privacy from a data set including private data by applying the transfer learning method.

機械学習装置１は、プライベートなデータを含まない第２の集合Ｄ_２を用いて機械学習モデルの全体（Ａ＋Ｂ_２）を更新してもよい。
これにより、プライベートなデータを含まない第２の集合Ｄ_２の特徴も含めて前半部分（Ａ）が学習され、学習済みモデルの性能の向上が期待できる。 The machine learning device 1 may update the entire machine learning model (A + B ₂ ) with a _{second set D 2} that does not contain private data.
_{As a result, the first half (A) including the features of the second set D 2} that does not include private data is learned, and the performance of the trained model can be expected to be improved.

機械学習装置１は、プライベートなデータのみで第１の集合を構成することにより、プライベートなデータが多数の場合に、プライベートなデータを含まない第２の集合Ｄ_２のデータ量を最大化でき、学習済みモデルの性能低下を抑制できる。 Machine learning apparatus 1, by constituting the first set only in private data, if private data are numerous, to maximize second data amount of the set D ₂ that does not include private data, It is possible to suppress the performance deterioration of the trained model.

以上、本発明の実施形態について説明したが、本発明は前述した実施形態に限るものではない。また、前述した実施形態に記載された効果は、本発明から生じる最も好適な効果を列挙したに過ぎず、本発明による効果は、実施形態に記載されたものに限定されるものではない。 Although the embodiments of the present invention have been described above, the present invention is not limited to the above-described embodiments. Moreover, the effects described in the above-described embodiments are merely a list of the most suitable effects resulting from the present invention, and the effects according to the present invention are not limited to those described in the embodiments.

機械学習装置１による機械学習方法は、ソフトウェアにより実現される。ソフトウェアによって実現される場合には、このソフトウェアを構成するプログラムが、情報処理装置（コンピュータ）にインストールされる。また、これらのプログラムは、ＣＤ−ＲＯＭのようなリムーバブルメディアに記録されてユーザに配布されてもよいし、ネットワークを介してユーザのコンピュータにダウンロードされることにより配布されてもよい。さらに、これらのプログラムは、ダウンロードされることなくネットワークを介したＷｅｂサービスとしてユーザのコンピュータに提供されてもよい。 The machine learning method by the machine learning device 1 is realized by software. When realized by software, the programs that make up this software are installed in the information processing device (computer). Further, these programs may be recorded on a removable medium such as a CD-ROM and distributed to the user, or may be distributed by being downloaded to the user's computer via a network. Further, these programs may be provided to the user's computer as a Web service via a network without being downloaded.

１機械学習装置
１０制御部
１１データ分割部
１２第１学習部
１３モデル分割部
１４モデル置換部
１５第２学習部
２０記憶部 1 Machine learning device 10 Control unit 11 Data division unit 12 First learning unit 13 Model division unit 14 Model replacement unit 15 Second learning unit 20 Storage unit

Claims

機械学習モデルの訓練データを、プライベートなデータを含む第１の集合、及びプライベートなデータを含まない第２の集合に分割するデータ分割部と、
前記第１の集合を用いて前記機械学習モデルを学習する第１学習部と、
前記第１学習部により学習された前記機械学習モデルを、入力側モデルと出力側モデルとに分割するモデル分割部と、
前記出力側モデルを、前記第２の集合のタスクに適合した形式の初期モデルに置き換えるモデル置換部と、
前記第２の集合を用いて、前記機械学習モデルのうち、少なくとも前記初期モデルを含む部分を学習する第２学習部と、を備える機械学習装置。 A data divider that divides the training data of the machine learning model into a first set that contains private data and a second set that does not contain private data.
A first learning unit that learns the machine learning model using the first set,
A model dividing unit that divides the machine learning model learned by the first learning unit into an input side model and an output side model, and
A model replacement unit that replaces the output-side model with an initial model in a format suitable for the task of the second set.
A machine learning device including a second learning unit that learns at least a portion of the machine learning model including the initial model using the second set.

前記第２学習部は、前記機械学習モデルの全体を更新する請求項１に記載の機械学習装置。 The machine learning device according to claim 1, wherein the second learning unit updates the entire machine learning model.

前記データ分割部は、プライベートなデータのみで前記第１の集合を構成する請求項１又は請求項２に記載の機械学習装置。 The machine learning device according to claim 1 or 2, wherein the data dividing unit constitutes the first set with only private data.

機械学習モデルの訓練データを、プライベートなデータを含む第１の集合、及びプライベートなデータを含まない第２の集合に分割するデータ分割ステップと、
前記第１の集合を用いて前記機械学習モデルを学習する第１学習ステップと、
前記第１学習ステップにおいて学習された前記機械学習モデルを、入力側モデルと出力側モデルとに分割するモデル分割ステップと、
前記出力側モデルを、前記第２の集合のタスクに適合した形式の初期モデルに置き換えるモデル置換ステップと、
前記第２の集合を用いて、前記機械学習モデルのうち、少なくとも前記初期モデルを含む部分を学習する第２学習ステップと、をコンピュータが実行する機械学習方法。 A data splitting step that splits the training data of a machine learning model into a first set that contains private data and a second set that does not contain private data.
A first learning step of learning the machine learning model using the first set,
A model division step for dividing the machine learning model learned in the first learning step into an input side model and an output side model, and
A model replacement step that replaces the output-side model with an initial model in a format suitable for the task of the second set.
A machine learning method in which a computer executes a second learning step of learning at least a portion of the machine learning model including the initial model using the second set.

請求項１から請求項３のいずれかに記載の機械学習装置としてコンピュータを機能させるための機械学習プログラム。 A machine learning program for operating a computer as the machine learning device according to any one of claims 1 to 3.