JP6649875B2

JP6649875B2 - Information processing apparatus, information processing method, program, and information processing system

Info

Publication number: JP6649875B2
Application number: JP2016252628A
Authority: JP
Inventors: 慧米川; 塁木村; 村松　茂樹; 茂樹村松; 亜令小林
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2016-12-27
Filing date: 2016-12-27
Publication date: 2020-02-19
Anticipated expiration: 2036-12-27
Also published as: JP2018106453A

Description

本発明は、情報処理装置、情報処理方法、プログラム、情報処理システム、及び通信端末に関する。 The present invention relates to an information processing device, an information processing method, a program, an information processing system, and a communication terminal.

近年、機械学習による認識・推定技術が急激に発展してきている。例えば特許文献１には、大規模データセットを利用して画像内に含まれる複数の物体（又は対象）について、各物体が何であるかを自動的に認識する技術が開示されている。 In recent years, recognition / estimation techniques using machine learning have been rapidly developed. For example, Patent Literature 1 discloses a technique for automatically recognizing what each object is about a plurality of objects (or objects) included in an image using a large-scale data set.

特許第５８２３２７０号公報Japanese Patent No. 5823270

上記のような技術では、認識対象を認識するための何らかの特徴量に基づいて認識・推定処理が実行される。したがって、認識・推定処理を実行する装置がネットワーク上にあるような場合には、通信ネットワークを介してその装置に特徴量を送信しなければならない。 In the above technique, recognition / estimation processing is executed based on some feature amount for recognizing a recognition target. Therefore, when a device that executes the recognition / estimation processing is on a network, the feature amount must be transmitted to the device via a communication network.

しかしながら、通信ネットワークを介して送信される情報が、例えば企業が保有する個人情報等を用いた特徴量となれば、その特徴量をそのまま外部に送信することは個人情報流出のリスクを高めるため許容しがたい。したがって、そのような特徴量を除いた他の特徴量から、認識・推定処理を実行することができることが求められている。 However, if the information transmitted via the communication network is, for example, a feature using personal information or the like owned by a company, transmitting the feature as it is to the outside as it is allowed to increase the risk of leaking personal information. It is difficult. Therefore, it is required that the recognition / estimation process can be executed from other feature amounts excluding such a feature amount.

そこで、本発明はこれらの点に鑑みてなされたものであり、複数の特徴量のうちの一部の特徴量を用いて、他の特徴量を補完する技術を提供することを目的とする。 Therefore, the present invention has been made in view of these points, and an object of the present invention is to provide a technique of using a part of a plurality of feature amounts to complement another feature amount.

本発明の第１の態様は、情報処理装置である。この装置は、Ｎ人（Ｎは２以上の整数）のユーザそれぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、Ｍ種類（Ｍは２以上の整数）の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べて構成される行列を、ユーザ因子行列Ｕと特徴因子行列Ｆとの２つの行列の積で近似した場合における特徴因子行列Ｆを格納する特徴因子行列格納部と、Ｐ人（Ｐは２以上の整数）のユーザそれぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、前記Ｍ種類の特徴量のうちのＱ種類（Ｑは１以上Ｍ未満の整数）の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べて構成される試験行列Ｔを取得する試験行列取得部と、前記特徴因子行列Ｆのうち、前記試験行列Ｔを構成する特徴ベクトルに対応する成分を抽出して得られる試験特徴因子行列Ｆｔと、それ以外の成分で構成される欠損特徴因子行列Ｆｌとを生成する特徴因子抽出部と、前記試験行列Ｔを、試験ユーザ因子行列Ｕｔと前記試験特徴因子行列Ｆｔとの積で近似する試験ユーザ因子行列Ｕｔを算出するユーザ因子抽出部と、前記試験ユーザ因子行列Ｕｔと前記欠損特徴因子行列Ｆｌとの積から、前記試験行列Ｔを構成する特徴ベクトルに含まれない欠損特徴ベクトルの推定値を取得する欠損情報推定部と、を備える。 A first aspect of the present invention is an information processing device. This apparatus generates feature vectors each having M elements (M is an integer of 2 or more), each of which includes a numerical value indicating a feature amount associated with each of N users (N is an integer of 2 or more). Feature matrix storage unit that stores a feature factor matrix F when a matrix configured by arranging a plurality of feature vectors obtained by performing the above is approximated by a product of two matrices of a user factor matrix U and a feature factor matrix F And a feature vector having a numerical value indicating a feature amount associated with each of P users (P is an integer of 2 or more) as elements, and Q types (Q is 1 or more and less than M) of the M types of feature amounts A test matrix acquisition unit configured to acquire a test matrix T formed by arranging a plurality of feature vectors generated and obtained for each of the feature amounts of (i) and the test matrix T of the feature factor matrix F. Characteristic A feature factor extraction unit that generates a test feature factor matrix Ft obtained by extracting a component corresponding to a vector and a missing feature factor matrix Fl composed of other components; A user factor extraction unit that calculates a test user factor matrix Ut that is approximated by a product of a matrix Ut and the test feature factor matrix Ft, and a product of the test user factor matrix Ut and the missing feature factor matrix Fl. A missing information estimating unit that acquires an estimated value of a missing feature vector that is not included in the feature vectors that constitute T.

前記特徴因子抽出部は、前記試験行列Ｔを構成する特徴ベクトルに含まれない特徴量である欠損特徴量を特定するための情報をさらに取得してもよい。 The feature factor extraction unit may further acquire information for specifying a missing feature amount that is a feature amount not included in a feature vector included in the test matrix T.

前記情報処理装置は、Ｎ人のユーザそれぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、Ｍ種類の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べて構成される行列を取得する教師行列取得部と、前記教師行列取得部が取得した行列を、交互最小二乗法（Alternative Least Squares）を用いてユーザ因子行列Ｕと特徴因子行列Ｆとの２つの行列の積に分解する行列分解部と、をさらに備えてもよく、前記特徴因子行列格納部は、前記行列分解部が分解した前記特徴因子行列Ｆを格納してもよい。 The information processing device is configured by arranging a plurality of feature vectors obtained by generating a feature vector having a numerical value indicating a feature amount associated with each of N users as an element for each of M types of feature amounts. A teacher matrix acquisition unit that acquires a matrix, and a matrix acquired by the teacher matrix acquisition unit are converted into a product of two matrices of a user factor matrix U and a feature factor matrix F using an alternate least squares method (Alternative Least Squares). A matrix factorization unit for decomposing, and the feature factor matrix storage unit may store the feature factor matrix F decomposed by the matrix factorization unit.

前記ユーザ因子抽出部は、最小二乗法を用いて、前記試験行列Ｔと前記試験特徴因子行列Ｆｔとから前記試験ユーザ因子行列Ｕｔを算出してもよい。 The user factor extraction unit may calculate the test user factor matrix Ut from the test matrix T and the test feature factor matrix Ft using a least squares method.

前記特徴量は、前記ユーザの個人情報に関する特徴量であってもよい。 The characteristic amount may be a characteristic amount relating to the personal information of the user.

前記情報処理装置は、前記Ｍ種類の特徴ベクトルに対応する特徴量に基づいてあらかじめ機械学習によって得られた識別器を格納する識別器格納部と、前記試験行列Ｔと前記欠損特徴ベクトルとを用いて、前記Ｐ人のユーザそれぞれに関するＭ種類の特徴ベクトルから構成される推定行列Ｍｅを生成する推定行列生成部と、前記推定行列Ｍｅを構成する特徴ベクトルと前記識別器とに基づいて、前記Ｐ人のユーザそれぞれに関する識別処理を実行する識別部と、前記識別処理の結果を、前記試験行列Ｔを送信した通信端末に送信する送信部と、をさらに備えてもよい。 The information processing device uses a classifier storage unit that stores a classifier obtained in advance by machine learning based on feature amounts corresponding to the M types of feature vectors, and uses the test matrix T and the missing feature vector. An estimation matrix generation unit configured to generate an estimation matrix Me including M types of feature vectors for each of the P users; and the P vector based on the feature vectors configuring the estimation matrix Me and the discriminator. The apparatus may further include an identification unit that executes an identification process for each of the human users, and a transmission unit that transmits a result of the identification process to the communication terminal that has transmitted the test matrix T.

本発明の第２の態様は、情報処理方法である。この方法において、プロセッサが、Ｎ人（Ｎは２以上の整数）のユーザそれぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、Ｍ種類（Ｍは２以上の整数）の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べて構成される行列を、ユーザ因子行列Ｕと特徴因子行列Ｆとの２つの行列の積で近似した場合における特徴因子行列Ｆを取得するステップと、Ｐ人（Ｐは２以上の整数）のユーザそれぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、前記Ｍ種類の特徴量のうちのＱ種類（Ｑは１以上Ｍ未満の整数）の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べて構成される試験行列Ｔを取得するステップと、前記特徴因子行列Ｆのうち、前記試験行列Ｔを構成する特徴ベクトルに対応する成分を抽出して得られる試験特徴因子行列Ｆｔと、それ以外の成分で構成される欠損特徴因子行列Ｆｌとを生成するステップと、前記試験行列Ｔを、試験ユーザ因子行列Ｕｔと前記試験特徴因子行列Ｆｔとの積で近似する試験ユーザ因子行列Ｕｔを算出するステップと、前記試験ユーザ因子行列Ｕｔと前記欠損特徴因子行列Ｆｌとの積から、前記試験行列Ｔを構成する特徴ベクトルに含まれない欠損特徴ベクトルの推定値を取得するステップと、を実行する。 A second aspect of the present invention is an information processing method. In this method, the processor generates M types (M is an integer of 2 or more) of feature vectors, each of which includes a numerical value indicating a feature amount associated with each of N users (N is an integer of 2 or more). Obtaining a feature factor matrix F in a case where a matrix configured by arranging a plurality of feature vectors generated and obtained for each is approximated by a product of two matrices of a user factor matrix U and a feature factor matrix F; , And P (where P is an integer of 2 or more) the feature vectors each having a numerical value indicating the feature amount associated with each of the users, and Q types (Q is 1 or more and less than M) of the M types of feature amounts Obtaining a test matrix T configured by arranging a plurality of feature vectors generated and obtained for each of the (integer) feature amounts, and configuring the test matrix T among the feature factor matrices F. Generating a test feature factor matrix Ft obtained by extracting a component corresponding to the feature vector and a missing feature factor matrix Fl composed of other components; and converting the test matrix T into a test user factor matrix Ut Calculating a test user factor matrix Ut that is approximated by the product of the test feature factor matrix Ft and the product of the test user factor matrix Ut and the missing feature factor matrix Fl. Obtaining an estimated value of the missing feature vector not included in the vector.

本発明の第３の態様は、プログラムである。このプログラムは、コンピュータに、Ｎ人（Ｎは２以上の整数）のユーザそれぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、Ｍ種類（Ｍは２以上の整数）の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べて構成される行列を、ユーザ因子行列Ｕと特徴因子行列Ｆとの２つの行列の積で近似した場合における特徴因子行列Ｆを取得する機能と、Ｐ人（Ｐは２以上の整数）のユーザそれぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、前記Ｍ種類の特徴量のうちのＱ種類（Ｑは１以上Ｍ未満の整数）の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べて構成される試験行列Ｔを取得する機能と、前記特徴因子行列Ｆのうち、前記試験行列Ｔを構成する特徴ベクトルに対応する成分を抽出して得られる試験特徴因子行列Ｆｔと、それ以外の成分で構成される欠損特徴因子行列Ｆｌとを生成する機能と、前記試験行列Ｔを、試験ユーザ因子行列Ｕｔと前記試験特徴因子行列Ｆｔとの積で近似する試験ユーザ因子行列Ｕｔを算出する機能と、前記試験ユーザ因子行列Ｕｔと前記欠損特徴因子行列Ｆｌとの積から、前記試験行列Ｔを構成する特徴ベクトルに含まれない欠損特徴ベクトルの推定値を取得する機能と、を実現させる。 A third aspect of the present invention is a program. This program stores in a computer a feature vector having M elements (N is an integer of 2 or more) indicating numerical values indicating feature amounts associated with N users (N is an integer of 2 or more). A function of obtaining a feature factor matrix F in a case where a matrix configured by arranging a plurality of feature vectors generated and obtained for each is approximated by a product of two matrices of a user factor matrix U and a feature factor matrix F; , And P (where P is an integer of 2 or more) the feature vectors each having a numerical value indicating the feature amount associated with each of the users, and Q types (Q is 1 or more and less than M) of the M types of feature amounts (Integer), a function of acquiring a test matrix T formed by arranging a plurality of feature vectors obtained and obtained for each of the feature amounts, and a feature matrix constituting the test matrix T among the feature factor matrices F. A function for generating a test feature factor matrix Ft obtained by extracting a component corresponding to a tor and a missing feature factor matrix Fl composed of other components, and a test user factor matrix Ut A function of calculating a test user factor matrix Ut approximated by a product of the test feature factor matrix Ft, and a feature vector constituting the test matrix T from a product of the test user factor matrix Ut and the missing feature factor matrix Fl And a function of obtaining an estimated value of the missing feature vector not included in the above.

本発明の第４の態様は、通信端末と、ネットワークを介して前記通信端末と通信する情報処理装置と、を備える情報処理システムである。このシステムにおいて、前記情報処理装置は、Ｎ人（Ｎは２以上の整数）のユーザそれぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、Ｍ種類（Ｍは２以上の整数）の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べて構成される行列を、ユーザ因子行列Ｕと特徴因子行列Ｆとの２つの行列の積で近似した場合における特徴因子行列Ｆを格納する特徴因子行列格納部と、Ｐ人（Ｐは２以上の整数）のユーザそれぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、前記Ｍ種類の特徴量のうちのＱ種類（Ｑは１以上Ｍ未満の整数）の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べて構成される試験行列Ｔを、前記ネットワークを介して前記通信端末から取得する試験行列取得部と、前記特徴因子行列Ｆのうち、前記試験行列Ｔを構成する特徴ベクトルに対応する成分を抽出して得られる試験特徴因子行列Ｆｔと、それ以外の成分で構成される欠損特徴因子行列Ｆｌとを生成する特徴因子抽出部と、前記試験行列Ｔを、試験ユーザ因子行列Ｕｔと前記試験特徴因子行列Ｆｔとの積で近似する試験ユーザ因子行列Ｕｔを算出するユーザ因子抽出部と、前記試験ユーザ因子行列Ｕｔと前記欠損特徴因子行列Ｆｌとの積から、前記試験行列Ｔを構成する特徴ベクトルに含まれない欠損特徴ベクトルの推定値を取得する欠損情報推定部と、を備える。前記通信端末は、Ｐ人のユーザそれぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、前記Ｍ種類の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べて構成される行列から、所定の特徴量に対応する特徴ベクトルを除いたＱ種類の特徴ベクトルから構成される試験行列Ｔを生成する試験行列生成部と、前記ネットワークを介して前記試験行列Ｔを前記情報処理装置に送信する試験行列送信部と、を備える。 A fourth aspect of the present invention is an information processing system including a communication terminal and an information processing device that communicates with the communication terminal via a network. In this system, the information processing apparatus may include M types (M is an integer of 2 or more) of feature vectors each having a value indicating a feature amount associated with each of N users (N is an integer of 2 or more). A feature factor matrix F is stored when a matrix formed by arranging a plurality of feature vectors generated and obtained for each of the feature amounts is approximated by a product of two matrices of a user factor matrix U and a feature factor matrix F. A feature vector matrix storage unit and a feature vector having elements representing numerical values indicating feature amounts associated with P users (P is an integer of 2 or more) as Q elements (Q types) of the M types of feature amounts (Q is an integer of 1 or more and less than M). A test matrix T configured by arranging a plurality of feature vectors generated and obtained for each of the feature amounts is obtained from the communication terminal via the network. Test matrix acquisition unit, a test feature factor matrix Ft obtained by extracting a component corresponding to the feature vector constituting the test matrix T from the feature factor matrix F, and a missing feature composed of other components. A feature factor extraction unit that generates a factor matrix Fl, and a user factor extraction unit that calculates a test user factor matrix Ut that approximates the test matrix T by a product of a test user factor matrix Ut and the test feature factor matrix Ft. A loss information estimating unit that obtains, from a product of the test user factor matrix Ut and the missing feature factor matrix Fl, an estimated value of a missing feature vector not included in the feature vectors forming the test matrix T. The communication terminal is configured by arranging a plurality of feature vectors obtained by generating, for each of the M types of feature amounts, a feature vector including a numerical value indicating a feature amount associated with each of the P users. A test matrix generation unit configured to generate a test matrix T composed of Q types of feature vectors excluding a feature vector corresponding to a predetermined feature amount from a matrix, and the test matrix T via the network. And a test matrix transmission unit for transmitting the test matrix.

本発明の第５の態様は、通信端末と、ネットワークを介して前記通信端末と通信する情報処理装置と、を備える情報処理システムにおける前記通信端末である。この通信端末は、Ｐ人（Ｐは２以上の整数）のユーザそれぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、Ｍ種類の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べて構成される行列から、所定の特徴量に対応する特徴ベクトルを除いたＱ種類の特徴ベクトルから構成される試験行列Ｔを生成する試験行列生成部と、前記ネットワークを介して前記試験行列Ｔを前記情報処理装置に送信する試験行列送信部と、前記情報処理装置において、前記試験行列生成部が除いた欠損特徴ベクトルを推定するためにあらかじめ定められた特徴因子行列Ｆに基づいて推定された欠損特徴ベクトルと、前記試験行列Ｔとから、前記Ｐ人のユーザそれぞれに関する識別処理を実行された結果を受信する識別結果受信部と、を備える。 A fifth aspect of the present invention is the communication terminal in the information processing system that includes a communication terminal and an information processing device that communicates with the communication terminal via a network. This communication terminal generates a plurality of feature vectors obtained by generating, for each of the M types of feature values, feature vectors each having a value indicating a feature value associated with each of P users (P is an integer of 2 or more). A test matrix generation unit configured to generate a test matrix T composed of Q types of feature vectors excluding a feature vector corresponding to a predetermined feature amount from a matrix configured by arranging feature vectors; A test matrix transmission unit that transmits a matrix T to the information processing device; and an estimation unit that estimates a missing feature vector removed by the test matrix generation unit in the information processing device based on a predetermined feature factor matrix F. Identification result receiving unit that receives the result of executing the identification process for each of the P users from the obtained missing feature vector and the test matrix T , Comprising a.

本発明によれば、複数の特徴量のうちの一部の特徴量を用いて、他の特徴量を補完することができる。 According to the present invention, a part of a plurality of feature amounts can be used to complement another feature amount.

実施の形態に係る情報処理システムの構成を模式的に示す図である。FIG. 1 is a diagram schematically illustrating a configuration of an information processing system according to an embodiment. 実施の形態に係る情報処理装置の機能構成を模式的に示す図である。FIG. 2 is a diagram schematically illustrating a functional configuration of the information processing apparatus according to the embodiment. 特徴行列のデータ構造を模式的に示す図である。It is a figure which shows the data structure of a characteristic matrix typically. 実施の形態に係る特徴行列生成部の機能構成を模式的に示す図である。FIG. 4 is a diagram schematically illustrating a functional configuration of a feature matrix generation unit according to the embodiment. 特徴行列と特徴因子行列との関係を説明するための図である。FIG. 4 is a diagram for explaining a relationship between a feature matrix and a feature factor matrix. 実施の形態に係る情報推定部の機能構成を模式的に示す図である。FIG. 4 is a diagram schematically illustrating a functional configuration of an information estimating unit according to the embodiment. 実施の形態に係る情報推定部が実行する推定処理を説明するための模式図である。FIG. 9 is a schematic diagram for describing an estimation process performed by an information estimation unit according to the embodiment. 実施の形態に係る識別実行部の機能構成を模式的に示す図である。It is a figure which shows typically the functional structure of the identification execution part which concerns on embodiment. 通信端末の機能構成を模式的に示す図である。It is a figure which shows the functional structure of a communication terminal typically. 実施の形態に係る情報処理装置が実行する情報処理の流れを説明するためのフローチャートである。5 is a flowchart for explaining a flow of information processing executed by the information processing apparatus according to the embodiment.

＜実施の形態の概要＞
図１を参照して、実施の形態の概要を述べる。
図１は、実施の形態に係る情報処理システムＳの構成を模式的に示す図である。実施の形態に係る情報処理システムＳは、情報処理装置１と、ネットワークＮを介して情報処理装置１と通信する通信端末２とを備える。情報処理装置１は、例えばブレードサーバやクラウドサーバ等の既知の計算機、及び大容量記憶装置から構成される。情報処理装置１は、複数の特徴量に基づいてあらかじめ機械学習によって得られた識別器を保持しており、ネットワークＮを介して通信端末２から取得した特徴量に基づいて、識別処理を実行することができる。 <Overview of Embodiment>
An outline of the embodiment will be described with reference to FIG.
FIG. 1 is a diagram schematically illustrating a configuration of an information processing system S according to an embodiment. The information processing system S according to the embodiment includes an information processing device 1 and a communication terminal 2 that communicates with the information processing device 1 via a network N. The information processing device 1 includes a known computer such as a blade server and a cloud server, for example, and a mass storage device. The information processing apparatus 1 holds a classifier obtained in advance by machine learning based on a plurality of feature amounts, and executes an identification process based on the feature amount obtained from the communication terminal 2 via the network N. be able to.

通信端末２は、利用者が管理しているＰＣ（Personal Computer）やワークステーション等の計算機である。通信端末２の利用者は、情報処理装置１に識別処理を実行させるために、ネットワークＮを介して識別処理に必要な特徴量を情報処理装置１に送信する。 The communication terminal 2 is a computer such as a PC (Personal Computer) or a workstation managed by the user. The user of the communication terminal 2 transmits a feature amount required for the identification processing to the information processing apparatus 1 via the network N in order to cause the information processing apparatus 1 to execute the identification processing.

図１においては一つの通信端末２のみを図示しているが、情報処理装置１は複数の異なる利用者がそれぞれ管理する複数の異なる通信端末２とも通信可能である。情報処理装置１を維持管理する管理者は、例えば特定の個人情報等、個々の利用者では取得することが難しい情報を取得することができる。このため情報処理装置１は、個々の利用者では取得することが難しい情報から抽出された特徴量を用いる識別器を保持することができる。これにより、利用者は、自身では保持することができない識別器を用いた識別処理を、情報処理装置１に実行させることが可能となる。 Although only one communication terminal 2 is shown in FIG. 1, the information processing device 1 can communicate with a plurality of different communication terminals 2 managed by a plurality of different users. The administrator who maintains and manages the information processing apparatus 1 can acquire information, such as specific personal information, which is difficult for individual users to acquire. For this reason, the information processing device 1 can hold a classifier that uses a feature amount extracted from information that is difficult for individual users to acquire. This allows the user to cause the information processing apparatus 1 to execute an identification process using an identifier that cannot be held by the user.

ここで、一般にある装置が識別器を用いた識別処理を実行する際には、学習時に用いた特徴量と同じ種類の特徴量が必要となる。これはすなわち、情報処理装置１が保持する識別器が、個人情報に関する特徴量等のように一般に秘匿することが望まれている特徴量を用いる場合、通信端末２の利用者が情報処理装置１に識別処理を実行させるためにはそのような特徴量を情報処理装置１に提供する必要があることを意味する。通信端末２の利用者はそのような特徴量を郵送等のオフラインで情報処理装置１に提供することも考えられるが、利便性や速度、紛失の懸念等を考慮すると、必ずしも最適な解決策とは言えない。 Here, in general, when a certain device executes a classification process using a classifier, the same type of feature amount as that used at the time of learning is required. That is, when the discriminator held by the information processing device 1 uses a feature amount that is generally desired to be concealed, such as a feature amount related to personal information, the user of the communication terminal 2 can use the information processing device 1 Means that it is necessary to provide such information to the information processing apparatus 1 in order to cause the information processing apparatus 1 to execute the identification processing. It is conceivable that the user of the communication terminal 2 provides such a feature amount to the information processing apparatus 1 off-line by mail or the like. However, considering the convenience, speed, concern about loss, and the like, it is not necessarily the best solution. I can't say.

そこで実施の形態に係る情報処理装置１は、識別器を実行するために必要な特徴量のうち、秘匿することが望まれている特徴量を除いた特徴量を通信端末２から取得する。情報処理装置１は、通信端末２から取得した特徴量に基づいて、除かれている特徴量を推定して補完する。情報処理装置１は、通信端末２から取得した特徴量と補完した特徴量とに基づいて、識別器を用いた識別処理を実行して結果を通信端末２に送信する。これにより、利用者は、秘匿することが望まれている特徴量を情報処理装置１に送信することなく、そのような特徴量を用いた識別器による識別処理の結果を得ることができる。
以下、実施の形態に係る情報処理装置１についてより詳細に説明する。 Therefore, the information processing apparatus 1 according to the embodiment acquires, from the communication terminal 2, a feature amount excluding a feature amount desired to be concealed among feature amounts required to execute the classifier. The information processing apparatus 1 estimates and complements the removed feature based on the feature acquired from the communication terminal 2. The information processing device 1 executes an identification process using an identifier based on the feature amount acquired from the communication terminal 2 and the complemented feature amount, and transmits the result to the communication terminal 2. Accordingly, the user can obtain the result of the identification processing by the classifier using such a feature amount without transmitting the feature amount desired to be kept secret to the information processing apparatus 1.
Hereinafter, the information processing apparatus 1 according to the embodiment will be described in more detail.

＜情報処理装置１の機能構成＞
図２は、実施の形態に係る情報処理装置１の機能構成を模式的に示す図である。実施の形態に係る情報処理装置１は、通信部１０、記憶部２０、及び制御部３０を備える。 <Functional Configuration of Information Processing Apparatus 1>
FIG. 2 is a diagram schematically illustrating a functional configuration of the information processing apparatus 1 according to the embodiment. The information processing device 1 according to the embodiment includes a communication unit 10, a storage unit 20, and a control unit 30.

通信部１０は、情報処理装置１が通信端末２との間で情報を送受信するための通信インタフェースである。記憶部２０は、ＯＳ（Operating System）やアプリケーションプログラム等を格納するＲＯＭ（Read Only Memory）や情報処理装置１の作業領域となるＲＡＭ（Random Access Memory）、その他、識別器等の各種情報を格納するＨＤＤ（Hard Disk Drive）やＳＳＤ（Solid State Drive）等の大容量記憶装置である。記憶部２０の大容量記憶装置には、特徴因子行列格納部２１及び識別器格納部２２が格納されている。 The communication unit 10 is a communication interface through which the information processing device 1 transmits and receives information to and from the communication terminal 2. The storage unit 20 stores various information such as a ROM (Read Only Memory) for storing an OS (Operating System), an application program, and the like, a RAM (Random Access Memory) as a work area of the information processing apparatus 1, and other identifiers. It is a large-capacity storage device such as a hard disk drive (HDD) or a solid state drive (SSD). In the mass storage device of the storage unit 20, a feature factor matrix storage unit 21 and a discriminator storage unit 22 are stored.

制御部３０は、情報処理装置１のＣＰＵ（Central Processing Unit）やＧＰＵ（Graphics Processing Unit）等のプロセッサであり、記憶部２０に記憶されたプログラムを実行することによって特徴行列生成部３１、情報推定部３２、及び識別実行部３３として機能する。 The control unit 30 is a processor such as a CPU (Central Processing Unit) or a GPU (Graphics Processing Unit) of the information processing device 1, and executes a program stored in the storage unit 20 to execute the feature matrix generation unit 31 and the information estimation. It functions as the unit 32 and the identification execution unit 33.

特徴因子行列格納部２１は、Ｎ人（Ｎは２以上の整数）のユーザそれぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルであって、各特徴ベクトルはＭ種類（Ｍは２以上の整数）の特徴量それぞれについて生成して得られたＮ個の特徴ベクトルを並べて構成される行列を、ユーザ因子行列Ｕと特徴因子行列Ｆとの２つの行列の積で近似した場合における特徴因子行列Ｆを格納する。 The feature factor matrix storage unit 21 is a feature vector having, as elements, numerical values indicating feature amounts associated with N users (N is an integer of 2 or more), and each feature vector has M types (M is 2 A matrix formed by arranging N feature vectors generated and obtained for each of the above-described feature amounts (the above integers) by a product of two matrices of a user factor matrix U and a feature factor matrix F The factor matrix F is stored.

［特徴行列Ｍｓ］
以下、特徴因子行列格納部２１が格納する特徴因子行列Ｆについて説明するが、その前提となる特徴行列Ｍｓについてまず説明する。特徴行列Ｍｓは、特徴行列生成部３１によって生成される。 [Feature matrix Ms]
Hereinafter, the feature factor matrix F stored in the feature factor matrix storage unit 21 will be described. First, the feature matrix Ms as a premise thereof will be described. The feature matrix Ms is generated by the feature matrix generation unit 31.

図３は、特徴行列Ｍｓのデータ構造を模式的に示す図である。また図４は、実施の形態に係る特徴行列生成部３１の機能構成を模式的に示す図である。特徴行列生成部３１は、教師行列取得部３１１と行列分解部３１２とを含む。 FIG. 3 is a diagram schematically illustrating the data structure of the feature matrix Ms. FIG. 4 is a diagram schematically illustrating a functional configuration of the feature matrix generation unit 31 according to the embodiment. The feature matrix generation unit 31 includes a teacher matrix acquisition unit 311 and a matrix decomposition unit 312.

上述したように、特徴行列Ｍｓは、Ｎ人の個人それぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、Ｍ種類の特徴量それぞれについて生成して得られた複数個の特徴ベクトルを並べて構成される行列であり、図３では太線の矩形で示されている。図３に示す例において、ユーザＩＤはＮ人の個人それぞれを一意に特定するための識別子を表し、特徴量ＩＤはＭ種類の特徴量それぞれを一意に特定するための識別子を表す。 As described above, the feature matrix Ms includes a plurality of feature vectors obtained by generating feature vectors each having a numerical value indicating a feature amount associated with each of N individuals for each of M types of feature amounts. Are arranged in a matrix, and are shown by thick rectangles in FIG. In the example shown in FIG. 3, the user ID represents an identifier for uniquely identifying each of the N individuals, and the feature amount ID represents an identifier for uniquely identifying each of the M types of feature amounts.

図３において、特徴量ＩＤがＦ００００１で特定される特徴量は、各個人の年齢を示す数値である。また、特徴量ＩＤがＦ００００２で特定される特徴量は、各個人の性別を０（男性）又は１（女性）で数値化したものである。例えば図３に示す例は、ユーザＩＤがＵ００００１で特定される個人の年齢は２３歳、Ｕ００００２で特定される個人の年齢は３４歳、Ｕ０ＸＸＸＸで特定される個人の年齢は１８歳であることを示している。 In FIG. 3, the feature amount specified by the feature amount ID F00001 is a numerical value indicating the age of each individual. The feature amount specified by the feature amount ID F00002 is obtained by quantifying the gender of each individual by 0 (male) or 1 (female). For example, the example shown in FIG. 3 indicates that the age of the individual identified by the user ID U00001 is 23, the age of the individual identified by U00002 is 34, and the age of the individual identified by U0XXXX is 18 years. Is shown.

このように、特徴行列Ｍｓの各列は、特定の特徴量に対応する数値を要素とする縦ベクトルとなっている。特徴行列Ｍｓは、各特徴量に対応するＭ個の縦ベクトルを並べて構成されるＮ行Ｍ列の行列である。特徴行列Ｍｓは各人の個人情報が含まれるデータであり、実施の形態に係る情報処理システムＳにおいては情報処理装置１を維持管理する管理者でなければ入手が困難な情報を含んでいる。 As described above, each column of the feature matrix Ms is a vertical vector having elements corresponding to numerical values corresponding to specific feature amounts. The feature matrix Ms is a matrix of N rows and M columns configured by arranging M vertical vectors corresponding to each feature amount. The feature matrix Ms is data including personal information of each person, and in the information processing system S according to the embodiment, includes information that is difficult to obtain unless a manager who maintains and manages the information processing apparatus 1.

教師行列取得部３１１は、Ｎ人の個人それぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、Ｍ種類の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べることにより、特徴行列Ｍｓを生成する。一例として、Ｎの値は数万であり、Ｍの値は数千である。 The teacher matrix acquisition unit 311 arranges a plurality of feature vectors obtained by generating, for each of the M types of feature amounts, a feature vector having a numerical value indicating a feature amount associated with each of the N individuals as an element. , A feature matrix Ms. As an example, the value of N is tens of thousands and the value of M is thousands.

［特徴行列Ｍｓと特徴因子行列Ｆとの関係］
行列分解部３１２は、教師行列取得部３１１が取得した特徴行列Ｍｓを、例えば交互最小二乗法（Alternative Least Squares）等の既知の行列分解手法を用いてユーザ因子行列Ｕと特徴因子行列Ｆとの２つの行列の積に分解する。行列分解部３１２は、特徴行列Ｍｓを分解することで得られた特徴因子行列Ｆを特徴因子行列格納部２１に格納する。 [Relationship between feature matrix Ms and feature factor matrix F]
The matrix factorization unit 312 converts the feature matrix Ms acquired by the teacher matrix acquisition unit 311 into a user factor matrix U and a feature factor matrix F using a known matrix factorization method such as, for example, an alternate least squares method (Alternative Least Squares). Decompose into the product of two matrices. The matrix factorization unit 312 stores the feature factor matrix F obtained by decomposing the feature matrix Ms in the feature factor matrix storage unit 21.

図５は、特徴行列Ｍｓ、ユーザ因子行列Ｕ、及び特徴因子行列Ｆの関係を説明するための図である。図５に示すように、行列分解部３１２は、特徴行列Ｍｓをユーザ因子行列Ｕと特徴因子行列Ｆとの２つの行列の積で近似する。ここでユーザ因子行列ＵはＮ行Ｋ列の行列であり、特徴因子行列ＦはＫ行Ｍ列の行列である。 FIG. 5 is a diagram for explaining the relationship among the feature matrix Ms, the user factor matrix U, and the feature factor matrix F. As shown in FIG. 5, the matrix decomposition unit 312 approximates the feature matrix Ms by a product of two matrices of a user factor matrix U and a feature factor matrix F. Here, the user factor matrix U is a matrix with N rows and K columns, and the feature factor matrix F is a matrix with K rows and M columns.

ここでＫは、行列分解部３１２が特徴行列Ｍｓを分解するときに設定する因子パラメータであり、任意の正の整数を設定することができる。Ｋの具体的な値は特徴行列Ｍｓのランク等を考慮して実験により定めればよい。一例としては、誤差行列Ｅの二乗誤差ｅ_ｓｑｒを最小化するＫを以下のように設定してもよい。 Here, K is a factor parameter set when the matrix decomposition section 312 decomposes the feature matrix Ms, and can be set to any positive integer. The specific value of K may be determined by experiment in consideration of the rank of the feature matrix Ms and the like. As an example, K that minimizes the square error e _sqr of the error matrix E may be set as follows.

誤差行列Ｅを以下の式（１）で定義する。
Ｅ＝Ｍｓ−Ｕ×Ｆ（１） The error matrix E is defined by the following equation (1).
E = Ms−U × F (1)

誤差行列Ｅのｉ行ｊ列の成分をｅ_ｉｊとしたとき、Ｋの値は以下の式（２）で与えられる。

ここでａｒｇｍｉｎ（ｘ）は、ｘが最小となるパラメータを意味する。 _Assuming that the component at the i-th row and the j-th column of the error matrix E is e _ij , the value of K is given by the following equation (2).

Here, argmin (x) means a parameter that minimizes x.

図５に示すように、ユーザ因子行列Ｕは行の長さがＮであり、個人の数と一致する。一方特徴因子行列Ｆは列の長さがＭであり、特徴量の数と一致する。特徴行列Ｍｓ≒ユーザ因子行列Ｕ×特徴因子行列Ｆの定義から、ユーザ因子行列Ｕの各行はそれぞれ各個人を表現する情報を保持し、特徴因子行列Ｆの各列はそれぞれ各特徴量を表現するとみることもできる。例えばユーザＩＤがＵ００００１で特定される個人の、特徴量ＩＤがＦ００００１で特定される特徴量の値は、ユーザ因子行列Ｕの第１行と特徴因子行列Ｆの第１列との内積をとることで得られる。教師行列取得部３１１及び行列分解部３１２は、情報処理装置１を維持管理する管理者が特徴行列Ｍｓを更新する度に特徴因子行列Ｆを生成し、特徴因子行列格納部２１に格納させる特徴因子行列Ｆを更新する。 As shown in FIG. 5, the user factor matrix U has a row length of N, which is equal to the number of individuals. On the other hand, the feature factor matrix F has a column length M, which is equal to the number of feature values. From the definition of the feature matrix Ms ≒ user factor matrix U × feature factor matrix F, each row of the user factor matrix U holds information expressing each individual, and each column of the feature factor matrix F expresses each feature amount. You can also see. For example, the value of the feature amount of the individual whose user ID is specified by U00001 and whose feature amount ID is specified by F00001 is the inner product of the first row of the user factor matrix U and the first column of the feature factor matrix F. Is obtained. The teacher matrix acquisition unit 311 and the matrix decomposition unit 312 generate a feature factor matrix F each time the administrator who maintains the information processing apparatus 1 updates the feature matrix Ms, and stores the feature factor matrix F in the feature factor matrix storage unit 21. Update the matrix F.

［特徴因子行列Ｆを用いた欠損データの推定］
特徴因子行列格納部２１が特徴因子行列Ｆを格納していることを前提として、特徴因子行列Ｆを用いることによる欠損データの推定処理について説明する。 [Estimation of Missing Data Using Feature Factor Matrix F]
Assuming that the feature factor matrix storage unit 21 stores the feature factor matrix F, a process of estimating missing data using the feature factor matrix F will be described.

図６は、実施の形態に係る情報推定部３２の機能構成を模式的に示す図である。実施の形態に係る情報推定部３２は、試験行列取得部３２１、特徴因子抽出部３２２、ユーザ因子抽出部３２３、及び欠損情報推定部３２４を含む。
試験行列取得部３２１は、ネットワークＮを介して通信端末２から試験行列Ｔを取得する。試験行列Ｔは、Ｐ人（Ｐは２以上の整数）の個人それぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、Ｍ種類の特徴量のうちのＱ種類（Ｑは１以上Ｍ未満の整数）の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べて構成される行列である。 FIG. 6 is a diagram schematically illustrating a functional configuration of the information estimating unit 32 according to the embodiment. The information estimation unit 32 according to the embodiment includes a test matrix acquisition unit 321, a feature factor extraction unit 322, a user factor extraction unit 323, and a missing information estimation unit 324.
The test matrix acquisition unit 321 acquires a test matrix T from the communication terminal 2 via the network N. The test matrix T includes a feature vector having a numerical value indicating a feature amount associated with each of P individuals (P is an integer of 2 or more) as an element, and Q types (Q is 1 or more) of M types of feature amounts. This is a matrix formed by arranging a plurality of feature vectors generated for each of the feature amounts (an integer less than M).

ここで通信端末２の利用者は、Ｐ人の個人それぞれに関連付けられたＭ種類の特徴量のうち、情報処理装置１に提供することを望まない（Ｍ−Ｑ）種類の特徴量を除くことにより、通信端末２に試験行列Ｔを生成させる。なお、Ｐ人の個人は特徴行列Ｍｓを構成する特徴ベクトルに対応付けられた個人でなくてもよいが、Ｑ種類の特徴量は特徴行列Ｍｓの特徴量に含まれる特徴量と同種の特徴量である。 Here, the user of the communication terminal 2 removes (MQ) types of feature amounts that the user does not want to provide to the information processing apparatus 1 among the M types of feature amounts associated with each of the P individuals. As a result, the communication terminal 2 generates the test matrix T. The P individuals need not be the individuals associated with the feature vectors constituting the feature matrix Ms, but the Q types of feature amounts are the same type of feature amounts as the feature amounts included in the feature amount of the feature matrix Ms. It is.

特徴因子抽出部３２２は、特徴因子行列Ｆを特徴因子行列格納部２１から読み出す。特徴因子抽出部３２２は、読み出した特徴因子行列Ｆのうち、試験行列Ｔを構成する特徴ベクトルに対応する成分を抽出して得られる試験特徴因子行列Ｆｔと、それ以外の成分で構成される欠損特徴因子行列Ｆｌと、を生成する。このため、特徴因子抽出部３２２は、試験行列Ｔを構成する特徴ベクトルに含まれない特徴量である欠損特徴量を特定するための情報も、通信端末２から取得する。 The feature factor extraction unit 322 reads the feature factor matrix F from the feature factor matrix storage unit 21. The feature factor extracting unit 322 extracts a test feature factor matrix Ft obtained by extracting a component corresponding to a feature vector forming the test matrix T from the read feature factor matrix F, and a defect configured by other components. And a feature factor matrix Fl. For this reason, the feature factor extraction unit 322 also acquires from the communication terminal 2 information for specifying a missing feature amount, which is a feature amount not included in the feature vector forming the test matrix T.

特徴因子抽出部３２２が取得する欠損特徴量を特定するための情報は、例えば試験行列Ｔを構成する特徴ベクトルに含まれない特徴量ＩＤであってもよいし、試験行列Ｔを構成する特徴ベクトルに含まれる特徴量ＩＤであってもよい。後者の場合、特徴因子抽出部３２２は、試験行列Ｔを構成する特徴ベクトルに含まれる特徴量ＩＤ以外の特徴量ＩＤで特定される特徴量を、欠損特徴量として特定すればよい。 The information for specifying the missing feature amount acquired by the feature factor extracting unit 322 may be, for example, a feature amount ID not included in a feature vector forming the test matrix T, or a feature vector forming the test matrix T. May be the feature amount ID included in. In the latter case, the feature factor extraction unit 322 may specify a feature value specified by a feature value ID other than the feature value ID included in the feature vector forming the test matrix T as a missing feature value.

ユーザ因子抽出部３２３は、試験行列Ｔを、試験ユーザ因子行列Ｕｔと試験特徴因子行列Ｆｔとの積で近似するための試験ユーザ因子行列Ｕｔを算出する。より具体的には、ユーザ因子抽出部３２３は、最小二乗法を用いて、試験行列Ｔと試験特徴因子行列Ｆｔとから試験ユーザ因子行列Ｕｔを算出する。欠損情報推定部３２４は、試験ユーザ因子行列Ｕｔと欠損特徴因子行列Ｆｌとの積から、試験行列Ｔを構成する特徴ベクトルに含まれない欠損特徴ベクトルの推定値から構成される欠損行列Ｍｌを取得する。欠損行列Ｍｌの各列が、欠損特徴ベクトルの推定値である。これにより、欠損情報推定部３２４は、通信端末２から取得した試験行列Ｔには含まれない（Ｍ−Ｑ）種類の特徴量の推定値を取得することができる。 The user factor extraction unit 323 calculates a test user factor matrix Ut for approximating the test matrix T with a product of the test user factor matrix Ut and the test feature factor matrix Ft. More specifically, the user factor extraction unit 323 calculates the test user factor matrix Ut from the test matrix T and the test feature factor matrix Ft using the least squares method. The missing information estimating unit 324 acquires a missing matrix Ml composed of estimated values of missing feature vectors not included in the feature vectors constituting the test matrix T from a product of the test user factor matrix Ut and the missing feature factor matrix Fl. I do. Each column of the missing matrix Ml is an estimated value of the missing feature vector. Thereby, the loss information estimating unit 324 can obtain the estimated values of the (MQ) types of feature amounts that are not included in the test matrix T obtained from the communication terminal 2.

図７は、実施の形態に係る情報推定部３２が実行する推定処理を説明するための模式図である。なお、以下に記載する（１）から（６）までの流れは、図７中の（１）から（６）に対応する。なお、図７において破線の矩形は、図５に示す特徴行列Ｍｓ、ユーザ因子行列Ｕ、及び特徴因子行列Ｆの関係を説明するための図と同一である。また図７において斜線で示す行列は、各推定処理によって得られる行列である。 FIG. 7 is a schematic diagram illustrating an estimation process performed by the information estimation unit 32 according to the embodiment. Note that the flow from (1) to (6) described below corresponds to (1) to (6) in FIG. Note that the dashed rectangle in FIG. 7 is the same as the diagram for explaining the relationship between the feature matrix Ms, the user factor matrix U, and the feature factor matrix F shown in FIG. In FIG. 7, the matrices indicated by oblique lines are matrices obtained by the respective estimation processes.

（１）教師行列取得部３１１は、Ｎ人の個人それぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、Ｍ種類の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べることにより、特徴行列Ｍｓを生成する。（２）行列分解部３１２は、教師行列取得部３１１が取得した特徴行列Ｍｓを、交互最小二乗法を用いてユーザ因子行列Ｕと特徴因子行列Ｆとの２つの行列の積に分解する。特徴因子行列格納部２１は、行列分解部３１２が特徴行列Ｍｓを分解して生成した特徴因子行列Ｆを格納する。以上が、情報処理装置１があらかじめ実行するいわば前処理である。 (1) The teacher matrix acquiring unit 311 generates a plurality of feature vectors obtained by generating, for each of the M types of feature values, a feature vector having a numerical value indicating a feature value associated with each of the N individuals as an element. By arranging, the feature matrix Ms is generated. (2) The matrix decomposing unit 312 decomposes the feature matrix Ms acquired by the teacher matrix acquiring unit 311 into a product of two matrices of a user factor matrix U and a feature factor matrix F using an alternating least squares method. The feature factor matrix storage unit 21 stores a feature factor matrix F generated by decomposing the feature matrix Ms by the matrix decomposition unit 312. The above is the so-called pre-processing executed by the information processing apparatus 1 in advance.

（３）試験行列取得部３２１は、ネットワークＮを介して通信端末２から試験行列Ｔを取得する。（４）特徴因子抽出部３２２は、特徴因子行列格納部２１から読み出した特徴因子行列Ｆのうち、試験行列Ｔを構成する特徴ベクトルに対応する成分を抽出して得られる試験特徴因子行列Ｆｔと、それ以外の成分で構成される欠損特徴因子行列Ｆｌと、を生成する。（５）ユーザ因子抽出部３２３は、試験ユーザ因子行列Ｕｔと試験特徴因子行列Ｆｔとの積で試験行列Ｔを近似するための試験ユーザ因子行列Ｕｔを、最小二乗法を用いて算出する。 (3) The test matrix acquisition unit 321 acquires the test matrix T from the communication terminal 2 via the network N. (4) The feature factor extraction unit 322 extracts a test feature factor matrix Ft obtained by extracting a component corresponding to a feature vector forming the test matrix T from the feature factor matrix F read from the feature factor matrix storage unit 21. , And a missing feature factor matrix Fl composed of other components. (5) The user factor extraction unit 323 calculates a test user factor matrix Ut for approximating the test matrix T with a product of the test user factor matrix Ut and the test feature factor matrix Ft using the least square method.

（６）欠損情報推定部３２４は、試験ユーザ因子行列Ｕｔと欠損特徴因子行列Ｆｌとの積から、試験行列Ｔを構成する特徴ベクトルに含まれない欠損特徴ベクトルの推定値から構成される欠損行列Ｍｌを取得する。欠損行列Ｍｌの各列は、通信端末２から取得した試験行列Ｔには含まれない（Ｍ−Ｑ）種類の特徴量の推定値を要素とするベクトルとなっている。 (6) The missing information estimating unit 324 is based on the product of the test user factor matrix Ut and the missing feature factor matrix Fl, and is a missing matrix composed of estimated values of missing feature vectors not included in the feature vectors constituting the test matrix T. Obtain Ml. Each column of the missing matrix Ml is a vector whose elements are estimated values of (MQ) types of feature amounts that are not included in the test matrix T acquired from the communication terminal 2.

［推定した特徴量を用いた識別処理］
上述したように、実施の形態に係る情報処理装置１は、個々の利用者では取得することが難しい情報から抽出された特徴量を用いる識別器を保持している。具体的には、情報処理装置１における識別器格納部２２は、Ｍ種類の特徴ベクトルに対応する特徴量に基づいてあらかじめ機械学習によって得られた識別器を格納している。情報処理装置１は、試験行列取得部３２１がネットワークＮを介して通信端末２から取得した試験行列Ｔと、欠損情報推定部３２４が推定した欠損行列Ｍｌとから、Ｍ種類の特徴量をすべて含むＰ行Ｍ列の行列を取得することができる。情報処理装置１における識別実行部３３は、この行列と識別器とに基づいて識別処理を実行する。 [Identification Processing Using Estimated Features]
As described above, the information processing apparatus 1 according to the embodiment holds the discriminator that uses a feature amount extracted from information that is difficult to obtain for each user. Specifically, the discriminator storage unit 22 in the information processing device 1 stores discriminators obtained in advance by machine learning based on feature amounts corresponding to M types of feature vectors. The information processing device 1 includes all M types of feature amounts from the test matrix T acquired by the test matrix acquisition unit 321 from the communication terminal 2 via the network N and the missing matrix Ml estimated by the missing information estimation unit 324. A matrix with P rows and M columns can be obtained. The identification execution unit 33 in the information processing device 1 executes an identification process based on the matrix and the identifier.

図８は、実施の形態に係る識別実行部３３の機能構成を模式的に示す図である。実施の形態に係る識別実行部３３は、推定行列生成部３３１、識別部３３２、及び送信部３３３を含む。推定行列生成部３３１は、試験行列Ｔと欠損行列Ｍｌとを用いて、Ｐ人の個人それぞれに関するＭ種類の特徴ベクトルから構成される推定行列Ｍｅを生成する。 FIG. 8 is a diagram schematically illustrating a functional configuration of the identification execution unit 33 according to the embodiment. The identification execution unit 33 according to the embodiment includes an estimation matrix generation unit 331, an identification unit 332, and a transmission unit 333. The estimation matrix generation unit 331 uses the test matrix T and the missing matrix Ml to generate an estimation matrix Me including M types of feature vectors for each of the P individuals.

識別部３３２は、推定行列Ｍｅを構成する特徴ベクトルと識別器とに基づいて、Ｐ人の個人それぞれに関する識別処理を実行する。送信部３３３は、識別処理の結果を、試験行列Ｔを送信した通信端末２に送信する。 The identification unit 332 performs an identification process for each of the P individuals based on the feature vectors and the classifiers forming the estimation matrix Me. The transmitting unit 333 transmits the result of the identification processing to the communication terminal 2 that has transmitted the test matrix T.

［通信端末２の機能構成］
図９は、通信端末２の機能構成を模式的に示す図である。通信端末２は、試験行列生成部２０１、試験行列送信部２０２、及び識別結果受信部２０３を含む。
試験行列生成部２０１は、Ｐ人の個人それぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、Ｍ種類の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べて構成される行列から、所定の特徴量に対応する特徴ベクトルを除いたＱ種類の特徴ベクトルから構成される試験行列Ｔを生成する。ここで「所定の特徴量」とは、通信端末２から情報処理装置１に対して秘匿する特徴量であり、上述したＭ−Ｑ種類の特徴量である。 [Functional Configuration of Communication Terminal 2]
FIG. 9 is a diagram schematically illustrating a functional configuration of the communication terminal 2. The communication terminal 2 includes a test matrix generation unit 201, a test matrix transmission unit 202, and an identification result reception unit 203.
The test matrix generation unit 201 is configured by arranging a plurality of feature vectors obtained by generating, for each of M types of feature amounts, a feature vector having a numerical value indicating a feature amount associated with each of the P individuals. A test matrix T composed of Q kinds of feature vectors excluding a feature vector corresponding to a predetermined feature amount is generated from the matrix. Here, the “predetermined feature amount” is a feature amount that is concealed from the communication terminal 2 to the information processing device 1, and is the above-described MQ type feature amount.

試験行列送信部２０２は、ネットワークＮを介して試験行列Ｔを情報処理装置１に送信する。識別結果受信部２０３は、情報処理装置１において、特徴因子行列Ｆに基づいて推定された欠損特徴ベクトルと、試験行列Ｔとから、Ｐ人のユーザそれぞれに関する識別処理を実行された結果を受信する。これにより、通信端末２の利用者は、情報処理装置１に提供することを望まない（Ｍ−Ｑ）種類の特徴量を除いた情報のみを情報処理装置１に送信するだけで、それらの情報の利用を前提とする識別器による識別処理の結果を得ることができる。 The test matrix transmission unit 202 transmits the test matrix T to the information processing device 1 via the network N. The identification result receiving unit 203 receives, from the information processing apparatus 1, a result of executing the identification processing for each of the P users from the missing feature vector estimated based on the feature factor matrix F and the test matrix T. . As a result, the user of the communication terminal 2 transmits only information excluding (M−Q) types of feature amounts that the user does not want to provide to the information processing device 1 to the information processing device 1. Can be obtained as a result of the classification processing by the classifier on the premise of the use of.

＜情報処理装置１が実行する情報処理方法の処理フロー＞
図１０は、実施の形態に係る情報処理装置１が実行する情報処理の流れを説明するためのフローチャートである。本フローチャートにおける処理は、例えば情報処理装置１の電源が投入されたときに開始する。 <Processing flow of information processing method executed by information processing apparatus 1>
FIG. 10 is a flowchart illustrating a flow of information processing executed by information processing apparatus 1 according to the embodiment. The processing in this flowchart is started, for example, when the information processing apparatus 1 is turned on.

試験行列取得部３２１は、ネットワークＮを介して通信端末２から試験行列Ｔを取得する（Ｓ２）。特徴因子抽出部３２２は、特徴因子行列Ｆを特徴因子行列格納部２１から読み出す（Ｓ４）。特徴因子抽出部３２２は、読み出した特徴因子行列Ｆのうち、試験行列Ｔを構成する特徴ベクトルに対応する成分を抽出して得られる試験特徴因子行列Ｆｔを取得する（Ｓ６）。 The test matrix acquisition unit 321 acquires the test matrix T from the communication terminal 2 via the network N (S2). The feature factor extraction unit 322 reads the feature factor matrix F from the feature factor matrix storage unit 21 (S4). The feature factor extraction unit 322 acquires a test feature factor matrix Ft obtained by extracting a component corresponding to a feature vector forming the test matrix T from the read feature factor matrix F (S6).

特徴因子抽出部３２２はまた、読み出した特徴因子行列Ｆのうち、試験行列Ｔを構成する特徴ベクトルに対応する成分以外の成分で構成される欠損特徴因子行列Ｆｌを抽出する（Ｓ８）。ユーザ因子抽出部３２３は、試験ユーザ因子行列Ｕｔと試験特徴因子行列Ｆｔとの積で試験行列Ｔを近似するための試験ユーザ因子行列Ｕｔを算出する（Ｓ１０）。欠損情報推定部３２４は、試験ユーザ因子行列Ｕｔと欠損特徴因子行列Ｆｌとの積から、試験行列Ｔを構成する特徴ベクトルに含まれない欠損特徴ベクトルの推定値を取得する（Ｓ１２）。欠損情報推定部３２４が欠損特徴ベクトルの推定値を取得すると、本フローチャートにおける処理は終了する。 The feature factor extraction unit 322 also extracts a missing feature factor matrix Fl composed of components other than the components corresponding to the feature vectors forming the test matrix T from the read feature factor matrix F (S8). The user factor extraction unit 323 calculates a test user factor matrix Ut for approximating the test matrix T by a product of the test user factor matrix Ut and the test feature factor matrix Ft (S10). The missing information estimating unit 324 acquires an estimated value of a missing feature vector not included in the feature vectors forming the test matrix T from the product of the test user factor matrix Ut and the missing feature factor matrix Fl (S12). When the missing information estimating unit 324 acquires the estimated value of the missing feature vector, the processing in this flowchart ends.

＜情報処理装置１が奏する効果＞
以上説明したように、実施の形態に係る情報処理装置１によれば、識別器が利用する複数の特徴量のうちの一部の特徴量を用いて、他の特徴量を補完することができる。これにより、識別器を利用しようとする者は、例えばその識別器が個人情報等の秘匿されるべき情報を用いるものであっても、そのような情報を提供することなく識別結果を得ることができる。 <Effects of Information Processing Apparatus 1>
As described above, according to the information processing apparatus 1 according to the embodiment, it is possible to use a part of the plurality of feature amounts used by the classifier to complement other feature amounts. . This allows a person who intends to use the classifier to obtain a classification result without providing such information, for example, even if the classifier uses confidential information such as personal information. it can.

以上、本発明を実施の形態を用いて説明したが、本発明の技術的範囲は上記実施の形態に記載の範囲には限定されない。上記実施の形態に、多様な変更又は改良を加えることが可能であることが当業者に明らかである。特に、装置の分散・統合の具体的な実施形態は以上に図示するものに限られず、その全部又は一部について、種々の付加等に応じて、又は、機能負荷に応じて、任意の単位で機能的又は物理的に分散・統合して構成することができる。 As described above, the present invention has been described using the embodiments, but the technical scope of the present invention is not limited to the scope described in the above embodiments. It is apparent to those skilled in the art that various changes or improvements can be made to the above embodiment. In particular, the specific embodiment of the distribution / integration of the apparatus is not limited to the above-described one, and all or a part of the apparatus is arbitrarily added according to various additions or the like, or according to a functional load. It can be configured to be distributed or integrated functionally or physically.

上記では、特徴行列Ｍｓの各列が個人毎のある特徴量を並べて構成されるベクトルである場合について説明した。ここで特徴行列生成部３１は、特徴行列Ｍｓの各列が所定の規格化条件を満たすように特徴行列Ｍｓを生成してもよい。ここで「所定の規格化条件」とは、特徴行列Ｍｓの各列又は行に共通に備えさせる条件である。具体的には、特徴行列生成部３１は、特徴行列Ｍｓの各列又は行の総和が一定となるように特徴行列Ｍｓを生成する。あるいは特徴行列生成部３１は、特徴行列Ｍｓの各列又は行の分散が一定となるように特徴行列Ｍｓを生成してもよい。これにより、行列分解部３１２が交互最小二乗法を用いる際やユーザ因子抽出部３２３が最小二乗法を用いる際に、処理対象の行列の各列又は行に係る重みを平準化することができる。 The case where each column of the feature matrix Ms is a vector configured by arranging certain feature amounts for each individual has been described above. Here, the feature matrix generation unit 31 may generate the feature matrix Ms such that each column of the feature matrix Ms satisfies a predetermined normalization condition. Here, the “predetermined normalization condition” is a condition commonly provided for each column or row of the feature matrix Ms. Specifically, the feature matrix generation unit 31 generates the feature matrix Ms such that the sum of each column or row of the feature matrix Ms is constant. Alternatively, the feature matrix generation unit 31 may generate the feature matrix Ms such that the variance of each column or row of the feature matrix Ms is constant. Thus, when the matrix decomposition section 312 uses the alternating least squares method or when the user factor extraction section 323 uses the least squares method, it is possible to level the weights of each column or row of the processing target matrix.

１・・・情報処理装置
２・・・通信端末
１０・・・通信部
２０・・・記憶部
２１・・・特徴因子行列格納部
２２・・・識別器格納部
３０・・・制御部
３１・・・特徴行列生成部
３２・・・情報推定部
３３・・・識別実行部
２０１・・・試験行列生成部
２０２・・・試験行列送信部
２０３・・・識別結果受信部
３１１・・・教師行列取得部
３１２・・・行列分解部
３２１・・・試験行列取得部
３２２・・・特徴因子抽出部
３２３・・・ユーザ因子抽出部
３２４・・・欠損情報推定部
３３１・・・推定行列生成部
３３２・・・識別部
３３３・・・送信部
Ｎ・・・ネットワーク
Ｓ・・・情報処理システム

DESCRIPTION OF SYMBOLS 1 ... Information processing apparatus 2 ... Communication terminal 10 ... Communication part 20 ... Storage part 21 ... Feature factor matrix storage part 22 ... Classifier storage part 30 ... Control part 31 ··· Feature matrix generation unit 32 ··· Information estimation unit 33 ··· Identification execution unit 201 ··· Test matrix generation unit 202 ··· Test matrix transmission unit 203 ··· Identification result reception unit 311 ··· Teacher matrix Acquisition unit 312 Matrix decomposition unit 321 Test matrix acquisition unit 322 Feature factor extraction unit 323 User factor extraction unit 324 Loss information estimation unit 331 Estimation matrix generation unit 332 ... Identifying unit 333 ... Transmitting unit N ... Network S ... Information processing system

Claims

Ｎ人（Ｎは２以上の整数）のユーザそれぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、Ｍ種類（Ｍは２以上の整数）の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べて構成される行列を、ユーザ因子行列Ｕと特徴因子行列Ｆとの２つの行列の積で近似した場合における特徴因子行列Ｆを格納する特徴因子行列格納部と、
Ｐ人（Ｐは２以上の整数）のユーザそれぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、前記Ｍ種類の特徴量のうちのＱ種類（Ｑは１以上Ｍ未満の整数）の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べて構成される試験行列Ｔを取得する試験行列取得部と、
前記特徴因子行列Ｆのうち、前記試験行列Ｔを構成する特徴ベクトルに対応する成分を抽出して得られる試験特徴因子行列Ｆｔと、それ以外の成分で構成される欠損特徴因子行列Ｆｌとを生成する特徴因子抽出部と、
前記試験行列Ｔを、試験ユーザ因子行列Ｕｔと前記試験特徴因子行列Ｆｔとの積で近似する試験ユーザ因子行列Ｕｔを算出するユーザ因子抽出部と、
前記試験ユーザ因子行列Ｕｔと前記欠損特徴因子行列Ｆｌとの積から、前記試験行列Ｔを構成する特徴ベクトルに含まれない欠損特徴ベクトルの推定値を取得する欠損情報推定部と、
を備える情報処理装置。 A feature vector is generated by generating, for each of M types (M is an integer of 2 or more) of feature amounts, feature vectors each having a numerical value indicating a feature amount associated with each of N users (N is an integer of 2 or more). A feature factor matrix storage unit that stores a feature factor matrix F when a matrix configured by arranging the plurality of feature vectors is approximated by a product of two matrices of a user factor matrix U and a feature factor matrix F;
A feature vector having a numerical value indicating a feature amount associated with each of P users (P is an integer of 2 or more) as elements is used for Q types (Q is an integer of 1 or more and less than M) of the M types of feature amounts. A) a test matrix obtaining unit configured to obtain a test matrix T configured by arranging a plurality of feature vectors obtained and generated for each of the feature amounts;
A test feature factor matrix Ft obtained by extracting a component corresponding to a feature vector constituting the test matrix T from the feature factor matrix F and a missing feature factor matrix Fl composed of other components are generated. A feature factor extraction unit that performs
A user factor extraction unit that calculates a test user factor matrix Ut that approximates the test matrix T by a product of a test user factor matrix Ut and the test feature factor matrix Ft;
A missing information estimating unit that obtains an estimated value of a missing feature vector not included in a feature vector forming the test matrix T from a product of the test user factor matrix Ut and the missing feature factor matrix Fl;
An information processing apparatus comprising:

前記特徴因子抽出部は、前記試験行列Ｔを構成する特徴ベクトルに含まれない特徴量である欠損特徴量を特定するための情報をさらに取得する、
請求項１に記載の情報処理装置。 The feature factor extraction unit further obtains information for specifying a missing feature amount that is a feature amount not included in a feature vector included in the test matrix T.
The information processing device according to claim 1.

Ｎ人のユーザそれぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、Ｍ種類の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べて構成される行列を取得する教師行列取得部と、
前記教師行列取得部が取得した行列を、交互最小二乗法（Alternative Least Squares）を用いてユーザ因子行列Ｕと特徴因子行列Ｆとの２つの行列の積に分解する行列分解部と、をさらに備え、
前記特徴因子行列格納部は、前記行列分解部が分解した前記特徴因子行列Ｆを格納する、
請求項１又は２に記載の情報処理装置。 A teacher matrix for obtaining a matrix formed by arranging a plurality of feature vectors obtained by generating feature vectors each having a value indicating a feature amount associated with each of N users for each of M types of feature amounts An acquisition unit;
A matrix decomposing unit that decomposes the matrix acquired by the teacher matrix acquiring unit into a product of two matrices of a user factor matrix U and a feature factor matrix F using an alternative least squares method (Alternative Least Squares). ,
The feature factor matrix storage unit stores the feature factor matrix F decomposed by the matrix factorization unit.
The information processing device according to claim 1.

前記ユーザ因子抽出部は、最小二乗法を用いて、前記試験行列Ｔと前記試験特徴因子行列Ｆｔとから前記試験ユーザ因子行列Ｕｔを算出する、
請求項１から３のいずれか一項に記載の情報処理装置。 The user factor extraction unit calculates the test user factor matrix Ut from the test matrix T and the test feature factor matrix Ft using a least square method.
The information processing device according to claim 1.

前記特徴量は、前記ユーザの個人情報に関する特徴量である、
請求項１から４のいずれか一項に記載の情報処理装置。 The feature amount is a feature amount related to the personal information of the user.
The information processing apparatus according to claim 1.

前記Ｍ種類の特徴ベクトルに対応する特徴量に基づいてあらかじめ機械学習によって得られた識別器を格納する識別器格納部と、
前記試験行列Ｔと前記欠損特徴ベクトルとを用いて、前記Ｐ人のユーザそれぞれに関するＭ種類の特徴ベクトルから構成される推定行列Ｍｅを生成する推定行列生成部と、
前記推定行列Ｍｅを構成する特徴ベクトルと前記識別器とに基づいて、前記Ｐ人のユーザそれぞれに関する識別処理を実行する識別部と、
前記識別処理の結果を、前記試験行列Ｔを送信した通信端末に送信する送信部と、
をさらに備える請求項１から５のいずれか一項に記載の情報処理装置。 A classifier storage unit that stores a classifier obtained in advance by machine learning based on feature amounts corresponding to the M types of feature vectors,
An estimation matrix generation unit configured to generate an estimation matrix Me including M types of feature vectors for each of the P users using the test matrix T and the missing feature vector;
An identification unit configured to execute an identification process for each of the P users based on the feature vectors that form the estimation matrix Me and the identifier.
A transmitting unit that transmits a result of the identification processing to the communication terminal that has transmitted the test matrix T;
The information processing apparatus according to any one of claims 1 to 5, further comprising:

プロセッサが、
Ｎ人（Ｎは２以上の整数）のユーザそれぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、Ｍ種類（Ｍは２以上の整数）の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べて構成される行列を、ユーザ因子行列Ｕと特徴因子行列Ｆとの２つの行列の積で近似した場合における特徴因子行列Ｆを取得するステップと、
Ｐ人（Ｐは２以上の整数）のユーザそれぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、前記Ｍ種類の特徴量のうちのＱ種類（Ｑは１以上Ｍ未満の整数）の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べて構成される試験行列Ｔを取得するステップと、
前記特徴因子行列Ｆのうち、前記試験行列Ｔを構成する特徴ベクトルに対応する成分を抽出して得られる試験特徴因子行列Ｆｔと、それ以外の成分で構成される欠損特徴因子行列Ｆｌとを生成するステップと、
前記試験行列Ｔを、試験ユーザ因子行列Ｕｔと前記試験特徴因子行列Ｆｔとの積で近似する試験ユーザ因子行列Ｕｔを算出するステップと、
前記試験ユーザ因子行列Ｕｔと前記欠損特徴因子行列Ｆｌとの積から、前記試験行列Ｔを構成する特徴ベクトルに含まれない欠損特徴ベクトルの推定値を取得するステップと、
を実行する情報処理方法。 The processor
A feature vector is generated by generating, for each of M types (M is an integer of 2 or more) of feature amounts, feature vectors each having a value indicating a feature amount associated with each of N users (N is an integer of 2 or more). Obtaining a feature factor matrix F when a matrix configured by arranging the plurality of feature vectors is approximated by a product of two matrices of a user factor matrix U and a feature factor matrix F;
A feature vector having a numerical value indicating a feature amount associated with each of P users (P is an integer of 2 or more) as elements is used for Q types (Q is an integer of 1 or more and less than M) of the M types of feature amounts. A) obtaining a test matrix T configured by arranging a plurality of feature vectors obtained and generated for each of the feature amounts;
A test feature factor matrix Ft obtained by extracting a component corresponding to a feature vector constituting the test matrix T from the feature factor matrix F and a missing feature factor matrix Fl composed of other components are generated. Steps to
Calculating a test user factor matrix Ut that approximates the test matrix T with a product of a test user factor matrix Ut and the test feature factor matrix Ft;
Obtaining an estimated value of a missing feature vector not included in the feature vectors forming the test matrix T from a product of the test user factor matrix Ut and the missing feature factor matrix Fl;
Information processing method for executing.

コンピュータに、
Ｎ人（Ｎは２以上の整数）のユーザそれぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、Ｍ種類（Ｍは２以上の整数）の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べて構成される行列を、ユーザ因子行列Ｕと特徴因子行列Ｆとの２つの行列の積で近似した場合における特徴因子行列Ｆを取得する機能と、
Ｐ人（Ｐは２以上の整数）のユーザそれぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、前記Ｍ種類の特徴量のうちのＱ種類（Ｑは１以上Ｍ未満の整数）の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べて構成される試験行列Ｔを取得する機能と、
前記特徴因子行列Ｆのうち、前記試験行列Ｔを構成する特徴ベクトルに対応する成分を抽出して得られる試験特徴因子行列Ｆｔと、それ以外の成分で構成される欠損特徴因子行列Ｆｌとを生成する機能と、
前記試験行列Ｔを、試験ユーザ因子行列Ｕｔと前記試験特徴因子行列Ｆｔとの積で近似する試験ユーザ因子行列Ｕｔを算出する機能と、
前記試験ユーザ因子行列Ｕｔと前記欠損特徴因子行列Ｆｌとの積から、前記試験行列Ｔを構成する特徴ベクトルに含まれない欠損特徴ベクトルの推定値を取得する機能と、
を実現させるプログラム。 On the computer,
A feature vector is generated by generating, for each of M types (M is an integer of 2 or more) of feature amounts, feature vectors each having a value indicating a feature amount associated with each of N users (N is an integer of 2 or more). A function of obtaining a feature factor matrix F when a matrix configured by arranging the plurality of feature vectors is approximated by a product of two matrices of a user factor matrix U and a feature factor matrix F;
A feature vector having a numerical value indicating a feature amount associated with each of P users (P is an integer of 2 or more) as elements is used for Q types (Q is an integer of 1 or more and less than M) of the M types of feature amounts. A) a function of acquiring a test matrix T configured by arranging a plurality of feature vectors obtained and generated for each of the feature amounts;
A test feature factor matrix Ft obtained by extracting a component corresponding to a feature vector constituting the test matrix T from the feature factor matrix F and a missing feature factor matrix Fl composed of other components are generated. Function and
A function of calculating a test user factor matrix Ut that approximates the test matrix T by a product of a test user factor matrix Ut and the test feature factor matrix Ft;
A function of obtaining an estimated value of a missing feature vector not included in the feature vectors constituting the test matrix T from a product of the test user factor matrix Ut and the missing feature factor matrix Fl;
The program that realizes.

通信端末と、
ネットワークを介して前記通信端末と通信する情報処理装置と、
を備える情報処理システムであって、
前記情報処理装置は、
Ｎ人（Ｎは２以上の整数）のユーザそれぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、Ｍ種類（Ｍは２以上の整数）の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べて構成される行列を、ユーザ因子行列Ｕと特徴因子行列Ｆとの２つの行列の積で近似した場合における特徴因子行列Ｆを格納する特徴因子行列格納部と、
Ｐ人（Ｐは２以上の整数）のユーザそれぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、前記Ｍ種類の特徴量のうちのＱ種類（Ｑは１以上Ｍ未満の整数）の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べて構成される試験行列Ｔを、前記ネットワークを介して前記通信端末から取得する試験行列取得部と、
前記特徴因子行列Ｆのうち、前記試験行列Ｔを構成する特徴ベクトルに対応する成分を抽出して得られる試験特徴因子行列Ｆｔと、それ以外の成分で構成される欠損特徴因子行列Ｆｌとを生成する特徴因子抽出部と、
前記試験行列Ｔを、試験ユーザ因子行列Ｕｔと前記試験特徴因子行列Ｆｔとの積で近似する試験ユーザ因子行列Ｕｔを算出するユーザ因子抽出部と、
前記試験ユーザ因子行列Ｕｔと前記欠損特徴因子行列Ｆｌとの積から、前記試験行列Ｔを構成する特徴ベクトルに含まれない欠損特徴ベクトルの推定値を取得する欠損情報推定部と、を備え、
前記通信端末は、
Ｐ人のユーザそれぞれに関連付けられた特徴量を示す数値を要素とする特徴ベクトルを、前記Ｍ種類の特徴量それぞれについて生成して得られた複数の特徴ベクトルを並べて構成される行列から、所定の特徴量に対応する特徴ベクトルを除いたＱ種類の特徴ベクトルから構成される試験行列Ｔを生成する試験行列生成部と、
前記ネットワークを介して前記試験行列Ｔを前記情報処理装置に送信する試験行列送信部と、を備える、
情報処理システム。 A communication terminal;
An information processing device that communicates with the communication terminal via a network,
An information processing system comprising:
The information processing device,
A feature vector is generated by generating, for each of M types (M is an integer of 2 or more) of feature amounts, feature vectors each having a value indicating a feature amount associated with each of N users (N is an integer of 2 or more). A feature factor matrix storage unit that stores a feature factor matrix F when a matrix configured by arranging the plurality of feature vectors is approximated by a product of two matrices of a user factor matrix U and a feature factor matrix F;
A feature vector having a numerical value indicating a feature amount associated with each of P users (P is an integer of 2 or more) as elements is used for Q types (Q is an integer of 1 or more and less than M) of the M types of feature amounts. A) a test matrix acquisition unit that acquires a test matrix T configured by arranging a plurality of feature vectors obtained and generated for each of the feature amounts from the communication terminal via the network;
A test feature factor matrix Ft obtained by extracting a component corresponding to a feature vector constituting the test matrix T from the feature factor matrix F and a missing feature factor matrix Fl composed of other components are generated. A feature factor extraction unit that performs
A user factor extraction unit that calculates a test user factor matrix Ut that approximates the test matrix T by a product of a test user factor matrix Ut and the test feature factor matrix Ft;
A loss information estimating unit that obtains, from a product of the test user factor matrix Ut and the missing feature factor matrix Fl, an estimated value of a missing feature vector not included in the feature vectors forming the test matrix T,
The communication terminal,
A feature vector having a numerical value indicating a feature amount associated with each of the P users as an element is obtained from a matrix configured by arranging a plurality of feature vectors obtained for each of the M types of feature amounts. A test matrix generation unit configured to generate a test matrix T including Q types of feature vectors excluding a feature vector corresponding to a feature amount;
A test matrix transmission unit that transmits the test matrix T to the information processing device via the network.
Information processing system.