JP5931021B2

JP5931021B2 - Personal recognition tendency model learning device, personal recognition state estimation device, personal recognition tendency model learning method, personal recognition state estimation method, and program

Info

Publication number: JP5931021B2
Application number: JP2013162921A
Authority: JP
Inventors: 史朗熊野; 大塚　和弘; 和弘大塚; 昌史松田; 石井　亮; 亮石井; 淳司大和
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2013-08-06
Filing date: 2013-08-06
Publication date: 2016-06-08
Anticipated expiration: 2033-08-06
Also published as: JP2015032233A

Description

この発明は、複数の観察者が人物の状態をどのように解釈するかの傾向を学習し、特定の観察者が人物の状態をどのように解釈するかを推定する技術に関する。 The present invention relates to a technique for learning a tendency of how a plurality of observers interpret a person's state and estimating how a specific observer interprets the person's state.

対面での対話は、情報の伝達や共有、他者の気持ちの理解、意思決定などを行う際の最も基本的な形態のコミュニケーションである。しかし、うまく対話を行うことは必ずしも容易なことではなく、意思疎通における些細な齟齬から対人関係を損じることも少なくない。そのため、機械を介した対話や対話エージェントといった情報技術による対話の質や効率の改善が望まれている。これを実現するためには、表情やジェスチャといった対話者の行動を認識するのみならず、対話者の感情や人格、あるいは対話者間の関係性なども理解する必要がある。自動対話分析の研究も行動の自動認識から始まったが、近年ではこのような対話者の内部状態の推定へと移行してきている。 Face-to-face dialogue is the most basic form of communication when communicating and sharing information, understanding other people's feelings, and making decisions. However, it is not always easy to have a good dialogue, and it is often the case that personal relationships are damaged by trivial traps in communication. Therefore, it is desired to improve the quality and efficiency of dialogue by information technology such as dialogue through machines and dialogue agents. In order to realize this, it is necessary not only to recognize the behavior of the interlocutor, such as facial expressions and gestures, but also to understand the emotion and personality of the interlocutor and the relationship between the interlocutors. Research on automatic dialog analysis began with automatic recognition of behavior, but in recent years it has shifted to the estimation of the internal state of such a dialog.

コミュニケーションの場面では、対話者の内部状態には二つの異なる側面がある。一つは、真の状態、すなわち、実際に本人たちの内部で生じている状態という側面である。もう一つは、認知状態、すなわち、他者にどのように解釈されるかという側面である。これまでの大半の研究は、真の状態の推定を目指してきた。しかし、特に、行動を介した内部状態の表出とそこからの内部状態の認知が繰り返される対話の場面においては、認知状態も非常に重要である。 In communication situations, there are two different aspects to the internal state of the interlocutor. One is the aspect of the true state, that is, the state that actually occurs within the individuals. The other is the aspect of cognitive status, that is, how it is interpreted by others. Most studies so far have aimed at estimating the true state. However, the cognitive state is also very important, especially in a dialogue scene where the internal state is expressed through actions and the internal state is recognised.

近年、感性情報工学分野では、情動や気分などの認知状態を推定することが注目されてきている。従来研究は、専ら認知状態とその認知される人物の言語／非言語行動との関連付けを行ってきた。例えば、対象人物が笑顔を浮かべていたらその認知状態は幸福となるといった具合である。このとき、観察者がどう解釈するかは主観的、すなわち観察者次第であり、大きな解釈の偏差を含んでいることが過去の心理学の研究で明らかにされている。そこで、従来研究では、この主観性や偏差を排除するため、複数の観察者から解釈を集め、その代表値（例えば、日本人や３０代の男性／女性などの平均値や最頻値）を客観的解釈としてそれを正しく推定しようとする試みがなされてきた。対話者の内部状態についての客観的解釈をその対話者の行動から推定する技術として、例えば、特許文献１に記載の技術が存在する。 In recent years, in the field of Kansei information engineering, it has been attracting attention to estimate a cognitive state such as emotion and mood. Previous studies have exclusively associated cognitive status with the language / non-verbal behavior of the perceived person. For example, if the target person has a smiling face, the recognition state becomes happy. At this time, how the observer interprets is subjective, that is, depends on the observer, and past psychological studies have revealed that it includes a large deviation in interpretation. Therefore, in the conventional research, in order to eliminate this subjectivity and deviation, we gathered interpretations from a plurality of observers, and used representative values (for example, average values or mode values of Japanese and men / women in their 30s). Attempts have been made to correctly estimate it as an objective interpretation. As a technique for estimating an objective interpretation of a conversation person's internal state from the action of the conversation person, for example, a technique described in Patent Document 1 exists.

特開２０１１−１８５７２７号公報JP 2011-185727 A

従来技術では、ある特定の観察者の解釈がどのように偏るか、さらに、ある認知の対象に対してどのように解釈するか、といったある一人の観察者の主観的な解釈を推定したい場合には、その観察者の解釈のラベルを使った教師あり学習でもってモデルを学習することが必要であった。つまり、従来技術の多くは、ある特定の観察者の解釈がどのように偏るか、さらに、ある認知の対象に対してどのように解釈するか、といったある一人の観察者の主観的な解釈を、その観察者の解釈情報を用いずに推定できるには至っていないと言える。 In the prior art, when you want to estimate the subjective interpretation of a single observer, such as how the interpretation of a particular observer is biased, and how to interpret for a certain cognitive object Needed to learn the model with supervised learning using the observer's interpretation label. In other words, many of the prior arts use a single observer's subjective interpretation, such as how a particular observer's interpretation is biased, and how to interpret a certain cognitive object. Therefore, it can be said that it cannot be estimated without using the interpretation information of the observer.

対話中の二者の共感状態が外部観察者の集団からどのように解釈されるかの分布、すなわち、共感状態を｛共感／反感／どちらでもない｝の３状態として、ある対話を共感状態のそれぞれと解釈する観察者の割合の分布を推定する方法が複数提案されている。しかし、従来モデルの大半は観察者の個人的な特性を全く考慮していないため、特定の観察者の主観的な解釈は推定できない。 The distribution of how the two sympathetic states in the dialogue are interpreted from a group of external observers, that is, the sympathetic state is defined as three states of {sympathy / antisense / neither}, Several methods have been proposed for estimating the distribution of the percentage of observers interpreted as each. However, since most of the conventional models do not consider the personal characteristics of the observer at all, the subjective interpretation of a specific observer cannot be estimated.

この発明の目的は、ある人物の感情や対話中の人物間の関係性などの内部状態について、特定の観察者が持つ主観的な解釈を正しく推定することである。 An object of the present invention is to correctly estimate a subjective interpretation of a specific observer with respect to an internal state such as an emotion of a certain person and a relationship between persons in conversation.

上記の課題を解決するために、この発明の一態様である対人認知傾向モデル学習装置は、対人認知ラベリング部、行動認識部、観察者特性入力部、客観的行動認知傾向モデル学習部及び主観的対人認知傾向モデル学習部を含む。対人認知ラベリング部は、複数の観察者が少なくとも一人の人物を撮影した学習用映像に基づいて学習用映像中の人物の状態を解釈した対人認知状態を時系列にラベル付けした解釈ラベル集合を生成する。観察者特性入力部は、複数の観察者それぞれの個人特性を表す観察者特性を入力する。客観的行動認知傾向モデル学習部は、解釈ラベル集合及び学習用行動時系列を用いて、行動時系列が与えられたもとでの対人認知状態の尤度を表す客観的行動認知傾向モデルのパラメタを学習する。主観的対人認知傾向モデル学習部は、観察者特性及び解釈ラベル集合を用いて、個人特性が与えられたもとでの対人認知状態の尤度を表す主観的対人認知傾向モデルのパラメタを学習する。 In order to solve the above problems, an interpersonal recognition tendency model learning device according to an aspect of the present invention includes an interpersonal recognition labeling unit, an action recognition unit, an observer characteristic input unit, an objective behavior recognition tendency model learning unit, and a subjective Includes an interpersonal cognitive tendency model learning section. Interpersonal cognitive labeling unit generates a set of interpretation labels that time-sequentially label the interpersonal cognitive state based on the video for learning taken by multiple observers and captured the state of the person in the video for learning To do. The observer characteristic input unit inputs observer characteristics representing individual characteristics of the plurality of observers. The objective behavior recognition tendency model learning unit learns the parameters of the objective behavior recognition trend model that represents the likelihood of the state of human recognition given the behavior time series using the interpretation label set and the behavior time series for learning. To do. The subjective personal recognition tendency model learning unit learns a parameter of a subjective personal recognition tendency model representing the likelihood of the personal recognition state given the personal characteristics, using the observer characteristics and the interpretation label set.

この発明の他の態様である対人認知状態推定装置は、主観的対人認知傾向モデル記憶部、客観的行動認知傾向モデル記憶部、主観的対人認知傾向推定部、客観的行動認知傾向推定部及び主観的対人認知状態推定部を含む。主観的対人認知傾向モデル記憶部は、対人認知傾向モデル学習装置により学習された主観的対人認知傾向モデルを記憶する。客観的行動認知傾向モデル記憶部は、対人認知傾向モデル学習装置により学習された客観的行動認知傾向モデルを記憶する。主観的対人認知傾向推定部は、主観的対人認知傾向モデルを用いて、入力された対象観察者の個人特性から対象観察者の主観的対人認知傾向を求める。客観的行動認知傾向推定部は、客観的行動認知傾向モデルを用いて、少なくとも一人の人物を撮影した推定用映像に基づいて推定用映像中の人物の行動を時系列にラベル付けした推定用行動時系列から客観的行動認知傾向を求める。主観的対人認知状態推定部は、主観的対人認知傾向及び客観的行動認知傾向を用いて主観的対人認知状態推定結果を求める。 The interpersonal cognitive state estimation apparatus according to another aspect of the present invention includes a subjective interpersonal recognition tendency model storage unit, an objective behavior recognition tendency model storage unit, a subjective interpersonal recognition tendency estimation unit, an objective behavior recognition tendency estimation unit, and a subjective Includes an interpersonal cognitive state estimation unit. The subjective personal recognition tendency model storage unit stores the subjective personal recognition tendency model learned by the personal recognition tendency model learning device. The objective behavior recognition tendency model storage unit stores the objective behavior recognition tendency model learned by the interpersonal recognition tendency model learning device. The subjective personal recognition tendency estimation unit obtains the subjective personal recognition tendency of the target observer from the input individual characteristics of the target observer using a subjective personal recognition tendency model. The objective behavior recognition trend estimator uses the objective behavior recognition trend model to estimate the behavior of the person in the video for estimation based on the video for estimation taken of at least one person in time series. The objective behavior recognition tendency is obtained from the time series. The subjective person recognition state estimation unit obtains a subjective person recognition state estimation result using the subjective person recognition tendency and the objective behavior recognition tendency.

この発明によれば、ある人物の感情や対話中の人物間の関係性などの内部状態について、特定の観察者が持つ主観的な解釈を正しく推定することができる。 According to the present invention, it is possible to correctly estimate a subjective interpretation of a specific observer with respect to an internal state such as an emotion of a certain person and a relationship between persons in conversation.

図１は、対人認知傾向モデル学習装置の機能構成を例示する図である。FIG. 1 is a diagram illustrating a functional configuration of an interpersonal recognition tendency model learning device. 図２は、対人認知傾向モデル学習方法の処理フローを例示する図である。FIG. 2 is a diagram illustrating a processing flow of the interpersonal recognition tendency model learning method. 図３は、対人認知状態推定装置の機能構成を例示する図である。FIG. 3 is a diagram illustrating a functional configuration of the interpersonal recognition state estimation device. 図４は、対人認知状態推定方法の処理フローを例示する図である。FIG. 4 is a diagram illustrating a processing flow of the interpersonal recognition state estimation method. 図５は、主観的対人認知傾向モデルを説明するための図である。FIG. 5 is a diagram for explaining a subjective interpersonal recognition tendency model. 図６は、応用例の対人認知状態推定装置の機能構成を例示する図である。FIG. 6 is a diagram illustrating a functional configuration of an interpersonal recognition state estimation device of an application example.

以下、この発明の実施の形態について詳細に説明する。なお、図面中において同じ機能を有する構成部には同じ番号を付し、重複説明を省略する。 Hereinafter, embodiments of the present invention will be described in detail. In addition, the same number is attached | subjected to the component which has the same function in drawing, and duplication description is abbreviate | omitted.

［発明のポイント］
実施形態の説明に先立ち、この発明のポイントについて説明する。この発明における一番のポイントは、観察者の主観的対人認知傾向と、その観察者の個人特性（例えば、性別や心理尺度スコア）とを関連付けるモデルを構築する点である。ここで、主観的対人認知傾向というのは、ある観察者が他者の内部状態をどう解釈するのかの傾向のことであり、ここでは、対人認知傾向を一次元の確率分布として表現する。例えば、肯定的あるいは否定的に偏った解釈をしやすい、または中立的な解釈をしやすい（中心化傾向）、あるいは中立的な解釈を避けやすい（極端反応傾向）といった傾向が挙げられる。 [Points of Invention]
Prior to the description of the embodiments, the points of the present invention will be described. The most important point in the present invention is to construct a model that associates an observer's subjective interpersonal tendency and the observer's personal characteristics (for example, gender and psychological scale score). Here, the subjective interpersonal recognition tendency is a tendency of how an observer interprets the other person's internal state, and here, the interpersonal recognition tendency is expressed as a one-dimensional probability distribution. For example, there is a tendency that a positive or negative biased interpretation is easy, a neutral interpretation is easy (centralization tendency), or a neutral interpretation is easy to avoid (extreme reaction tendency).

後述する実施形態では、観察者の人格を直接観測できない潜在変数として表現し、各観察者の人格を学習データから自動的に獲得した典型的な観察者の人格（以下、典型的人格と呼ぶ）の混合によって表現する。典型的人格は対人認知傾向及び個人特性についてそれぞれ固有のパラメタを持っており、それらのパラメタから対人認知傾向及び個人特性を確率的に決定するものと仮定する。 In an embodiment to be described later, the observer's personality is expressed as a latent variable that cannot be directly observed, and each observer's personality is automatically acquired from the learning data (hereinafter referred to as a typical personality). Expressed by a mixture of It is assumed that a typical personality has unique parameters for personal cognitive tendency and personal characteristics, and the personal cognitive tendency and personal characteristics are determined probabilistically from these parameters.

このような確率モデルに基づき推定される主観的対人認知傾向を事前情報とし、既存技術により推定される客観的行動認知傾向を事後情報として組み合わせることで、その観察者が他者の感情をどう解釈するのかを推定できるようになる。客観的行動認知傾向とは、観察者集団が他者の行動から感情を解釈するときにどうばらつくかの傾向である。これを可能とするために次の二つの仮定を設けている。一つは、観察者の主観的解釈が他者の行動の間でどう変化するのかは観察者に関わらず一定であるという仮定である。例えば、他者の笑顔と怒り顔の間での解釈が相対的にどの程度変化するかは観察者によらず一定であるということである。もう一つの仮定は、観察者によって解釈のバイアス（主観的対人認知傾向）が異なるということである。この二つの仮定により、他者のある行動を見たときの観察者の解釈が観察者の間で異なるということが言える。 By combining the subjective interpersonal cognitive tendency estimated based on such a probabilistic model as prior information and the objective behavioral cognitive tendency estimated by existing technology as posterior information, the observer interprets the feelings of others. You will be able to estimate what to do. The objective behavior recognition tendency is the tendency of the observer group to vary when interpreting emotions from the behavior of others. To make this possible, the following two assumptions are made. One is the assumption that how the observer's subjective interpretation changes between the actions of others is constant regardless of the observer. For example, the relative change in the interpretation between another person's smile and angry face is constant regardless of the observer. Another assumption is that the observers have different interpretation biases (subjective personal perceptions). With these two assumptions, it can be said that the observer's interpretation differs when the observer sees certain actions of others.

なお、対象とする対人認知傾向としては、名義尺度（カテゴリカルデータ）、順序尺度、間隔尺度及び比例尺度のいずれの尺度のデータも適用可能である。また、心理尺度については、既存の心理アンケートから測定したスコアとすればよい。 In addition, as a target interpersonal recognition tendency, data of any scale of nominal scale (categorical data), order scale, interval scale, and proportional scale can be applied. The psychological scale may be a score measured from an existing psychological questionnaire.

この発明では、推定の対象とする人物の内部状態は、喜びや怒り、悲しみといった一人の人物の基本感情であってもよいし、共感や反感などの二者もしくは三者以上の対話の状態であってもよい。後述する実施形態では、対話における二者間の共感／反感の状態を対象とする。対話中の共感や反感は、同調圧力（大勢の他者が自分とは異なる同じ意見を持っているときには、それに従わなければならないと感じること）に深く関わり、合意形成や人間関係を構築する上での基本要素である。 In the present invention, the internal state of the person to be estimated may be a basic emotion of one person such as joy, anger, and sadness, or in a state of dialogue of two or more parties such as empathy and antipathy. There may be. In an embodiment described later, a state of empathy / antisense between two parties in a dialogue is targeted. The empathy and antipathy during the dialogue is deeply related to the pressure of entrainment (the feeling that many others have to follow when they have the same opinion different from their own), and helps build consensus building and relationships. Is the basic element.

［実施形態］
この発明の一つの実施形態は、観察者集団が学習用映像を解釈した結果に基づいて主観的対人認知傾向モデルを学習する対人認知傾向モデル学習装置と、学習した対人認知傾向モデルを用いて特定の観察者が推定用映像中の人物の内部状態を解釈した結果を推定する対人認知状態推定装置とからなる。対人認知傾向モデル学習装置と対人認知状態推定装置とは、必ずしも別々の装置である必要はなく、各装置の含む各部を重複なく含む一台の装置として構成することも可能である。 [Embodiment]
One embodiment of the present invention relates to a personal recognition tendency model learning device that learns a subjective personal recognition tendency model based on a result obtained by an observer group interpreting a learning video, and specified using the learned personal recognition tendency model And an interpersonal recognition state estimation device that estimates the result of interpretation of the internal state of the person in the estimation video. The interpersonal recognition tendency model learning device and the interpersonal recognition state estimation device do not necessarily need to be separate devices, and can be configured as a single device including each unit included in each device without duplication.

＜対人認知傾向モデル学習装置＞
図１を参照して、実施形態に係る対人認知傾向モデル学習装置の機能構成の一例を説明する。対人認知傾向モデル学習装置は、一人以上の人物を撮影した映像や、その映像に基づいて複数の観察者が人手により付与した解釈ラベルなどを中間出力しつつ、最終的に主観的対人認知傾向モデルのパラメタ及び客観的行動認知傾向モデルのパラメタを出力する装置である。 <Personal cognitive tendency model learning device>
With reference to FIG. 1, an example of a functional configuration of the interpersonal recognition tendency model learning device according to the embodiment will be described. The interpersonal cognitive tendency model learning device finally outputs a subjective interpersonal tendency model while intermediately outputting images of one or more persons photographed and interpretation labels manually assigned by a plurality of observers based on the images. This is a device that outputs the parameters and objective behavior recognition tendency model parameters.

実施形態に係る対人認知傾向モデル学習装置１は、映像入力部１０、対人認知ラベリング部１１、観察者特性入力部１２、行動認識部１３、主観的対人認知傾向モデル学習部１４、客観的行動認知傾向モデル学習部１５、主観的対人認知傾向モデル記憶部２０及び客観的行動認知傾向モデル記憶部２１を含む。対人認知傾向モデル学習装置１は、例えば、中央演算処理装置（Central Processing Unit、CPU）、主記憶装置（Random Access Memory、RAM）などを有する公知又は専用のコンピュータに特別なプログラムが読み込まれて構成された特別な装置である。対人認知傾向モデル学習装置１は、例えば、中央演算処理装置の制御のもとで各処理を実行する。対人認知傾向モデル学習装置１に入力されたデータや各処理で得られたデータは、例えば、主記憶装置に格納され、主記憶装置に格納されたデータは必要に応じて読み出されて他の処理に利用される。対人認知傾向モデル学習装置１が備える各記憶部は、例えば、RAM（Random Access Memory）などの主記憶装置、ハードディスクや光ディスクもしくはフラッシュメモリ（Flash Memory）のような半導体メモリ素子により構成される補助記憶装置、またはリレーショナルデータベースやキーバリューストアなどのミドルウェアにより構成することができる。対人認知傾向モデル学習装置１が備える各記憶部は、それぞれ論理的に分割されていればよく、一つの物理的な記憶装置に記憶されていてもよい。 The interpersonal recognition tendency model learning device 1 according to the embodiment includes a video input unit 10, an interpersonal recognition labeling unit 11, an observer characteristic input unit 12, an action recognition unit 13, a subjective interpersonal recognition tendency model learning unit 14, and an objective behavior recognition. A tendency model learning unit 15, a subjective interpersonal recognition tendency model storage unit 20, and an objective behavior recognition tendency model storage unit 21 are included. The interpersonal recognition tendency model learning device 1 is configured by loading a special program into a known or dedicated computer having, for example, a central processing unit (CPU), a main storage device (Random Access Memory, RAM), and the like. Special equipment. The interpersonal recognition tendency model learning device 1 executes each process under the control of a central processing unit, for example. The data input to the interpersonal recognition tendency model learning device 1 and the data obtained in each process are stored in, for example, a main storage device, and the data stored in the main storage device is read out as necessary to obtain other data. Used for processing. Each storage unit included in the interpersonal recognition tendency model learning device 1 is, for example, a main storage device such as a RAM (Random Access Memory), an auxiliary storage configured by a semiconductor memory element such as a hard disk, an optical disk, or a flash memory. It can be configured by a device, or middleware such as a relational database or key-value store. Each storage unit included in the personal recognition tendency model learning device 1 may be logically divided, and may be stored in one physical storage device.

図２を参照して、対人認知傾向モデル学習方法の処理フローの一例を、実際に行われる手続きの順に従って説明する。 With reference to FIG. 2, an example of the processing flow of the interpersonal recognition tendency model learning method will be described according to the order of procedures actually performed.

ステップＳ１０において、映像入力部１０は、一人以上の人物を撮影した学習用映像を入力して記憶する。学習用映像は主観的対人認知傾向モデル及び客観的行動認知傾向モデルの学習に用いられる映像であり、学習に十分な分量の映像である必要がある。撮影対象が複数人対話である場合、各対話者について一台のカメラを用意してもよいし、魚眼レンズを用いるなどした全方位カメラ一台で対話者全員を撮影してもよい。撮影した映像はハードディスクドライブ（Hard Disk Drive、HDD）等の媒体に記憶される。さらに、対話映像中に三者以上が含まれている場合は、二者がインタラクションを行っているシーンを全て抽出する。対話映像が二者による対話であれば映像を分割する必要はない。また、他者状態として、基本感情などある一人の内部状態を対象とする場合は、対話者一人ずつを抽出すればよい。分割した結果は、各シーンを個別の映像ファイルとして出力してもよいし、元の対話映像中からシーンを切り出す時刻及び位置の情報を記録しておいてもよい。以降の対人認知傾向モデル学習装置の処理では、映像中のすべてのシーンに対してそれぞれ処理を行うものとする。 In step S10, the video input unit 10 inputs and stores a learning video obtained by photographing one or more persons. The learning image is an image used for learning of the subjective interpersonal recognition tendency model and the objective behavior recognition tendency model, and needs to be an amount of images sufficient for learning. When the subject to be photographed is a multi-person conversation, one camera may be prepared for each conversation person, or all the conversation persons may be photographed by one omnidirectional camera using a fisheye lens. The captured video is stored in a medium such as a hard disk drive (HDD). Furthermore, when three or more parties are included in the dialogue video, all scenes in which the two parties are interacting are extracted. If the dialogue video is a dialogue between two parties, there is no need to divide the video. Further, when one person's internal state such as basic emotion is targeted as the other person's state, it is only necessary to extract one person who interacts. As a result of the division, each scene may be output as an individual video file, or information on the time and position at which the scene is cut out from the original interactive video may be recorded. In the subsequent processing of the interpersonal recognition tendency model learning device, it is assumed that the processing is performed on all scenes in the video.

ステップＳ１１において、対人認知ラベリング部１１は、学習用映像の各シーンを複数の観察者（以下、学習用観察者集団と呼ぶ）に提示し、各観察者はそれぞれのシーンに対して、｛共感／反感／どちらでもない｝のうちの一つの解釈を示すラベルを付与する。以下、複数の観察者それぞれが付与した解釈ラベルの結果全体を解釈ラベル集合と呼ぶ。生成した解釈ラベル集合は、主観的対人認知傾向モデル学習部１４及び客観的行動認知傾向モデル学習部１５へ出力される。 In step S11, the interpersonal recognition labeling unit 11 presents each scene of the learning video to a plurality of observers (hereinafter referred to as learning observer group), and each observer {empathizes with each scene. / Negative / None} is given a label indicating the interpretation. Hereinafter, the entire result of interpretation labels given by each of a plurality of observers is referred to as an interpretation label set. The generated interpretation label set is output to the subjective interpersonal recognition tendency model learning unit 14 and the objective behavior recognition tendency model learning unit 15.

ステップＳ１２において、観察者特性入力部１２は、観察者の個人特性（性別及び心理尺度スコア）を入力する。入力された観察者の個人特性は、主観的対人認知傾向モデル学習部１４へ出力される。心理尺度スコアについては、心理アンケートを提示してそれぞれの質問に対する回答を入力してスコアを自動で計算する。あるいは、あらかじめアンケート用紙などを用いてスコアを計算しておき、そのスコアを入力してもよい。観察者特性入力部１２で対象とする観察者は、対人認知ラベリング部１１にて解釈のラベル付けを行った学習用観察者集団である。後述する対人認知状態推定装置の処理では、対象とする観察者は、推定の対象とする観察者である。推定の対象とする観察者は学習用観察者集団に含まれていてもよいし、含まれていなくてもよい。 In step S12, the observer characteristic input unit 12 inputs the individual characteristics (gender and psychological scale score) of the observer. The input personal characteristics of the observer are output to the subjective interpersonal recognition tendency model learning unit 14. As for the psychological scale score, a psychological questionnaire is presented, an answer to each question is input, and the score is automatically calculated. Alternatively, a score may be calculated in advance using a questionnaire sheet and the score may be input. The observers targeted by the observer characteristic input unit 12 are learning observer groups that have been labeled for interpretation by the interpersonal recognition labeling unit 11. In the process of the personal recognition state estimation device to be described later, the target observer is the observer to be estimated. The observer to be estimated may or may not be included in the learning observer group.

観察者の個人特性についてより詳細に説明する。心理尺度として、例えば、主要５因子性格検査（BigFive性格検査）、及びDavisのInterpersonal Reactivity Index（IRI）を使用することができる。なお、この発明では、計算機で処理可能な範囲において定量化された任意の数の任意の特性を使用することが可能である。例えば、BigFive性格検査は、「外向性」、「情緒不安定性」、「開放性」、「誠実性」及び「調和性」の５つの因子からなる。各因子に対して１２項目ずつ計６０項目の特性語（例えば、「話し好き」、「悩みがち」、「独創的な」など）が用意されており、それぞれに対して「非常に当てはまる」（スコア＝７）から「まったくあてはまらない」（スコア＝１）から選んだ一つの回答から合計点（スコアの和）を集計することで、５つの因子のスコア（各要素が１２〜８４の間の値をとる５次元のベクトル）が計算される。DavisのInterpersonal Reactivity Index（IRI）の場合も同様に、「視点取得」、「共感的配慮」、「空想」及び「個人的苦悩」の４つの因子それぞれに対して、「何かを決定する時には、自分と反対の意見を持つ人たちの立場に立って考えてみる」といった７つずつの質問に対する回答の結果から合計点を計算し、４次元のベクトルが得られる。よって、BigFive性格検査及びDavisのInterpersonal Reactivity Index（IRI）の両方を使用する場合、特性は５＋４＝９次元のベクトルとして表現される。 The personal characteristics of the observer will be described in more detail. As a psychological measure, for example, a major five-factor personality test (BigFive personality test) and Davis' Interpersonal Reactivity Index (IRI) can be used. In the present invention, it is possible to use an arbitrary number of arbitrary characteristics quantified within a range that can be processed by a computer. For example, the BigFive personality test consists of five factors: “extroversion”, “emotional instability”, “openness”, “honesty”, and “harmony”. A total of 60 characteristic words (for example, “talking”, “prone to worry”, “creative”, etc.) are prepared for each factor, 12 items, and “very true” ( Scores of 5 factors (each element is between 12-84) by summing up the total score (sum of the scores) from one answer selected from “not at all” (score = 1) from score = 7) A five-dimensional vector) is calculated. Similarly, in the case of Davis's Interpersonal Reactivity Index (IRI), for each of the four factors “acquisition of viewpoint”, “empathetic consideration”, “fantasy” and “personal distress” The total score is calculated from the results of answers to seven questions such as “Think from the standpoint of people who have opinions opposite to you”, and a four-dimensional vector is obtained. Thus, when using both the BigFive personality check and Davis' Interpersonal Reactivity Index (IRI), the characteristics are expressed as 5 + 4 = 9-dimensional vectors.

ステップＳ１３において、行動認識部１３は、映像入力部１１で取得した学習用映像に基づいて、その映像中の各人物の行動を時系列にラベル付けした学習行動時系列を生成する。生成した学習行動時系列は、客観的行動認知傾向モデル学習部１５へ出力される。この実施形態では、表情、視線、頭部ジェスチャ及び発話の有無の４種類の行動を対象とする。表情は、感情を表す主要な経路である。視線は、感情を誰に伝えようとしているのかを示すため、または、他者の行動を観察するために必要である。頭部ジェスチャは、しばしば他者の意見に対する態度の表明として表出される。発話の有無は、話し手／聞き手という対話役割の主要な指標となる。表情は｛無表情／微笑／哄笑／苦笑／思考中／その他｝の６状態とする。視線は、どの対話者を見ているかを表し、状態の集合は｛他者のうちの誰か一人（状態数は対話者数に等しい）／どの他者の方も見ていない｝の（対話者数＋１）状態とする。頭部ジェスチャは、｛なし／頷き／首ふり／傾げ｝の４状態を基本とし、これらの組み合わせであってもよい。発話は｛発話／沈黙｝の２状態とする。それぞれの行動は外部の別々の認識装置を用いて認識しても構わない。例えば、表情の認識であれば、認識装置の一例として下記の参考文献１に記載された認識装置が挙げられる。なお、行動認識部１３は、対人認知ラベリング部１１と同様に、人手によるラベル付けを行いその結果を出力するとしても構わない。
〔参考文献１〕特許第４９４２１９７号公報 In step S 13, the behavior recognition unit 13 generates a learning behavior time series in which the behavior of each person in the video is labeled in time series based on the learning video acquired by the video input unit 11. The generated learning action time series is output to the objective action recognition tendency model learning unit 15. In this embodiment, four types of behaviors of expression, line of sight, head gesture, and presence / absence of speech are targeted. Facial expressions are the main pathway for expressing emotions. Gaze is necessary to show who you are trying to convey emotions to, or to observe the behavior of others. Head gestures are often expressed as an expression of attitude to the opinions of others. The presence or absence of utterances is a key indicator of the conversation role of the speaker / listener. There are 6 facial expressions: {no expression / smile / smile / bitter smile / thinking / others}. The line of sight represents which interlocutor is looking and the set of states is {one of the others (the number of states is equal to the number of interlocutors) / no other seeing} +1) state. The head gesture is basically based on the four states of {none / whit / neck / tilt}, and may be a combination thereof. The utterance has two states of {utterance / silence}. Each action may be recognized using a separate external recognition device. For example, in the case of facial expression recognition, an example of a recognition apparatus is a recognition apparatus described in Reference Document 1 below. Note that the action recognition unit 13 may perform manual labeling and output the result as in the case of the interpersonal recognition labeling unit 11.
[Reference 1] Japanese Patent No. 4942197

ステップＳ１４において、主観的対人認知傾向モデル学習部１４は、観察者特性入力部１２から入力された観察者特性及び対人認知ラベリング部１１から出力される解釈ラベル集合を用いて、観察者の対人認知傾向と観察者の個人特性とを関連付ける主観的対人認知傾向モデルのパラメタを学習する。 In step S 14, the subjective interpersonal recognition tendency model learning unit 14 uses the observer characteristics input from the observer characteristic input unit 12 and the interpretation label set output from the interpersonal recognition labeling unit 11 to recognize the viewer's interpersonal recognition. Learn the parameters of the subjective interpersonal cognitive tendency model that correlates the tendency with the personal characteristics of the observer.

＜＜主観的対人認知傾向モデルの概要＞＞
以下、図５を参照しながら、主観的対人認知傾向モデルについて詳細に説明する。この実施形態では、主観的対人認知傾向と対象観察者の個人特性を関連付けるモデルとして、確率的トピックモデルを用いる。確率的トピックモデルは、様々な離散データに隠れた潜在的なトピックを推定することができるモデルである。確率モデルを用いるのは、認知の過程に含まれる曖昧性や文脈情報など様々な外部要因や欠損情報をベイズ推定の枠組みで効果的に取り扱えるためである。 << Outline of subjective interpersonal cognitive tendency model >>
Hereinafter, the subjective interpersonal recognition tendency model will be described in detail with reference to FIG. In this embodiment, a probabilistic topic model is used as a model for associating the subjective personal recognition tendency with the personal characteristics of the target observer. A probabilistic topic model is a model that can estimate potential topics hidden in various discrete data. The reason for using the probabilistic model is that various external factors such as ambiguity and context information included in the cognitive process and missing information can be effectively handled in a Bayesian estimation framework.

この実施形態では、観察者の人格を直接観測できない潜在変数として表現し、各観察者の人格を学習データから自動的に獲得したK個の典型的人格の混合によって表現する。そして、それら典型的人格は対人認知傾向及び個人特性についてそれぞれ固有のパラメタを持っており、そこから対人認知傾向及び個人特性が確率的に決定されるものと仮定する。典型的人格は、確率的トピックモデルにおけるトピックに相当する。典型的人格の数Kはモデルの学習に要する時間と推定する人格の詳細度を鑑みて決定すればよいが、この実施形態ではK=4とする。典型的人格は、以下の式(1)に示すK次元ベクトルで表現される。 In this embodiment, the personality of the observer is expressed as a latent variable that cannot be directly observed, and the personality of each observer is expressed by a mixture of K typical personalities that are automatically acquired from the learning data. These typical personalities have parameters specific to personal recognition tendency and personal characteristics, and it is assumed that the personal recognition tendency and personal characteristics are determined probabilistically. A typical personality corresponds to a topic in a probabilistic topic model. The number K of typical personalities may be determined in consideration of the time required for learning the model and the level of detail of the estimated personality. In this embodiment, K = 4. A typical personality is represented by a K-dimensional vector shown in the following equation (1).

ここで、zはK個の要素のうちどれか一つだけが1を取り他は0となる、つまり1-of-K表現である。例えば、k（1≦k≦K）番目の典型的人格については、z_k=1であり、残りの要素は0となる。 Here, z is a 1-of-K representation in which only one of the K elements is 1 and the others are 0. For example, for the kth (1 ≦ k ≦ K) typical personality, z _k = 1 and the remaining elements are zero.

N人の観察者がいたときの観察者j（∈{1,…,N}）の人格は、このベクトルzを用いて、以下の式(2)にて表現される。 The personality of the observer j (∈ {1,..., N}) when there are N observers is expressed by the following equation (2) using this vector z.

ここで、P(z_j,k=1)は、観察者jの人格におけるk番目の典型的人格の占める割合を表し、以下の式(3)が成り立つ。 Here, P (z _{j, k} = 1) represents the ratio of the k-th typical personality in the personality of the observer j, and the following formula (3) is established.

この実施形態では主観的対人認知傾向モデルに生成モデルを用いる。ここで、生成モデルという意味は、まず、潜在変数で表現される観察者jの人格が典型的人格の事前確率πに従って確率的に決定され、次いで、その人格に従って主観的対人認知傾向と個人特性が確率的に決定されるということである。事前確率πは、以下の式(4)により表される。 In this embodiment, a generation model is used as a subjective interpersonal recognition tendency model. Here, the meaning of the generation model is that the personality of the observer j expressed by the latent variable is determined probabilistically according to the prior probability π of the typical personality, and then the subjective interpersonal recognition tendency and personal characteristics according to the personality Is determined probabilistically. Prior probability π is expressed by the following equation (4).

なお、典型的人格の事前確率πはすべての観察者に対して等しいものとする。 It is assumed that the prior probability π of the typical personality is the same for all observers.

各観察者の主観的対人認知傾向は確率分布αにて表される。確率分布αを、以下の式(5)に示す。 Each observer's subjective interpersonal cognitive tendency is represented by a probability distribution α. The probability distribution α is shown in the following formula (5).

ここで、確率α_eは対人認知状態e∈{1,…,N_e}が対象観察者に選択される確率を表し、以下の式(6)が成り立つ。 Here, the probability α _e represents the probability that the person recognition state eε {1,..., N _e } is selected by the target observer, and the following equation (6) is established.

この実施形態では、k番目の典型的人格が対人認知状態eの解釈を生成する過程を多項分布にてモデル化する。この多項分布は、M(e|1,α_k)と表される。ここで、多項分布のパラメタα_kはk番目の典型的人格の持つ主観的対人認知傾向を意味している。同様に、k番目の典型的人格はB(g|1,β_k)で表される二項分布に従って観察者の性別g={0,1}を決定しているものとする。ここで、二項分布のパラメタβ_kはk番目の典型的人格に占める男性（g=1）の割合を表す。心理尺度スコアsについては、k番目の典型的人格は正規分布N(s|μ_k,Σ_k)に従って生成すると仮定する。ここで、心理尺度はN_s個の尺度からなるものとする。つまり、s_iはi番目の心理尺度のスコアとして、以下の式(7)で表すことができる。 In this embodiment, the process in which the kth typical personality generates an interpretation of the interpersonal cognitive state e is modeled by a multinomial distribution. This multinomial distribution is represented as M (e | 1, α _k ). Here, the parameter α _k of the multinomial distribution means the subjective personal recognition tendency of the kth typical personality. Similarly, it is assumed that the kth typical personality determines the gender g = {0,1} of the observer according to the binomial distribution represented by B (g | 1, β _k ). Here, the parameter β _{k of the} binomial distribution represents the proportion of men (g = 1) in the kth typical personality. For the psychological scale score s, assume that the _kth typical personality is generated according to the normal distribution N (s | μ _k , Σ _k ). Here, the psychological scale is composed of N _s scales. That is, s _i can be expressed by the following equation (7) as the score of the i-th psychological scale.

また、正規分布のパラメタμ_k及びΣ_kは、k番目の典型的人格が持つ心理尺度スコアsの平均ベクトル及び共分散行列をそれぞれ表す。この実施形態では、数学的単純化のために、心理尺度スコアは観察者の潜在的人格が与えられたもとで、尺度間で独立であることを仮定する。すると、多変量正規分布N(s|μ_k,Σ_k)は一変量正規分布の積となる。 Further, the parameters μ _k and Σ _k of the normal distribution respectively represent an average vector and a covariance matrix of the psychological scale score s possessed by the kth typical personality. In this embodiment, for mathematical simplification, it is assumed that psychological scale scores are independent between scales given the potential personality of the observer. Then, the multivariate normal distribution N (s | μ _k , Σ _k ) is a product of the univariate normal distribution.

以上に示した主観的対人認知傾向モデルパラメタは以下の式(8)に示す通りにまとめることができる。 The subjective interpersonal recognition tendency model parameters shown above can be summarized as shown in the following equation (8).

この実施形態では、主観的対人認知傾向モデルパラメタΘが与えられたもとでの、観察者jが生成する対人認知状態e_j、観察者jの性別g_j及び心理尺度スコアs_jの同時確率P(e_j,g_j,s_j|Θ)をモデル化する。この同時確率は、潜在変数を積分消去すると、以下の式(9)となり、ここから計算可能である。 In this embodiment, given the subjective interpersonal cognitive tendency model parameter Θ, the interpersonal cognitive state e _j generated by the observer j, the gender g _j of the observer _j, and the joint probability P () of the psychological scale score s _j e _j , g _j , s _j | Θ). This joint probability can be calculated from the following equation (9) when the latent variable is integrated and eliminated.

＜＜主観的対人認知傾向モデルの学習方法＞＞
以下、主観的対人認知傾向モデルの学習方法について詳細に説明する。主観的対人認知傾向モデル学習部１４は、学習用観察者集団中の各観察者の性別及び心理尺度スコアと、学習用観察者集団中の各観察者が学習用映像の各シーンに対して付与した対人認知ラベル集合を用いて主観的対人認知傾向モデルパラメタの学習を行う。この主観的対人認知傾向モデルでは、もし潜在変数zが決まれば、主観的対人認知傾向モデルパラメタΘは解析的に計算可能である。この問題は潜在変数zを欠損値としたEMアルゴリズムを用いて効率的に解くことが可能である。 << Learning method of subjective interpersonal cognitive tendency model >>
Hereinafter, the learning method of the subjective interpersonal recognition tendency model will be described in detail. The subjective interpersonal cognitive tendency model learning unit 14 assigns the gender and psychological scale score of each observer in the learning observer group to each scene of the learning video by each observer in the learning observer group. The subjective interpersonal cognitive tendency model parameters are learned using the obtained interpersonal perception label set. In this subjective personal recognition tendency model, if the latent variable z is determined, the subjective personal recognition tendency model parameter Θ can be calculated analytically. This problem can be solved efficiently using the EM algorithm with the latent variable z as the missing value.

EMアルゴリズムはEステップとMステップを収束するまで繰り返す反復法である。Eステップでは、観察者jにおける典型的人格の混合率γ(z_j,k)、すなわちzの事後確率を全ての説明変数（主観的対人認知傾向、性別及び心理尺度スコア）の情報を用いて下記の式(10)に従い更新する。 The EM algorithm is an iterative method that repeats the E and M steps until they converge. In step E, the typical personality mixture rate γ (z _{j, k} ) in observer j, that is, the posterior probability of z, using information of all explanatory variables (subjective interpersonal cognitive tendency, gender and psychological scale score) Update according to equation (10) below.

ここで、学習用映像にはL個のシーンが含まれているものとし、h_jはそれらLシーンに対して観察者jが付与した対人認知ラベルの頻度分布（ヒストグラム）である。また、変数tを含む上添え字はEMアルゴリズムの反復ステップ番号を表す。 Here, it is assumed that the learning video includes L scenes, and h _j is the frequency distribution (histogram) of the personal recognition labels given to the L scenes by the observer j. The superscript including the variable t represents the EM algorithm iteration step number.

Mステップでは、主観的対人認知傾向モデルパラメタΘを更新する。更新式は、例えば下記の参考文献２に記されている混合ガウス分布におけるパラメタ更新と同様の方法で導出可能である。
〔参考文献２〕A. Dempster, N. Laird, and D. Rubin, “Maximum likelihood from incomplete data via the em algorithm,” Journal of the Royal Statistical Society, vol. B 39, no. 1, pp. 1-38, 1977. In the M step, the subjective interpersonal recognition tendency model parameter Θ is updated. The update formula can be derived by the same method as the parameter update in the mixed Gaussian distribution described in Reference Document 2 below, for example.
[Reference 2] A. Dempster, N. Laird, and D. Rubin, “Maximum likelihood from incomplete data via the em algorithm,” Journal of the Royal Statistical Society, vol. B 39, no. 1, pp. 1- 38, 1977.

具体的には次の式(11)〜(15)によって更新される。

Specifically, it is updated by the following equations (11) to (15).

ここで、h_j,eは頻度分布h_jのうち対人認知状態eの頻度である。また、N_kは下記の式(16)で表すことができる。 Here, h _{j, e} is the frequency of the interpersonal recognition state e in the frequency distribution h _j . N _k can be expressed by the following equation (16).

このように学習して得られた主観的対人認知傾向モデルのパラメタは、主観的対人認知傾向モデル記憶部２０へ記憶する。 The parameters of the subjective personal recognition tendency model obtained by learning in this way are stored in the subjective personal recognition tendency model storage unit 20.

ステップＳ１５において、客観的行動認知傾向モデル学習部１５は、対人認知ラベリング部１１から出力される解釈ラベル集合及び行動認識部１３から出力される学習行動時系列を用いて、観察者の対人認知傾向と映像中の人物の行動とを関連付ける客観的行動認知傾向モデルのパラメタを学習する。客観的行動認知傾向モデルのパラメタは、より具体的には、行動の時系列Bが与えられたもとでの対人認知状態eが得られる尤度P(B|e)である。言い替えると、客観的行動認知傾向モデルのパラメタは、行動時系列Bに対して対人認知状態eと解釈する観察者の割合である。 In step S15, the objective behavior recognition tendency model learning unit 15 uses the interpretation label set output from the personal recognition labeling unit 11 and the learning behavior time series output from the behavior recognition unit 13, and the personal recognition tendency of the observer. Learn the parameters of an objective behavior recognition trend model that correlates the behavior of the person in the video. More specifically, the parameter of the objective behavior recognition tendency model is the likelihood P (B | e) at which the interpersonal recognition state e is obtained with the time series B of behavior being given. In other words, the parameter of the objective behavior recognition tendency model is the ratio of the observer who interprets the behavioral time series B as the interpersonal recognition state e.

客観的行動認知傾向モデル学習部１５は、例えば、特許文献１に記載の方法により、客観的行動認知傾向モデルのパラメタを学習することができる。特許文献１に記載の方法では、以下のように客観的行動認知傾向モデルのパラメタを学習する。ただし、以下の手順は処理の概要であって、詳細な処理の手順については特許文献１を参照されたい。 The objective behavior recognition tendency model learning unit 15 can learn the parameters of the objective behavior recognition tendency model by the method described in Patent Document 1, for example. In the method described in Patent Literature 1, objective behavior recognition model parameters are learned as follows. However, the following procedure is an outline of the processing, and refer to Patent Document 1 for the detailed processing procedure.

客観的行動認知傾向モデル学習部１５は、学習行動時系列及び解釈ラベル集合を教師ラベルとして、客観的行動認知傾向モデルのハイパーパラメタを学習する。ここでは、それぞれのハイパーパラメタが表す事象がその学習用映像中で発生した確率をハイパーパラメタの値として設定することとする。例えば、尤度P(B=b|e=e’)についてであれば、教師ラベル中において、対人認知状態がe’であり、かつ、行動がbである回数を数えて、その回数を対人認知状態がe’である回数で除した値（確率）をハイパーパラメタの値として設定すればよい。 The objective behavior recognition tendency model learning unit 15 learns hyperparameters of the objective behavior recognition tendency model using the learning behavior time series and the interpretation label set as teacher labels. Here, the probability that the event represented by each hyperparameter has occurred in the learning video is set as the hyperparameter value. For example, for likelihood P (B = b | e = e '), in the teacher label, count the number of times that the person recognition state is e' and the action is b, and A value (probability) divided by the number of times that the recognition state is e ′ may be set as the hyperparameter value.

このようにして学習して得られた客観的行動認知傾向モデルのパラメタは、客観的行動認知傾向モデル記憶部２１へ記憶する。 The parameters of the objective behavior recognition tendency model obtained by learning in this way are stored in the objective behavior recognition tendency model storage unit 21.

＜対人認知状態推定装置＞
図３を参照して、実施形態に係る対人認知状態推定装置の機能構成の一例を説明する。対人認知状態推定装置は、主観的対人認知傾向モデルパラメタ及び客観的行動認知傾向モデルパラメタを入力とし、主観的対人認知状態推定結果を出力する装置である。 <Personal recognition state estimation device>
With reference to FIG. 3, an example of a functional configuration of the interpersonal recognition state estimation apparatus according to the embodiment will be described. The personal recognition state estimation device is a device that receives a subjective personal recognition tendency model parameter and an objective behavior recognition tendency model parameter and outputs a subjective personal recognition state estimation result.

実施形態に係る対人認知状態推定装置２は、対人認知傾向モデル学習装置１と同様に、映像入力部１０、観察者特性入力部１２、行動認識部１３、主観的対人認知傾向モデル記憶部２０及び客観的行動認知傾向モデル記憶部２１を含む。対人認知状態推定装置２ではさらに、主観的対人認知傾向推定部３１、客観的行動認知傾向推定部３２及び主観的対人認知状態推定部３３を含む。対人認知状態推定装置２は、例えば、中央演算処理装置（Central Processing Unit、CPU）、主記憶装置（Random Access Memory、RAM）などを有する公知又は専用のコンピュータに特別なプログラムが読み込まれて構成された特別な装置である。対人認知状態推定装置２は、例えば、中央演算処理装置の制御のもとで各処理を実行する。対人認知状態推定装置２に入力されたデータや各処理で得られたデータは、例えば、主記憶装置に格納され、主記憶装置に格納されたデータは必要に応じて読み出されて他の処理に利用される。対人認知状態推定装置２が備える各記憶部は、例えば、ＲＡＭ（Random Access Memory）などの主記憶装置、ハードディスクや光ディスクもしくはフラッシュメモリ（Flash Memory）のような半導体メモリ素子により構成される補助記憶装置、またはリレーショナルデータベースやキーバリューストアなどのミドルウェアにより構成することができる。対人認知状態推定装置２が備える各記憶部は、それぞれ論理的に分割されていればよく、一つの物理的な記憶装置に記憶されていてもよい。 Similarly to the interpersonal recognition tendency model learning device 1, the interpersonal recognition state estimation apparatus 2 according to the embodiment includes a video input unit 10, an observer characteristic input unit 12, an action recognition unit 13, a subjective interpersonal recognition tendency model storage unit 20, and An objective behavior recognition tendency model storage unit 21 is included. The personal recognition state estimation device 2 further includes a subjective personal recognition tendency estimation unit 31, an objective behavior recognition tendency estimation unit 32, and a subjective personal recognition state estimation unit 33. The interpersonal recognition state estimation device 2 is configured, for example, by loading a special program into a known or dedicated computer having a central processing unit (CPU), a main storage device (Random Access Memory, RAM), and the like. Special equipment. For example, the interpersonal recognition state estimation device 2 executes each process under the control of the central processing unit. The data input to the personal recognition state estimation device 2 and the data obtained in each process are stored in, for example, the main storage device, and the data stored in the main storage device is read out as necessary to perform other processing. Used for Each storage unit included in the personal recognition state estimation device 2 is, for example, a main storage device such as a RAM (Random Access Memory), an auxiliary storage device configured by a semiconductor memory element such as a hard disk, an optical disk, or a flash memory. Or middleware such as a relational database or key-value store. Each storage unit included in the interpersonal recognition state estimation device 2 may be logically divided, and may be stored in one physical storage device.

図４を参照して、対人認知状態推定方法の処理フローの一例を、実際に行われる手続きの順に従って説明する。 With reference to FIG. 4, an example of the processing flow of the interpersonal recognition state estimation method will be described in the order of procedures actually performed.

主観的対人認知傾向モデル記憶部２０には、対人認知傾向モデル学習装置１により学習された主観的対人認知傾向モデルが記憶されている。 The subjective personal recognition tendency model storage unit 20 stores a subjective personal recognition tendency model learned by the personal recognition tendency model learning device 1.

客観的行動認知傾向モデル記憶部２１には、対人認知傾向モデル学習装置１により学習された客観的行動認知傾向モデルが記憶されている。 The objective behavior recognition tendency model storage unit 21 stores an objective behavior recognition tendency model learned by the interpersonal recognition tendency model learning device 1.

ステップＳ１０において、映像入力部１０は、一人以上の人物を撮影した推定用映像を入力して記憶する。推定用映像は対人認知状態を推定する対象の映像である。推定用映像の内容は学習用映像と同様であるが、学習用映像には含まれない映像であるものとする。対人認知傾向モデル学習装置では、映像中のすべてのシーンに対して以降の処理を行ったが、対人認知状態推定装置では、映像中のある一つのシーンに対して処理を行ってもよいし、映像中のすべてのシーンに対して処理を行なっても構わない。 In step S10, the video input unit 10 inputs and stores an estimation video obtained by photographing one or more persons. The video for estimation is a video for which the state of interpersonal recognition is estimated. The content of the estimation video is the same as that of the learning video, but the video is not included in the learning video. In the personal recognition tendency model learning device, the subsequent processing is performed for all scenes in the video, but in the personal recognition state estimation device, processing may be performed for one scene in the video, Processing may be performed on all scenes in the video.

ステップＳ１２において、観察者特性入力部１２は、対人認知状態を推定する対象の観察者の個人特性を入力する。個人特性は、対人認知傾向モデル学習装置１が学習に用いた観察者の個人特性と同様であり、この実施形態では、例えば、性別及び心理尺度スコアである。推定の対象とする観察者は学習用観察者集団に含まれていてもよいし、含まれていなくともよい。 In step S 12, the observer characteristic input unit 12 inputs the personal characteristic of the observer who is the target of estimating the interpersonal recognition state. The personal characteristics are the same as the personal characteristics of the observer used for learning by the interpersonal cognitive tendency model learning device 1. In this embodiment, the personal characteristics are, for example, gender and psychological scale scores. The observer to be estimated may or may not be included in the learning observer group.

ステップＳ１３において、行動認識部１３は、映像入力部１１で取得した推定用映像に基づいて、その映像中の各人物の行動を時系列にラベル付けした推定用行動時系列を生成する。生成した推定用行動時系列は、客観的行動認知傾向推定部３２へ出力される。映像中の人物の行動を認識する方法は、対人認知傾向モデル学習装置１と同様である。 In step S 13, the behavior recognition unit 13 generates an estimation behavior time series in which the behavior of each person in the video is labeled in time series based on the estimation video acquired by the video input unit 11. The generated estimation action time series is output to the objective action recognition tendency estimation unit 32. The method for recognizing the action of a person in the video is the same as that of the interpersonal recognition tendency model learning device 1.

ステップＳ３１において、主観的対人認知傾向推定部３１は、主観的対人認知傾向モデル２０に記憶された主観的対人認知傾向モデルを用いて、観察者特性入力部１２から入力された対象観察者の個人特性から対象観察者の主観的対人認知傾向を求める。求めた主観的対人認知傾向は、主観的対人認知状態推定部３３へ出力される。 In step S 31, the subjective personal recognition tendency estimation unit 31 uses the subjective personal recognition tendency model stored in the subjective personal recognition tendency model 20, and the individual of the target observer input from the observer characteristic input unit 12. The subjective interpersonal recognition tendency of the target observer is obtained from the characteristics. The obtained subjective personal recognition tendency is output to the subjective personal recognition state estimation unit 33.

主観的対人認知傾向推定部３１が出力する主観的対人認知傾向は、P(e_j|g_j,s_j,Θ)と表せられるが、下記の式(17)に示すように、結局、式(9)と等しくなる。 The subjective personal recognition tendency output by the subjective personal recognition tendency estimation unit 31 is expressed as P (e _j | g _j , s _j , Θ), but as shown in the following formula (17), Equal to (9).

ここで、最後の比例関係の導出には、P(g_j,s_j|Θ)の各要素がいずれも既知であり、比例定数として扱えることを利用した。 Here, for the derivation of the last proportional relationship, it is used that each element of P (g _j , s _j | Θ) is known and can be treated as a proportional constant.

このとき、混合率zの事後確率分布は、下記の式(18)で計算することができる。 At this time, the posterior probability distribution of the mixing ratio z can be calculated by the following equation (18).

そして、この混合率における主観的対人認知傾向は、下記の式(19)となる。 And the subjective interpersonal recognition tendency in this mixing ratio is expressed by the following equation (19).

ステップＳ３２において、客観的行動認知傾向推定部３２は、客観的行動認知傾向モデル２１に記憶された客観的行動認知傾向モデルを用いて、行動認識部１３の出力する推定用行動時系列から客観的行動認知傾向を求める。求めた客観的行動認知傾向は、主観的対人認知状態推定部３３へ出力される。 In step S 32, the objective behavior recognition tendency estimation unit 32 uses the objective behavior recognition tendency model stored in the objective behavior recognition tendency model 21 to objectively calculate from the estimation behavior time series output by the behavior recognition unit 13. Seek behavioral perception trends. The obtained objective behavior recognition tendency is output to the subjective interpersonal recognition state estimation unit 33.

客観的行動認知傾向推定部３２は、例えば、特許文献１に記載の方法により、客観的行動認知傾向を求めることができる。特許文献１に記載の方法では、以下のように客観的行動認知傾向を求める。ただし、以下の手順は処理の概要であって、詳細な処理の手順については特許文献１を参照されたい。 The objective behavior recognition tendency estimation unit 32 can obtain an objective behavior recognition tendency by the method described in Patent Document 1, for example. In the method described in Patent Document 1, an objective behavior recognition tendency is obtained as follows. However, the following procedure is an outline of the processing, and refer to Patent Document 1 for the detailed processing procedure.

客観的行動認知傾向推定部３２は、まず、客観的行動認知傾向モデル記憶部２１に記憶されたハイパーパラメタに基づいて、客観的行動認知傾向モデルの各パラメタの初期値を設定し、設定された各パラメタの初期値に基づいて、推定用映像中の人物の客観的行動認知傾向の推定値の初期値を設定する。次に、客観的行動認知傾向モデルのハイパーパラメタと、客観的行動認知傾向モデルの各パラメタの初期値及び客観的行動認知傾向の推定値の初期値、または前回決定された客観的行動認知傾向モデルの各パラメタの値及び客観的行動認知傾向の推定値とに基づいて求めた確率分布に従って、サンプリングにより客観的行動認知傾向の推定値を決定する。さらに、行動認識部１３の出力する行動時系列及び客観的行動認知傾向モデルのハイパーパラメタに基づいて求めた確率分布に従って、サンプリングにより客観的行動認知傾向モデルの各パラメタの値を決定する。客観的行動認知傾向推定部３２は、所定の収束条件として予め定めた反復回数に到達したか否かを判定し、予め定めた反復回数に到達していない場合には所定の収束条件が成立していないと判断して、再度、客観的行動認知傾向の推定値の決定を行い、さらに、客観的行動認知傾向モデルの各パラメタの決定を行う。予め定めた反復回数に到達した場合には所定の収束条件が成立したと判断し、以降の処理へ進む。 The objective behavior recognition tendency estimation unit 32 first sets and sets initial values of each parameter of the objective behavior recognition tendency model based on the hyperparameters stored in the objective behavior recognition trend model storage unit 21. Based on the initial value of each parameter, the initial value of the estimated value of the objective behavior recognition tendency of the person in the estimation video is set. Next, the hyperparameter of the objective behavior recognition tendency model, the initial value of each parameter of the objective behavior recognition tendency model and the initial value of the estimated value of the objective behavior recognition trend model, or the objective behavior recognition tendency model previously determined In accordance with the probability distribution obtained based on the values of the respective parameters and the estimated value of the objective behavior recognition tendency, the estimated value of the objective behavior recognition tendency is determined by sampling. Further, the value of each parameter of the objective behavior recognition tendency model is determined by sampling according to the probability distribution obtained based on the behavior time series output by the behavior recognition unit 13 and the hyper parameter of the objective behavior recognition tendency model. The objective behavior recognition tendency estimation unit 32 determines whether or not a predetermined number of iterations has been reached as a predetermined convergence condition, and if the predetermined number of iterations has not been reached, the predetermined convergence condition is satisfied. In other words, the estimated value of the objective behavior recognition tendency is determined again, and further, each parameter of the objective behavior recognition tendency model is determined. When the predetermined number of iterations has been reached, it is determined that a predetermined convergence condition has been established, and the process proceeds to the subsequent processing.

客観的行動認知傾向推定部３２は、続いて、客観的行動認知傾向モデルのハイパーパラメタと、前回決定された客観的行動認知傾向モデルの各パラメタの値及び客観的行動認知傾向の推定値とに基づいて求めた確率分布に従って、サンプリングにより客観的行動認知傾向の推定値を決定する。さらに、行動認識部１３の出力する行動時系列及び客観的行動認知傾向モデルのハイパーパラメタに基づいて求めた確率分布に従って、サンプリングにより客観的行動認知傾向モデルの各パラメタの値を決定する。客観的行動認知傾向推定部３２は、上記の処理を予め定めた回数（ここではR回とする）繰り返し、R回分の客観的行動認知傾向の推定値及び客観的行動認知傾向モデルの各パラメタに基づいて、客観的行動認知傾向の推定値及び客観的行動認知傾向モデルの各パラメタの推定値を算出する。 Subsequently, the objective behavior recognition tendency estimation unit 32 converts the hyper parameters of the objective behavior recognition tendency model, the values of the parameters of the objective behavior recognition tendency model determined last time, and the estimated values of the objective behavior recognition tendency. According to the probability distribution obtained based on the above, an estimated value of the objective behavior recognition tendency is determined by sampling. Further, the value of each parameter of the objective behavior recognition tendency model is determined by sampling according to the probability distribution obtained based on the behavior time series output by the behavior recognition unit 13 and the hyper parameter of the objective behavior recognition tendency model. The objective behavior recognition tendency estimation unit 32 repeats the above process a predetermined number of times (here, R times), and uses the estimated value of the objective behavior recognition tendency for R times and each parameter of the objective behavior recognition trend model. Based on this, the estimated value of the objective behavior recognition tendency and the estimated value of each parameter of the objective behavior recognition tendency model are calculated.

客観的行動認知傾向の推定値P(B|e)は、具体的には、各時刻での推定結果を算出し、そこから対象とする時間の中で平均を取るなどをすればよい。この場合、客観的行動認知傾向P(B|e)は、下記の式(20)により表される。 Specifically, the estimated value P (B | e) of the objective behavior recognition tendency may be obtained by calculating an estimation result at each time and then averaging the target results. In this case, the objective behavior recognition tendency P (B | e) is expressed by the following equation (20).

ステップＳ３３において、主観的対人認知状態推定部３３は、主観的対人認知傾向推定部３１の出力する主観的対人認知傾向及び客観的行動認知傾向推定部３２の出力する客観的行動認知傾向を用いて主観的対人認知状態推定結果を求める。具体的には、性別g_jで心理尺度スコアs_jを持つ観察者jが、対話者の行動の集合B_iを含む対象シーンiに対して生成する主観的対人認知状態e_i,jを、これらの同時確率P(e_i,j,B_i,g_j,s_j)として計算する。この同時確率は、式(21)に示すように変換できる。 In step S 33, the subjective personal recognition state estimation unit 33 uses the subjective personal recognition tendency output by the subjective personal recognition tendency estimation unit 31 and the objective behavior recognition tendency output by the objective behavior recognition tendency estimation unit 32. Obtain subjective interpersonal cognitive state estimation results. Specifically, an observer _j who has a gender g _j and a psychological scale score s _j generates a subjective interpersonal cognitive state e _{i, j} generated for a target scene i including a set B _i of dialogue persons. These joint probabilities P (e _{i, j} , B _i , g _j , s _j ) are calculated. This joint probability can be converted as shown in equation (21).

主観的対人認知状態の推定結果については、すべての対人認知状態について確率付きで出力したい場合は、同時確率分布P(e_i,j,B_i,g_j,s_j)をそのまま出力すればよい。確率分布ではなく一つの値として出力したい場合は、下記の式(22)に示すように、同時確率が最も高くなる対人認知状態を出力すればよい。 For the estimation results of subjective interpersonal cognitive states, if you want to output all interpersonal cognitive states with probability, you can output the joint probability distribution P (e _{i, j} , B _i , g _j , s _j ) as it is . When it is desired to output a single value instead of the probability distribution, it is only necessary to output the interpersonal recognition state having the highest joint probability as shown in the following equation (22).

＜応用例＞
ある観察者の主観的対人認知傾向が既知である場合には、主観的対人認知傾向モデルを用いて、その観察者の個人特性を推定することが可能である。 <Application example>
When the subjective personal recognition tendency of an observer is known, it is possible to estimate the personal characteristics of the observer using a subjective personal recognition tendency model.

図６を参照して、応用例に係る対人認知状態推定装置の機能構成の一例を説明する。応用例に係る対人認知状態推定装置３は、実施形態の対人認知状態推定装置２と同様に、主観的対人認知傾向モデル記憶部２０を含む。対人認知状態推定装置３ではさらに、主観的対人認知傾向記憶部２２及び対象観察者特性推定部３４を含む。 With reference to FIG. 6, an example of a functional configuration of the interpersonal recognition state estimation device according to the application example will be described. The interpersonal recognition state estimation device 3 according to the application example includes a subjective interpersonal recognition tendency model storage unit 20 in the same manner as the interpersonal recognition state estimation device 2 of the embodiment. The interpersonal recognition state estimation device 3 further includes a subjective interpersonal recognition tendency storage unit 22 and a target observer characteristic estimation unit 34.

主観的対人認知傾向記憶部２２には、既知の主観的対人認知傾向が記憶されている。既知の主観的対人認知傾向は、典型的には以前の対人認知傾向モデル学習装置１により推定された主観的対人認知傾向であるが、手入力などにより人為的にパラメタを設定することで生成したものであってもよい。 The subjective personal recognition tendency storage unit 22 stores a known subjective personal recognition tendency. The known subjective interpersonal cognitive tendency is typically the subjective interpersonal cognitive tendency estimated by the previous interpersonal cognitive tendency model learning device 1, but was generated by manually setting parameters by manual input or the like. It may be a thing.

対象観察者特性推定部３４は、主観的対人認知傾向モデル記憶部２０に記憶された主観的対人認知傾向モデルを用いて、主観的対人認知傾向記憶部２２に記憶されている主観的対人認知傾向からその主観的対人認知傾向を持つ観察者の個人特性を推定する。 The target observer characteristic estimation unit 34 uses the subjective interpersonal recognition tendency model stored in the subjective interpersonal recognition tendency model storage unit 20 to use the subjective interpersonal recognition tendency stored in the subjective interpersonal recognition tendency storage unit 22. From this, we estimate the personal characteristics of observers who have the subjective interpersonal cognitive tendency.

上述のとおり、主観的対人認知傾向モデルは、観察者jが生成する対人認知状態e_j、観察者jの性別g_j及び心理尺度スコアs_jの同時確率P(e_j,g_j,s_j|Θ)をモデル化したものである。したがって、観察者jが生成する対人認知状態e_jが既知であれば、実施形態の対人認知状態推定装置２に含まれる主観的対人認知傾向推定部３１と同様にして、観察者jの性別g_j及び心理尺度スコアs_jを求めることができる。 As described above, the subjective person perception trend model, the observer j generates interpersonal cognitive status e _j, the joint probability P (e _j sex g _j and psychological subscale scores s _j of the observer j, g _j, s _j | Θ) is modeled. Therefore, if the person recognition state ej generated by the observer _j is known, the gender g of the observer j is similar to the subjective person recognition tendency estimation unit 31 included in the person recognition state estimation device 2 of the embodiment. _j and psychological scale score s _j can be obtained.

［プログラム、記録媒体］
この発明は上述の実施形態に限定されるものではなく、この発明の趣旨を逸脱しない範囲で適宜変更が可能であることはいうまでもない。上記実施例において説明した各種の処理は、記載の順に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されてもよい。 [Program, recording medium]
The present invention is not limited to the above-described embodiment, and it goes without saying that modifications can be made as appropriate without departing from the spirit of the present invention. The various processes described in the above-described embodiments are not only executed in time series according to the order described, but may be executed in parallel or individually as required by the processing capability of the apparatus that executes the processes.

また、上記実施形態で説明した各装置における各種の処理機能をコンピュータによって実現する場合、各装置が有すべき機能の処理内容はプログラムによって記述される。そして、このプログラムをコンピュータで実行することにより、上記各装置における各種の処理機能がコンピュータ上で実現される。 When various processing functions in each device described in the above embodiment are realized by a computer, the processing contents of the functions that each device should have are described by a program. Then, by executing this program on a computer, various processing functions in each of the above devices are realized on the computer.

この処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、例えば、磁気記録装置、光ディスク、光磁気記録媒体、半導体メモリ等どのようなものでもよい。 The program describing the processing contents can be recorded on a computer-readable recording medium. As the computer-readable recording medium, for example, any recording medium such as a magnetic recording device, an optical disk, a magneto-optical recording medium, and a semiconductor memory may be used.

また、このプログラムの流通は、例えば、そのプログラムを記録したＤＶＤ、ＣＤ−ＲＯＭ等の可搬型記録媒体を販売、譲渡、貸与等することによって行う。さらに、このプログラムをサーバコンピュータの記憶装置に格納しておき、ネットワークを介して、サーバコンピュータから他のコンピュータにそのプログラムを転送することにより、このプログラムを流通させる構成としてもよい。 The program is distributed by selling, transferring, or lending a portable recording medium such as a DVD or CD-ROM in which the program is recorded. Furthermore, the program may be distributed by storing the program in a storage device of the server computer and transferring the program from the server computer to another computer via a network.

このようなプログラムを実行するコンピュータは、例えば、まず、可搬型記録媒体に記録されたプログラムもしくはサーバコンピュータから転送されたプログラムを、一旦、自己の記憶装置に格納する。そして、処理の実行時、このコンピュータは、自己の記録媒体に格納されたプログラムを読み取り、読み取ったプログラムに従った処理を実行する。また、このプログラムの別の実行形態として、コンピュータが可搬型記録媒体から直接プログラムを読み取り、そのプログラムに従った処理を実行することとしてもよく、さらに、このコンピュータにサーバコンピュータからプログラムが転送されるたびに、逐次、受け取ったプログラムに従った処理を実行することとしてもよい。また、サーバコンピュータから、このコンピュータへのプログラムの転送は行わず、その実行指示と結果取得のみによって処理機能を実現する、いわゆるＡＳＰ（Application Service Provider）型のサービスによって、上述の処理を実行する構成としてもよい。なお、本形態におけるプログラムには、電子計算機による処理の用に供する情報であってプログラムに準ずるもの（コンピュータに対する直接の指令ではないがコンピュータの処理を規定する性質を有するデータ等）を含むものとする。 A computer that executes such a program first stores, for example, a program recorded on a portable recording medium or a program transferred from a server computer in its own storage device. When executing the process, the computer reads a program stored in its own recording medium and executes a process according to the read program. As another execution form of the program, the computer may directly read the program from a portable recording medium and execute processing according to the program, and the program is transferred from the server computer to the computer. Each time, the processing according to the received program may be executed sequentially. Also, the program is not transferred from the server computer to the computer, and the above-described processing is executed by a so-called ASP (Application Service Provider) type service that realizes the processing function only by the execution instruction and result acquisition. It is good. Note that the program in this embodiment includes information that is used for processing by an electronic computer and that conforms to the program (data that is not a direct command to the computer but has a property that defines the processing of the computer).

また、この形態では、コンピュータ上で所定のプログラムを実行させることにより、本装置を構成することとしたが、これらの処理内容の少なくとも一部をハードウェア的に実現することとしてもよい。 In this embodiment, the present apparatus is configured by executing a predetermined program on a computer. However, at least a part of these processing contents may be realized by hardware.

１０映像入力部
１１対人認知ラベリング部
１２観察者特性入力部
１３行動認識部
１４主観的対人認知傾向モデル学習部
１５客観的行動認知傾向モデル学習部
２０主観的対人認知傾向モデル記憶部
２１客観的行動認知傾向モデル記憶部
３１主観的対人認知傾向推定部
３２客観的行動認知傾向推定部
３３主観的対人認知状態推定部
３４対象観察者特性推定部 DESCRIPTION OF SYMBOLS 10 Image | video input part 11 Person recognition labeling part 12 Observer characteristic input part 13 Action recognition part 14 Subjective person recognition tendency model learning part 15 Objective action recognition tendency model learning part 20 Subjective person recognition tendency model storage part 21 Objective action Cognitive tendency model storage unit 31 Subjective person recognition tendency estimation part 32 Objective behavior recognition tendency estimation part 33 Subjective person recognition state estimation part 34 Target observer characteristic estimation part

Claims

複数の観察者が少なくとも一人の人物を撮影した学習用映像に基づいて上記学習用映像中の人物の状態を解釈した対人認知状態を時系列にラベル付けした解釈ラベル集合を生成する対人認知ラベリング部と、
上記学習用映像に基づいて上記学習用映像中の人物の行動を時系列にラベル付けした学習用行動時系列を生成する行動認識部と、
上記複数の観察者それぞれの個人特性を表す観察者特性を入力する観察者特性入力部と、
上記解釈ラベル集合及び上記学習用行動時系列を用いて、行動時系列が与えられたもとでの対人認知状態の尤度を表す客観的行動認知傾向モデルのパラメタを学習する客観的行動認知傾向モデル学習部と、
上記解釈ラベル集合及び上記観察者特性を用いて、個人特性が与えられたもとでの対人認知状態の尤度を表す主観的対人認知傾向モデルのパラメタを学習する主観的対人認知傾向モデル学習部と、
を含む対人認知傾向モデル学習装置。 Interpersonal cognitive labeling unit that generates a set of interpretation labels that time-sequentially label the interpersonal cognitive state obtained by interpreting the state of the person in the video for learning based on the video for learning taken by a plurality of observers. When,
An action recognition unit that generates a learning action time series in which the actions of persons in the learning video are labeled in time series based on the learning video;
An observer characteristic input unit for inputting observer characteristics representing individual characteristics of each of the plurality of observers;
Objective behavior recognition trend model learning that learns the parameters of the objective behavior recognition trend model that represents the likelihood of interpersonal cognitive state given the behavior time series using the interpretation label set and the behavior time series for learning. And
A subjective interpersonal cognitive tendency model learning unit that learns a parameter of a subjective interpersonal cognitive tendency model that represents the likelihood of the interpersonal cognitive state given the personal characteristic using the interpretation label set and the observer characteristic,
Interpersonal cognitive tendency model learning device.

請求項１に記載の対人認知傾向モデル学習装置であって、
上記主観的対人認知傾向モデルは、上記観察者特性から抽出した典型的人格の個人特性をトピックとし、上記典型的人格の混合率を潜在変数とする確率的トピックモデルである
対人認知傾向モデル学習装置。 The interpersonal recognition tendency model learning device according to claim 1,
The subjective interpersonal cognitive tendency model is a probabilistic topic model in which the personal characteristics of the typical personality extracted from the observer characteristics are topics, and the mixed rate of the typical personalities is a latent variable. .

請求項１または２に記載の対人認知傾向モデル学習装置であって、
上記個人特性は、性別及び心理尺度スコアを含み、
上記主観的対人認知傾向モデルは、Nを観察者の数とし、1≦j≦Nとし、e_jをj番目の観察者の対人認知状態とし、g_jをj番目の観察者の性別とし、s_jをj番目の観察者の心理尺度スコアとし、Kを典型的人格の数とし、1≦k≦Kとし、π_kをk番目の典型的人格の事前確率とし、α_kをk番目の典型的人格の持つ対人認知傾向とし、β_kをk番目の典型的人格の性別とし、μ_kをk番目の典型的人格の心理尺度スコアの平均ベクトルとし、Σ_kをk番目の典型的人格の心理尺度スコアの共分散行列とし、Mは多項分布を表し、Bは二項分布を表し、Nは正規分布を表すとして、次式で表される同時確率により計算される

対人認知傾向モデル学習装置。 The interpersonal cognitive tendency model learning device according to claim 1 or 2,
The personal characteristics include gender and psychological scale scores,
The subjective interpersonal recognition tendency model is such that N is the number of observers, 1 ≦ j ≦ N, e _j is the personal recognition state of the jth observer, g _j is the gender of the jth observer, s _j is the psychological scale score of the jth observer, K is the number of typical personalities, 1 ≦ k ≦ K, π _k is the prior probability of the kth typical personality, and α _k is the kth and interpersonal perception tends to have a typical personality, β _k and the k-th of the typical personality of gender, μ _k was used as a mean vector of psychological scale score of the k-th of a typical personality, k-th of a typical personality Σ _k Is a covariance matrix of psychological scale scores, where M is a multinomial distribution, B is a binomial distribution, and N is a normal distribution.

Interpersonal cognitive tendency model learning device.

請求項１から３のいずれかに記載の対人認知傾向モデル学習装置により学習された主観的対人認知傾向モデルを記憶する主観的対人認知傾向モデル記憶部と、
請求項１から３のいずれかに記載の対人認知傾向モデル学習装置により学習された客観的行動認知傾向モデルを記憶する客観的行動認知傾向モデル記憶部と、
上記主観的対人認知傾向モデルを用いて、入力された対象観察者の個人特性から上記対象観察者の主観的対人認知傾向を求める主観的対人認知傾向推定部と、
上記客観的行動認知傾向モデルを用いて、少なくとも一人の人物を撮影した推定用映像に基づいて上記推定用映像中の人物の行動を時系列にラベル付けした推定用行動時系列から客観的行動認知傾向を求める客観的行動認知傾向推定部と、
上記主観的対人認知傾向及び上記客観的行動認知傾向を用いて主観的対人認知状態推定結果を求める主観的対人認知状態推定部と、
を含む対人認知状態推定装置。 A subjective interpersonal recognition tendency model storage unit that stores a subjective interpersonal recognition tendency model learned by the interpersonal recognition tendency model learning device according to claim 1,
An objective behavior recognition tendency model storage unit that stores an objective behavior recognition tendency model learned by the interpersonal recognition tendency model learning device according to claim 1;
Using the subjective personal recognition tendency model, a subjective personal recognition tendency estimation unit for obtaining the subjective personal recognition tendency of the target observer from the personal characteristics of the input target observer,
Using the objective behavior recognition trend model, objective behavior recognition from a presumed behavior time series in which the behavior of the person in the presumption video is labeled in time series based on the presumption video taken of at least one person An objective behavior recognition trend estimator that seeks trends,
A subjective interpersonal recognition state estimation unit for obtaining a subjective interpersonal recognition state estimation result using the subjective interpersonal recognition tendency and the objective behavior recognition tendency;
Interpersonal cognitive state estimation device including:

請求項１から３のいずれかに記載の対人認知傾向モデル学習装置により学習された主観的対人認知傾向モデルを記憶する対人認知傾向モデル記憶部と、
上記主観的対人認知傾向モデルを用いて、入力された対象観察者の主観的対人認知傾向から上記対象観察者の個人特性を求める対象観察者特性推定部と、
を含む対人認知状態推定装置。 An interpersonal recognition tendency model storage unit that stores a subjective interpersonal tendency model learned by the interpersonal recognition tendency model learning device according to claim 1;
Using the subjective interpersonal recognition tendency model, the target observer characteristic estimation unit for obtaining the personal characteristic of the target observer from the subjective interpersonal recognition tendency of the input target observer,
Interpersonal cognitive state estimation device including:

対人認知ラベリング部が、複数の観察者が少なくとも一人の人物を撮影した学習用映像に基づいて上記学習用映像中の人物の状態を解釈した対人認知状態を時系列にラベル付けした解釈ラベル集合を生成する対人認知ラベリングステップと、
行動認識部が、上記学習用映像に基づいて上記学習用映像中の人物の行動を時系列にラベル付けした学習用行動時系列を生成する行動認識ステップと、
観察者特性入力部が、上記複数の観察者それぞれの個人特性を表す観察者特性を入力する観察者特性入力ステップと、
客観的行動認知傾向モデル学習部が、上記解釈ラベル集合及び上記学習用行動時系列を用いて、行動時系列が与えられたもとでの対人認知状態の尤度を表す客観的行動認知傾向モデルのパラメタを学習する客観的行動認知傾向モデル学習ステップと、
主観的対人認知傾向モデル学習部が、上記解釈ラベル集合及び上記観察者特性を用いて、個人特性が与えられたもとでの対人認知状態の尤度を表す主観的対人認知傾向モデルのパラメタを学習する主観的対人認知傾向モデル学習ステップと、
を含む対人認知傾向モデル学習方法。 Interpersonal cognitive labeling unit sets a set of interpretation labels that time-sequentially recognizes the state of human cognition obtained by interpreting the state of the person in the learning video based on the video for learning in which a plurality of observers photograph at least one person. Interpersonal recognition labeling step to generate,
An action recognition step in which an action recognition unit generates a learning action time series in which a person's action in the learning video is labeled in time series based on the learning video;
The observer characteristic input unit inputs an observer characteristic representing the individual characteristic of each of the plurality of observers, and an observer characteristic input step,
Objective behavior recognition trend model learning unit uses the interpretation label set and the behavior time series for learning, and the parameters of the objective behavior recognition trend model that represent the likelihood of the state of human recognition when the behavior time series is given Objective behavioral cognitive tendency model learning step to learn
Subjective interpersonal cognitive tendency model learning unit learns the parameters of the subjective interpersonal cognitive tendency model representing the likelihood of interpersonal cognitive state given the personal characteristics, using the set of interpretation labels and the observer characteristics Subjective interpersonal cognitive tendency model learning step,
Interpersonal cognitive tendency model learning method.

主観的対人認知傾向推定部が、請求項６に記載の対人認知傾向モデル学習方法により学習された主観的対人認知傾向モデルを用いて、入力された対象観察者の個人特性から上記対象観察者の主観的対人認知傾向を求める主観的対人認知傾向推定ステップと、
客観的行動認知傾向推定部が、請求項６に記載の対人認知傾向モデル学習方法により学習された客観的行動認知傾向モデルを用いて、少なくとも一人の人物を撮影した推定用映像に基づいて上記推定用映像中の人物の行動を時系列にラベル付けした推定用行動時系列から客観的行動認知傾向を求める客観的行動認知傾向推定ステップと、
主観的対人認知状態推定部が、上記主観的対人認知傾向及び上記客観的行動認知傾向を用いて主観的対人認知状態推定結果を求める主観的対人認知状態推定ステップと、
を含む対人認知状態推定方法。 The subjective interpersonal recognition tendency estimation unit uses the subjective personal recognition tendency model learned by the interpersonal recognition tendency model learning method according to claim 6 to determine the target observer's personal characteristics from the input individual characteristics of the target observer. A subjective interpersonal cognitive tendency estimation step to obtain a subjective interpersonal cognitive tendency;
The objective behavior recognition tendency estimation unit uses the objective behavior recognition tendency model learned by the interpersonal recognition tendency model learning method according to claim 6 to perform the estimation based on an estimation video obtained by photographing at least one person. Objective behavior recognition tendency estimation step for obtaining an objective behavior recognition tendency from an estimation behavior time series in which the behavior of a person in the video is labeled in time series,
A subjective interpersonal cognitive state estimation unit that obtains a subjective interpersonal cognitive state estimation result using the subjective interpersonal recognition tendency and the objective behavior recognition tendency,
Interpersonal cognitive state estimation method including:

請求項１から３のいずれかに記載の対人認知傾向モデル学習装置もしくは請求項４または５に記載の対人認知状態推定装置としてコンピュータを機能させるためのプログラム。 A program for causing a computer to function as the interpersonal recognition tendency model learning device according to claim 1 or the interpersonal recognition state estimation device according to claim 4 or 5.