JP6821611B2

JP6821611B2 - Estimator, its method, and program

Info

Publication number: JP6821611B2
Application number: JP2018008252A
Authority: JP
Inventors: 遼平渋江; 惇米家
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2018-01-22
Filing date: 2018-01-22
Publication date: 2021-01-27
Anticipated expiration: 2038-01-22
Also published as: JP2019126425A

Description

本発明は、ヒトの注視点の時系列をモデル化した注視点モデルのパラメータを推定する技術に関する。 The present invention relates to a technique for estimating parameters of a gaze point model that models a human gaze point time series.

サリエンシーマップは、視覚刺激に対するボトムアップ性の注意を表現するための計算論的なモデルであり、元画像の色や輝度といった特徴から2 次元のグレースケール画像として計算される。サリエンシーマップによって画像の中で人が視線を向けやすい箇所を知ることができるため、計算論的神経科学に限らない様々な分野で研究されている。 The saliency map is a computational model for expressing bottom-up attention to visual stimuli, and is calculated as a two-dimensional grayscale image from features such as the color and brightness of the original image. Since the saliency map allows us to know the parts of the image that people can easily look at, it is being studied in various fields not limited to computational neuroscience.

非特許文献１には、サリエンシーマップという概念について詳しく記載されている。非特許文献１では、サリエンシーマップの中で最も値が大きい箇所に視線が向くと仮定されている。この仮定はWinner take allルールと呼ばれており、このルールに従うと我々の視線は画像情報のみから決定論的に決まることとなる。 Non-Patent Document 1 describes in detail the concept of saliency map. In Non-Patent Document 1, it is assumed that the line of sight is directed to the portion having the largest value in the saliency map. This assumption is called the Winner take all rule, and according to this rule, our line of sight is deterministically determined only from image information.

C. Koch and S. Ullman, "Shifts in selective visual attention: towards the underlying neural circuitry", Human Neurobiology, vol. 4, pp. 219-27, 1985.C. Koch and S. Ullman, "Shifts in selective visual attention: towards the underlying neural circuits", Human Neurobiology, vol. 4, pp. 219-27, 1985.

しかしながら、非特許文献１で用いられているWinner take allルールは、我々が画像を見たときにいつも同じ箇所に着目するとは限らないという点で、現実に即していない。さらに、Winner take all ルールではサリエンシーマップの最大値のみに意味を見出すため、各画素に割り当てられた絶対値や異なる画素間の値の比に解釈を与えることができない。 However, the Winner take all rule used in Non-Patent Document 1 is not realistic in that it does not always focus on the same part when we look at the image. Furthermore, since the Winner take all rule finds meaning only in the maximum value of the salency map, it is not possible to give an interpretation to the absolute value assigned to each pixel or the ratio of the values between different pixels.

ここで挙げた問題は、従来のサリエンシーマップが「画像の情報のみ」から計算されるあくまで計算論的なモデルであることに起因する。これらの問題を解決するためには、画像を見ているときの注意状態を代替するような教師データを用意し、教師付き学習を行う必要がある。 The problem mentioned here is due to the fact that the conventional surrency map is a computational model calculated from "image information only". In order to solve these problems, it is necessary to prepare teacher data that substitutes the attention state when viewing the image and perform supervised learning.

そこで本発明では、サリエンシーマップがサッケード生成に関連の深い上丘に表象されているという知見をもとに、サリエンシーマップを点過程モデルを用いてモデル化し、アイトラッカーによって計測された注視点の時系列からサリエンシーマップのモデルパラメータを簡便に推定する推定装置、その方法、及びプログラムを提供することを目的とする。 Therefore, in the present invention, based on the finding that the saliency map is represented in the superior colliculus that is closely related to sackade generation, the saliency map is modeled using a point process model, and the gazing point measured by the eye tracker. It is an object of the present invention to provide an estimation device, a method, and a program for easily estimating a model parameter of a surrency map from the time series of.

上記の課題を解決するために、本発明の一態様によれば、推定装置は、サッカードの時系列は、各サッカードの特徴量をマークとするマーク付き点過程で生成されるものとし、時刻ｔの人の真の注目点は、時刻ｔ近傍にサッカードが発生する場合は時刻t-1の真の注目点を当該サッカードの方向及び大きさに応じて移動させた位置とし、時刻ｔ近傍にサッカードが発生しない場合は時刻t-1の真の注目点をランダムな方向及び大きさに応じて移動させた位置とし、時刻ｔの人の注視点の遅れは、AR(2)モデルに従うものとし、時刻ｔの人の真の注視点は、時刻ｔの真の注目点を時刻ｔの注視点の遅れにより補正した位置とし、実際に計測される時刻ｔの注視点を、上記時刻ｔの真の注視点にノイズが加わったものとしてモデル化したものを注視点の生成モデルとして、対象者の眼の動きを計測して得た注視点の時系列から、注視点の生成モデルのモデルパラメータを推定するモデル推定部を含む。 In order to solve the above problems, according to one aspect of the present invention, the estimation device assumes that the time series of soccerd is generated by the marked point process in which the feature amount of each soccerd is marked. The true point of interest of a person at time t is the position where the true point of interest at time t-1 is moved according to the direction and size of the soccerd when a soccerd occurs in the vicinity of time t. If soccer does not occur in the vicinity of t, the true point of interest at time t-1 is set to a position moved according to a random direction and size, and the delay in the gaze point of the person at time t is AR (2). According to the model, the true gaze point of the person at time t is the position where the true point of interest at time t is corrected by the delay of the gaze point at time t, and the gaze point at time t actually measured is the above. A gaze point generation model from the time series of gaze points obtained by measuring the movement of the eyes of the subject, using a model of the true gaze point at time t with noise added as a gaze point generation model. Includes a model estimation unit that estimates the model parameters of.

本発明によれば、アイトラッカーによって計測された注視点の時系列から注視点モデルのモデルパラメータを簡便に推定することができるという効果を奏する。 According to the present invention, it is possible to easily estimate the model parameters of the gazing point model from the gazing point time series measured by the eye tracker.

マーク付き点過程から生成された標本の例を示す図。The figure which shows the example of the sample generated from the marked point process. サッカードがmovement fieldに向かって発生したタイミングでそのニューロンが発火するイメージを示す図。The figure which shows the image which the neuron fires at the timing when a saccade occurs toward a movement field. ニューロンの配置と受容野の中心位置との対応関係を示す図。The figure which shows the correspondence relation between the arrangement of a neuron and the central position of a receptive field. 真の注視点を、真の注目点a_rと注視点の遅れs_rとに分解のイメージを示す図。The figure which shows the image of the decomposition of the true gaze point into the true point of interest a _r and the delay s _{r of the} gaze point. 第一実施形態に係る推定装置の機能ブロック図。The functional block diagram of the estimation apparatus which concerns on 1st Embodiment. 第一実施形態に係る推定装置の処理フローの例を示す図。The figure which shows the example of the processing flow of the estimation apparatus which concerns on 1st Embodiment. 事後期待値とその解釈の例を示す図。The figure which shows the example of the post-expected value and its interpretation. 被験者のデータの１反復目の推定結果を示す図。The figure which shows the estimation result of the 1st iteration of the data of a subject. 被験者のデータの１１反復目の推定結果を示す図。The figure which shows the estimation result of the eleventh repetition of the data of a subject. 被験者のデータから計算した注視点の頻度分布と、サリエンシーマップにサポートベクター回帰を適用して得られた初期値とを表す図。The figure which shows the frequency distribution of the gaze point calculated from the data of a subject, and the initial value obtained by applying the support vector regression to the salency map. 被験者のデータの１１反復目の推定結果を示す図。The figure which shows the estimation result of the eleventh repetition of the data of a subject. 第一実施形態の変形例に係る推定装置の機能ブロック図。The functional block diagram of the estimation apparatus which concerns on the modification of 1st Embodiment.

以下、本発明の実施形態について、説明する。なお、以下の説明に用いる図面では、同じ機能を持つ構成部や同じ処理を行うステップには同一の符号を記し、重複説明を省略する。以下の説明において、テキスト中で使用する記号「^」「~」「^-」等は、本来直後の文字の真上に記載されるべきものであるが、テキスト記法の制限により、当該文字の直前に記載する。式中においてはこれらの記号は本来の位置に記述している。また、ベクトルや行列の各要素単位で行われる処理は、特に断りが無い限り、そのベクトルやその行列の全ての要素に対して適用されるものとする。 Hereinafter, embodiments of the present invention will be described. In the drawings used in the following description, the same reference numerals are given to the components having the same function and the steps for performing the same processing, and duplicate description is omitted. In the following explanation, the symbols "^", "~", " ^- ", etc. used in the text should be written directly above the character immediately after, but due to restrictions on the text notation, immediately before the character. Described in. In the formula, these symbols are described in their original positions. Further, unless otherwise specified, the processing performed for each element of the vector or matrix shall be applied to all the elements of the vector or the matrix.

＜サリエンシーマップのモデル＞
まず、サリエンシーマップのモデルについて説明する。 <Sariency map model>
First, the model of the saliency map will be described.

人がある一枚の画像を見ており、その間の注視点の時系列がアイトラッカーによって計測されている状況を想定する。本実施形態では、アイトラッカーによって計測された眼球が向いている位置を「注視点」、人が真に注目している位置を「注目点」と呼び分けることにする。本実施形態の目的は、画像情報を用いることで「注視点」の軌跡から眼球運動による遅れの成分やノイズ成分を除去し、「注目点」の軌跡を推定することである。この推定の際に、「注目点」の軌跡に影響を与える潜在変数としてサリエンシーマップが同時に計算される。 Imagine a situation where a person is looking at a single image and the time series of gaze points between them is being measured by an eye tracker. In the present embodiment, the position where the eyeball is facing, which is measured by the eye tracker, is referred to as the “point of interest”, and the position where the person is truly paying attention is referred to as the “point of interest”. An object of the present embodiment is to remove the delay component and noise component due to eye movement from the locus of the "gaze point" by using the image information, and to estimate the locus of the "point of interest". At the time of this estimation, the salency map is calculated at the same time as a latent variable that affects the trajectory of the "point of interest".

本実施形態では、注視点の時系列を生成モデルで表現し、その事後分布を推定する。 In this embodiment, the time series of the gazing point is represented by a generative model, and its posterior distribution is estimated.

＜サリエンシーマップの条件付き強度関数による定義＞
従来のサリエンシーマップは、サッカードの行き先になりやすい地点で値が大きくなるように定義されている(参考文献１参照)。
（参考文献１）L. Itti, C. Koch and E. Niebur, "A Model of Saliency-Based Visual Attention for Rapid Scene Analysis", IEEE Transactions on Pattern Analysis and Machie Intelligence, vol. 20, pp. 1254-1259, 1998. <Definition by conditional strength function of saliency map>
Conventional saccade maps are defined so that the value increases at points that are likely to be the destination of the saccade (see Reference 1).
(Reference 1) L. Itti, C. Koch and E. Niebur, "A Model of Saliency-Based Visual Attention for Rapid Scene Analysis", IEEE Transactions on Pattern Analysis and Machie Intelligence, vol. 20, pp. 1254-1259 , 1998.

そこで本実施形態では、このサリエンシーマップの考え方に則り、各地点のサリエンシーの値をその地点に向かうサッカードの発生頻度によって定義することにする。 Therefore, in the present embodiment, the value of salience at each point is defined by the frequency of occurrence of saccades toward that point, based on the concept of this salience map.

サッカードは、眼球運動に含まれる跳躍性のジャンプのことであり、瞬間的に発生する運動である。そのため、その時系列は瞬間的に発生するイベントを並べたような時系列となる。統計的モデリングの枠組みにおいては、このようなランダムに生起するイベントの時系列を記述するための確率過程として点過程がよく用いられている。とくに各イベントが何らかの特徴量をもつ場合はマーク付き点過程が用いられる。マーク付き点過程とは、各イベントにマークとよばれる特徴量が付加された系列の確率的構造を記述するための確率過程であり、モデリングの対象としては地震や為替取引等が挙げられる。図１にマーク付き点過程から生成された標本の例を示す。図１では、横軸が時刻を表し、縦軸が所定のマークを表す。サッカードの時系列をマーク付き点過程とみなす場合は、サッカードの発生時刻をイベントの発生時刻、サッカードの振幅・方向・到着地点などの特徴量をマークとみなすことができる。 Saccade is a jumping jump included in eye movements, and is a movement that occurs instantaneously. Therefore, the time series is a time series in which events that occur momentarily are arranged. In the framework of statistical modeling, the point process is often used as a stochastic process for describing the time series of such randomly occurring events. The marked point process is used, especially if each event has some features. The marked point process is a stochastic process for describing the stochastic structure of a series in which a feature quantity called a mark is added to each event, and modeling targets include earthquakes and foreign exchange transactions. FIG. 1 shows an example of a sample generated from the marked point process. In FIG. 1, the horizontal axis represents time and the vertical axis represents a predetermined mark. When the time series of saccades is regarded as a marked point process, the time of occurrence of saccades can be regarded as the time of occurrence of events, and the features such as the amplitude, direction, and arrival point of saccades can be regarded as marks.

点過程モデリングを行う場合、データから点過程の確率的構造を推定することが目標となる。しかしながら、その確率密度関数を直接推定するのは難しい。これは、点過程が定義される距離空間の構造が複雑であり、尤度の計算が困難であることに起因する。したがって、点過程によって時系列データのモデリングを行う場合は、確率密度関数を推定するかわりに条件付き強度関数と呼ばれる関数を推定することが多い。 When performing point process modeling, the goal is to estimate the stochastic structure of the point process from the data. However, it is difficult to directly estimate the probability density function. This is because the structure of the metric space in which the point process is defined is complicated, and it is difficult to calculate the likelihood. Therefore, when modeling time series data by a point process, a function called a conditional intensity function is often estimated instead of estimating the probability density function.

条件付き強度関数は、過去の履歴があたえられたもとでの単位時間あたりに発生するイベントの数の期待値に相当する関数である。具体的には The conditional intensity function is a function corresponding to the expected value of the number of events that occur per unit time under the given history of the past. In particular

と定義される。ただし、Nは計数測度、κ∈Κは各イベントに付加されたマーク、H_tは時刻tまでの履歴である。このH_tは、t以前に発生したイベントの情報や同時に観測された共変量の値など、全ての情報を含んでもよい。 Is defined as. However, N is the counting measure, κ ∈ Κ is the mark added to each event, and H _t is the history up to time t. This H _t may include all information such as information on events that occurred before t and values of covariates observed at the same time.

この条件付き強度関数によって、ほとんどの点過程の確率的構造は一意に決定されるため（例えば参考文献２参照）、点過程の推定を行う際は条件付き強度関数を推定することを考えれば十分である。
（参考文献２）D. J. Daley and D. Vere-Jones, "An Introduction to The Theory of Point Processes", Springer Science and Business Media, New York, 2003. Since the stochastic structure of most point processes is uniquely determined by this conditional intensity function (see, for example, Reference 2), it is sufficient to consider estimating the conditional intensity function when estimating the point process. Is.
(Reference 2) DJ Daley and D. Vere-Jones, "An Introduction to The Theory of Point Processes", Springer Science and Business Media, New York, 2003.

本実施形態では、サッカードのベクトルをマークとするようなマーク付き点過程からサッカードの系列が生成されていると仮定する。今回の問題設定では画像という時間的に定常な刺激を呈示しているため、サッカードの発生も時間的に定常であるとする。すなわち、条件付き強度関数が In this embodiment, it is assumed that the saccade sequence is generated from a marked point process such that the saccade vector is used as a mark. Since the problem setting this time presents a temporally stationary stimulus called an image, it is assumed that the occurrence of saccade is also temporally stationary. That is, the conditional intensity function

と分解できるとし、サッカードの発生を示す関数λ(t|H_t)が定常であるとする。ただし、条件付き強度関数それ自体が確率過程である。このとき、λ_S(κ|H_t)はサッカードが画像平面上のどの地点に向かって発生しやすいかを説明する項であり、サリエンシーマップに対応する関数となる。 It is assumed that the function λ (t | H _t ) indicating the occurrence of saccade is steady. However, the conditional intensity function itself is a stochastic process. At this time, λ _S (κ | H _t ) is a term for explaining to which point on the image plane the saccade is likely to occur, and is a function corresponding to the saccade map.

次に、λ_S(κ|H_t)としてどのような関数が適しているかを考察する。本実施形態では、サッカード生成に関連の深い中脳の上丘を参考にλ_S(κ|H_t)の形を決めることにする。上丘にはある特定の振幅・方向のサッカードが発生したときのみに発火するという性質をもつニューロンが存在する。すなわち、各ニューロンは視野上でmovement field(図２中、MFと表す)と呼ばれるある特定の領域をもち、サッカードがその領域に向かって発生したタイミングでそのニューロンが発火する。図２にそのイメージ図を示す。図中、spike trainはニューロンの発火状態を示し、scan pathは注視点を示す。また、それらのニューロンの配置(motor map on SC)は各々が反応する振幅・方向に空間的に対応したマップをなしていることが知られている。図３にその対応関係を示す。図３では上丘におけるニューロンの配置と受容野の中心位置は空間的に対応している。このような対応関係を根拠に、サリエンシーマップが上丘に表象されているといわれることもある。 Next, let us consider what kind of function is suitable as λ _S (κ | H _t ). In this embodiment, the shape of λ _S (κ | H _t ) is determined with reference to the superior colliculus of the midbrain, which is closely related to saccade generation. There are neurons in the superior colliculus that have the property of firing only when a saccade of a specific amplitude and direction occurs. That is, each neuron has a specific region called a movement field (referred to as MF in FIG. 2) on the visual field, and the neuron fires at the timing when the saccade is generated toward that region. The image diagram is shown in FIG. In the figure, the spike train indicates the firing state of the neuron, and the scan path indicates the gazing point. It is also known that the arrangement of these neurons (motor map on SC) forms a map that spatially corresponds to the amplitude and direction in which each responds. FIG. 3 shows the correspondence. In FIG. 3, the arrangement of neurons in the superior colliculus and the central position of the receptive field correspond spatially. It is sometimes said that the superior colliculus is represented on the superior colliculus based on such a correspondence.

サリエンシーマップが上丘によって表象されている場合、サリエンシーマップは上丘ニューロンの受容野(ニューロンの神経応答に変化を生じるような刺激が提示される空間の領域)を足し合わせた形で表現されていると考えられる。本実施形態では、この生理学的な知見を参考に、λ_S(κ|H_t)を次のようなガウシアンカーネルの非負結合によって表すことにする。 When the superior colliculus map is represented by the superior colliculus, the superior colliculus map is the sum of the receptive fields of superior colliculus neurons (the areas of space in which stimuli that change the neural response of neurons are presented). It is thought that it has been done. In this embodiment, with reference to this physiological finding, λ _S (κ | H _t ) is expressed by the following non-negative binding of Gaussian kernel.

ただし、 However,

であり、c_jは発火頻度、μ_jは受容野の中心位置、Λ_jは受容野の広がり具合を表すパラメータである。とくに、c_jについては過去のサッカードの履歴H_tに依存してよいとする。また、x_t-は時刻t直前での注視点である。時刻tでサッカードが発生した場合は、x_t-はサッカードの発生地点、κ+x_t-はサッカードの到着地点に対応する。 C _j is the firing frequency, μ _j is the central position of the receptive field, and Λ _j is the parameter indicating the extent of the receptive field. In particular, c _j may depend on the history H _t of past soccer games. Also, x _t- is the gaze point just before time t. If a saccade occurs at time t, x _t- corresponds to the origin of the saccade and κ + x _t- corresponds to the arrival point of the saccade.

以上が本実施形態におけるサリエンシーマップの定義である。ここで、サリエンシーマップを条件付き強度関数によって定義することの利点を補足しておく。条件付き強度関数によってサリエンシーマップを定義する最大の利点は、サリエンシーの値がサッカードの頻度になっているという点である。したがって、Winner take allルールで説明できないような注視点の確率的な挙動を説明することができる。さらに、異なる画素間の値の比や、異なる画像における値の比に意味を与えることができる。例えば、A地点のサリエンシー値がB 地点のサリエンシー値の2倍である場合、A地点の方がB地点よりも2倍注視されやすいと解釈できる。さらに、サリエンシーマップを上丘ニューロンの発火活動の強度関数と直接対応付けることも可能である。 The above is the definition of the saliency map in this embodiment. Here, we supplement the advantages of defining the saliency map by the conditional intensity function. The biggest advantage of defining a salience map with a conditional intensity function is that the salience value is the frequency of saccades. Therefore, it is possible to explain the probabilistic behavior of the gazing point that cannot be explained by the Winner take all rule. Furthermore, it is possible to give meaning to the ratio of values between different pixels and the ratio of values in different images. For example, if the salience value at point A is twice the salience value at point B, it can be interpreted that point A is twice as likely to be watched as point B. Furthermore, it is possible to directly associate the saliency map with the intensity function of the firing activity of superior colliculus neurons.

画像の情報については、次のように利用する。事前に、従来手法で計算されたサリエンシーマップ画像Sに対してガウシアンカーネルを用いたサポートベクター回帰により、{(c_j,μ_j,Λ_j)}^J _j=1を推定しておく。ただし、条件付き強度関数は非負である必要があるため、サポートベクター回帰を適用する際は重み係数^-c_jについて非負制約を導入する。事後分布の推定の段階では、事前に推定しておいたパラメータのうち{(μ_j,Λ_j)}^J _j=1を固定したハイパーパラメータとしてそのまま利用する。また、{c_j}^J _j=1は、c_j(H_t)を推定する際の事前分布として利用する。 Image information is used as follows. In advance, {(c _j , μ _j , Λ _j )} ^J _{j = 1} is estimated by the support vector regression using the Gaussian kernel for the saliency map image S calculated by the conventional method. However, the conditional intensity function because there must be non-negative, when applying the support vector regression weighting factor ^- introducing zero bound on c _j. At the stage of estimating the posterior distribution, {(μ _j , Λ _j )} ^J _{j = 1} among the parameters estimated in advance is used as it is as a fixed hyperparameter. Also, {c _j } ^J _{j = 1} is used as a prior distribution when estimating c _j (H _t ).

＜注視点の生成モデル＞
観測時間をT、アイトラッカーのサンプリング回数をRとし、サンプリング間隔をΔt = T/Rとする。アイトラッカーによって計測された注視点をx_r∈R²,r=1,…,Rとし、人が真に着目している点の位置(つまり、注目点)をa_r∈R²,r=1,…,Rとする。この注目点a_rは、普段は連続的に変化しつつ、注意の対象が遷移した際に大きくジャンプするとする。本実施形態では、この瞬間的な注目点のジャンプをサッカードと呼ぶことにする。 <Generative model of gaze point>
Let T be the observation time, R be the number of eye tracker samplings, and Δt = T / R be the sampling interval. The gazing point measured by the eye tracker is x _r ∈ R ² , r = 1,…, R, and the position of the point that a person is really paying attention to (that is, the point of interest) is a _r ∈ R ² , r = Let 1,…, R. It is assumed that this point of interest a _r usually changes continuously, but jumps greatly when the object of attention changes. In the present embodiment, this momentary jump of attention is referred to as saccade.

ここで、κ_rを、時刻rにサッカードが発生した場合はそのサッカードのベクトル、発生しなかった場合は空集合{}(ただし、式中では、空集合を Here, κ _r is the vector of the saccade if the soccerd occurs at time r, and the empty set {} if it does not occur (however, in the formula, the empty set is used.

とも表記する）をとる確率変数とする。このκ_rを用いて、注目点a_rが次のようなジャンプを伴うランダムウォークに従って生成されているとする。 It is also expressed as a random variable. Using this kappa _r , it is assumed that the point of interest a _r is generated according to a random walk with the following jump.

このような仮定のもとで、注目点a_rは不連続な時系列となる。つまり、時刻r毎の人の真の注目点a_rは、１時刻前の真の注目点a_r-1を基準として、時刻rの近傍(時刻r-1〜時刻rの間)にサッカードが発生する場合はサッカードの方向及び大きさ(つまり、サッカードのベクトルκ_r)に応じて移動させた位置とし、時刻rの近傍にサッカードが発生しない場合はランダムな方向及び大きさへ移動させた位置となるようモデル化される。言い換えれば、時刻rの近傍にサッカードが発生しない場合はランダムウォークu_r〜N(0,U)により移動先の位置が生成されるものとしてモデル化される。 Under these assumptions, the points of interest a _r are discontinuous time series. That is, the true point of interest a _r of a person at each time r is saccade in the vicinity of time r (between time r-1 and time r) with reference to the true point of interest a _r-1 one hour before. If saccade occurs, the position is moved according to the direction and size of saccade (that is, the vector κ _{r of saccade} ), and if saccade does not occur in the vicinity of time r, the direction and size are random. It is modeled to be the moved position. In other words, if no saccade occurs near the time r, it is modeled as a random walk u _r ~ N (0, U) to generate the destination position.

しかしながら、眼球の物理的な制約上、注視点は瞬間的に移動することはない。なぜならば、我々がある点を注目しようと思い立ったのちに注視点がその点に移動するまでには、筋肉が眼球を動かす分の遅れが生じるからである。そこで、この眼球運動の制約によって生じる注目点a_rからの注視点の遅れをs_rで表すことにし、時刻毎の人の注視点の遅れs_rは、次のようなAR(2)モデルによって生成されると仮定され、モデル化される。 However, due to the physical restrictions of the eyeball, the gazing point does not move momentarily. This is because there is a delay in the movement of the eyeballs by the muscles before the point of gaze moves to that point after we have decided to pay attention to that point. Therefore, we decided to express the delay of the gazing point from the point of interest a _r caused by this restriction of eye movement by s _r , and the delay s _r of the gazing point of a person at each time is _calculated by the following AR (2) model. Assumed to be generated and modeled.

ただし、F₁、F₂は眼球運動の特性を決めるパラメータである。 However, F ₁ and F ₂ are parameters that determine the characteristics of eye movement.

このもとで、真の注視点の軌跡はa_r+s_rによって表される。つまり、真の注視点は、真の注目点a_rを注視点の遅れs_rにより補正した位置としてモデル化される。図４に分解のイメージを示す。そして、実際に計測される注視点は、真の注視点a_r+s_rにノイズw_rが加わたものとして次のようにモデル化される。 Under this, the true gazing point trajectory is represented by a _r + s _r . That is, the true gaze point is modeled as the position where the true point of interest a _r is corrected by the gaze point delay s _r . FIG. 4 shows an image of decomposition. Then, the actually measured gaze point is modeled as follows, assuming that the true gaze point a _r + s _r plus the noise w _r .

次に、サッカード{κ_r}がどのように生成されるかを考える。本実施形態では、＜サリエンシーマップの条件付き強度関数による定義＞で説明したとおり、サッカードが(2)式および(3)式の条件付き強度関数によって定義されるマーク付き点過程から生成されていると仮定する。ただし、観測が離散的であるため、条件付き強度関数は離散時間で定義されることに注意する。いま、サッカードが発生した直後は、新たなサッカードは発生しにくく、かつそのサッカードの行き先は発生直前の注視点に依存すると考えられる。そこで、サッカードがサリエンシーマップのどの混合成分に向かって発生したものかを示す潜在変数 Next, consider how the saccade {κ _r } is generated. In this embodiment, as described in <Definition of Saliency Map by Conditional Intensity Function>, the saccade is generated from the marked point process defined by the conditional intensity functions of Eqs. (2) and (3). Suppose you are. Note, however, that the conditional intensity function is defined in discrete time because the observations are discrete. Immediately after a saccade occurs, it is unlikely that a new saccade will occur, and the destination of the saccade depends on the point of gaze immediately before the occurrence. So, a latent variable that indicates which mixed component of the saccade the saccade originated in.

を導入し、{(κ_r,j_r)}が次のようなマーク付き点過程から生成されていると想定する。 Is introduced, and it is assumed that {(κ _r , j _r )} is generated from the following marked point process.

ただし、r^*は時刻r以前で発生した最後(直近)のサッカードの発生時刻のインデックスであり、i^*はそのサッカードがどの混合成分に向かって発生したものかを示すインデックスである。r以前で発生したサッカードがない場合はr^*=0とする。A={A_ij}は確率推移行列であり、サッカードが発生したもとでそのサッカードが混合成分iから混合成分jに向かうものである確率がA_ijに対応する。また、π_jは最初に発生したサッカードが向かう混合成分がjである確率である。h(・)はサッカードが短い時間間隔で発生しないようにするための修正項であり、本実施形態では負の二項分布のハザード関数 However, r ^* is an index of the occurrence time of the last (most recent) saccade that occurred before time r, and i ^* is an index indicating which mixed component the saccade occurred toward. If there is no saccade that occurred before r, set r ^* = 0. A = {A _ij } is a probability transition matrix, and the probability that the _saccade goes from the mixed component i to the mixed component j when the _saccade occurs corresponds to A _ij . Also, π _j is the probability that the first mixed component headed by the saccade is j. h (・) is a modified term for preventing saccades from occurring at short time intervals, and in this embodiment, a hazard function with a negative binomial distribution.

を用いる。ただし、 Is used. However,

であり、θは発生間隔の平均を、mはサッカード直後の発生しにくさを操作するパラメータである。とくに、m=1のとき負の二項分布は幾何分布と一致しハザードは定数となる。 , Θ is the average of the occurrence intervals, and m is the parameter that controls the difficulty of occurrence immediately after saccade. In particular, when m = 1, the negative binomial distribution matches the geometric distribution and the hazard is a constant.

さらに、A,πおよびθについては次のような事前分布を設定する。 Furthermore, the following prior distributions are set for A, π and θ.

ただし、 However,

であり、{(c_j,μ_j,Λ_j)}は事前にサポートベクター回帰によって計算しておいたパラメータである。また、α₀は画像情報の事前分布への反映度合いを調節するパラメータであり、事前に適切な値を設定しておく。 And {(c _j , μ _j , Λ _j )} is a parameter calculated in advance by support vector regression. In addition, α ₀ is a parameter that adjusts the degree of reflection of image information in the prior distribution, and an appropriate value is set in advance.

以上が本実施形態における生成モデルの定義である。最後に、モデルをまとめておく。 The above is the definition of the generative model in this embodiment. Finally, I will summarize the model.

＜事後分布の推定＞
推定のステップでは、(10)式のモデルのもとでの潜在変数の事後分布およびハイパーパラメータを推定することが目標となる。その推定の方法を順を追って説明する。まずはじめに、(10)式の生成モデルを扱いが容易なモデルに書き換える。次に、書き換えたモデルのもとで、事後分布の近似を求める方法を説明する。最後に得られた事後分布をどう解釈すればよいかを述べる。 <Estimation of posterior distribution>
In the estimation step, the goal is to estimate the posterior distribution and hyperparameters of latent variables under the model of Eq. (10). The estimation method will be described step by step. First, rewrite the generative model of Eq. (10) into a model that is easy to handle. Next, a method of obtaining an approximation of the posterior distribution based on the rewritten model will be described. We describe how to interpret the posterior distribution obtained at the end.

＜スイッチング線形ガウス状態空間モデルへの帰着＞
上述の(10)式の生成モデルは、等価な隠れセミマルコフ・スイッチング線形ガウス状態空間モデルに変形することができる。このようなモデルに書き換えることで、変分ベイズの枠組みで潜在変数の事後分布を推定することが可能となる。 <Reduction to a switching linear Gaussian state-space model>
The generative model of Eq. (10) above can be transformed into an equivalent hidden semi-Markov switching linear Gaussian state-space model. By rewriting to such a model, it becomes possible to estimate the posterior distribution of latent variables within the framework of variational Bayes.

サッカードがどの混合成分に向けて発生したかを示す潜在変数{j_r}^R _r=1のかわりとして、新たな潜在変数{z_r}^R _r=1を Instead of the latent variable {j _r } ^R _{r = 1} , which indicates which mixed component the saccade was directed to, a new latent variable {z _r } ^R _{r = 1}

と定義する。ただし、j_(r+1)^*は時刻r+1以前に発生した最後のサッカードが、どの混合成分に向かって発生したものかを示すインデックスである。なお、上付き添え字及び下付き添え字におけるA^BはA^Bを意味し、A_BはA_Bを意味するものとする。また、{z_r}^R _r=1に関連する事象について Is defined as. However, j _{(r + 1) ^ *} is an index indicating which mixed component the last saccade that occurred before time r + 1 occurred. In addition, A ^ B in superscript and superscript means A ^B , and A_B means A _B. Also, about the events related to {z _r } ^R _{r = 1}

と表記することにする。以上の定義のもとで、{z_r}^R _r=1は推移確率が I will write it as. Under the above definition, {z _r } ^R _{r = 1} has a transition probability

であり、初期確率が And the initial probability is

であるセミマルコフ過程に従う。 Follow the semi-Markov process.

次に、{(x_r,a_r,s_r,κ_r)}^R _r=1の生成モデルを書き換える。書き換えの際のポイントはκ_rを消すことである。 Next, rewrite the generative model of {(x _r , a _r , s _r , κ _r )} ^R _{r = 1} . The point at the time of rewriting is to erase kappa _r .

ここで、新たな変数を以下のように定める。 Here, the new variables are defined as follows.

このように変数を定義すると、{(x_r,b_r)}^R _r=1の生成モデルを次のような線形状態空間モデルに書き換えることができる。 By defining the variables in this way, the generative model of {(x _r , b _r )} ^R _{r = 1} can be rewritten into the following linear state-space model.

したがって、(10)式の生成モデルは、背後にセミマルコフ過程に従う潜在変数が存在し、その潜在変数に従って局所的に線形ガウス状態空間モデルをつなげた形で表すことができる。このようなモデルをスイッチング線形状態空間モデルとよぶ。 Therefore, the generative model of Eq. (10) can be expressed in the form of a latent variable that follows the semi-Markov process behind it, and a linear Gaussian state-space model that is locally connected according to the latent variable. Such a model is called a switching linear state space model.

最後に、(10)式を書き換えたモデルをまとめておく。 Finally, the model in which Eq. (10) is rewritten is summarized.

＜変分ベイズ＞
以降、表記の簡単のため <Variational Bayes>
Hereafter, for simplicity of notation

とする。 And.

推定のステップでは、xが与えられたもとでのb,z,φの事後分布を計算することが目標となる。しかしながら、モデルの構造が複雑であるため、真の事後分布を解析的に計算することができない。そこで、本稿では変分ベイズの枠組みを用いて、真の事後分布の近似を計算することにする。 In the estimation step, the goal is to calculate the posterior distribution of b, z, φ given x. However, due to the complexity of the model structure, the true posterior distribution cannot be calculated analytically. Therefore, in this paper, we use the variational Bayesian framework to calculate an approximation of the true posterior distribution.

変分ベイズとは、計算が簡単になるようにあらかじめ指定しておいた分布族の中から、真の事後分布とのKLダイバージェンスが最小になるような分布を求め、その分布を真の事後分布の近似とする方法である。いま、xの周辺対数尤度log p(x|ψ)は、任意の分布q(b,z,φ)を用いて The variational Bayes is to find the distribution that minimizes the KL divergence with the true posterior distribution from the distribution family specified in advance so that the calculation is easy, and the distribution is the true posterior distribution. It is a method of approximation of. Now, the peripheral log-likelihood log p (x | ψ) of x uses an arbitrary distribution q (b, z, φ).

と分解できる。ただし、KL(q||p(b,z,φ|x,ψ))はq(b,z,φ)と真の事後分布p(b,z,φ|x,ψ)との間のKLダイバージェンスである。(19)式の一行目の左辺がqに依存しないことより、L(q)の最大化とKLダイバージェンスの最小化は一致する。とくに、qが真の事後分布p(b,z,φ|x,ψ)と一致する場合にL(q)は最大値をとる。 Can be disassembled. However, KL (q || p (b, z, φ | x, ψ)) is between q (b, z, φ) and the true posterior distribution p (b, z, φ | x, ψ). KL divergence. Since the left side of the first line of equation (19) does not depend on q, the maximization of L (q) and the minimization of KL divergence match. In particular, L (q) takes the maximum value when q matches the true posterior distribution p (b, z, φ | x, ψ).

したがって、事後分布の近似qを求める際は、真の事後分布とのKLダイバージェンスの最小化を考えるかわりに、L(q)の最大化を考えれば良い。しかしながら、任意の分布qについてL(q)を計算するのは困難である。そこで、L(q)の計算が簡単な分布の族Qを指定し、この分布族の中でL(q)が最大となる分布を求めることにする。すなわち、事後分布の推定を次のような最適化問題を解くことに帰着させる。 Therefore, when finding the approximation q of the posterior distribution, instead of considering the minimization of KL divergence with the true posterior distribution, it is sufficient to consider maximizing L (q). However, it is difficult to calculate L (q) for any distribution q. Therefore, we specify the family Q of the distribution whose L (q) is easy to calculate, and find the distribution where L (q) is the largest in this distribution family. That is, the estimation of the posterior distribution results in solving the following optimization problem.

本実施形態ではこの分布族Qとして、独立性制約 In the present embodiment, this distribution family Q is an independence constraint.

を満たす分布の族を指定する。このような制約を導入することにより、座標降下法によってL(q)の最適化を効率的に行うことができる。具体的には、q(b)の更新にカルマンスムーザ、q(z)の更新にforward-backwardアルゴリズムが利用できる。さらに、q(φ)の更新も解析的な最適解を導出できる。また、ψについてL(q)を最大化することにより、これらのハイパーパラメータをもデータから決定することが可能である。 Specify a family of distributions that satisfy. By introducing such a constraint, it is possible to efficiently optimize L (q) by the coordinate descent method. Specifically, the Kalman smoother can be used to update q (b), and the forward-backward algorithm can be used to update q (z). Furthermore, updating q (φ) can also derive an analytical optimum solution. Also, by maximizing L (q) for ψ, these hyperparameters can also be determined from the data.

以下、上述の処理を実現する推定装置について説明する。 Hereinafter, an estimation device that realizes the above processing will be described.

＜第一実施形態に係る推定装置＞
注視点の時系列を入力として、注視点モデルのパラメータを推定する推定装置について説明する。 <Estimating device according to the first embodiment>
An estimation device that estimates the parameters of the gazing point model using the gazing point time series as input will be described.

推定装置は、例えば、中央演算処理装置（CPU: Central Processing Unit）、主記憶装置（RAM: Random Access Memory）などを有する公知又は専用のコンピュータに特別なプログラムが読み込まれて構成された特別な装置である。推定装置は、例えば、中央演算処理装置の制御のもとで各処理を実行する。推定装置に入力されたデータや各処理で得られたデータは、例えば、主記憶装置に格納され、主記憶装置に格納されたデータは必要に応じて中央演算処理装置へ読み出されて他の処理に利用される。推定装置の各処理部は、少なくとも一部が集積回路等のハードウェアによって構成されていてもよい。推定装置が備える各記憶部は、例えば、RAM（Random Access Memory）などの主記憶装置、ハードディスクや光ディスクもしくはフラッシュメモリ（Flash Memory）のような半導体メモリ素子により構成される補助記憶装置、またはリレーショナルデータベースやキーバリューストアなどのミドルウェアにより構成することができる。 The estimation device is a special device configured by loading a special program into a known or dedicated computer having, for example, a central processing unit (CPU), a main storage device (RAM: Random Access Memory), and the like. Is. The estimation device executes each process under the control of the central processing unit, for example. The data input to the estimation device and the data obtained by each process are stored in the main storage device, for example, and the data stored in the main storage device is read out to the central processing unit as needed and used for other processing. Used for processing. At least a part of each processing unit of the estimation device may be configured by hardware such as an integrated circuit. Each storage unit included in the estimation device is, for example, a main storage device such as RAM (Random Access Memory), an auxiliary storage device composed of a hard disk, an optical disk, or a semiconductor memory element such as a flash memory, or a relational database. It can be configured with middleware such as or key value store.

図５は第一実施形態に係る推定装置の機能ブロック図を、図６はその処理フローの例を示す図である。 FIG. 5 is a functional block diagram of the estimation device according to the first embodiment, and FIG. 6 is a diagram showing an example of the processing flow.

推定装置は、モデル推定部１０１と出力部１７０とを含む。 The estimation device includes a model estimation unit 101 and an output unit 170.

モデル推定部１０１は、初期設定部１１０と、第１更新部１２０と、第２更新部１３０と、第３更新部１４０と、第４更新部１５０と、制御部１６０とを含む。 The model estimation unit 101 includes an initial setting unit 110, a first update unit 120, a second update unit 130, a third update unit 140, a fourth update unit 150, and a control unit 160.

<<モデル推定部１０１>>
[入力]：アイトラッカーにより計測された注視点の時系列x_r（r=1,2,…,R)
[出力]：モデルパラメータの推定結果q(b),q(z),q(φ),ψ
[処理]：モデル推定部１０１は、アイトラッカーにより計測された注視点(対象者の眼の動きを計測して得た注視点)の時系列x_r（r=1,2,…,R)を入力として、(17)式により表される注視点の時系列モデルの各パラメータb,z,φ,ψの事後分布を推定することにより、注視点の時系列モデル（学習済みモデル）を求める。 << Model estimation unit 101 >>
[Input]: Time series of gaze point measured by eye tracker x _r (r = 1,2,…, R)
[Output]: Model parameter estimation results q (b), q (z), q (φ), ψ
[Processing]: The model estimation unit 101 uses the time series x _r (r = 1,2, ..., R) of the gazing point (the gazing point obtained by measuring the movement of the subject's eye) measured by the eye tracker. By estimating the posterior distribution of each parameter b, z, φ, ψ of the gaze point time series model expressed by Eq. (17), the gaze point time series model (trained model) is obtained. ..

例えば、独立性制約 For example, independence constraint

を満たす分布族の下で、 Under a distribution family that meets

のL(q)が最大となる事後分布を求める。 Find the posterior distribution that maximizes L (q) of.

以下、モデル推定部１０１の具体的な処理について説明する。 Hereinafter, specific processing of the model estimation unit 101 will be described.

〔初期設定部１１０〕
初期設定部１１０は、q(b,z,φ),ψに適当な初期値を設定し(Ｓ１１０)、出力する。このとき、q(b,z,φ)については、(21)式 [Initial setting unit 110]
The initial setting unit 110 sets appropriate initial values for q (b, z, φ) and ψ (S110), and outputs the data. At this time, for q (b, z, φ), Eq. (21)

を満たすようにq(b),q(z),q(φ)を与える。ここで、q(b),q(z),q(φ)は、以後の処理で更新の際に用いる各変数の事後期待値が計算できるような分布を与える。例えば、各分布を更新する際に更新先となる分布と、同じ形の分布となるようにパラメータを決定する。本実施形態では、
q(b):正規分布
q(φ):q(A)とq(π)はディリクレ分布、q(θ)はベータ分布
となるようにパラメータを決定する。なお、q(z)は、離散確率変数なので、どのような値でもよい。 Give q (b), q (z), q (φ) so as to satisfy. Here, q (b), q (z), and q (φ) give a distribution so that the ex post facto expected value of each variable used at the time of updating in the subsequent processing can be calculated. For example, when updating each distribution, the parameters are determined so that the distribution has the same shape as the distribution to be updated. In this embodiment,
q (b): Normal distribution
Determine the parameters so that q (φ): q (A) and q (π) have a Dirichlet distribution and q (θ) has a beta distribution. Since q (z) is a discrete random variable, it may have any value.

〔第１更新部１２０〕
第１更新部１２０は、q(z),q(φ),ψを入力とし、q(z),q(φ),ψが与えられているもとで、(19)式のL(q)を最大にするq(b)を求め、q(b)の値を求めた値で更新する(Ｓ１２０)。例えば、初回の処理では初期設定部１１０で設定したq(z),q(φ),ψを用い、2回目以降の処理ではそれぞれ第２更新部１３０、第３更新部１４０、第４更新部１５０で更新した最新のq(z),q(φ),ψを用いる。以下に例を示す。 [First update unit 120]
The first update unit 120 takes q (z), q (φ), ψ as inputs, and given q (z), q (φ), ψ, L (q) of Eq. (19) is given. ) Is obtained to maximize q (b), and the value of q (b) is updated with the obtained value (S120). For example, in the first processing, q (z), q (φ), ψ set in the initial setting unit 110 are used, and in the second and subsequent processing, the second update unit 130, the third update unit 140, and the fourth update unit, respectively. The latest q (z), q (φ), ψ updated in 150 are used. An example is shown below.

q(z),q(φ)およびψが与えられているもとで、L(q)を最大にするq(b)は Given q (z), q (φ) and ψ, q (b) that maximizes L (q) is

となる。ここで、b以外の変数についての完全対数尤度の事後期待値は、次のような線形ガウス状態空間モデルの対数尤度と一致する。 Will be. Here, the posterior expected value of the perfect log-likelihood for variables other than b matches the log-likelihood of the linear Gaussian state-space model as follows.

ただし、~F_r,~o_r,~P_rは、 However, ~ F _r , ~ o _r , ~ P _r are

であり、{ξ_rj}_{r=1,…,R,j=1,…,J}は、 _And {ξ _rj } _{r = 1,…, R, j = 1,…, J} is

である。したがって、通常の線形ガウス状態空間モデルと同様、カルマンスムーザを適用することによってq(b)が計算可能である。既存のカルマンスムーザを適用することができるため、その詳細は省略する。 Is. Therefore, q (b) can be calculated by applying the Kalman smoother, as in the normal linear Gaussian state-space model. Since the existing Kalman smoother can be applied, the details are omitted.

〔第２更新部１３０〕
第２更新部１３０は、q(b),q(φ),ψを入力とし、q(b),q(φ),ψが与えられているもとで,(19)式のL(q)を最大にするq(z)を求め、q(z)の値を求めた値で更新する(Ｓ１３０)。例えば、初回の処理では初期設定部１１０で設定したq(φ),ψと第１更新部１２０で更新したq(b)を用い、2回目以降の処理ではそれぞれ第１更新部１２０、第３更新部１４０、第４更新部１５０で更新した最新のq(b),q(φ),ψを用いる。以下に例を示す。 [Second update unit 130]
The second update unit 130 takes q (b), q (φ), ψ as inputs, and gives L (q) of Eq. (19) under the given q (b), q (φ), ψ. ) Is obtained to maximize q (z), and the value of q (z) is updated with the obtained value (S130). For example, in the first process, q (φ) and ψ set in the initial setting unit 110 and q (b) updated in the first update unit 120 are used, and in the second and subsequent processes, the first update unit 120 and the third are used, respectively. The latest q (b), q (φ), ψ updated by the update unit 140 and the fourth update unit 150 are used. An example is shown below.

q(b),q(φ)およびψが与えられているもとで,L(q)を最大にするq(z)は Given q (b), q (φ) and ψ, the q (z) that maximizes L (q) is

となる。ここで、z以外の変数についての完全対数尤度の事後期待値は、次のようなセミマルコフ過程の対数尤度と一致する。 Will be. Here, the posterior expected value of the perfect log-likelihood for variables other than z agrees with the log-likelihood of the semi-Markov process as follows.

ただし、 However,

であり、これらの事後期待値は解析的に計算可能である。 Therefore, these post-expected values can be calculated analytically.

したがって、隠れセミマルコフモデルの推定に用いられるforward-backwardアルゴリズムを適用することによってq(z)が計算可能である。forward-backwardアルゴリズムでは、次式で定義されるforward-message(α,α^*)およびbackward-message(β,β^*)を再起的に計算する。 Therefore, q (z) can be calculated by applying the forward-backward algorithm used to estimate the hidden semi-Markov model. In the forward-backward algorithm, the forward-message (α, α ^* ) and backward-message (β, β ^* ) defined by the following equations are calculated recursively.

ただし、o_rは時刻rでの観測に対応する変数である。計算されたメッセージを用いることで、z_rの事後分布を得ることができる。具体的な更新式の詳細については、参考文献３等を参照されたい。
(参考文献３)S. Z. Yu, "Hidden semi-Markov models. Articial intelligence", vol. 174, pp. 215-243, 2010. However, o _r is a variable corresponding to the observation at time r. By using the calculated message, the posterior distribution of z _r can be obtained. For details of the specific update formula, refer to Reference 3 and the like.
(Reference 3) SZ Yu, "Hidden semi-Markov models. Articial intelligence", vol. 174, pp. 215-243, 2010.

しかしながら、通常のforward-backwardアルゴリズムに必要な計算量はO(J²R²)であり、時系列の長さRが長いときは望ましくない。そこで、本実施形態では参考文献４で提案された手法を用いて計算量を削減する。
(参考文献４)M. J. Johnson and A. S. Willsky, "Stochastic Variational Inference for Bayesian Time Series Models", International Conference on Machine Learning, 2014. However, the amount of calculation required for a normal forward-backward algorithm is O (J ² R ² ), which is not desirable when the time series length R is long. Therefore, in the present embodiment, the amount of calculation is reduced by using the method proposed in Reference 4.
(Reference 4) MJ Johnson and AS Willsky, "Stochastic Variational Inference for Bayesian Time Series Models", International Conference on Machine Learning, 2014.

この手法の主なアイディアは、(27)式の隠れセミマルコフモデルを等価な隠れマルコフモデルへ変換するというものである。 The main idea of this method is to convert the hidden semi-Markov model of Eq. (27) into an equivalent hidden Markov model.

ここで、新たな潜在変数^-z_rを Now we have a new latent variable ^- z _r

とし、 age,

とする。ただし、 And. However,

とする。以上の設定のもとで、対数尤度が And. Under the above settings, the log-likelihood is

と表されるマルコフ過程を考えると、 Considering the Markov process expressed as

となる（参考文献５参照）。
(参考文献５)M. J. Johnson, "Bayesian time series models and scalable inference", PhD thesis, Massachusetts Institute of Technology, 2014. (See Reference 5).
(Reference 5) MJ Johnson, "Bayesian time series models and scalable inference", PhD thesis, Massachusetts Institute of Technology, 2014.

すなわち、{^-z_r}^R _r=1の分布が得られれば、{z_r}^R _r=1の分布も得られるということになる。とくに、￣z_rについてのforward-message ^-αおよびbackward-message ^-βを That, ^- as long {z _r} obtained distribution of ^R _{r = 1,} it comes to be obtained the distribution of _{^{_{{z r} R r = 1}}} . In particular, forward-message about the ¯z _r ^- α and backward-message ^- β a

とすれば、(α,α^*,β,β^*)および(^-α,^-β)には ^{If, (α, α *, β} , β *) and ^{^(-} α, ^- β) in the

という関係が成り立つ。 The relationship holds.

以上のような関係性を利用してq(z)の更新を行う。まず、隠れマルコフモデルにおけるforward-backwardアルゴリズムを用いて^-z_rについてのメッセージ(^-α,^-β)を計算する。そして、その結果を用いてz_rについてのメッセージ(α,α^*,β,β^*)を計算し、q(z)をそのメッセージに対応する分布に更新する。隠れマルコフモデルのforward-backwardアルゴリズムに必要な計算量はO((mJ)²R)であるため、この更新は時系列の長さRについて高々線形時間で済む。 Update q (z) using the above relationships. First, using the forward-backward algorithm in a hidden Markov model ^- message for _{^{^{z r (- α, - β}}} ) is calculated. Then, using the result, the message (α, α ^* , β, β ^* ) for z _r is calculated, and q (z) is updated to the distribution corresponding to the message. Since the complexity required for the forward-backward algorithm of the hidden Markov model is O ((mJ) ² R), this update requires at most linear time for the length R of the time series.

〔第３更新部１４０〕
第３更新部１４０は、q(b),q(z),ψが与えられているもとで、(19)式のL(q)を最大にするq(φ)を求め、q(φ)の値を求めた値で更新する(Ｓ１４０)。例えば、初回の処理では初期設定部１１０で設定したψと、それぞれ第１更新部１２０、第２更新部１３０で更新したq(b)、q(z)を用い、2回目以降の処理ではそれぞれ第１更新部１２０、第３更新部１４０、第４更新部１５０で更新した最新のq(b),q(φ),ψを用いる。 [3rd update unit 140]
The third update unit 140 finds q (φ) that maximizes L (q) in Eq. (19) under the given q (b), q (z), and ψ, and q (φ). ) Is updated with the obtained value (S140). For example, in the first processing, ψ set by the initial setting unit 110 and q (b) and q (z) updated by the first update unit 120 and the second update unit 130 are used, respectively, and in the second and subsequent processes, respectively. The latest q (b), q (φ), ψ updated by the first update unit 120, the third update unit 140, and the fourth update unit 150 are used.

q(b),q(z)およびψが与えられているもとで、L(q)を最大にするq(φ)は Given q (b), q (z) and ψ, the q (φ) that maximizes L (q) is

となる。 Will be.

以降、それぞれの項についての更新式を示す。まず、q(A_i),i= 1,…,Jを Hereafter, the update formulas for each item are shown. First, q (A _i ), i = 1,…, J

と更新する。ただし、 And update. However,

である。 Is.

また、q(π)を Also, q (π)

と更新する。ただし、 And update. However,

である。 Is.

さらに、q(θ)を Furthermore, q (θ)

と更新する。ただし、 And update. However,

である。このようにして更新したq(A_i)、q(π)、q(θ)を用いて、(36)式によりq(φ)を更新する。 Is. Using q (A _i ), q (π), and q (θ) updated in this way, q (φ) is updated by Eq. (36).

〔第４更新部１５０〕
第４更新部１５０は、q(b),q(z),q(φ)を入力とし、q(b),q(z),q(φ)に基づいて、ハイパーパラメータU,V,Wを更新する(Ｓ１５０)。例えば、それぞれ第１更新部１２０、第２更新部１３０、第３更新部１４０で更新した最新のq(b),q(z),q(φ)を用いる。 [4th update unit 150]
The fourth update unit 150 takes q (b), q (z), q (φ) as input, and hyperparameters U, V, W based on q (b), q (z), q (φ). Is updated (S150). For example, the latest q (b), q (z), and q (φ) updated by the first update unit 120, the second update unit 130, and the third update unit 140 are used.

ハイパーパラメータそれぞれについての更新式を示す。まず、Uを The update formula for each hyperparameter is shown. First, U

により更新する。 Update by.

また、Vを Also, V

により更新する。 Update by.

さらに、Wを In addition, W

により更新する。 Update by.

更新したU,V,Wを用いて、ΨとΩを Using the updated U, V, W, Ψ and Ω

と設定し、このΨとΩを用いた行列方程式 And the matrix equation using this Ψ and Ω

の解fを求める。ただし、vecはベクトル化作用素であり、vec(Ω)はΩの各行ベクトルを並べたベクトルに対応する。この解fを用いて、F₁,F₂を Find the solution f of. However, vec is a vectorization operator, and vec (Ω) corresponds to a vector in which each row vector of Ω is arranged. Using this solution f, F ₁ , F ₂

により、更新する。 To update.

以上の更新式は各パラメータについてL(q)を最大にするものとなっているが、外れ値に強いロバストな手法によって置き換えてもよい。 The above update formula maximizes L (q) for each parameter, but it may be replaced by a robust method that is resistant to outliers.

以上の処理により、ハイパーパラメータψ=(U,V,W,F₁,F₂)を更新することができる。 By the above processing, the hyperparameter ψ = (U, V, W, F ₁ , F ₂ ) can be updated.

〔制御部１６０〕
制御部１６０は、所定の終了条件を満たすまで第１更新部１２０〜第４更新部１５０を繰り返し実行させる（Ｓ１６０）。例えば、予め定めた繰り返し回に達したことを終了条件とし、所定の繰り返し回数に到達するまで第１更新部１２０〜第４更新部１５０を繰り返し実行させるよう制御する。 [Control unit 160]
The control unit 160 repeatedly executes the first update unit 120 to the fourth update unit 150 until a predetermined end condition is satisfied (S160). For example, the end condition is that the number of repetitions reached a predetermined number, and the first update unit 120 to the fourth update unit 150 are controlled to be repeatedly executed until the predetermined number of repetitions is reached.

或いは、第１更新部１２０によりq(b)を更新する前のq(b),q(z),q(φ),ψに基づいて計算されるL(q)と、第１更新部１２０〜第４更新部１５０により更新された後のq(b),q(z),q(φ),ψに基づいて計算されるL(q)の差が所定の閾値以下となることを終了条件とし、それまで第１更新部１２０〜第４更新部１５０を繰り返し実行させるよう制御する。 Alternatively, L (q) calculated based on q (b), q (z), q (φ), ψ before updating q (b) by the first update unit 120, and the first update unit 120. ~ Ends that the difference of L (q) calculated based on q (b), q (z), q (φ), ψ after being updated by the fourth update unit 150 is equal to or less than a predetermined threshold value. As a condition, control is performed so that the first update unit 120 to the fourth update unit 150 are repeatedly executed until then.

要するに、十分L(q)が大きくなるまで（L(q)が最大化に近づくまで）第１更新部１２０〜第４更新部１５０を繰り返し実行させればよい。 In short, the first update unit 120 to the fourth update unit 150 may be repeatedly executed until L (q) becomes sufficiently large (until L (q) approaches maximization).

〔出力部１７０〕
出力部１７０は、所定の終了条件を満たした時点のパラメータをモデルパラメータの推定結果q(b),q(z),q(φ),ψとして出力する（Ｓ１７０）。 [Output unit 170]
The output unit 170 outputs the parameters at the time when the predetermined end condition is satisfied as the estimation results q (b), q (z), q (φ), ψ of the model parameters (S170).

＜効果＞
以上の構成により、アイトラッカーによって計測された注視点の時系列からサリエンシーマップのモデルパラメータを簡便に推定することができる。 <Effect>
With the above configuration, the model parameters of the saliency map can be easily estimated from the time series of the gazing point measured by the eye tracker.

また、変分事後分布qについての潜在変数の事後期待値や推定されたハイパーパラメータを観察することで、注視点の時系列に含まれる様々な情報を得ることができる。例えば、a_r+s_rの事後期待値を計算することで注視点からノイズを除去した時系列が得られる。とくに、背後にある注視点のジャンプの性質を加味した雑音除去がなされているという点で、通常の平滑化に比べて有用である。また、Aの事後期待値から、被験者の注意遷移の振る舞いを観察できる。具体的には、Aの事後期待値についてマルコフクラスタリングアルゴリズムを適用することで、条件付き強度関数を表現するために用いたガウシアンカーネルをいくつかのクラスタに分割できる。分割されたカーネルのクラスタそれぞれをオブジェクトだとみなすことで、画像内に存在する注意を引く対象の数を同定することも可能である。図７に代表的なものについての解釈を示す。 In addition, by observing the posterior expected value of the latent variable and the estimated hyperparameters for the variational posterior distribution q, various information included in the time series of the gazing point can be obtained. For example, by calculating the post-expected value of a _r + s _r , a time series in which noise is removed from the gazing point can be obtained. In particular, it is more useful than normal smoothing in that noise is removed in consideration of the jumping property of the gaze point behind it. In addition, the behavior of the subject's attention transition can be observed from the ex post facto expected value of A. Specifically, by applying the Markov clustering algorithm to the posterior expected value of A, the Gaussian kernel used to express the conditional intensity function can be divided into several clusters. It is also possible to identify the number of attention-grabbing objects present in an image by considering each of the divided kernel clusters as an object. FIG. 7 shows the interpretation of typical ones.

＜シミュレーション結果＞
参考文献６のデータセットに第一実施形態を適用した結果を示す。このデータは、被験者が画像を見ている間の3秒間の注視点の軌跡をアイトラッカーによって計測したものである。
（参考文献６）T. Judd, K. Ehinger, F. Durand and A. Torralba, "Learning to predict where humans look", IEEE International Conference on Computer Vision, 2009 <Simulation result>
The result of applying the first embodiment to the data set of Reference 6 is shown. This data is obtained by measuring the trajectory of the gazing point for 3 seconds while the subject is viewing the image with an eye tracker.
(Reference 6) T. Judd, K. Ehinger, F. Durand and A. Torralba, "Learning to predict where humans look", IEEE International Conference on Computer Vision, 2009

このデータセットに含まれるひとつの画像のデータについて、第一実施形態を適用した。 15人の被験者のデータのうち、4人の被験者の注視点の時系列データに対して本実施形態を適用した。図８にある被験者のデータの１反復目の推定結果を、図９に１１反復目の推定結果を示す。「Saccade delay」の縦方向の破線は、推定したサッカードの発生時刻を示す。最初の反復では、ノイズによって多くの偽のサッカードが推定されてしまっているものの、反復を繰り返すことによって真のサッカードのみを分離することができているのが見て取れる。また、図１０は推定に用いた4人を含む15人の被験者のデータから計算した注視点の頻度分布と、参考文献１のサリエンシーマップにサポートベクター回帰を適用して得られた初期値とを表す。図１１は１１反復目の推定結果を示す。本実施形態を適用することで、１１反復目の推定結果が初期値よりも注視点の頻度分布に近い画像となっているのが見て取れる。 The first embodiment was applied to the data of one image included in this data set. This embodiment was applied to the time-series data of the gazing points of 4 subjects out of the data of 15 subjects. The estimation result of the first iteration of the subject data shown in FIG. 8 is shown, and the estimation result of the eleventh iteration is shown in FIG. The vertical dashed line of "Saccade delay" indicates the estimated time of occurrence of the saccade. In the first iteration, the noise has estimated many fake saccades, but it can be seen that repeating the iterations can separate only the true saccades. In addition, FIG. 10 shows the frequency distribution of the gazing point calculated from the data of 15 subjects including 4 used for the estimation, and the initial value obtained by applying the support vector regression to the saliency map of Reference 1. Represents. FIG. 11 shows the estimation result of the 11th iteration. By applying this embodiment, it can be seen that the estimation result of the 11th iteration is an image closer to the frequency distribution of the gazing point than the initial value.

＜変形例＞
本実施形態では、q(b),q(z),q(φ),ψの順で、パラメータを更新しているが、更新の順番は変更してもよい。初期設定部１１０で設定した初期値、または、更新した最新のパラメータを用いて、第１更新部１２０、第２更新部１３０、第３更新部１４０、第４更新部１５０においてパラメータを更新すればよい。 <Modification example>
In this embodiment, the parameters are updated in the order of q (b), q (z), q (φ), ψ, but the update order may be changed. If the parameters are updated in the first update unit 120, the second update unit 130, the third update unit 140, and the fourth update unit 150 by using the initial values set in the initial setting unit 110 or the latest updated parameters. Good.

また、推定装置は、サッカード発生時刻推定部１８１、注視点系列推定部１８２、注目点系列推定部１８３、注目範囲推定部１８４、サリエンシーマップ生成部１８５と、の少なくともいずれかをさらに含む構成としてもよい（図１２参照）。 Further, the estimation device further includes at least one of a soccerd occurrence time estimation unit 181, a gazing point sequence estimation unit 182, a focus sequence estimation unit 183, a focus range estimation unit 184, and a saliency map generation unit 185. (See FIG. 12).

例えば、サッカード発生時刻推定部１８１は、モデル推定部１０１で学習したモデルパラメータの推定結果q(b),q(z),q(φ),ψを入力とし、これらの値に基づいて、 For example, the saccade occurrence time estimation unit 181 inputs the estimation results q (b), q (z), q (φ), ψ of the model parameters learned by the model estimation unit 101, and based on these values,

により、サッカードの発生時刻を推定し、出力する。 Estimates the occurrence time of saccade and outputs it.

注視点系列推定部１８２は、モデル推定部１０１で学習したモデルパラメータの推定結果q(b),q(z),q(φ),ψを入力とし、これらの値に基づいて、 The gazing point series estimation unit 182 inputs the estimation results q (b), q (z), q (φ), ψ of the model parameters learned by the model estimation unit 101, and based on these values,

により、注視点の時系列の推定結果を求め、出力する。 To obtain and output the time-series estimation result of the gazing point.

注目点系列推定部１８３は、モデル推定部１０１で学習したモデルパラメータの推定結果q(b),q(z),q(φ),ψを入力とし、これらの値に基づいて、 The point of interest series estimation unit 183 inputs the estimation results q (b), q (z), q (φ), ψ of the model parameters learned by the model estimation unit 101, and based on these values,

により、注目点の時系列の推定結果を求め、出力する。 To obtain and output the time-series estimation result of the point of interest.

注目範囲推定部１８４は、モデル推定部１０１で学習したｚの事後分布q(z)に基づいて、注目範囲または注目対象の推定結果を求め、出力する。例えば、どの混合成分に向かってサッカードが発生したかを示すzの事後分布と、受容野の中心位置{μ_j}_j=1 ^Jと、受容野の広がり具合{Λ_j}_j=1 ^Jと、から、注目範囲または注目対象の推定結果を求めことができる。 The attention range estimation unit 184 obtains and outputs the estimation result of the attention range or the attention target based on the posterior distribution q (z) of z learned by the model estimation unit 101. For example, the posterior distribution of z, which indicates to which mixed component the saccade occurred, the central position of the receptive field {μ _j } _{j = 1} ^J, and the extent of the receptive field spread {Λ _j } _{j = 1} ^J. From, the estimation result of the attention range or the attention target can be obtained.

サリエンシーマップ生成部１８５は、モデル推定部１０１で学習したモデルパラメータの推定結果q(b),q(z),q(φ),ψを入力とし、これらの値に基づいて、 The saliency map generation unit 185 inputs the estimation results q (b), q (z), q (φ), ψ of the model parameters learned by the model estimation unit 101, and based on these values,

により、サリエンシーマップを生成し、出力する。 Generates and outputs a saliency map.

出力部１７０は、サリエンシーマップのモデルパラメータの推定結果、サッカードの発生時刻、注視点の時系列の推定結果、注目点の時系列の推定結果、注目範囲または注目対象の推定結果、サリエンシーマップ、の少なくとも何れかを入力とし、推定装置及び出力部１７０は、少なくとも何れかを出力する。 The output unit 170 describes the estimation result of the model parameter of the saccade map, the occurrence time of the saccade, the estimation result of the time series of the gazing point, the estimation result of the time series of the attention point, the estimation result of the attention range or the attention target, and the saliency. At least one of the maps is used as an input, and the estimation device and the output unit 170 output at least one of them.

＜その他の変形例＞
本発明は上記の実施形態及び変形例に限定されるものではない。例えば、上述の各種の処理は、記載に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されてもよい。その他、本発明の趣旨を逸脱しない範囲で適宜変更が可能である。 <Other variants>
The present invention is not limited to the above embodiments and modifications. For example, the various processes described above may not only be executed in chronological order according to the description, but may also be executed in parallel or individually as required by the processing capacity of the apparatus that executes the processes. In addition, changes can be made as appropriate without departing from the spirit of the present invention.

＜プログラム及び記録媒体＞
また、上記の実施形態及び変形例で説明した各装置における各種の処理機能をコンピュータによって実現してもよい。その場合、各装置が有すべき機能の処理内容はプログラムによって記述される。そして、このプログラムをコンピュータで実行することにより、上記各装置における各種の処理機能がコンピュータ上で実現される。 <Programs and recording media>
In addition, various processing functions in each device described in the above-described embodiment and modification may be realized by a computer. In that case, the processing content of the function that each device should have is described by the program. Then, by executing this program on the computer, various processing functions in each of the above devices are realized on the computer.

この処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、例えば、磁気記録装置、光ディスク、光磁気記録媒体、半導体メモリ等どのようなものでもよい。 The program describing the processing content can be recorded on a computer-readable recording medium. The computer-readable recording medium may be, for example, a magnetic recording device, an optical disk, a photomagnetic recording medium, a semiconductor memory, or the like.

また、このプログラムの流通は、例えば、そのプログラムを記録したＤＶＤ、ＣＤ−ＲＯＭ等の可搬型記録媒体を販売、譲渡、貸与等することによって行う。さらに、このプログラムをサーバコンピュータの記憶装置に格納しておき、ネットワークを介して、サーバコンピュータから他のコンピュータにそのプログラムを転送することにより、このプログラムを流通させてもよい。 Further, the distribution of this program is performed, for example, by selling, transferring, renting, or the like a portable recording medium such as a DVD or a CD-ROM in which the program is recorded. Further, the program may be distributed by storing the program in the storage device of the server computer and transferring the program from the server computer to another computer via a network.

このようなプログラムを実行するコンピュータは、例えば、まず、可搬型記録媒体に記録されたプログラムもしくはサーバコンピュータから転送されたプログラムを、一旦、自己の記憶部に格納する。そして、処理の実行時、このコンピュータは、自己の記憶部に格納されたプログラムを読み取り、読み取ったプログラムに従った処理を実行する。また、このプログラムの別の実施形態として、コンピュータが可搬型記録媒体から直接プログラムを読み取り、そのプログラムに従った処理を実行することとしてもよい。さらに、このコンピュータにサーバコンピュータからプログラムが転送されるたびに、逐次、受け取ったプログラムに従った処理を実行することとしてもよい。また、サーバコンピュータから、このコンピュータへのプログラムの転送は行わず、その実行指示と結果取得のみによって処理機能を実現する、いわゆるＡＳＰ（Application Service Provider）型のサービスによって、上述の処理を実行する構成としてもよい。なお、プログラムには、電子計算機による処理の用に供する情報であってプログラムに準ずるもの（コンピュータに対する直接の指令ではないがコンピュータの処理を規定する性質を有するデータ等）を含むものとする。 A computer that executes such a program first, for example, first stores a program recorded on a portable recording medium or a program transferred from a server computer in its own storage unit. Then, when the process is executed, the computer reads the program stored in its own storage unit and executes the process according to the read program. Further, as another embodiment of this program, a computer may read the program directly from a portable recording medium and execute processing according to the program. Further, every time the program is transferred from the server computer to this computer, the processing according to the received program may be executed sequentially. In addition, the above processing is executed by a so-called ASP (Application Service Provider) type service that realizes the processing function only by the execution instruction and the result acquisition without transferring the program from the server computer to this computer. May be. In addition, the program shall include information used for processing by a computer and equivalent to the program (data that is not a direct command to the computer but has a property of defining the processing of the computer, etc.).

また、コンピュータ上で所定のプログラムを実行させることにより、各装置を構成することとしたが、これらの処理内容の少なくとも一部をハードウェア的に実現することとしてもよい。 Further, although each device is configured by executing a predetermined program on a computer, at least a part of these processing contents may be realized by hardware.

Claims

サッカードの時系列は、各サッカードの特徴量をマークとするマーク付き点過程で生成されるものとし、
時刻ｔの人の真の注目点は、時刻ｔ近傍にサッカードが発生する場合は時刻t-1の真の注目点を当該サッカードの方向及び大きさに応じて移動させた位置とし、時刻ｔ近傍にサッカードが発生しない場合は時刻t-1の真の注目点をランダムな方向及び大きさに応じて移動させた位置とし、
時刻ｔの人の注視点の遅れは、AR(2)モデルに従うものとし、
時刻ｔの人の真の注視点は、前記時刻ｔの真の注目点を前記時刻ｔの注視点の遅れにより補正した位置とし、
実際に計測される時刻ｔの注視点を、上記時刻ｔの真の注視点にノイズが加わったものとしてモデル化したものを注視点の生成モデルとして、
対象者の眼の動きを計測して得た注視点の時系列から、前記注視点の生成モデルのモデルパラメータを推定するモデル推定部を含む、
推定装置。 The saccade time series shall be generated in a marked point process with each saccade feature as a mark.
The true point of interest of a person at time t is the position where the true point of interest at time t-1 is moved according to the direction and size of the saccade when a saccade occurs in the vicinity of time t. If no saccade occurs in the vicinity of t, the true point of interest at time t-1 is set to a position moved according to a random direction and size.
The delay of the gaze point of the person at time t shall follow the AR (2) model.
The true gazing point of the person at time t is the position where the true point of interest at time t is corrected by the delay of the gazing point at time t.
A model of the gaze point at time t actually measured as a true gaze point at time t with noise added is used as a gaze point generation model.
It includes a model estimation unit that estimates the model parameters of the gaze point generation model from the gaze point time series obtained by measuring the movement of the eyes of the subject.
Estimator.

請求項１記載の推定装置であって、
前記モデル推定部で学習したモデルパラメータに基づいて、サッカードの発生時刻を推定するサッカード発生時刻推定部と、
前記モデル推定部で学習したモデルパラメータに基づいて、注視点の時系列の推定結果を求める注視点系列推定部と、
前記モデル推定部で学習したモデルパラメータに基づいて、注目点の時系列の推定結果を求める注目点系列推定部と、
前記モデル推定部で学習したモデルパラメータに基づいて、注目範囲または注目対象の推定結果を求める注目範囲推定部と、
前記モデル推定部で学習したモデルパラメータに基づいて、サリエンシーマップを生成するサリエンシーマップ生成部と、の少なくともいずれかをさらに含む、
推定装置。 The estimation device according to claim 1.
A saccade occurrence time estimation unit that estimates the saccade occurrence time based on the model parameters learned by the model estimation unit, and a saccade occurrence time estimation unit.
Based on the model parameters learned by the model estimation unit, the gaze point series estimation unit that obtains the estimation result of the gaze point time series,
Based on the model parameters learned by the model estimation unit, the attention point series estimation unit that obtains the estimation result of the time series of the attention points, and the attention point series estimation unit.
Based on the model parameters learned by the model estimation unit, the attention range estimation unit that obtains the estimation result of the attention range or the attention target, and the attention range estimation unit.
It further includes at least one of a salency map generation unit that generates a saliency map based on the model parameters learned by the model estimation unit.
Estimator.

請求項１記載の推定装置であって、
κ_rを、時刻rにサッカードが発生した場合はそのサッカードのベクトル、発生しなかった場合は空集合をとる確率変数とし、j_rを時刻rにおけるサッカードがサリエンシーマップのどの混合成分に向かって発生したものかを示す潜在変数とし、H_rを時刻rまでの履歴とし、Sをサリエンシーマップ画像とし、r^*を時刻r以前で発生した最後のサッカードの発生時刻とし、時刻r以前に発生したサッカードがない場合はr^*=0とし、a_rを時刻rにおける真の注目点とし、s_rを時刻rにおける注視点の遅れとし、h_NBをサッカードが短い時間間隔で発生しないようにするための修正項とし、i^*を時刻r以前で発生した最後のサッカードがどの混合成分に向かって発生したものかを示すものとし、
前記生成モデルは、

により与えられる、
ことを特徴とする推定装置。 The estimation device according to claim 1.
_Let κ _r be a stochastic variable that takes the vector of the soccerd if the soccerd occurs at time r, or an empty set if it does not occur, and let j _{r be the} mixed component of the salency map where soccerd at time r occurs. _Let H _{r be} the history up to time r, S be the surrender map image, r ^* be the time of occurrence of the last soccerd that occurred before time r, and the time. If there is no soccerd that occurred before r, r ^* = 0, a _r is the true point of interest at time r, s _r is the delay of the gaze point at time r, and h _NB is the short time interval of soccerd. As a correction item to prevent it from occurring in, i ^* shall indicate to which mixed component the last soccerd that occurred before time r occurred.
The generative model is

Given by,
An estimation device characterized by that.

請求項３記載の推定装置であって、
j_(r+1)^*を時刻r+1以前に発生した最後のサッカードが、どの混合成分に向かって発生したものかを示すものとし、
前記生成モデルは、

により与えられる、
ことを特徴とする推定装置。 The estimation device according to claim 3.
_Let j _{(r + 1) ^ *} indicate to which mixed component the last saccade that occurred before time r + 1 occurred.
The generative model is

Given by,
An estimation device characterized by that.

請求項４記載の推定装置であって、

とし、q(b,z,φ)を任意の事後分布とし、p(b,z,φ|x,ψ)を真の事後分布とし、前記モデル推定部は、次式により与えられるL(q)が大きくなるように、事後分布q(b),q(z),q(φ),ψを繰り返し更新することにより前記モデルパラメータを学習する、

ことを特徴とする推定装置。 The estimation device according to claim 4.

Let q (b, z, φ) be an arbitrary posterior distribution, let p (b, z, φ | x, ψ) be a true posterior distribution, and the model estimation unit is L (q) given by the following equation. ) Is increased, and the model parameters are learned by repeatedly updating the posterior distributions q (b), q (z), q (φ), and ψ.

An estimation device characterized by that.

サッカードの時系列は、各サッカードの特徴量をマークとするマーク付き点過程で生成されるものとし、
時刻ｔの人の真の注目点は、時刻ｔ近傍にサッカードが発生する場合は時刻t-1の真の注目点を当該サッカードの方向及び大きさに応じて移動させた位置とし、時刻ｔ近傍にサッカードが発生しない場合は時刻t-1の真の注目点をランダムな方向及び大きさに応じて移動させた位置とし、
時刻ｔの人の注視点の遅れは、AR(2)モデルに従うものとし、
時刻ｔの人の真の注視点は、前記時刻ｔの真の注目点を前記時刻ｔの注視点の遅れにより補正した位置とし、
実際に計測される時刻ｔの注視点を、上記時刻ｔの真の注視点にノイズが加わったものとしてモデル化したものを注視点の生成モデルとして、
対象者の眼の動きを計測して得た注視点の時系列から、前記注視点の生成モデルのモデルパラメータを推定するモデル推定ステップを含む、
推定方法。 The saccade time series shall be generated in a marked point process with each saccade feature as a mark.
The true point of interest of a person at time t is the position where the true point of interest at time t-1 is moved according to the direction and size of the saccade when a saccade occurs in the vicinity of time t. If no saccade occurs in the vicinity of t, the true point of interest at time t-1 is set to a position moved according to a random direction and size.
The delay of the gaze point of the person at time t shall follow the AR (2) model.
The true gazing point of the person at time t is the position where the true point of interest at time t is corrected by the delay of the gazing point at time t.
A model of the gaze point at time t actually measured as a true gaze point at time t with noise added is used as a gaze point generation model.
A model estimation step for estimating the model parameters of the gaze point generation model from the gaze point time series obtained by measuring the movement of the eyes of the subject is included.
Estimating method.

請求項１から請求項５の何れかの推定装置としてコンピュータを機能させるためのプログラム。 A program for operating a computer as an estimation device according to any one of claims 1 to 5.