JP2015210695A

JP2015210695A - Apparatus for detecting object in image

Info

Publication number: JP2015210695A
Application number: JP2014092361A
Authority: JP
Inventors: 寿乃田村; Hisano Tamura
Original assignee: Isuzu Motors Ltd
Current assignee: Isuzu Motors Ltd
Priority date: 2014-04-28
Filing date: 2014-04-28
Publication date: 2015-11-24

Abstract

PROBLEM TO BE SOLVED: To provide an apparatus for detecting an object in an image, configured to obtain an objective result when detecting an object to be visually recognized, from an image including the object.SOLUTION: An apparatus for detecting an object in an image includes: a point-of-gaze detection section 7 which detects a point of gaze of a client, which changes with time in an image, as time series data, by inputting an image including a plurality of objects to a pulse neural network; a point-of-gaze field area setting section 8 which sets a point-of-gaze field area in an image with the point of gaze detected by the point-of-gaze detection section 7 as a reference; an object detection section 11 which detects an object included in the point-of-gaze field area set by the point-of-gaze field area setting section 8, out of the objects, as an object to be visually recognized; and a detection information storage section 12 which accumulates and stores predetermined detection information on the object to be visually recognized, which is detected every time the object detection section 11 detects the object to be recognized.

Description

本発明は、画像から物体を検出するが画像の物体検出装置に関する。 The present invention relates to an object detection apparatus for detecting an object from an image.

特開２００４−２８０５９７号公報には、レイアウト評価システム及びレイアウト評価プログラム、並びにレイアウト評価方法が記載されている。レイアウト評価システムでは、人が注目すると思われる注目領域の抽出基準として「誘目度」という概念を用いる。誘目度は、人間の主観にあったパラメータをいい、領域の色の異質度、テクスチュアの異質度、形の異質度、及び面積の異質度に基づく異質性等の物理的特徴から算出される。つまり、注目領域の評価の際は、物理的特徴に従って人間の主観にあった評価をするので、人間の主観に適合した注目領域を抽出することができる。具体的には、原画像を領域分割して図領域を抽出し、各図領域の誘目度を算出し、算出した誘目度に基づいて注目領域を抽出する。複数の注目領域を抽出した時は、誘目度の高い上位Ｎ個の注目領域を抽出する。 Japanese Laid-Open Patent Publication No. 2004-280597 describes a layout evaluation system, a layout evaluation program, and a layout evaluation method. In the layout evaluation system, the concept of “attraction level” is used as a criterion for extracting a region of interest that a person is likely to focus on. The degree of attraction is a parameter that matches human subjectivity, and is calculated from physical characteristics such as the heterogeneity of the color of the region, the heterogeneity of the texture, the heterogeneity of the shape, and the heterogeneity of the area. That is, in the evaluation of the attention area, since the evaluation according to the human subjectivity is performed according to the physical characteristics, the attention area suitable for the human subjectivity can be extracted. Specifically, the original image is divided into regions to extract the figure regions, the degree of attraction of each figure region is calculated, and the attention area is extracted based on the calculated degree of attraction. When a plurality of attention areas are extracted, the top N attention areas having a high degree of attraction are extracted.

特開２００４−２８０５９７号公報JP 2004-280597 A

上記特許文献１に記載のシステムでは、図領域の物理的特徴である誘目度を用いて、画像から抽出した全ての図領域の注目度を順位付けしている。しかし、注目領域の抽出は人間の主観に適合した物理的特徴を基準としているので、抽出された領域の注目度の評価には看者の主観によって差異が生じるおそれがある。また、看者が画像を視認する場合、まず画像上の注意を引く部分（注視点）を注視し、続いて注視点を経時的に移動させることによって画像全体を視認し、視認性の高い領域については繰返し注視するという視覚特性を示す傾向にあり、上記文献１に記載のシステムのように全ての図領域の注目度を同時に判断することは困難である。このため、物体を含む画像から視認の対象となる物体（視認対象物体）を検出する際に、注目領域を視認対象物体であるとして、上記特許文献１に記載のシステムを用いた場合には、視認対象物体の抽出や注目度の順位付けなどに対して看者の主観による個人差が生じ、客観的な検出結果が得られないおそれがある。 In the system described in Patent Document 1, the attention degree of all the figure areas extracted from the image is ranked using the degree of attraction that is a physical feature of the figure area. However, since the extraction of the attention area is based on physical features that are suitable for human subjectivity, there is a possibility that the attention degree of the extracted area may vary depending on the subjectivity of the viewer. In addition, when a viewer visually recognizes an image, first, he / she pays attention to a part of the image that attracts attention (a gazing point), and then moves the gazing point over time, thereby visually observing the entire image. Tends to show visual characteristics of repeatedly gazing, and it is difficult to simultaneously determine the attention level of all the figure regions as in the system described in the above-mentioned document 1. For this reason, when detecting the object (viewing target object) to be visually recognized from the image including the object, assuming that the attention area is the viewing target object, when using the system described in Patent Document 1, Individual differences due to the subjectivity of viewers may occur with respect to the extraction of objects to be visually recognized and the ranking of attention, and objective detection results may not be obtained.

そこで本発明は、物体を含む画像から視認対象物体を検出する場合に客観的な検出結果を得ることが可能な画像の物体検出装置の提供を目的とする。 Accordingly, an object of the present invention is to provide an object detection apparatus for an image that can obtain an objective detection result when a visual recognition target object is detected from an image including the object.

上記目的を達成すべく、本発明の画像の物体検出装置は、複数の物体を含む画像から視認対象物体を検出する画像の物体検出装置であって、注視点検出手段と、注視点視野領域設定手段と、物体検出手段と、記憶手段とを備える。注視点検出手段は、画像をパルスニューラルネットワークに入力することによって、画像内で経時的に変遷する看者の注視点を時系列データとして検出する。注視点視野領域設定手段は、注視点検出手段が検出した注視点を基準として画像内に注視点視野領域を設定する。物体検出手段は、複数の物体のうち注視点視野領域設定手段が設定した注視点視野領域に含まれる物体を視認対象物体として検出する。記憶手段は、物体検出手段が視認対象物体を検出する毎に、検出された視認対象物体に関する所定の検出情報を蓄積して記憶する。 In order to achieve the above object, an object detection device for an image of the present invention is an image object detection device for detecting a visual recognition target object from an image including a plurality of objects, and includes a gaze point detection unit and a gaze point visual field region setting. Means, object detection means, and storage means. The gazing point detection means detects the gazing point of the viewer that changes with time in the image as time series data by inputting the image to the pulse neural network. The gazing point visual field setting unit sets the gazing point visual field region in the image with reference to the gazing point detected by the gazing point detection unit. The object detection unit detects an object included in the gazing point visual field setting region set by the gazing point visual field region setting unit among the plurality of objects as a visual recognition target object. The storage unit accumulates and stores predetermined detection information regarding the detected visual target object each time the object detection unit detects the visual target object.

上記構成では、注視点検出手段が画像をパルスニューラルネットワークに入力すると、画像の所定の部分に対応したニューロンからの強い反応がパルス出力される。その後、強い反応を示すニューロンの位置が時間の経過とともに変遷し、パルスニューラルネットワークからはニューロンの反応が時系列データとして出力される。ニューロンが強い反応を示す画像の所定の部分を、画像の看者が視覚的注意を向ける部分（注視点）と見做すことによって、注視点検出手段が検出する時系列データは、看者の注意点の経時的な変遷を表すデータとなる。注視点視野領域設定手段は、注視点検出手段が検出した注視点を中心として、例えば注視点から離間するほど画像の輝度が減衰する注視点視野領域を設定し、物体検出手段は、設定された注視点視野領域に含まれる物体を視認対象物体として検出する。物体検出手段が検出する注視点の経時的な変遷に応じて、注視点視野領域も経時的に変遷し、注視点視野領域に含まれる視認対象物体が物体検出手段によって経時的に検出される。そして、記憶手段は視認対象物体が検出される毎に、検出された視認対象物体に関する所定の検出情報、例えば視認対象物体の画像上の位置や形状等を蓄積して記憶する。 In the above configuration, when the gazing point detection means inputs an image to the pulse neural network, a strong response from a neuron corresponding to a predetermined portion of the image is pulsed. Thereafter, the position of the neuron showing a strong response changes with time, and the neuron response is output as time-series data from the pulse neural network. The time-series data detected by the gazing point detection means by observing a predetermined portion of the image in which the neuron exhibits a strong response as a portion (gazing point) where the viewer of the image directs visual attention, This data represents changes over time of points of interest. The gazing point visual field setting means sets, for example, a gazing point visual field region in which the luminance of the image is attenuated as the distance from the gazing point is centered on the gazing point detected by the gazing point detection unit, and the object detection unit is set An object included in the gazing point visual field region is detected as an object to be viewed. The gazing point visual field region also changes with time according to the temporal change of the gazing point detected by the object detection unit, and the visual target object included in the gazing point visual field region is detected by the object detection unit with time. Then, each time the visual recognition target object is detected, the storage unit accumulates and stores predetermined detection information related to the detected visual recognition object, for example, the position and shape of the visual recognition object on the image.

この結果、画像に含まれる物体のうち、まず看者の注意が向けられた注視点視野領域に存在する物体が視認対象物体として検出され、さらに注視点の経時的な変遷に応じた注視点視野領域内の視認対象物体が検出されるという看者の視覚特性に基づいて視認対象物体が検出される。このため、視認対象物体の検出に対する看者の主観の影響が抑制され客観的な検出結果を得ることが可能となる。また、客観的な検出結果が得られることによって、検出結果について被験者等を用いた検証の必要性が低減され、例えば物体を含む画像からの物体検出に関わる技術開発等の開発期間の大幅な短縮や開発費用の削減が可能となる。 As a result, among the objects included in the image, an object existing in the gazing point visual field area where the viewer's attention is directed is first detected as the object to be viewed, and the gazing point visual field corresponding to the temporal change of the gazing point is detected. The visual recognition target object is detected based on the visual characteristic of the viewer that the visual recognition target object in the region is detected. For this reason, the subjective influence of the viewer on the detection of the object to be visually recognized is suppressed, and an objective detection result can be obtained. In addition, by obtaining objective detection results, the need for verification of the detection results using subjects etc. is reduced, for example, the development period such as technology development related to object detection from images containing objects is greatly shortened And development costs can be reduced.

また、上記画像の物体検出装置は物体集計手段を備えてもよい。物体集計手段は、記憶手段が記憶した所定の検出情報に基づいて、視認対象物体として検出された検出回数を複数の物体毎に集計する。 The object detection device for an image may include an object counting unit. The object counting means counts the number of detections detected as the visual target object for each of the plurality of objects based on the predetermined detection information stored in the storage means.

上記構成では、物体集計手段が、画像に含まれる複数の物体毎に視認対象物体として検出された検出回数を集計する。視認対象物体は、注視点の経時的な変遷に伴って注視点視野領域内で検出されるので、注視点検出手段が注視点の出力を開始してから視認対象物体の検出を継続して検出回数を集計すると、集計された視認対象物体毎の検出回数に差異が生じる可能性がある。これは、看者は視認し易い物体を繰り返し注視するという看者の視覚特性に基づく結果であり、視認し易さの度合いである視認性の高い物体ほど検出回数が増大する。このように、視認対象物体の検出回数の集計結果を用いることによって、検出された視認対象物体同士の視認性についての客観的評価が可能となる。このため、例えば画像内に配置した物体が看者から適切な視認を受けられるか否かの評価を行なうなど、看者の注意を考慮に入れた画像デザインの検討等に活用することができる。 In the configuration described above, the object counting means counts the number of detections detected as a visual recognition target object for each of a plurality of objects included in the image. The object to be visually recognized is detected within the gazing point visual field region as the gazing point changes over time, so detection of the sighting object is continuously detected after the gazing point detection means starts outputting the gazing point. When the number of times is counted, there may be a difference in the number of times of detection for each of the counted objects to be viewed. This is a result based on the visual characteristics of the viewer that the viewer repeatedly gazes at an object that is easy to visually recognize, and the number of detections increases as the visibility is high. In this way, by using the result of counting the number of detections of the visual target object, it is possible to objectively evaluate the visibility between the detected visual target objects. For this reason, for example, it can be utilized for examination of an image design taking into consideration the attention of the viewer, such as evaluating whether or not an object placed in the image can be appropriately viewed by the viewer.

また、上記画像の物体検出装置は、輪郭抽出手段と、周囲結合領域設定手段と、物体方向判定手段とを備えてもよい。輪郭抽出手段は、画像から輪郭を抽出する。周囲結合領域設定手段は、輪郭のいずれの側が物体であるかを判定するために用いる周囲結合領域を、輪郭抽出手段が抽出した輪郭に応じて画像内に設定する。物体方向判定手段は、注視点視野領域設定手段が設定した注視点視野領域に含まれる注視点視野輪郭を、輪郭抽出手段が抽出した輪郭から抽出し、抽出した注視点視野輪郭のいずれの側が物体であるかを、周囲結合領域設定手段が設定した周囲結合領域を用いて判定する。物体検出手段は、注視点視野輪郭によって囲まれた領域であって注視点視野輪郭に対して注視点視野輪郭の内側が物体であると物体方向判定手段が判定した領域を視認対象物体として検出する。 The object detection device for an image may include a contour extraction unit, a surrounding combination region setting unit, and an object direction determination unit. The contour extracting means extracts a contour from the image. The surrounding joint area setting means sets the surrounding joint area used for determining which side of the contour is the object in the image according to the contour extracted by the contour extracting means. The object direction determining means extracts the gazing point visual field contour included in the gazing point visual field area set by the gazing point visual field setting means from the contour extracted by the contour extracting means, and either side of the extracted gazing point visual field outline is the object Is determined using the surrounding coupling area set by the surrounding coupling area setting means. The object detection means detects a region surrounded by the gazing point visual field contour and determined by the object direction determination unit as an object to be visually recognized that is inside the gazing point visual field contour with respect to the gazing point visual field contour. .

上記構成では、輪郭抽出手段が抽出した輪郭に応じて、周囲結合領域設定手段が画像内に設定した周囲結合領域を用いて、物体方向判定手段が、注視点視野輪郭のいずれの側が物体であるかを判定し、物体検出手段が、注視点視野輪郭によって囲まれた領域であって注視点視野輪郭に対して注視点視野輪郭の内側にある物体を視認対象物体として検出する。このように、注視点視野領域に含まれる注視点視野輪郭と周囲結合領域とを用いることによって、注視点視野輪郭に対して物体が存在する方向を判定するだけで、物体を含む画像から視認対象物体を検出することができる。このため、例えば人間の主観に合った物理的特徴を用いて注目領域を抽出する場合等と比べて看者の主観の影響を受け難く、視認対象物体をより客観的に検出することが可能となる。 In the above configuration, according to the contour extracted by the contour extraction unit, the object direction determination unit uses the peripheral combination region set in the image by the peripheral combination region setting unit, and the object direction determination unit is an object on either side of the gazing point visual field contour. The object detecting means detects an object that is an area surrounded by the gazing point visual field outline and that is inside the gazing point visual field outline as a visual recognition target object. In this way, by using the gazing point visual field contour and the surrounding joint region included in the gazing point visual field region, it is possible to determine the visual target from the image including the object only by determining the direction in which the object exists with respect to the gazing point visual field contour. An object can be detected. For this reason, it is less affected by the subjectivity of the viewer compared to, for example, extracting a region of interest using physical features that match the human subjectivity, and the object to be viewed can be detected more objectively. Become.

本発明の画像の物体検出装置によれば、物体を含む画像から視認対象物体を検出する場合に客観的な検出結果を得ることが可能となる。 According to the object detection device for an image of the present invention, an objective detection result can be obtained when a visual recognition target object is detected from an image including the object.

物体検出装置のブロック図である。It is a block diagram of an object detection apparatus. 物体検出処理を説明するフローチャートである。It is a flowchart explaining an object detection process. 複数の物体を含む画像の模式図である。It is a schematic diagram of the image containing a some object. 画像内に設定された注視点視野領域の模式図である。It is a schematic diagram of a gazing point visual field set in an image. 注視点視野領域に含まれる注視点視野輪郭の模式図である。It is a schematic diagram of a gaze point visual field outline included in a gaze point visual field region. 視認対象物体の検出例の模式図である。It is a schematic diagram of the example of a detection of a visual recognition target object. 注視点の経時的な変遷を示す模式図である。It is a schematic diagram which shows a time-dependent transition of a gazing point. 図３の画像に含まれる物体の配置を変更した画像の模式図である。It is a schematic diagram of the image which changed the arrangement | positioning of the object contained in the image of FIG. 図８の画像における視認対象物体の検出例の模式図である。It is a schematic diagram of the example of a detection of the visual recognition target object in the image of FIG. パルスニューラルネットワークの構造の模式図である。It is a schematic diagram of the structure of a pulse neural network. 周囲結合領域の説明図である。It is explanatory drawing of a surrounding joint area | region.

以下、本発明の一実施形態を、図面に基づいて説明する。図１に示すように、物体検出装置１は、撮像部２と、ＥＣＵ３と、表示部４とを備える。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings. As shown in FIG. 1, the object detection device 1 includes an imaging unit 2, an ECU 3, and a display unit 4.

撮像部２は、ＣＣＤカメラ等によって画像を撮像し、画像をＥＣＵ３へ出力する。 The imaging unit 2 captures an image with a CCD camera or the like, and outputs the image to the ECU 3.

ＥＣＵ３は、ＣＰＵ（Central Processing Unit）とＲＯＭ(Read Only Memory)とＲＡＭ(Random Access Memory)とを備える。ＣＰＵは、ＲＯＭに格納された物体検出処理プログラムを読み出して物体検出処理を実行し、注視点検出部７、注視点視野領域設定部８、輪郭抽出部６、周囲結合領域設定部９、物体方向判定部１０、物体検出部１１、物体集計部１３として機能する。ＲＡＭは、撮像部２が出力した画像を記憶する画像記憶部５と、後述の視認対象物体に関する所定の検出情報を記憶する検出情報記憶部（記憶手段）１２として機能する。また、ＲＡＭは、後述の各種係数の設定領域、及びＣＰＵ演算結果の一時記憶領域としても機能する。 The ECU 3 includes a CPU (Central Processing Unit), a ROM (Read Only Memory), and a RAM (Random Access Memory). The CPU reads out the object detection processing program stored in the ROM and executes the object detection processing. The gazing point detection unit 7, the gazing point visual field region setting unit 8, the contour extraction unit 6, the surrounding coupling region setting unit 9, the object direction It functions as the determination unit 10, the object detection unit 11, and the object totaling unit 13. The RAM functions as an image storage unit 5 that stores an image output by the imaging unit 2 and a detection information storage unit (storage unit) 12 that stores predetermined detection information related to a visual target object described later. The RAM also functions as a setting area for various coefficients, which will be described later, and a temporary storage area for CPU calculation results.

注視点検出部７は、画像をパルスニューラルネットワーク（以下ＰＣＮＮと称す）に入力して注視点を検出するとともに、注視点の時間的な変遷を時系列データとして出力する。すなわち、注視点検出部７は注視点検出手段として機能する。ＰＣＮＮは、視覚刺激に対するニューロン群の発火現象をモデル化した公知のニューラルネットワークであり、図１０に模式的な構造を示す。入力される画像がｒ×ｒの二次元画像の場合、ＰＣＮＮには入力に応じたｒ×ｒ個のニューロンが二次元配置される。ニューロンはＦｅｅｄｉｎｇ部とＬｉｎｋｉｎｇ部とから構成され、Ｆｅｅｄｉｎｇ部は内部からの刺激Ｙｉｊと外部からの刺激Ｓｉｊとを入力として受け取り、Ｌｉｎｋｉｎｇ部は内部からの刺激のみを入力として受け取る。入力によるＦｅｅｄｉｎｇ部の応答をＦｉｊ、Ｌｉｎｋｉｎｇ部の応答をＬｉｊとすると、Ｆｉｊ及びＬｉｊはそれぞれ式（１）及び式（２）で表される。 The gazing point detection unit 7 inputs an image to a pulse neural network (hereinafter referred to as PCNN) to detect the gazing point, and outputs a temporal transition of the gazing point as time series data. That is, the gazing point detection unit 7 functions as a gazing point detection unit. PCNN is a known neural network that models the firing phenomenon of a neuron group in response to a visual stimulus, and a schematic structure is shown in FIG. When the input image is an r × r two-dimensional image, r × r neurons corresponding to the input are two-dimensionally arranged on the PCNN. The neuron is composed of a Feeding unit and a Linking unit. The Feeding unit receives an internal stimulus Yij and an external stimulus Sij as inputs, and the Linking unit receives only an internal stimulus as inputs. Assuming that the response of the Feeding part by input is Fij and the response of the Linking part is Lij, Fij and Lij are expressed by Expression (1) and Expression (2), respectively.

ニューロンは前時点での自身の状態を保持しており、減衰項ｅ^−αにより時間とともに減衰する。また、Ｍ、ＷはそれぞれＦｅｅｄｉｎｇ部及びＬｉｎｋｉｎｇ部のニューロン同士を結合するシナプスの重み行列である。ニューロンの出力Ｙｉｊは、ニューロンの内部状態Ｕｉｊ（式（３））と動的な閾値Θｉｊ（式（４））との比較によって決定され、式（５）のように表される。 The neuron maintains its own state at the previous time point, and decays with time due to the decay term e- ^α . M and W are synaptic weight matrices that connect neurons of the Feeding unit and Linking unit, respectively. The neuron output Yij is determined by comparing the neuron internal state Uij (formula (3)) with the dynamic threshold Θij (formula (4)), and is expressed as formula (5).

なお、本実施形態では、ＰＣＮＮの出力ＹｉｊはＳｉｇｍｏｉｄ関数を用いて連続値として出力される。ニューロンが発火すると閾値Θは一旦増加しその後指数関数的に減少し、ある時刻において再びニューロンが発火する。時刻ｎにニューロンが発火した場合、前の発火時刻との時間感覚τｉｊは式（６）のように表される。 In the present embodiment, the PCNN output Yij is output as a continuous value using a sigmoid function. When a neuron fires, the threshold Θ once increases and then decreases exponentially, and the neuron fires again at a certain time. When the neuron fires at time n, the time sensation τij with respect to the previous firing time is expressed as in equation (6).

このように、注視点検出部７のパルスニューラルネットワークに画像が入力されると、画像の所定の部分に対応したニューロンからの強い反応（発火）がパルス出力される。その後、強い反応を示すニューロンの位置が時間の経過とともに変遷し、パルスニューラルネットワークからはニューロンの反応が時系列データとして出力される。ニューロンの反応が最大値を示す画像の所定の部分を、画像の看者が視覚的注意を向ける部分（注視点）と見做すことによって、注視点検出部７が検出する時系列データは、看者の注意点の経時的な変遷を表すデータとなる。 Thus, when an image is input to the pulse neural network of the gazing point detection unit 7, a strong reaction (ignition) from a neuron corresponding to a predetermined portion of the image is output as a pulse. Thereafter, the position of the neuron showing a strong response changes with time, and the neuron response is output as time-series data from the pulse neural network. The time-series data detected by the gazing point detection unit 7 when the predetermined portion of the image in which the response of the neuron has the maximum value is regarded as a portion (gazing point) to which the viewer of the image directs visual attention, This data represents the changes over time of the caution points of the viewer.

注視点視野領域設定部８は、注視点検出部７から出力される注視点（ｉｊ）を中心に、大きさｂ×ｂのガウシャンフィルタを生成し、ガウシャンフィルタとＰＣＮＮの出力値との畳み込み処理をすることによって、画像内に注視点を中心として注視点から離間するほど画像の輝度が減衰する注視点視野領域を設定する。注視点視野領域は式（７）のように表される。 The gazing point visual field setting unit 8 generates a Gaussian filter of size b × b around the gazing point (ij) output from the gazing point detection unit 7, and outputs the Gaussian filter and the output value of the PCNN. By performing the convolution process, a gazing point visual field region in which the luminance of the image attenuates as the distance from the gazing point is increased is set in the image. The gazing point visual field area is expressed as in Expression (7).

すなわち、注視点視野領域設定部８は注視点検出部７が検出した注視点を基準として画像内に注視点視野領域を設定する注視点視野領域設定手段として機能する。 That is, the gazing point visual field setting unit 8 functions as a gazing point visual field setting unit that sets the gazing point visual field region in the image with reference to the gazing point detected by the gazing point detection unit 7.

輪郭抽出部６は、撮像部２から出力される画像とガボールフィルタとの畳み込み処理をすることによって画像のコンストラスト情報を検出し、コンストラスト情報から輪郭を抽出する。すなわち輪郭抽出部６は、輪郭抽出手段として機能する。 The contour extraction unit 6 detects the contrast information of the image by performing convolution processing between the image output from the imaging unit 2 and the Gabor filter, and extracts the contour from the contrast information. That is, the contour extracting unit 6 functions as a contour extracting unit.

周囲結合領域設定部９は、輪郭のいずれの側が物体であるかを判定するために用いる周囲結合領域を、輪郭抽出部６が抽出した輪郭に応じて画像内に設定し、周囲結合領域設定手段として機能する。輪郭は物体に属しているので輪郭のいずれの側が物体であるかを判定できれば、画像から物体を検出することができる。周囲結合領域設定部９は、輪郭抽出部６によって検出された画像のコントラスト情報に基づき、コントラスト情報から抽出された輪郭に応じて周囲結合領域を画像内に設定する。周囲結合領域は、物体の存在する方向を肯定的に示す興奮性結合領域、及び物体の存在する方向を否定的に示す抑制性結合領域を有しており、興奮性結合係数、及び抑制性結合係数を介して輪郭と結合する。 The surrounding joint region setting unit 9 sets a surrounding joint region used for determining which side of the contour is an object in the image according to the contour extracted by the contour extracting unit 6, and surrounding joint region setting means Function as. Since the contour belongs to the object, if it can be determined which side of the contour is the object, the object can be detected from the image. Based on the contrast information of the image detected by the contour extracting unit 6, the surrounding combined region setting unit 9 sets a peripheral combined region in the image according to the contour extracted from the contrast information. The surrounding coupling region has an excitatory coupling region that positively indicates the direction in which the object exists, and an inhibitory coupling region that indicates in the negative the direction in which the object exists. Combine with contours via coefficients.

物体方向判定部１０は、注視点視野領域設定部８が画像内に設定した注視点視野領域に含まれる注視点視野輪郭を、輪郭抽出部６が抽出した輪郭から抽出し、抽出した注視点視野輪郭のいずれの側が物体であるかを、周囲結合領域設定部９が設定した周囲結合領域を用いて判定し、物体方向判定手段として機能する。例えば図１１に示すように、輪郭上の点Ｓに対して左側方向に興奮性結合領域が、右側方向に抑制性結合される領域が設定されている場合は、輪郭上の点Ｓにおいては左側方向（図中の矢印方向）が物体の方向であると判定される。物体方向判定部１０は、判定結果を物体検出部１１へ出力する。 The object direction determination unit 10 extracts the gazing point visual field outline included in the gazing point visual field region set in the image by the gazing point visual field setting unit 8 from the contour extracted by the contour extraction unit 6, and extracts the gazing point visual field Which side of the contour is the object is determined using the surrounding connection area set by the surrounding connection area setting unit 9, and functions as an object direction determination unit. For example, as shown in FIG. 11, when an excitatory coupling region is set in the left direction and an inhibitory coupling region is set in the right direction with respect to the point S on the contour, It is determined that the direction (the arrow direction in the figure) is the direction of the object. The object direction determination unit 10 outputs the determination result to the object detection unit 11.

物体検出部１１は、画像に含まれる物体のうち、注視点視野輪郭によって囲まれた領域であって注視点視野輪郭に対して注視点視野輪郭の内側方向が物体であると物体方向判定部１０が判定した領域を視認対象物体として検出し、検出結果を物体集計部１３へ出力する。すなわち、物体検出部１１は、画像に含まれる複数の物体から視認対象物体を検出する物体検出手段として機能する。 The object detection unit 11 is an area surrounded by the gazing point visual field contour among the objects included in the image, and the object direction determination unit 10 determines that the inner direction of the gazing point visual field contour is the object with respect to the gazing point visual field contour. Is detected as a visual recognition target object, and the detection result is output to the object counting unit 13. That is, the object detection unit 11 functions as an object detection unit that detects a visual recognition target object from a plurality of objects included in the image.

物体集計部１３は、検出情報記憶部１２が記憶した所定の検出情報に基づいて、視認対象物体として検出された検出回数を画像に含まれる複数の物体毎に集計し、集計結果を表示部４へ出力する。すなわち、物体集計部１３は、物体集計手段として機能する。 Based on the predetermined detection information stored in the detection information storage unit 12, the object totaling unit 13 totals the number of detections detected as the visual target object for each of the plurality of objects included in the image, and displays the totaling result as the display unit 4. Output to. That is, the object totaling unit 13 functions as an object totaling unit.

表示部４は、物体集計部１３から出力する視認対象物体の集計結果を液晶ディスプレイ等によって表示する。 The display unit 4 displays the result of counting the objects to be viewed output from the object totaling unit 13 on a liquid crystal display or the like.

次に、本実施形態に係る物体検出装置１の動作を図２に示すフローチャートに基づいて説明する。ＥＣＵ３は、まずカウンタｎ及びｔをそれぞれ初期値ゼロにリセットする（ステップＳ１）。カウンタｎは、ＰＣＮＮのニューロンの発火（出力）の回数をカウントし、カウンタｔは、ＰＣＮＮが稼動を開始してからの経過時間をそれぞれカウントする。次にＥＣＵ３は撮像部２から画像を取得し、画像記憶部５に記憶する（ステップＳ２）。本実施形態では図３に示すように模式的にａ〜ｄの４種類の物体が含まれている画像を例として説明する。次にＥＣＵ３は取得した画像からガボールフィルタ処理によって輪郭を抽出し（ステップＳ３）ステップＳ４へ進む。ステップＳ４では、画像をＰＣＮＮに入力してＰＣＮＮの稼動を開始させステップＳ５へ進む。ステップＳ５では、ＰＣＮＮから出力があったか否かを判定する。ＰＣＮＮから出力があった場合はステップＳ６へ進み、カウンタｎに１を加算しステップＳ７へ進む。ステップＳ７では、ＰＣＮＮの出力から注視点（図３のＰ１）を検出し、検出された注視点Ｐ１を中心として注視点から離間するほど画像の輝度が減衰する注視点視野領域Ｑ１（図４の鎖線円で表される領域）を設定する（ステップＳ７）。注視点視野領域Ｑ１を設定することによって、図５に示すように注視点Ｐ１に関する視認対象物体の探索の範囲が注視点視野領域Ｑ１に含まれる物体に限定される。次にＥＣＵ３は、周囲結合領域を設定しステップＳ９へ進む。ステップＳ９では、注視点視野領域Ｑ１に含まれる注視点視野輪郭Ｒ１のいずれの側が物体であるかを、周囲結合領域を用いて判定する。次にＥＣＵ３は、図５に示すように注視点視野輪郭Ｒ１によって囲まれた領域であって注視点視野輪郭Ｒ１に対して注視点視野輪郭Ｒ１の内側が物体であると判定された領域を視認対象物体として検出する（ステップＳ１０）。すなわち、図３の注視点Ｐ１の注視点視野輪郭Ｒ１内の物体ａが視認対象物体として検出される（図６）。次にＥＣＵ３は、視認対象物体に関する検出情報、例えば視認対象物体が検出された注視点の位置や視認対象物体の形状等を画像に含まれる物体毎に検出情報記憶部１２に蓄積して記憶し（ステップＳ１１）ステップＳ１２へ進む。ステップＳ１２では、カウンタｎのカウント値が所定回数ｎｍａｘを超えているか否かを判定する。カウンタｎのカウント値がｎｍａｘ以下の場合は、ステップＳ５に戻り、ＰＣＮＮから次の出力があったか否かを判定し、次の出力があった場合は、ステップＳ６からステップＳ１１までを繰返し、新たに検出された視認対象物体の検出情報を検出情報記憶部１２に蓄積して記憶する。 Next, the operation of the object detection apparatus 1 according to the present embodiment will be described based on the flowchart shown in FIG. The ECU 3 first resets the counters n and t to the initial value zero (step S1). The counter n counts the number of times the PCNN neuron is fired (output), and the counter t counts the time elapsed since the PCNN started operation. Next, the ECU 3 acquires an image from the imaging unit 2 and stores it in the image storage unit 5 (step S2). In the present embodiment, an image including four types of objects a to d as schematically shown in FIG. 3 will be described as an example. Next, the ECU 3 extracts a contour from the acquired image by Gabor filter processing (step S3), and proceeds to step S4. In step S4, an image is input to PCNN to start operation of PCNN, and the process proceeds to step S5. In step S5, it is determined whether or not there is an output from PCNN. If there is an output from PCNN, the process proceeds to step S6, 1 is added to the counter n, and the process proceeds to step S7. In step S7, a gazing point (P1 in FIG. 3) is detected from the output of the PCNN, and the gazing point visual field Q1 (in FIG. 4) in which the luminance of the image is attenuated as the detected gazing point P1 is separated from the gazing point. A region represented by a chain line circle) is set (step S7). By setting the gazing point visual field Q1, as shown in FIG. 5, the search range of the visual target object with respect to the gazing point P1 is limited to the objects included in the gazing point visual field Q1. Next, the ECU 3 sets a surrounding coupling area and proceeds to step S9. In step S9, it is determined using the surrounding joint region which side of the gazing point visual field contour R1 included in the gazing point visual field region Q1 is an object. Next, as shown in FIG. 5, the ECU 3 visually recognizes an area surrounded by the gazing point visual field outline R1 and determined to be an object inside the gazing point visual field outline R1 with respect to the gazing point visual field outline R1. It detects as a target object (step S10). That is, the object a within the gazing point visual field contour R1 of the gazing point P1 in FIG. 3 is detected as a visual recognition target object (FIG. 6). Next, the ECU 3 accumulates and stores in the detection information storage unit 12 for each object included in the image, detection information related to the visual target object, for example, the position of the gazing point at which the visual target object is detected and the shape of the visual target object. (Step S11) Proceed to step S12. In step S12, it is determined whether or not the count value of the counter n exceeds a predetermined number nmax. If the count value of the counter n is less than or equal to nmax, the process returns to step S5 to determine whether or not there is a next output from the PCNN, and if there is a next output, repeat steps S6 to S11 and newly Detection information of the detected visual recognition target object is accumulated and stored in the detection information storage unit 12.

ステップＳ１２でカウンタｎのカウント値がｎｍａｘを超えたと判定された場合は、所定回数の視認対象物体の検出が終了したのでステップＳ１３へ進む。また、ステップＳ５で、ＰＣＮＮから出力がないと判定された場合はステップＳ１４へ進み、カウンタｔのカウント値に１を加算してステップＳ１５へ進む。ステップＳ１５では、カウンタｔのカウント値が所定時間ｔｍａｘを超えているか否かを判定する。カウンタｔのカウント値がｔｍａｘ以下の場合はステップＳ５へ戻り、ＰＣＮＮからの出力の有無を判定する。ステップＳ１５でカウンタｔのカウント値がｔｍａｘを超えたと判定された場合は、物体検出の時間が所定時間を超えたのでステップＳ１３へ進む。すなわち、検出回数が所定回数ｎｍａｘを超えた時、又は検出時間が所定時間ｔｍａｘを超えた時のいずれか早いタイミングで視認性物体の検出を終了しステップＳ１３へ進む。 If it is determined in step S12 that the count value of the counter n has exceeded nmax, the detection of the object to be visually recognized is completed a predetermined number of times, and the process proceeds to step S13. If it is determined in step S5 that there is no output from PCNN, the process proceeds to step S14, 1 is added to the count value of the counter t, and the process proceeds to step S15. In step S15, it is determined whether or not the count value of the counter t exceeds a predetermined time tmax. If the count value of the counter t is equal to or less than tmax, the process returns to step S5 to determine whether or not there is an output from the PCNN. If it is determined in step S15 that the count value of the counter t has exceeded tmax, the object detection time has exceeded a predetermined time, and the process proceeds to step S13. That is, when the number of detections exceeds the predetermined number nmax or when the detection time exceeds the predetermined time tmax, the detection of the visible object ends at the earlier timing, and the process proceeds to step S13.

ステップＳ１３では、検出情報記憶部１２に記憶された視認対象物体の検出情報から画像に含まれる物体ごとに、視認性物体として検出された検出回数を集計する。さらに、画像記憶部５に記憶された取得画像を表示部４に表示し、画像に含まれる物体のうち視認対象物体として検出された回数が多い物体ほど輝度を高くする等の方法によって物体を強調して表示する。 In step S <b> 13, the number of detections detected as a visibility object is tabulated for each object included in the image from the detection information of the visual target object stored in the detection information storage unit 12. Further, the acquired image stored in the image storage unit 5 is displayed on the display unit 4, and the object is emphasized by a method such as increasing the luminance of the object included in the image that is detected as the object to be viewed more frequently. And display.

本実施形態では、画像をＰＣＮＮに入力し、ＰＣＮＮからの出力を画像の看者が視覚的注意を向ける部分（注視点）と見做すことによって、看者の視覚的注意点の経時的な変遷を検出する。図３に示すように例えばａ、ｂ、ｃ、ｄの４個の物体を含む画像に対してＰＣＮＮから最初の注視点Ｐ１が出力されると、Ｐ１を中心とした注視点視野領域Ｑ１が設定される（図４）。注視点視野領域Ｑ１内に含まれる注視点視野輪郭Ｒ１について何れの側が物体であるかが判定され（図５）、注視点視野輪郭Ｒ１によって囲まれた領域であって、注視点視野輪郭Ｒ１に対して注視点視野輪郭Ｒ１の内側にある物体ａが視認対象物体として検出される。注視点Ｐ１を出力した後に、ＰＣＮＮが次の注視点Ｐ２を出力すると、注視点Ｐ１における物体検出と同様に、注視点Ｐ２を中心とした注視点視野内の物体ｂが視認対象物体として検出される。注視点Ｐ２を出力した後で、ＰＣＮＮがさらに次の注視点Ｐ３を出力した場合は、同様に物体ｃが視認対象物体として検出される。また、図３に示す例において、物体ｄは、他の物体と比べて小型であるという物理的特徴のために注視されず視認対象物体として検出されにくいが、図８に示す例のように、物体ｄが注視点Ｐ１の注視点視野領域Ｑ１に含まれている場合には、物体ａとともに物体ｄも視認対象物体として検出される可能性がある（図９）。 In this embodiment, an image is input to the PCNN, and the output from the PCNN is regarded as a portion (gazing point) to which the viewer of the image is directed to the visual attention, so that the visual attention point of the viewer is changed over time. Detect transitions. As shown in FIG. 3, when the first gazing point P1 is output from the PCNN for an image including four objects a, b, c, and d, for example, a gazing point visual field Q1 centered on P1 is set. (FIG. 4). It is determined which side is an object with respect to the gazing point visual field contour R1 included in the gazing point visual field region Q1 (FIG. 5), and is an area surrounded by the gazing point visual field contour R1. On the other hand, the object a inside the gazing point visual field contour R1 is detected as a visual recognition target object. When the PCNN outputs the next gazing point P2 after outputting the gazing point P1, the object b in the gazing point visual field centering on the gazing point P2 is detected as the object to be visually recognized, similar to the object detection at the gazing point P1. The After the gazing point P2 is output, when the PCNN further outputs the next gazing point P3, the object c is similarly detected as the visual recognition target object. Further, in the example shown in FIG. 3, the object d is not gaze and is not easily detected as a visual target object due to a physical feature that it is small compared to other objects, but as in the example shown in FIG. When the object d is included in the gazing point visual field Q1 of the gazing point P1, the object d may be detected as a visual recognition target object together with the object a (FIG. 9).

この結果、画像に含まれる物体ａ〜ｄのうち、まず看者の注意が向けられた注視点視野領域に存在する物体ａが視認対象物体として検出され、さらに注視点の経時的な変遷（図７の鎖線矢印）に従って変遷する注視点視野領域内の視認対象物体ｂ、ｃが検出されるという看者の視覚特性に基づく客観的な検出結果を得ることが可能となる。 As a result, among the objects a to d included in the image, the object a existing in the gazing point visual field area where the viewer's attention is directed is first detected as the object to be visually recognized, and the temporal transition of the gazing point (see FIG. It is possible to obtain an objective detection result based on the viewer's visual characteristics that the visual target objects b and c in the gazing point visual field region that changes according to the dotted line arrow 7) are detected.

また、注視点Ｐ１を基準とした注視点視野領域Ｑ１に含まれる注視点視野輪郭Ｒ１及び周囲結合領域を用いることによって、注視点視野輪郭Ｒ１に対して物体が存在する方向を判定するだけで、物体ａ〜ｄを含む画像から視認対象物体ａを検出することができる。このため、例えば人間の主観に基づいた物理的特徴を用いて注目領域を抽出する場合等と比べて看者の主観の影響を受け難く、視認対象物体をより客観的に検出することが可能となる。 Further, by using the gazing point visual field contour R1 and the surrounding joint region included in the gazing point visual field region Q1 with the gazing point P1 as a reference, only the direction in which the object exists with respect to the gazing point visual field contour R1 is determined. The visual recognition target object a can be detected from the image including the objects a to d. For this reason, for example, compared with the case where the attention area is extracted using physical features based on human subjectivity, it is less affected by the subjectivity of the viewer, and the object to be viewed can be detected more objectively. Become.

また、ＰＣＮＮは入力されている画像についての反応を時系列的に継続し、図３に示す画像の例では注視点Ｐ１、Ｐ２、Ｐ３を出力した後に、図７に示すように注視点Ｐ１を再び出力する可能性がある。この場合は、物体ａが視認対象物体として再び検出されることになる。このように、視認対象物体は、注視点の経時的な変遷に伴って注視点視野領域内で検出されるので、注視点検出手段が注視点の出力を開始してから所定時間ｔｍａｘ又は所定回数ｎｍａｘに達するまで視認対象物体の検出を継続して検出回数を集計すると、集計された視認対象物体毎の検出回数に差異が生じる可能性がある。これは、看者は視認し易い物体を繰り返し注視するという看者の視覚特性に基づく結果であり、検出回数が大きい物体ほど視認性が高いと判断できる。 Further, the PCNN continues the reaction on the input image in time series, and in the example of the image shown in FIG. 3, after outputting the gazing points P1, P2, and P3, the gazing point P1 is set as shown in FIG. There is a possibility to output again. In this case, the object a is detected again as a visual recognition target object. As described above, since the visual recognition target object is detected in the gazing point visual field region as the gazing point changes with time, the gazing point detection unit starts outputting the gazing point for a predetermined time tmax or a predetermined number of times. If the detection target object is continuously detected until nmax is reached and the number of detections is totaled, there is a possibility that a difference occurs in the total number of detections for each visual target object. This is a result based on the viewer's visual characteristics that the viewer repeatedly gazes at an object that is easy to visually recognize, and it can be determined that an object with a larger number of detections has higher visibility.

例えば、図３に示す画像に含まれる４個の物体ａ、ｂ、ｃ、ｄについての視認性を評価する場合、視認対象物体としての検出回数が、それぞれ１５回、５回、９回、１回であったとすると、物体の視認性は物体ａが最も高く、ａ、ｃ、ｂ、ｄの順に視認性が低下すると客観的に評価することができる。また、物体の視認性を客観的に評価できるので、同一の画像を入力した場合には、物体の視認性について同一の評価結果を得ることができる。従って、物体検出装置１による判定結果と、看者が実際に画像を視認して得られる視認性判定の実験結果等とを比較して物体検出処理プログラムを調整することによって、物体検出装置１が出力する視認性評価結果を看者による視認性評価結果に近づけることが可能である。このため、例えば画像に含まれる物体同士の視認性についての客観的評価が可能となり、看者の視認性を考慮にいれた画像デザインの検討等に活用することができる。 For example, when evaluating the visibility of four objects a, b, c, and d included in the image shown in FIG. 3, the number of detections as the object to be viewed is 15 times, 5 times, 9 times, If the object is visible, the object a has the highest visibility, and the object can be objectively evaluated as the visibility decreases in the order of a, c, b, and d. Moreover, since the visibility of an object can be objectively evaluated, when the same image is input, the same evaluation result can be obtained for the visibility of the object. Therefore, the object detection apparatus 1 adjusts the object detection processing program by comparing the determination result by the object detection apparatus 1 with the visibility determination experimental result obtained by the viewer actually viewing the image. It is possible to bring the visibility evaluation result to be output closer to the visibility evaluation result by the viewer. For this reason, objective evaluation about the visibility of objects included in an image becomes possible, for example, and it can utilize for examination of an image design etc. which considered a viewer's visibility.

また、画像に含まれる物体の視認性の評価では、客観的な評価を得るまでに多くの被験者や実験時間を要したが、物体検出装置１を用いることによって、被験者を用いずに客観的で再現性のよい評価結果を短時間に得ることが可能となる。このため、画像からの物体検出に関わる技術開発等に物体検出装置１を活用することによって開発期間の大幅な短縮や開発費用の削減が可能となる。 In addition, in the evaluation of the visibility of an object included in an image, many subjects and experiment time are required until an objective evaluation is obtained. However, by using the object detection device 1, it is objective without using a subject. Evaluation results with good reproducibility can be obtained in a short time. For this reason, by using the object detection apparatus 1 for technical development related to object detection from an image, the development period can be significantly shortened and development costs can be reduced.

なお、視認対象物体の検出結果の表示は、視認対象物体の検出回数に応じた物体の画像上の強調表示に限定されず、例えば横軸を視認対象物体の種類、縦軸を視認対象物体毎の検出回数とするヒストグラム等によって表示してもよい。 The display of the detection result of the visual target object is not limited to highlighting on the image of the object according to the number of detections of the visual target object. For example, the horizontal axis represents the type of the visual target object, and the vertical axis represents the visual target object. You may display by the histogram etc. which are the detection frequency of this.

また、視認対象物体の検出は本実施形態の、注視点視野領域の輪郭に対する物体の方向を判定する方法に限定されず、例えば注視点視野領域の輪郭に囲まれた領域の形状や大きさ等の物理的特徴等に基づいて視認対象物体を検出してもよい。 Further, the detection of the object to be visually recognized is not limited to the method of determining the direction of the object with respect to the outline of the gazing point visual field area of the present embodiment, and for example, the shape and size of the area surrounded by the outline of the gazing point visual field area The object to be visually recognized may be detected based on the physical characteristics of.

また、画像の輪郭抽出は本実施形態で用いたガボールフィルタに限定されず、画像の明暗コントラストを抽出できるものであればよく、例えばラプラシアンフィルタ等を用いてもよい。 The contour extraction of the image is not limited to the Gabor filter used in the present embodiment, and any image can be used as long as it can extract the contrast of the image. For example, a Laplacian filter may be used.

また、看者を年齢や性別などの特性によってグループ分けし、各グループの看者が画像を視認して得られる物体の視認性判定の実験結果と、物体検出装置１の判定結果とを比較して物体検出処理プログラムを調整することによって、物体検出装置１に看者の特性に応じた物体の視認性判定の機能を持たせてもよい。 Moreover, the viewers are grouped according to characteristics such as age and gender, and the result of the object visibility determination obtained by the viewers of each group viewing the image is compared with the determination result of the object detection device 1. By adjusting the object detection processing program, the object detection apparatus 1 may be provided with a function for determining the visibility of the object according to the characteristics of the viewer.

以上、本発明者によってなされた発明を適用した実施形態について説明したが、この実施形態による本発明の開示の一部をなす論述及び図面により本発明は限定されることはない。すなわち、この実施形態に基づいて当業者等によりなされる他の実施形態、実施例及び運用技術等は全て本発明の範疇に含まれることは勿論である。 As mentioned above, although the embodiment to which the invention made by the present inventor is applied has been described, the present invention is not limited by the discussion and the drawings that form part of the disclosure of the present invention according to this embodiment. That is, it is needless to say that other embodiments, examples, operation techniques, and the like made by those skilled in the art based on this embodiment are all included in the scope of the present invention.

本発明は、物体を含む画像からの視認対象物体の検出装置として広く適用可能である。 The present invention can be widely applied as a device for detecting an object to be visually recognized from an image including an object.

１物体検出装置
３ＥＣＵ
６輪郭抽出部（輪郭抽出手段）
７注視点検出部（注視点検出手段）
８注視点視野領域設定部（注視点視野領域設定手段）
９周囲結合領域設定部（周囲結合領域設定手段）
１０物体方向判定部（物体方向判定手段）
１１物体検出部（物体検出手段）
１２検出情報記憶部（記憶手段）
１３物体集計部（物体集計手段） 1 Object detection device 3 ECU
6 Contour extraction unit (contour extraction means)
7 Gaze point detection unit (gaze point detection means)
8 Gaze point visual field setting part (Gaze point visual field setting means)
9 Ambient coupling area setting section (Ambient coupling area setting means)
10 Object direction determination unit (object direction determination means)
11 Object detection unit (object detection means)
12 Detection information storage unit (storage means)
13 Object counting part (object counting means)

Claims

複数の物体を含む画像から視認対象物体を検出する画像の物体検出装置であって、
前記画像をパルスニューラルネットワークに入力することによって、前記画像内で経時的に変遷する看者の注視点を時系列データとして検出する注視点検出手段と、
前記注視点検出手段が検出した前記注視点を基準として前記画像内に注視点視野領域を設定する注視点視野領域設定手段と、
前記複数の物体のうち前記注視点視野領域設定手段が設定した前記注視点視野領域に含まれる物体を視認対象物体として検出する物体検出手段と、
前記物体検出手段が前記視認対象物体を検出する毎に、検出された前記視認対象物体に関する所定の検出情報を蓄積して記憶する記憶手段と、を備える
ことを特徴とする画像の物体検出装置。 An object detection device for detecting an object to be visually recognized from an image including a plurality of objects,
Gaze point detection means for detecting a gaze point of a viewer that changes over time in the image as time-series data by inputting the image to a pulse neural network;
Gazing point visual field setting means for setting a gazing point visual field region in the image with reference to the gazing point detected by the gazing point detection unit;
Object detection means for detecting an object included in the gazing point visual field region set by the gazing point visual field setting unit among the plurality of objects as a visual recognition target object;
An image object detection device comprising: storage means for accumulating and storing predetermined detection information relating to the detected object to be visually recognized each time the object detecting means detects the object to be visually recognized.

請求項１に記載の画像の物体検出装置であって、
前記記憶手段が記憶した前記所定の検出情報に基づいて、前記視認対象物体として検出された検出回数を前記複数の物体毎に集計する物体集計手段を備える
ことを特徴とする画像の物体検出装置。 The image object detection device according to claim 1,
An image object detection device comprising: object counting means for counting the number of detections detected as the visual recognition target object for each of the plurality of objects based on the predetermined detection information stored in the storage means.

請求項１又は請求項２に記載の画像の物体検出装置であって、
前記画像から輪郭を抽出する輪郭抽出手段と、
輪郭のいずれの側が物体であるかを判定するために用いる周囲結合領域を、前記輪郭抽出手段が抽出した前記輪郭に応じて前記画像内に設定する周囲結合領域設定手段と、
前記注視点視野領域設定手段が設定した前記注視点視野領域に含まれる注視点視野輪郭を、前記輪郭抽出手段が抽出した前記輪郭から抽出し、抽出した前記注視点視野輪郭のいずれの側が物体であるかを、前記周囲結合領域設定手段が設定した前記周囲結合領域を用いて判定する物体方向判定手段と、を備え、
前記物体検出手段は、前記注視点視野輪郭によって囲まれた領域であって前記注視点視野輪郭に対して前記注視点視野輪郭の内側が物体であると前記物体方向判定手段が判定した領域を前記視認対象物体として検出する
ことを特徴とする画像の物体検出装置。 The object detection device for an image according to claim 1 or 2,
Contour extracting means for extracting a contour from the image;
A surrounding joint area setting means for setting a surrounding joint area used for determining which side of the contour is an object in the image according to the contour extracted by the contour extracting means;
The gazing point visual field contour included in the gazing point visual field region set by the gazing point visual field setting unit is extracted from the contour extracted by the contour extracting unit, and either side of the extracted gazing point visual field contour is an object. An object direction determination unit that determines whether or not there is using the surrounding coupling region set by the surrounding coupling region setting unit,
The object detection means is an area surrounded by the gazing point visual field outline, and the object direction determination means determines that the area inside the gazing point visual field outline is an object with respect to the gazing point visual field outline. An object detection device for an image, characterized in that it is detected as an object to be viewed.