JP2009104244A

JP2009104244A - Information processor and method, and program

Info

Publication number: JP2009104244A
Application number: JP2007273044A
Authority: JP
Inventors: Akira Nakamura; 章中村; Yoshiaki Iwai; 嘉昭岩井; Takayuki Ashigahara; 隆之芦ヶ原
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2007-10-19
Filing date: 2007-10-19
Publication date: 2009-05-14
Anticipated expiration: 2027-10-19
Also published as: CN101414352B; JP4983539B2; CN101414352A

Abstract

PROBLEM TO BE SOLVED: To more certainly recognize an object inside an image. SOLUTION: About each of feature points extracted in step S102 of Figure 6 about a model image 21-1, a feature value described in step S103 of Figure 6, and correlation images of 42-11 to 42-NP between the model image 21-1 wherein itself is extracted, and one or more other model images 21-2 to 21-N are generated in step S105. Based on the correlation images, a discrimination capability value showing a contribution degree for identifying the photographic object of the model image 21-1 is calculated in step S106. The information processor can be applied to an object recognition device recognizing the object inside the image, for example. COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、情報処理装置および方法、並びにプログラムに関し、特に、より確実に画像内の物体を認識することができるようになった情報処理装置および方法、並びにプログラムに関する。 The present invention relates to an information processing apparatus, method, and program, and more particularly, to an information processing apparatus, method, and program that can recognize an object in an image more reliably.

従来、局所特徴量を用いたテクスチャベースの一般物体認識手法が存在する（特許文献１参照）。かかる手法は、照明変化に強く、ロバストに物体の認識を可能にする一方で、テクスチャの少ない物体に適用すると識別能力が低下するという特徴がある。 Conventionally, there is a texture-based general object recognition method using local features (see Patent Document 1). Such a method is resistant to changes in illumination and can recognize an object robustly, but has a feature that the discrimination ability is lowered when applied to an object with less texture.

また、本願の出願当初の出願人によって既に特願2006-100705号として出願された願書に添付した明細書等には、エッジ情報やサポート点を使用することにより、テクスチャのない物体に対して局所特徴マッチングの手法の適用を可能にするといった手法が開示されている。即ち、かかる手法とは、モデル画像とクエリ画像とから特徴点を抽出し、その周辺の局所特徴量を記述し、特徴量同士のマッチングを行い、ハフ変換やRANSAC等を用いたアウトライヤ（ミスマッチ）除去を行った後のマッチングペア数で、モデル画像内の物体とクエリ画像内の物体の識別をするという手法である。
特開2004-326693号公報 In addition, the specification attached to the application already filed as Japanese Patent Application No. 2006-100705 by the applicant at the time of the filing of the present application uses the edge information and the support points, so that a local object with respect to an object having no texture is used. A technique that enables the application of a feature matching technique is disclosed. In other words, this method is to extract feature points from the model image and query image, describe the local feature values around them, match the feature values, and use outliers (mismatch) using Hough transform, RANSAC, etc. This is a technique of identifying an object in the model image and an object in the query image by the number of matching pairs after the removal.
JP 2004-326693 A

しかしながら、これらの従来の手法では、次のような３つの問題点が存在していた。その結果、これらの従来の手法よりも確実に画像内の物体を認識できる手法の実現が期待されている状況である。 However, these conventional methods have the following three problems. As a result, it is expected to realize a method that can recognize an object in an image more reliably than these conventional methods.

即ち、第１の問題点とは、モデル画像の特徴点位置とクエリ画像の特徴点位置の出現再現性が悪い場合には、識別能力が著しく低下するという問題点である。この第１の問題点は、エッジを使用した場合には、モデル画像のエッジとクエリ画像のエッジの再現性が識別能力に大きく影響するというという問題点となる。 That is, the first problem is that when the appearance reproducibility of the feature point position of the model image and the feature point position of the query image is poor, the discrimination ability is remarkably lowered. The first problem is that when edges are used, the reproducibility of the model image edges and the query image edges greatly affects the discrimination ability.

第２の問題点とは、最終的にモデルの識別をインライヤ（ミスマッチペア除去後）のマッチペア数で判断しているため、モデル画像内の物体とクエリ画像内の物体の類似度によらず、複雑なテクスチャや輪郭で、特徴点が多く出る物体同士のマッチペアは多くなり、単純なテクスチャや形状の物体は、マッチペアが少なくなるという傾向がある、という問題点である。 The second problem is that the identification of the model is finally determined by the number of match pairs of the inlier (after mismatched pair removal), so regardless of the similarity between the object in the model image and the object in the query image, The problem is that there are a large number of match pairs between objects that have many feature points due to complex textures and contours, and objects that have simple textures and shapes tend to have fewer match pairs.

第３の問題点とは、ベース点周辺にサポート点を設け、マッチングの精度向上に利用する場合、サポート点の選択基準が、複数のモデル画像間の差異を考慮していない、という問題点である。 The third problem is that when support points are provided around the base points and used for improving the accuracy of matching, the support point selection criteria do not take into account differences between multiple model images. is there.

本発明は、このような状況に鑑みてなされたものであり、より確実に画像内の物体を認識することができるようにするものである。 The present invention has been made in view of such a situation, and makes it possible to more reliably recognize an object in an image.

本発明の一側面の情報処理装置は、クエリ画像とモデル画像とを比較し、前記モデル画像の被写体と前記クエリ画像の被写体とを同定するための支援情報を提供する情報処理装置であって、前記モデル画像から１以上の特徴点を抽出する特徴点抽出手段と、前記特徴点抽出手段により抽出された１以上の前記特徴点の特徴量をそれぞれ記述する特徴量記述手段と、前記特徴点抽出手段により抽出された１以上の前記特徴点のそれぞれについて、前記特徴量記述手段により記述された自身の前記特徴量と、自身が抽出された前記モデル画像、および１以上の別モデル画像との相関画像をそれぞれ生成し、それらの相関画像に基づいて、前記モデル画像の前記被写体を識別するための寄与度を示す識別能力値を演算する識別能力値演算手段とを備える。 An information processing apparatus according to an aspect of the present invention is an information processing apparatus that compares a query image with a model image and provides support information for identifying a subject of the model image and a subject of the query image, Feature point extraction means for extracting one or more feature points from the model image, feature quantity description means for describing feature quantities of the one or more feature points extracted by the feature point extraction means, and the feature point extraction For each of the one or more feature points extracted by the means, the correlation between the feature quantity described by the feature quantity description means, the model image from which the feature quantity was extracted, and one or more other model images A discriminating ability value calculating means for generating discriminating ability values indicating the degree of contribution for identifying the subject of the model image based on the correlation images generated from the respective images; That.

前記特徴点抽出手段により抽出された前記１以上の特徴点のうちの少なくとも１つをベース点とし、前記ベース点の一定範囲内に存在する前記特徴点の中から、前記識別能力値演算手段により演算された前記識別能力値が前記ベース点よりも高い前記特徴点を、サポート点として選択するサポート点選択手段をさらに備える。 At least one of the one or more feature points extracted by the feature point extraction unit is used as a base point, and among the feature points existing within a certain range of the base point, the discrimination ability value calculation unit Support point selecting means for selecting, as a support point, the feature point having the calculated discrimination ability value higher than the base point is further provided.

前記識別能力値演算手段は、前記相関画像全体の平均値と最大値の少なくとも一方に基づいて、前記識別能力値を演算する。 The discrimination ability value calculating means calculates the discrimination ability value based on at least one of an average value and a maximum value of the entire correlation image.

本発明の一側面の情報処理方法およびプログラムは、上述した本発明の一側面の情報処理装置に対応する方法およびプログラムである。 An information processing method and program according to one aspect of the present invention are a method and program corresponding to the information processing apparatus according to one aspect of the present invention described above.

本発明の一側面においては、クエリ画像とモデル画像とを比較し、前記モデル画像の被写体と前記クエリ画像の被写体とを同定するための支援情報として、次のような識別能力値が演算されて提供される。すなわち、前記モデル画像から１以上の特徴点が抽出され、抽出された１以上の前記特徴点の特徴量がそれぞれ記述される。そして、抽出された１以上の前記特徴点のそれぞれについて、記述された自身の前記特徴量と、自身が抽出された前記モデル画像、および１以上の別モデル画像との相関画像がそれぞれ生成され、それらの相関画像に基づいて、前記モデル画像の前記被写体を識別するための寄与度を示す識別能力値が演算される。 In one aspect of the present invention, the following identification capability value is calculated as support information for comparing the query image and the model image and identifying the subject of the model image and the subject of the query image. Provided. That is, one or more feature points are extracted from the model image, and feature quantities of the extracted one or more feature points are respectively described. Then, for each of the one or more extracted feature points, a correlation image between the described feature amount of the described feature, the model image from which the feature has been extracted, and one or more other model images is respectively generated. Based on these correlation images, an identification capability value indicating a contribution for identifying the subject of the model image is calculated.

以上のように、本発明の一側面によれば、画像内の物体を認識するために識別能力値を提供することができる。特に、本発明の一側面によれば、かかる識別能力値を利用することで、より確実に画像内の物体を認識することができる。 As described above, according to one aspect of the present invention, it is possible to provide a discrimination capability value for recognizing an object in an image. In particular, according to one aspect of the present invention, an object in an image can be more reliably recognized by using such a discrimination capability value.

以下に本発明の実施の形態を説明するが、本発明の構成要件と、明細書又は図面に記載の実施の形態との対応関係を例示すると、次のようになる。この記載は、本発明をサポートする実施の形態が、明細書又は図面に記載されていることを確認するためのものである。したがって、明細書又は図面中には記載されているが、本発明の構成要件に対応する実施の形態として、ここには記載されていない実施の形態があったとしても、そのことは、その実施の形態が、その構成要件に対応するものではないことを意味するものではない。逆に、実施の形態が構成要件に対応するものとしてここに記載されていたとしても、そのことは、その実施の形態が、その構成要件以外の構成要件には対応しないものであることを意味するものでもない。 Embodiments of the present invention will be described below. Correspondences between the constituent elements of the present invention and the embodiments described in the specification or the drawings are exemplified as follows. This description is intended to confirm that the embodiments supporting the present invention are described in the specification or the drawings. Therefore, even if there is an embodiment that is described in the specification or the drawings but is not described here as an embodiment that corresponds to the constituent elements of the present invention, that is not the case. It does not mean that the form does not correspond to the constituent requirements. Conversely, even if an embodiment is described herein as corresponding to a configuration requirement, that means that the embodiment does not correspond to a configuration requirement other than the configuration requirement. It's not something to do.

本発明の一側面の情報処理装置は、
クエリ画像（例えば図１のクエリ画像２２）とモデル画像（例えば図１のモデル画像２１−１乃至２１−Ｎ）とを比較し、前記モデル画像の被写体と前記クエリ画像の被写体とを同定するための支援情報を提供する情報処理装置（例えば図１の物体認識装置）において、
前記モデル画像から１以上の特徴点を抽出する特徴点抽出手段（例えば図２の特徴点抽出部３１）と、
前記特徴点抽出手段により抽出された１以上の前記特徴点の特徴量をそれぞれ記述する特徴量記述手段（例えば図２の特徴量記述部３２）と、
前記特徴点抽出手段により抽出された１以上の前記特徴点（例えば、図６のモデル画像２１−１について、図６のステップＳ１０２で抽出された特徴点）のそれぞれについて、前記特徴量記述手段により記述された自身の前記特徴量（図６のステップＳ１０３で記述された特徴量）と、自身が抽出された前記モデル画像、および１以上の別モデル画像との相関画像をそれぞれ生成し（例えば、図６の相関画像４２−１１乃至４２−ＮＰをステップＳ１０５で生成し）、それらの相関画像に基づいて、前記モデル画像の前記被写体を識別するための寄与度を示す識別能力値を演算する（例えば図６のステップＳ１０６で識別能力値を演算する）識別能力値演算手段（例えば図２の特徴点識別能力値演算部３３）と
を備える。 An information processing apparatus according to one aspect of the present invention includes:
To compare a query image (for example, the query image 22 in FIG. 1) and a model image (for example, the model images 21-1 to 21-N in FIG. 1) and identify the subject of the model image and the subject of the query image. In the information processing apparatus (for example, the object recognition apparatus in FIG. 1) that provides the support information of
Feature point extraction means (for example, feature point extraction unit 31 in FIG. 2) for extracting one or more feature points from the model image;
Feature quantity description means (for example, feature quantity description section 32 in FIG. 2) each describing the feature quantities of one or more feature points extracted by the feature point extraction means;
For each of the one or more feature points extracted by the feature point extraction means (for example, the feature points extracted in step S102 in FIG. 6 for the model image 21-1 in FIG. 6) by the feature amount description means. A correlation image between the described feature amount (the feature amount described in step S103 in FIG. 6), the model image from which the feature is extracted, and one or more other model images is generated (for example, Correlation images 42-11 to 42-NP of FIG. 6 are generated in step S105), and based on these correlation images, an identification capability value indicating a contribution for identifying the subject of the model image is calculated ( For example, a discrimination capability value calculation unit (for example, a feature point discrimination capability value calculation unit 33 in FIG. 2) that calculates a discrimination capability value in step S106 in FIG. 6 is provided.

前記特徴点抽出手段により抽出された前記１以上の特徴点のうちの少なくとも１つをベース点とし、前記ベース点の一定範囲内に存在する前記特徴点の中から、前記識別能力値演算手段により演算された前記識別能力値が前記ベース点よりも高い前記特徴点を、サポート点として選択するサポート点選択手段（例えば図２のサポート点選択部３４）
をさらに備える。 At least one of the one or more feature points extracted by the feature point extraction unit is used as a base point, and among the feature points existing within a certain range of the base point, the discrimination ability value calculation unit Support point selection means (for example, the support point selection unit 34 in FIG. 2) that selects, as a support point, the feature point having the calculated discrimination ability value higher than the base point.
Is further provided.

本発明の一側面の情報処理方法およびプログラムは、上述した本発明の一側面の情報処理装置に対応する方法およびプログラムである。詳細については後述するが、このプログラムは、例えば、図１７のリムーバブルメディア２１１や、記憶部２０８に含まれるハードディスク等の記録媒体に記録され、図１７の構成のコンピュータにより実行される。 An information processing method and program according to one aspect of the present invention are a method and program corresponding to the information processing apparatus according to one aspect of the present invention described above. Although details will be described later, this program is recorded on, for example, a removable medium 211 in FIG. 17 or a recording medium such as a hard disk included in the storage unit 208, and is executed by the computer having the configuration in FIG.

その他、本発明の一側面としては、上述した本発明の一側面のプログラムを記録した記録媒体も含まれる。 In addition, as one aspect of the present invention, a recording medium on which the program according to one aspect of the present invention described above is recorded is also included.

以下、図面を参照しながら本発明の実施の形態について説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

図１は、本発明の一実施の形態である物体認識装置の機能の構成を示すブロック図である。 FIG. 1 is a block diagram showing a functional configuration of an object recognition apparatus according to an embodiment of the present invention.

図１において、物体認識装置は、モデル特徴量抽出部１１、モデル特徴量辞書１２、およびクエリ画像認識部１３から構成される。 In FIG. 1, the object recognition apparatus includes a model feature quantity extraction unit 11, a model feature quantity dictionary 12, and a query image recognition unit 13.

モデル特徴量抽出部１１は、物体の認識において、認識の対象の物体をそれぞれ含むモデル画像２１−１乃至２１−Ｎ（Ｎは１以上の整数値）から、モデル特徴量をそれぞれ抽出して、モデル特徴量辞書１２に登録する。 In the recognition of an object, the model feature amount extraction unit 11 extracts model feature amounts from model images 21-1 to 21-N (N is an integer value of 1 or more) each including an object to be recognized, It is registered in the model feature dictionary 12.

なお、モデル画像２１−１乃至２１−Ｎは、静止画像そのものまたは動画像のフレーム画像とされる。 The model images 21-1 to 21-N are still images themselves or frame images of moving images.

クエリ画像認識部１３は、モデル画像２１−１乃至２１−Ｎに含まれる各物体と比較され、認識される物体を含むクエリ画像２２から、クエリ特徴量を抽出し、モデル特徴量辞書１２に登録されているモデル特徴量とのマッチングを行い、そのマッチングの結果に基づいて、モデル画像２１−１乃至２１−Ｎ内の各物体とクエリ画像２２内の物体との同定を試みる。 The query image recognition unit 13 is compared with each object included in the model images 21-1 to 21 -N, extracts a query feature amount from the query image 22 including the recognized object, and registers it in the model feature amount dictionary 12. Matching is performed with the model feature quantity, and identification of each object in the model images 21-1 to 21-N and an object in the query image 22 is attempted based on the result of the matching.

なお、クエリ画像２２は、モデル画像２１−１乃至２１−Ｎと同様に、静止画像そのものまたは動画像のフレーム画像とされる。 The query image 22 is a still image itself or a frame image of a moving image, like the model images 21-1 to 21-N.

以下、モデル特徴量抽出部１１とクエリ画像認識部１３とのそれぞれの詳細について、その順番で個別に説明していく。 Hereinafter, the details of the model feature quantity extraction unit 11 and the query image recognition unit 13 will be described individually in that order.

なお、以下、モデル画像２１−１乃至２１−Ｎを個々に区別する必要がない場合、換言すると、モデル画像２１−１乃至２１−Ｎのうちの１つについて言及する場合、単にモデル画像２１と称する。 Hereinafter, when it is not necessary to distinguish the model images 21-1 to 21-N individually, in other words, when referring to one of the model images 21-1 to 21-N, the model images 21 and 21-N are simply referred to as the model image 21. Called.

図２は、モデル特徴量抽出部１１の機能の詳細な構成を示すブロック図である。 FIG. 2 is a block diagram showing a detailed configuration of the function of the model feature quantity extraction unit 11.

モデル特徴量抽出部１１は、特徴点抽出部３１、特徴量記述部３２、特徴点識別能力値演算部３３、サポート点選択部３４、および、モデル特徴量情報生成部３５を含むように構成される。 The model feature amount extraction unit 11 is configured to include a feature point extraction unit 31, a feature amount description unit 32, a feature point identification capability value calculation unit 33, a support point selection unit 34, and a model feature amount information generation unit 35. The

特徴点抽出部３１は、モデル画像２１から特徴点を抽出し、その抽出結果を特徴量記述部３２とモデル特徴量情報生成部３５とに提供する。 The feature point extraction unit 31 extracts feature points from the model image 21 and provides the extraction result to the feature amount description unit 32 and the model feature amount information generation unit 35.

なお、特徴点抽出部３１が採用する特徴点抽出手法自体は特に限定されない。 Note that the feature point extraction method itself employed by the feature point extraction unit 31 is not particularly limited.

具体的には例えば、図３は、Harrisコーナーディテクタ等を使用した特徴点抽出手法が採用された場合の特徴点の抽出結果を示している。図３の○（白丸印）が特徴点を示している。かかる手法では、図３に示されるように、コーナ点が特徴点として抽出される。 Specifically, for example, FIG. 3 shows a feature point extraction result when a feature point extraction method using a Harris corner detector or the like is employed. The circles (white circles) in FIG. 3 indicate feature points. In this method, as shown in FIG. 3, corner points are extracted as feature points.

また例えば、図４は、Cannyエッジディテクタ等を使用した特徴点抽出手法が採用された場合の特徴点の抽出結果を示している。図４の○（白丸印）が特徴点を示している。かかる手法では、図４に示されるように、エッジ点が特徴点として抽出される。 Further, for example, FIG. 4 shows a feature point extraction result when a feature point extraction method using a Canny edge detector or the like is employed. The circles (white circles) in FIG. 4 indicate feature points. In this method, as shown in FIG. 4, edge points are extracted as feature points.

特徴量記述部３２は、特徴点抽出部３１によって抽出された各特徴点の周辺で、局所特徴量記述を行う処理をそれぞれ行い、各処理結果を特徴点識別能力値演算部３３とモデル特徴量情報生成部３５とに提供する。 The feature quantity description unit 32 performs a process of describing a local feature quantity around each feature point extracted by the feature point extraction unit 31, and each process result is used as a feature point identification capability value calculation unit 33 and a model feature quantity. Provided to the information generation unit 35.

なお、特徴量記述部３２が採用する局所特徴量記述手法自体は特に限定されない。 Note that the local feature description method itself employed by the feature description unit 32 is not particularly limited.

例えば、画素値の輝度勾配等を利用した、局所特徴量のベクトル記述を行う手法を採用できる。 For example, it is possible to employ a method of performing a vector description of local feature amounts using a luminance gradient of pixel values.

具体的には例えば、図５に示されるように、特徴点周辺で、５ｘ５画素の範囲の輝度勾配をベクトル記述する場合は、各画素の輝度勾配の、ｘ成分、ｙ成分をそれぞれ次元とし、(Vx(0,0),Vy(0,0), Vx(0,1), Vy(0,1),…Vx(4,4), Vy(4,4))で５０次元ベクトルを構成する、といった手法を採用することができる。 Specifically, for example, as shown in FIG. 5, when describing the luminance gradient in the range of 5 × 5 pixels around the feature point as a vector, the x component and the y component of the luminance gradient of each pixel are respectively dimensioned. (Vx (0,0), Vy (0,0), Vx (0,1), Vy (0,1), ... Vx (4,4), Vy (4,4)) constitutes a 50-dimensional vector Can be employed.

また、別の手法としては、例えば、輝度勾配ベクトルの方向別にヒストグラムを取った記述手法等も採用できる。例えば特徴点周辺の起動勾配ベクトルの方向を、１０度ごとにヒストグラムをとった場合３６次元ベクトルとなる。 As another method, for example, a description method using a histogram for each direction of the luminance gradient vector can be employed. For example, when the direction of the starting gradient vector around the feature point is taken every 10 degrees, a 36-dimensional vector is obtained.

また例えば、輝度情報をそのまま特徴量とするといった手法等も採用できる。たとえば特徴量周辺の５ｘ５画素の範囲で、輝度情報をそのままベクトル記述する場合は、２５次元ベクトルとなる。 Further, for example, a method of directly using luminance information as a feature amount can be employed. For example, when the luminance information is described as a vector in the range of 5 × 5 pixels around the feature amount, it becomes a 25-dimensional vector.

さらにまた、上述した各種記述手法を組み合わせてもよい。 Furthermore, the various description methods described above may be combined.

特徴点識別能力値演算部３３は、特徴点抽出部３１によって抽出された各特徴点（特徴量記述部３２により特徴量記述化された各特徴点）のそれぞれについて、識別能力値を演算し、それらの各演算結果をサポート点選択部３４とモデル特徴量情報生成部３５に提供する。 The feature point identification capability value calculation unit 33 calculates a discrimination capability value for each feature point (each feature point described by the feature amount description unit 32) extracted by the feature point extraction unit 31, Each calculation result is provided to the support point selection unit 34 and the model feature amount information generation unit 35.

ここで、識別能力値とは、モデル画像２１に含まれる被写体、即ち認識対象の物体を、他の物体（他のモデル画像に含まれる物体等）と区別して識別する場合において、その特徴点がその識別にどの程度寄与しているのか、即ち、その特徴点がその識別においてどの程度影響を及ぼしているのか、といった被写体を識別するための特徴点の能力（モデル識別の能力）を示す値をいう。 Here, the discriminating ability value is a characteristic point when the subject included in the model image 21, that is, the recognition target object is distinguished from other objects (such as objects included in other model images). A value indicating the capability (model identification capability) of the feature point for identifying the subject such as how much it contributes to the identification, that is, how much the feature point affects the identification. Say.

図６は、識別能力値が算出されるまでの一連の処理を説明するフローチャートである。 FIG. 6 is a flowchart for explaining a series of processes until the discrimination ability value is calculated.

なお、以下、図６の記載に併せて、モデル画像２１−１から抽出された各特徴点についての識別能力値が演算される場合の処理について説明をする。ただし、実際には、モデル画像２１−１のみならず、別のモデル画像２１−２乃至２１−Ｎから抽出された各特徴点のそれぞれについても、以下の説明と同様の処理が施されて、識別能力値がそれぞれ演算される。 In the following, in conjunction with the description of FIG. 6, processing in the case where the discrimination ability value for each feature point extracted from the model image 21-1 is calculated will be described. However, in practice, not only the model image 21-1, but also each feature point extracted from the other model images 21-2 to 21-N is subjected to the same processing as described below. Discrimination ability values are respectively calculated.

図６のステップＳ１００において、モデル特徴量抽出部１１は、全モデル画像２１−１乃至２１−Ｎを取得する。 In step S100 of FIG. 6, the model feature amount extraction unit 11 acquires all model images 21-1 to 21-N.

ステップＳ１０２において、特徴点抽出部３１は、上述したように、モデル画像２１−１から１以上の特徴点を抽出する。ステップＳ１０３において、特徴量記述部３２は、上述したように、モデル画像２１−１から抽出された各特徴点について特徴量記述をそれぞれ行う。 In step S102, the feature point extraction unit 31 extracts one or more feature points from the model image 21-1, as described above. In step S103, as described above, the feature quantity description unit 32 performs feature quantity description for each feature point extracted from the model image 21-1.

このようなステップステップＳ１０２とＳ１０３の処理と並行して、ステップＳ１０４において、特徴点識別能力値演算部３３は、モデル画像２１−１乃至２１−Ｎのそれぞれから、特徴量画像４１−１乃至４１−Ｎをそれぞれ生成する。 In parallel with the processing in steps S102 and S103, the feature point identification capability value calculation unit 33 in step S104 extracts the feature amount images 41-1 to 41-41 from the model images 21-1 to 21-N, respectively. -N is generated respectively.

ここで、特徴量画像４１−Ｋ（Ｋは、１乃至Ｎのうちの何れかの整数値）とは、モデル画像２１−Ｋの全画素を対象として、特徴量記述部３２に採用された局所特徴量記述手法と同一手法に従って特徴量記述がそれぞれ行われた場合、その記述結果、即ち、各特徴量を各画素値として構成された画像をいう。 Here, the feature amount image 41-K (K is an integer value of 1 to N) is a local amount adopted by the feature amount description unit 32 for all the pixels of the model image 21-K. When feature amount descriptions are performed according to the same method as the feature amount description method, the description result, that is, an image configured with each feature amount as each pixel value.

ステップＳ１０５において、特徴点識別能力値演算部３３は、モデル画像２１−１の各特徴点（ステップＳ１０２の処理で抽出されて、ステップＳ１０３の処理で特徴量記述化された各特徴点）のうちの、識別能力を演算したいP個（Pは、ステップステップＳ１０２の処理で抽出された個数以下の整数値）の特徴点についてそれぞれ、相関画像を生成する。 In step S105, the feature point identification capability value calculation unit 33 out of each feature point of the model image 21-1 (each feature point extracted in the process of step S102 and described in the feature amount in the process of step S103). A correlation image is generated for each of P feature points (P is an integer value equal to or less than the number extracted in the process of step S102) for which the discrimination ability is to be calculated.

ここで、相関画像４２−ＫＬ（Ｋは、上述の特徴量画像４１−ＫのＫと同一値。Ｌは、１乃至Ｐのうちのいずれかの値）とは、次のような画像をいう。即ち、識別能力を演算したいP個の特徴点に１乃至Pの番号を付したする。そして、そのうちの処理の対象として注目すべき番号Ｌの特徴点を、注目特徴点Ｌと称するとする。この場合、注目特徴点Ｌの特徴量と、特徴量画像４１−Ｋを構成する各画素値（即ち、各特徴量）とのマッチングがそれぞれ行われ、それぞれ相関（距離）値が求められたときに、それらの各相関値を各画素値として構成された画像が、相関画像４２−ＫＬとなる。このとき、相関値としては、例えば、ベクトル同士の正規化相関、距離0としてはユーグリッド距離等の尺度を採用することができる。 Here, the correlation image 42-KL (K is the same value as K of the above-described feature amount image 41-K. L is any value from 1 to P) refers to the following image. . That is, the number of 1 to P is given to P feature points for which the discrimination ability is to be calculated. And the feature point of the number L which should be noted as the object of the process is called the feature point L of interest. In this case, when the feature amount of the target feature point L is matched with each pixel value (that is, each feature amount) constituting the feature amount image 41-K, and a correlation (distance) value is obtained, respectively. In addition, an image configured with each of the correlation values as a pixel value is a correlation image 42-KL. At this time, as the correlation value, for example, a normalized correlation between vectors can be used, and a scale such as a Eugrid distance can be used as the distance 0.

即ち、注目特徴点Ｌに対して、Ｎ枚の特徴量画像４１−１，４１−２，・・・，４１−Ｎのそれぞれの各画素との相関を示すＮ枚の相関画像４２−１Ｌ，４２−２Ｌ，・・・，４２−ＮＬが生成される。 That is, N correlation images 42-1L indicating the correlation with each pixel of the N feature amount images 41-1, 41-2,. 42-2L,..., 42-NL are generated.

換言すると、１つの特徴量画像４１−Ｋに対しては、番号１乃至Pがそれぞれ付されたP個の各特徴点毎に１枚ずつの相関画像、即ち、P枚の相関画像４２−Ｋ１，４２−Ｋ２，・・・，４２−ＫＰが生成される。 In other words, for one feature value image 41-K, one correlation image for each of P feature points numbered 1 to P, that is, P correlation images 42-K1. , 42-K2,..., 42-KP are generated.

ステップＳ１０６において、特徴点識別能力値演算部３３は、番号１乃至Pが付されたP個の各特徴点毎に、全相関画像の平均または最大値から識別能力値をそれぞれ演算する。即ち、特徴点識別能力値演算部３３は、この平均または最大値の低いものから順に、モデル識別が高いものとして、識別能力値を与えていく。なお、全相関画像とは、注目特徴点Ｌに対して生成された相関画像の全て、即ち、Ｎ枚の相関画像４２−１Ｌ，４２−２Ｌ，・・・，４２−ＮＬをいう。 In step S106, the feature point discriminating ability value calculation unit 33 calculates the discriminating ability value from the average or maximum value of all correlation images for each of the P feature points numbered 1 to P. In other words, the feature point identification capability value calculation unit 33 gives the identification capability value as the model identification is higher in order from the lowest average or maximum value. The all correlation images are all correlation images generated for the target feature point L, that is, N correlation images 42-1L, 42-2L,..., 42-NL.

例えば、図７や図８には、識別能力値を画像化したものが示されている。ここで、識別能力値が高い特徴点ほど、明るく（白色に）なっている。即ち、図７は、カエルの形状を有する物体（以下、カエルと略称する）を含む画像がモデル画像２１−１とされた場合の識別能力値の例を示している。図７に示されるように、カエルの目の付近が、識別能力値が高い、即ち、カエルであることを識別するために重要な部分であることがわかる。一方、図８は、犬の形状を有する物体（以下、犬と略記する）を含む画像がモデル画像２１−１とされた場合の識別能力値の例を示している。図８に示されるように、犬の尾の付近が、識別能力値が高い、即ち、犬であることを識別するために重要な部分であることがわかる。 For example, FIGS. 7 and 8 show an image of the discrimination ability value. Here, the feature points with higher discrimination ability values are brighter (whiter). That is, FIG. 7 shows an example of the discrimination ability value when an image including an object having a frog shape (hereinafter abbreviated as a frog) is the model image 21-1. As shown in FIG. 7, it can be seen that the vicinity of the frog eye has a high discrimination ability value, that is, an important part for identifying that it is a frog. On the other hand, FIG. 8 shows an example of the discrimination ability value when an image including an object having a dog shape (hereinafter abbreviated as a dog) is a model image 21-1. As shown in FIG. 8, it can be seen that the vicinity of the tail of the dog is an important part for identifying the dog having a high discrimination capability value, that is, a dog.

なお、図示はしないが、特徴点識別能力値演算部３３は、図６のステップＳ１０６の処理後、例えば、Ｐ個の各特徴点の番号を、識別能力値の高い順に並び替える処理を実行してもよい。即ち、かかる処理後のＰ個の各特徴点の番号とは、モデル識別に重要な順番を示すことになる。 Although not shown, the feature point identification capability value calculation unit 33 executes, for example, a process of rearranging the numbers of the P feature points in descending order of the identification capability value after the process of step S106 in FIG. May be. That is, the numbers of the P feature points after the processing indicate the order important for model identification.

図２に戻り、サポート点選択部３４は、特徴点識別能力値演算部３３により算出された識別能力値を利用してサポート点を選択する。 Returning to FIG. 2, the support point selection unit 34 selects a support point using the discrimination capability value calculated by the feature point discrimination capability value calculation unit 33.

ここで、サポート点とは、次のような点をいう。即ち、特徴点抽出部３１により抽出された特徴点の中から基準となる点として選択された点を、以下、ベース点と称する。この場合、ベース点以外の特徴点であってベース点に従属して決定される点を、サポート点と称する。 Here, the support points refer to the following points. That is, the point selected as a reference point from the feature points extracted by the feature point extraction unit 31 is hereinafter referred to as a base point. In this case, a feature point other than the base point that is determined depending on the base point is referred to as a support point.

サポート点の決定手法自体は特に限定されず、例えば本実施の形態では、モデル画像２１において、ベース点の配置位置から一定範囲内にある特徴点のうちの、識別能力値がベース点よりも高い値を有する特徴点を、サポート点として選択する、といった手法が採用されているとする。かかる手法を採用した場合には、１つのベース点に対して、複数のサポート点が選択される場合もある。図９は、かかる手法に従ったサポート点選択部３４の処理（以下、サポート点選択処理と称する）の一例を説明するフローチャートである。 The support point determination method itself is not particularly limited. For example, in the present embodiment, in the model image 21, among the feature points within a certain range from the base point arrangement position, the discrimination ability value is higher than the base point. Assume that a method of selecting a feature point having a value as a support point is employed. When such a method is adopted, a plurality of support points may be selected for one base point. FIG. 9 is a flowchart for explaining an example of processing of the support point selection unit 34 (hereinafter referred to as support point selection processing) according to such a method.

図９のステップＳ１２１において、サポート点選択部３４は、モデル画像２１におけるＰ個の各特徴点の識別能力値を取得する。 In step S 121 of FIG. 9, the support point selection unit 34 acquires the identification capability value of each of the P feature points in the model image 21.

ステップＳ１２２において、サポート点選択部３４は、Ｐ個の特徴点から１以上のベース点を選択する。なお、ベース点の選択手法自体は特に限定されない。 In step S122, the support point selection unit 34 selects one or more base points from the P feature points. The base point selection method itself is not particularly limited.

ステップＳ１２３において、サポート点選択部３４は、１以上のベース点のうちの所定の１つを処理対象として、その処理対象のベース点の位置から一定範囲内にある他の特徴点を抽出する。 In step S123, the support point selection unit 34 sets a predetermined one of the one or more base points as a processing target, and extracts other feature points within a certain range from the position of the base point to be processed.

ステップＳ１２４において、サポート点選択部３４は、抽出された特徴点の識別能力値が、ベース点の識別能力値より高いか否かを判定する。 In step S124, the support point selection unit 34 determines whether or not the extracted feature point identification capability value is higher than the base point identification capability value.

ここで、ステップＳ１２３の処理で、１つの特徴点も抽出されない場合がある。かかる場合には、ステップＳ１２４の処理でＮＯであると強制的に判定されて、処理はステップＳ１２６に進むとする。なお、ステップＳ１２６以降の処理については後述する。 Here, in the process of step S123, one feature point may not be extracted. In such a case, it is forcibly determined as NO in the process of step S124, and the process proceeds to step S126. The processing after step S126 will be described later.

逆に、ステップＳ１２３の処理で、複数の特徴点が抽出される場合がある。かかる場合には、複数の特徴点のうちの所定の１つがステップＳ１２４の処理対象となり、その処理対象の特徴点の識別能力値が、ベース点の識別能力値より高いか否かが判定される。 Conversely, a plurality of feature points may be extracted in the process of step S123. In such a case, it is determined whether or not a predetermined one of the plurality of feature points is a processing target in step S124, and the discrimination capability value of the processing target feature point is higher than the discrimination capability value of the base point. .

サポート点選択部３４は、ステップＳ１２４において、抽出された特徴点の識別能力値がベース点の識別能力値より高いと判定した場合、ステップＳ１２５において、抽出された特徴点（複数の特徴点が抽出されている場合には、処理対象の特徴点）をサポート点として選択する。これにより、処理はステップＳ１２６に進む。 If the support point selection unit 34 determines in step S124 that the discriminating ability value of the extracted feature point is higher than the discriminating ability value of the base point, in step S125, the support point selecting unit 34 extracts the extracted feature points (a plurality of feature points are extracted. If so, the feature point to be processed is selected as a support point. Thereby, the process proceeds to step S126.

これに対して、ステップＳ１２４において、抽出された特徴点の識別能力値がベース点の識別能力値より低いと判定された場合、ステップＳ１２５の処理は実行されずに、即ち、抽出された特徴点（複数の特徴点が抽出されている場合には、処理対象の特徴点）はサポート点として選択されずに、処理はステップＳ１２６に進む。 On the other hand, when it is determined in step S124 that the discriminating ability value of the extracted feature point is lower than the discriminating ability value of the base point, the process of step S125 is not executed, that is, the extracted feature point. (If a plurality of feature points are extracted, the feature point to be processed) is not selected as a support point, and the process proceeds to step S126.

ステップＳ１２６において、サポート点選択部３４は、他に抽出された特徴点があるか否かを判定する。 In step S126, the support point selection unit 34 determines whether there are other extracted feature points.

即ち、上述したように、ステップＳ１２３の処理で複数の特徴点が抽出された場合には、ステップＳ１２６の処理でＹＥＳであると判定され、処理はステップＳ１２４に戻され、それ以降の処理が繰り返される。即ち、複数の特徴点のそれぞれが順次処理対象となり、ステップＳ１２４，Ｓ１２５，Ｓ１２６のループ処理が繰り返し実行される。その結果、複数の特徴点のうちの、ベース点よりも識別能力値が高い特徴点のみがサポート点として選択されることになる。複数の特徴点の全てについて上述のループ処理が実行されると、最後のループ処理のステップＳ１２６においてＮＯであると判定されて、処理はステップＳ１２７に進む。 That is, as described above, when a plurality of feature points are extracted in the process of step S123, it is determined YES in the process of step S126, the process returns to step S124, and the subsequent processes are repeated. It is. That is, each of the plurality of feature points is sequentially processed, and the loop processing of steps S124, S125, and S126 is repeatedly executed. As a result, among the plurality of feature points, only a feature point having a higher discrimination ability value than the base point is selected as a support point. When the above loop processing is executed for all of the plurality of feature points, it is determined NO in step S126 of the last loop processing, and the processing proceeds to step S127.

また、ステップＳ１２３の処理で１つの特徴点のみが抽出された場合または１つの特徴点も抽出されなかった場合におけるステップＳ１２６の処理では、直ちにＮＯであると判定され、処理はステップＳ１２７に進む。 Further, when only one feature point is extracted in the process of step S123 or when one feature point is not extracted, it is immediately determined as NO in the process of step S126, and the process proceeds to step S127.

ステップＳ１２７において、サポート点選択部３４は、他にベース点があるか否かを判定する。 In step S127, the support point selection unit 34 determines whether there is another base point.

まだ、処理対象となっていないベース点が存在する場合には、ステップＳ１２７の処理でＹＥＳであると判定されて、処理はステップＳ１２３に戻されてそれ以降の処理が繰り返される。 If there is still a base point that is not subject to processing, it is determined YES in step S127, the process returns to step S123, and the subsequent processing is repeated.

このようにして、１以上のベース点のそれぞれについて、サポート点が０以上選択されると、ステップＳ１２７の処理でＮＯであると判定されて、サポート点選択処理は終了する。 In this manner, when 0 or more support points are selected for each of one or more base points, it is determined NO in the process of step S127, and the support point selection process ends.

具体的には例えば、図１０（Ａ）乃至（Ｃ）のそれぞれには、ベース点とサポート点の選択結果が示されている。即ち、同一のモデル画像２１から、３つのベース点が選択されており、それぞれのベース点が○（白丸）として、図１０（Ａ）乃至（Ｃ）のそれぞれに示されている。そして、それらの３つのベース点に対して選択された複数のサポート点が、図１０（Ａ）乃至（Ｃ）に、ベース点を示す○（白丸）よりも小さな●（黒丸）としてそれぞれ示されている。 Specifically, for example, in each of FIGS. 10A to 10C, the selection result of the base point and the support point is shown. That is, three base points are selected from the same model image 21, and each base point is shown as a circle (white circle) in FIGS. 10A to 10C. A plurality of support points selected for these three base points are respectively shown in FIGS. 10A to 10C as ● (black circles) smaller than ○ (white circles) indicating the base points. ing.

図２に戻り、モデル特徴量情報生成部３５は、以上説明した特徴点抽出部３１乃至サポート点選択部３４の各種処理結果を示すモデル特徴量情報（ベース点＋サポート点）を生成し、モデル特徴量辞書１２に登録する。即ち、モデル特徴量情報とは、モデル画像２１−１乃至２１−Ｎのそれぞれについて、抽出された各特徴点に関する情報をいう。具体的には例えば、それらの特徴点がベース点とサポート点との別に区別されて、それぞれについての局所特徴量や識別能力値、また、サポート点情報等から構成される情報が、モデル特徴量情報である。 Returning to FIG. 2, the model feature amount information generation unit 35 generates model feature amount information (base point + support point) indicating various processing results of the feature point extraction unit 31 to the support point selection unit 34 described above. Register in the feature dictionary 12. That is, the model feature amount information is information related to each feature point extracted for each of the model images 21-1 to 21-N. Specifically, for example, these feature points are classified into base points and support points, and information including local feature amounts, discriminating ability values, support point information, and the like is model feature amounts. Information.

以上、図１の物体認識装置のうちのモデル特徴量抽出部１１の詳細について説明した。以下、クエリ画像認識部１３の詳細について説明する。 The details of the model feature quantity extraction unit 11 in the object recognition apparatus of FIG. 1 have been described above. Details of the query image recognition unit 13 will be described below.

図１１は、クエリ画像認識部１３の機能の詳細な構成を示すブロック図である。 FIG. 11 is a block diagram illustrating a detailed configuration of functions of the query image recognition unit 13.

クエリ画像認識部１３は、特徴画像生成部５１、相関画像生成部５２、シフト相関画像生成部５３、相関画像和生成部５４、および判定部５５を含むように構成される。 The query image recognition unit 13 is configured to include a feature image generation unit 51, a correlation image generation unit 52, a shift correlation image generation unit 53, a correlation image sum generation unit 54, and a determination unit 55.

特徴画像生成部５１は、認識させたい物体を含むクエリ画像２２が入力されると、そのクエリ画像２２から特徴量画像を生成する。即ち、上述した図６のステップＳ１０４と同様の処理がクエリ画像２２に施されることになる。 When a query image 22 including an object to be recognized is input, the feature image generation unit 51 generates a feature amount image from the query image 22. That is, the processing similar to step S104 in FIG. 6 described above is performed on the query image 22.

相関画像生成部５２は、クエリ画像２２の特徴量画像の各画素値（即ち、各画素の特徴量）と、モデル特徴量辞書１２に記録された各モデル画像２１−１乃至２１−Ｎの各特徴点（以下、各モデル特徴点と称する）の特徴量とのマッチングを行い、それぞれの相関（距離）値を各画素値として構成する画像、即ち、相関画像を生成する。 The correlation image generation unit 52 stores each pixel value of the feature amount image of the query image 22 (that is, the feature amount of each pixel) and each of the model images 21-1 to 21-N recorded in the model feature amount dictionary 12. Matching with feature quantities of feature points (hereinafter, referred to as model feature points) is performed, and an image in which each correlation (distance) value is configured as each pixel value, that is, a correlation image is generated.

シフト相関画像生成部５３は、各モデル特徴点の位置に応じて、それぞれ対応する相関画像の各画素位置をシフトさせた画像（以下、シフト相関画像と称する）を生成する。なお、シフト相関画像の生成手法については、図１２乃至図１６を参照して後述する。 The shift correlation image generation unit 53 generates an image (hereinafter referred to as a shift correlation image) obtained by shifting each pixel position of the corresponding correlation image according to the position of each model feature point. A method for generating a shift correlation image will be described later with reference to FIGS.

相関画像和生成部５４は、モデル画像２１−１乃至２１−Ｎ毎に、各モデル特徴点の各シフト相関画像、若しくは、それらに対して各種画像処理を施した後の各画像の和を取った画像（以下、相関和画像と称する）を生成する。即ち、相関和画像とは、２以上の画像の各画素値の総和を、それぞれの画素値として構成される画像をいう。 The correlation image sum generation unit 54 calculates, for each model image 21-1 to 21-N, each shift correlation image of each model feature point, or the sum of each image after various image processing is performed on them. The image (hereinafter referred to as a correlation sum image) is generated. That is, the correlation sum image is an image configured by using the sum of the pixel values of two or more images as the respective pixel values.

なお、相関和画像の生成手法（シフト相関画像に対して施される各種画像処理の例含む）の具体例については、図１２乃至図１６を参照して後述する。 A specific example of the correlation sum image generation method (including examples of various image processes performed on the shift correlation image) will be described later with reference to FIGS.

判定部５５は、モデル画像２１−１乃至２１−Ｎのそれぞれに対して生成された各相関和画像に基づいて、モデル画像２１−１乃至２１−Ｎに含まれる各物体がクエリ画像２２に含まれている物体と同一であるか否かを判定し、その判定結果を出力する。 The determination unit 55 includes the objects included in the model images 21-1 to 21-N in the query image 22 based on the correlation sum images generated for the model images 21-1 to 21-N. It is determined whether or not the object is the same as the detected object, and the determination result is output.

即ち、所定のモデル画像２１−Ｋについての相関和画像のうちの、シフト相関画像の生成時のシフト位置（後述する例では中央付近の位置）の画素値が、相関和画像のローカルピークとなる。そして、かかるローカルピークが、モデル画像２１−Ｋに含まれる物体が、クエリ画像２２においてどの程度の割合で存在するのかを示す存在推定度を表すことになる。よって、判定部５５は、モデル画像２１−Ｋの相関和画像のローカルピークが閾値以上の場合、モデル画像２１−Ｋに含まれる物体がクエリ画像２２に含まれる画像と一致すると判定すること、即ち、その物体を認識することができる。 That is, among the correlation sum images for the predetermined model image 21-K, the pixel value at the shift position (position near the center in the example described later) at the time of generation of the shift correlation image becomes the local peak of the correlation sum image. . Then, the local peak represents a presence estimation degree indicating how much an object included in the model image 21 -K is present in the query image 22. Therefore, the determination unit 55 determines that the object included in the model image 21-K matches the image included in the query image 22 when the local peak of the correlation sum image of the model image 21-K is equal to or greater than the threshold. The object can be recognized.

以下、図１２乃至図１６を参照して、クエリ画像認識部１３の動作のうちの、主に相関画像生成部５２乃至相関画像和生成部５４の動作について説明する。 Hereinafter, the operations of the correlation image generation unit 52 to the correlation image sum generation unit 54 among the operations of the query image recognition unit 13 will be described with reference to FIGS.

即ち、図１３乃至図１６は、図１２（Ａ）の画像がクエリ画像２２として入力された場合における、図１２（Ｂ）のモデル画像２１との相関和画像が生成されるまでの各種処理結果の具体例を示している。 That is, FIGS. 13 to 16 show various processing results until the correlation sum image with the model image 21 of FIG. 12B is generated when the image of FIG. 12A is input as the query image 22. A specific example is shown.

図１３の例では、モデル画像２１の特徴量情報として、４つのベース点b1乃至b4の特徴量のみが利用されて、相関和画像が生成される。換言すると、図１３の例では、後述する他の例のように、サポート点の情報や、識別能力値は一切利用されない。なお、ベース点b1乃至b4は例示にしか過ぎず、ベース点の位置や個数は図１３の例に限定されず任意であることは言うまでもない。 In the example of FIG. 13, as the feature amount information of the model image 21, only the feature amounts of the four base points b1 to b4 are used to generate a correlation sum image. In other words, in the example of FIG. 13, the support point information and the identification capability value are not used at all as in other examples described later. Note that the base points b1 to b4 are merely examples, and it goes without saying that the positions and the number of base points are not limited to the example of FIG. 13 and are arbitrary.

図１３のステップＳ１３１において、相関画像生成部５２は、クエリ画像２２の特徴量画像の各画素値（即ち、各画素の特徴量）と、モデル画像２１のベース点b1乃至b4の各特徴量とのマッチングを行うことで、図１３のＳ１３１の枠内に示されるような４つの相関画像を生成する。 In step S131 of FIG. 13, the correlation image generation unit 52 determines each pixel value of the feature amount image of the query image 22 (that is, a feature amount of each pixel) and each feature amount of the base points b1 to b4 of the model image 21. By performing this matching, four correlation images as shown in the frame of S131 of FIG. 13 are generated.

ステップＳ１３２において、シフト相関画像生成部５３は、各ベース点b1乃至b4の位置に応じて、それぞれ対応する相関画像の各画素位置をシフトさせることで、図１３のＳ１３２の枠内に示されるような４つのシフト相関画像を生成する。 In step S132, the shift correlation image generation unit 53 shifts each pixel position of the corresponding correlation image in accordance with the positions of the base points b1 to b4, as shown in the frame of S132 in FIG. Four shift correlation images are generated.

図１３の例のシフト相関画像は、モデル画像２１におけるベース点bn（nは整数値であって、図１３の例ではnは１乃至４のうちの何れかの値）の存在位置（相関画像の対応画素位置）が、画像の中央位置にシフトするように、相関画像の各画素位置がシフトされた結果得られる画像となっている。 The shift correlation image in the example of FIG. 13 is the presence position (correlation image) of the base point bn (n is an integer value, and n is any one of 1 to 4 in the example of FIG. 13). Is obtained as a result of shifting each pixel position of the correlation image so that the corresponding pixel position is shifted to the center position of the image.

ステップＳ１３３において、相関画像和生成部５４は、これらの４つのシフト相関画像を単純に足し合わせることで、図１３のＳ１３３の枠内に示されるような相関和画像を生成する。なお、「足し合わせるとは」、上述の如く、各画素毎に、各画素値を足し合わせることを意味する。このことは、以下の説明でも同様である。 In step S133, the correlation image sum generation unit 54 simply adds these four shift correlation images to generate a correlation sum image as shown in the frame of S133 in FIG. Note that “adding” means adding the pixel values for each pixel as described above. The same applies to the following description.

このような図１３の例に対して、図１４の例では、モデル画像２１の特徴量情報として、４つのベース点b1乃至b4の特徴量に加えて、それらの識別能力値に基づく重み値α１乃至α４が利用されて、相関和画像が生成される。 In contrast to the example of FIG. 13, in the example of FIG. 14, as the feature amount information of the model image 21, in addition to the feature amounts of the four base points b1 to b4, the weight value α1 based on their discrimination ability values A correlation sum image is generated by using .alpha.4.

即ち、ステップＳ１４１において、相関画像生成部５２は、クエリ画像２２の特徴量画像の各画素値（即ち、各画素の特徴量）と、モデル画像２１のベース点b1乃至b4の各特徴量とのマッチングを行うことで、図１４のＳ１４１の枠内に示されるような４つの相関画像を生成する。 That is, in step S 141, the correlation image generation unit 52 calculates each pixel value (that is, a feature amount of each pixel) of the feature amount image of the query image 22 and each feature amount of the base points b 1 to b 4 of the model image 21. By performing matching, four correlation images as shown in the frame of S141 in FIG. 14 are generated.

なお、図１４のＳ１４１の枠内に示される４つの相関画像とは、図１３のＳ１３１の枠内に示される４つの相関画像と同一である。即ち、ステップＳ１４１の処理とステップＳ１３１の処理とは同様の処理である。 Note that the four correlation images shown in the frame of S141 in FIG. 14 are the same as the four correlation images shown in the frame of S131 in FIG. That is, the process of step S141 and the process of step S131 are the same process.

ステップＳ１４２の処理では、シフト相関画像の生成処理が実行される。ただし、ステップＳ１４２の処理は、図１３のステップＳ１３２の処理とは異なる。 In the process of step S142, a shift correlation image generation process is executed. However, the process of step S142 is different from the process of step S132 of FIG.

即ち、ステップＳ１４２−１において、シフト相関画像生成部５３は、各ベース点b1乃至b4の位置に応じて、それぞれ対応する相関画像の各画素位置をシフトさせることで、図１４のＳ１４２−１の点線枠内に示されるような４つのシフト相関画像を生成する。 That is, in step S142-1, the shift correlation image generation unit 53 shifts each pixel position of the corresponding correlation image in accordance with the position of each base point b1 to b4, so that the process of S142-1 in FIG. Four shift correlation images as shown in the dotted frame are generated.

なお、図１４のＳ１４２−１の点線枠内に示される４つのシフト相関画像とは、図１３のＳ１３２の枠内に示される４つのシフト相関画像と同一である。即ち、ステップＳ１４２−１の処理とは、図１３のステップＳ１３２の処理と同様の処理である。 Note that the four shift correlation images shown in the dotted frame in S142-1 in FIG. 14 are the same as the four shift correlation images shown in the frame in S132 in FIG. That is, the process of step S142-1 is the same process as the process of step S132 of FIG.

換言すると、ステップＳ１４２の処理とは、図１３のステップＳ１３２（＝ステップＳ１４２−１）の処理に加えて、さらに次のようなステップＳ１４２−２の処理が付加された処理であるといる。なお、ステップＳ１４２−２の処理で最終的に得られるシフト相関画像と、ステップＳ１４２−１の処理の結果得られるシフト相関画像とを個々に区別すべく、以下、前者を重みつきシフト相関画像と称し、後者を単純シフト相関画像と称する。 In other words, the process of step S142 is a process in which the following process of step S142-2 is further added to the process of step S132 (= step S142-1) of FIG. In order to distinguish the shift correlation image finally obtained by the process of step S142-2 from the shift correlation image obtained as a result of the process of step S142-1, hereinafter, the former will be referred to as a weighted shift correlation image. The latter is referred to as a simple shift correlation image.

即ち、ステップＳ１４２−１の処理では、図１４のＳ１４２−１の点線枠内に示される４つの単純シフト相関画像が生成される。そこで、ステップＳ１４２−２において、シフト相関画像生成部５３は、各ベース点b1乃至b4のそれぞれに対応する各単純シフト相関画像の各画素値に対して、各ベース点b1乃至b4の識別能力値に基づく重み値α1乃至α4をそれぞれ掛けることで、識別能力値に応じた重み付けがなされた各画素値により構成される画像、即ち、図１４のＳ１４２−２の点線枠内に示されるような４つの重みつきシフト相関画像を生成する。 That is, in the process of step S142-1, four simple shift correlation images shown in the dotted line frame of S142-1 in FIG. 14 are generated. Therefore, in step S142-2, the shift correlation image generation unit 53 determines the discrimination capability value of each base point b1 to b4 for each pixel value of each simple shift correlation image corresponding to each of the base points b1 to b4. Are multiplied by the weight values α1 to α4 based on the image, respectively, and an image composed of each pixel value weighted according to the discriminating ability value, that is, 4 as shown in the dotted line frame of S142-2 in FIG. One weighted shift correlation image is generated.

ステップＳ１４３において、相関画像和生成部５４は、これらの４つの重みつきシフト相関画像を単純に足し合わせることで、図１４のＳ１４３の枠内に示されるような相関和画像を生成する。 In step S143, the correlation image sum generation unit 54 generates a correlation sum image as shown in the frame of S143 in FIG. 14 by simply adding these four weighted shift correlation images.

このような図１３，図１４の例に対して、図１５の例では、モデル画像２１の特徴量情報として、４つのベース点b1乃至b4の特徴量に加えてさらに、各ベース点b1乃至b4のサポート点の情報が利用されて、相関和画像が生成される。ただし、図１５の例では、図１４の例のように、識別能力値に基づく重み値α１乃至α４は利用されない。 In contrast to the examples of FIGS. 13 and 14, in the example of FIG. 15, in addition to the feature amounts of the four base points b1 to b4, the base points b1 to b4 are added as the feature amount information of the model image 21. Correlation sum images are generated using the information of the support points. However, in the example of FIG. 15, the weight values α1 to α4 based on the discrimination ability value are not used as in the example of FIG.

ステップＳ１５１の処理では、相関画像の生成処理が実行される。ただし、ステップＳ１５１の処理は、図１３のステップＳ１３１や図１４のステップＳ１４１の処理とは異なる。 In the process of step S151, a correlation image generation process is executed. However, the process of step S151 is different from the process of step S131 of FIG. 13 and step S141 of FIG.

即ち、ステップＳ１５２−１において、相関画像生成部５２は、クエリ画像２２の特徴量画像の各画素値（即ち、各画素の特徴量）と、モデル画像２１のベース点b1乃至b4の各特徴量とのマッチングを行うことで、図１５のＳ１５１−１の枠内に示されるような４つの相関画像を生成する。 That is, in step S152-1, the correlation image generation unit 52 determines each pixel value of the feature amount image of the query image 22 (ie, the feature amount of each pixel) and each feature amount of the base points b1 to b4 of the model image 21. Are generated, four correlation images as shown in the frame of S151-1 in FIG. 15 are generated.

なお、図１５のＳ１５１−１の枠内に示される４つの相関画像とは、図１３のＳ１３１の枠内に示される４つの相関画像と同一、即ち図１４のＳ１４１の枠内に示される４つの相関画像と同一である。即ち、ステップＳ１５１−１の処理とは、図１３のステップＳ１３１や図１４のステップＳ１４１の処理と同様の処理である。 Note that the four correlation images shown in the frame of S151-1 in FIG. 15 are the same as the four correlation images shown in the frame of S131 in FIG. 13, that is, 4 shown in the frame of S141 in FIG. Identical to two correlation images. That is, the process of step S151-1 is the same process as the process of step S131 of FIG. 13 or step S141 of FIG.

換言すると、ステップＳ１５１の処理とは、図１３のステップＳ１３１（＝図１４のステップＳ１４１＝図１５のステップＳ１５１−１）の処理に加えて、さらに次のようなステップＳ１５１−２，Ｓ１５１−３の処理が付加された処理であるといる。なお、以下、各ステップＳ１５１−１乃至Ｓ１５１−３の各処理の結果得られる相関画像を個々に区別すべく、ステップＳ１５１−１の処理の結果得られる相関画像をベース点相関画像と称し、ステップＳ１５１−２の処理の結果得られる相関画像をサポート点シフト相関画像と称し、ステップＳ１５１−３の処理の結果得られる相関画像を、ベース点bnを中心としたサポート点シフト相関画像和と称する。 In other words, in addition to the processing of step S131 in FIG. 13 (= step S141 in FIG. 14 = step S151-1 in FIG. 15), the processing in step S151 further includes the following steps S151-2 and S151-3. This process is added. Hereinafter, the correlation image obtained as a result of the process of step S151-1 is referred to as a base point correlation image in order to individually distinguish the correlation images obtained as a result of the processes of steps S151-1 to S151-3. The correlation image obtained as a result of the process of S151-2 is referred to as a support point shift correlation image, and the correlation image obtained as a result of the process of step S151-3 is referred to as a support point shift correlation image sum centered on the base point bn.

即ち、ステップＳ１５１−１の処理では、図１５のＳ１５１−１の点線枠内に示される４つのベース点相関画像が生成される。 That is, in the process of step S151-1, four base point correlation images shown in the dotted line frame of S151-1 of FIG. 15 are generated.

ステップＳ１５１−２において、相関画像生成部５２は、モデル画像２１のベース点bnについて、クエリ画像２２の特徴量画像の各画素値（即ち、各画素の特徴量）と、ベース点bnにおけるサポート点snm（mは、１以上の整数値）の各特徴量とのマッチングをそれぞれ行うことで、m個の相関画像を生成する。さらに、相関画像生成部５２は、サポート点snmの存在位置（相関画像の対応画素位置）を、ベース点bnの存在位置（相関画像の対応画素位置）にシフトすることで、ベース点b1乃至b4のそれぞれについて、図１４のＳ１５１−２の枠内に示されるようなm個のサポート点シフト相関画像をそれぞれ生成する。 In step S151-2, the correlation image generation unit 52 determines each pixel value of the feature amount image of the query image 22 (that is, the feature amount of each pixel) and the support point at the base point bn for the base point bn of the model image 21. By performing matching with each feature quantity of snm (m is an integer value of 1 or more), m correlation images are generated. Furthermore, the correlation image generation unit 52 shifts the presence position of the support point snm (corresponding pixel position of the correlation image) to the presence position of the base point bn (corresponding pixel position of the correlation image), thereby causing the base points b1 to b4 to be shifted. For each of these, m support point shift correlation images as shown in the frame of S151-2 of FIG. 14 are generated.

即ち、ベース点b1には、2個のサポート点s11，s12が存在する。よって、サポート点s11についてのサポート点シフト相関画像と、サポート点s12についてのサポート点シフト相関画像が生成される。 That is, there are two support points s11 and s12 at the base point b1. Therefore, a support point shift correlation image for the support point s11 and a support point shift correlation image for the support point s12 are generated.

以下同様に、ベース点b2には、3個のサポート点s21，s22，s23が存在する。よって、サポート点s21についてのサポート点シフト相関画像、サポート点s22についてのサポート点シフト相関画像、および、サポート点s23についてのサポート点シフト相関画像が生成される。 Similarly, there are three support points s21, s22, and s23 at the base point b2. Therefore, a support point shift correlation image for the support point s21, a support point shift correlation image for the support point s22, and a support point shift correlation image for the support point s23 are generated.

ベース点b3には、2個のサポート点s31，s32が存在する。よって、サポート点s31についてのサポート点シフト相関画像と、サポート点s32についてのサポート点シフト相関画像が生成される。 There are two support points s31 and s32 at the base point b3. Therefore, a support point shift correlation image for the support point s31 and a support point shift correlation image for the support point s32 are generated.

ベース点b4には、1個のサポート点s41が存在する。よって、サポート点s41についてのサポート点シフト相関画像が生成される。 There is one support point s41 at the base point b4. Therefore, a support point shift correlation image for the support point s41 is generated.

ステップＳ１５１−３において、相関画像生成部５２は、モデル画像２１のベース点bnについて、対応するベース点相関画像（ステップＳ１５１−１の処理の結果得られる画像）と、対応するm個のサポート点シフト相関画像（ステップＳ１５１−２の処理の結果得られる画像）を単純に足し合わせることで、図１５のＳ１５１−３の枠内に示されるような、ベース点bnを中心としたサポート点シフト相関画像和を生成する。 In step S151-3, the correlation image generation unit 52 performs the corresponding base point correlation image (the image obtained as a result of the processing in step S151-1) and the corresponding m support points for the base point bn of the model image 21. By simply adding the shift correlation image (the image obtained as a result of the processing in step S151-2), the support point shift correlation centered on the base point bn as shown in the frame of S151-3 in FIG. Generate an image sum.

即ち、ベース点b1については、ベース点b1についてのベース点相関画像、並びに、サポート点s11についてのサポート点シフト相関画像およびサポート点s12についてのサポート点シフト相関画像が足し合わされ、その結果、ベース点b1を中心としたサポート点シフト相関画像和が生成される。 That is, for the base point b1, the base point correlation image for the base point b1, and the support point shift correlation image for the support point s11 and the support point shift correlation image for the support point s12 are added together. A support point shift correlation image sum centered on b1 is generated.

以下同様に、ベース点b2については、ベース点b2についてのベース点相関画像、並びに、サポート点s21についてのサポート点シフト相関画像、サポート点s22についてのサポート点シフト相関画像、および、サポート点s23についてのサポート点シフト相関画像が足し合わされ、その結果、ベース点b2を中心としたサポート点シフト相関画像和が生成される。 Similarly, for the base point b2, the base point correlation image for the base point b2, the support point shift correlation image for the support point s21, the support point shift correlation image for the support point s22, and the support point s23. These support point shift correlation images are added together, and as a result, a support point shift correlation image sum centered on the base point b2 is generated.

ベース点b3については、ベース点b3についてのベース点相関画像、並びに、サポート点s31についてのサポート点シフト相関画像およびサポート点s32についてのサポート点シフト相関画像が足し合わされ、その結果、ベース点b3を中心としたサポート点シフト相関画像和が生成される。 For the base point b3, the base point correlation image for the base point b3, and the support point shift correlation image for the support point s31 and the support point shift correlation image for the support point s32 are added together. A centered support point shift correlation image sum is generated.

ベース点b4については、ベース点b4についてのベース点相関画像、並びに、サポート点s41についてのサポート点シフト相関画像が足し合わされ、その結果、ベース点b4を中心としたサポート点シフト相関画像和が生成される。 For the base point b4, the base point correlation image for the base point b4 and the support point shift correlation image for the support point s41 are added, and as a result, a support point shift correlation image sum centered on the base point b4 is generated. Is done.

その後のステップＳ１５２，Ｓ１５３の処理は、図１３のステップＳ１３２，Ｓ１３３の処理と基本的に同様の処理が実行される。ただし、図１３のステップＳ１３２の処理対象は、図１５のステップＳ１５１−１の処理結果であるベース点相関画像となっていた。これに対して、図１５のステップＳ１５２の処理対象は、上述の如く、図１５のステップＳ１５１−１の処理結果であるベース点相関画像に対して、ステップＳ１５１−２の処理結果であるサポート点シフト相関画像が足し合わされた結果得られる画像、即ち、ベース点を中心としたサポート点シフト相関画像和である。 Subsequent processes in steps S152 and S153 are basically the same as the processes in steps S132 and S133 in FIG. However, the processing target in step S132 in FIG. 13 is the base point correlation image that is the processing result in step S151-1 in FIG. On the other hand, as described above, the processing target of step S152 in FIG. 15 is the support point that is the processing result of step S151-2 with respect to the base point correlation image that is the processing result of step S151-1 of FIG. This is an image obtained as a result of adding the shift correlation images, that is, a support point shift correlation image sum centered on the base point.

図１６の例は、図１４の例と図１５の例とを組み合わせた例である。即ち、図１６の例では、モデル画像２１の特徴量情報として、４つのベース点b1乃至b4の特徴量に加えて、それらの識別能力値に基づく重み値α１乃至α４と、各ベース点b1乃至b4のサポート点の情報との両者が利用されて、相関和画像が生成される。 The example of FIG. 16 is an example in which the example of FIG. 14 and the example of FIG. 15 are combined. That is, in the example of FIG. 16, in addition to the feature amounts of the four base points b1 to b4, as the feature amount information of the model image 21, weight values α1 to α4 based on their discrimination ability values and the base points b1 to b4 A correlation sum image is generated using both of the support point information of b4.

換言すると、図１６のステップＳ１６１の処理が、図１５のステップＳ１５１の処理と同様の処理である。即ち、図１６のステップＳ１６１−１乃至Ｓ１６１−３のそれぞれが、図１５のステップＳ１５１−１乃至Ｓ１５１−３のそれぞれと同様の処理である。 In other words, the process in step S161 in FIG. 16 is the same as the process in step S151 in FIG. That is, each of steps S161-1 to S161-3 in FIG. 16 is the same processing as each of steps S151-1 to S151-3 in FIG.

一方、図１６のステップＳ１６２の処理が、図１４のステップＳ１４２の処理と同様の処理である。即ち、図１６のステップＳ１６２−１，Ｓ１６２−２のそれぞれが、図１４のステップＳ１４１−１，Ｓ１４１−２のそれぞれと同様の処理である。 On the other hand, the process in step S162 in FIG. 16 is the same as the process in step S142 in FIG. That is, each of steps S162-1, S162-2 in FIG. 16 is the same processing as each of steps S141-1, S141-2 in FIG.

式で表すと、図１６のステップＳ１６１の処理結果は、次の式（１）のように表される。 Expressed as an expression, the processing result of step S161 in FIG. 16 is expressed as the following expression (1).

・・・（１）

... (1)

式（１）の左辺のSumSpCor_bn(x,y)が、ベース点bnを中心としたサポート点シフト相関画像和の座標（x,y）における画素値を示している。なお、nは、図１６の例では１乃至4のうちの何れかの値とされているが、任意の整数値に一般化できることはいうまでもない。 SumSpCor _bn (x, y) on the left side of Expression (1) indicates the pixel value at the coordinates (x, y) of the support point shift correlation image sum centered on the base point bn. Note that n is any value from 1 to 4 in the example of FIG. 16, but it goes without saying that it can be generalized to an arbitrary integer value.

また、式（１）の右辺において、Cor_snm(x,y)が、サポート点snmの相関画像の(x,y)における画素値を示している。m_bnは、ベース点bnにおけるサポート点の数を示している。即ち、図１６の例では、m_b1＝２，m_b2＝３，m_b3＝２，m_b4＝１とされている。(bx_n，by_n)は、ベース点bnの座標を示している。(snx_m，sny_m)は、サポート点snmの座標を示している。 In addition, on the right side of Equation (1), Cor _snm (x, y) indicates the pixel value in (x, y) of the correlation image at the support point snm. m _bn indicates the number of support points at the base point bn. That is, in the example of FIG. 16, m _b1 = 2, m _b2 = 3, m _b3 = 2 and m _b4 = 1. (bx _n , by _n ) indicates the coordinates of the base point bn. (snx _m , sny _m ) indicates the coordinates of the support point snm.

そして、図１６のステップＳ１６３の最終的な処理結果は、次の式（２）のように表される。即ち、式（２）の右辺のΣ内の式が、図１６のステップＳ１６２の処理結果を示している。 Then, the final processing result of step S163 in FIG. 16 is expressed as the following equation (2). That is, the expression in Σ on the right side of Expression (2) indicates the processing result of Step S162 in FIG.

・・・（２）

... (2)

式（２）の左辺のSumCor(x,y)が、ステップＳ１６３の処理の結果得られる相関和画像の座標（x,y）における画素値を示している。 SumCor (x, y) on the left side of Expression (2) indicates the pixel value at the coordinates (x, y) of the correlation sum image obtained as a result of the processing in step S163.

また、式（２）の右辺において、(cx,cy)が、モデル画像２１の中心座標を示している。 Further, on the right side of Expression (2), (cx, cy) indicates the center coordinates of the model image 21.

以上説明したように、本発明を適用することで、モデル画像と、クエリ画像の特徴点抽出のリピータビリティーを考慮する必要がなくなり、よりロバストな認識が可能になる。 As described above, by applying the present invention, it is not necessary to consider the repeatability of model image and query image feature point extraction, and more robust recognition is possible.

また、相関画像和の所定画素値（例えば中央付近の画素値）、即ち、相関値の総和の値が、物体の存在推定度を表すので、この値を比較することにより、どの物体がどれ位の確率で存在しているかが分かるようになる。 In addition, the predetermined pixel value of the correlation image sum (for example, the pixel value near the center), that is, the sum of the correlation values represents the existence estimation degree of the object. You can see if it exists with the probability of.

また、自分のモデル画像の他の部分や、他のモデル画像との相関具合を考慮して、特徴量の識別能力値を演算し、その識別能力値に基づいてサポート点の選択もできるので、マッチングの精度が向上する。 Also, considering the correlation with other parts of your model image and other model images, you can calculate the feature value discrimination ability value, and you can also select the support point based on the discrimination ability value, Matching accuracy is improved.

上述した一連の処理は、ハードウェアにより実行することもできるし、ソフトウェアにより実行することもできる。一連の処理をソフトウェアにより実行する場合には、そのソフトウェアを構成するプログラムが、専用のハードウェアに組み込まれているコンピュータ、または、各種のプログラムをインストールすることで、各種の機能を実行することが可能な、例えば汎用のパーソナルコンピュータなどに、プログラム記録媒体からインストールされる。 The series of processes described above can be executed by hardware or can be executed by software. When a series of processing is executed by software, a program constituting the software may execute various functions by installing a computer incorporated in dedicated hardware or various programs. For example, it is installed from a program recording medium in a general-purpose personal computer or the like.

図１７は、上述した一連の処理をプログラムにより実行するパーソナルコンピュータの構成の例を示すブロック図である。CPU（Central Processing Unit）２０１は、ROM（Read Only Memory）２０２、または記憶部２０８に記憶されているプログラムに従って各種の処理を実行する。RAM（Random Access Memory）２０３には、CPU２０１が実行するプログラムやデータなどが適宜記憶される。これらのCPU２０１、ROM２０２、およびRAM２０３は、バス２０４により相互に接続されている。 FIG. 17 is a block diagram showing an example of the configuration of a personal computer that executes the above-described series of processing by a program. A CPU (Central Processing Unit) 201 executes various processes according to a program stored in a ROM (Read Only Memory) 202 or a storage unit 208. A RAM (Random Access Memory) 203 appropriately stores programs executed by the CPU 201 and data. These CPU 201, ROM 202, and RAM 203 are connected to each other by a bus 204.

CPU２０１にはまた、バス２０４を介して入出力インターフェース２０５が接続されている。入出力インターフェース２０５には、キーボード、マウス、マイクロフォンなどよりなる入力部２０６、ディスプレイ、スピーカなどよりなる出力部２０７が接続されている。CPU２０１は、入力部２０６から入力される指令に対応して各種の処理を実行する。そして、CPU２０１は、処理の結果を出力部２０７に出力する。 An input / output interface 205 is also connected to the CPU 201 via the bus 204. Connected to the input / output interface 205 are an input unit 206 composed of a keyboard, mouse, microphone, and the like, and an output unit 207 composed of a display, a speaker, and the like. The CPU 201 executes various processes in response to commands input from the input unit 206. Then, the CPU 201 outputs the processing result to the output unit 207.

入出力インターフェース２０５に接続されている記憶部２０８は、例えばハードディスクからなり、CPU２０１が実行するプログラムや各種のデータを記憶する。通信部２０９は、インターネットやローカルエリアネットワークなどのネットワークを介して外部の装置と通信する。 The storage unit 208 connected to the input / output interface 205 includes, for example, a hard disk, and stores programs executed by the CPU 201 and various data. The communication unit 209 communicates with an external device via a network such as the Internet or a local area network.

また、通信部２０９を介してプログラムを取得し、記憶部２０８に記憶してもよい。 Further, a program may be acquired via the communication unit 209 and stored in the storage unit 208.

入出力インターフェース２０５に接続されているドライブ２１０は、磁気ディスク、光ディスク、光磁気ディスク、或いは半導体メモリなどのリムーバブルメディア２１１が装着されたとき、それらを駆動し、そこに記録されているプログラムやデータなどを取得する。取得されたプログラムやデータは、必要に応じて記憶部２０８に転送され、記憶される。 The drive 210 connected to the input / output interface 205 drives a removable medium 211 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory, and drives programs and data recorded there. Get etc. The acquired program and data are transferred to and stored in the storage unit 208 as necessary.

コンピュータにインストールされ、コンピュータによって実行可能な状態とされるプログラムを格納するプログラム記録媒体は、図１７に示されるように、磁気ディスク（フレキシブルディスクを含む）、光ディスク（CD-ROM(Compact Disc-Read Only Memory),DVD(Digital Versatile Disc)を含む）、光磁気ディスク、もしくは半導体メモリなどよりなるパッケージメディアであるリムーバブルメディア２１１、または、プログラムが一時的もしくは永続的に格納されるROM２０２や、記憶部２０８を構成するハードディスクなどにより構成される。プログラム記録媒体へのプログラムの格納は、必要に応じてルータ、モデムなどのインターフェースである通信部２０９を介して、ローカルエリアネットワーク、インターネット、デジタル衛星放送といった、有線または無線の通信媒体を利用して行われる。 As shown in FIG. 17, a program recording medium that stores a program that is installed in a computer and can be executed by the computer is a magnetic disk (including a flexible disk), an optical disk (CD-ROM (Compact Disc-Read). Only memory), DVD (Digital Versatile Disc), removable media 211, which is a package medium composed of a magneto-optical disk, semiconductor memory, or the like, or ROM 202 where a program is temporarily or permanently stored, or a storage unit It is constituted by a hard disk or the like constituting 208. The program is stored in the program recording medium using a wired or wireless communication medium such as a local area network, the Internet, or digital satellite broadcasting via a communication unit 209 that is an interface such as a router or a modem as necessary. Done.

なお、本明細書において、プログラム記録媒体に格納されるプログラムを記述するステップは、記載された順序に沿って時系列的に行われる処理はもちろん、必ずしも時系列的に処理されなくとも、並列的あるいは個別に実行される処理をも含むものである。 In the present specification, the step of describing the program stored in the program recording medium is not limited to the processing performed in time series in the order described, but is not necessarily performed in time series. Or the process performed separately is also included.

また、本発明の実施の形態は、上述した実施の形態に限定されるものではなく、本発明の要旨を逸脱しない範囲において種々の変更が可能である。例えば、以上においては、本発明を物体認識装置に適用した実施の形態について説明したが、本発明は、例えば、画像内の物体を比較し認識する情報処理装置に適用することができる。 The embodiments of the present invention are not limited to the above-described embodiments, and various modifications can be made without departing from the scope of the present invention. For example, the embodiment in which the present invention is applied to an object recognition apparatus has been described above. However, the present invention can be applied to an information processing apparatus that compares and recognizes objects in an image, for example.

本発明の一実施の形態である物体認識装置の機能の構成を示すブロック図である。It is a block diagram which shows the structure of the function of the object recognition apparatus which is one embodiment of this invention. 図１のモデル特徴量抽出部の機能の詳細な構成を示すブロック図である。It is a block diagram which shows the detailed structure of the function of the model feature-value extraction part of FIG. 図２の特徴点抽出部の処理結果の具体例を示す図である。It is a figure which shows the specific example of the process result of the feature point extraction part of FIG. 図２の特徴点抽出部の処理結果の具体例を示す図である。It is a figure which shows the specific example of the process result of the feature point extraction part of FIG. 図２の特徴量記述部の処理手法の例を説明する図である。It is a figure explaining the example of the processing method of the feature-value description part of FIG. 図２の特徴点識別能力値演算部の処理例を説明するフローチャートである。It is a flowchart explaining the process example of the feature point identification capability value calculating part of FIG. 図６の処理結果の具体例を示す図である。It is a figure which shows the specific example of the process result of FIG. 図６の処理結果の具体例を示す図である。It is a figure which shows the specific example of the process result of FIG. 図２のサポート点選択部によるサポート点選択処理例を説明するフローチャートである。It is a flowchart explaining the example of a support point selection process by the support point selection part of FIG. 図９の処理結果の具体例を示す図である。It is a figure which shows the specific example of the process result of FIG. 図１のクエリ画像認識部の機能の詳細な構成を示すブロック図である。It is a block diagram which shows the detailed structure of the function of the query image recognition part of FIG. 図１１のクエリ画像認識部の処理を説明するためのモデル画像とクエリ画像の具体例を示す図である。It is a figure which shows the specific example of the model image and query image for demonstrating the process of the query image recognition part of FIG. 図１１のクエリ画像認識部の処理結果の具体例を示す図である。It is a figure which shows the specific example of the processing result of the query image recognition part of FIG. 図１１のクエリ画像認識部の処理結果の具体例を示す図である。It is a figure which shows the specific example of the processing result of the query image recognition part of FIG. 図１１のクエリ画像認識部の処理結果の具体例を示す図である。It is a figure which shows the specific example of the processing result of the query image recognition part of FIG. 図１１のクエリ画像認識部の処理結果の具体例を示す図である。It is a figure which shows the specific example of the processing result of the query image recognition part of FIG. パーソナルコンピュータの構成の例を示すブロック図である。And FIG. 11 is a block diagram illustrating an example of a configuration of a personal computer.

符号の説明Explanation of symbols

１１モデル特徴量抽出部，１２モデル特徴量辞書，１３クエリ画像認識部，２１モデル画像２１クエリ画像，３１特徴点抽出部，３２特徴量記述部，３３特徴点識別能力値演算部，３４サポート点選択部，５１特徴画像生成部，５２相関画像生成部，５３シフト相関画像生成部，５４相関画像和生成部，５５判定部，２０１ＣＰＵ，２０２ＲＯＭ，２０３ＲＡＭ，２０８記憶部，２１１リムーバブルメディア DESCRIPTION OF SYMBOLS 11 Model feature-value extraction part, 12 Model feature-value dictionary, 13 Query image recognition part, 21 Model image 21 Query image, 31 Feature point extraction part, 32 Feature-value description part, 33 Feature point identification capability value calculation part, 34 Support point Selection unit, 51 feature image generation unit, 52 correlation image generation unit, 53 shift correlation image generation unit, 54 correlation image sum generation unit, 55 determination unit, 201 CPU, 202 ROM, 203 RAM, 208 storage unit, 211 removable media

Claims

クエリ画像とモデル画像とを比較し、前記モデル画像の被写体と前記クエリ画像の被写体とを同定するための支援情報を提供する情報処理装置において、
前記モデル画像から１以上の特徴点を抽出する特徴点抽出手段と、
前記特徴点抽出手段により抽出された１以上の前記特徴点の特徴量をそれぞれ記述する特徴量記述手段と、
前記特徴点抽出手段により抽出された１以上の前記特徴点のそれぞれについて、前記特徴量記述手段により記述された自身の前記特徴量と、自身が抽出された前記モデル画像、および１以上の別モデル画像との相関画像をそれぞれ生成し、それらの相関画像に基づいて、前記モデル画像の前記被写体を識別するための寄与度を示す識別能力値を演算する識別能力値演算手段と
を備える情報処理装置。 In an information processing apparatus for comparing a query image and a model image and providing support information for identifying a subject of the model image and a subject of the query image,
Feature point extraction means for extracting one or more feature points from the model image;
Feature quantity description means for describing the feature quantities of the one or more feature points extracted by the feature point extraction means;
For each of the one or more feature points extracted by the feature point extraction means, the feature quantity described by the feature quantity description means, the model image from which the feature quantity is extracted, and one or more different models An information processing apparatus comprising: discrimination capability value calculation means for generating a correlation image with each image and calculating a discrimination capability value indicating a contribution degree for identifying the subject of the model image based on the correlation images .

前記特徴点抽出手段により抽出された前記１以上の特徴点のうちの少なくとも１つをベース点とし、前記ベース点の一定範囲内に存在する前記特徴点の中から、前記識別能力値演算手段により演算された前記識別能力値が前記ベース点よりも高い前記特徴点を、サポート点として選択するサポート点選択手段
をさらに備える請求項１に記載の情報処理装置。 At least one of the one or more feature points extracted by the feature point extraction unit is used as a base point, and among the feature points existing within a certain range of the base point, the discrimination ability value calculation unit The information processing apparatus according to claim 1, further comprising: a support point selecting unit that selects, as a support point, the feature point having the calculated discrimination ability value higher than the base point.

前記識別能力値演算手段は、前記相関画像全体の平均値と最大値の少なくとも一方に基づいて、前記識別能力値を演算する
請求項１に記載の情報処理装置。 The information processing apparatus according to claim 1, wherein the discrimination capability value calculation unit calculates the discrimination capability value based on at least one of an average value and a maximum value of the entire correlation image.

クエリ画像とモデル画像とを比較し、前記モデル画像の被写体と前記クエリ画像の被写体とを同定するための支援情報を提供する情報処理装置の情報処理方法において、
前記情報処理装置が実行するステップとして、
前記モデル画像から１以上の特徴点を抽出し、
抽出された１以上の前記特徴点の特徴量をそれぞれ記述し、
抽出された１以上の前記特徴点のそれぞれについて、記述された自身の前記特徴量と、自身が抽出された前記モデル画像、および１以上の別モデル画像との相関画像をそれぞれ生成し、それらの相関画像に基づいて、前記モデル画像の前記被写体を識別するための寄与度を示す識別能力値を演算する
ステップを含む情報処理方法。 In the information processing method of the information processing apparatus for comparing the query image and the model image and providing support information for identifying the subject of the model image and the subject of the query image,
As the step executed by the information processing apparatus,
Extracting one or more feature points from the model image;
Describe the feature quantities of the one or more extracted feature points,
For each of the one or more extracted feature points, generate a correlation image between the described feature quantity of the description described above, the model image from which the feature is extracted, and one or more other model images, and An information processing method including a step of calculating an identification capability value indicating a contribution degree for identifying the subject of the model image based on a correlation image.

クエリ画像とモデル画像とを比較し、前記モデル画像の被写体と前記クエリ画像の被写体とを同定するための支援情報を提供する情報処理装置を制御するコンピュータに、
前記モデル画像から１以上の特徴点を抽出し、
抽出された１以上の前記特徴点の特徴量をそれぞれ記述し、
抽出された１以上の前記特徴点のそれぞれについて、記述された自身の前記特徴量と、自身が抽出された前記モデル画像、および１以上の別モデル画像との相関画像をそれぞれ生成し、それらの相関画像に基づいて、前記モデル画像の前記被写体を識別するための寄与度を示す識別能力値を演算する
ステップを実行させるプログラム。 A computer that controls an information processing apparatus that compares a query image and a model image and provides support information for identifying a subject of the model image and a subject of the query image,
Extracting one or more feature points from the model image;
Describe the feature quantities of the one or more extracted feature points,
For each of the one or more extracted feature points, generate a correlation image between the described feature quantity of the description described above, the model image from which the feature is extracted, and one or more other model images, and A program for executing a step of calculating an identification capability value indicating a contribution for identifying the subject of the model image based on a correlation image.