JP2013073476A

JP2013073476A - Image retrieval device, image retrieval method, and program

Info

Publication number: JP2013073476A
Application number: JP2011212980A
Authority: JP
Inventors: Koichi Umakai; 浩一馬養; Hirotaka Shiiyama; 弘隆椎山
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2011-09-28
Filing date: 2011-09-28
Publication date: 2013-04-22
Anticipated expiration: 2031-09-28
Also published as: JP5893318B2

Abstract

PROBLEM TO BE SOLVED: To obtain retrieval results in order of higher utility.SOLUTION: An image retrieval method includes the steps of: defining, in a combination of local feature amounts that makes a distance between local features shortest, a difference between scale numbers that are applied in accordance with resolution in order of larger size of reduced images in creation processing of reduced image as a difference in resolution; reflecting a difference in resolution with highest frequency to a weighting coefficient of similarity; and outputting final similarity obtained by multiplying the reflection result by a final number of votes VoteMax so as to output a retrieval result to which resolution of an image is reflected.

Description

本発明は、例えば、局所特徴点及び局所特徴量に基づく検索を行うために用いて好適な画像検索装置、画像検索方法及びプログラムに関する。 The present invention relates to an image search apparatus, an image search method, and a program suitable for use in performing a search based on local feature points and local feature amounts, for example.

従来、画像の局所的な特徴量（局所特徴量）を用いて類似画像を検索する方法が提案されている。この方法では、まず、画像から特徴的な点（局所特徴点）を抽出する。抽出方法については例えば非特許文献１に開示されている方法が挙げられる。そして、当該特徴点とその周辺の画像情報とに基づいて、当該特徴点に対応する特徴量（局所特徴量）を計算する。局所特徴量については例えば非特許文献２に開示されている方法によって計算することができる。このように画像の検索は、局所特徴量同士のマッチングによって行われる。 Conventionally, a method of searching for a similar image using a local feature amount (local feature amount) of an image has been proposed. In this method, first, characteristic points (local feature points) are extracted from an image. Examples of the extraction method include the method disclosed in Non-Patent Document 1. Then, based on the feature point and the surrounding image information, a feature amount (local feature amount) corresponding to the feature point is calculated. The local feature amount can be calculated by a method disclosed in Non-Patent Document 2, for example. In this way, the image search is performed by matching local feature amounts.

局所特徴量を利用する手法においては、局所特徴量を回転不変、拡大・縮小不変となる複数の要素で構成される情報として定義する。これにより、画像を回転させたり、拡大又は縮小させたりした場合であっても、画像を検索することができる。また、局所特徴量は一般的にベクトルとして表現される。ただし、局所特徴量が回転不変、拡大・縮小不変であることは理論上の話であり、実際のデジタル画像においては、画像の回転や拡大・縮小処理前の局所特徴量と処理後の対応する局所特徴量との間に若干の変動が生じる。 In the method using the local feature amount, the local feature amount is defined as information including a plurality of elements that are rotation invariant and enlargement / reduction invariant. Thereby, even when the image is rotated, enlarged or reduced, the image can be searched. Further, the local feature amount is generally expressed as a vector. However, it is a theoretical story that the local feature is invariant to rotation and enlargement / reduction. In an actual digital image, the local feature before image rotation and enlargement / reduction processing corresponds to that after processing. Some variation occurs between the local feature amount.

例えば非特許文献２に記載の方法では、回転不変の局所特徴量を算出するために、局所特徴点周辺の局所領域の画素パターンから主方向を算出し、局所特徴量の算出時に主方向を基準に局所領域を回転させて方向を正規化する。また、拡大・縮小不変の局所特徴量を算出するために、異なる解像度の画像を内部で生成し、各解像度の画像からそれぞれ局所特徴点を抽出し、局所特徴量を算出する。ここで、内部で生成した一連の異なる解像度の画像集合は一般的にスケールスペースと呼ばれる。 For example, in the method described in Non-Patent Document 2, in order to calculate a rotation-invariant local feature amount, a main direction is calculated from a pixel pattern of a local region around a local feature point, and the main direction is used as a reference when calculating the local feature amount. Rotate the local region to normalize the direction. In addition, in order to calculate the local feature amount that does not change in size, the image is generated internally with different resolutions, local feature points are extracted from each resolution image, and the local feature amount is calculated. Here, a series of images having different resolutions generated inside is generally called a scale space.

C.Harris and M.J. Stephens,"A combined corner and edge detector," In Alvey Vision Conference,pages 147-152, 1988.C. Harris and M.J. Stephens, "A combined corner and edge detector," In Alvey Vision Conference, pages 147-152, 1988. David G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," International Journal of Computer Vision, 60, 2 (2004), pp.91-110.David G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," International Journal of Computer Vision, 60, 2 (2004), pp.91-110. J. J. Koenderink and A. J. van Doorn, "Representation of local geometry in the visual system," Riological cybernetics, vol.55, pp.367-375, 1987.J. J. Koenderink and A. J. van Doorn, "Representation of local geometry in the visual system," Riological cybernetics, vol.55, pp.367-375, 1987. M. A. Fischler and R. C. Bollers, "Random sample consensus: A paradigm formodel fitting with applications to image analysis and automated cartography," Commum. ACM, no.24, vol.6, pp.381-395, June 1981.M. A. Fischler and R. C. Bollers, "Random sample consensus: A paradigm formodel fitting with applications to image analysis and automated cartography," Commum. ACM, no.24, vol.6, pp.381-395, June 1981.

検索結果により画像が表示される順位は、多くの場合、クエリ画像と登録画像との間の類似度が局所特徴量のマッチングにより算出され、その類似度を用いて決定される。局所特徴量のマッチング方法としては、局所特徴量をベクトルに表現してベクトル間の距離を測るものや、ハッシュ表を用いる方法などがある。ところが、このような従来の方法では、検索結果の画像の利用価値の順序と結果の表示順序とが合致しないことがある。このような現象が生じる場合の例とその原因について以下に説明する。 In many cases, the order in which images are displayed based on the search result is determined using the similarity between the query image and the registered image calculated by matching local features. As a local feature amount matching method, there are a method of measuring a distance between vectors by expressing the local feature amount as a vector, a method using a hash table, and the like. However, in such a conventional method, the order of use values of the search result images may not match the display order of the results. An example of the case where such a phenomenon occurs and its cause will be described below.

例えば、クエリ画像と同じ解像度で回転のない第１の登録画像と、クエリ画像よりはるかに解像度が高いが若干回転している第２の登録画像とが検索される場合を考える。まず、局所特徴量をマッチングするために、前述のように、それぞれの画像に対してスケールスペースを生成し、各解像度の画像から回転不変の局所特徴量を算出する。 For example, consider a case where a first registered image that has the same resolution as the query image and does not rotate and a second registered image that is much higher in resolution than the query image but slightly rotated are searched. First, in order to match local feature amounts, as described above, a scale space is generated for each image, and rotation-invariant local feature amounts are calculated from images of each resolution.

このとき、第２の登録画像からは、クエリ画像には存在しない高解像度の画像がスケールスペースに含まれ、その画像からも局所特徴量が算出される。ところが、これらの局所特徴量は検索時のクエリ画像との局所特徴量のマッチングには寄与しない。クエリ画像と重なる解像度のスケールスペースから算出された第２の登録画像の局所特徴量だけが、クエリ画像との局所特徴量同士のマッチングに寄与することになる。 At this time, from the second registered image, a high-resolution image that does not exist in the query image is included in the scale space, and the local feature amount is also calculated from the image. However, these local feature amounts do not contribute to matching of the local feature amounts with the query image at the time of search. Only the local feature amount of the second registered image calculated from the scale space having the resolution overlapping with the query image contributes to matching of the local feature amounts with the query image.

ここで、第２の登録画像とクエリ画像との間でマッチした局所特徴量ペアのベクトル間距離は、第１の登録画像とクエリ画像との間でマッチした局所特徴量ペア間のベクトル間距離よりも大きくなる可能性が高い。これは、クエリ画像に対して第１の登録画像が回転していない一方で第２の登録画像は若干回転していることに起因する。すなわち、第２の登録画像は回転によって画像にノイズが付与され、そのノイズが局所特徴量を変動させるからである。 Here, the inter-vector distance of the local feature amount pair matched between the second registered image and the query image is the inter-vector distance between the matched local feature amount pairs between the first registered image and the query image. Is likely to be larger. This is because the first registered image is not rotated with respect to the query image, while the second registered image is slightly rotated. That is, the second registered image is given a noise by rotation, and the noise fluctuates the local feature amount.

この結果、クエリ画像と第１の登録画像との類似度は、解像度の高い第２の登録画像との類似度よりも大きくなる。ところが、第２の登録画像の回転量がユーザの気にならない程度の回転量であった場合には、解像度の高い第２の登録画像の方が第１の登録画像よりも利用価値が高いことがある。例えば、解像度の高い画像の方が、画像の回転や拡大など画像加工時の画質劣化問題を回避しやすいため、再利用性に優れている。 As a result, the similarity between the query image and the first registered image is greater than the similarity between the second registered image having a high resolution. However, when the rotation amount of the second registered image is such that the user does not care, the second registered image having a higher resolution has a higher utility value than the first registered image. There is. For example, an image with a higher resolution is more reusable because it is easier to avoid image quality degradation problems during image processing such as image rotation and enlargement.

このように、従来の方法では検索結果の画像の利用価値の順序と結果の表示順序とが合致しないという現象が容易に生じ得るという問題がある。 As described above, in the conventional method, there is a problem that a phenomenon that the order of the use values of the search result images does not match the display order of the results can easily occur.

本発明は前述の問題点に鑑み、利用価値の高い順序で検索結果を得ることができるようにすることを目的としている。 The present invention has been made in view of the above-described problems, and it is an object of the present invention to obtain search results in order of high utility value.

本発明の画像検索装置は、入力画像を用いて予め登録されている登録画像を検索する画像検索装置であって、前記入力画像及び登録画像から局所的な局所特徴を抽出する局所特徴抽出手段と、前記局所特徴抽出手段によって抽出された局所特徴に基づいて前記入力画像と前記登録画像との間の第１の類似度を算出する第１の類似度算出手段と、前記第１の類似度算出手段によって算出された第１の類似度と、前記入力画像及び登録画像の解像度情報とに基づいて前記入力画像と前記登録画像との間の第２の類似度を算出する第２の類似度算出手段と、前記第２の類似度算出手段によって算出された第２の類似度に応じて検索結果を出力する出力手段とを備えることを特徴とする。 An image search device of the present invention is an image search device for searching a registered image registered in advance using an input image, and includes local feature extraction means for extracting a local local feature from the input image and the registered image; First similarity calculation means for calculating a first similarity between the input image and the registered image based on the local feature extracted by the local feature extraction means; and the first similarity calculation Second similarity calculation for calculating a second similarity between the input image and the registered image based on the first similarity calculated by the means and the resolution information of the input image and the registered image Means and output means for outputting a search result in accordance with the second similarity calculated by the second similarity calculation means.

本発明によれば、画像の解像度が反映された利用価値の高い順序で検索結果を出力することができる。 According to the present invention, search results can be output in order of high utility value reflecting the resolution of an image.

実施形態における画像検索装置の構成例を示すブロック図である。It is a block diagram showing an example of composition of an image search device in an embodiment. 実施形態において、画像の局所特徴点及び局所特徴量を抽出する処理手順の一例を示すフローチャートである。In an embodiment, it is a flow chart which shows an example of a processing procedure which extracts a local feature point and local feature-value of an image. 図２のステップＳ２０４における縮小画像の生成処理の概要を説明する図である。It is a figure explaining the outline | summary of the production | generation process of the reduction image in step S204 of FIG. 局所特徴点の抽出する概要を説明する図である。It is a figure explaining the outline | summary which a local feature point extracts. 解像度を考慮しないで類似度を算出する処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the process sequence which calculates a similarity degree without considering resolution. 実施形態における類似度を算出する処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the process sequence which calculates the similarity in embodiment. 実施形態における画像検索装置のハードウェア構成例を示すブロック図である。It is a block diagram which shows the hardware structural example of the image search device in embodiment.

以下、本発明の実施形態について、図面を参照しながら説明する。
図１は、本実施形態における画像検索装置１００の構成例を示すブロック図である。
図１において、画像検索装置１００は、第１の画像入力部１０２、第２の画像入力部１０５、第１の局所特徴点／局所特徴量抽出部１０３、第２の局所特徴点／局所特徴量抽出部１０６及び局所特徴量比較部１０７から構成されている。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
FIG. 1 is a block diagram illustrating a configuration example of an image search apparatus 100 according to the present embodiment.
In FIG. 1, an image search apparatus 100 includes a first image input unit 102, a second image input unit 105, a first local feature point / local feature quantity extraction unit 103, and a second local feature point / local feature quantity. An extraction unit 106 and a local feature amount comparison unit 107 are included.

画像検索装置１００では、入力画像１０１が第１の画像入力部１０２に入力されると、第１の局所特徴点／局所特徴量抽出部１０３は、局所特徴として局所特徴点とその局所特徴量とを抽出する。同様に、比較対象となる入力画像１０４が第２の画像入力部１０５に入力されると、第２の局所特徴点／局所特徴量抽出部１０６は、局所特徴として局所特徴点とその局所特徴量とを抽出する。本実施形態では、入力画像１０１をクエリ画像とし、比較対象となる入力画像１０４をデータベースに登録されている登録画像として説明する。また、局所特徴点とその局所特徴量とを抽出する処理の詳細については後述する。 In the image search apparatus 100, when the input image 101 is input to the first image input unit 102, the first local feature point / local feature amount extraction unit 103 determines a local feature point and its local feature amount as local features. To extract. Similarly, when the input image 104 to be compared is input to the second image input unit 105, the second local feature point / local feature quantity extraction unit 106 uses the local feature point and its local feature quantity as local features. And extract. In this embodiment, the input image 101 is described as a query image, and the input image 104 to be compared is described as a registered image registered in a database. Details of the process of extracting the local feature points and the local feature amounts will be described later.

局所特徴点／局所特徴量比較部１０７は、第１の局所特徴点／局所特徴量抽出部１０３及び第２の局所特徴点／局所特徴量抽出部１０６で抽出されたそれぞれの局所特徴点及び局所特徴量同士を比較し、画像表示順序として比較結果１０８を出力する。この局所特徴点及び局所特徴量の比較処理の詳細についても後述する。 The local feature point / local feature amount comparison unit 107 includes the local feature points / local features extracted by the first local feature point / local feature amount extraction unit 103 and the second local feature point / local feature amount extraction unit 106. The feature amounts are compared with each other, and a comparison result 108 is output as an image display order. Details of the comparison processing of the local feature points and the local feature amounts will be described later.

図７は、本実施形態に係る画像検索装置１４００のハードウェア構成例を示すブロック図である。
図７において、画像検索装置１４００は例えばパーソナルコンピュータであり、ＲＯＭ１４３０には、後述するフローチャートの処理をＣＰＵ１４１０の制御により実現させるプログラムが格納されている。そして、プログラムを実行させる際には、ＲＯＭ１４３０に格納されたプログラムをＲＡＭ１４２０に読み出してＣＰＵ１４１０が処理できるようにしている。 FIG. 7 is a block diagram illustrating a hardware configuration example of the image search apparatus 1400 according to the present embodiment.
In FIG. 7, an image search device 1400 is a personal computer, for example, and a ROM 1430 stores a program that realizes processing of a flowchart described below under the control of the CPU 1410. When the program is executed, the program stored in the ROM 1430 is read into the RAM 1420 so that the CPU 1410 can process it.

バス１４５０は、ＲＯＭ１４３０、ＲＡＭ１４２０、ＣＰＵ１４１０及びＨＤＤ１４４０とデータのやりとりを行う。また、画像検索装置１４００は、ユーザインターフェース１４６０に接続されるキーボードやマウスなどの入出力機器からの入力を受ける。さらに、画像検索装置１４００は、ネットワークインターフェース１４７０を介してネットワーク１５００を経由し、データベース（ＤＢ）１５１０、クライアント端末（ＣＬＩＥＮＴ）１５２０、及びプリンタ（ＰＲＩＮＴＥＲ）１５３０とデータの入出力を行う。 The bus 1450 exchanges data with the ROM 1430, the RAM 1420, the CPU 1410, and the HDD 1440. Further, the image search device 1400 receives input from input / output devices such as a keyboard and a mouse connected to the user interface 1460. Further, the image search apparatus 1400 inputs / outputs data to / from the database (DB) 1510, the client terminal (CLIENT) 1520, and the printer (PRINTER) 1530 via the network 1500 via the network interface 1470.

また、複数のハードウェアとソフトウェアとの協働によって後述する実施形態の処理を実現してもよい。例えば、画像検索装置１４００がクライアント端末１５２０やプリンタ１５３０から検索依頼とクエリ画像とを受け付け、後述するフローチャートの処理を行い、データベース１５１０からクエリ画像に類似する画像を検索する形態が挙げられる。 Further, the processing of the embodiment described later may be realized by cooperation of a plurality of hardware and software. For example, the image search apparatus 1400 may receive a search request and a query image from the client terminal 1520 or the printer 1530, perform processing of a flowchart to be described later, and search for an image similar to the query image from the database 1510.

［局所特徴抽出処理］
次に、画像の局所特徴（局所特徴点及び局所特徴量）抽出処理を説明する。画像の局所特徴抽出処理は、画像データを読み出すステップ、輝度成分の抽出ステップ、縮小画像の作成ステップ、局所特徴点の抽出ステップ、及び局所特徴量の算出ステップからなる。 [Local feature extraction processing]
Next, local feature (local feature points and local feature amounts) extraction processing of an image will be described. The local feature extraction processing of an image includes a step of reading image data, a luminance component extraction step, a reduced image creation step, a local feature point extraction step, and a local feature amount calculation step.

図２は、本実施形態において、画像の局所特徴点及び局所特徴量を抽出する処理手順の一例を示すフローチャートである。
まず、ステップＳ２０１により処理を開始し、ステップＳ２０２において、第１の画像入力部１０２及び第２の画像入力部１０５は、それぞれの画像データを読み出す。そして、ステップＳ２０３において、第１の局所特徴点／局所特徴量抽出部１０３及び第２の局所特徴点／局所特徴量抽出部１０６はそれぞれ、画像データから輝度成分を抽出し、輝度成分画像を生成する。 FIG. 2 is a flowchart illustrating an example of a processing procedure for extracting local feature points and local feature amounts of an image in the present embodiment.
First, processing is started in step S201, and in step S202, the first image input unit 102 and the second image input unit 105 read out the respective image data. In step S203, the first local feature point / local feature amount extraction unit 103 and the second local feature point / local feature amount extraction unit 106 each extract a luminance component from the image data to generate a luminance component image. To do.

次に、ステップＳ２０４において、第１の局所特徴点／局所特徴量抽出部１０３及び第２の局所特徴点／局所特徴量抽出部１０６はそれぞれ、ステップＳ２０３で作成した当該輝度成分画像を倍率ｐに従って順次縮小し、縮小画像をｎ枚生成する。ただし、倍率ｐ及び縮小画像の枚数ｎは予め決められているものとする。 Next, in step S204, each of the first local feature point / local feature quantity extraction unit 103 and the second local feature point / local feature quantity extraction unit 106 uses the luminance component image created in step S203 according to the magnification p. The image is sequentially reduced to generate n reduced images. However, the magnification p and the number n of reduced images are determined in advance.

図３は、図２のステップＳ２０４における縮小画像の生成処理の概要を説明する図である。なお、図３では、倍率ｐを２の−（１／４）乗とし、縮小画像の枚数ｎを９とした場合の例を示している。また、図３に示す例では、倍率ｐを面積比ではなく辺の長さの比としている。 FIG. 3 is a diagram for explaining the outline of the reduced image generation process in step S204 of FIG. FIG. 3 shows an example where the magnification p is 2 to the power of − (1/4) and the number n of reduced images is 9. Further, in the example shown in FIG. 3, the magnification p is not the area ratio but the side length ratio.

図３において、３０１はステップＳ２０３で作成した輝度成分画像である。３０２は輝度成分画像３０１から倍率ｐに従って４回縮小された縮小画像であり、輝度成分画像３０１を１／２に縮小した画像に相当する。また、３０３は輝度成分画像３０１から倍率ｐに従って８回縮小された縮小画像であり、輝度成分画像３０１を１／４に縮小した画像に相当する。なお、画像を縮小する方法は単純に画素を間引く方法、線形補間を用いる方法、低域フィルタ適用後にサンプリングする方法などを用いることができるが、何れの方法でもよい。本実施形態では、線形補間による縮小方法を用いて画像を縮小するものとする。 In FIG. 3, reference numeral 301 denotes the luminance component image created in step S203. Reference numeral 302 denotes a reduced image obtained by reducing the luminance component image 301 four times in accordance with the magnification p, and corresponds to an image obtained by reducing the luminance component image 301 to ½. Reference numeral 303 denotes a reduced image obtained by reducing the luminance component image 301 eight times according to the magnification p, and corresponds to an image obtained by reducing the luminance component image 301 to ¼. Note that, as a method for reducing an image, a method of simply thinning out pixels, a method using linear interpolation, a method of sampling after applying a low-pass filter, or the like can be used. Any method may be used. In this embodiment, it is assumed that an image is reduced using a reduction method based on linear interpolation.

３０４はスケール番号であり、縮小画像のサイズが大きい方から順に解像度に応じて付与される番号である。本実施形態では、スケール番号は１から始まるようにしているが、０から始まるようにしてもよい。 Reference numeral 304 denotes a scale number, which is a number assigned according to the resolution in order from the larger reduced image size. In this embodiment, the scale number starts from 1. However, the scale number may start from 0.

図２の説明に戻り、次に、ステップＳ２０５において、第１の局所特徴点／局所特徴量抽出部１０３及び第２の局所特徴点／局所特徴量抽出部１０６は、ステップＳ２０４で得られたｎ枚の縮小画像のそれぞれから局所的な特徴点（局所特徴点）を抽出する。ここで抽出する局所特徴点は、画像に回転や縮小などの画像処理を施しても同じ場所から安定的に抽出されるようなロバストな局所特徴点である。このような局所特徴点を抽出する方法として、本実施形態では、非特許文献１に記載されているＨａｒｒｉｓ作用素を用いる。 Returning to the description of FIG. 2, next, in step S205, the first local feature point / local feature quantity extraction unit 103 and the second local feature point / local feature quantity extraction unit 106 obtain the n obtained in step S204. Local feature points (local feature points) are extracted from each of the reduced images. The local feature points extracted here are robust local feature points that can be stably extracted from the same place even if image processing such as rotation or reduction is performed on the image. As a method for extracting such local feature points, the Harris operator described in Non-Patent Document 1 is used in the present embodiment.

具体的には、Ｈａｒｒｉｓ作用素を作用させて得られた画像の画素それぞれについて、着目画素とその周辺にある８つの画素との合計９画素の画素値を調べる。そして、着目画素の画素値が予め定めた値以上であり、かつ局所極大になる（当該９画素の中で当該画素の画素値が最大になる）場合に、その着目画素が位置する点を局所特徴点として抽出する。なお、ロバストな局所特徴点を抽出可能な方法であれば、上述のＨａｒｒｉｓ作用素による特徴点抽出方法に限らず、どのような特徴点抽出方法でも適用可能である。 Specifically, for each pixel of the image obtained by applying the Harris operator, the pixel value of a total of nine pixels including the pixel of interest and the eight pixels around it is examined. When the pixel value of the pixel of interest is equal to or greater than a predetermined value and has a local maximum (the pixel value of the pixel is the maximum among the nine pixels), the point where the pixel of interest is located is locally Extract as feature points. Note that any feature point extraction method is applicable as long as it is a method capable of extracting a robust local feature point, not limited to the above-described feature point extraction method using the Harris operator.

次に、ステップＳ２０６において、ステップＳ２０５で得られた局所特徴点それぞれについて、画像の回転があっても不変となるように定義された特徴量（局所特徴量）を算出する。この局所特徴量の算出方法として、本実施形態では非特許文献３に記載されているLocal Jet及びそれらの導関数の組み合わせを用いる。この方法により算出される局所特徴量は、拡大縮小、回転に対して、比較的高い耐性を持つような特性を持たせることができる。具体的には、以下の式（１）により局所特徴量を算出する。 Next, in step S206, for each local feature point obtained in step S205, a feature quantity (local feature quantity) defined so as to remain unchanged even when the image is rotated is calculated. As a method for calculating the local feature amount, in this embodiment, a combination of Local Jet and derivatives thereof described in Non-Patent Document 3 is used. The local feature amount calculated by this method can have characteristics that have a relatively high resistance to enlargement / reduction and rotation. Specifically, the local feature amount is calculated by the following equation (1).

ただし、式（１）の右辺で用いている記号は、以下の式（２）から式（７）に示すように定義される。 However, the symbols used on the right side of Expression (1) are defined as shown in Expression (2) to Expression (7) below.

ここで、式（２）右辺のＧ（ｘ，ｙ）はガウス関数であり、Ｉ（ｘ，ｙ）は画像の座標（ｘ，ｙ）における画素値であり、"＊"は畳み込み演算を表す記号である。また、式（３）は式（２）で定義された変数Ｌのｘに関する偏導関数であり、式（４）は当該変数Ｌのｙに関する偏導関数である。式（５）は式（３）で定義された変数Ｌ_xのｙに関する偏導関数である。式（６）は式（３）で定義された変数Ｌ_xのｘに関する偏導関数であり、式（７）は式（４）で定義されたＬ_yのｙに関する偏導関数である。 Here, G (x, y) on the right side of Expression (2) is a Gaussian function, I (x, y) is a pixel value at image coordinates (x, y), and “*” represents a convolution operation. It is a symbol. Equation (3) is a partial derivative of variable L defined in equation (2) with respect to x, and Equation (4) is a partial derivative of variable L with respect to y. Equation (5) is a partial derivative with respect to y of the variable L _x defined in Equation (3). Equation (6) is a partial derivative of variable L _x defined in equation (3) with respect to x, and equation (7) is a partial derivative of L _y defined in equation (4) with respect to y.

なお、局所特徴量の算出方法は、上述の方法に限らず他の局所特徴量の算出方法も適用可能である。他の局所特徴量の算出方法として、例えば非特許文献２に記載されている方法がある。この方法では、局所特徴点周辺の局所領域内の各画素に対して画素値の勾配方向を算出し、そのヒストグラムを局所特徴量としている。 In addition, the calculation method of a local feature-value is not restricted to the above-mentioned method, The calculation method of another local feature-value is applicable. As another local feature amount calculation method, for example, there is a method described in Non-Patent Document 2. In this method, the gradient direction of the pixel value is calculated for each pixel in the local region around the local feature point, and the histogram is used as the local feature amount.

図４は、局所特徴点の抽出する概要を説明する図である。図４（ａ）は入力画像４１０を示しており、図４（ｂ）は、入力画像４１０から抽出した局所特徴点４２１、４２２を重畳した入力画像４２０を示している。図４に示す例では、局所特徴点４２１、４２２に対してそれぞれ局所特徴量が計算される。 FIG. 4 is a diagram illustrating an outline of extracting local feature points. 4A shows an input image 410, and FIG. 4B shows an input image 420 on which local feature points 421 and 422 extracted from the input image 410 are superimposed. In the example illustrated in FIG. 4, local feature amounts are calculated for the local feature points 421 and 422, respectively.

［類似度算出方法］
まず、一般的な方法を用いて類似度を算出する方法について説明する。類似度の算出方法では様々な方法を用いることができるが、以下の説明では、例えば非特許文献４に記載されているＲＡＮＳＡＣを利用した類似度算出方法について説明する。 [Similarity calculation method]
First, a method for calculating the similarity using a general method will be described. Various methods can be used as the similarity calculation method. In the following description, for example, a similarity calculation method using RANSAC described in Non-Patent Document 4 will be described.

ここで、入力画像の局所特徴点をＱとし、その座標をＱ（ｘ'，ｙ'）とし、その局所特徴点Ｑの局所特徴量をＶ_qとする。また、比較対象となる入力画像の画像上の局所特徴点をＳとして、その座標をＳ（ｘ，ｙ）とし、その局所特徴点Ｓの局所特徴量をＶ_sとする。 Here, it is assumed that the local feature point of the input image is Q, the coordinate is Q (x ′, y ′), and the local feature amount of the local feature point Q is V _q . Also, the local feature point on the image of the input image to be compared as S, and the coordinates S (x, y) and to the local feature quantity of the local feature points S and V _s.

図５は、解像度を考慮しないで類似度を算出する処理手順の一例を示すフローチャートである。
ステップＳ５０１で処理を開始すると、ステップＳ５０２において、局所特徴量Ｖ_qと局所特徴量Ｖ_sとを照合して最小距離対応点リストを作成する。具体的には、まず、局所特徴量Ｖ_qと局所特徴量Ｖ_sとの全ての組み合わせについて局所特徴量間の距離を計算する。次に、計算した局所特徴量間の距離が閾値Ｔｖ以下となり、かつ、最小距離となるような局所特徴量Ｖ_qと局所特徴量Ｖ_sとの組み合わせ（対応点）を抽出する。これらの対応点を最小距離対応点リストに登録することにより、最小距離対応点リストを作成する。 FIG. 5 is a flowchart illustrating an example of a processing procedure for calculating the similarity without considering the resolution.
Upon starting the processing step S501, in step S502, to create a minimum distance correspondence point list by matching the local feature quantity V _q and the local feature amount V _s. Specifically, first, distances between local feature quantities are calculated for all combinations of the local feature quantity V _q and the local feature quantity V _s . Next, a combination (corresponding point) of the local feature quantity V _q and the local feature quantity V _s is extracted so that the calculated distance between the local feature quantities is equal to or less than the threshold value Tv and is the minimum distance. By registering these corresponding points in the minimum distance corresponding point list, a minimum distance corresponding point list is created.

ここで、最小距離対応点リストには、例えばｋ番目の最小距離対応点をそれぞれＱ_k、Ｓ_kと表わし、これらの座標をＱ_k（ｘ_k'，ｙ_k'）、Ｓ_k（ｘ_k，ｙ_k）とし、添え字を合わせて記載する。なお、１つの局所特徴点に対応付けられる局所特徴量は２つ以上あってもよいが、以下の説明では説明を簡略化するため、１つの局所特徴点に対応付けられる局所特徴量は１つだけとする。そして、前記Ｑ_k、Ｓ_kの局所特徴量をそれぞれＶ_q（ｋ）、Ｖ_s（ｋ）とする。また、ステップＳ５０２では、ｍ組の対応点が最小距離対応点リストに登録されるものとする。 Here, in the minimum distance corresponding point list, for example, the kth minimum distance corresponding point is represented as Q _k and S _k , respectively, and these coordinates are represented as Q _k (x _k ′, y _k ′) and S _k (x _k ). , Y _k ) and include the subscript. Note that there may be two or more local feature amounts associated with one local feature point, but in the following description, one local feature amount is associated with one local feature point in order to simplify the description. Only. The local feature values of Q _k and S _k are V _q (k) and V _s (k), respectively. In step S502, m sets of corresponding points are registered in the minimum distance corresponding point list.

次に、ステップＳ５２１において、ｍが３以上であるか否かを判定する。この判定の結果、ｍが３未満である場合は、類似度を算出できないため、ステップＳ５２２に進み、処理を終了する。 Next, in step S521, it is determined whether m is 3 or more. As a result of this determination, if m is less than 3, the similarity cannot be calculated, so the process proceeds to step S522, and the process ends.

一方、ステップＳ５２１の判定の結果、ｍが３以上である場合は、ステップＳ５０３において、最終投票数を表す変数ＶｏｔｅＭａｘを０に初期化する。そして、ステップＳ５０４において、類似度算出処理の反復カウント数を表す変数Ｃｏｕｎｔを０に初期化する。 On the other hand, if m is 3 or more as a result of the determination in step S521, a variable VoteMax representing the final number of votes is initialized to 0 in step S503. In step S504, a variable Count indicating the number of iterations of the similarity calculation process is initialized to zero.

次に、ステップＳ５０５において、反復カウント数Ｃｏｕｎｔが予め定められた最大反復処理回数Ｒｎを超えていないか否かを判定する。この判定の結果、反復カウント数Ｃｏｕｎｔが最大反復処理回数Ｒｎを超えていない場合はステップＳ５０６へ進み、投票数を表す変数Ｖｏｔｅを０に初期化する。一方、ステップＳ５０５の判定の結果、反復カウント数Ｃｏｕｎｔが最大反復処理回数Ｒｎを超えている場合には、ステップＳ５１９へ進み、最終投票数ＶｏｔｅＭａｘを類似度として出力し、ステップＳ５２０において処理を終了する。 Next, in step S505, it is determined whether or not the iteration count number Count has exceeded a predetermined maximum number of iterations Rn. As a result of this determination, if the iteration count number Count does not exceed the maximum number of iterations Rn, the process proceeds to step S506, and a variable Vote representing the number of votes is initialized to zero. On the other hand, if it is determined in step S505 that the iteration count Count exceeds the maximum number of iterations Rn, the process proceeds to step S519, the final vote number VoteMax is output as the similarity, and the process ends in step S520. .

次に、ステップＳ５０７において、最小距離対応点リストから対応点の組の座標をランダムに２組抽出する。ここで、抽出した２組の座標をＱ₁（ｘ₁'，ｙ₁'）、Ｓ₁（ｘ₁，ｙ₁）、及びＱ₂（ｘ₂'，ｙ₂'）、Ｓ₂（ｘ₂，ｙ₂）と定義する。そして、ステップＳ５０８において、変換行列Ｍ、Ｔを計算する。ここで、行列Ｍは、以下の式（８）において変数ａ〜ｄで構成される行列であり、行列Ｔは変数ｅ〜ｆで構成される行列である。 Next, in step S507, two sets of coordinates of corresponding point pairs are randomly extracted from the minimum distance corresponding point list. Here, the extracted two sets of coordinates are Q ₁ (x ₁ ′, y ₁ ′), S ₁ (x ₁ , y ₁ ), Q ₂ (x ₂ ′, y ₂ ′), S ₂ (x ₂ , Y ₂ ). In step S508, conversion matrices M and T are calculated. Here, the matrix M is a matrix composed of variables a to d in the following equation (8), and the matrix T is a matrix composed of variables ef.

ここで、以下の説明では説明を簡略化するため、相似変換だけを考える。この場合には、上記式（８）は以下の式（９）のように書き換えられる。 Here, in order to simplify the explanation in the following explanation, only the similarity transformation is considered. In this case, the above equation (8) is rewritten as the following equation (9).

また、式（９）における変数ａ、ｂ、ｅ、ｆは、座標Ｑ₁（ｘ₁'，ｙ₁'）、Ｓ₁（ｘ₁，ｙ₁）及びＱ₂（ｘ₂'，ｙ₂'）、Ｓ₂（ｘ₂，ｙ₂）の座標値を用いて以下の式（１０）〜式（１３）で表すことができる。 Further, the variables a, b, e, and f in the equation (9) are represented by coordinates Q ₁ (x ₁ ′, y ₁ ′), S ₁ (x ₁ , y ₁ ), and Q ₂ (x ₂ ′, y ₂ ′). ), S ₂ (x ₂ , y ₂ ) can be used to express the following equations (10) to (13).

次に、ステップＳ５０９において、ステップＳ５０７で最短距離対応点リストからランダムに抽出された２組の点以外の点を選択するために、対応点選択変数ｋを３に初期化する。そして、ステップＳ５１０において、対応点選択変数ｋが最短距離対応点リストに登録されている対応点の組数ｍを超えていないか否かを判定する。この判定の結果、組数ｍを超えている場合はステップＳ５１６へ進む。ステップＳ５１６の処理については後述する。 Next, in step S509, the corresponding point selection variable k is initialized to 3 in order to select a point other than the two sets of points randomly extracted from the shortest distance corresponding point list in step S507. Then, in step S510, it is determined whether or not the corresponding point selection variable k exceeds the number m of corresponding points registered in the shortest distance corresponding point list. As a result of the determination, if the number of sets m is exceeded, the process proceeds to step S516. The process of step S516 will be described later.

一方、ステップＳ５１０の判定の結果、組数ｍを超えていない場合は、ステップＳ５１１において、最小距離対応点リストから新たな対応点の組を１組抽出する。ここでは、抽出した座標をＳ_k（ｘ_k，ｙ_k）、Ｑ_k（ｘ_k'，ｙ_k'）と定義する。なお、抽出する際には、ステップＳ５０７と同様に付加情報を利用する。 On the other hand, if the result of determination in step S510 is that the number m of pairs has not been exceeded, one set of new corresponding points is extracted from the minimum distance corresponding point list in step S511. Here, the extracted coordinates are _defined as S _k (x _k , y _k ) and Q _k (x _k ′, y _k ′). When extracting, additional information is used in the same manner as in step S507.

次に、ステップＳ５１２において、式（９）によりＳ_k（ｘ_k，ｙ_k）を変換した座標Ｓ_k'（ｘ_k"，ｙ_k"）を求める。その後、ステップＳ５１３において、座標Ｓ_k'（ｘ_k"，ｙ_k"）と座標Ｑ_k（ｘ_k'，ｙ_k'）との幾何学的距離をユークリッド距離で計算し、当該ユークリッド距離が閾値Ｔｄ以下であるか否かを判定する。この判定の結果、当該ユークリッド距離が閾値Ｔｄ以下の場合はステップＳ５１４へ進み、投票数Ｖｏｔｅをインクリメントした後、ステップＳ５１５へ進む。一方、ステップＳ５１３の判定の結果、当該ユークリッド距離が閾値Ｔｄより大きい場合は、何もせずにステップＳ５１５へ進む。 Next, in step S512, coordinates S _k ′ (x _k ″, y _k ″) _obtained by converting S _k (x _k , y _k ) using Expression (9) are _obtained . Thereafter, in step S513, the geometric distance between the coordinates S _k ′ (x _k ″, y _k ″) and the coordinates Q _k (x _k ′, y _k ′) is calculated as the Euclidean distance, and the Euclidean distance is a threshold value. It is determined whether or not Td or less. If the result of this determination is that the Euclidean distance is less than or equal to the threshold Td, the process proceeds to step S514, and after the vote number Vote is incremented, the process proceeds to step S515. On the other hand, if the result of the determination in step S513 is that the Euclidean distance is greater than the threshold Td, the process proceeds to step S515 without doing anything.

次に、ステップＳ５１５において、対応点選択変数ｋをインクリメントし、ステップＳ５１０に戻り、対応点選択変数ｋが最短距離対応点リストに登録されている対応点の組数ｍを超えるまで、上述の処理を繰り返す。 Next, in step S515, the corresponding point selection variable k is incremented, and the process returns to step S510, and the above processing is performed until the corresponding point selection variable k exceeds the number m of corresponding points registered in the shortest distance corresponding point list. repeat.

次に、ステップＳ５１０で、対応点選択変数ｋが当該対応点リストに登録されている対応点の組数ｍを超えた場合について説明する。ステップＳ５１６においては、投票数Ｖｏｔｅの値と最終投票数ＶｏｔｅＭａｘの値とを比較し、投票数Ｖｏｔｅの値が最終投票数ＶｏｔｅＭａｘの値よりも大きいか否かを判定する。この判定の結果、投票数Ｖｏｔｅの値が最終投票数ＶｏｔｅＭａｘの値よりも大きい場合にはステップＳ５１７へ進む。 Next, the case where the corresponding point selection variable k exceeds the number m of corresponding points registered in the corresponding point list in step S510 will be described. In step S516, the value of the vote number Vote and the value of the final vote number VoteMax are compared, and it is determined whether or not the value of the vote number Vote is larger than the value of the final vote number VoteMax. As a result of this determination, if the value of the vote number Vote is larger than the value of the final vote number VoteMax, the process proceeds to step S517.

ステップＳ５１７においては、最終投票数ＶｏｔｅＭａｘの値を投票数Ｖｏｔｅの値で置き換える。そして、ステップＳ５１８において反復カウント数Ｃｏｕｎｔをインクリメントし、ステップＳ５０５に戻る。一方、ステップＳ５１６の判定の結果、投票数Ｖｏｔｅの値が最終投票数ＶｏｔｅＭａｘの値以下の場合にはステップＳ５１８へ処理を移し、反復カウント数ＣｏｕｎｔをインクリメントしてステップＳ５０５に処理を戻す。 In step S517, the value of the final vote number VoteMax is replaced with the value of the vote number Vote. In step S518, the iteration count Count is incremented, and the process returns to step S505. On the other hand, as a result of the determination in step S516, if the value of the vote number Vote is equal to or less than the value of the final vote number VoteMax, the process proceeds to step S518, the repeat count number Count is incremented, and the process returns to step S505.

なお、図５の説明では、相似変換だけを考慮して説明したが、アフィン変換などその他の幾何学変換についても、ステップＳ５０８でそれぞれに応じた変換行列を求めることにより、対応可能である。例えば、アフィン変換の場合には、まずステップＳ５０７で、ランダムに選択する対応点の組の座標数を３とする。次に、ステップＳ５０８で、式（９）ではなく式（８）を用い、ステップＳ５０７で選択した３組の対応点（合計６点）を用いて変数ａ〜ｆを求めればよい。 In the description of FIG. 5, only the similarity transformation is considered. However, other geometric transformations such as affine transformation can be dealt with by obtaining a transformation matrix corresponding to each in step S508. For example, in the case of affine transformation, first, in step S507, the number of coordinates of a pair of corresponding points selected at random is set to 3. Next, in Step S508, using Equation (8) instead of Equation (9), the variables a to f may be obtained using the three corresponding points selected in Step S507 (6 points in total).

また、図５の説明では、ステップＳ５１９で類似度として最終投票数ＶｏｔｅＭａｘを出力する例について説明したが、他の類似度を計算するようにしてもよい。例えば、ステップＳ５０３以降の処理を行わずに、ステップＳ５０２で作成された最小距離対応点リストに登録された対応点の組数ｍを類似度として出力する方法がある。 In the description of FIG. 5, the example in which the final vote number VoteMax is output as the similarity in step S519 has been described, but other similarities may be calculated. For example, there is a method in which the number m of pairs of corresponding points registered in the minimum distance corresponding point list created in step S502 is output as the similarity without performing the processing after step S503.

次に、本実施形態における類似度算出方法について、図６を参照しながら説明する。
図６は、本実施形態において、局所特徴点／局所特徴量比較部１０７により類似度を算出する処理手順の一例を示すフローチャートである。図６に示すフローチャートでは、図５の処理手順にステップＳ６０１の処理が追加され、さらに図５のステップＳ５１９の処理がステップＳ６０２の処理に入れ替わっている。なお、図５の同じ符号を付しているステップＳ５０１〜Ｓ５１８、Ｓ５２０〜Ｓ５２２については図５と同様であるため、説明は省略する。 Next, the similarity calculation method in the present embodiment will be described with reference to FIG.
FIG. 6 is a flowchart illustrating an example of a processing procedure for calculating the similarity by the local feature point / local feature amount comparison unit 107 in the present embodiment. In the flowchart shown in FIG. 6, the process of step S601 is added to the process procedure of FIG. 5, and the process of step S519 of FIG. 5 is replaced with the process of step S602. Note that steps S501 to S518 and S520 to S522 denoted by the same reference numerals in FIG. 5 are the same as those in FIG.

まず、ステップＳ６０１の処理について説明する。 First, the process of step S601 will be described.

図５に示した手順では、まず、ステップＳ５０２で、局所特徴量Ｖ_qと局所特徴量Ｖ_sとの全ての組み合わせについて局所特徴量間の距離を計算する。次に、計算した局所特徴量間の距離が閾値Ｔｖ以下となり、かつ、最小距離となるような局所特徴量Ｖ_qと局所特徴量Ｖ_sとの組み合わせ（対応点）を抽出し、これらの対応点を登録することにより最小距離対応点リストを作成する。 In the procedure shown in FIG. 5, first, in step S502, distances between local feature quantities are calculated for all combinations of the local feature quantity V _q and the local feature quantity V _s . Next, a combination (corresponding point) of the local feature quantity V _q and the local feature quantity V _s so that the calculated distance between the local feature quantities is equal to or smaller than the threshold Tv and becomes the minimum distance is extracted, and the correspondence between them is extracted. A list of points corresponding to the minimum distance is created by registering the points.

このとき、図３に示したスケール番号３０４に着目することにより比較対象画像との間の解像度の差を推定することができる。以下にその方法を説明する。ここで、局所特徴量間の距離が閾値Ｔｖ以下となり、かつ、最小距離となるような局所特徴量Ｖ_qと局所特徴量Ｖ_sとの組み合わせ（対応点）があった場合に、それぞれのスケール番号をＳ_q、Ｓ_sとする。また、局所特徴量Ｖ_qが属する画像をクエリ画像とした場合に、局所特徴量Ｖ_sが属する画像の類似度を算出する場合を想定する。 At this time, by paying attention to the scale number 304 shown in FIG. 3, the difference in resolution from the comparison target image can be estimated. The method will be described below. Here, when there is a combination (corresponding point) of the local feature quantity V _q and the local feature quantity V _s such that the distance between the local feature quantities is equal to or less than the threshold value Tv and the minimum distance, the respective scales are obtained. The numbers are S _q and S _s . Further, it is assumed that when the image to which the local feature value V _q belongs is used as a query image, the similarity of the image to which the local feature value V _s belongs is calculated.

解像度の差をΔＳ＝Ｓ_s−Ｓ_qとしたとき、局所特徴量Ｖ_qと局所特徴量Ｖ_sとが正しい対応点（正対応点）である場合には、解像度の差ΔＳは一定値となる。一方、対応が誤っている点（誤対応点）の解像度の差ΔＳは正対応点の解像度の差ΔＳと一致しないことが多い。つまり、特徴量間の距離で抽出した対応点すべてに対して解像度の差ΔＳを算出し、解像度の差ΔＳの最頻値ΔＳ_modを求めることにより、最頻値ΔＳ_modを持つ対応点を正対応点と推定できる。 When the difference in resolution is ΔS = S _s −S _q , if the local feature amount V _q and the local feature amount V _s are correct corresponding points (positive corresponding points), the resolution difference ΔS is a constant value. Become. On the other hand, the resolution difference ΔS at the point where the correspondence is incorrect (miscorresponding point) often does not coincide with the resolution difference ΔS at the positive corresponding point. That is, the resolution difference ΔS is calculated for all corresponding points extracted by the distance between the feature quantities, and the mode value ΔS _mod of the resolution difference ΔS is obtained, so that the corresponding point having the mode value ΔS _mod is corrected. It can be estimated as a corresponding point.

また、最頻値ΔＳ_modは比較画像間の解像度の差を表わし、ΔＳ_mod＞０の場合は局所特徴点Ｖ_sが属する画像の解像度の方が大きいことを示す。また、ΔＳ_mod＝０の場合は比較画像間に解像度の差がなく、ΔＳ_mod＜０の場合は、局所特徴点Ｖ_sが属する画像の解像度の方が小さいことを示す。 The mode value ΔS _mod represents the difference in resolution between the comparison images. When ΔS _mod > 0, the resolution of the image to which the local feature point V _s belongs is larger. Further, when ΔS _mod = 0, there is no difference in resolution between the comparison images, and when ΔS _mod <0, the resolution of the image to which the local feature point V _s belongs is smaller.

ステップＳ６０１では、局所特徴量Ｖ_sが属する画像の解像度の方が大きい場合は類似度が大きくなり、局所特徴量Ｖ_sが属する画像の解像度の方が小さい場合は類似度が小さくなるように、重み付けするための重み係数ｗを決定する。本実施形態では、以下の式（１４）により重み係数ｗを決定する。ここで決定した重み係数ｗは、後述するステップＳ６０２で用いられる。 In step S601, the degree of similarity increases when the resolution of the image to which the local feature amount V _s belongs is larger, and the degree of similarity decreases when the resolution of the image to which the local feature amount V _s belongs is smaller. A weighting factor w for weighting is determined. In the present embodiment, the weight coefficient w is determined by the following equation (14). The weighting factor w determined here is used in step S602 described later.

次に、ステップＳ６０２の処理について説明する。 Next, the process of step S602 will be described.

図５のステップＳ５１９では、第１の類似度算出手段として機能することにより最終投票数ＶｏｔｅＭａｘを類似度として出力した。ここで、ステップＳ５１９で出力した類似度を第１の類似度とした場合、ステップＳ６０２では、第２の類似度算出手段として機能することにより第１の類似度とステップＳ６０１で算出した重み係数ｗとを用いて第２の類似度を算出する。本実施形態においては、第２の類似度Ｓｉｍ２を以下の式（１５）により算出し、これを総合類似度とする。このように検索結果の表示順序は、第２の類似度Ｓｉｍ２に応じて決定される。 In step S519 in FIG. 5, the final vote number VoteMax is output as the similarity by functioning as the first similarity calculation means. Here, when the similarity output in step S519 is the first similarity, in step S602, the first similarity and the weight coefficient w calculated in step S601 are obtained by functioning as second similarity calculation means. And the second similarity is calculated. In the present embodiment, the second similarity Sim2 is calculated by the following equation (15), and this is used as the total similarity. As described above, the display order of the search results is determined according to the second similarity Sim2.

以上のように本実施形態によれば、比較対象画像間の対応点のスケール番号に着目して解像度の差を推定し、解像度の差を類似度に反映するようにした。これにより、ユーザの感覚により合致した類似度を算出することが可能になる。 As described above, according to the present embodiment, the resolution difference is estimated by focusing on the scale number of the corresponding point between the comparison target images, and the resolution difference is reflected in the similarity. Thereby, it becomes possible to calculate the degree of similarity that matches the sense of the user.

なお、本実施形態では、スケール番号に着目することによって第１の類似度に重みを付けて第２の類似度を算出し、第２の類似度を総合類似度としたが、解像度情報を利用した類似度算出方法であれば、他の類似度算出方法でもよい。例えば、第１の類似度が所定の閾値以上の場合に、式（１４）の結果をそのまま総合類似度としもよい。 In the present embodiment, the second similarity is calculated by weighting the first similarity by paying attention to the scale number, and the second similarity is set as the overall similarity. However, the resolution information is used. Any other similarity calculation method may be used as long as it is a similarity calculation method. For example, when the first similarity is equal to or greater than a predetermined threshold, the result of Expression (14) may be used as the total similarity as it is.

また、本実施形態では、スケール番号に着目することによって解像度の差を推定したが、解像度の差を推定することができれば他の推定方法を用いもよい。例えば、クエリ画像である入力画像１０１と登録画像である入力画像１０４との位置合わせに利用可能なパラメータを用いて解像度の差を求めることができる。以下、推定方法の一例として、最終投票数ＶｏｔｅＭａｘを算出した時の変換行列Ｍを用いる方法について説明する。具体的には、以下の式（１６）により、変換行列Ｍの要素値を用いて、入力画像１０４に対する入力画像１０１の縮小率Ｒを算出する。 In this embodiment, the difference in resolution is estimated by paying attention to the scale number. However, other estimation methods may be used as long as the difference in resolution can be estimated. For example, the difference in resolution can be obtained using parameters that can be used for alignment between the input image 101 that is a query image and the input image 104 that is a registered image. Hereinafter, as an example of the estimation method, a method of using the transformation matrix M when the final vote number VoteMax is calculated will be described. Specifically, the reduction ratio R of the input image 101 with respect to the input image 104 is calculated using the element value of the transformation matrix M by the following equation (16).

ここで、Ｒ＜１である場合は、登録画像である入力画像１０４の方がクエリ画像である入力画像１０１よりも解像度が高い。このように、縮小率Ｒを算出することによって解像度の差を推定することが可能となる。さらに、縮小率Ｒを類似度に反映させるために式（１４）及び式（１５）を用いる場合は、以下の式（１７）を用いて最頻値ΔＳ_modを求める。 Here, when R <1, the input image 104 that is a registered image has a higher resolution than the input image 101 that is a query image. Thus, by calculating the reduction ratio R, it is possible to estimate the difference in resolution. Furthermore, when formula (14) and formula (15) are used to reflect the reduction rate R in the similarity, the mode value ΔS _mod is obtained using the following formula (17).

また、クエリ画像である入力画像１０１と登録画像である入力画像１０４との位置合わせを行った後に、画像の周波数成分を比較することによっても解像度の差を求めることができる。以下、図５及び図６において、最終投票数ＶｏｔｅＭａｘを算出した時の変換行列Ｍ、Ｔを用いる方法について説明する。具体的には、以下の式（１８）により入力画像１０１に拡大・縮小、及び回転処理を施して入力画像１０４と位置合わせをする。 Further, after the input image 101 that is the query image and the input image 104 that is the registered image are aligned, the difference in resolution can also be obtained by comparing the frequency components of the images. Hereinafter, a method of using the transformation matrices M and T when the final vote number VoteMax is calculated in FIGS. 5 and 6 will be described. Specifically, the input image 101 is subjected to enlargement / reduction and rotation processing according to the following expression (18) to align with the input image 104.

ここで、Ｑ_k（ｘ_k'，ｙ_k'）は位置合わせ前の入力画像１０１の画素座標であり、Ｑ_k'（Ｘ_k'，Ｙ_k'）は位置合わせ後の入力画像の画素座標である。すなわち、Ｑ_k（ｘ_k'，ｙ_k'）の位置にある画素値をＱ_k'（Ｘ_k'，Ｙ_k'）の位置の画素値として設定することにより、位置合わせ後の入力画像を生成する。このとき、線形補間を用いるように構成してもよい。 Here, Q _k (x _k ′, y _k ′) is the pixel coordinates of the input image 101 before alignment, and Q _k ′ (X _k ′, Y _k ′) is the pixel coordinates of the input image after alignment. It is. That is, by setting the pixel value at the position of Q _k (x _k ′, y _k ′) as the pixel value at the position of Q _k ′ (X _k ′, Y _k ′), the input image after alignment is obtained. Generate. At this time, linear interpolation may be used.

次に、周波数成分の比較方法としては、まず、入力画像１０４と前述の位置合わせ後の入力画像とを周波数変換する。ここで、入力画像１０４の最大周波数をｆｓ_maxとし、位置合わせ後の入力画像の最大周波数をｆｑ_maxとした場合、入力画像１０４に対する位置合わせ後の入力画像の縮小率Ｒは以下の式（１９）により推定できる。 Next, as a frequency component comparison method, first, frequency conversion is performed on the input image 104 and the input image after the alignment described above. Here, when the maximum frequency of the input image 104 is fs _max and the maximum frequency of the input image after alignment is fq _max , the reduction ratio R of the input image after alignment with respect to the input image 104 is expressed by the following equation (19 ).

さらに、この縮小率Ｒを類似度に反映させるために式（１４）及び式（１５）を用いる場合は、前述のように、式（１７）により最頻値ΔＳ_modを求めればよい。なお、縮小率Ｒは、周波数成分の比ではなく周波数成分の差を用いるようにしてもよい。また、位置合わせを行った後の入力画像と登録画像とが重なりあう領域あるいはその一部に対して周波数成分を比較してもよい。 Furthermore, when formula (14) and formula (15) are used to reflect the reduction ratio R in the similarity, the mode value ΔS _mod may be obtained by formula (17) as described above. Note that the reduction ratio R may use a frequency component difference instead of a frequency component ratio. Further, the frequency components may be compared with a region where the input image after registration is aligned and the registered image or a part thereof.

（その他の実施形態）
また、本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア（プログラム）を、ネットワーク又は各種記憶媒体を介してシステム或いは装置に供給し、そのシステム或いは装置のコンピュータ（またはＣＰＵやＭＰＵ等）がプログラムを読み出して実行する処理である。 (Other embodiments)
The present invention can also be realized by executing the following processing. That is, software (program) that realizes the functions of the above-described embodiments is supplied to a system or apparatus via a network or various storage media, and a computer (or CPU, MPU, or the like) of the system or apparatus reads the program. It is a process to be executed.

１０３第１の局所特徴点／局所特徴量抽出部
１０６第２の局所特徴点／局所特徴量抽出部
１０７局所特徴点／局所特徴量比較部 103 First local feature point / local feature quantity extraction unit 106 Second local feature point / local feature quantity extraction unit 107 Local feature point / local feature quantity comparison unit

Claims

入力画像を用いて予め登録されている登録画像を検索する画像検索装置であって、
前記入力画像及び登録画像から局所的な局所特徴を抽出する局所特徴抽出手段と、
前記局所特徴抽出手段によって抽出された局所特徴に基づいて前記入力画像と前記登録画像との間の第１の類似度を算出する第１の類似度算出手段と、
前記第１の類似度算出手段によって算出された第１の類似度と、前記入力画像及び登録画像の解像度情報とに基づいて前記入力画像と前記登録画像との間の第２の類似度を算出する第２の類似度算出手段と、
前記第２の類似度算出手段によって算出された第２の類似度に応じて検索結果を出力する出力手段とを備えることを特徴とする画像検索装置。 An image search device for searching a registered image registered in advance using an input image,
Local feature extraction means for extracting local local features from the input image and the registered image;
First similarity calculation means for calculating a first similarity between the input image and the registered image based on the local feature extracted by the local feature extraction means;
A second similarity between the input image and the registered image is calculated based on the first similarity calculated by the first similarity calculating unit and the resolution information of the input image and the registered image. Second similarity calculating means for
An image search apparatus comprising: output means for outputting a search result according to the second similarity calculated by the second similarity calculation means.

前記第２の類似度算出手段は、前記解像度情報を用いて前記第１の類似度に重み付けすることにより前記第２の類似度を算出することを特徴とする請求項１に記載の画像検索装置。 The image search apparatus according to claim 1, wherein the second similarity calculation unit calculates the second similarity by weighting the first similarity using the resolution information. .

前記解像度情報は、前記入力画像と前記登録画像との解像度の差であることを特徴とする請求項１又は２に記載の画像検索装置。 The image search apparatus according to claim 1, wherein the resolution information is a difference in resolution between the input image and the registered image.

前記解像度の差は、前記入力画像と前記登録画像との位置合わせに用いられるパラメータによって算出される値であることを特徴とする請求項３に記載の画像検索装置。 The image search apparatus according to claim 3, wherein the difference in resolution is a value calculated by a parameter used for alignment between the input image and the registered image.

前記解像度の差は、前記入力画像と前記登録画像との位置合わせが行われたた後に、それぞれの画像の周波数成分の比較によって算出される値であることを特徴とする請求項３に記載の画像検索装置。 The difference in resolution is a value calculated by comparing frequency components of respective images after the registration between the input image and the registered image is performed. Image search device.

前記周波数成分の比較は、前記周波数成分の差又は比による比較であることであることを特徴とする請求項５に記載の画像検索装置。 The image search apparatus according to claim 5, wherein the comparison of the frequency components is a comparison based on a difference or a ratio of the frequency components.

前記周波数成分の比較は、前記入力画像及び登録画像の最大周波数の比較であることを特徴とする請求項５又は６に記載の画像検索装置。 The image search apparatus according to claim 5 or 6, wherein the comparison of the frequency components is a comparison of maximum frequencies of the input image and the registered image.

前記周波数成分の比較は、前記位置合わせを行った後の前記入力画像と前記登録画像とが重なりあう領域あるいはその一部に対して行われることを特徴とする請求項５又は６に記載の画像検索装置。 The image according to claim 5 or 6, wherein the comparison of the frequency components is performed on a region where the input image after the alignment is performed and the registered image or a part thereof overlaps. Search device.

前記位置合わせは、前記入力画像又は前記登録画像に対する回転、拡大又は縮小によるものであることを特徴とする請求項４〜８の何れか１項に記載の画像検索装置。 The image search apparatus according to claim 4, wherein the alignment is based on rotation, enlargement, or reduction of the input image or the registered image.

前記解像度の差は、前記入力画像と前記登録画像とで局所特徴を照合することにより得られる対応点の前記解像度に応じて付与された番号の差であることを特徴とする請求項３に記載の画像検索装置。 The difference in resolution is a difference in numbers assigned according to the resolution of corresponding points obtained by collating local features between the input image and the registered image. Image search device.

入力画像を用いて予め登録されている登録画像を検索する画像検索方法であって、
前記入力画像及び登録画像から局所的な局所特徴を抽出する局所特徴抽出工程と、
前記局所特徴抽出工程において抽出された局所特徴に基づいて前記入力画像と前記登録画像との間の第１の類似度を算出する第１の類似度算出工程と、
前記第１の類似度算出工程において算出された第１の類似度と、前記入力画像及び登録画像の解像度情報とに基づいて前記入力画像と前記登録画像との間の第２の類似度を算出する第２の類似度算出工程と、
前記第２の類似度算出工程において算出された第２の類似度に応じて検索結果を出力する出力工程とを備えることを特徴とする画像検索方法。 An image search method for searching for a registered image registered in advance using an input image,
A local feature extraction step of extracting local local features from the input image and the registered image;
A first similarity calculation step of calculating a first similarity between the input image and the registered image based on the local feature extracted in the local feature extraction step;
A second similarity between the input image and the registered image is calculated based on the first similarity calculated in the first similarity calculating step and the resolution information of the input image and the registered image. A second similarity calculation step,
An image search method comprising: an output step of outputting a search result according to the second similarity calculated in the second similarity calculation step.

入力画像を用いて予め登録されている登録画像を検索する画像検索装置を制御するためのプログラムであって、
前記入力画像及び登録画像から局所的な局所特徴を抽出する局所特徴抽出工程と、
前記局所特徴抽出工程において抽出された局所特徴に基づいて前記入力画像と前記登録画像との間の第１の類似度を算出する第１の類似度算出工程と、
前記第１の類似度算出工程において算出された第１の類似度と、前記入力画像及び登録画像の解像度情報とに基づいて前記入力画像と前記登録画像との間の第２の類似度を算出する第２の類似度算出工程と、
前記第２の類似度算出工程において算出された第２の類似度に応じて検索結果を出力する出力工程とをコンピュータに実行させることを特徴とするプログラム。 A program for controlling an image search device that searches for a registered image registered in advance using an input image,
A local feature extraction step of extracting local local features from the input image and the registered image;
A first similarity calculation step of calculating a first similarity between the input image and the registered image based on the local feature extracted in the local feature extraction step;
A second similarity between the input image and the registered image is calculated based on the first similarity calculated in the first similarity calculating step and the resolution information of the input image and the registered image. A second similarity calculation step,
A program causing a computer to execute an output step of outputting a search result according to the second similarity calculated in the second similarity calculation step.