JP3143532B2

JP3143532B2 - Image retrieval apparatus and method

Info

Publication number: JP3143532B2
Application number: JP04320665A
Authority: JP
Inventors: 宏明佐藤; 祐一坂内; 洋岡崎; 典男志村
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1992-11-30
Filing date: 1992-11-30
Publication date: 2001-03-07
Anticipated expiration: 2016-03-07
Also published as: JPH06168277A

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、画像を含むデータを格
納し、たとえば言葉あるいは図形・画像パターンなどに
よる指定に応じてデータを取り出すデータベースシステ
ム等の画像検索装置及びその方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image retrieval apparatus such as a database system for storing data including an image and extracting the data in accordance with, for example, a word, a figure or an image pattern, and a method therefor.

【０００２】[0002]

【従来の技術】従来、この種のデータベースシステムに
おいては、登録時にデータにキーワードを付け、与えら
れたキーワードに対応するデータを検索するものがあっ
た。また、近年、このようなキーワード付けに適さない
対象データやキーワード付けの労力を省くために、予め
設けられたキーワードの代わりに画像から計測された特
徴量との差を計算し、これに基づいて検索を行なうデー
タベースシステムが検討されてきた。このような例は加
藤、下垣、藤村：「画像対話型商標：意匠データベース
TRADEMARK 」、電子通信情報学会論文誌vol.j72-DII, n
o.4, pp.535-544(1989) ( 以降、文献１と呼ぶ）や栗
田、下垣、加藤：「主観的類似度に適応した画像検
索」、情報処理学会論文誌、vol.31, no.2, pp.31-38
（1989）（以降、文献２と呼ぶ）に提案されている。文
献１による画像データベースシステムでは、図２の２０
１に示すような検索パターンを入力とし、これから濃淡
分布、周波数分布などの特徴量を計測し、各特徴量の値
を要素とする特徴量ベクトルＦ＝（ｆ１，ｆ２，…，ｆ
ｎ）を得る。（ここで、ｆ１，ｆ２などは個々の特徴量
の値である。）特徴量ベクトルＦをデータベースに格納
された各画像パターンｉの特徴量ベクトルＦｉ＝（ｆｉ
０，ｆｉ１，…，ｆｉｎ）との距離Ｄを数式１で計算
し、この距離の小さいデータを類似データとして提示す
る。また文献２の画像データベースシステムでは、「暖
かい」、「冷たい」、「やわらかい」、「新鮮な」など
の言葉と絵画データの持つ色分布特徴を対応づけ、検索
のために指示された言葉を特徴量に変換し、これをデー
タベース中の各データの持つ色特徴分布との比較により
指示に対応した画像データを検索している。2. Description of the Related Art Hitherto, in this type of database system, there has been a system in which a keyword is attached to data at the time of registration and data corresponding to the given keyword is searched. Also, in recent years, in order to save the target data that is not suitable for such keyword assignment and the effort of keyword assignment, a difference from a feature amount measured from an image instead of a keyword provided in advance is calculated, and based on this, Database systems for performing searches have been considered. Examples of such cases are Kato, Shimogaki, and Fujimura: "Image Interactive Trademark: Design Database"
TRADEMARK '', IEICE Transactions vol.j72-DII, n
o.4, pp.535-544 (1989) (hereinafter referred to as Ref. 1), Kurita, Shimogaki, and Kato: "Image Retrieval Adapted to Subjective Similarity", Transactions of Information Processing Society of Japan, vol.31, no .2, pp.31-38
(1989) (hereinafter referred to as Reference 2). In the image database system according to Document 1, 20 in FIG.
1, a feature amount such as a grayscale distribution and a frequency distribution is measured, and a feature amount vector F = (f1, f2,..., F
n). (Here, f1, f2, etc. are the values of the individual feature values.) The feature value vector F = (fi) of each image pattern i stored in the database.
0, fi1,..., Fin) is calculated by Expression 1, and data with a small distance is presented as similar data. The image database system of Document 2 associates words such as "warm", "cold", "soft" and "fresh" with the color distribution features of the picture data, and characterizes the words specified for retrieval. The image data corresponding to the instruction is searched by comparing the data with the color feature distribution of each data in the database.

【０００３】[0003]

【数１】 (Equation 1)

【０００４】[0004]

【発明が解決しようとする課題】しかしながら上記従来
例では、検索の際に全てのデータに対して指示されたパ
ターンもしくは言葉から得られた特徴との距離計算を行
なっていたために、検索時間がかかるという欠点があっ
た。However, in the above-mentioned conventional example, since the distance calculation is performed for all the data at the time of retrieval from the designated pattern or the feature obtained from the words, it takes a long retrieval time. There was a disadvantage.

【０００５】本発明は上記従来例に鑑みて成されたもの
で、短い検索時間でデータの検索ができる画像検索装置
を提供することを目的とする。The present invention has been made in view of the above conventional example, and has as its object to provide an image retrieval apparatus capable of retrieving data in a short retrieval time.

【０００６】[0006]

【課題を解決するための手段】及びMeans for Solving the Problems and

【作用】上記目的を達成するために、本発明の画像検索
装置は次のような構成からなる。In order to achieve the above object, the image retrieval apparatus of the present invention has the following configuration.

【０００７】複数の画像に対応する画像データを記憶す
る記憶手段と、画像の複数の特徴をそれぞれ特徴量とし
て数値的に計測する計測手段と、前記複数の特徴量を、
各特徴ごとのしきい値をもって非線形に量子化する量子
化手段と、量子化された複数の特徴量を線形に結合して
インデクスを生成し、該インデクスと前記記憶手段に記
憶された画像データとを対応づける対応づけ手段とを具
備する。Storage means for storing image data corresponding to a plurality of images; measuring means for numerically measuring a plurality of features of the image as feature quantities;
A quantizing means for non-linearly quantizing with a threshold value for each feature, and an index generated by linearly combining a plurality of quantized feature quantities, and the index and the image data stored in the storage means. Associating means.

【０００８】更に好ましくは、前記量子化手段は、前記
記憶手段に記憶された画像データを計測して得られる特
徴量に基づいて、前記しきい値により区切られる領域ご
とに含まれる画像データの数に違いが生じないよう、前
記量子化のしきい値を決定する。[0008] More preferably, the quantization means is configured to determine the number of image data included in each area divided by the threshold based on a characteristic amount obtained by measuring the image data stored in the storage means. Is determined so as not to cause a difference in the quantization.

【０００９】[0009]

【００１０】また、本発明の画像検索方法は次のような
構成からなる。The image search method of the present invention has the following configuration.

【００１１】画像の複数の特徴をそれぞれ特徴量として
数値的に計測する計測工程と、前記複数の特徴量を、各
特徴ごとのしきい値をもって非線形に量子化する量子化
工程と、量子化された複数の特徴量を線形に結合してイ
ンデクスを生成し、該インデクスと記憶手段に記憶され
た画像データとを対応づける対応づけ工程とを具備す
る。A measuring step of numerically measuring a plurality of features of the image as feature quantities; a quantization step of non-linearly quantizing the plurality of feature quantities with a threshold value for each feature; And generating an index by linearly combining the plurality of feature amounts, and associating the index with the image data stored in the storage means.

【００１２】更に好ましくは、前記量子化工程は、前記
記憶手段に記憶された画像データを計測して得られる特
徴量に基づいて、前記しきい値により区切られる領域ご
とに含まれる画像データの数に違いが生じないよう、量
子化のしきい値を決定する。More preferably, in the quantization step, the number of image data included in each area divided by the threshold value is determined based on a characteristic amount obtained by measuring the image data stored in the storage means. Is determined so as not to cause a difference.

【００１３】[0013]

【００１４】[0014]

【実施例】【Example】

［第１実施例］図１は、本発明の第１の実施例であるデ
ータベースシステムの主たる構成を示した図である。図
において、１０１は検索のためのユーザからの指定を受
け検索情報ｉを出力するユーザ指定入力手段、１０２は
ユーザの指定した検索情報ｉを受けそれから特徴量ベク
トルｊを生成する特徴量計測変換手段、１０３は特徴量
ベクトルｊから対応づけテーブルへのインデクスｋを生
成する量子化手段、１０４はインデクスｋとデータベー
ス１０６内のデータとの対応づけを格納した対応づけテ
ーブル、１０５はインデクスにより対応づけテーブルか
ら得られた候補データ群ｌと特徴量ベクトルｊとを受け
検索情報に当てはまるデータを選択する画像検索手段、
１０６は画像データを含むデータを格納したデータベー
ス、１０７はデータベースの内容に応じて対応付けを制
御するための閾値配列設定手段である。[First Embodiment] FIG. 1 is a diagram showing a main configuration of a database system according to a first embodiment of the present invention. In the figure, reference numeral 101 denotes a user designation input unit that receives a designation from a user for a search and outputs search information i, and 102 denotes a feature amount measurement conversion unit that receives a search information i designated by a user and generates a feature amount vector j from the search information i. , 103 are quantization means for generating an index k from the feature amount vector j to the association table, 104 is an association table storing association between the index k and data in the database 106, and 105 is an association table based on the index. Image search means for receiving the candidate data group 1 and the feature amount vector j obtained from, and selecting data applicable to the search information;
Reference numeral 106 denotes a database storing data including image data, and reference numeral 107 denotes a threshold array setting unit for controlling association according to the contents of the database.

【００１５】図３は実施例のデータベースシステムによ
る検索処理を示した流れ図である。まずステップＳ３０
１で、ユーザ指定入力手段１０１がユーザから図２に示
すような画像・図形パターンもしくは言葉もしくはその
両方により検索すべき対象データの指定を受ける。次い
でステップＳ３０２で、特徴量計測変換手段が、画像・
図形パターンの場合には画像処理により特徴量を計測
し、言葉の場合には内部の変換テーブルにより与えられ
た言葉を特徴量に変換し、これらから特徴量ベクトルｊ
を生成する。続いてステップＳ３０３で量子化手段１０
３が特徴量ベクトルｊの各要素を変換量子化して対応づ
けテーブルのインデクスｋを生成する。ステップＳ３０
４では、インデクスｋで対応づけテーブル１０４を索く
ことにより候補データ群ｌを得る。ステップＳ３０５
で、画像検索手段１０５は候補データ群ｌの中の各デー
タの特徴量ベクトルと検索情報の特徴量ベクトルｊとの
違いを計算し、違いが小さいデータを検索結果として選
択する。FIG. 3 is a flowchart showing a search process by the database system of the embodiment. First, step S30
In step 1, the user designation input means 101 receives designation of target data to be searched for by an image / graphic pattern and / or words as shown in FIG. Next, in step S302, the feature quantity measurement conversion unit
In the case of a graphic pattern, the feature amount is measured by image processing, and in the case of a word, a word given by an internal conversion table is converted into a feature amount, and a feature amount vector j
Generate Subsequently, in step S303, the quantization means 10
3 transforms and quantizes each element of the feature amount vector j to generate an index k of the association table. Step S30
In 4, the candidate data group 1 is obtained by searching the association table 104 with the index k. Step S305
Then, the image search means 105 calculates the difference between the feature vector of each data in the candidate data group 1 and the feature vector j of the search information, and selects data having a small difference as a search result.

【００１６】ここで、ユーザは図２に示すようなラフス
ケッチと画像データのイメージを表す言葉とを用いて検
索したい画像を指定する。特徴量計測変換手段１０２に
おける特徴量計測は文献１及び文献２に示されているよ
うに多種多様であり、例えば、検索情報の画像・図形パ
ターンをフーリエ変換して得られる係数列やパターンを
二値化処理して得られた連結領域の面積や円形度などを
用いる。言葉からの特徴量変換に用いる変換テーブル
は、事前に多数の被験者に多数の画像パターンを観察さ
せて言葉との合致度合を評価させ、この合致度合と画像
パターンから計測される色特徴などの特徴量との相関を
計算するか、あるいは、合致度合と特徴量とを入出力と
するニューラルネットワークの学習により作成すること
ができる。このような手法のより詳細な説明は文献２に
与えられている。特徴量計測および特徴量変換により得
られた特徴量はこれらを要素とする特徴量ベクトルにま
とめられ、出力される。Here, the user designates an image to be searched using a rough sketch as shown in FIG. 2 and words representing the image of the image data. The feature amount measurement by the feature amount measurement conversion means 102 is of various types as shown in Documents 1 and 2, and for example, a coefficient sequence or pattern obtained by Fourier-transforming an image / graphic pattern of search information is used. The area and circularity of the connected region obtained by the binarization processing are used. The conversion table used to convert feature values from words allows a large number of subjects to observe a large number of image patterns in advance and evaluate the degree of matching with the words, and evaluates features such as color features measured from the matching degree and the image patterns. It can be created by calculating the correlation with the quantity or by learning a neural network that uses the degree of matching and the feature quantity as input and output. A more detailed description of such an approach is given in reference 2. The feature amounts obtained by the feature amount measurement and the feature amount conversion are collected into a feature amount vector having these as elements, and output.

【００１７】図４は量子化によるインデクスの生成を説
明する図である。特徴量ベクトルは利用する特徴量の数
を次元とするベクトルであり、２次元となることはほと
んどないが、ここでは簡単のために特徴量を２つ利用し
た場合である２次元の例を示す。FIG. 4 is a diagram for explaining the generation of an index by quantization. The feature quantity vector is a vector having the dimension of the number of feature quantities to be used, and is rarely two-dimensional. However, here, a two-dimensional example in which two feature quantities are used is shown for simplicity. .

【００１８】図４における２つの軸はそれぞれの特徴量
の値を意味する。各軸はいくつかのしきい値（ｘ１，ｘ
２，…ｘｎ，ｙ１，ｙ２…，ｙｍ）により区切られ、複
数の領域に分割されている。特徴量ベクトルｊはこの特
徴量空間の１点に対応し、これに対する領域の番号がイ
ンデクスとなる。領域の番号は例えば、ｘ_i ＜ｘ≦ｘ_i+1 ，ｙ_j ＜ｙ≦ｙ_j+1 の時にｊ＊（ｎ＋１）＋ｉとする。ただし、ｘ₀ ，ｙ₀
はマイナス無限大、ｘ_n+ ₁ ，ｙ_m+1 はプラス無限大とす
る。The two axes in FIG. 4 represent the values of the respective feature values. Each axis has several thresholds (x1, x
2,... Xn, y1, y2,... Ym) and is divided into a plurality of regions. The feature amount vector j corresponds to one point in the feature amount space, and the number of the region corresponding to this point is an index. The area number is j * (n + 1) + i when x _i <x ≦ x _{i + 1} and y _j <y ≦ y _{j + 1} , for example. Where x ₀ , y ₀
Is minus infinity, and x _{n +} ₁ and y _{m + 1} are plus infinity.

【００１９】ここで、閾値ｘｉ，ｙｉなどを生成する方
法には色々あるが、固定的な閾値を与えた場合には、各
領域に含まれるデータの数に大小の違いが生じ、検索時
間にバラツキが発生するという問題点が生じる。これ
は、量子化手段１０３を以下に示すように構成すること
によって、非線形を含む多様な閾値配列を与えることで
解決することができる。Here, there are various methods for generating the threshold values xi, yi, etc., but when a fixed threshold value is given, a difference in the number of data included in each area occurs, and the search time becomes longer. There is a problem that variation occurs. This can be solved by configuring the quantization means 103 as described below to provide various threshold arrangements including non-linearity.

【００２０】図５は、図４に説明した量子化を実現する
ための一方式であり、比較量子化手段５０１と閾値配列
５０２から構成される。比較量子化手段５０１は、閾値
配列５０２にあるしきい値を順に参照しながら検索情報
からの特徴量ベクトルと比較し、各要素毎の比較結果か
らインデクスを作成する。なお、特徴量ベクトルの各次
元の要素の量子化に異なるしきい値を用いる場合には各
次元に対応したしきい値配列を用意する。FIG. 5 shows one system for realizing the quantization explained in FIG. 4, and is composed of a comparison quantization means 501 and a threshold value array 502. The comparison quantization unit 501 compares the threshold value in the threshold value array 502 with the feature amount vector from the search information while referring to the threshold value in order, and creates an index from the comparison result for each element. When different thresholds are used for quantizing the elements of each dimension of the feature amount vector, a threshold array corresponding to each dimension is prepared.

【００２１】図６は対応づけテーブルの構成を示したも
のであり、主テーブル６０１の各アドレスには対応する
データの識別子の配列へのポインタが格納されている。
６０２は対応するデータ識別子の配列の一例であり、配
列データの最後を示すものとして０が入っている。イン
デクスは相対アドレスとして利用され、そこに格納され
たデータ識別子の配列へのポインタが出力される。FIG. 6 shows the structure of the association table. Each address of the main table 601 stores a pointer to an array of corresponding data identifiers.
Reference numeral 602 denotes an example of an array of corresponding data identifiers, in which 0 is entered to indicate the end of the array data. The index is used as a relative address, and a pointer to an array of data identifiers stored therein is output.

【００２２】図７は画像検索手段１０５による検索の手
順のフローチャートである。まず、ステップＳ７０１に
おいて、識別子配列内の各データに対して、データベー
ス内に登録されている特徴量ベクトルを読み出し、ステ
ップＳ７０２において、これと検索情報から得られる特
徴ベクトルとの距離を前出の数式１により計算する。こ
れを全データについて行った後、ステップＳ７０４にお
いて、算出された距離をキーとして小さい順にデータを
ソーティングし、上位にいくつか（通常、１０から１０
０程度）を検索結果として出力する。尚、距離の計算方
式としては、パターン認識の教科書に説明されている各
種の距離尺度、例えばマハラノビス距離などを用いるこ
ともできる。FIG. 7 is a flowchart of a search procedure by the image search means 105. First, in step S701, for each piece of data in the identifier array, a feature vector registered in the database is read, and in step S702, the distance between the feature vector and the feature vector obtained from the search information is calculated by the above-described formula. Calculate by 1. After performing this for all the data, in step S704, the data is sorted in ascending order using the calculated distance as a key, and some of the data are sorted to higher ranks (usually 10 to 10).
0) is output as a search result. As a method of calculating the distance, various distance scales described in a textbook on pattern recognition, for example, a Mahalanobis distance can be used.

【００２３】図８は、図５に説明した量子化手段の閾値
配列５０２の内容をデータ内容に応じて設定するための
閾値配列設定手段１０７の内容を示した図である。デー
タベースに蓄えられた全データもしくは一部のデータを
対象に、順に特徴量計測手段１０２により各特徴量を計
算し、その値に対応する特徴量ヒストグラムカウンタ８
０２を１増加させることにより、対象とするデータ群に
対する特徴量ヒストグラムを計数する。続いて、閾値決
定手段８０３が特徴量ヒストグラムを参照して閾値を決
定し、閾値配列５０２に設定する。上記の閾値配列の決
定は各特徴量毎に繰り返し行なう。FIG. 8 is a diagram showing the contents of the threshold array setting means 107 for setting the contents of the threshold array 502 of the quantization means described in FIG. 5 according to the data contents. The feature amount is sequentially calculated by the feature amount measuring means 102 for all or a part of the data stored in the database, and the feature amount histogram counter 8 corresponding to the calculated value is calculated.
By incrementing 02 by one, the feature amount histogram for the target data group is counted. Subsequently, the threshold value determining unit 803 determines a threshold value with reference to the feature amount histogram, and sets the threshold value in the threshold value array 502. The above-described determination of the threshold array is repeatedly performed for each feature amount.

【００２４】図９は閾値決定手段８０３の動作を示す流
れ図である。閾値決定手段８０３は、ステップＳ９０１
における初期化の後、ステップＳ９０２〜Ｓ９０５によ
り、ｉを加算しながら順に特徴量ヒストグラムカウンタ
８０２の内容Ｈ（ｉ）を参照して内部のレジスタｔに加
算すると共に、その値が特定の閾値Ｓ（これは対象デー
タ数Ｎ／閾値数ｎで計算される）を越えた場合に、ステ
ップＳ９０７において、その時のｉの値を閾値として閾
値配列５０２に設定し、ステップＳ９０８において、レ
ジスタｔからＳだけ値を減算する。これをすべての特徴
量ヒストグラムカウンタを参照し終わるまで繰り返すこ
とで閾値が決定される。このしきい値配列の設定は、以
下の２通りの形態で利用される。第１は、画像データの
登録以前に標準的なデータ群を用いて行われるものであ
り、これは標準的なしきい値配列として与えられる。第
２は、データベースに画像データが登録された後に、蓄
えられたデータに対してしきい値配列を最適化するもの
であり、この場合にはしきい値配列を再設定した後、デ
ータベース内の各画像データに対してインデクスの再計
算を行い、これに基づいて対応付けテーブルを作り直
す。FIG. 9 is a flowchart showing the operation of the threshold value determining means 803. The threshold value determining unit 803 determines in step S901
After the initialization in, in steps S902 to S905, while adding i, the contents are added sequentially to the internal register t with reference to the contents H (i) of the feature amount histogram counter 802, and the value is set to a specific threshold value S ( When this exceeds the number of target data N / the number of thresholds n), in step S907, the value of i at that time is set as a threshold in the threshold array 502, and in step S908, only S Is subtracted. The threshold is determined by repeating this process until all feature amount histogram counters have been referenced. The setting of the threshold value array is used in the following two forms. The first one is performed using a standard data group before registration of image data, and is given as a standard threshold value array. Second, after the image data is registered in the database, the threshold array is optimized with respect to the stored data. In this case, after the threshold array is reset, the threshold array is reset. The index is recalculated for each image data, and the correspondence table is recreated based on the recalculation.

【００２５】以上説明したように、画像特徴量の計測手
段に加えて、特徴量の量子化手段と量子化された特徴量
と画像データを対応づける対応づけテーブルを具備する
ことにより、指示された検索情報から距離計算なしに候
補データを限定することができ、検索時間の短縮が可能
となる。また、量子化手段に閾値配列を具備することで
非線形量子化を行ない、対応付けテーブル内でのデータ
のバラツキを均一化することで検索時間のバラツキを少
なくすることができた。さらに、データベースに格納さ
れているデータに応じた閾値配列の設定手段を用意する
ことでデータベースの内容に応じて閾値を調整し、検索
時間のバラツキをさらに小さくすることが可能となっ
た。As described above, in addition to the means for measuring the image feature, the instruction is provided by providing the means for quantizing the feature and the association table for associating the quantized feature with the image data. The candidate data can be limited from the search information without calculating the distance, and the search time can be reduced. Also, by providing a threshold array in the quantization means, nonlinear quantization was performed, and variations in data in the association table were made uniform, thereby reducing variations in search time. Further, by providing means for setting a threshold array according to the data stored in the database, the threshold can be adjusted according to the contents of the database, and the variation in search time can be further reduced.

【００２６】[0026]

【他の実施例】[Other embodiments]

［第２実施例］図１０は量子化手段１０３を実現する別
方式であり、比較量子化手段１００１と正規化計算手段
１００２から構成される。比較量子化手段１００１は正
規化計算手段１００２によりある値の範囲に正規化され
た特徴量の値をあらかじめ設定された閾値に従つて量子
化し、各要素毎の量子化結果からインデクスを作成す
る。正規化計算手段は例えば数式２のような計算を行な
う。[Second Embodiment] FIG. 10 shows another method for realizing the quantization means 103, which comprises a comparison quantization means 1001 and a normalization calculation means 1002. The comparison quantization means 1001 quantizes the value of the feature quantity normalized to a certain value range by the normalization calculation means 1002 according to a preset threshold, and creates an index from the quantization result for each element. The normalization calculating means performs a calculation as shown in Equation 2, for example.

【００２７】[0027]

【数２】ｏｕｔｐｕｔ＝（ａ＊ｉｎｐｕｔ＋ｂ）ｍｏｄｃここで、ａ，ｂ，ｃは前もって設定された定数である。
同様に任意のハッシュ関数や対数変換などが利用でき
る。なお、特徴量ベクトルの各次元の要素の値の分布が
異なる場合には各次元に対応した量子化手段を用意す
る。## EQU2 ## output = (a * input + b) mod c where a, b, and c are predetermined constants.
Similarly, an arbitrary hash function or logarithmic conversion can be used. When the distribution of the values of the elements of each dimension of the feature amount vector is different, quantization means corresponding to each dimension is prepared.

【００２８】このようにしても、第１実施例と同じく検
索時間を短縮することができるという効果を得ることが
できる。［第３実施例］次に、本発明の第３の実施例の画像デー
タベースシステムの説明をする。Also in this case, the effect that the search time can be shortened as in the first embodiment can be obtained. Third Embodiment Next, an image database system according to a third embodiment of the present invention will be described.

【００２９】従来、画像データベースから必要な画像を
検索する場合には、第１の方法として、画像の蓄積時に
付属情報として登録した言葉や記号等のキーワードを用
いて検索を行う方法があった。Conventionally, when searching for a required image from an image database, there has been a first method of searching using a keyword such as a word or a symbol registered as auxiliary information when the image is stored.

【００３０】また、第２の方法として、蓄積する画像を
入力する際にその画像の特徴を表わす特徴量を複数個抽
出し、それらも入力画像と関連づけて画像データベース
に蓄積し、検索時には、入力された例示画像から前記特
徴を表わす複数の特徴量を抽出し、この抽出された特徴
量と画像データベースに蓄積された画像の対応する特徴
量の間で距離計算を行ない、求められた距離空間により
候補順位を決め、その順位をもとに表示するという方法
があった。As a second method, when an image to be stored is input, a plurality of feature amounts representing the characteristics of the image are extracted, and these are also stored in an image database in association with the input image. A plurality of feature amounts representing the features are extracted from the extracted example image, a distance calculation is performed between the extracted feature amounts and the corresponding feature amounts of the image stored in the image database, and the calculated metric space is used. There was a method of determining candidate rankings and displaying them based on the rankings.

【００３１】しかしながら、第１の方法では画像データ
ベースの規模が大きく、表現内容が複雑になると、蓄積
される画像全てに体系的にキーワードを付与することは
極めて困難で、言語や記号などのキーワードのみでは必
要とする画像を一度では検索できないことも多い。However, in the first method, if the size of the image database is large and the expression content is complicated, it is extremely difficult to systematically assign keywords to all stored images, and only keywords such as languages and symbols are used. In many cases, the required image cannot be searched at once.

【００３２】また、第２の方法では、画像データベース
中の全ての画像に対し、特徴を表わす特徴量の比較及び
候補順位の決定が行なわれるため、候補として好ましく
ない画像が現われたり、所望の画像が上位候補にならな
かったり、利用者の意図にあった柔軟な画像の検索がで
きないという問題があった。In the second method, since the comparison of feature amounts representing features and the determination of the candidate order are performed for all the images in the image database, an undesired image appears as a candidate or a desired image is obtained. However, there was a problem that the image could not be a high-ranking candidate or that a flexible image search according to the user's intention could not be performed.

【００３３】本実施例においては、上記のような問題点
を解決し、利用者の意図にあった柔軟な画像の検索方法
を説明する。In the present embodiment, a description will be given of a method for resolving the above-mentioned problems and for retrieving a flexible image which is intended by the user.

【００３４】以下、第３の実施例を図面を用いながら説
明する。図１１は本実施例の装置のブロック図である。
図１１において、１０はデータベースに蓄積しておくべ
きデータを入力するための蓄積データ入力部、２０は入
力された画像に対して種々の特徴量を計算する特徴量計
算部、３０は入力されたデータを蓄積しているデータ蓄
積部、４０は所望のデータを得るための検索条件を設定
する検索条件入力部、５０は検索条件入力部４０より与
えられた検索条件からデータ蓄積部３０にあるデータの
うち候補となるデータを選び出す候補決定部、６０は本
実施例装置全体の制御を行なう制御部、７０はデータを
表示するための表示部である。Hereinafter, a third embodiment will be described with reference to the drawings. FIG. 11 is a block diagram of the device of the present embodiment.
In FIG. 11, reference numeral 10 denotes a stored data input unit for inputting data to be stored in a database, reference numeral 20 denotes a feature amount calculation unit for calculating various feature amounts for an input image, and reference numeral 30 denotes an input. A data storage unit for storing data, 40 is a search condition input unit for setting search conditions for obtaining desired data, and 50 is a data stored in the data storage unit 30 based on the search conditions given by the search condition input unit 40. Among them, a candidate determining unit 60 for selecting candidate data, a control unit 60 for controlling the entire apparatus of the present embodiment, and a display unit 70 for displaying data.

【００３５】蓄積データ入力部１０は、画像の入力を行
なう画像入力部１１と画像に付属した情報（画像名，日
付，その他必要な情報）を入力する付属情報入力部１２
からなる。データ蓄積部３０は、画像入力部１１から入
力された画像を蓄積する画像蓄積部３１、特徴量計算部
２０で計算された画像特徴量を蓄積する特徴量蓄積部３
２、付属情報入力部１２から入力された付属情報を蓄積
する付属情報蓄積部３３からなる。検索条件入力部４０
は、検索キーとなる画像を例示する例示画入力部４１、
付属情報による検索条件を指示する付属情報の検索条件
入力部４２、例示画像入力部４１で示された例示画と付
属情報の検索条件入力部４２で示された検索条件を組み
合わせて検索条件を指定する検索条件指定部４３とから
なる。候補決定部５０における画像分類部５１は、特徴
量蓄積部３２に蓄積されている各画像の特徴量ベクトル
をもとに蓄積されている画像を分類し、その結果を保持
している。付属情報による検索部５２は、付属情報の検
索条件入力部４２で入力された検索条件に合致するデー
タを設定する。さらに統合判定部５３では、例示画入力
部４１より入力された例示画と蓄積画像の分類結果の比
較、および付属情報による検索部５２において選定され
たデータとより総合的に合致するデータを選び出す。類
似度計算部５４は総合判定部５３で選び出されたデータ
に対する類似度を計算して候補画像を順位付ける。The stored data input unit 10 includes an image input unit 11 for inputting an image and an auxiliary information input unit 12 for inputting information (image name, date, and other necessary information) attached to the image.
Consists of The data storage unit 30 includes an image storage unit 31 that stores the image input from the image input unit 11 and a feature storage unit 3 that stores the image feature calculated by the feature calculation unit 20.
2. An auxiliary information storage unit 33 for storing the auxiliary information input from the auxiliary information input unit 12. Search condition input section 40
Is an example image input unit 41 exemplifying an image serving as a search key;
A search condition input unit 42 for specifying a search condition based on the additional information, a search condition is specified by combining the example image shown in the example image input unit 41 and the search condition shown in the additional information search condition input unit 42. And a search condition specifying unit 43 to be executed. The image classification unit 51 in the candidate determination unit 50 classifies the stored images based on the feature vector of each image stored in the feature storage unit 32, and holds the result. The additional information search unit 52 sets data that matches the search condition input in the additional information search condition input unit 42. Further, the integration determination unit 53 compares the classification result of the exemplary image and the stored image input from the exemplary image input unit 41, and selects data that more comprehensively matches the data selected by the search unit 52 based on the attached information. The similarity calculation unit 54 calculates the similarity to the data selected by the comprehensive determination unit 53 and ranks the candidate images.

【００３６】図１１の構成の装置において、画像の蓄積
には蓄積用の画像を画像入力部１１から入力する。入力
された画像に対し、特徴量計算部２０において画像の特
徴を表わす複数個の特徴量が計算され、入力された画像
とともにそれぞれ特徴量蓄積部３２と画像蓄積部３１へ
蓄積される。また、入力画像に対する付属情報（例え
ば、画像名、日付，その他必要な情報）を付属情報入力
部１２より入力し、入力画像および特徴量と対応づけて
付属情報蓄積部３３へ蓄積する。これらのデータを関連
付けて蓄積する方法は、既に公知である関係データベー
ス等のデータベースマネジメントシステムを用いれば容
易に行なえるのでここでは詳述しない。In the apparatus having the configuration shown in FIG. 11, an image to be stored is input from the image input unit 11 for storing an image. For the input image, a plurality of feature values representing the features of the image are calculated by the feature value calculation unit 20 and stored in the feature value storage unit 32 and the image storage unit 31 together with the input image. Further, additional information (for example, image name, date, and other necessary information) for the input image is input from the auxiliary information input unit 12 and stored in the auxiliary information storage unit 33 in association with the input image and the feature amount. The method of storing these data in association with each other can be easily performed by using a known database management system such as a relational database.

【００３７】次に、検索機能に関して説明する。本実施
例における検索機能は、（ｉ）例示画による検索、（ｉ
ｉ）付属情報による検索、（ｉｉｉ）例示画と付属情報
の組合せによる検索の３つに分けられる。（ｉ）の例示
画による検索では、検索キーとなる画像を例示画入力部
４１に例示し、このキー画像に“似た画像”を選び出す
類似検索、及びこのキー画像に“似ていない画像”を選
び出す非類似検索が可能である。（ｉｉ）の付属情報に
よる検索では、画像の付属情報について何らかの検索条
件があらかじめ分かっているような場合は、付属情報の
検索条件入力部より付属情報の条件を入力して候補とな
るデータを選び出すことが可能である。（ｉｉｉ）の例
示画と付属情報の組合せによる検索では、（ｉ）と（ｉ
ｉｉ）の検索条件の論理和または論理積をとることがで
きる。例えば、例示したキー画像に“似ていない画像”
で、かつ付属情報の検索条件を満たすもの、例示したキ
ー画像に“似ている画像”または付属情報の検索条件を
満たすもの、等の検索が可能で、この指定を検索条件指
定部４３に行なう。これらの処理をさらに詳述する。
（ｉ）の例示画による検索では、例示画入力部４１に入
力されたキー画像の特徴量が特徴量計算部２０で、既に
蓄積されている画像の特徴量と同様に計算される。Next, the search function will be described. The search function according to the present embodiment includes (i) a search using an illustrated image, and (i)
The search is divided into three types: i) a search based on additional information, and (iii) a search based on a combination of an example image and additional information. In the search based on the example image of (i), an image serving as a search key is illustrated in the example image input unit 41, and a similar search for selecting “an image similar to the key image” and an “image not similar to the key image” are performed. Is possible. In the search using the additional information in (ii), if any search conditions are known in advance for the additional information of the image, the conditions of the additional information are input from the search condition input unit of the additional information to select candidate data. It is possible. In the search by the combination of the example picture and the accessory information of (iii), (i) and (i)
The logical sum or logical product of the search conditions of ii) can be taken. For example, "Dissimilar image" to the illustrated key image
In addition, it is possible to search for an image which satisfies the search condition of the attached information, an image which is similar to the exemplified key image or an image which satisfies the search condition of the attached information, and so on. . These processes will be described in more detail.
In the search based on the example image in (i), the feature amount of the key image input to the example image input unit 41 is calculated by the feature amount calculation unit 20 in the same manner as the feature amount of the already stored image.

【００３８】一方、画像分類部５１では、特徴量蓄積部
３２に蓄積された特徴量（一般にｎ個）で構成される空
間（ｎ次元）で、統計的手段等による求まる識別関数を
用いて、このｎ次元特徴量空間を複数個に分割し、画像
蓄積部３１へ蓄積されているデータを分類する。On the other hand, the image classifying section 51 uses a discriminant function obtained by a statistical means or the like in a space (n-dimensional) constituted by the feature amounts (generally n) stored in the feature amount storage section 32. The n-dimensional feature space is divided into a plurality of parts, and the data stored in the image storage unit 31 is classified.

【００３９】図１２に２次元の特徴量空間で画像データ
とａ，ｂ，ｃ３つのグループに分割した例を示す。キー
画像として例示された画像も計算された特徴量により、
この空間内の点として表わせるので、どのグループに属
するか判定することが可能である。ここで、例示画に
“似ている”条件での検索の場合、例示画が属するグル
ープの画像データは、類似度計算部５４へ送られ、キー
画像とグループ内の画像との特徴量を用いた類似度の計
算が行なわれ、類似度の小さい順に表示部７０に表示さ
れる。この類似度の計算に用いられる特徴量は、分類の
ために用いたのと同一の特徴量でも良いが、好ましい方
法としては、分類には画像の大域的な特徴を表わす特徴
量を類似度の計算には比較的詳細な特徴を記述できる特
徴量を用いると効果的である。FIG. 12 shows an example in which image data and three groups a, b and c are divided in a two-dimensional feature space. The image exemplified as the key image is also calculated by the calculated feature amount.
Since it can be represented as a point in this space, it is possible to determine which group it belongs to. Here, in the case of a search under the condition “similar” to the example image, the image data of the group to which the example image belongs is sent to the similarity calculation unit 54, and the feature amount between the key image and the image in the group is used. The calculated similarity is calculated and displayed on the display unit 70 in ascending order of similarity. The feature value used for the calculation of the similarity may be the same feature value used for the classification, but a preferable method is that the feature value representing the global feature of the image is used for the classification. It is effective to use a feature that can describe relatively detailed features in the calculation.

【００４０】次に、例示画に“似ていない”条件での検
索の場合、例示画が属するグループを除く全てのグルー
プの画像のデータを最終候補として表示部７０に表示す
る。例えば、図１２において、キー画像がグループａに
属するとすれば、グループｂ、およびｃに属する全ての
画像が最終候補となる。“似ていない”条件で類似度の
計算を行なわないのは、類似度の尺度が小さい（つまり
“似ていない”）領域では、人間の間隔として類似性の
距離尺度がほとんど意味を持たないためである。Next, in the case of a search under the condition that the example image does not resemble, the image data of all the groups except the group to which the example image belongs is displayed on the display unit 70 as final candidates. For example, in FIG. 12, if the key image belongs to the group a, all the images belonging to the groups b and c are the final candidates. The reason why the similarity is not calculated under the “not similar” condition is that the similarity distance scale has little meaning as a human interval in a region where the similarity scale is small (that is, “not similar”). It is.

【００４１】次に、（ｉｉ）の付属情報による検索で
は、付属情報の検索条件入力部４２より入力された検索
条件（文字列のマッチング，数値の一致，不一致，大小
関係，これらの論理和，論理積，否定等）により、付属
情報による検索部５２において付属情報蓄積部３３に蓄
積されているデータとの間で検索が行なわれ、条件に当
てはまるデータが最終候補として表示部７０に表示され
る。付属情報による検索は、関係データベース操作言語
ＳＱＬ等、公知なのでここでは詳述しない。Next, in the search based on the additional information (ii), the search conditions (character string matching, numerical value match, mismatch, magnitude relationship, logical OR, (Logical product, negation, etc.), a search is performed between the data stored in the auxiliary information storage unit 33 in the search unit 52 based on the auxiliary information, and data meeting the conditions is displayed on the display unit 70 as final candidates. . The search using the attached information is well known in relational database operation language SQL or the like, and will not be described in detail here.

【００４２】（ｉｉ）の例示画と付属情報との組み合わ
せによる検索では、検索条件指定部４３で例示したキー
画像に“似ている”又は“似ていない”条件と付属情報
の検索条件との論理和又は論理積を指定すればよい。こ
の時、例示画で指定された条件に合った画像の集合を、
画像分類部５１より得ることができ、また付属情報によ
る検索条件に合った画像の集合を付属情報による検索部
５２より得ることができるので、統合判定部５３では検
索条件指定部４３で指定された「ＡＮＤ」または「Ｏ
Ｒ」の指定により、それぞれ上記の２つの集合の論理積
または論理和を取って得られた集合を統合判定部５３の
結果とする。次に、例示画の条件が“似ている”場合に
は、統合判定部５３で得られた画像の集合に対し、類似
度計算部５４において、（ｉ）の場合と同様にキー画像
との類似度を計算し、類似度の小さい画像から候補画像
として表示部７０に表示される。例示画の条件が“似て
いない”場合には、統合判定部５３で得られた画像の集
合を候補画像として表示部７０へ表示する。In the search based on the combination of the example image and the additional information in (ii), the condition of “similar” or “not similar” to the key image exemplified in the search condition specifying unit 43 and the additional information search condition A logical sum or a logical product may be specified. At this time, a set of images that meet the conditions specified in the illustration
Since it can be obtained from the image classifying unit 51 and a set of images matching the search condition based on the attached information can be obtained from the search unit 52 based on the attached information, the integration determining unit 53 specifies the image specified by the search condition specifying unit 43. "AND" or "O
According to the designation of “R”, the sets obtained by taking the logical product or the logical sum of the above two sets are taken as the results of the integration determining unit 53. Next, when the conditions of the example images are “similar”, the similarity calculation unit 54 determines whether the set of images obtained by the integration determination unit 53 is the same as the key image in the same manner as in (i). The similarity is calculated, and the images having the smaller similarity are displayed on the display unit 70 as candidate images. If the example image condition is “not similar”, a set of images obtained by the integration determination unit 53 is displayed on the display unit 70 as candidate images.

【００４３】以上述べたように、本実施例によれば、画
像データの検索にあたって、例示されたキー画像に類似
又は非類似した画像を検索する例示画検索過程、および
画像に付属する付属情報によって画像を検索する付属情
報検索過程、および上記例示画検索過程と付属情報検索
過程とを組み合わせた統合検索過程とにより、画像デー
タに対する多様な検索方式を提供し、利用者の意図にあ
った柔軟な画像検索方式を実現できる利点がある。As described above, according to the present embodiment, when searching for image data, an example image search process for searching for an image similar or dissimilar to the exemplified key image, and additional information attached to the image are performed. A variety of search methods for image data are provided by an auxiliary information search process for searching for images and an integrated search process combining the above-described example image search process and the auxiliary information search process, and a flexible and user-intended search method is provided. There is an advantage that an image search method can be realized.

【００４４】なお、本実施例の画像分類部５１では、画
像の分類のための識別関数を統計的な手法による線形の
識別関数の例を挙げたが、ニューラルネット等、非線系
の識別関数とすることができる。また、前もってどのよ
うな画像のクラスが存在するかが分からない場合には、
クラスタリングの手法を用いて画像を分類することも可
能である。また、付属情報による検索を最初に行い、付
属情報の適切な画像のみに対してさらに類似又は非類似
の検索を行うことで検索の効率化も可能である。［第４実施例］第４実施例として、データベースシステ
ムにおける画像検索の説明をする。In the image classifying section 51 of this embodiment, the classification function for classifying an image is exemplified by a linear classification function by a statistical method. However, a classification function of a non-linear system such as a neural network is used. It can be. Also, if you do not know in advance what kind of image class exists,
It is also possible to classify images using a clustering technique. In addition, it is possible to improve the efficiency of the search by first performing a search using the attached information and then performing a similar or dissimilar search on only an appropriate image of the attached information. [Fourth Embodiment] As a fourth embodiment, an image search in a database system will be described.

【００４５】従来、画像データベースから必要な画像を
検索する場合には、第１の方法として、画像の蓄積時に
付属情報として登録した言葉や記号等のキーワードを用
いて検索を行う方法がある。Conventionally, when retrieving a necessary image from an image database, a first method is to use a keyword such as a word or a symbol registered as ancillary information when the image is stored.

【００４６】また、第２の方法として、蓄積する画像を
入力する際にその画像の特徴を表わす特徴量を複数個抽
出し、それらも入力画像と関連づけて画像データベース
に蓄積し、検索時には、入力された例示画像から前記特
徴を表わす複数の特徴量を抽出し、この抽出された特徴
量と画像データベースに蓄積された画像の対応する特徴
量の間で距離計算を行ない、求められた距離空間により
候補順位を決め、その順位をもとに表示するという方法
がある。As a second method, when an image to be stored is input, a plurality of feature values representing the characteristics of the image are extracted and stored in an image database in association with the input image. A plurality of feature amounts representing the features are extracted from the extracted example image, a distance calculation is performed between the extracted feature amounts and the corresponding feature amounts of the image stored in the image database, and the calculated metric space is used. There is a method of determining the candidate ranking and displaying based on the ranking.

【００４７】しかしながら、第１の方法では、画像デー
タベースの規模が大きく表現内容が複雑になると、言語
や記号などのキーワードのみでは必要とする画像を一度
では検索できないことが多く、また第２の方法では、画
像データベース中の全ての画像に対し、特徴を表わす特
徴量の比較及び候補順位の決定が行われるため、検索前
から候補として現われるのに好ましくないと分かってい
る画像なども候補として挙げられるため、必要とする画
像が上位候補に上がりにくくなったり、また検索時間も
長くなるなど、利用者の意図にあった画像の検索ができ
ないという問題があった。However, in the first method, if the size of the image database is large and the expression content is complicated, it is often impossible to search for the required image only once using keywords such as languages and symbols, and the second method. In, the comparison of the feature amounts representing the features and the determination of the candidate rank are performed for all the images in the image database, so that the images that are known to be unfavorable to appear as candidates before the search are also included as candidates. Therefore, there is a problem that it is difficult to search for an image that meets the user's intention, for example, the required image is unlikely to be ranked as a top candidate and the search time is long.

【００４８】本実施例では上記のような問題点を解決
し、効率的な類似画像の蓄積及び検索方法を提供する。This embodiment solves the above-mentioned problems and provides an efficient similar image storage and retrieval method.

【００４９】図１３は、本実施例に係る装置のブロック
構成図を示す。FIG. 13 shows a block diagram of the apparatus according to this embodiment.

【００５０】図において、１は画像データベースシステ
ムであり、２は蓄積画像入力部、３はこの蓄積画像入力
部２によって入力された画像の特徴量の抽出部、４は付
属情報の入力部、また５は検索時の例示画像入力部、６
はこの例示画像入力部５によって入力された例示画像の
特徴量抽出部、７は付属情報の検索条件入力部、また８
は検索結果等の表示部である。In the figure, 1 is an image database system, 2 is a stored image input unit, 3 is a unit for extracting feature values of an image input by the stored image input unit 2, 4 is an input unit for attached information, 5 is an example image input unit at the time of search, 6
Is a feature amount extraction unit for the example image input by the example image input unit 5, 7 is a search condition input unit for the auxiliary information, and 8
Is a display section for a search result or the like.

【００５１】画像データベースシステム１の１１は入力
画像の蓄積部、１２は特徴量抽出部３により抽出された
特徴量の蓄積部、１３は付属情報入力部４により入力さ
れた付属情報の蓄積部、１４は検索条件入力部７により
入力された条件による候補画像限定部１５は特徴量の蓄
積部１２内の特徴量と例示画像の特徴量抽出部６により
抽出された特徴量間との類似度計算部、１６はこの類似
度計算部１５により得られた類似度に基いた候補順位決
定部である。Reference numeral 11 in the image database system 1 denotes a storage unit for the input image, 12 denotes a storage unit for the feature amount extracted by the feature extraction unit 3, 13 denotes a storage unit for the auxiliary information input by the auxiliary information input unit 4, Reference numeral 14 denotes a candidate image limiting unit based on the condition input by the search condition input unit 7. The similarity calculation between the feature amount in the feature amount storage unit 12 and the feature amount extracted by the feature amount extraction unit 6 of the example image. The unit 16 is a candidate order determination unit based on the similarity obtained by the similarity calculation unit 15.

【００５２】本実施例は、画像データベースシステム１
として関係データベースを用い、植物の葉の２値画像を
対象画像として実現した例である。In the present embodiment, the image database system 1
Is an example in which a binary image of a leaf of a plant is realized as a target image using a relational database.

【００５３】まず画像の蓄積時には、画像データベース
作成者が蓄積用画像を蓄積画像入力部２に入力する。そ
うすると、特徴量抽出部３により画像の特徴を表わす複
数個の特徴量が抽出され、入力された画像とともに特徴
量蓄積部１２と画像蓄積部１１に蓄積される。First, when storing an image, the creator of the image database inputs a storage image to the storage image input unit 2. Then, a plurality of feature values representing the features of the image are extracted by the feature value extraction unit 3 and stored in the feature value storage unit 12 and the image storage unit 11 together with the input image.

【００５４】図１５は、関係データベースに画像データ
を蓄積する形態の一例である。３１は画像ファイルのデ
ィレクトリのみをカラム内に格納する例、３２は画像デ
ータをベクトル列として直接カラム内に格納する例であ
る。また、関係データベースに特徴量データを蓄積する
方法の例として、特徴量の１つをｎ次元ベクトル（ｎは
特徴量によって異なる）とした時に、ｎ次元ベクトルを
ｎ個のカラムに格納する方法、ｎ次元ベクトルを画像デ
ータのようにバイト列形式で格納する方法などがあげら
れる。ここでは画像の特徴量として、８×８のメッシュ
ごとの黒画素の数、縦横各８本の短冊ごとの白黒反転回
転、円形度、伸長度などを用いる。FIG. 15 shows an example of a form for storing image data in a relational database. Reference numeral 31 denotes an example in which only the directory of an image file is stored in a column, and reference numeral 32 denotes an example in which image data is directly stored in a column as a vector sequence. Further, as an example of a method of storing feature amount data in a relational database, when one of the feature amounts is an n-dimensional vector (n differs depending on the feature amount), a method of storing the n-dimensional vector in n columns, There is a method of storing an n-dimensional vector in a byte string format like image data. Here, as the feature amount of the image, the number of black pixels for each 8 × 8 mesh, the black-and-white reversal rotation, the degree of circularity, the degree of expansion, and the like for each of eight strips in each of the vertical and horizontal directions are used.

【００５５】また入力画像に付属する付属情報を付属情
報入力部４より入力し、入力画像と対応づけて付属情報
蓄積部１３に蓄積する。ここでは、たとえば画像ｉｄ、
名称、科名、生息地、花期などを画像の付属情報として
登録する。The additional information attached to the input image is input from the additional information input unit 4 and stored in the additional information storage unit 13 in association with the input image. Here, for example, the image id,
The name, family name, habitat, flower season, etc. are registered as additional information of the image.

【００５６】次に例示画像の検索時についてであるが、
本実施例の検索時の処理を示すフローチャートを図１４
に示し、説明する。Next, with regard to the retrieval of the example image,
FIG. 14 is a flowchart showing processing at the time of retrieval according to this embodiment.
Will be described.

【００５７】まず、利用者が或る葉の２値画像を例示画
像として例示画像入力部５に入力する（Ｓ１４１）。こ
こでこの例示画像は、検索時に新たに入力する方法に加
え、画像データベース中に蓄積されている画像を例示画
像として用いることも可能である。このように例示画像
を示すと、特徴量抽出部６により入力画像から蓄積時と
同様の画像の特徴を表わす特徴量が抽出される（Ｓ１４
２）。また、例えば、検索により得たい葉の科名が分か
っているなど、画像の付属情報について何らかの検索条
件があらかじめ分かっているような場合は、検索条件入
力部７より付属情報の条件を入力する（Ｓ１４３）。こ
こで、この検索条件が入力されると、検索条件による候
補画像限定部１４によりこの入力された検索条件と画像
データベース１中の付属情報蓄積部１３に蓄積された付
属情報との間で検索が行なわれ、その条件に当てはまる
ものだけを候補画像と限定する（Ｓ１４４）。ここで
は、画像データベースとして関係データベースを用いて
いるので、データベース操作言語ＳＱＬを用いて検索を
行うことができる。例えば、“科名がバラ科である”と
いう検索条件として“select image_id where family='
bara' ”というように与えることにより、検索条件にあ
てはまる画像のみのimage_idを得ることができる。また
ここで、付属情報の検索条件が入力されない場合には、
全ての蓄積画像が候補画像となるようにすればよい。First, the user inputs a binary image of a certain leaf as an example image to the example image input unit 5 (S141). Here, in addition to the method of newly inputting this example image at the time of search, it is also possible to use an image stored in the image database as an example image. When the example image is shown in this manner, the feature amount representing the same image feature as that at the time of accumulation is extracted from the input image by the feature amount extracting unit 6 (S14).
2). Further, for example, in the case where some kind of search condition is previously known for the additional information of the image, such as when the family name of the leaf desired to be obtained by the search is known, the condition of the additional information is input from the search condition input unit 7 ( S143). Here, when the search condition is input, a search between the input search condition and the attached information stored in the attached information storage unit 13 in the image database 1 is performed by the candidate image limiting unit 14 based on the search condition. Then, only those that satisfy the condition are limited to candidate images (S144). Here, since the relational database is used as the image database, the search can be performed using the database operation language SQL. For example, as a search condition of "the family name is Rosaceae", "select image_id where family = '
By giving bara '”, you can get the image_id of only the images that match the search condition. If the search condition of the attached information is not entered,
What is necessary is just to make all the stored images become candidate images.

【００５８】次に、この候補画像に対して順位づけを行
なうわけだが、まず、例示画像の特徴量と候補画像の特
徴量との間で類似度計算部１５により類似度の計算を行
なう（Ｓ１４５）。限定された候補画像に対して距離計
算を行う方法の例として、前記付属情報による検索によ
り得られたimage_idに対応する画像の特徴量に対して距
離計算を行う方法や、得られた候補画像もしくは画像ｉ
ｄを一度関係データベースのテーブルに格納しそのテー
ブルに対して距離計算を行う方法などがある。ここで、
類似度は例示画像と各蓄積画像の各特徴量間の距離計算
を行なうことにより求めているが、例えば、用いる特徴
量を任意に選択できたり、学習などにより得られた人間
の主観評価と各特徴量間の関係を利用し、各特徴量に重
みづけを与えたりすることも可能である。Next, ranking is performed on the candidate images. First, a similarity is calculated by the similarity calculator 15 between the feature amount of the example image and the feature amount of the candidate image (S145). ). As an example of a method of performing a distance calculation on a limited candidate image, a method of performing a distance calculation on a feature amount of an image corresponding to an image_id obtained by a search using the attached information, or a method of obtaining a candidate image or Image i
There is a method of once storing d in a table of a relational database and performing distance calculation on the table. here,
The similarity is obtained by calculating the distance between each feature amount of the example image and each accumulated image.For example, the feature amount to be used can be arbitrarily selected, and the subjective evaluation of human being obtained by learning and the like can be used. It is also possible to weight each feature using the relationship between the features.

【００５９】このようにして求められた類似度をもとに
候補順位決定部１６により例示画像に類似していると思
われる候補画像の候補順位を決定する（Ｓ１４６）。Based on the degree of similarity thus obtained, the candidate rank determining unit 16 determines the candidate rank of a candidate image which is considered to be similar to the example image (S146).

【００６０】このようにして決定された候補順位に従
い、表示部８により、例えば第１候補のみ表示するとか
上位１０候補を表示し残りはウィンドウのボタン等を指
示することにより次の１０候補を表示するといったよう
に、指示した表示形態で表示する（Ｓ１４７）。In accordance with the candidate ranking determined in this way, the display unit 8 displays, for example, only the first candidate or the top 10 candidates, and displays the next 10 candidates by instructing a window button or the like. Is displayed in the designated display form (S147).

【００６１】以上の過程の後、利用者は必要とする画像
を蓄積画像の中から求めることができる。After the above steps, the user can obtain the required image from the stored images.

【００６２】以上説明したように、本実施例の装置は、
画像データベースより類似の画像を検索する際に、例示
画像とそれに付属した何らかの付属情報を検索条件に与
えることにより、はじめから候補として望ましくない画
像を排除することができるため、検索精度が向上し、ま
た検索時間も短縮できるなど、効率的な類似検索を実現
できる効果がある。As described above, the device of this embodiment is
When searching for a similar image from the image database, by giving an example image and some additional information attached to the example image to the search condition, it is possible to exclude an undesirable image as a candidate from the beginning, so that the search accuracy is improved, Further, there is an effect that an efficient similar search can be realized, for example, a search time can be shortened.

【００６３】尚、本発明は、複数の機器から構成される
システムに適用しても１つの機器から成る装置に適用し
ても良い。また、本発明は、システム或は装置にプログ
ラムを供給することによって達成される場合にも適用で
きることはいうまでもない。The present invention may be applied to a system composed of a plurality of devices or an apparatus composed of one device. Needless to say, the present invention can be applied to a case where the present invention is achieved by supplying a program to a system or an apparatus.

【００６４】[0064]

【発明の効果】以上説明したように、本発明にかかる画
像検索装置及び方法は、短い検索時間でデータの検索が
できるという効果がある。As described above, the image search apparatus and method according to the present invention have an effect that data can be searched in a short search time.

【図面の簡単な説明】[Brief description of the drawings]

【図１】第１実施例の主たる構成を示した図である。FIG. 1 is a diagram showing a main configuration of a first embodiment.

【図２】ユーザによるデータ指定情報の例の図である。FIG. 2 is a diagram illustrating an example of data designation information by a user;

【図３】本発明によるデータベースシステムの動作を示
す流れ図である。FIG. 3 is a flowchart showing the operation of the database system according to the present invention.

【図４】特徴量空間の量子化の説明図である。FIG. 4 is an explanatory diagram of quantization of a feature space.

【図５】量子化手段の構成例の図である。FIG. 5 is a diagram illustrating a configuration example of a quantization unit;

【図６】対応付けテーブルの構成を示す図である。FIG. 6 is a diagram showing a configuration of an association table.

【図７】画像検索手段の動作を示した流れ図である。FIG. 7 is a flowchart showing the operation of the image search means.

【図８】量子化手段の閾値配列をデータ内容に応じて設
定する閾値配列設定手段の構成図である。FIG. 8 is a configuration diagram of a threshold value array setting unit that sets a threshold value array of a quantization unit according to data content.

【図９】閾値決定手段の動作を示す流れ図である。FIG. 9 is a flowchart showing the operation of the threshold value determining means.

【図１０】量子化手段の別の構成例の図である。FIG. 10 is a diagram illustrating another configuration example of the quantization unit;

【図１１】第３実施例のブロック図である。FIG. 11 is a block diagram of a third embodiment.

【図１２】画像データを特徴量空間でグループ分けした
例を示す図である。FIG. 12 is a diagram illustrating an example in which image data is grouped in a feature amount space.

【図１３】第４実施例を示すブロック構成図である。FIG. 13 is a block diagram showing a fourth embodiment.

【図１４】第４実施例の検索時の処理の流れを示す流れ
図を示す図である。FIG. 14 is a diagram showing a flowchart illustrating a flow of a process at the time of retrieval according to a fourth embodiment;

【図１５】関係データベースに画像データを蓄積する形
態の一例の図である。FIG. 15 is a diagram illustrating an example of a form in which image data is stored in a relational database.

【符号の説明】[Explanation of symbols]

１０１ユーザ指定入力手段、１０２特徴量計測変換手段、１０３量子化手段、１０４対応づけテーブル、１０５画像検索手段、１０６画像データを含むデータを格納したデータベー
ス、１０７閾値配列設定手段、ｉ検索情報、ｊ特徴量ベクトル、ｋ対応づけテーブルへのインデクス、ｌ対応づけテーブルから得られた候補データのリス
ト、５０１量子化手段１０３内の比較量子化手段、５０２閾値配列、８０２閾値配列設定手段の中の特徴量ヒストグラムカ
ウンタ、８０３閾値決定手段である。101 user designation input means, 102 feature quantity measurement conversion means, 103 quantization means, 104 correspondence table, 105 image search means, 106 database storing data including image data, 107 threshold value array setting means, i search information, j Features vector, k Index to the association table, l List of candidate data obtained from the association table, 501 Comparison quantization means in the quantization means 103, 502 Threshold array, 802 Features in the threshold array setting means 803 is a threshold value determination means.

───────────────────────────────────────────────────── フロントページの続き (72)発明者志村典男東京都大田区下丸子３丁目30番２号キヤノン株式会社内 (56)参考文献特開昭64−73460（ＪＰ，Ａ) 特開昭61−213925（ＪＰ，Ａ) 特開平１−102685（ＪＰ，Ａ) 特開平４−98463（ＪＰ，Ａ) 加藤俊一他，「マルチメディア商標・意匠データベースＴＲＡＤＥＭＡＲＫ」，電子情報通信学会技術研究報告（ＰＲＵ88−５〜15）Ｖｏｌ．88 Ｎｏ．24，1988（昭63−05−20），ｐ．31 −38 坂内正夫，「画像検索技術」，電子情報通信学会誌Ｖｏｌ．71 Ｎｏ．99, 1988（昭63−09−25），ｐ．911−914 柳原圭雄他，「特徴抽出時間を考慮した最適分類木の一作成法−例画像入力による類似画像検索への応用−」，電子通信学会論文誌Ｖｏｌ．Ｊ68−ＤＮｏ．６，1985（昭60−06−25），ｐ. 1325−1335 柴田正啓他，「画像データベースの連想検索方式」，電子情報通信学会論文誌Ｖｏｌ．Ｊ73−Ｄ−▲ＩＩ▼ Ｎｏ. ４，1990（平02−04−25），ｐ．526− 534 (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06F 17/30 ＪＩＣＳＴファイル（ＪＯＩＳ)──────────────────────────────────────────────────続き Continuation of the front page (72) Inventor Norio Shimura 3-30-2 Shimomaruko, Ota-ku, Tokyo Canon Inc. (56) References JP-A-64-73460 (JP, A) JP-A-61 -213925 (JP, A) JP-A-1-102865 (JP, A) JP-A-4-98463 (JP, A) Shunichi Kato et al., "Multimedia Trademark and Design Database TRADEMARK", IEICE Technical Research Report (PRU88-5-15) Vol. 88 No. 24, 1988 (Showa 63-05-20), p. 31-38 Masao Sakauchi, “Image Search Technology”, Journal of the Institute of Electronics, Information and Communication Engineers, Vol. 71 No. 99, 1988 (63-09-25), p. 911-914 Yoshio Yanagihara et al., "A Method for Creating an Optimal Classification Tree Considering Feature Extraction Time-Application to Similar Image Retrieval by Inputting Example Images-", IEICE Transactions, Vol. J68-D No. 6, 1985 (Showa 60-06-25), pp. 1325-1335, Masahiro Shibata et al., "Associative search method for image database", IEICE Transactions, Vol. J73-D-II No. 4, 1990 (Heisei 02-04-25), p. 526− 534 (58) Field surveyed (Int. Cl. ⁷ , DB name) G06F 17/30 JICST file (JOIS)

Claims

(57)【特許請求の範囲】(57) [Claims]

【請求項１】複数の画像に対応する画像データを記憶
する記憶手段と、画像の複数の特徴をそれぞれ特徴量として数値的に計測
する計測手段と、前記複数の特徴量を、各特徴ごとのしきい値をもって非
線形に量子化する量子化手段と、量子化された複数の特徴量を線形に結合してインデクス
を生成し、該インデクスと前記記憶手段に記憶された画
像データとを対応づける対応づけ手段とを具備すること
を特徴とする画像検索装置。And 1. A storage means for storing image data corresponding to a plurality of images, and numerically measuring measure means more features as each feature amount of images, the plurality of features, for each feature Non-threshold
A quantization means for linearly quantizing, and an index by linearly combining a plurality of quantized features.
And an associating means for generating the index and associating the index with the image data stored in the storage means.

【請求項２】前記量子化手段は、前記記憶手段に記憶
された画像データを計測して得られる特徴量に基づい
て、前記しきい値により区切られる領域ごとに含まれる
画像データの数に違いが生じないよう、前記量子化のし
きい値を決定することを特徴とする請求項１記載の画像
検索装置。2. The image processing apparatus according to claim 1, wherein the quantizing unit is included in each area divided by the threshold based on a feature amount obtained by measuring image data stored in the storage unit.
As the difference in the number of image data does not occur, the image retrieval apparatus according to claim 1, wherein determining the threshold of the quantization.

【請求項３】画像の複数の特徴をそれぞれ特徴量とし
て数値的に計測する計測工程と、前記複数の特徴量を、各特徴ごとのしきい値をもって非
線形に量子化する量子化工程と、量子化された複数の特徴量を線形に結合してインデクス
を生成し、該インデクスと記憶手段に記憶された画像デ
ータとを対応づける対応づけ工程とを具備することを特
徴とする画像検索方法。3. A measurement step of numerically measuring a plurality of features of an image as feature quantities, a quantization step of non-linearly quantizing the plurality of feature quantities with a threshold value for each feature, reduction by a plurality of feature amounts to generate the index attached to the linear and the image retrieval method characterized by comprising the association step of associating the image data stored in the index storage means.

【請求項４】前記量子化工程は、前記記憶手段に記憶4. The quantization step is stored in the storage unit.
された画像データを計測して得られる特徴量に基づいBased on the features obtained by measuring the image data
て、前記しきい値により区切られる領域ごとに含まれるIs included in each area divided by the threshold value.
画像データの数に違いが生じないよう、前記量子化のしThe quantization is performed so that there is no difference in the number of image data.
きい値を決定することを特徴とする請求項３記載の画像The image according to claim 3, wherein the threshold value is determined.
検索方法。retrieval method.