JP2012064082A

JP2012064082A - Image classification device

Info

Publication number: JP2012064082A
Application number: JP2010208904A
Authority: JP
Inventors: Shuhei Sasakura; 州平笹倉; Eiji Fukumiya; 英二福宮
Original assignee: Panasonic Corp
Current assignee: Panasonic Corp
Priority date: 2010-09-17
Filing date: 2010-09-17
Publication date: 2012-03-29

Abstract

PROBLEM TO BE SOLVED: To solve problems of the increased number of comparisons caused by the increased number of collations in a system which is capable of automatically organizing images by person with a method of specifying a person using face recognition, which facilitates organizing several thousands or several tens of thousands of stored images.SOLUTION: In a function which classifies pictures by person collation using face feature amounts, an image classification device of the present invention detects expressions of people at a time of registering the people, and organizes feature amounts thereof according to classifications of the expressions. After the organization, when classifying a person, the image classification device detects a face expression of the person to be classified to compare to only the feature amounts belonging to the same expression classification, thus realizing a comparison reducing the number of comparisons.

Description

本発明は、人物が撮影された多数の画像を容易に整理することが可能な画像分類装置に関する。 The present invention relates to an image classification device that can easily organize a large number of images taken of a person.

デジタルカメラで撮影した写真や、ムービーカメラで撮影した動画などには撮影者もしくは被写体となった人物に重要な思い出やイベントが映し出されている。 Photos taken with a digital camera and videos taken with a movie camera show important memories and events for the photographer or the person who became the subject.

特許文献１には、デジタル写真を顔認識を用いて整理する装置について開示されている。記録媒体に保存されたデジタル画像を読み込み、顔の領域を切り出し、顔認識技術を用いて、複数のデジタル画像上の顔との類似性をみて人物を判断することで、人物での写真整理を行うというものである。これを実現するための機器の構成、ユーザインターフェイスとなる画面、顔画像の整理例が示されている。 Patent Document 1 discloses an apparatus that organizes digital photographs using face recognition. Organize photos by reading digital images stored on a recording medium, cutting out facial areas, and using facial recognition technology to determine the person by looking at similarities with faces on multiple digital images. Is to do. A device configuration for realizing this, a screen serving as a user interface, and an example of organizing facial images are shown.

特許文献２には、顔認識技術を用いて特定の人物を画像内から探しだす場合、当人かを判断したい人物の顔写真を複数毎使用することで認識精度を向上させる方式について記載されている。 Patent Document 2 describes a method of improving recognition accuracy by using a plurality of face photographs of a person who wants to determine whether the person is the person or the like when searching for a specific person in the image using face recognition technology. Yes.

特許文献３には、デジタルカメラにおいて人物を撮影する場合に、被写体の笑顔を検出して、撮影したい被写体を自在に選択することができる仕組みについて開示されている。デジタルカメラで撮影時に、被写体となる人物の表情を認識し、適切な撮影タイミングを自動的に判断することができる。人物の表情は笑顔とすることで、笑顔になった瞬間を撮影タイミングとして撮影することで、笑った顔の写真が撮影できるというものである。 Japanese Patent Application Laid-Open No. 2004-133620 discloses a mechanism that can detect a smiling face of a subject and freely select a subject to be photographed when photographing a person with a digital camera. When photographing with a digital camera, it is possible to recognize the facial expression of the person who is the subject and automatically determine an appropriate photographing timing. A person's facial expression is a smile, and a picture of a smiling face can be taken by taking a picture of the moment when the person smiles as the shooting timing.

特開２００５−１７４３０８号公報JP-A-2005-174308 特開２００３−２７１９５８号公報JP 2003-271958 A 特開２００８−３１１８１７号公報JP 2008-31817 A

デジタルの画像データを整理する場合、パソコンなどのキーボードやファイル管理機能のある機器において利用者が一枚一枚の写真を選別しつつ整理するには手間がかかるため整理すること自体が大きな課題であった。また高度な熟練を必要とするパソコンを使用できない人にとってはただデータを溜め込むだけになっていた。そこで複雑な入力を必要とせずにデジタル画像を簡単に整理する方法が特許文献１に記載されている。 When organizing digital image data, it takes a lot of time and effort to sort out photos one by one on a computer or other device with a keyboard or file management function. there were. For those who cannot use a personal computer that requires advanced skills, it was just collecting data. Therefore, Patent Document 1 describes a method for easily arranging digital images without requiring complicated input.

特許文献１では、所有している写真全部の顔を認識し整理する方法が記載されているが、写真枚数が増えるほど写真内の人物は増えるため、必ずしも全員を比較し取得する必要はない。集合写真で一緒に写った人物や、観光地で撮影した写真にたまたま写りこんでしまった人物などの整理は多くの場合不要である。そのために家族や友人など整理したい人物を限定する場合がある。その場合には、整理すべき人物の顔写真を手動により指定、もしくは候補から選ぶことでその人物のみを探すことが可能となる。 Patent Document 1 describes a method for recognizing and organizing the faces of all the photos that the user owns. However, since the number of people in the photo increases as the number of photos increases, it is not always necessary to compare and acquire all of them. In many cases, it is not necessary to organize people who appear together in a group photo or people who happen to appear in a photograph taken at a sightseeing spot. For this reason, there are cases where the number of persons such as family members and friends to be organized is limited. In that case, it is possible to search only the person by manually specifying the face photograph of the person to be organized or selecting from the candidates.

この時に、より精度よく認識を行うために、同一の人物に対して複数毎の顔画像を用意することで精度を上げる方法について特許文献２に記載されている。登録すべき顔画像はそれぞれ別々の画像である。これらの画像は、人物を知る人の手を解して登録されるため、誤った人物である可能性はないという仮定に基づいた認識精度向上方法である。 At this time, Patent Document 2 describes a method for improving accuracy by preparing a plurality of face images for the same person in order to perform recognition with higher accuracy. Each face image to be registered is a separate image. Since these images are registered with the help of a person who knows the person, the recognition accuracy improving method is based on the assumption that there is no possibility of being an incorrect person.

しかし、人間の表情は喜怒哀楽を表現できるものであり、同一人物であっても様々に変化する。登録画像を数学的手法において平均化したとしても、比較時には誤差を含んでしまう。また登録する表情に偏りがある場合、他人であっても同じ笑った顔であれば類似してしまうため認識が誤る問題があった。そこでより多くの顔画像を同一人物として登録することで精度を向上させる手法。また認識したい人物意外の人物であることを登録することで、どちらにより似ているか相対的な比較を行う手法をとることで精度の向上を行っている。 However, human facial expressions can express emotions and emotions, and even the same person changes in various ways. Even if the registered images are averaged by a mathematical method, an error is included in the comparison. Also, if the facial expressions to be registered are biased, there is a problem in that recognition is wrong because even if other people have the same smiling face, they are similar. Therefore, a technique to improve accuracy by registering more face images as the same person. Also, by registering that the person is not the person who wants to be recognized, the accuracy is improved by taking a relative comparison of which person is similar.

しかしこれらの手法にも欠点がある。同一人物に対してより多くの顔画像を用意する、また対象外の人物も含めて比較するということはそれだけ比較回数が増えることに直結するため、比較時間がより多くかかってしまうという欠点があった。より高性能なハードウェアスペックを用意することで時間短縮を図る方法もあるが、組み込み機器のようにスペックが限定されざる終えない機器では限界があると共に、消費電力も増えるため携帯機器ではバッテリー消耗にも直結してしまっていた。 However, these methods also have drawbacks. Preparing more face images for the same person, and comparing including non-target persons, is directly linked to the increase in the number of comparisons. It was. There is a way to shorten the time by preparing higher-performance hardware specs, but there is a limit for devices that cannot be finished, such as embedded devices, and there is a limit, and power consumption also increases, so battery consumption in portable devices Was also directly connected.

上記課題を解決するために、顔を比較する際に表情を取り入れることで比較対象を制限し、より少ない回数での照合を実現する仕組みを持つようにする。 In order to solve the above-described problems, a comparison target is limited by incorporating facial expressions when comparing faces, and a mechanism for realizing collation with a smaller number of times is provided.

本発明を利用することで、膨大な顔情報がデータベースに登録されているような場合であっても、効率的に顔照合を行えるため、照合に要する時間を削減でき、また同じ時間ではより多くの写真分類を行うことができる。 By using the present invention, even when a large amount of face information is registered in the database, face matching can be performed efficiently, so the time required for matching can be reduced and more at the same time. Photo classification can be performed.

実施の形態の画像分類装置１００の構成を示すブロック図1 is a block diagram showing a configuration of an image classification device 100 according to an embodiment 実施の形態の顔登録時に使用する画像分類装置１００を示すブロック図A block diagram showing an image classification device 100 used at the time of face registration of an embodiment 実施の形態の顔登録時の画像分類装置１００の制御フローチャートControl flow chart of image classification apparatus 100 at the time of face registration according to the embodiment 実施の形態の顔分類ＤＢ１０５の記録構造を示す図The figure which shows the recording structure of face classification DB105 of embodiment 実施の形態の画像分類時に使用する画像分類装置１００を示すブロック図A block diagram showing an image classification device 100 used at the time of image classification of an embodiment 実施の形態の画像分類時の画像分類装置１００の制御フローチャートControl flow chart of image classification apparatus 100 at the time of image classification according to the embodiment

以下に、本発明を実施するための最良の形態について、図面を参照して説明する。 The best mode for carrying out the present invention will be described below with reference to the drawings.

＜図１の説明：写真分類装置１００の構成＞
本実施の形態の写真分類装置１００を図１に示す。図１は、顔照合を用いて見つけ出したい人物の顔を記録すると共に、その人物が写っているかもしれない写真、あるいは、写っていないかもしれない写真、を与えると、写っているか否かを判断する機能を有する写真分類装置１００である。写真分類装置１００は、画像入力部１０１、顔検出部１０２、表情検出部１０３、顔特徴抽出部１０４、顔特徴ＤＢ１０５（ＤＢ：データベースの略）、顔認識部１０６、結果判定部１０７、画像分類部１０８、分類別保存ＤＢ１０９から成り立っている。 <Description of FIG. 1: Configuration of Photo Classification Device 100>
A photo classification apparatus 100 according to the present embodiment is shown in FIG. FIG. 1 records the face of a person to be found using face matching, and gives a photograph that the person may or may not be captured. This is a photo classification apparatus 100 having a function of determining. The photo classification device 100 includes an image input unit 101, a face detection unit 102, a facial expression detection unit 103, a facial feature extraction unit 104, a facial feature DB 105 (DB: database), a face recognition unit 106, a result determination unit 107, an image classification Section 108 and a classification-specific storage DB 109.

外部からの画像データは、画像入力部１０１で写真分類装置１００内にとりこまれる。画像入力部１０１は、取り込んだ画像データを顔検出部１０２に送る。顔検検出部１０２は、画像データから顔画像を抜き出す。抜き出された顔画像は表情検出部１０３と、顔特徴量検出部１０４と、に送られる。表情検出部１０３は顔画像から表情情報を作成する。表情情報は、顔特徴量検出部１０４が検出した顔特徴量情報とともに顔特徴ＤＢ１０５に記録される。ま顔表情情報はすでに顔特徴ＤＢに記録された表情情報と顔認識部１０６で照合され、照合結果が結果判定部１０７において同一人物の確定が行われる。同一人物の確定が行われた後、画像分類部１０８において人物別に分類されて、分類別保存ＤＢ１０９に保存される。 Image data from outside is taken into the photo classification apparatus 100 by the image input unit 101. The image input unit 101 sends the captured image data to the face detection unit 102. The face detection detection unit 102 extracts a face image from the image data. The extracted face image is sent to the expression detection unit 103 and the face feature amount detection unit 104. The facial expression detection unit 103 creates facial expression information from the face image. The facial expression information is recorded in the facial feature DB 105 together with facial feature amount information detected by the facial feature amount detection unit 104. The facial expression information is collated with the facial expression information already recorded in the facial feature DB by the face recognition unit 106, and the result of the collation is determined by the result determination unit 107. After the same person is confirmed, the image classification unit 108 classifies the person and saves it in the classification storage DB 109.

＜図２の説明：写真分類装置１００の登録動作＞
写真分類装置１００は、見つけ出したい人物の顔情報を登録する場合と、登録した顔情報を使って人物を照合する場合で大きく動作が異なる。図２に、写真分類装置１００が、見つけ出したい人物の顔情報を登録する場合の構成を示す。図２の写真分類装置１００は、画像入力部１０１、顔検出部１０２、表情検出部１０３、顔特徴量抽出部１０４、顔特徴ＤＢ１０５が動作している。図１にある顔認識部１０６、結果判定部１０７、画像分類部１０８、分類別保存ＤＢ１０９は使用していないため記載されていないだけで、物理的に写真分類装置１００からなくなったわけではない。 <Description of FIG. 2: Registration Operation of Photo Classification Apparatus 100>
The operation of the photo classification apparatus 100 differs greatly between the case where face information of a person to be found is registered and the case where a person is collated using the registered face information. FIG. 2 shows a configuration when the photo classification apparatus 100 registers face information of a person to be found. In the photo classification apparatus 100 of FIG. 2, an image input unit 101, a face detection unit 102, a facial expression detection unit 103, a face feature amount extraction unit 104, and a face feature DB 105 operate. The face recognition unit 106, the result determination unit 107, the image classification unit 108, and the classified storage DB 109 in FIG. 1 are not used because they are not used, and are not physically lost from the photo classification apparatus 100.

＜図３の説明：図２の処理フロー＞
図２の状態の写真分類装置１００で、見つけ出したい人物の顔情報を登録する手順を図３に示す。まずユーザに登録したい人物の写ったデジタル画像を、画像入力部１０１から入力する（ステップ３０１）。画像入力部１０１は、フラッシュメモリーカードや磁気記録型の装置やディスクから内部に記録された画像を読み取るような構造であってもよい。なお、光学式のスキャナ機能を持ち印刷物や写真フイルムから読みとる構造を持ちデジタル画像データを作成する機能を有していてもよい。または、カメラレンズと光学素子を持ち、実空間の映像を撮影し、デジタル画像データ化するようデジタルカメラの機能を有していてもよい。なお、入力されるデジタル画像データの記録フォーマットについても限定しない。 <Description of FIG. 3: Processing Flow of FIG. 2>
FIG. 3 shows a procedure for registering face information of a person to be found in the photo classification apparatus 100 in the state of FIG. First, a digital image showing a person to be registered as a user is input from the image input unit 101 (step 301). The image input unit 101 may be configured to read an image recorded therein from a flash memory card, a magnetic recording type device, or a disk. It may have an optical scanner function, a structure for reading from a printed matter or a photographic film, and a function for creating digital image data. Alternatively, it may have a camera lens and an optical element, and may have a digital camera function so that a real-space image is taken and converted into digital image data. The recording format of the input digital image data is not limited.

画像入力部１０１において入力されたデジタル画像データは、顔検出部１０２においてデジタル画像データ内に人物が映っているか、映っているのであれば位置と大きさを計測される（ステップ３０２）。なお顔の検出方法としてＡｄａＢｏｏｓｔ法やＨａａｒ型の特徴量検出法など多数の手法が広く知られていると共に、実際に使用されている。なお、入力されたデジタル画像データ内に複数の人物が検出された場合、どの人物を登録するかを利用者に選択する機能を有してもよい。また、複数人をそのまま登録してもよい、その場合は以降の処理を各人物に対して繰り返し実施する。 The digital image data input by the image input unit 101 is measured in position and size if a person is shown in the digital image data in the face detection unit 102 or if it is shown (step 302). As a face detection method, a number of methods such as the AdaBoost method and the Haar type feature amount detection method are widely known and actually used. Note that when a plurality of persons are detected in the input digital image data, the user may have a function of selecting which person to register. Further, a plurality of people may be registered as they are, and in this case, the subsequent processing is repeatedly performed for each person.

人物が検出されなかった場合、人物の登録はできないため処理を修正する（ステップ３０３）。 If no person is detected, the person cannot be registered and the process is corrected (step 303).

検出された場合には、顔検出部１０２で検出された顔領域に対して表情検出部１０３で表情を検出する（ステップ３０４）。表情を人間が目で見て理解する場合、笑っている、怒っているなどと分類するが表情検出部１０３においても同じように検出することが可能である。具体的な表情の抽出に関しては、先の顔検検出方法のＡｄａＢｏｏｓｔ法やＨａａｒ型の特徴量検出方法を用いて、モデル化された表情への類似性を得点化し判定して、どの表情により近いかを判断基準とする。 If it is detected, the facial expression detection unit 103 detects the facial expression detected from the face area detected by the face detection unit 102 (step 304). When a facial expression is understood by human eyes, it is classified as laughing or angry, but the facial expression detection unit 103 can detect it in the same way. For specific facial expression extraction, the similarity to the modeled facial expression is scored and determined using the AdaBoost method or Haar-type feature amount detection method of the previous face detection detection method, and it is closer to which facial expression This is the criterion.

この時の表情の類似性を、本実施の形態では説明上、得点を用いて０〜１００点の整数値の判定を行っているとするが、他の得点範囲の利用（０〜１０００点など）、マイナス点の利用、実数の利用、上限下限を設けない方法など、得点のつけ方を限定するものではない。さらに本実施の形態では、より理解を用意とするため、表情は無表情と笑顔を判断するものとするが、怒りや悲しみなど区分する表情を限定するものではない。さらに０〜１００点の得点のうち０点を無表情とし、１００点を最高の笑顔とし、中間点は表情を正規化した値を表すものとする。 In this embodiment, it is assumed that the similarity of facial expressions at this time is determined as an integer value of 0 to 100 points using scores, but other scoring ranges (0 to 1000 points, etc.) are used. ), Use of negative points, use of real numbers, method of not setting upper and lower limits, etc., does not limit how to score. Further, in this embodiment, in order to prepare for better understanding, the expression is determined to be no expression and a smile, but the expression to be classified such as anger and sadness is not limited. Further, of the 0 to 100 points, 0 points are set as no expression, 100 points are set as the highest smile, and the intermediate point indicates a value obtained by normalizing the expression.

特許文献１のように笑顔を検出した場合に動作を決定する機器の場合、この得点が一定値を超えた場合、例えば８０点を超えたら笑顔であると判断する、といったように閾値を設けることで、「無表情と笑顔」の２値とすることで動作判断を実現しているが、本実施の形態では任意の数に分類してよい。例えば、０〜２０点を表情１、２０〜４０点を表情２、４０〜６０点を表情３、６０〜８０点を表情４、８０点〜１００点を表情５とする。そして表情１から５の順で、無表情から笑顔まで５段階で表現するものとするものである。本実施例では、表情検出部では５段階での判定を用いるものとする。しかし６段階や１００段階など他の段階に分ける場合について限定するものではない。 In the case of a device that determines the operation when a smile is detected as in Patent Document 1, if this score exceeds a certain value, for example, if it exceeds 80, a threshold is set so that a smile is determined. Thus, although the motion determination is realized by setting the binary value of “no expression and smile”, in this embodiment, it may be classified into an arbitrary number. For example, 0 to 20 points are facial expressions 1, 20 to 40 points are facial expressions 2, 40 to 60 points are facial expressions 3, 60 to 80 points are facial expressions 4, and 80 to 100 points are facial expressions 5. Then, in the order of facial expressions 1 to 5, the expression is expressed in five stages from no expression to smile. In this embodiment, the facial expression detection unit uses determination in five stages. However, it is not limited to the case of dividing into other stages such as 6 stages and 100 stages.

顔検出部１０２で検出された顔に対し、顔特徴抽出部１０４において顔特徴を抽出する（ステップ３０５）。抽出する顔特徴は、後記述する個人識別を行うために使用するもので、先の顔検出に使用されたＡｄａＢｏｏｓｔ法やＨａａｒ型の特徴量検出法など多数の方式があり、採用している方式に応じた特徴を数値化したものである。 For the face detected by the face detection unit 102, the face feature extraction unit 104 extracts a face feature (step 305). The extracted facial features are used for personal identification to be described later, and there are a number of methods such as the AdaBoost method and Haar type feature amount detection method used for the previous face detection, and the adopted methods This is a quantification of the characteristics corresponding to

表情検出部１０３で検出された表情の段階と、顔特徴抽出部１０４で抽出された顔特徴は、顔特徴ＤＢに１０５に記録される（ステップ３０６）。抽出された顔特徴が誰のものであるかについては、ユーザからの入力によって人物として記録される。これにより見つけ出したい人物の顔情報を登録する場合の処理を終了する。 The facial expression stage detected by the facial expression detection unit 103 and the facial feature extracted by the facial feature extraction unit 104 are recorded in the facial feature DB 105 (step 306). Who the extracted facial features belong to is recorded as a person by input from the user. Thus, the process for registering the face information of the person to be found is ended.

<図４の説明：顔特徴ＤＢ１０５の記録構造>
図４には、図３のステップ３０６で、表情検出部１０３において検出された表情の段階と、顔特徴抽出部１０４において抽出された顔特徴を顔特徴ＤＢ１０５に記録する構造について記載する。顔特徴ＤＢ１０５では、人物別に表情レベルによる区分けを行うようにしている。これにより各人物の表情に応じた顔特徴量を記録することが可能となる。特徴を登録するときに、すでに同一人物の同じ表情が登録されていた場合、複数の特徴を登録できるようにしてもよいし、既に登録された特徴量と平均化してもよい。 <Description of FIG. 4: Recording Structure of Face Feature DB 105>
FIG. 4 shows a structure in which the facial expression stage detected by the facial expression detection unit 103 in step 306 of FIG. 3 and the facial feature extracted by the facial feature extraction unit 104 are recorded in the facial feature DB 105. In the face feature DB 105, classification is performed according to the expression level for each person. As a result, it is possible to record a facial feature amount corresponding to the facial expression of each person. When the same facial expression of the same person has already been registered when the feature is registered, a plurality of features may be registered, or may be averaged with the already registered feature amounts.

＜図５の説明：写真分類装置１００の照合動作＞
図５に、写真分類装置１００が登録された顔情報を使用して人物を照合する場合の構成をしめす。照合時は図１で記載したすべてのブロックを使用する。 <Description of FIG. 5: Collation Operation of Photo Classification Apparatus 100>
FIG. 5 shows a configuration in the case where the photograph classification apparatus 100 collates a person using registered face information. At the time of collation, all the blocks described in FIG. 1 are used.

＜図６の説明：図５の処理フロー＞
図５の状態の写真分類装置１００で、見つけ出したい人物の顔情報を照合する手順を図６に示す。まずユーザに分類したい人物の写ったデジタル画像を、画像入力部１０１から入力する（ステップ６０１）。画像入力部１０１は、フラッシュメモリーカードや磁気記録型の装置やディスクから内部に記録された画像を読み取るような構造であってもよい。なお、光学式のスキャナ機能を持ち印刷物や写真フイルムから読みとる構造を持ちデジタル画像データを作成する機能を有していてもよい。または、カメラレンズと光学素子を持ち、実空間の映像を撮影し、デジタル画像データ化するようデジタルカメラの機能を有していてもよい。なお、入力されるデジタル画像データの記録フォーマットについても限定しない。 <Description of FIG. 6: Process Flow of FIG. 5>
FIG. 6 shows a procedure for collating face information of a person to be found by the photo classification apparatus 100 in the state of FIG. First, a digital image showing a person to be classified as a user is input from the image input unit 101 (step 601). The image input unit 101 may be configured to read an image recorded therein from a flash memory card, a magnetic recording type device, or a disk. It may have an optical scanner function, a structure for reading from a printed matter or a photographic film, and a function for creating digital image data. Alternatively, it may have a camera lens and an optical element, and may have a digital camera function so that a real-space image is taken and converted into digital image data. The recording format of the input digital image data is not limited.

画像入力部１０１において入力されたデジタル画像データは、顔検出部１０２においてデジタル画像データ内に人物が映っているか、映っているのであれば位置と大きさを計測される（ステップ６０２）。なお顔の検出方法としてＡｄａＢｏｏｓｔ法やＨａａｒ型の特徴量検出法など多数の手法が広く知られていると共に、実際に使用されている。本実施の形態において検出方法について限定するものではない。なお、入力されたデジタル画像データ内に複数の人物が検出された場合、どの人物で整理するかを利用者に選択する機能を有してもよい。また、複数人をそのまま整理してもよい、その場合は以降の処理を各人物に対して繰り返し実施する。 The digital image data input by the image input unit 101 is measured in position and size if the person is shown in the digital image data in the face detection unit 102 or if it is shown in the face detection unit 102 (step 602). As a face detection method, a number of methods such as the AdaBoost method and the Haar type feature amount detection method are widely known and actually used. In the present embodiment, the detection method is not limited. In addition, when a plurality of persons are detected in the input digital image data, the user may have a function of selecting which person to organize. Further, a plurality of people may be arranged as they are, in which case the subsequent processing is repeatedly performed on each person.

人物が検出されなかった場合、人物の登録はできないため処理を修正する（ステップ６０３）。 If no person is detected, the person cannot be registered and the process is corrected (step 603).

検出された場合には、顔検出部１０２で検出された顔領域に対して表情検出部１０３で表情を検出する（ステップ６０４）。この時の表情検出方法については、図３のステップ３０４と同じ方式、同じ基準を使用しなければならない。これは本発明が表情で分類することを特徴としているためで、分類結果がことなると正しい結果が出ないためである。 If it is detected, the facial expression detection unit 103 detects a facial expression for the face area detected by the face detection unit 102 (step 604). For the facial expression detection method at this time, the same method and the same standard as in step 304 in FIG. 3 must be used. This is because the present invention is characterized by classifying by facial expression, and if the classification result is different, a correct result cannot be obtained.

続いて、顔検出部１０２で検出された顔領域に対して顔特徴抽出部１０４において顔特徴を抽出する（ステップ６０５）。抽出する顔特徴は、後記述する個人識別を行うために使用するもので、先の顔検出に使用されたＡｄａＢｏｏｓｔ法やＨａａｒ型の特徴量検出法など多数の方式があり、採用している方式に応じた特徴を数値化したものであるが、図３のステップ３０５と同じ方式、同じ基準で特徴量を検出しなければならない。 Subsequently, the face feature extraction unit 104 extracts face features from the face area detected by the face detection unit 102 (step 605). The extracted facial features are used for personal identification to be described later, and there are a number of methods such as the AdaBoost method and Haar type feature amount detection method used for the previous face detection, and the adopted methods However, the feature quantity must be detected by the same method and the same standard as step 305 in FIG.

表情検出部１０３において検出された表情と同じ分類に属する表情を顔特徴ＤＢ１０５に登録された人物と照合を行い誰であるかを判断していく。本例では、図４の人物ＡからＢ、Ｃ、Ｄ・・と順に調べていくものとするが、どのような順番で調べても構わない。人物の顔特徴のうち、ステップ６０４の表情と同じ表情の顔特徴を顔特徴ＤＢ１０５から選択する（ステップ６０６）。もし仮に同じ表情が登録されていない場合には、近い表情の顔特徴を選択する。ステップ６０４で検出された表情が表情２であって、その人物の表情２が顔特徴ＤＢに登録されていない場合、表情１と表情３が近い表情となる。表情１と表情３のどちらかを選ぶには、ステップ６０４で表情２と判断した理由の２０〜４０点を用いて、よりどちらに近いかを判断する。 The facial expression belonging to the same category as the facial expression detected by the facial expression detection unit 103 is collated with a person registered in the facial feature DB 105 to determine who the person is. In this example, the persons A, B, C, D,... In FIG. 4 are examined in this order, but any order may be used. Of the facial features of the person, a facial feature having the same facial expression as that of step 604 is selected from the facial feature DB 105 (step 606). If the same facial expression is not registered, a facial feature with a close facial expression is selected. If the facial expression detected in step 604 is facial expression 2, and facial expression 2 of the person is not registered in the facial feature DB, facial expressions 1 and 3 are close to each other. In order to select either facial expression 1 or facial expression 3, it is determined which is closer by using 20 to 40 points of the reason that facial expression 2 is determined in step 604.

ステップ６０５で得た顔特徴と、ステップ６０６で顔特徴ＤＢ１０５から呼び出された顔特徴と、が同一人物のもであるのかを顔認識部１０６で比較する（ステップ６０７）。顔認識を用いた比較は、顔検出と同じくＡｄａＢｏｏｓｔ法やＨａａｒ型の特徴量検出法など多数の方式がある。どのような方式を使ってもよい。 The face recognition unit 106 compares whether the facial feature obtained in step 605 and the facial feature called from the facial feature DB 105 in step 606 are the same person (step 607). For comparison using face recognition, there are many methods such as the AdaBoost method and the Haar type feature amount detection method as in the face detection. Any method may be used.

比較の結果（ステップ６０８）、結果判定部１０７で同一人物と判断できなかった場合には、ステップ６０６に戻り別の人物に対して確認する（ステップ６０９）。仮に顔特徴ＤＢ１０５に登録させた全人物に対して顔認識による比較で同一人物であることが確認できなかった場合には、登録されていない人物であるため整理することをあきらめて終了する。またこの時に、登録されていない新しい人物が見つかったとして、顔認証ＤＢ１０５に新規に登録してもよい。 As a result of the comparison (step 608), when the result determination unit 107 cannot determine the same person, the process returns to step 606 to confirm another person (step 609). If it is not possible to confirm that all persons registered in the face feature DB 105 are the same person by comparison by face recognition, they are not registered, and the arrangement is terminated. At this time, it may be newly registered in the face authentication DB 105 on the assumption that a new person not registered is found.

ステップ６０８の比較の結果、同一人物であることが確認できた場合、顔特徴ＤＢ１０５にその人物の同じ表情の位置に顔特徴情報を追加する（ステップ６１０）。追加する位置にすでに顔特徴情報がある場合には、同じ人物、同じ位置に複数の顔特徴をもてるようにして複数登録してもよい。また既に登録されている顔特徴と平均化して記録してもよい。なおこのステップ６１０はオプションである。新たに顔特徴ＤＢ１０５を充実化せしめる必要がないと判断した場合には、追加作業を行う必要はない。 As a result of the comparison in step 608, if it is confirmed that they are the same person, face feature information is added to the face feature DB 105 at the same facial expression position (step 610). When face feature information already exists at the position to be added, a plurality of face features may be registered so that a plurality of face features can be obtained at the same person and the same position. Further, it may be recorded by averaging with already registered facial features. This step 610 is optional. If it is determined that there is no need to newly enrich the face feature DB 105, no additional work is required.

続いて、ステップ６０８で見つかった人物での写真整理を画像分類部１０８で行う（ステップ６１１）。ここではその人物について写真整理を行うが、整理する画像は、ステップ６０１で画像入力部１０１にて入力された画像データである。ここでは人物による整理のみを記載しているが、画像データが撮影された日や場所など他の情報を追加して整理してもよい。これらの情報は画像データ内の撮影情報フィールドに記載された情報を用いてもよいし、ユーザから入力を受け付けてもよい。 Subsequently, the image classification unit 108 organizes the photos of the person found in step 608 (step 611). Here, photos are organized for the person, but the image to be organized is the image data input by the image input unit 101 in step 601. Here, only organization by person is described, but other information such as date and place where image data was taken may be added and organized. For these pieces of information, information described in the photographing information field in the image data may be used, or input from the user may be accepted.

画像分類部１０８で分類された分類基準で、分類別保存ＤＢ１０９に部類結果を保存す（ステップ６１２）。この分類はユーザが画像データを表示(再生)したい場合に呼び出し用の情報として使用できる。つまりは人物を指定した画像データ呼び出しが可能となるため、ユーザにとっては写真が整理された状態と解釈することになる。 Based on the classification criteria classified by the image classification unit 108, the category result is stored in the classification-specific storage DB 109 (step 612). This classification can be used as information for calling when the user wants to display (reproduce) image data. In other words, since it is possible to call up image data specifying a person, the user interprets the photograph as being organized.

以上の手順により、個人の顔特徴を表情別に分類しておくことで、表情が似通った顔特徴量と比較することができるようになる。これは同一人物でも様々な表情を作り出す人間の顔がどの場合か分からず、さまざまな表情と比較するよりも短時間で比較が行えることになる。仮に、すべての表情の特徴情報が顔特徴ＤＢ１０５内に同数で存在し、各表情の特徴がまったく異なる特性であった場合、５段階にわけているため比較回数は１／５となる可能性がある。また明らかに表情が異なる場合には、同一人物であっても顔特徴は異なってくるため、別人物であると誤った判断をする可能性（誤検出率）も低く抑えることが可能となる。 By classifying individual facial features according to facial expressions by the above procedure, it becomes possible to compare with facial feature quantities with similar facial expressions. This means that even if the person is the same person, the human face that produces various facial expressions is unknown, and comparisons can be made in a shorter time than comparing various facial expressions. If the feature information of all facial expressions exists in the face feature DB 105 in the same number, and the features of each facial expression are completely different characteristics, the number of comparisons may be reduced to 1/5 because of the five stages. is there. If the facial expressions are clearly different, the facial features will be different even if they are the same person, so that the possibility of erroneously judging that they are different persons (false detection rate) can be kept low.

本実施の形態での、表情別の顔特徴分類は、０〜１００点を基準得点として、２０点ずつ、等間隔で５段階に分類し説明した。これは図４のデータ構造説明にもそのまま記載されている。これは説明を理解しやすくするためのモデルである。実際に人物の顔を見ると笑っていないのに笑ったような顔の人や、笑っているのに笑っていないような人物もいる。またスナップ写真の場合には、微笑む程度に笑う場合が多く、等間隔で分類した場合には、写真の分散に偏りが出る可能性がある。分散に偏りがあると、比較回数の削減量が少なり多くの比較が必要となってしまう場合がある。例えば、同一人物の画像１００枚を登録し、基準得点を一定にし、２段階で分けた場合に１枚と９９枚に分かれたとすると、１枚のほうは比較回数から１回ですむが、もう一方は９９回比較が必要となり削減効果が薄い。また１枚となったほうも複数画像との比較ができていないため、誤った答えを出す可能性がある。そこで、分散に応じて境界となる得点を可変させる。分散は得点範囲で決定するのではなく、登録する顔特徴量の数を基準としてなるべく同数となるように境界を設けるとよい。またこの場合、人物によって境界となる得点が異なっていても構わない。ある人物は、低い方から２つ目の分類に属した顔特徴量と比較しているが、別の人物は３つ目の分類を使用していたとしても、比較結果に影響はない。 In the present embodiment, the facial feature classification for each facial expression has been described by classifying 20 points at 5 intervals at regular intervals, with 0 to 100 points being the reference score. This is also described as it is in the data structure explanation of FIG. This is a model to make the explanation easy to understand. There are people who look laughing when they actually look at their faces, and people who are laughing but not laughing. In the case of snapshots, there are many cases where people laugh to the extent that they smile. If they are classified at equal intervals, there is a possibility that the distribution of the photos will be biased. If the distribution is biased, the amount of reduction in the number of comparisons may be small and a large number of comparisons may be required. For example, if 100 images of the same person are registered and the standard score is fixed and divided into two stages, it is divided into 1 sheet and 99 sheets. On the other hand, 99 reductions are required and the reduction effect is small. In addition, there is a possibility that an incorrect answer will be given even if the number is one, because it cannot be compared with a plurality of images. Therefore, the score that becomes the boundary is varied according to the variance. Instead of determining the variance based on the score range, it is preferable to provide boundaries so that the number of facial feature values to be registered is the same as possible. In this case, the score that becomes the boundary may differ depending on the person. A certain person compares with a facial feature amount belonging to the second classification from the lowest, but even if another person uses the third classification, the comparison result is not affected.

なお、本実施の形態では、写真分類装置として記載した。しかし本実施の形態に記載していない、デジタルカメラや携帯電話やＰＤＡやパソコンなどの他の電子機器に同様の機能を取り入れてもかまわない。また写真分類装置内の各ブロックを複数の機器に分割した搭載し、ネットワークなどの接続手段を用いて一体となった動作を行ってもよい。またデバイスやＬＳＩやソフトウェアで動作するモジュールとして提供してもかまわない。 In this embodiment, it is described as a photo classification device. However, the same function may be incorporated in other electronic devices such as a digital camera, a mobile phone, a PDA, and a personal computer that are not described in this embodiment. Further, each block in the photo classification apparatus may be divided into a plurality of devices and integrated operation may be performed using connection means such as a network. It may also be provided as a module that operates with a device, LSI, or software.

本発明にかかる写真分類装置は、顔の特徴量を用いて人物照合を行い、写真分類をする機能において、人物登録時に表情を検出し、表情の分類によって特徴量を整理し、整理後の分類時に分類したい人物の顔の表情を検出することで、同じ表情分類に属する特徴量のみを比較することで、比較回数を削減でき、数千枚、数万枚という人間では用意に整理できない量のデジタル写真であったとしても、整理を可能とする仕組みを提供できる点で有用である。 The photo classification apparatus according to the present invention uses a facial feature amount to perform person matching, and in the photo classification function, detects facial expressions at the time of person registration, organizes feature amounts by facial expression classification, By detecting facial expressions of people who sometimes want to be classified, it is possible to reduce the number of comparisons by comparing only the features belonging to the same facial expression classification. Even if it is a digital photograph, it is useful in that it can provide a mechanism for organizing.

１００写真分類装置
１０１画像入力部
１０２顔検出部
１０３表情検出部
１０４顔特徴抽出部
１０５顔特徴ＤＢ
１０６顔認識部
１０７結果判定部
１０８画像分類部
１０９分類別保存ＤＢ DESCRIPTION OF SYMBOLS 100 Photo classification device 101 Image input part 102 Face detection part 103 Expression detection part 104 Face feature extraction part 105 Face feature DB
106 face recognition unit 107 result determination unit 108 image classification unit 109 storage DB for each classification

Claims

画像データを分類する画像分類装置であって、
予め登録された人物の複数の表情の特徴に関するデータを記録している顔特徴データベースと、
入力された画像データから人物の顔を抽出する顔検出部と、
前記抽出された顔の画像から人物の表情を識別する表情識別部と、
前記表情認識部が識別した表情に対応する前記顔特徴データベースの表情の特徴データと、前記顔検出部が検出した顔画像の特徴データと、を比較し、人物の一致性を判断する顔識別部と、
前記顔識別の判断に基づいて、前記入力された画像データを分類する画像分類部と、
を備える画像分類装置。 An image classification device for classifying image data,
A face feature database that records data relating to features of a plurality of facial expressions of a person registered in advance;
A face detection unit for extracting a human face from the input image data;
A facial expression identification unit for identifying a facial expression of a person from the extracted facial image;
A facial identification unit that compares facial expression feature data in the facial feature database corresponding to the facial expression identified by the facial expression recognition unit and facial image feature data detected by the facial detection unit, and determines the matching of the person When,
An image classification unit for classifying the input image data based on the determination of the face identification;
An image classification apparatus comprising:

前記顔識別部は、前記表情識別部によって分類された前記特徴データを使用して、同一表情を持つ前記特徴データを優先的に使用して人物の一致性を判断する、
請求項１に記載の画像分類装置。 The face identifying unit uses the feature data classified by the facial expression identifying unit to preferentially use the feature data having the same facial expression to determine person matching.
The image classification device according to claim 1.

前記顔認識部は、前記表情識別部によって分類された前記特徴データベースを、人物別に表情の分散に基づいて可変させて分類したのちに、同一の表情の前記特徴データを用いて人物の一致性を判断する、請求項２に記載の画像分類装置。 The face recognition unit categorizes the feature database classified by the facial expression identification unit based on the distribution of facial expressions according to the person, and then uses the feature data of the same facial expression to determine the identity of the person. The image classification device according to claim 2, wherein the determination is performed.