JP7034781B2

JP7034781B2 - Image processing equipment, image processing methods, and programs

Info

Publication number: JP7034781B2
Application number: JP2018049364A
Authority: JP
Inventors: 保彦岩本
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2018-03-16
Filing date: 2018-03-16
Publication date: 2022-03-14
Anticipated expiration: 2038-03-16
Also published as: JP2019161583A

Description

本発明は、取得される複数の画像において被写体領域を特定し、特定された被写体領域に応じた画像処理を行う画像処理装置、画像処理方法、及びプログラムに関する。 The present invention relates to an image processing apparatus, an image processing method, and a program that specifies a subject area in a plurality of acquired images and performs image processing according to the specified subject area.

時系列に並んだ複数の画像において画像解析により被写体の存在する被写体領域をそれぞれ特定する被写体探索処理を行う技術がある。特許文献１では、テンプレートマッチングを用いて被写体探索処理を行うとともに、被写体領域を含む周辺領域の画像情報を参照して被写体領域を補正する被写体領域補正処理を行う技術の開示がある。 There is a technique for performing a subject search process for identifying a subject area in which a subject exists by image analysis on a plurality of images arranged in a time series. Patent Document 1 discloses a technique for performing subject search processing using template matching and performing subject area correction processing for correcting a subject area by referring to image information of a peripheral area including a subject area.

特開２０１４－１２７１５４号公報Japanese Unexamined Patent Publication No. 2014-127154

しかしながら、特許文献１に記載の方法では、単に画像信号のレベルから被写体領域とそれ以外の領域を分離するため、同色だが異なる被写体が近い場合など、色や輝度によって両者を分離できない場合には被写体領域を正しく特定できないことがある。
上記課題に鑑み、本発明は、奥行き情報を用いて被写体領域を精度よく特定する画像処理装置及び画像処理方法を提供することを目的とする。 However, in the method described in Patent Document 1, since the subject area and the other area are simply separated from the level of the image signal, the subject cannot be separated by color or brightness, such as when different subjects of the same color are close to each other. The area may not be identified correctly.
In view of the above problems, it is an object of the present invention to provide an image processing apparatus and an image processing method for accurately specifying a subject area using depth information.

本発明に係る画像処理装置は、画像を取得する第１の取得手段と、前記画像のそれぞれについて、生成時のブロックのサイズが異なる複数のデフォーカスマップを取得する第２の取得手段と、前記画像及び前記デフォーカスマップに基づいて、対象の被写体を探索する探索手段と、前記画像及び前記デフォーカスマップに基づいて、前記探索手段が決定した被写体領域を補正する補正手段と、前記複数のデフォーカスマップの内から前記補正手段での前記被写体領域の補正に用いる前記デフォーカスマップを選択する選択手段とを有することを特徴とする。 The image processing apparatus according to the present invention includes a first acquisition means for acquiring an image, a second acquisition means for acquiring a plurality of defocus maps having different block sizes at the time of generation for each of the images, and the above-mentioned. A search means for searching for a target subject based on an image and the defocus map, a correction means for correcting a subject area determined by the search means based on the image and the defocus map, and the plurality of defocuses. It is characterized by having a selection means for selecting the defocus map used for the correction of the subject area by the correction means from the focus map.

本発明によれば、画像から被写体領域を特定する際にデフォーカス情報を用いることで、精度よく被写体領域を特定することができる。 According to the present invention, by using the defocus information when specifying the subject area from the image, the subject area can be specified with high accuracy.

本発明の実施形態における撮像装置の構成例を示す図である。It is a figure which shows the structural example of the image pickup apparatus in embodiment of this invention. 本実施形態における撮像素子の構成例を示す図である。It is a figure which shows the structural example of the image sensor in this embodiment. 本実施形態における被写体探索処理を説明する図である。It is a figure explaining the subject search process in this embodiment. 本実施形態における視差画像とデフォーカスマップの例を示す図である。It is a figure which shows the example of the parallax image and the defocus map in this embodiment. 本実施形態における撮像装置の処理の例を示すフローチャートである。It is a flowchart which shows the example of the processing of the image pickup apparatus in this embodiment. 本実施形態における被写体探索処理の例を示すフローチャートである。It is a flowchart which shows the example of the subject search process in this embodiment. 本実施形態におけるデフォーカスマップ選択処理の例を示すフローチャートである。It is a flowchart which shows the example of the defocus map selection process in this embodiment. 本実施形態における視差画像とデフォーカスマップの例を示す図である。It is a figure which shows the example of the parallax image and the defocus map in this embodiment. 本実施形態における視差画像とデフォーカスマップの例を示す図である。It is a figure which shows the example of the parallax image and the defocus map in this embodiment. 本実施形態におけるラベリング処理の例を示す図である。It is a figure which shows the example of the labeling processing in this embodiment. 本実施形態における被写体領域補正処理の例を示すフローチャートである。It is a flowchart which shows the example of the subject area correction processing in this embodiment. 本実施形態におけるラベリングしたデフォーカスマップを用いた被写体領域補正処理の例を示す図である。It is a figure which shows the example of the subject area correction processing using the labeled defocus map in this embodiment.

以下、本発明の実施形態を図面に基づいて説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

＜撮像システム＞
本発明の一実施形態における撮像装置について説明する。
本実施形態では、時系列に並んだ複数の画像に対して特許文献１のような被写体探索処理と被写体領域補正処理を行う際に、瞳分割された被写体光を受光する撮像素子からの出力を用いて位相差検出方式で求められたデフォーカス情報を用いる。 <Imaging system>
The image pickup apparatus in one embodiment of the present invention will be described.
In the present embodiment, when performing subject search processing and subject area correction processing as in Patent Document 1 for a plurality of images arranged in a time series, the output from the image sensor that receives the subject light divided into pupils is output. The defocus information obtained by the phase difference detection method is used.

ここで、瞳分割型撮像素子からの出力を用いて位相差検出方式により生成されるデフォーカスマップは、相関演算時に用いる微小ブロックの大きさにより異なる性質を持つ。具体的には、微小ブロックの大きさが小さいと相関演算に用いるデータ量が少なくなり、同一距離である領域のデフォーカス量がばらつくことがある。一方、微小ブロックの大きさが大きいと、遠近競合の影響によりデフォーカスマップ上で被写体輪郭部分の形状が正確に捉えられないことがある。 Here, the defocus map generated by the phase difference detection method using the output from the pupil division type image sensor has different properties depending on the size of the minute block used in the correlation calculation. Specifically, if the size of the minute block is small, the amount of data used for the correlation calculation is small, and the amount of defocus in the region having the same distance may vary. On the other hand, if the size of the minute block is large, the shape of the contour portion of the subject may not be accurately captured on the defocus map due to the influence of perspective competition.

また、被写体探索処理と被写体領域補正処理とでは重視すべき性能が異なる。被写体探索処理は、画像の広い領域から大まかな被写***置を探索する処理であるので、被写体と背景との境界を識別する精度は低くても良いが、背景を安定的に識別できる必要がある。一方、被写体領域補正処理は、おおまかに特定した被写***置から被写体領域を細かく補正する処理であるので、背景を安定的に識別できなくても良いが、被写体と背景との境界を識別する精度は高い必要がある。 In addition, the performance that should be emphasized differs between the subject search process and the subject area correction process. Since the subject search process is a process for searching a rough subject position from a wide area of an image, the accuracy of discriminating the boundary between the subject and the background may be low, but it is necessary to be able to stably discriminate the background. On the other hand, the subject area correction process is a process for finely correcting the subject area from a roughly specified subject position, so that it is not necessary to stably identify the background, but the accuracy of identifying the boundary between the subject and the background is high. Must be high.

すなわち、同じ大きさの微小ブロックを用いて生成されたデフォーカスマップを被写体探索処理及び被写体領域補正処理のそれぞれに用いる場合、両方の処理を精度よく実現することができないことがある。
そこで、本実施形態では、デフォーカス情報を算出する最小単位であるブロックのサイズが異なる複数のデフォーカスマップを生成する。それらを被写体探索処理と被写体領域補正処理とで選択的に用いることで、両処理の精度を上げ、被写体領域の特定精度を上げることを特徴とする。 That is, when a defocus map generated by using minute blocks of the same size is used for each of the subject search process and the subject area correction process, both processes may not be realized accurately.
Therefore, in the present embodiment, a plurality of defocus maps having different block sizes, which are the minimum units for calculating defocus information, are generated. By selectively using them in the subject search process and the subject area correction process, the accuracy of both processes is improved and the accuracy of specifying the subject area is improved.

図１は、本実施形態における撮像装置１００の構成例を示すブロック図である。本実施形態における撮像装置１００は、例えば被写体の画像を撮像するデジタルカメラとして具現化される。また、本実施形態における撮像装置１００は、時系列に並んだ画像に含まれる被写体を追跡する被写体領域追跡装置としても機能する。本実施形態における撮像装置１００は、撮像光学系１０１、撮像素子１０２、アナログ信号処理部１０３、Ａ／Ｄ（アナログ／デジタル）変換部１０４、制御部１０５、画像処理部１０６、表示部１０７、及び記録媒体１０８を有する。また、撮像装置１００は、被写体指定部１０９、被写体追跡部１１０、ＲＯＭ（Read Only Memory）１２１、及びＲＡＭ（Random Access Memory）１２２を有する。 FIG. 1 is a block diagram showing a configuration example of the image pickup apparatus 100 according to the present embodiment. The image pickup apparatus 100 in the present embodiment is embodied as, for example, a digital camera that captures an image of a subject. In addition, the image pickup apparatus 100 in the present embodiment also functions as a subject area tracking apparatus for tracking a subject included in images arranged in a time series. The image pickup apparatus 100 in the present embodiment includes an image pickup optical system 101, an image pickup element 102, an analog signal processing unit 103, an A / D (analog / digital) conversion unit 104, a control unit 105, an image processing unit 106, a display unit 107, and It has a recording medium 108. Further, the image pickup apparatus 100 includes a subject designation unit 109, a subject tracking unit 110, a ROM (Read Only Memory) 121, and a RAM (Random Access Memory) 122.

撮像装置１００において、被写体像に係る光は、撮像レンズを含む撮像光学系１０１によって集光され撮像素子１０２に入射する。撮像素子１０２は、入射する光の強度に応じた電気信号をそれぞれ出力する複数の光電変換素子を有し、被写体像（光学像）を電気信号に光電変換する。撮像素子１０２における画素配列構成の詳細は後述する。アナログ信号処理部１０３は、撮像素子１０２から出力された画像信号に対して相関二重サンプリング（ＣＤＳ）等のアナログ信号処理を施す。Ａ／Ｄ変換部１０４は、アナログ信号処理部１０３から出力されたアナログの画像信号をデジタルデータの形式に変換する。Ａ／Ｄ変換部１０４によって変換されたデジタル形式の画像信号（以下、単に画像ともいう）は、制御部１０５及び画像処理部１０６に入力される。 In the image pickup apparatus 100, the light related to the subject image is collected by the image pickup optical system 101 including the image pickup lens and incident on the image pickup element 102. The image pickup device 102 has a plurality of photoelectric conversion elements that output electric signals according to the intensity of incident light, and photoelectrically converts a subject image (optical image) into an electric signal. Details of the pixel arrangement configuration in the image pickup device 102 will be described later. The analog signal processing unit 103 performs analog signal processing such as correlated double sampling (CDS) on the image signal output from the image pickup element 102. The A / D conversion unit 104 converts the analog image signal output from the analog signal processing unit 103 into a digital data format. The digital image signal (hereinafter, also simply referred to as an image) converted by the A / D conversion unit 104 is input to the control unit 105 and the image processing unit 106.

制御部１０５は、例えばＣＰＵ（Central Processing Unit）やマイクロコントローラ等であり、撮像装置１００の動作を制御する。制御部１０５は、選択手段の一例である。制御部１０５は、ＲＯＭ１２１に記憶されたプログラムコードをＲＡＭ１２２の作業領域に展開して順次実行することで、撮像装置１００の各機能部を制御する。また、制御部１０５は、撮像素子１０２で撮像する際の焦点状況や露出状況等の撮像条件を制御することもできる。例えば、制御部１０５は、Ａ／Ｄ変換部１０４から出力された画像信号に基づいて、撮像光学系１０１の図示しない焦点制御機構や露出制御機構の制御を行う。焦点制御機構は撮像光学系１０１に含まれる撮像レンズを光軸方向へ駆動させるアクチュエータ等であり、露出制御機構は撮像光学系１０１に含まれる絞りやシャッタを駆動させるアクチュエータ等である。 The control unit 105 is, for example, a CPU (Central Processing Unit), a microcontroller, or the like, and controls the operation of the image pickup apparatus 100. The control unit 105 is an example of the selection means. The control unit 105 controls each functional unit of the image pickup apparatus 100 by expanding the program code stored in the ROM 121 into the work area of the RAM 122 and sequentially executing the program code. Further, the control unit 105 can also control imaging conditions such as a focal condition and an exposure condition when imaging with the image sensor 102. For example, the control unit 105 controls a focus control mechanism and an exposure control mechanism (not shown) of the image pickup optical system 101 based on the image signal output from the A / D conversion unit 104. The focus control mechanism is an actuator or the like that drives the image pickup lens included in the image pickup optical system 101 in the optical axis direction, and the exposure control mechanism is an actuator or the like that drives the aperture or shutter included in the image pickup optical system 101.

画像処理部１０６は、入力されるデジタル形式の画像信号に対して、ガンマ補正やホワイトバランス処理等の画像処理を施す。画像処理部１０６は、第１の取得手段及び第２の取得手段の一例である。本実施形態では、異なる２つの瞳領域の画像信号を加算することで撮像面の画像を取得することができる他、異なる２つの瞳領域の画像信号を各々扱うことにより視差の異なる２つの画像（視差画像）を取得することもできる。本実施形態における説明では、異なる２つの瞳領域の画像信号を加算したものを（Ａ＋Ｂ）像、異なる２つの瞳領域の画像信号を各々扱ったものをＡ像、Ｂ像と呼称する。また、画像処理部１０６は、被写体追跡部１１０から供給される画像中の被写体領域に関する情報を用いた画像処理を行ったり、デフォーカスマップの生成を行ったりする。さらに、画像処理部１０６は、デフォーカスマップの２値化処理やラベリング処理を行う。 The image processing unit 106 performs image processing such as gamma correction and white balance processing on the input digital image signal. The image processing unit 106 is an example of the first acquisition means and the second acquisition means. In the present embodiment, an image of the imaging surface can be acquired by adding the image signals of two different pupil regions, and two images having different parallax by handling the image signals of the two different pupil regions ( It is also possible to acquire a parallax image). In the description of the present embodiment, the sum of the image signals of the two different pupil regions is referred to as an (A + B) image, and the one dealing with the image signals of the two different pupil regions is referred to as an A image and a B image. Further, the image processing unit 106 performs image processing using information about the subject area in the image supplied from the subject tracking unit 110, and generates a defocus map. Further, the image processing unit 106 performs binarization processing and labeling processing of the defocus map.

デフォーカスマップの生成は、公知の瞳分割型位相差検出方式であって良い。例えば、画像処理部１０６は、異なる２つの瞳領域の画像信号のそれぞれについて相関演算を行い、視差の異なるＡ像及びＢ像のずれ量である位相差、つまりＡ像及びＢ像の像ずれ量を算出する。そして、画像処理部１０６は、算出したＡ像及びＢ像の位相差（像ずれ量）に基づいてデフォーカス量を算出することで、デフォーカスマップを生成する。デフォーカスマップは画素毎にデフォーカス量を有するマップであり、デフォーカス量はＦδの単位で表される。本実施形態において、画像処理部１０６は、相関演算を行う微小ブロックの大きさが異なる複数のデフォーカスマップを生成する。例えば、微小ブロックのサイズを基準サイズの等倍と２倍に設定した２種類のデフォーカスマップを生成する。基準サイズとは、被写体追跡部１１０で取り扱う最小被写体サイズに基づいて決定する。なお、デフォーカスマップ生成において入力となる視差画像と被写体追跡部１１０の入力画像との解像度が異なる場合には、最小被写体サイズを視差画像上に換算したサイズを基準サイズとする。 The defocus map may be generated by a known pupil division type phase difference detection method. For example, the image processing unit 106 performs a correlation calculation for each of the image signals in two different pupil regions, and the phase difference, which is the amount of deviation between the A image and the B image having different parallax, that is, the amount of image deviation between the A image and the B image. Is calculated. Then, the image processing unit 106 generates a defocus map by calculating the defocus amount based on the calculated phase difference (image shift amount) between the A image and the B image. The defocus map is a map having a defocus amount for each pixel, and the defocus amount is expressed in the unit of Fδ. In the present embodiment, the image processing unit 106 generates a plurality of defocus maps having different sizes of the minute blocks for which the correlation calculation is performed. For example, two types of defocus maps are generated in which the size of the minute block is set to the same size and twice the reference size. The reference size is determined based on the minimum subject size handled by the subject tracking unit 110. If the resolutions of the parallax image input in the defocus map generation and the input image of the subject tracking unit 110 are different, the size obtained by converting the minimum subject size onto the parallax image is used as the reference size.

表示部１０７は、例えばＬＣＤ（Liquid Crystal Display）や有機ＥＬ（electroluminescence）ディスプレイであり、画像処理部１０６から供給される画像信号に基づいて画像を表示する。撮像装置１００は、撮像素子１０２で時系列的に撮像した画像を表示部１０７に表示させることで、表示部１０７を電子ビューファインダ（ＥＶＦ）として機能させることができる。また、表示部１０７は、被写体追跡部１１０によって追跡している被写体を含む被写体領域の位置等も表示可能とする。また、画像処理部１０６から出力される画像信号は、記録媒体１０８に記録可能である。記録媒体１０８は、例えば撮像装置１００に着脱可能なメモリカードである。なお、画像処理部１０６から出力される画像信号の記録先は、撮像装置１００に内蔵されたメモリであっても良いし、撮像装置１００と通信可能に接続された外部装置であっても良い。 The display unit 107 is, for example, an LCD (Liquid Crystal Display) or an organic EL (electroluminescence) display, and displays an image based on an image signal supplied from the image processing unit 106. The image pickup device 100 can make the display unit 107 function as an electronic viewfinder (EVF) by displaying the images captured by the image pickup element 102 in time series on the display unit 107. Further, the display unit 107 can also display the position of the subject area including the subject tracked by the subject tracking unit 110. Further, the image signal output from the image processing unit 106 can be recorded on the recording medium 108. The recording medium 108 is, for example, a memory card that can be attached to and detached from the image pickup apparatus 100. The recording destination of the image signal output from the image processing unit 106 may be a memory built in the image pickup device 100 or an external device communicably connected to the image pickup device 100.

被写体指定部１０９は、画像に含まれる、追跡対象とする被写体を指定する。被写体指定部１０９は、例えばタッチパネルやボタン等を含む入力インターフェイスであり、この被写体指定部１０９を介して、ユーザー（撮像者）は画像に含まれる任意の被写体を追跡対象に指定することが可能である。被写体追跡部１１０は、画像処理部１０６から時系列的に画像信号及びデフォーカス量が順次供給され、撮像された時刻の異なる画像に含まれる被写体を追跡する。被写体追跡部１１０は、画像における被写体の画素パターンや特徴量、デフォーカスマップにおけるデフォーカス量に基づき、被写体指定部１０９によって指定された追跡対象の被写体を検出して追跡する。被写体追跡部１１０は、テンプレート登録部１１１、色特徴量抽出部１１２、距離特徴量算出部１１３、テンプレートマッチング部１１４、及び大きさ推定部１１５を有する。 The subject designation unit 109 designates a subject to be tracked, which is included in the image. The subject designation unit 109 is an input interface including, for example, a touch panel and buttons, and the user (imager) can designate any subject included in the image as a tracking target via the subject designation unit 109. be. The subject tracking unit 110 sequentially supplies image signals and defocus amounts from the image processing unit 106 in chronological order, and tracks subjects included in images having different times of capture. The subject tracking unit 110 detects and tracks a subject to be tracked designated by the subject designation unit 109 based on the pixel pattern and feature amount of the subject in the image and the defocus amount in the defocus map. The subject tracking unit 110 includes a template registration unit 111, a color feature amount extraction unit 112, a distance feature amount calculation unit 113, a template matching unit 114, and a size estimation unit 115.

テンプレート登録部１１１は、画像処理部１０６から時系列的に順次供給される画像から追跡対象の被写体に対応する部分領域を抽出してテンプレート画像として登録する。なお、登録されたテンプレート画像は、大きさ推定部１１５での処理結果に応じて拡大処理や縮小処理が適宜施された上でテンプレートマッチング部１１４に入力される。また、時系列的に順次供給される画像に含まれる被写体の見え方の変化を考慮して、時系列的に順次供給される画像毎にテンプレート画像を更新しても良い。この場合には、テンプレート画像として登録する部分領域の範囲を、大きさ推定部１１５での処理結果に応じて変化させれば良い。 The template registration unit 111 extracts a partial area corresponding to the subject to be tracked from the images sequentially supplied from the image processing unit 106 in chronological order and registers it as a template image. The registered template image is input to the template matching unit 114 after being appropriately subjected to enlargement processing and reduction processing according to the processing result of the size estimation unit 115. Further, the template image may be updated for each of the images sequentially supplied in chronological order in consideration of the change in the appearance of the subject included in the images sequentially supplied in chronological order. In this case, the range of the partial area to be registered as the template image may be changed according to the processing result of the size estimation unit 115.

色特徴量抽出部１１２は、画像処理部１０６から時系列的に順次供給される画像から追跡対象の被写体に関する色特徴量を抽出して保持する。具体的には、色特徴量抽出部１１２は、被写体指定部１０９によって指定された被写体に対応する領域の画素値情報から色特徴量を抽出する。例えば、画像処理部１０６から供給される画像をＨＳＶ色空間で表現する場合、色特徴量抽出部１１２は、色相（Ｈｕｅ）を画素値情報とし、色相のヒストグラムから頻度（要素数）が所定の閾値以上の階調を色特徴量として抽出する。閾値は例えば２５％とする。また、色相のヒストグラムは０～３５９で表現される色相情報を１２階調に分け、頻度を要素数で正規化しているものとする。ここで、正規化とはヒストグラムを取得する領域の面積（画素数）で割っていることを示す。 The color feature amount extraction unit 112 extracts and holds the color feature amount related to the subject to be tracked from the images sequentially supplied from the image processing unit 106 in time series. Specifically, the color feature amount extraction unit 112 extracts the color feature amount from the pixel value information of the region corresponding to the subject designated by the subject designation unit 109. For example, when the image supplied from the image processing unit 106 is expressed in the HSV color space, the color feature amount extraction unit 112 uses the hue (Hue) as pixel value information, and the frequency (number of elements) is predetermined from the hue histogram. Gradations above the threshold are extracted as color features. The threshold value is, for example, 25%. Further, in the hue histogram, it is assumed that the hue information represented by 0 to 359 is divided into 12 gradations and the frequency is normalized by the number of elements. Here, normalization indicates that the histogram is divided by the area (number of pixels) of the area for which the histogram is acquired.

距離特徴量算出部１１３は、画像処理部１０６から時系列的に順次供給されるデフォーカスマップから追跡対象の被写体に関する距離特徴量を算出して保持する。距離特徴量は、被写体指定部１０９によって指定された被写体に対応する領域からデフォーカス量の平均値で算出して良い。なお、平均値に限らず、中央値や分散値等の他のものでも良い。また、時系列的に順次供給されるデフォーカスマップに含まれる被写体の光軸方向の移動や、見た目の大きさの変化を考慮して、時系列的に順次供給される画像毎に距離特徴量を更新しても良い。この場合には、距離特徴量を算出する部分領域の範囲を、大きさ推定部１１５での処理結果に応じて変化させれば良い。 The distance feature amount calculation unit 113 calculates and holds the distance feature amount for the subject to be tracked from the defocus map sequentially supplied from the image processing unit 106 in time series. The distance feature amount may be calculated by the average value of the defocus amount from the area corresponding to the subject designated by the subject designation unit 109. It should be noted that the value is not limited to the average value, and other values such as the median value and the variance value may be used. In addition, considering the movement of the subject in the optical axis direction included in the defocus map sequentially supplied in chronological order and the change in the size of the appearance, the distance feature amount for each image sequentially supplied in chronological order. May be updated. In this case, the range of the partial region for calculating the distance feature amount may be changed according to the processing result of the size estimation unit 115.

テンプレートマッチング部１１４は、画像処理部１０６から時系列的に逐次供給される（Ａ＋Ｂ）像及びデフォーカスマップに含まれる被写体を探索する。テンプレートマッチング部１１４は、探索手段の一例である。探索処理においては、テンプレートマッチング部１１４は、被写体モデルとして、テンプレート登録部１１１に保持されたテンプレート画像と、距離特徴量算出部１１３に保持された距離特徴量とを参照する。 The template matching unit 114 searches for the (A + B) image sequentially supplied from the image processing unit 106 in time series and the subject included in the defocus map. The template matching unit 114 is an example of the search means. In the search process, the template matching unit 114 refers to the template image held in the template registration unit 111 and the distance feature amount held in the distance feature amount calculation unit 113 as the subject model.

大きさ推定部１１５は、画像処理部１０６からの時系列的に順次供給される入力群に対し、基準領域と特徴量とを参照して被写体の大きさを推定する。ここで、入力群とは画像処理部１０６から入力される画像及びデフォーカスマップである。また、基準領域とはテンプレートマッチング部１１４が決定した被写体領域であり、特徴量とは色特徴量抽出部１１２により抽出された色特徴量及び距離特徴量算出部１１３により算出された距離特徴量である。大きさ推定部１１５は、供給される画像の各画素において、色特徴量抽出部１１２により抽出された色特徴量と一致する画素であるか否か、距離特徴量算出部１１３により算出された距離特徴量と一致する画素であるか否かを判定する。大きさ推定部１１５は、色特徴量及び距離特徴量と一致すると判定した画素を特徴画素とし、基準領域から特徴画素の分布状況に応じて被写体の大きさを拡大するか否か、縮小するか否かを判定する。ただし、予め最小被写体サイズを定めておき、縮小は最小被写体サイズまでしか行わないものとする。最小被写体サイズは入力画像に対する比率で決定し、例えば入力画像がＶＧＡであれば、最小被写体サイズは３２画素とする。大きさ推定部１１５によって推定された被写体の大きさに基づき被写体領域は更新され、制御部１０５や画像処理部１０６に供給される。また、大きさ推定部１１５によって推定された被写体の大きさは、テンプレート登録部１１１や距離特徴量算出部１１３に伝えられ、各種情報が更新される。大きさ推定部１１５は、補正手段の一例である。 The size estimation unit 115 estimates the size of the subject with reference to the reference region and the feature amount for the input group sequentially supplied from the image processing unit 106 in time series. Here, the input group is an image and a defocus map input from the image processing unit 106. The reference area is a subject area determined by the template matching unit 114, and the feature amount is a distance feature amount calculated by the color feature amount extraction unit 112 and the distance feature amount calculation unit 113. be. The size estimation unit 115 determines whether or not each pixel of the supplied image is a pixel that matches the color feature amount extracted by the color feature amount extraction unit 112, and the distance calculated by the distance feature amount calculation unit 113. It is determined whether or not the pixel matches the feature amount. The size estimation unit 115 uses pixels determined to match the color feature amount and the distance feature amount as feature pixels, and whether or not to expand or reduce the size of the subject according to the distribution of the feature pixels from the reference region. Judge whether or not. However, the minimum subject size is set in advance, and the reduction is performed only up to the minimum subject size. The minimum subject size is determined by the ratio to the input image. For example, if the input image is VGA, the minimum subject size is 32 pixels. The subject area is updated based on the size of the subject estimated by the size estimation unit 115, and is supplied to the control unit 105 and the image processing unit 106. Further, the size of the subject estimated by the size estimation unit 115 is transmitted to the template registration unit 111 and the distance feature amount calculation unit 113, and various information is updated. The size estimation unit 115 is an example of the correction means.

＜撮像素子の画素配列構成＞
次に、本実施形態における撮像素子１０２について説明する。図２（ａ）は、図１に示した撮像素子１０２の画素配列構成を示す図である。撮像素子１０２においては、図２（ａ）に示すように画素２００が二次元マトリクス状（行列状）に規則的に配列されている。各画素２００は、図２（ｂ）に示すように、マイクロレンズ２０１と一対の光電変換部２０２Ａ、２０３Ｂ（以下、瞳分割画素２０２Ａ、２０３Ｂとも呼ぶ）から構成される。 <Pixel array configuration of image sensor>
Next, the image pickup device 102 in this embodiment will be described. FIG. 2A is a diagram showing a pixel arrangement configuration of the image pickup device 102 shown in FIG. In the image pickup device 102, as shown in FIG. 2A, the pixels 200 are regularly arranged in a two-dimensional matrix (matrix). As shown in FIG. 2B, each pixel 200 is composed of a microlens 201 and a pair of photoelectric conversion units 202A and 203B (hereinafter, also referred to as pupil division pixels 202A and 203B).

本実施形態においては、二次元マトリクス状に規則的に配列された瞳分割画素２０２Ａ、２０３Ｂから、視差画像としてＡ像、Ｂ像が出力されるものとする。図２（ａ）に示すように撮像素子１０２を構成することで、撮像光学系１０１の瞳の異なる領域を通過する一対の光束を一対の光学像として結像させて、それらをＡ像及びＢ像として出力することができる。本実施形態では、このＡ像、Ｂ像を参照して画像処理部１０６がデフォーカスマップを生成する。なお、Ａ像、Ｂ像の取得方法は、これに限定されない。例えば、空間的に間隔をあけて設置した複数台のカメラから取得した互いに視差のついた画像をＡ像、Ｂ像としても良いし、複数の光学系と撮像部を有する１台のカメラから得られる視差画像をそれぞれＡ像、Ｂ像としても良い。 In the present embodiment, it is assumed that the A image and the B image are output as parallax images from the pupil division pixels 202A and 203B regularly arranged in a two-dimensional matrix. By configuring the image sensor 102 as shown in FIG. 2A, a pair of light fluxes passing through different regions of the pupil of the image pickup optical system 101 are formed into an image as a pair of optical images, and these are image A and B. It can be output as an image. In the present embodiment, the image processing unit 106 generates a defocus map with reference to the A image and the B image. The method of acquiring the A image and the B image is not limited to this. For example, images with parallax obtained from a plurality of cameras installed at spatial intervals may be used as an A image and a B image, or may be obtained from a single camera having a plurality of optical systems and an imaging unit. The parallax images to be obtained may be the A image and the B image, respectively.

＜テンプレートマッチング部における処理＞
次に、本実施形態におけるテンプレートマッチング部１１４における処理について説明する。図３は、テンプレートマッチング部１１４における被写体探索処理を説明する図である。被写体が指定された際の（Ａ＋Ｂ）像を図３（ａ）に示し、図３（ａ）に示した（Ａ＋Ｂ）像に対応する２値化デフォーカスマップを図３（ｂ）に示す。図３（ａ）に示す領域３００はテンプレート画像を切り出した領域を示し、図３（ｂ）に示す領域３０１は領域３００に対応するデフォーカスマップ上の領域であり、距離特徴量を算出した領域を示している。また、被写体探索処理対象の（Ａ＋Ｂ）像を図３（ｃ）に示し、図３（ｃ）に示した（Ａ＋Ｂ）像に対応する２値化デフォーカスマップを図３（ｄ）に示す。 <Processing in the template matching section>
Next, the processing in the template matching unit 114 in this embodiment will be described. FIG. 3 is a diagram illustrating a subject search process in the template matching unit 114. The (A + B) image when the subject is designated is shown in FIG. 3 (a), and the binarized defocus map corresponding to the (A + B) image shown in FIG. 3 (a) is shown in FIG. 3 (b). The area 300 shown in FIG. 3A shows an area from which the template image is cut out, and the area 301 shown in FIG. 3B is an area on the defocus map corresponding to the area 300, and the area where the distance feature amount is calculated is calculated. Is shown. Further, the (A + B) image of the subject search processing target is shown in FIG. 3 (c), and the binarized defocus map corresponding to the (A + B) image shown in FIG. 3 (c) is shown in FIG. 3 (d).

テンプレートマッチング部１１４は、被写体探索処理を行うとき、図３（ｃ）に示すようにサーチ領域３０２及びウィンドウ領域３０３を設定する。サーチ領域３０２は、（Ａ＋Ｂ）像全域とすることが好ましいが、前画像における被写体領域と重心が等しく、一辺の比がＮ倍の領域としても良い。さらに、テンプレートマッチング部１１４は、図３（ｃ）に示すように、サーチ領域３０２の内部において、２次元空間的に順次画素単位でずらしながら領域３００と同じ大きさのウィンドウ領域３０３を複数設定する。また、同時にテンプレートマッチング部１１４は、（Ａ＋Ｂ）像におけるウィンドウ領域３０３に対応するデフォーカスマップ上の領域３０４を設定する。 The template matching unit 114 sets the search area 302 and the window area 303 as shown in FIG. 3C when performing the subject search process. The search area 302 is preferably the entire area of the (A + B) image, but may be an area having the same center of gravity as the subject area in the previous image and having a side ratio of N times. Further, as shown in FIG. 3C, the template matching unit 114 sets a plurality of window areas 303 having the same size as the area 300 while sequentially shifting them in two-dimensional space in pixel units inside the search area 302. .. At the same time, the template matching unit 114 sets the area 304 on the defocus map corresponding to the window area 303 in the (A + B) image.

テンプレートマッチング部１１４は、領域３００から切り出したテンプレート画像とウィンドウ領域３０３から切り出した画像との相関度を算出するとともに、領域３０１から算出した距離特徴量と領域３０４から算出した距離評価値との類似度を算出する。そして、テンプレートマッチング部１１４は、算出した相関度と類似度とに基づいて被写体領域を決定する。ここで、相関度は、領域３００から切り出したテンプレート画像とウィンドウ領域３０３から切り出した画像の各画素の画素値の差分和を用いて良く、求められた差分和の値が小さい程、相関度が高いことを表す。また、距離評価値は、領域３０４の各座標のデフォーカス量に対して距離特徴量と同様の方法で算出して良い。類似度は、距離評価値と距離特徴量の差の絶対値を用いて良く、求められた値が小さい程、類似度が高いことを表す。なお、相関度、距離評価値、及び類似度を算出する方法は一例であり、前述した方法に限定されるものではない。例えば、相関度を求める方法は、正規化相互相関などの他の方法であっても良いし、類似度を算出する方法は分散であっても良い。 The template matching unit 114 calculates the degree of correlation between the template image cut out from the area 300 and the image cut out from the window area 303, and is similar to the distance feature amount calculated from the area 301 and the distance evaluation value calculated from the area 304. Calculate the degree. Then, the template matching unit 114 determines the subject area based on the calculated correlation degree and similarity degree. Here, the degree of correlation may be the sum of the pixel values of the pixel values of the template image cut out from the area 300 and the image cut out from the window area 303, and the smaller the value of the obtained difference sum, the lower the degree of correlation. Represents high. Further, the distance evaluation value may be calculated for the defocus amount of each coordinate of the region 304 by the same method as the distance feature amount. The degree of similarity may be determined by using the absolute value of the difference between the distance evaluation value and the distance feature amount, and the smaller the obtained value, the higher the degree of similarity. The method for calculating the degree of correlation, the distance evaluation value, and the degree of similarity is an example, and is not limited to the above-mentioned method. For example, the method for obtaining the degree of correlation may be another method such as normalized cross-correlation, and the method for calculating the degree of similarity may be variance.

＜画像処理部が生成するデフォーカスマップ＞
次に、本実施形態における画像処理部１０６が生成するデフォーカスマップについて説明する。図４は、Ａ像及びＢ像において設定される微小ブロックと、その微小ブロックを用いて生成されたデフォーカスマップの例を示す図である。デフォーカスマップの着目座標のデフォーカス量を算出するために、図４（ａ）ではＡ像において小さいサイズの微小ブロック４００が設定されており、図４（ｂ）ではＢ像に対して小さいサイズの微小ブロック４０１が設定されている例である。微小ブロック４００及び微小ブロック４０１は同じ大きさであり、着目座標のデフォーカス量を算出するために位置をずらしながら複数組設定され、各々に含まれる画像情報から相関演算が行われる。この小さいサイズの微小ブロックに基づき全座標に対して相関演算を行って得られるデフォーカスマップを２値化した例を図４（ｃ）に示す。図４（ｃ）に示すように、生成時の微小ブロック４００及び４０１が小さいため、被写体輪郭部分の形状は正しく捉えられている。しかし、微小ブロック４００及び４０１が小さいために相関演算に用いる情報量が少なく、ノイズや背景のエッジ成分等が誤相関することにより、同一距離である背景領域の一部においてデフォーカス量がばらついている。 <Defocus map generated by the image processing unit>
Next, the defocus map generated by the image processing unit 106 in the present embodiment will be described. FIG. 4 is a diagram showing an example of a minute block set in the A image and the B image and a defocus map generated by using the minute block. In order to calculate the defocus amount of the coordinate of interest of the defocus map, a small block 400 having a small size is set in the image A in FIG. 4A, and a small size in the image B in FIG. 4B. This is an example in which the minute block 401 of is set. The minute block 400 and the minute block 401 have the same size, and a plurality of sets are set while shifting the positions in order to calculate the defocus amount of the coordinates of interest, and the correlation calculation is performed from the image information included in each. FIG. 4C shows an example of binarizing a defocus map obtained by performing a correlation operation on all coordinates based on this small block of small size. As shown in FIG. 4C, since the minute blocks 400 and 401 at the time of generation are small, the shape of the subject contour portion is correctly captured. However, since the minute blocks 400 and 401 are small, the amount of information used for the correlation calculation is small, and noise and background edge components are erroneously correlated, so that the amount of defocus varies in a part of the background region having the same distance. There is.

図４（ｄ）及び図４（ｅ）に示す例は、大きいサイズの微小ブロック４０２及び４０３が設定されていることを除けば、図４（ａ）及び図４（ｂ）に示した例と同様である。この大きいサイズの微小ブロックに基づき全座標に対して相関演算を行って得られるデフォーカスマップを２値化表現した例を図４（ｆ）に示す。図４（ｆ）に示すように、生成時の微小ブロック４０２及び４０３が大きいため、同一距離である背景領域のデフォーカス量はばらついていない。しかし、微小ブロック４０２及び４０３が大きいために、デフォーカスマップの各着目座標において相関演算を行う際に被写体成分を含む座標が多いことにより、被写体輪郭部分の形状が正しく捉えられていない。 The examples shown in FIGS. 4 (d) and 4 (e) are the same as the examples shown in FIGS. 4 (a) and 4 (b), except that the large-sized microblocks 402 and 403 are set. The same is true. FIG. 4 (f) shows an example of binarizing a defocus map obtained by performing a correlation operation on all coordinates based on this large-sized minute block. As shown in FIG. 4 (f), since the minute blocks 402 and 403 at the time of generation are large, the defocus amount of the background region having the same distance does not vary. However, since the minute blocks 402 and 403 are large, the shape of the subject contour portion is not correctly captured because there are many coordinates including the subject component when performing the correlation calculation at each coordinate of interest in the defocus map.

＜撮像装置１００における処理＞
次に、本実施形態における撮像装置１００での処理の流れについて説明する。図５は、本実施形態における撮像装置１００の処理の例を示すフローチャートである。図５には、撮像装置１００における被写体探索処理及び被写体領域補正処理に係る処理について示している。なお、被写体探索処理及び被写体領域補正処理に係る処理以外の処理は、一般的な撮像装置と同様であるので、説明は省略する。 <Processing in the image pickup apparatus 100>
Next, the flow of processing in the image pickup apparatus 100 in the present embodiment will be described. FIG. 5 is a flowchart showing an example of processing of the image pickup apparatus 100 in the present embodiment. FIG. 5 shows the processing related to the subject search processing and the subject area correction processing in the image pickup apparatus 100. Since the processing other than the processing related to the subject search processing and the subject area correction processing is the same as that of a general image pickup apparatus, the description thereof will be omitted.

まず、ステップＳ５００にて、画像処理部１０６は、撮像素子１０２等を介して得られた画像信号から生成した（Ａ＋Ｂ）像及び微小ブロックのサイズが異なる２種類のデフォーカスマップを入力として読みこみ、被写体追跡部１１０に供給する。次に、ステップＳ５０１にて、被写体追跡部１１０は、被写体指定部１０９による指定を受け付け、指定された領域を（Ａ＋Ｂ）像における被写体領域とする。続いて、ステップＳ５０２にて、テンプレート登録部１１１は、（Ａ＋Ｂ）像の被写体領域から切り出した画像をテンプレート画像として登録する。また、ステップＳ５０３にて、色特徴量抽出部１１２は、（Ａ＋Ｂ）像の被写体領域から色特徴量を抽出する。また、ステップＳ５０４にて、距離特徴算出部１１３は、被写体領域に対応する微小ブロックが小さいデフォーカスマップ上の領域から距離特徴量を抽出する。なお、前述したステップＳ５０２、Ｓ５０３、Ｓ５０４の処理の実行順序は、順不同であり、また一部又は全部を並列して実行しても良い。 First, in step S500, the image processing unit 106 reads the (A + B) image generated from the image signal obtained via the image sensor 102 or the like and two types of defocus maps having different sizes of minute blocks as inputs. , Supply to the subject tracking unit 110. Next, in step S501, the subject tracking unit 110 accepts the designation by the subject designation unit 109, and sets the designated region as the subject region in the (A + B) image. Subsequently, in step S502, the template registration unit 111 registers an image cut out from the subject area of the (A + B) image as a template image. Further, in step S503, the color feature amount extraction unit 112 extracts the color feature amount from the subject area of the (A + B) image. Further, in step S504, the distance feature calculation unit 113 extracts the distance feature amount from the area on the defocus map where the minute blocks corresponding to the subject area are small. The execution order of the processes of steps S502, S503, and S504 described above is in no particular order, and some or all of them may be executed in parallel.

次のステップＳ５０５～Ｓ５１２の処理は、時系列的に順次供給される（Ａ＋Ｂ）像及びデフォーカスマップ毎に繰り返し実行される。ステップＳ５０５にて、ステップＳ５００と同様に、画像処理部１０６は、（Ａ＋Ｂ）像及び微小ブロックのサイズが異なる２種類のデフォーカスマップを入力として読みこみ、被写体追跡部１１０に供給する。次に、ステップＳ５０６にて、テンプレートマッチング部１１４は、（Ａ＋Ｂ）像とデフォーカスマップを入力として被写体探索処理を行い、テンプレート画像と距離特徴量とを参照して（Ａ＋Ｂ）像において最も被写体らしい領域を被写体領域に決定する。次に、ステップＳ５０７にて、制御部１０５は、ステップＳ５０５において入力された２つのデフォーカスマップから被写体領域補正に用いるデフォーカスマップを選択する。 The next steps S505 to S512 are repeatedly executed for each (A + B) image and defocus map that are sequentially supplied in chronological order. In step S505, similarly to step S500, the image processing unit 106 reads the (A + B) image and two types of defocus maps having different sizes of the minute blocks as inputs, and supplies them to the subject tracking unit 110. Next, in step S506, the template matching unit 114 performs subject search processing by inputting the (A + B) image and the defocus map, and refers to the template image and the distance feature amount, and seems to be the most subject in the (A + B) image. Determine the area as the subject area. Next, in step S507, the control unit 105 selects a defocus map to be used for subject area correction from the two defocus maps input in step S505.

次に、ステップＳ５０８にて、画像処理部１０６は、ステップＳ５０７において選択したデフォーカスマップに対して２値化処理を行う。この２値化処理では、デフォーカスマップの各座標において、それぞれのデフォーカス量が基準値に近いか否かに基づいて２値化する。ここでの基準値は、ステップＳ５０６において決定された被写体領域に対応するデフォーカスマップ上の領域から得られる距離評価値とする。次に、ステップＳ５０９にて、画像処理部１０６は、ステップＳ５０８において２値化したデフォーカスマップのラベリングを行う。次に、ステップＳ５１０にて、大きさ推定部１１５は、被写体領域補正処理を行い、直前の画像における被写体領域を基準として、色特徴量及び距離特徴量に基づき被写体の大きさを推定して被写体領域を補正する。このとき、大きさ推定部１１５は、画像平面上の位置が、ステップＳ５０６において決定された被写体領域に近いラベルに属するデフォーカス量を優先的に用いて被写体領域を補正する。 Next, in step S508, the image processing unit 106 performs binarization processing on the defocus map selected in step S507. In this binarization process, each coordinate of the defocus map is binarized based on whether or not each defocus amount is close to the reference value. The reference value here is a distance evaluation value obtained from an area on the defocus map corresponding to the subject area determined in step S506. Next, in step S509, the image processing unit 106 labels the defocus map binarized in step S508. Next, in step S510, the size estimation unit 115 performs subject area correction processing, estimates the size of the subject based on the color feature amount and the distance feature amount with reference to the subject area in the immediately preceding image, and estimates the subject. Correct the area. At this time, the size estimation unit 115 corrects the subject area by preferentially using the defocus amount belonging to the label whose position on the image plane is close to the subject area determined in step S506.

次に、ステップＳ５１１にて、ステップＳ５１０において大きさ推定部１１５により補正された被写体領域がテンプレート登録部１１１に供給され、テンプレート画像に反映されてテンプレート画像が更新される。ステップＳ５０２において登録されたテンプレート画像を保持する手法であれば、ステップＳ５１１では、ステップＳ５１０において補正された被写体領域とテンプレート登録時の被写体領域の大きさの比率に基づいてテンプレート画像を拡大又は縮小を行う。テンプレート画像を更新する手法であれば、ステップＳ５１１では、ステップＳ５０５において入力された（Ａ＋Ｂ）像からステップＳ５１０において補正された被写体領域を切り出した部分画像でテンプレート画像を更新する。このようにテンプレート画像の大きさ又は部分領域の範囲を、大きさ推定部１１５による被写体の大きさの推定の結果に基づいて逐次更新することで、被写体の画像上の大きさ変化に頑健な被写体追跡が実現できる。 Next, in step S511, the subject area corrected by the size estimation unit 115 in step S510 is supplied to the template registration unit 111, reflected in the template image, and the template image is updated. If the method is to retain the template image registered in step S502, in step S511, the template image is enlarged or reduced based on the ratio of the size of the subject area corrected in step S510 to the size of the subject area at the time of template registration. conduct. If it is a method for updating the template image, in step S511, the template image is updated with a partial image obtained by cutting out the subject area corrected in step S510 from the (A + B) image input in step S505. By sequentially updating the size or partial area range of the template image based on the result of estimation of the size of the subject by the size estimation unit 115 in this way, the subject is robust against changes in the size of the subject on the image. Tracking can be realized.

次に、ステップＳ５１２にて、ステップＳ５１０において大きさ推定部１１５により補正された被写体領域が距離特徴量算出部１１３に供給され、距離特徴量に反映されて距離特徴量が更新される。そして、ステップＳ５０５に戻る。ステップＳ５１２では、ステップＳ５０４において算出された距離特徴量を更新する手法であれば、ステップＳ５０７において選択したデフォーカスマップにおいて、ステップＳ５１０において補正された被写体領域から算出された値で距離特徴量を更新する。このように距離特徴量を、大きさ推定部１１５による被写体の大きさの推定の結果に基づいて逐次更新することで、被写体の光軸上の位置変化に頑強な被写体追跡が実現できる。 Next, in step S512, the subject area corrected by the size estimation unit 115 in step S510 is supplied to the distance feature amount calculation unit 113, and is reflected in the distance feature amount to update the distance feature amount. Then, the process returns to step S505. In step S512, if the method is to update the distance feature amount calculated in step S504, the distance feature amount is updated with the value calculated from the subject area corrected in step S510 in the defocus map selected in step S507. do. By sequentially updating the distance feature amount based on the result of estimation of the size of the subject by the size estimation unit 115, it is possible to realize subject tracking that is robust against changes in the position of the subject on the optical axis.

＜被写体探索処理＞
次に、図５のステップＳ５０６において、テンプレートマッチング部１１４が行う被写体探索処理について説明する。図６は、本実施形態における被写体探索処理の例を示すフローチャートである。 <Subject search processing>
Next, in step S506 of FIG. 5, the subject search process performed by the template matching unit 114 will be described. FIG. 6 is a flowchart showing an example of the subject search process in the present embodiment.

まず、ステップＳ６００にて、テンプレートマッチング部１１４は、入力画像（（Ａ＋Ｂ）像）に対してサーチ領域を決定する。次に、ステップＳ６０１にて、テンプレートマッチング部１１４は、入力画像のサーチ領域内部に対してウィンドウ領域を設定する。次に、ステップＳ６０２にて、テンプレートマッチング部１１４は、テンプレート画像とウィンドウ領域の画像情報を参照して相関度を算出する。次に、ステップＳ６０３にて、テンプレートマッチング部１１４は、ウィンドウ領域に対応する微小ブロックが大きいデフォーカスマップ上の領域を参照し、距離評価値を算出する。次に、ステップＳ６０４にて、テンプレートマッチング部１１４は、図５のステップＳ５０４において算出した距離特徴量とステップＳ６０３において算出した距離評価値から類似度を算出する。 First, in step S600, the template matching unit 114 determines a search area for the input image ((A + B) image). Next, in step S601, the template matching unit 114 sets a window area inside the search area of the input image. Next, in step S602, the template matching unit 114 calculates the degree of correlation by referring to the image information of the template image and the window area. Next, in step S603, the template matching unit 114 refers to the area on the defocus map in which the minute block corresponding to the window area is large, and calculates the distance evaluation value. Next, in step S604, the template matching unit 114 calculates the similarity from the distance feature amount calculated in step S504 of FIG. 5 and the distance evaluation value calculated in step S603.

次に、ステップＳ６０５にて、テンプレートマッチング部１１４は、入力画像のサーチ領域内部に未設定のウィンドウ領域があるか否かを判定する。未設定のウィンドウ領域があるとテンプレートマッチング部１１４が判定した場合、ステップＳ６０１に戻り、未設定のウィンドウ領域が無いと判定した場合、ステップＳ６０６に進む。ステップＳ６０６において、テンプレートマッチング部１１４は、ステップＳ６０１～Ｓ６０５の一連の処理を行った複数のウィンドウ領域の内から、最も被写体らしさの高い領域を被写体領域として決定し、被写体探索処理を終了する。例えば、テンプレートマッチング部１１４は、ステップＳ６０４において算出された類似度が閾値よりも小さく（類似性が高い）、かつステップＳ６０２において算出した相関度が最も高いウィンドウ領域を被写体領域として決定する。このように、画素情報とデフォーカス量を参照して被写体探索処理を行うことにより、画素情報だけでは分離が難しい被写体に対する被写体探索精度を向上させることができる。 Next, in step S605, the template matching unit 114 determines whether or not there is an unset window area inside the search area of the input image. If the template matching unit 114 determines that there is an unset window area, the process returns to step S601, and if it is determined that there is no unset window area, the process proceeds to step S606. In step S606, the template matching unit 114 determines the region with the highest subject-likeness as the subject region from among the plurality of window regions subjected to the series of processes of steps S601 to S605, and ends the subject search process. For example, the template matching unit 114 determines a window region in which the similarity calculated in step S604 is smaller than the threshold value (high similarity) and the correlation calculated in step S602 is the highest as the subject area. In this way, by performing the subject search process with reference to the pixel information and the defocus amount, it is possible to improve the subject search accuracy for a subject whose separation is difficult only with the pixel information.

＜デフォーカスマップ選択処理＞
ここで、図５のステップＳ５０７において、制御部１０５が行うデフォーカスマップの選択処理について説明する。図７は、本実施形態におけるデフォーカスマップ選択処理の例を示すフローチャートである。 <Defocus map selection process>
Here, the defocus map selection process performed by the control unit 105 in step S507 of FIG. 5 will be described. FIG. 7 is a flowchart showing an example of the defocus map selection process in the present embodiment.

まず、ステップＳ７００にて、制御部１０５は、入力画像を撮影した際のＩＳＯ感度（露光感度）が閾値以上であるか否かの判定を行う。撮影した際のＩＳＯ感度（露光感度）が閾値以上であると制御部１０５が判定した場合にはステップＳ７０３に進み、閾値以上ではないと制御部１０５が判定した場合にはステップＳ７０１に進む。ＩＳＯ感度（露光感度）に応じたデフォーカスマップの切り替えについては後述する。ステップＳ７０１において、制御部１０５は、前画像での被写体領域のサイズが閾値以上であるか否かの判定を行う。前画像での被写体領域のサイズが閾値以上であると制御部１０５が判定した場合にはステップＳ７０３に進み、閾値以上ではないと制御部１０５が判定した場合にはステップＳ７０２に進む。被写体領域の大きさに応じたデフォーカスマップの切り替えについては後述する。 First, in step S700, the control unit 105 determines whether or not the ISO sensitivity (exposure sensitivity) when the input image is taken is equal to or higher than the threshold value. If the control unit 105 determines that the ISO sensitivity (exposure sensitivity) at the time of shooting is equal to or higher than the threshold value, the process proceeds to step S703, and if the control unit 105 determines that the ISO sensitivity (exposure sensitivity) is not equal to or higher than the threshold value, the process proceeds to step S701. The switching of the defocus map according to the ISO sensitivity (exposure sensitivity) will be described later. In step S701, the control unit 105 determines whether or not the size of the subject area in the previous image is equal to or larger than the threshold value. If the control unit 105 determines that the size of the subject area in the previous image is equal to or larger than the threshold value, the process proceeds to step S703, and if the control unit 105 determines that the size is not equal to or larger than the threshold value, the process proceeds to step S702. Switching the defocus map according to the size of the subject area will be described later.

ステップＳ７０２では、制御部１０５は、微小ブロックが小さいデフォーカスマップを選択する。また、ステップＳ７０３では、制御部１０５は、微小ブロックが大きいデフォーカスマップを選択する。このように、撮影条件や被写体領域のサイズによるデフォーカスマップの影響を踏まえ、微小ブロックの大きさが異なるデフォーカスマップを選択することで、被写体領域補正処理で使用するデフォーカスマップを適切に切り替えることができる。なお、本実施形態では２つのデフォーカスマップを切り替える方法について記載したが、これに限定されるものではない。複数のデフォーカスマップを供給するとともに、ＩＳＯ感度（露光感度）、被写体サイズ毎に複数の閾値を設定し、現フレームにおける値と閾値との関係性から複数のデフォーカスマップを適宜切り替えるようにしても良い。 In step S702, the control unit 105 selects a defocus map with a small small block. Further, in step S703, the control unit 105 selects a defocus map having a large minute block. In this way, by selecting defocus maps with different sizes of minute blocks based on the influence of the defocus map depending on the shooting conditions and the size of the subject area, the defocus map used in the subject area correction process can be switched appropriately. be able to. Although the method of switching between the two defocus maps has been described in the present embodiment, the present invention is not limited to this. In addition to supplying multiple defocus maps, multiple threshold values are set for each ISO sensitivity (exposure sensitivity) and subject size, and multiple defocus maps are appropriately switched based on the relationship between the values and threshold values in the current frame. Is also good.

以下、ＩＳＯ感度（露光感度）に応じたデフォーカスマップの切り替えについて説明する。ステップＳ７００において、ＩＳＯ感度（露光感度）が高い場合における視差画像及びデフォーカスマップについて図４及び図８を用いて説明する。図４の説明は前述のとおりであり、図８全体の説明はＩＳＯ感度（露光感度）が高いためにゲインノイズ８００が表れている点を除いて図４と同様である。 Hereinafter, switching of the defocus map according to the ISO sensitivity (exposure sensitivity) will be described. In step S700, the parallax image and the defocus map when the ISO sensitivity (exposure sensitivity) is high will be described with reference to FIGS. 4 and 8. The description of FIG. 4 is as described above, and the description of FIG. 8 as a whole is the same as that of FIG. 4 except that the gain noise 800 appears due to the high ISO sensitivity (exposure sensitivity).

図８（ｃ）に示した例と図４（ｃ）に示した例とを比べると、図８（ｃ）に示した例において、同一距離である領域のデフォーカス量のばらつきが顕著になっている。これは、図８（ａ）及び図８（ｂ）において、微小ブロック８０１及び８０２内にゲインノイズ８００が含まれるためである。また、図８（ｆ）に示した例と図８（ｃ）に示した例とを比べると、同一距離の領域でデフォーカス量がばらつく問題が抑制されている。これは、図８（ｄ）及び図８（ｅ）において、微小ブロック８０３及び８０４が、微小ブロック８０１及び８０２に比べて大きく、微小ブロックにおけるゲインノイズ８００の割合が低いためである。したがって、ＩＳＯ感度（露光感度）が閾値以上であると判定された場合には、微小ブロックが大きいデフォーカスマップに優先的に切り替えるようにすることが好ましい。 Comparing the example shown in FIG. 8 (c) with the example shown in FIG. 4 (c), in the example shown in FIG. 8 (c), the variation in the defocus amount in the same distance region becomes remarkable. ing. This is because the gain noise 800 is included in the minute blocks 801 and 802 in FIGS. 8 (a) and 8 (b). Further, comparing the example shown in FIG. 8 (f) with the example shown in FIG. 8 (c), the problem that the defocus amount varies in a region of the same distance is suppressed. This is because, in FIGS. 8 (d) and 8 (e), the minute blocks 803 and 804 are larger than the minute blocks 801 and 802, and the ratio of the gain noise 800 in the minute blocks is low. Therefore, when it is determined that the ISO sensitivity (exposure sensitivity) is equal to or higher than the threshold value, it is preferable to preferentially switch to the defocus map having a large minute block.

次に、被写体領域の大きさに応じたデフォーカスマップの切り替えについて説明する。ステップＳ７０１において、被写体領域のサイズが大きい場合における視差画像及びデフォーカスマップについて図４及び図９を用いて説明する。図４の説明は前述の通りであり、図９全体の説明は、被写体領域の大きさが異なる点を除いて図４と同様である。 Next, switching of the defocus map according to the size of the subject area will be described. In step S701, the parallax image and the defocus map when the size of the subject area is large will be described with reference to FIGS. 4 and 9. The description of FIG. 4 is as described above, and the description of FIG. 9 as a whole is the same as that of FIG. 4 except that the size of the subject area is different.

図９（ｆ）に示した例と図４（ｆ）に示した例とを比較すると、図９（ｆ）に示した例では被写体輪郭部分の形状が正しく捉えられていない問題が軽微であるが、図４（ｆ）に示した例では被写体輪郭部分の形状が正しく捉えられていない問題が顕著である。これは、被写体領域のサイズが異なるために、白く表現されている領域におけるデフォーカス量が誤相関している座標の割合が、図９（ｆ）の例では低く、図４（ｆ）の例では高いからである。したがって、被写体領域の大きさが小さい場合には、微小ブロックが小さいデフォーカスマップを優先し、被写体領域の大きさが大きい場合には、微小ブロックが大きいデフォーカスマップを優先することが望ましい。 Comparing the example shown in FIG. 9 (f) with the example shown in FIG. 4 (f), the problem that the shape of the contour portion of the subject is not correctly captured in the example shown in FIG. 9 (f) is minor. However, in the example shown in FIG. 4 (f), the problem that the shape of the contour portion of the subject is not correctly captured is remarkable. This is because the proportion of coordinates in which the amount of defocus is erroneously correlated in the area expressed in white is low in the example of FIG. 9 (f) because the size of the subject area is different, and the example of FIG. 4 (f). Because it is expensive. Therefore, when the size of the subject area is small, it is desirable to give priority to the defocus map having a small minute block, and when the size of the subject area is large, it is desirable to give priority to the defocus map having a large minute block.

＜ラベリング処理＞
次に、図５のステップＳ５０９において、画像処理部１０６が行うラベリング処理について説明する。図１０は、画像処理部１０６におけるラベリング処理を説明する図である。図１０（ａ）は、ステップＳ５０８において２値化したデフォーカスマップに対してラベリング処理を行う様子を表す図である。ラベリング処理では各座標に対して、着目座標１０００を切り替えながら８近傍の周辺座標１００１に対し、既にラベルＩＤが発番されているかを判定する。発番されていない場合には着目座標に新規のラベルＩＤを割り振り、発番されている場合には周辺座標１００１と同一のラベルＩＤを発番する。デフォーカスマップの全座標に対してラベリング処理を行った結果が図１０（ｂ）である。被写体領域内の各座標には同一のラベルＩＤが発番され、他の領域には異なるラベルＩＤが発番される。図１０の例では周辺座標を８近傍で判定しているが、４近傍で判定しても良い。 <Labeling process>
Next, the labeling process performed by the image processing unit 106 in step S509 of FIG. 5 will be described. FIG. 10 is a diagram illustrating a labeling process in the image processing unit 106. FIG. 10A is a diagram showing how the defocus map binarized in step S508 is subjected to the labeling process. In the labeling process, it is determined whether or not the label ID has already been issued for the peripheral coordinates 1001 in the vicinity of 8 while switching the coordinate of interest 1000 for each coordinate. If it is not numbered, a new label ID is assigned to the coordinates of interest, and if it is numbered, the same label ID as the peripheral coordinates 1001 is numbered. FIG. 10B shows the result of labeling all the coordinates of the defocus map. The same label ID is assigned to each coordinate in the subject area, and different label IDs are assigned to the other areas. In the example of FIG. 10, the peripheral coordinates are determined in the vicinity of 8, but may be determined in the vicinity of 4.

＜被写体領域補正処理＞
次に、図５のステップＳ５１０において、大きさ推定部１１５が行う被写体領域補正処理について説明する。図１１は、本実施形態における被写体領域補正処理の例を示すフローチャートである。図１１において、ステップＳ１１０１～Ｓ１１０４の処理が拡大判定処理を示し、ステップＳ１１０５～Ｓ１１０８の処理が縮小判定処理を示す。なお、以下の説明において、被写体領域の形状は矩形であるとする。また、以下では、拡大判定処理（ステップＳ１１０１～Ｓ１１０４）を行った後に縮小判定処理（ステップＳ１１０５～Ｓ１１０８）を行う例を説明するが、縮小判定処理を行った後に拡大判定処理を行うようにしても良い。 <Subject area correction processing>
Next, in step S510 of FIG. 5, the subject area correction process performed by the size estimation unit 115 will be described. FIG. 11 is a flowchart showing an example of the subject area correction process in the present embodiment. In FIG. 11, the processes of steps S1101 to S1104 indicate the enlargement determination process, and the processes of steps S1105 to S1108 indicate the reduction determination process. In the following description, it is assumed that the shape of the subject area is rectangular. Further, in the following, an example in which the reduction determination process (steps S1105 to S1108) is performed after the enlargement determination process (steps S1101 to S1104) is performed will be described, but the enlargement determination process is performed after the reduction determination process is performed. Is also good.

まず、ステップＳ１１００にて、大きさ推定部１１５は、供給された（Ａ＋Ｂ）像の各画素値を着目画素とし、特徴画素であるか否かを判定する。大きさ推定部１１５は、着目画素が、条件（Ａ）を満たし、かつ条件（Ｂ）を満たす場合、特徴画素であると判定する。条件（Ａ）は、着目画素の画素情報が色特徴量抽出部１１２で抽出された色特徴量と合致していることである。また、条件（Ｂ）は、着目画素とステップＳ５０６において決定した被写体領域の重心座標が、ステップＳ５０９においてラベリングしたデフォーカスマップにおいて同じラベルに属することである。 First, in step S1100, the size estimation unit 115 uses each pixel value of the supplied (A + B) image as the pixel of interest, and determines whether or not it is a feature pixel. When the pixel of interest satisfies the condition (A) and the condition (B), the size estimation unit 115 determines that the pixel of interest is a feature pixel. The condition (A) is that the pixel information of the pixel of interest matches the color feature amount extracted by the color feature amount extraction unit 112. Further, the condition (B) is that the pixel of interest and the center of gravity coordinates of the subject region determined in step S506 belong to the same label in the defocus map labeled in step S509.

次に、ステップＳ１１０１にて、大きさ推定部１１５は、テンプレートマッチング部１１４により決定された被写体領域を基準として、その被写体領域の外周領域の各辺において特徴画素の占める割合を算出する。次に、ステップＳ１１０２にて、大きさ推定部１１５は、被写体領域の４辺の内の１つの辺について、ステップＳ１１０１において算出した特徴画素の占める割合が所定の閾値（第１の閾値）以上であるか否かを判定する。特徴画素の占める割合が所定の閾値（第１の閾値）以上であると大きさ推定部１１５が判定した場合にはステップＳ１１０３に進み、所定の閾値（第１の閾値）以上ではないと大きさ推定部１１５が判定した場合にはステップＳ１１０４に進む。 Next, in step S1101, the size estimation unit 115 calculates the ratio of the feature pixels on each side of the outer peripheral region of the subject region with the subject region determined by the template matching unit 114 as a reference. Next, in step S1102, the size estimation unit 115 determines that the ratio of the feature pixels calculated in step S1101 for one of the four sides of the subject area is equal to or higher than a predetermined threshold value (first threshold value). Determine if it exists. If the size estimation unit 115 determines that the proportion of the feature pixels is equal to or greater than a predetermined threshold value (first threshold value), the process proceeds to step S1103, and the size is not equal to or greater than the predetermined threshold value (first threshold value). If the estimation unit 115 determines, the process proceeds to step S1104.

尚、ラベリングしたデフォーカスマップを参照する様子を図１２に示す。被写体領域１２００の左側に外周領域１２０１があり、外周領域１２０１の領域内に距離情報がばらついている。しかし、被写体のラベルＩＤが１であり、ラベルＩＤが２または４である座標は条件（Ｂ）を満たさない。従ってラベリングしたデフォーカスマップを領域補正に用いることで、外周領域内における距離情報のばらつきの影響を抑制することができる。 FIG. 12 shows how to refer to the labeled defocus map. The outer peripheral area 1201 is on the left side of the subject area 1200, and the distance information is scattered within the area of the outer peripheral area 1201. However, the coordinates in which the label ID of the subject is 1 and the label ID is 2 or 4 do not satisfy the condition (B). Therefore, by using the labeled defocus map for the area correction, it is possible to suppress the influence of the variation of the distance information in the outer peripheral area.

ステップＳ１１０３では、大きさ推定部１１５は、当辺の方向に被写体領域を拡大し、その後ステップＳ１１０４に進む。次に、ステップＳ１１０４にて、大きさ推定部１１５は、被写体領域の４辺のすべてにおいて拡大判定処理が完了したか否かを判定する。被写体領域の４辺のすべてにおいて拡大判定処理が完了したと大きさ推定部１１５が判定した場合には、拡大判定処理を終了してステップＳ１１０５に進み、そうでない場合にはステップＳ１１０２に戻って未処理の辺について処理を行う。 In step S1103, the size estimation unit 115 expands the subject area in the direction of this side, and then proceeds to step S1104. Next, in step S1104, the size estimation unit 115 determines whether or not the enlargement determination process is completed on all four sides of the subject area. If the size estimation unit 115 determines that the enlargement determination process has been completed on all four sides of the subject area, the enlargement determination process is terminated and the process proceeds to step S1105. If not, the process returns to step S1102 and is not yet performed. Performs processing on the processing side.

ステップＳ１１０５にて、大きさ推定部１１５は、テンプレートマッチング部１１４により決定された被写体領域を基準として、その被写体領域の内周領域の各辺において特徴画素の占める割合を算出する。次に、ステップＳ１１０６にて、大きさ推定部１１５は、被写体領域の４辺の内の１つの辺について、ステップＳ１１０５において算出した特徴画素の占める割合が所定の閾値（第２の閾値）未満であるか否かを判定する。特徴画素の占める割合が所定の閾値（第２の閾値）未満であると大きさ推定部１１５が判定した場合にはステップＳ１１０７に進み、所定の閾値（第２の閾値）未満ではないと大きさ推定部１１５が判定した場合にはステップＳ１１０８に進む。 In step S1105, the size estimation unit 115 calculates the ratio of the feature pixels in each side of the inner peripheral region of the subject region with the subject region determined by the template matching unit 114 as a reference. Next, in step S1106, the size estimation unit 115 determines that the proportion of the feature pixels calculated in step S1105 for one of the four sides of the subject area is less than a predetermined threshold value (second threshold value). Determine if it exists. If the size estimation unit 115 determines that the proportion of the feature pixels is less than a predetermined threshold value (second threshold value), the process proceeds to step S1107, and the size is not less than the predetermined threshold value (second threshold value). If the estimation unit 115 determines, the process proceeds to step S1108.

ステップＳ１１０７では、大きさ推定部１１５は、当辺の方向に被写体領域を縮小し、その後ステップＳ１１０８に進む。次に、ステップＳ１１０８にて、大きさ推定部１１５は、被写体領域の４辺のすべてにおいて縮小判定処理が完了したか否かを判定する。被写体領域の４辺のすべてにおいて縮小判定処理が完了したと大きさ推定部１１５が判定した場合には、縮小判定処理を終了して被写体領域補正処理を終了し、そうでない場合にはステップＳ１１０６に戻って未処理の辺について処理を行う。なお、拡大判定処理において、拡大した辺に関しては、縮小判定処理の対象から除外するようにしても良い。 In step S1107, the size estimation unit 115 reduces the subject area in the direction of this side, and then proceeds to step S1108. Next, in step S1108, the size estimation unit 115 determines whether or not the reduction determination process is completed on all four sides of the subject area. If the size estimation unit 115 determines that the reduction determination processing has been completed on all four sides of the subject area, the reduction determination processing is terminated and the subject area correction processing is terminated. If not, step S1106 is performed. Go back and process the unprocessed edges. In the enlargement determination process, the enlarged edge may be excluded from the target of the reduction determination process.

以上のように、被写体探索と被写体領域補正との処理の性質に応じて、微小ブロックの大きさが異なる複数のデフォーカスマップを適宜切り替えることで、適切なデフォーカスマップを用いて各処理を行う。これにより、画像情報だけでは背景と被写体を分離できないようなシーンにおいても、被写体探索処理及び被写体領域補正処理をともに精度良く実現でき、例えば精度良く被写体追尾を行うことが可能となる。 As described above, each process is performed using an appropriate defocus map by appropriately switching a plurality of defocus maps having different sizes of minute blocks according to the nature of the process of subject search and subject area correction. .. As a result, even in a scene where the background and the subject cannot be separated only by the image information, both the subject search process and the subject area correction process can be realized with high accuracy, and for example, the subject can be tracked with high accuracy.

（本発明の他の実施形態）
本発明は、前述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読み出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other Embodiments of the present invention)
The present invention supplies a program that realizes one or more functions of the above-described embodiment to a system or device via a network or storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by the processing to be performed. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

なお、前記実施形態は、何れも本発明を実施するにあたっての具体化のほんの一例を示したものに過ぎず、これらによって本発明の技術的範囲が限定的に解釈されてはならないものである。すなわち、本発明はその技術思想、又はその主要な特徴から逸脱することなく、様々な形で実施することができる。 It should be noted that the above embodiments are merely examples of the embodiment of the present invention, and the technical scope of the present invention should not be construed in a limited manner by these. That is, the present invention can be implemented in various forms without departing from the technical idea or its main features.

１００：撮像装置１０１：撮像光学系１０２：撮像素子１０５：制御部１０６：画像処理部１０９：被写体指定部１１０：被写体追跡部１１１：テンプレート登録部１１２：色特徴量抽出部１１３：距離特徴量算出部１１４：テンプレートマッチング部１１５：大きさ推定部２００：画素２０１：マイクロレンズ２０２Ａ、２０３Ｂ：光電変換部３０２：サーチ領域３０３：ウィンドウ領域４００、４０１、４０２、４０３：微小ブロック１０００：着目座標１００１：周辺座標１２００：被写体領域１２０１：外周領域 100: Image pickup device 101: Image pickup optical system 102: Image pickup element 105: Control unit 106: Image processing unit 109: Subject designation unit 110: Subject tracking unit 111: Template registration unit 112: Color feature amount extraction unit 113: Distance feature amount calculation Part 114: Template matching part 115: Size estimation part 200: Pixel 201: Microlens 202A, 203B: Photoelectric conversion part 302: Search area 303: Window area 400, 401, 402, 403: Microblock 1000: Focused coordinates 1001: Peripheral coordinates 1200: Subject area 1201: Outer peripheral area

Claims

画像を取得する第１の取得手段と、
前記画像のそれぞれについて、生成時のブロックのサイズが異なる複数のデフォーカスマップを取得する第２の取得手段と、
前記画像及び前記デフォーカスマップに基づいて、対象の被写体を探索する探索手段と、
前記画像及び前記デフォーカスマップに基づいて、前記探索手段が決定した被写体領域を補正する補正手段と、
前記複数のデフォーカスマップの内から前記補正手段での前記被写体領域の補正に用いる前記デフォーカスマップを選択する選択手段とを有することを特徴とする画像処理装置。 The first acquisition method for acquiring images,
A second acquisition means for acquiring a plurality of defocus maps having different block sizes at the time of generation for each of the images.
A search means for searching a target subject based on the image and the defocus map, and
A correction means for correcting a subject area determined by the search means based on the image and the defocus map, and a correction means.
An image processing apparatus comprising: a selection means for selecting the defocus map used for correction of the subject area by the correction means from the plurality of defocus maps.

前記選択手段は、前記探索手段での前記被写体の探索に用いる前記デフォーカスマップよりも生成時のブロックのサイズが小さい前記デフォーカスマップを選択することを特徴とする請求項１記載の画像処理装置。 The image processing apparatus according to claim 1, wherein the selection means selects the defocus map having a smaller block size at the time of generation than the defocus map used for searching the subject by the search means. ..

前記第１の取得手段は、時系列に並んだ画像を取得し、
前記第２の取得手段は、時系列に並んだ画像のそれぞれについて、生成時のブロックのサイズが異なる複数のデフォーカスマップを取得し、
前記探索手段は、前記補正手段により前記被写体領域が補正された画像を用いて、次の画像における前記被写体を探索することを特徴とする請求項１又は２記載の画像処理装置。 The first acquisition means acquires images arranged in chronological order and obtains them.
The second acquisition means acquires a plurality of defocus maps having different block sizes at the time of generation for each of the images arranged in chronological order.
The image processing apparatus according to claim 1 or 2, wherein the search means searches for the subject in the next image by using an image in which the subject area is corrected by the correction means.

前記選択手段は、前記第１の取得手段により取得した画像の露光感度が閾値以上である場合には、前記画像の露光感度が閾値以上でない場合に選択される前記デフォーカスマップよりも生成時のブロックのサイズが大きい前記デフォーカスマップを選択することを特徴とする請求項１～３の何れか１項に記載の画像処理装置。 When the exposure sensitivity of the image acquired by the first acquisition means is equal to or higher than the threshold value, the selection means is generated from the defocus map selected when the exposure sensitivity of the image is not equal to or higher than the threshold value. The image processing apparatus according to any one of claims 1 to 3, wherein the defocus map having a large block size is selected.

前記選択手段は、前記補正手段により補正された前記被写体領域のサイズが閾値以上である場合には、前記被写体領域のサイズが閾値以上でない場合に選択される前記デフォーカスマップよりも生成時のブロックのサイズが大きい前記デフォーカスマップを、次の画像に用いる前記デフォーカスマップに選択することを特徴とする請求項１～３の何れか１項に記載の画像処理装置。 The selection means is a block at the time of generation than the defocus map selected when the size of the subject area corrected by the correction means is equal to or larger than the threshold value and the size of the subject area is not equal to or larger than the threshold value. The image processing apparatus according to any one of claims 1 to 3, wherein the defocus map having a large size is selected as the defocus map to be used for the next image.

前記補正手段は、前記選択手段が選択した前記デフォーカスマップにラベリング処理を施し、画像平面上の位置が、前記探索手段が決定した前記被写体領域に近いラベルに属するデフォーカス量を優先的に用いて被写体領域を補正することを特徴とする請求項１～５の何れか１項に記載の画像処理装置。 The correction means applies a labeling process to the defocus map selected by the selection means, and preferentially uses a defocus amount whose position on the image plane belongs to a label close to the subject area determined by the search means. The image processing apparatus according to any one of claims 1 to 5, wherein the subject area is corrected.

前記第２の取得手段は、１つの前記画像に係る異なる２つの瞳領域の画像信号のそれぞれで前記ブロックを用いた相関演算を行い、前記２つの瞳領域に対応する２つの像の位相差を算出し、前記位相差に基づいてデフォーカス量を算出して前記デフォーカスマップを取得することを特徴とする請求項１～６の何れか１項に記載の画像処理装置。 The second acquisition means performs a correlation operation using the block for each of the image signals of two different pupil regions related to the image, and obtains the phase difference between the two images corresponding to the two pupil regions. The image processing apparatus according to any one of claims 1 to 6, wherein the defocus amount is calculated and the defocus amount is calculated based on the phase difference to acquire the defocus map.

画像を取得する第１の取得手段と、
１つの前記画像に係る異なる２つの瞳領域の画像信号のそれぞれでブロックを用いた相関演算を行い、前記２つの瞳領域に対応する２つの像の位相差を算出し、前記位相差に基づいてデフォーカス量を算出することでデフォーカスマップを生成するとともに、前記画像のそれぞれについて、第１のブロックを用いて第１の前記デフォーカスマップを取得し、前記第１のブロックよりサイズが小さい第２のブロックを用いて第２の前記デフォーカスマップを取得する第２の取得手段と、
前記第１のデフォーカスマップを用いて、前記画像から対象の被写体を探索する探索手段と、
前記第１のデフォーカスマップ及び前記第２のデフォーカスマップの内から選択された前記デフォーカスマップを用いて、前記探索手段が決定した被写体領域を補正する補正手段と、
前記画像の露光感度が閾値以上でなく、かつ前記補正手段により補正された前記被写体領域のサイズが閾値以上でない場合、前記第２のデフォーカスマップを前記補正手段で用いる前記デフォーカスマップに選択し、前記画像の露光感度が閾値以上である場合、又は前記補正手段により補正された前記被写体領域のサイズが閾値以上である場合、前記第１のデフォーカスマップを前記補正手段で用いる前記デフォーカスマップに選択する選択手段とを有することを特徴とする画像処理装置。 The first acquisition method for acquiring images,
Correlation calculation using blocks is performed for each of the image signals of two different pupil regions related to the one image, the phase difference between the two images corresponding to the two pupil regions is calculated, and the phase difference is based on the phase difference. A defocus map is generated by calculating the defocus amount, and for each of the images, the first defocus map is acquired by using the first block, and the size is smaller than that of the first block. A second acquisition means for acquiring the second defocus map using the second block, and
A search means for searching the target subject from the image using the first defocus map, and
A correction means for correcting a subject area determined by the search means by using the defocus map selected from the first defocus map and the second defocus map.
When the exposure sensitivity of the image is not equal to or more than the threshold value and the size of the subject area corrected by the correction means is not equal to or more than the threshold value, the second defocus map is selected as the defocus map used by the correction means. When the exposure sensitivity of the image is equal to or higher than the threshold value, or when the size of the subject area corrected by the correction means is equal to or higher than the threshold value, the defocus map using the first defocus map in the correction means is used. An image processing apparatus comprising: a selection means for selection.

画像を取得する第１の取得工程と、
前記画像のそれぞれについて、生成時のブロックのサイズが異なる複数のデフォーカスマップを取得する第２の取得工程と、
前記画像及び前記デフォーカスマップに基づいて、対象の被写体を探索する探索工程と、
前記画像及び前記デフォーカスマップに基づいて、前記探索工程で決定した被写体領域を補正する補正工程と、
前記複数のデフォーカスマップの内から前記補正工程で前記被写体領域の補正に用いる前記デフォーカスマップを選択する選択工程とを有することを特徴とする画像処理方法。 The first acquisition process to acquire an image and
For each of the above images, a second acquisition step of acquiring a plurality of defocus maps having different block sizes at the time of generation, and
A search process for searching for a target subject based on the image and the defocus map, and
A correction step of correcting the subject area determined in the search step based on the image and the defocus map, and a correction step.
An image processing method comprising a selection step of selecting the defocus map used for correction of the subject area in the correction step from the plurality of defocus maps.

画像を取得する第１の取得工程と、
１つの前記画像に係る異なる２つの瞳領域の画像信号のそれぞれでブロックを用いた相関演算を行い、前記２つの瞳領域に対応する２つの像の位相差を算出し、前記位相差に基づいてデフォーカス量を算出することでデフォーカスマップを生成するとともに、前記画像のそれぞれについて、第１のブロックを用いて第１の前記デフォーカスマップを取得し、前記第１のブロックよりサイズが小さい第２のブロックを用いて第２の前記デフォーカスマップを取得する第２の取得工程と、
前記第１のデフォーカスマップを用いて、前記画像から対象の被写体を探索する探索工程と、
前記第１のデフォーカスマップ及び前記第２のデフォーカスマップの内から選択された前記デフォーカスマップを用いて、前記探索工程で決定した被写体領域を補正する補正工程と、
前記画像の露光感度が閾値以上でなく、かつ前記補正工程で補正された前記被写体領域のサイズが閾値以上でない場合、前記第２のデフォーカスマップを前記補正工程で用いる前記デフォーカスマップに選択し、前記画像の露光感度が閾値以上である場合、又は前記補正工程で補正された前記被写体領域のサイズが閾値以上である場合、前記第１のデフォーカスマップを前記補正工程で用いる前記デフォーカスマップに選択する選択工程とを有することを特徴とする画像処理方法。 The first acquisition process to acquire an image and
Correlation calculation using blocks is performed for each of the image signals of two different pupil regions related to the one image, the phase difference of the two images corresponding to the two pupil regions is calculated, and the phase difference is based on the phase difference. A defocus map is generated by calculating the defocus amount, and for each of the images, the first defocus map is acquired by using the first block, and the size is smaller than that of the first block. A second acquisition step of acquiring the second defocus map using the block 2 and
A search step of searching for a target subject from the image using the first defocus map, and
A correction step of correcting a subject area determined in the search step by using the defocus map selected from the first defocus map and the second defocus map, and a correction step.
When the exposure sensitivity of the image is not equal to or greater than the threshold value and the size of the subject area corrected in the correction step is not equal to or greater than the threshold value, the second defocus map is selected as the defocus map used in the correction step. When the exposure sensitivity of the image is equal to or greater than the threshold value, or when the size of the subject area corrected in the correction step is equal to or greater than the threshold value, the defocus map using the first defocus map in the correction step is used. An image processing method comprising a selection step of selection.

画像を取得する第１の取得ステップと、
前記画像のそれぞれについて、生成時のブロックのサイズが異なる複数のデフォーカスマップを取得する第２の取得ステップと、
前記画像及び前記デフォーカスマップに基づいて、対象の被写体を探索する探索ステップと、
前記画像及び前記デフォーカスマップに基づいて、前記探索ステップで決定した被写体領域を補正する補正ステップと、
前記複数のデフォーカスマップの内から前記補正ステップで前記被写体領域の補正に用いる前記デフォーカスマップを選択する選択ステップとをコンピュータに実行させるためのプログラム。 The first acquisition step to acquire the image and
For each of the above images, a second acquisition step of acquiring a plurality of defocus maps having different block sizes at the time of generation, and
A search step for searching for a target subject based on the image and the defocus map, and
A correction step for correcting the subject area determined in the search step based on the image and the defocus map, and a correction step.
A program for causing a computer to execute a selection step of selecting the defocus map used for correction of the subject area in the correction step from the plurality of defocus maps.

画像を取得する第１の取得ステップと、
１つの前記画像に係る異なる２つの瞳領域の画像信号のそれぞれでブロックを用いた相関演算を行い、前記２つの瞳領域に対応する２つの像の位相差を算出し、前記位相差に基づいてデフォーカス量を算出することでデフォーカスマップを生成するとともに、前記画像のそれぞれについて、第１のブロックを用いて第１の前記デフォーカスマップを取得し、前記第１のブロックよりサイズが小さい第２のブロックを用いて第２の前記デフォーカスマップを取得する第２の取得ステップと、
前記第１のデフォーカスマップを用いて、前記画像から対象の被写体を探索する探索ステップと、
前記第１のデフォーカスマップ及び前記第２のデフォーカスマップの内から選択された前記デフォーカスマップを用いて、前記探索ステップで決定した被写体領域を補正する補正ステップと、
前記画像の露光感度が閾値以上でなく、かつ前記補正ステップで補正された前記被写体領域のサイズが閾値以上でない場合、前記第２のデフォーカスマップを前記補正ステップで用いる前記デフォーカスマップに選択し、前記画像の露光感度が閾値以上である場合、又は前記補正ステップで補正された前記被写体領域のサイズが閾値以上である場合、前記第１のデフォーカスマップを前記補正ステップで用いる前記デフォーカスマップに選択する選択ステップとをコンピュータに実行させるためのプログラム。 The first acquisition step to acquire the image and
Correlation calculation using blocks is performed for each of the image signals of two different pupil regions related to the one image, the phase difference between the two images corresponding to the two pupil regions is calculated, and the phase difference is based on the phase difference. A defocus map is generated by calculating the defocus amount, and for each of the images, the first defocus map is acquired by using the first block, and the size is smaller than that of the first block. A second acquisition step of acquiring the second defocus map using the block 2 and
A search step of searching for a target subject from the image using the first defocus map, and a search step.
Using the defocus map selected from the first defocus map and the second defocus map, a correction step for correcting the subject area determined in the search step, and a correction step.
When the exposure sensitivity of the image is not equal to or greater than the threshold value and the size of the subject area corrected in the correction step is not equal to or greater than the threshold value, the second defocus map is selected as the defocus map used in the correction step. When the exposure sensitivity of the image is equal to or greater than the threshold value, or when the size of the subject area corrected in the correction step is equal to or greater than the threshold value, the defocus map using the first defocus map in the correction step is used. A program to let the computer perform the selection steps and the selection steps to select.