JP6546611B2

JP6546611B2 - Image processing apparatus, image processing method and image processing program

Info

Publication number: JP6546611B2
Application number: JP2017018229A
Authority: JP
Inventors: 弘員柿沼; 長田　秀信; 秀信長田; 広太竹内; 広夢宮下
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2017-02-03
Filing date: 2017-02-03
Publication date: 2019-07-17
Anticipated expiration: 2037-02-03
Also published as: JP2018124890A

Description

本発明は、画像から任意領域を抽出する技術に関する。 The present invention relates to a technique for extracting an arbitrary region from an image.

映像から任意領域を抽出する技術がある。例えば、放送局や撮影スタジオでは、グリーンバック又はブルーバックの前に被写体を配置し、その被写体の画像領域のみを映像から抽出するクロマキー技術が用いられている。このクロマキー技術で被写体の画像領域のみを抽出し、任意のＣＧ（Computer Graphics）映像を実時間で重ね合わせることにより、付加価値の高い映像を生成することができる。このような利点から、例えば、テレビ放送でのバーチャルスタジオや映画でのＣＧ編集作業において、クロマキー技術を用いて行う任意領域の抽出技術は不可欠となっている。 There is a technology for extracting an arbitrary area from a video. For example, in a broadcasting station or a shooting studio, a chroma key technology is used in which an object is placed in front of green back or blue back and only an image area of the object is extracted from video. By extracting only the image area of the subject using this chroma key technology and superimposing arbitrary CG (Computer Graphics) images in real time, it is possible to generate an image with high added value. From such an advantage, for example, in a virtual studio in television broadcasting or CG editing work in a movie, an arbitrary area extraction technique performed using chroma key technology is indispensable.

しかし、クロマキー技術はグリーンバックを要するなど被写体の背景に制約があるため、屋外での撮影やスポーツ競技の撮影には向いていない。そこで、任意領域を任意の背景から抽出する手法が検討されている。例えば、非特許文献１には、背景差分法を用いる手法が記載されている。この手法では、予め背景を撮影しておき、その背景画像と被写体の撮像画像（カラー画像）とを画素ごとに比較し、明度差の大きい画素のみを対象領域（抽出対象の被写体が撮像された画像領域）としてカラー画像から抽出する。 However, chroma-key technology is not suitable for shooting outdoors or sports competitions because the background of the subject is restricted, for example, requiring green back. Therefore, a method of extracting an arbitrary area from an arbitrary background has been considered. For example, Non-Patent Document 1 describes a method using a background subtraction method. In this method, the background is captured in advance, and the background image and the captured image (color image) of the subject are compared for each pixel, and only the pixel having a large difference in lightness is taken as the target area (the subject of the extraction target is captured It extracts from a color image as an image area | region.

特開２０１４−１７５８３７号公報JP, 2014-175837, A

波部斉、外２名、“照明変化に対して頑健な背景差分法”情処学研報、コンピュータビジョンとイメージメディア（ＣＶＩＭ）115-3、1999年3月18日、p.17-p.24Hitoshi Habe, 2 others, "Stricken background subtraction method against change in lighting", Journal of Information Science, Computer Vision and Image Media (CVIM) 115-3, March 18, 1999, p. 17-p. twenty four 中島秀真、外５名、“Kinectによる安定な動物体領域抽出のためのカラー画像とデプスマップの時刻合わせ”、情報通信学会技術研究報告、PRMU、パターン認識・メディア理解、111(379)、321-328、IPSJ SIG Technical Report、Vol.2012-CVIM-180、No.59、2012年1月20日Hidemasa Nakajima, 5 others, "Time alignment of color image and depth map for stable moving object region extraction with Kinect", Technical Research Report of the Information and Communication Society, PRMU, Pattern recognition and media understanding, 111 (379), 321 -328, IPSJ SIG Technical Report, Vol. 2012-CVIM-180, No. 59, January 20, 2012 宮下広夢、外４名、“センサとカメラを活用した高速・高精度な被写体抽出”、電子情報通信学会、信学報告、IEICE Technical Report、MVE2016-1,116(73)、17−22，2016年6月Miyashita Hiromu, 4 others, "High-speed, high-accuracy object extraction using sensors and cameras", The Institute of Electronics, Information and Communication Engineers, Bulletin of IEICE, IEICE Technical Report, MVE2016-1, 116 (73), 17-22, 2016 June

背景差分法を用いる場合、背景が変動しない状況では正しく動作する。しかし、実際の撮影環境では、太陽光や照明に変化が生じ、スポーツ競技の場合は観客の動きもあるため、対象領域以外の非対象領域でも背景画像との間で一定の明度差が生じてしまい、その対象領域のみを正確に抽出することができない。また、抽出対象の被写体以外の被写体が侵入してきた場合、その被写体も併せて抽出されてしまう。 When the background subtraction method is used, it works correctly in the situation where the background does not change. However, in an actual shooting environment, sunlight and lighting change, and in the case of sports competition, there is movement of the audience, so that even in non-target areas other than the target area, a constant lightness difference occurs with the background image. Therefore, it is not possible to accurately extract only the target area. Further, when a subject other than the subject to be extracted intrudes, the subject is also extracted together.

一方、非特許文献２には、カラー画像と共にデプスマップを用いる手法が記載されている。この手法は、カラー画像に含まれる画素の色情報と、計測装置から被写体までのデプスマップの距離情報とを組み合わせることにより、抽出対象の被写体に対応する画像領域を対象領域として抽出する。抽出対象の被写体までの距離と非抽出対象までの距離との差（前景と背景との距離差）を利用して対象領域を特定するため、抽出対象の被写体をより正確に抽出することができる。 On the other hand, Non-Patent Document 2 describes a method of using a depth map together with a color image. In this method, an image area corresponding to a subject to be extracted is extracted as a target area by combining color information of pixels included in a color image and distance information of a depth map from the measuring device to the subject. Since the target area is specified using the difference between the distance to the extraction target subject and the distance to the non-extraction target (the distance difference between the foreground and the background), the extraction target subject can be extracted more accurately. .

しかし、デプスマップを用いる場合でも次の問題がある。 However, even when using a depth map, there are the following problems.

（１）対象領域の正確性と処理の高速性
通常、デプスマップの解像度はカラー画像に比べて著しく低い。そのため、低解像度であるデプスマップを用いる場合には、そのデプスマップをカラー画像と同じサイズにまでアップサンプル（拡大処理）する必要がある。しかし、このアップサンプルによりデプスマップの正確性が失われてしまい、例えば、距離の異なる画像領域間の輪郭部分にジャギーが生じ、画像領域の輪郭に正確さが失われてしまう。また、抽出する対象領域の輪郭を正確に把握できるほど厳密な距離算出を行うには、負荷の高い計算処理を必要とするため、画像抽出処理の高速性が失われてしまう。 (1) Accuracy of Target Area and Speed of Processing Usually, the resolution of the depth map is significantly lower than that of a color image. Therefore, in the case of using a low resolution depth map, it is necessary to upsample (enlarge) the depth map to the same size as the color image. However, this up-sampling loses the accuracy of the depth map, and for example, jaggies occur at contour portions between image regions having different distances, and the contour accuracy of the image regions is lost. In addition, in order to calculate the distance so strictly that the contour of the target region to be extracted can be accurately grasped, high-speed processing of the image extraction processing is lost because a calculation processing with a high load is required.

（２）被写体の動作への対応
仮に、高解像度のデプスマップを取得できた場合、そのデプスマップを用いて行う対象領域の抽出方法としては次の手法がある。 (2) Correspondence to the movement of the subject If a high resolution depth map can be obtained, there is the following method as a method of extracting a target area using the depth map.

（２−１）例えば、背景差分法と同様に、予め背景のデプスマップを保存しておき、抽出対象の被写体を含むデプスマップと背景のみのデプスマップとを比較し、それら２つのデプスマップ間で距離の差分を取ることにより被写体の対象領域を抽出する手法がある。 (2-1) For example, as in the background subtraction method, the depth map of the background is stored in advance, and the depth map including the subject to be extracted and the depth map of only the background are compared, and between the two depth maps There is a method of extracting the target area of the subject by taking the difference of the distance at.

（２−２）また、予め被写体の存在する距離の範囲を指定し、その範囲内に含まれる画像領域を被写体の対象領域として抽出する手法がある。 (2-2) Further, there is a method of designating in advance a range of distance in which the subject exists, and extracting an image area included in the range as a target area of the subject.

（２−３）また、予め被写体の存在する距離の範囲を指定し、その範囲内に含まれる画像領域を抽出対象領域の候補とし、画像内のテクスチャの特徴を評価することによって抽出対象の被写体を特定する手法がある（特許文献１）。 (2-3) Further, the range of the distance in which the subject exists is specified in advance, the image area included in the range is set as the candidate of the extraction target area, and the subject of the extraction target is evaluated by evaluating the features of the texture in the image. There is a method of specifying the (Patent Document 1).

しかし、（２−１）の手法では、抽出対象の被写体以外の被写体が背景に侵入してきた場合、その被写体も併せて抽出されてしまうため、背景差分法を用いる手法と同じ課題がある。 However, in the method of (2-1), when a subject other than the subject to be extracted intrudes into the background, the subject is also extracted together, so there is the same problem as the method using the background subtraction method.

また、（２−２）と（２−３）の手法では、抽出対象の被写体が奥行き方向に大きく移動し、指定した距離の範囲から外れた場合には、その被写体の対象領域を抽出することができなくなる。 In the methods (2-2) and (2-3), when the subject to be extracted moves largely in the depth direction and deviates from the specified distance range, the target area of the subject is extracted. Can not

また、（２−３）の手法の場合、被写体の人物が正面を向いているか後ろを向いているかで被写体のテクスチャが変化し、更には人物の服装や動作が変化することによってもテクスチャが大きく変化するため、その被写体の対象領域を正確に特定することは困難である。 Also, in the case of the method (2-3), the texture of the subject changes depending on whether the person of the subject is facing forward or backward, and the texture is also large due to changes in the clothes and movement of the person. Because of the change, it is difficult to accurately identify the target area of the subject.

本発明は、上記事情を鑑みてなされたものであり、被写体の抽出精度を改善することを目的とする。 The present invention has been made in view of the above circumstances, and an object thereof is to improve the extraction accuracy of a subject.

以上の課題を解決するため、請求項１に係る画像処理装置は、時系列のカラー画像から、画像の変化した変化領域と非変化領域とに分類された第１のマスク画像を生成する第１の生成部と、前記カラー画像に対応するデプスマップから、２つの距離閾値の間に含まれる閾値間領域と非閾値間領域とに分類された第２のマスク画像を生成する第２の生成部と、前記第１のマスク画像と前記第２のマスク画像とを組み合わせ、前記変化領域と前記閾値間領域との両領域に該当する対象領域と非対象領域とを有する合成マスク画像を生成する合成部と、前記合成マスク画像の対象領域に対応する距離の最頻値を前記デプスマップから求める推定部と、前記距離の最頻値に基づき前記２つの距離閾値を更新する更新部と、前記合成マスク画像の対象領域に対応する領域を前記カラー画像から抽出する適用部と、を備えることを特徴とする。 In order to solve the above problems, an image processing apparatus according to a first aspect of the present invention generates, from a time-series color image, a first mask image classified into a changed area where the image has changed and a non-changed area. A second generation unit configured to generate, from the depth map corresponding to the color image, a second mask image classified into the inter-threshold area and the non-threshold area included between the two distance thresholds, from the depth map corresponding to the color image And combining the first mask image and the second mask image to generate a composite mask image having target areas and non-target areas corresponding to both the change area and the inter-threshold area. , An estimation unit which obtains a mode value of the distance corresponding to the target area of the composite mask image from the depth map, an update unit which updates the two distance thresholds based on the mode value of the distance, and Target area of mask image Characterized in that it and a application unit that extracts a corresponding region from the color image.

請求項２に係る画像処理装置は、請求項１に記載の画像処理装置において、前記更新部は、前記距離の最頻値が前記２つの距離閾値の間で中心となるように、前記２つの距離閾値を更新することを特徴とする。 The image processing apparatus according to claim 2 is the image processing apparatus according to claim 1, wherein the update unit is configured to center the mode value of the distance between the two distance thresholds. The distance threshold is updated.

請求項３に係る画像処理装置は、請求項１又は２に記載の画像処理装置において、前記第１のマスク画像は、画像の変化度が前記変化領域よりも小さく前記非変化領域よりも大きい未分類領域を含み、前記第１の生成部は、前記未分類領域から一定範囲内の周辺領域の色情報又は距離情報を利用して前記未分類領域を前記変化領域又は前記非変化領域に分類することを特徴とする。 An image processing apparatus according to a third aspect of the present invention is the image processing apparatus according to the first or second aspect, wherein the first mask image has an image change degree smaller than the change area and larger than the non-change area. The first generation unit classifies the unclassified area into the change area or the non-change area using color information or distance information of a peripheral area within a certain range from the unclassified area. It is characterized by

請求項４に係る画像処理装置は、請求項１乃至３のいずれかに記載の画像処理装置において、前記合成部は、前記対象領域と前記非対象領域との境界に一定幅の未分類領域を形成し、当該未分類領域から一定範囲内の周辺領域の色情報又は距離情報を利用して当該未分類領域を前記対象領域又は前記非対象領域に分類することを特徴とする。 The image processing apparatus according to a fourth aspect is the image processing apparatus according to any one of the first to third aspects, wherein the combining unit is configured to set an unclassified area of a certain width at a boundary between the target area and the non-target area. The unclassified area is classified into the target area or the non-target area by using color information or distance information of peripheral areas within a certain range from the unclassified area.

請求項５に係る画像処理装置は、請求項１乃至４のいずれかに記載の画像処理装置において、前記第２の生成部は、前記カラー画像と同じ画角又は解像度となるように前記デプスマップを変更することを特徴とする。 The image processing apparatus according to claim 5 is the image processing apparatus according to any one of claims 1 to 4, wherein the second generation unit is configured to obtain the same depth angle or resolution as the color image. It is characterized by changing.

請求項６に係る画像処理装置は、請求項１乃至５のいずれかに記載の画像処理装置において、前記合成部は、前記変化領域に対する前記閾値間領域の割合が閾値以上である変化領域を前記対象領域とすることを特徴とする。 The image processing apparatus according to claim 6 is the image processing apparatus according to any one of claims 1 to 5, wherein the combining unit is configured to set a change area where the ratio of the inter-threshold area to the change area is equal to or more than a threshold. It is characterized in that it is a target area.

請求項７に係る画像処理方法は、画像処理装置で行う画像処理方法において、時系列のカラー画像から、画像の変化した変化領域と非変化領域とに分類された第１のマスク画像を生成するステップと、前記カラー画像に対応するデプスマップから、２つの距離閾値の間に含まれる閾値間領域と非閾値間領域とに分類された第２のマスク画像を生成するステップと、前記第１のマスク画像と前記第２のマスク画像とを組み合わせ、前記変化領域と前記閾値間領域との両領域に該当する対象領域と非対象領域とを有する合成マスク画像を生成するステップと、前記合成マスク画像の対象領域に対応する距離の最頻値を前記デプスマップから求めるステップと、前記距離の最頻値に基づき前記２つの距離閾値を更新するステップと、前記合成マスク画像の対象領域に対応する領域を前記カラー画像から抽出するステップと、を行うことを特徴とする。 In the image processing method according to claim 7, in the image processing method performed by the image processing apparatus, a first mask image classified into a changed area and a non-changed area of the image is generated from the time-series color image. Generating a second mask image classified into an inter-threshold area and an inter-threshold area included between two distance thresholds from the depth map corresponding to the color image; Combining a mask image and the second mask image to generate a composite mask image having target regions and non-target regions corresponding to both the change region and the inter-threshold region; and the composite mask image Obtaining from the depth map a mode value of the distance corresponding to the target area of the target, updating the two distance thresholds based on the mode value of the distance, and the composite mask And performing the step of extracting a region corresponding to the target region of the image from the color image.

請求項８に係る画像処理プログラムは、請求項１乃至６のいずれかに記載の画像処理装置としてコンピュータを機能させることを特徴とする。 An image processing program according to claim 8 causes a computer to function as the image processing apparatus according to any one of claims 1 to 6.

本発明によれば、被写体の抽出精度を向上することができる。 According to the present invention, the subject extraction accuracy can be improved.

画像処理装置の構成を示す図である。It is a figure showing composition of an image processing device. 撮影シーンの例を示す図である。It is a figure which shows the example of a photography scene. 背景差分マスク画像の生成処理例を示す図である。It is a figure which shows the example of a production | generation process of a background difference mask image. 距離閾値マスク画像の生成処理例を示す図である。It is a figure which shows the example of a production | generation process of a distance threshold value mask image. 合成マスク画像の生成処理例を示す図である。It is a figure which shows the example of a production | generation process of a synthetic mask image. 距離の最頻値の算出例及び距離閾値の変更例を示す図である。It is a figure which shows the example of calculation of the mode of the distance, and the example of a change of a distance threshold value. 画像処理装置の処理動作を示す図である。It is a figure which shows the processing operation of an image processing apparatus. 合成マスク画像の生成処理例（変形例）を示す図である。It is a figure which shows the example of a production | generation process of a synthetic | combination mask image (modification).

上記課題を解決するため、本発明は、時系列なカラー画像とデプスマップとの簡易な組み合わせより、抽出対象の被写体までの距離を各時間で推定し、その推定した距離を距離パラメータ（その被写体が存在し得る範囲を示す２つの距離閾値）にフィードバックする。 In order to solve the above problems, the present invention estimates the distance to the subject to be extracted at each time from a simple combination of a time-series color image and a depth map, and estimates the estimated distance as a distance parameter (the subject Feedback to the two distance thresholds) indicating the range in which

これにより、背景変化に頑健（遠景の動きや、照明の変化、抽出対象の被写体以外の被写体が侵入した場合でも前景として抽出されない）で、抽出対象の被写体の奥行き方向への移動にも対応した高精度な被写体抽出処理を低負荷な処理で実現可能であり、実時間での被写体抽出映像の取得を可能とすることができる。 As a result, the background change is robust (the movement of the distant view, the change of illumination, and even if a subject other than the subject to be extracted invades, it is not extracted as the foreground), and the movement of the subject to be extracted in the depth direction is also supported. High-accuracy subject extraction processing can be realized by low-load processing, and acquisition of a subject extraction video in real time can be enabled.

以下、本発明を実施する一実施の形態について図面を用いて説明する。なお、本実施の形態では静止画像を例に処理を説明するが、映像などの動画像のフレームにも適用可能である。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings. Note that, in the present embodiment, processing is described using a still image as an example, but the present invention is also applicable to a frame of a moving image such as a video.

本実施の形態に係る画像処理装置の構成を図１に示す。この画像処理装置１は、コンピュータ装置によって構成し、図１に示すように、背景差分マスク生成部１１、距離閾値マスク生成部１２、マスク合成部１３、距離推定部１４、距離パラメータ更新部１５、マスク適用部１６、を備えて構成される。また、この画像処理装置１は、撮像装置３及び計測装置５に接続されており、撮像装置３からカラー画像を入力し、計測装置５からデプスマップを入力する。 The configuration of the image processing apparatus according to the present embodiment is shown in FIG. The image processing apparatus 1 is configured by a computer apparatus, and as shown in FIG. 1, a background difference mask generation unit 11, a distance threshold mask generation unit 12, a mask combination unit 13, a distance estimation unit 14, a distance parameter update unit 15, The mask application unit 16 is configured. Further, the image processing apparatus 1 is connected to the imaging device 3 and the measuring device 5, inputs a color image from the imaging device 3, and inputs a depth map from the measuring device 5.

本実施の形態で用いる撮影シーンの例を図２に示す。撮像装置３と計測装置５は、互いの画角を同じにするため一体型の装置又は同じ位置に配置されている。撮像装置３は、所定方向のシーンを撮影した後、そのシーンのカラー画像を画像処理装置１へ送信し、計測装置５は、そのシーンのデプスマップを画像処理装置１へ送信する。 An example of a shooting scene used in the present embodiment is shown in FIG. The imaging device 3 and the measuring device 5 are disposed in an integrated device or at the same position in order to make the angle of view of the two equal. After capturing a scene in a predetermined direction, the imaging device 3 transmits a color image of the scene to the image processing device 1, and the measuring device 5 transmits a depth map of the scene to the image processing device 1.

次に、画像処理装置１の機能について説明する。 Next, the function of the image processing apparatus 1 will be described.

背景差分マスク生成部１１は、カラー画像を用いて、背景差分による前景、背景に領域分割された背景差分マスク画像を生成する機能を備える。具体的に、まず、予め抽出対象の被写体が存在しないシーンのカラー画像を取得して保存しておく。次に、抽出対象の被写体が存在するシーンのカラー画像を取得する。そして、その２枚のカラー画像の画素値の差分から、一定値以上の変化がある画素（変化領域）を前景、一定値以上の変化がない画素（非変化領域）を背景とラベル付けすることにより、背景差分マスク画像（第１のマスク画像）を生成する（図３）。例えば、前景の画素値を１、背景の画素値を０に設定する。 The background difference mask generation unit 11 has a function of generating a foreground difference by background difference and a background difference mask image divided into the background using a color image. Specifically, first, a color image of a scene in which a subject to be extracted does not exist is acquired and stored in advance. Next, a color image of a scene in which a subject to be extracted exists is acquired. Then, from the difference between the pixel values of the two color images, label a pixel (changed area) having a change of a predetermined value or more as the foreground and a pixel (non-change area) having no change of a predetermined value or more as a background. Thus, a background difference mask image (first mask image) is generated (FIG. 3). For example, the pixel value of the foreground is set to 1 and the pixel value of the background is set to 0.

距離閾値マスク生成部１２は、デプスマップを用いて、距離の閾値による前景、背景に領域分割された距離閾値マスク画像を生成する機能を備える。具体的に、まず、予め抽出対象の被写体が最初に存在し得る範囲を示す２つの距離閾値（以降、距離閾値の範囲という）を設定し、距離パラメータとして保存しておく。次に、抽出対象の被写体が存在するシーンのデプスマップと該距離パラメータとを取得し、距離閾値の範囲内に含まれる画像領域（閾値間領域）を前景、範囲外の画像領域（非閾値間領域）を背景とラベル付けすることにより、距離閾値マスク画像（第２のマスク画像）を生成する（図４）。このとき、必要に応じて、幾何補正とリサイズを行うことにより、カラー画像と同じ画角と解像度の距離閾値マスク画像を生成してもよい。 The distance threshold mask generation unit 12 has a function of generating a distance threshold mask image divided into the foreground and the background based on the distance threshold using the depth map. Specifically, first, two distance threshold values (hereinafter referred to as a distance threshold range) indicating the range in which the subject to be extracted may exist first are set in advance and stored as distance parameters. Next, the depth map of the scene in which the subject to be extracted is present and the distance parameter are acquired, and the image area (inter-threshold area) included in the range of the distance threshold is the foreground and the image area outside the range (non-threshold area) By labeling the region as background, a distance threshold mask image (second mask image) is generated (FIG. 4). At this time, a distance threshold mask image of the same angle of view and resolution as that of the color image may be generated by performing geometric correction and resizing as necessary.

マスク合成部１３は、背景差分マスク生成部１１で生成した背景差分マスク画像と、距離閾値マスク生成部１２で生成した距離閾値マスク画像とを組み合わせて、合成マスク画像を生成する機能を備える。具体的に、その２種類のマスク画像の前景の画素値を１、背景の画素値を０とみなしたときの積を取り、どちらのマスク画像でも前景となっている画素のみを前景（対象領域）とし、それ以外の画素全てを背景（非対象領域）とした合成マスク画像を生成する（図５）。 The mask combining unit 13 has a function of combining the background difference mask image generated by the background difference mask generating unit 11 and the distance threshold mask image generated by the distance threshold mask generating unit 12 to generate a combined mask image. Specifically, the product when the foreground pixel value of the two types of mask images is regarded as 1 and the background pixel value is regarded as 0 is taken, and only the pixels which are the foreground in both mask images are And generate a composite mask image with all other pixels as background (non-target area) (FIG. 5).

距離推定部１４は、抽出対象の被写体が存在する最新の距離を推定する機能を備える。具体的に、まず、マスク合成部１３で生成した合成マスク画像において、前景とラベル付けされている各画素の座標と該前景に対応するデプスマップの座標とをそれぞれ求め、各画素に紐づくデプスマップでの距離を求める。次に、求めた前景画素群の距離の最頻値を算出し、その距離の最頻値を最新の距離（抽出対象の被写体までの最新距離）と推定する（図６）。 The distance estimation unit 14 has a function of estimating the latest distance at which the subject to be extracted exists. Specifically, first, in the combined mask image generated by the mask combining unit 13, the coordinates of each pixel labeled as the foreground and the coordinates of the depth map corresponding to the foreground are respectively determined, and the depth associated with each pixel is determined. Find the distance on the map. Next, the mode of the distance of the determined foreground pixel group is calculated, and the mode of the distance is estimated to be the latest distance (the latest distance to the subject to be extracted) (FIG. 6).

距離パラメータ更新部１５は、距離閾値マスク生成部１２で用いる距離パラメータを更新する機能を備える。具体的に、例えば、抽出対象の被写体の距離として指定する距離閾値の範囲の中心が、距離推定部１４で求めた距離の最頻値になるように更新する（図６）。但し、前景画素数の最頻値が予め設定した一定数（最小画素数）に満たない場合は、抽出する画像領域に抽出対象の被写体が存在していないとみなし、距離パラメータの更新は行わない。 The distance parameter updating unit 15 has a function of updating the distance parameter used by the distance threshold mask generating unit 12. Specifically, for example, the center of the range of the distance threshold designated as the distance of the subject to be extracted is updated so as to be the mode of the distance obtained by the distance estimation unit 14 (FIG. 6). However, when the mode value of the foreground pixel number does not reach a predetermined number (minimum pixel number) set in advance, it is considered that the subject to be extracted does not exist in the image area to be extracted, and the distance parameter is not updated. .

マスク適用部１６は、カラー画像と合成マスク画像とを組み合わせて、合成マスク画像の前景画素と座標が重なり合うカラー画像の画素のみを抽出することにより、抽出対象の被写体のみを抽出した画像を生成する機能を備える。 The mask application unit 16 combines the color image and the composite mask image to extract only the pixels of the color image whose coordinates overlap with the foreground pixels of the composite mask image, thereby generating an image in which only the subject to be extracted is extracted. It has a function.

次に、上述した画像処理装置１で行う画像処理方法（被写体抽出方法）について説明する。その画像処理方法の処理動作を図７に示す。 Next, an image processing method (subject extraction method) performed by the above-described image processing apparatus 1 will be described. The processing operation of the image processing method is shown in FIG.

まず、ステップＳ１において、背景差分マスク生成部１１は、撮像装置３から出力されたカラー画像を読み込む。 First, in step S <b> 1, the background difference mask generation unit 11 reads a color image output from the imaging device 3.

次に、ステップＳ２において、背景差分マスク生成部１１は、背景差分を行うための背景画像が取得できているかを確認する。ここで、背景画像が未取得の場合は、ステップＳ３において、ステップＳ１で読み込んだカラー画像を背景画像として保存する。なお、この背景画像は、前景となり得る被写体が画像内に存在しないシーンの撮影によって得られる背景のみの画像とする。 Next, in step S2, the background difference mask generation unit 11 confirms whether the background image for performing the background difference can be acquired. Here, when the background image has not been acquired, the color image read in step S1 is stored as the background image in step S3. Note that this background image is an image of only a background obtained by shooting a scene in which no subject that can be the foreground is present in the image.

次に、ステップＳ４において、背景差分マスク生成部１１は、保存済みの背景画像を取得する。そして、ステップＳ５において、取得した背景画像とカラー画像とを比較し、画素値の差分から、一定値以上の変化がある画素を前景、一定値以上の変化がない画素を背景とラベル付けすることにより、背景差分マスク画像を生成する。 Next, in step S4, the background difference mask generation unit 11 acquires a stored background image. Then, in step S5, the acquired background image is compared with the color image, and a pixel having a change of a predetermined value or more is labeled the foreground and a pixel having no change of a predetermined value or more is labeled the background. Generates a background difference mask image.

次に、ステップＳ６において、距離閾値マスク生成部１２は、計測装置５から出力されたデプスマップを読み込む。 Next, in step S6, the distance threshold mask generation unit 12 reads the depth map output from the measuring device 5.

次に、ステップＳ７において、距離閾値マスク生成部１２は、抽出対象の被写体を範囲指定するために用いる２つの距離閾値が設定できているかを確認する。ここで、２つの距離閾値が未設定の場合は、ステップＳ８において、その２つの距離閾値の初期値を設定する。なお、この初期値は、抽出対象の被写体が最初に存在し得る距離ｄ_０を含むように設定した最小距離Ｄ_ｍｉｎ ^ｔ＝０と最大距離Ｄ_ｍａｘ ^ｔ＝０とする（Ｄ_ｍｉｎ ^ｔ＝０≦ｄ_０≦Ｄ_ｍａｘ ^ｔ＝０）。 Next, in step S7, the distance threshold mask generation unit 12 confirms whether or not two distance threshold values used for specifying the range of the extraction target subject can be set. Here, when two distance thresholds are not set, initial values of the two distance thresholds are set in step S8. Note that this initial value is set to a minimum distance D _min ^{t = 0} and a maximum distance D _max ^{t = 0} set so as to include the distance d _{0 at} which the subject to be extracted may initially exist (D _min ^{t = 0} d ₀ ≦ D _max ^{t = 0} ).

次に、ステップＳ９において、距離閾値マスク生成部１２は、保存済みの２つの距離閾値を読み込む。そして、ステップＳ１０において、取得したデプスマップの距離の値が最小距離Ｄ_ｍｉｎ ^ｔ＝０から最大距離Ｄ_ｍａｘ ^ｔ＝０の間に含まれる領域を前景、それ以外の領域を背景とラベル付けすることにより、距離閾値マスク画像を生成する。このとき、生成した距離閾値マスク画像の解像度がカラー画像よりも小さい場合、又はカラー画像と画角が異なる場合は、必要に応じてリサイズや幾何補正を行い、カラー画像と同じ解像度及び角度の距離閾値マスク画像に更新してもよい。 Next, in step S9, the distance threshold mask generator 12 reads two stored distance thresholds. Then, in step S10, the region in which the obtained distance value of the depth map is included between the minimum distance D _min ^{t = 0} and the maximum distance D _max ^{t = 0} is labeled the foreground, and the other region is labeled the background. Generates a distance threshold mask image. At this time, if the resolution of the generated distance threshold mask image is smaller than that of the color image, or if the angle of view is different from that of the color image, resizing or geometric correction is performed as necessary, and the distance of the same resolution and angle as the color image. The threshold mask image may be updated.

次に、ステップＳ１１において、マスク合成部１３は、背景差分マスク画像と距離閾値マスク画像とを組み合わせて、その２種類のマスク画像のうち、どちらのマスクでも前景となっている画素のみを前景とし、それ以外の画素全てを背景とした合成マスク画像を生成する。 Next, in step S11, the mask combining unit 13 combines the background difference mask image and the distance threshold mask image, and sets only the pixels which are the foreground in either mask of the two types of mask images as the foreground. , And a composite mask image with all other pixels as a background.

次に、ステップＳ１２において、距離推定部１４は、合成マスク画像において、前景とラベル付けされている各画素の座標と該前景に対応するデプスマップの座標とをそれぞれ求め、各画素に紐づくデプスマップでの距離を求め、前景画素群の距離の最頻値を求める。 Next, in step S12, the distance estimation unit 14 obtains the coordinates of each pixel labeled as foreground and the coordinates of the depth map corresponding to the foreground in the composite mask image, and the depth associated with each pixel is obtained. The distance in the map is determined, and the mode of the distance of the foreground pixel group is determined.

次に、ステップＳ１３において、距離パラメータ更新部１５は、ステップＳ１２で求めた距離の最頻値が、抽出対象の被写体の距離として指定する距離閾値の範囲の中心になるように最小距離Ｄ_ｍｉｎと最大距離Ｄ_ｍａｘを更新する。 Next, in step S13, the distance parameter updating unit 15 sets the minimum distance D _min so that the mode value of the distance obtained in step S12 is at the center of the range of the distance threshold designated as the distance of the subject to be extracted. Update the maximum distance _Dmax .

最後に、ステップＳ１４において、マスク適用部１６は、合成マスク画像で前景ラベルの付いている画素と座標が重なり合うカラー画像の画素のみを抽出する。その後、ステップＳ１５において、その抽出した画素のみからなる領域抽出画像を書き出す処理を行う。 Finally, in step S14, the mask application unit 16 extracts only the pixels of the color image whose coordinates overlap with the pixels to which the foreground label is attached in the composite mask image. Thereafter, in step S15, a process of writing out an area extraction image consisting of only the extracted pixels is performed.

以上より、本実施の形態によれば、背景差分と距離閾値とに基づく２枚のマスク画像の単純な組み合わせより、抽出対象の被写体までの距離を推定し、その推定した距離を距離パラメータにフィードバックするので、距離閾値マスク画像で用いる２つの範囲閾値が被写体の移動に追従変化することとなり、その被写体が奥行き方向に移動してもその位置を特定することができる。 As described above, according to the present embodiment, the distance to the subject to be extracted is estimated from a simple combination of two mask images based on the background difference and the distance threshold, and the estimated distance is fed back to the distance parameter. Therefore, the two range thresholds used in the distance threshold mask image change following the movement of the subject, and even if the subject moves in the depth direction, its position can be specified.

すなわち、２つの範囲閾値が被写体の位置によって動的に変化することにより、最小限の距離範囲の領域のみを抽出対象とすることができ、その被写体から距離の離れた背景領域に抽出対象の被写体以外の被写体が侵入しても前景領域として抽出されないこととなる。それゆえ、抽出対象の被写体から距離の離れた位置においては、物理的な変化（遠景の動き、照明の変化、物体の侵入など）に対して頑健性を持たせることができる。したがい、抽出対象の被写体の抽出精度を向上することができる。 That is, by dynamically changing the two range threshold values according to the position of the subject, only the area of the minimum distance range can be extracted, and the subject of the extraction target is placed in the background area at a distance from the subject. Even if a subject other than the subject enters, it is not extracted as the foreground area. Therefore, at a position away from the subject to be extracted, robustness against physical changes (movement of a distant view, change of illumination, intrusion of an object, etc.) can be provided. Therefore, the extraction accuracy of the subject to be extracted can be improved.

また、カラー画像の背景差分を用いるため、精度の低いデプスマップ（解像度が低く、抽出対象の被写体の輪郭が厳密に取得できないデプスマップ）であっても、高精度な領域抽出が実現できる。つまり、デプスマップの精度が粗くても、背景差分マスク画像の精度さえ高精度であれば、結果的に高精度な被写体領域の抽出を実現することができる。また、高精度なデプスマップを必要とせず、マスク画像の生成処理やカラー画像との組み合わせ処理も簡易な低負荷の計算処理で済むため、実時間で被写体抽出映像を取得することができる。 In addition, since the background difference of the color image is used, highly accurate region extraction can be realized even if the depth map is a low precision depth map (a depth map whose resolution is low and the outline of the subject to be extracted can not be strictly acquired). That is, even if the accuracy of the depth map is low, if the accuracy of the background difference mask image is high, it is possible to realize extraction of the subject region with high accuracy. In addition, since a high-precision depth map is not required, and a process of generating a mask image and a process of combining with a color image may be a simple low-load calculation process, it is possible to obtain a subject extraction video in real time.

＜変形例１＞
背景差分マスク生成部１１の変形例について説明する。背景差分マスク画像の精度が悪い（抽出対象の被写体の境界が精緻でない）場合、背景差分マスク生成部１１は、次の方法で背景差分マスク画像を生成する。 <Modification 1>
A modification of the background difference mask generation unit 11 will be described. When the accuracy of the background difference mask image is poor (the boundary of the subject to be extracted is not fine), the background difference mask generation unit 11 generates the background difference mask image by the following method.

まず、背景差分によって、前景と背景の２領域に分類する代わりに、間違いなく前景である絶対前景、間違いなく背景である絶対背景、そのどちらの領域に含まれるかを明確に分類できない未分類領域の３領域にラベル付けする。例えば、画素値の変化度が上限閾値以上である領域を絶対前景、下限閾値以下である領域を絶対背景、上限閾値と下限閾値との間である領域を未分類領域とする。 First of all, instead of classifying into two regions of foreground and background by background difference, the absolute foreground which is definitely the foreground, the absolute background which is definitely the background, the unclassified region which can not be clearly classified into which region Label 3 areas of. For example, an area in which the pixel value change degree is equal to or higher than the upper threshold is an absolute foreground, an area in which the pixel value change is equal to or lower than the lower threshold is an absolute background, and an area between the upper threshold and the lower threshold is an unclassified area.

次に、未分類領域としてラベル付けした領域に対して最近傍探索を利用したセグメンテーション手法（Nearest Neighbor Classification；非特許文献３；変形例２も同様）を用いることにより、注目する未分類領域の画素周辺の絶対前景又は絶対背景の色情報を参照した重み付け計算を行い、その重み付け計算結果に基づき、その未分類領域を絶対前景又は絶対背景に分類する。 Next, by using a segmentation method (Nearest Neighbor Classification; Non-Patent Document 3; the same applies to Modified Example 2) to a region labeled as an unclassified region, pixels of the unclassified region of interest Weighting calculation is performed with reference to color information of the surrounding absolute foreground or absolute background, and based on the result of the weighting calculation, the unclassified region is classified into an absolute foreground or an absolute background.

これにより、より精緻な背景差分マスク画像を生成可能となり、被写体の抽出精度を更に向上することができる。 As a result, a more precise background difference mask image can be generated, and the extraction accuracy of the subject can be further improved.

＜変形例２＞
マスク合成部１３の変形例について説明する。変形例１と同様に、背景差分による背景差分マスク画像の精度が悪い場合、マスク合成部１３は、次の方法で合成マスク画像を生成する。 <Modification 2>
A modified example of the mask combining unit 13 will be described. As in the first modification, when the accuracy of the background difference mask image due to the background difference is poor, the mask combining unit 13 generates a combined mask image by the following method.

まず、背景差分マスク画像と距離閾値マスク画像とを組み合わせて合成マスク画像を生成した後、合成マスク画像の前景領域と背景領域との境界に一定のピクセル幅を持たせた未分類領域を作成し、変形例１と同様に絶対前景、絶対背景、未分類領域の３領域にラベル付けする。 First, a composite mask image is generated by combining the background difference mask image and the distance threshold mask image, and then an unclassified area is created in which the boundary between the foreground area and the background area of the composite mask image has a fixed pixel width. In the same manner as in the first modification, three areas of absolute foreground, absolute background, and unclassified area are labeled.

次に、未分類領域にラベル付けした領域に対して最近傍探索を利用したセグメンテーション手法を用いることにより、注目する未分類領域の画素周辺の絶対前景又は絶対背景の色情報を参照した重み付け計算を行い、その重み付け計算結果に基づき、その未分類領域を絶対前景又は絶対背景に分類する。 Next, by using a segmentation method using nearest neighbor search for the region labeled to the unclassified region, weighting calculation is performed with reference to the color information of the absolute foreground or absolute background around the pixel of the unclassified region of interest. And classify the unclassified area into an absolute foreground or an absolute background based on the weighted calculation result.

これにより、より精緻な合成マスク画像を生成可能となり、被写体の抽出精度を更に向上することができる。 As a result, it is possible to generate a more precise composite mask image, and it is possible to further improve the object extraction accuracy.

＜変形例３＞
変形例１、変形例２の重み付け計算を行う際に、色情報だけでなく、デプスマップ（画角及び解像度がカラー画像と一致するデプスマップ）の距離情報を参照し、色空間上のカラー値の近似性を利用するのに加えて、物理空間上の距離値の近似性も加味するようにしてもよい。これにより、より精緻な背景差分マスク画像又は合成マスク画像を生成可能となり、被写体の抽出精度を更に向上することができる。 <Modification 3>
When performing weighting calculation of the first modification and the second modification, not only the color information but also the distance information of the depth map (the depth map in which the angle of view and the resolution match the color image) are referred to In addition to using the closeness of, the closeness of the distance value on the physical space may be added. As a result, a more precise background difference mask image or a combined mask image can be generated, and the extraction accuracy of the subject can be further improved.

＜変形例４＞
距離閾値マスク生成部１２の変形例について説明する。デプスマップの画角がカラー画像と異なる場合、又はデプスマップの解像度がカラー画像よりも低い場合、距離閾値マスク生成部１２は、カラー画像の画角及び解像度を基準に用いて、更には被写体までの距離として２つの距離閾値の中心（（最小距離Ｄ_ｍｉｎ−最大距離Ｄ_ｍａｘ）／２）をも参照して、取得したデプスマップに対して幾何補正やリサイズ等を行うことにより、解像度及び画角がカラー画像と同じになるようにデプスマップを変更する。これにより、距離閾値マスク生成部１２で生成される距離閾値マスク画像を精細化することが可能となり、被写体の抽出精度を更に向上することができる。 <Modification 4>
A modification of the distance threshold mask generation unit 12 will be described. If the angle of view of the depth map is different from that of the color image, or if the resolution of the depth map is lower than that of the color image, the distance threshold mask generator 12 uses the angle of view and resolution of the color image as a reference By performing geometric correction, resizing, etc. on the acquired depth map with reference to the center of the two distance thresholds ((minimum distance D _{min −maximum} distance D _max ) / 2) as the distance between Change the depth map so that the corners are the same as in the color image. As a result, the distance threshold mask image generated by the distance threshold mask generation unit 12 can be refined, and the object extraction accuracy can be further improved.

＜変形例５＞
マスク合成部１３の変形例について説明する。デプスマップの解像度が粗く、例えば図８に示すように被写体の境界付近の距離情報しか得られない場合、マスク合成部１３は、次の方法で合成マスク画像を生成する。 <Modification 5>
A modified example of the mask combining unit 13 will be described. When the resolution of the depth map is coarse and, for example, only distance information near the boundary of the subject can be obtained as shown in FIG. 8, the mask composition unit 13 generates a composite mask image by the following method.

まず、背景差分マスク画像に含まれる前景の閉領域を探索する。次に、その閉領域の各画素の座標に対応する距離閾値マスク画像の前景画素の割合を算出し、その割合が一定値以上となった閉領域の画素群を前景、それ以外を背景とした合成マスク画像を生成する。これにより、適正な合成マスク画像を生成可能となり、被写体の抽出精度を更に向上することができる。 First, a foreground closed region included in the background difference mask image is searched. Next, the ratio of the foreground pixels of the distance threshold mask image corresponding to the coordinates of each pixel of the closed region is calculated, and the pixel group of the closed region whose ratio is equal to or more than a fixed value is set as the foreground and the other as the background. Generate a composite mask image. As a result, an appropriate composite mask image can be generated, and the object extraction accuracy can be further improved.

以上、本実施の形態及びその変形例について説明した。本実施の形態で説明した画像処理装置１は、コンピュータで実現可能であり、画像処理装置１としてコンピュータを機能させるための画像処理プログラム、その画像処理プログラムの記憶媒体を作成することも可能である。 The present embodiment and its modification have been described above. The image processing apparatus 1 described in the present embodiment can be realized by a computer, and it is also possible to create an image processing program for causing a computer to function as the image processing apparatus 1 and a storage medium of the image processing program. .

１…画像処理装置
１１…背景差分マスク生成部（第１の生成部）
１２…距離閾値マスク生成部（第２の生成部）
１３…マスク合成部（合成部）
１４…距離推定部（推定部）
１５…距離パラメータ更新部（更新部）
１６…マスク適用部（適用部）
３…撮像装置
５…計測装置 1 ... image processing device 11 ... background difference mask generation unit (first generation unit)
12 ... Distance threshold mask generation unit (second generation unit)
13: Mask composition unit (composition unit)
14 ... distance estimation unit (estimation unit)
15 ... Distance parameter update unit (update unit)
16: Mask application unit (application unit)
3 ... imaging device 5 ... measuring device

Claims

時系列のカラー画像から、画像の変化した変化領域と非変化領域とに分類された第１のマスク画像を生成する第１の生成部と、
前記カラー画像に対応するデプスマップから、２つの距離閾値の間に含まれる閾値間領域と非閾値間領域とに分類された第２のマスク画像を生成する第２の生成部と、
前記第１のマスク画像と前記第２のマスク画像とを組み合わせ、前記変化領域と前記閾値間領域との両領域に該当する対象領域と非対象領域とを有する合成マスク画像を生成する合成部と、
前記合成マスク画像の対象領域に対応する距離の最頻値を前記デプスマップから求める推定部と、
前記距離の最頻値に基づき前記２つの距離閾値を更新する更新部と、
前記合成マスク画像の対象領域に対応する領域を前記カラー画像から抽出する適用部と、
を備えることを特徴とする画像処理装置。 A first generation unit that generates a first mask image classified into a changed area and a non-changed area of the image from the time-series color image;
A second generation unit configured to generate a second mask image classified into an inter-threshold area and an inter-threshold area included between two distance thresholds, from the depth map corresponding to the color image;
A combining unit for combining the first mask image and the second mask image to generate a combined mask image having target areas and non-target areas corresponding to both the change area and the inter-threshold area; ,
An estimation unit for obtaining a mode value of a distance corresponding to a target area of the composite mask image from the depth map;
An updating unit that updates the two distance thresholds based on the mode of the distance;
An application unit for extracting an area corresponding to a target area of the composite mask image from the color image;
An image processing apparatus comprising:

前記更新部は、
前記距離の最頻値が前記２つの距離閾値の間で中心となるように、前記２つの距離閾値を更新することを特徴とする請求項１に記載の画像処理装置。 The updating unit is
The image processing apparatus according to claim 1, wherein the two distance thresholds are updated such that the mode of the distance is centered between the two distance thresholds.

前記第１のマスク画像は、画像の変化度が前記変化領域よりも小さく前記非変化領域よりも大きい未分類領域を含み、
前記第１の生成部は、
前記未分類領域から一定範囲内の周辺領域の色情報又は距離情報を利用して前記未分類領域を前記変化領域又は前記非変化領域に分類することを特徴とする請求項１又は２に記載の画像処理装置。 The first mask image includes an unclassified area in which an image change degree is smaller than the change area and larger than the non-change area.
The first generation unit is
3. The unclassified area is classified into the change area or the non-change area by using color information or distance information of peripheral areas within a certain range from the unclassified area. Image processing device.

前記合成部は、
前記対象領域と前記非対象領域との境界に一定幅の未分類領域を形成し、当該未分類領域から一定範囲内の周辺領域の色情報又は距離情報を利用して当該未分類領域を前記対象領域又は前記非対象領域に分類することを特徴とする請求項１乃至３のいずれかに記載の画像処理装置。 The synthesis unit is
An unclassified area of a certain width is formed at the boundary between the target area and the non-target area, and the unclassified area is targeted using the color information or distance information of a peripheral area within a certain range from the non-sorted area. The image processing apparatus according to any one of claims 1 to 3, wherein the image processing apparatus is classified into a region or the non-target region.

前記第２の生成部は、
前記カラー画像と同じ画角又は解像度となるように前記デプスマップを変更することを特徴とする請求項１乃至４のいずれかに記載の画像処理装置。 The second generation unit is
The image processing apparatus according to any one of claims 1 to 4, wherein the depth map is changed to have the same angle of view or resolution as the color image.

前記合成部は、
前記変化領域に対する前記閾値間領域の割合が閾値以上である変化領域を前記対象領域とすることを特徴とする請求項１乃至５のいずれかに記載の画像処理装置。 The synthesis unit is
The image processing apparatus according to any one of claims 1 to 5, wherein a change area in which a ratio of the inter-threshold area to the change area is equal to or more than a threshold is set as the target area.

画像処理装置で行う画像処理方法において、
時系列のカラー画像から、画像の変化した変化領域と非変化領域とに分類された第１のマスク画像を生成するステップと、
前記カラー画像に対応するデプスマップから、２つの距離閾値の間に含まれる閾値間領域と非閾値間領域とに分類された第２のマスク画像を生成するステップと、
前記第１のマスク画像と前記第２のマスク画像とを組み合わせ、前記変化領域と前記閾値間領域との両領域に該当する対象領域と非対象領域とを有する合成マスク画像を生成するステップと、
前記合成マスク画像の対象領域に対応する距離の最頻値を前記デプスマップから求めるステップと、
前記距離の最頻値に基づき前記２つの距離閾値を更新するステップと、
前記合成マスク画像の対象領域に対応する領域を前記カラー画像から抽出するステップと、
を行うことを特徴とする画像処理方法。 In an image processing method performed by an image processing apparatus,
Generating, from the time-series color image, a first mask image classified into a changed region and a non-change region of the image;
Generating, from the depth map corresponding to the color image, a second mask image classified into an inter-threshold area and a non-inter-threshold area included between two distance thresholds;
Combining the first mask image and the second mask image to generate a composite mask image having target areas and non-target areas corresponding to both the change area and the inter-threshold area;
Obtaining from the depth map a mode value of a distance corresponding to a target area of the composite mask image;
Updating the two distance thresholds based on the mode of the distance;
Extracting an area corresponding to a target area of the composite mask image from the color image;
An image processing method characterized by performing.

請求項１乃至６のいずれかに記載の画像処理装置としてコンピュータを機能させることを特徴とする画像処理プログラム。 An image processing program which causes a computer to function as the image processing apparatus according to any one of claims 1 to 6.