JP7382479B1

JP7382479B1 - Image processing device, program, and image processing method

Info

Publication number: JP7382479B1
Application number: JP2022196586A
Authority: JP
Inventors: 淳郎岡澤; 哲太佐藤; 昇平大村
Original assignee: SoftBank Corp
Current assignee: SoftBank Corp
Priority date: 2022-12-08
Filing date: 2022-12-08
Publication date: 2023-11-16
Anticipated expiration: 2042-12-08

Abstract

【課題】対象画像の複数の画素のそれぞれについて、画素に対応する物体を推定する画像処理装置、プログラム、及び画像処理方法を提供する。【解決手段】画像処理装置１００は、対象画像を取得する画像取得部１１０と、対象画像の複数の画素のそれぞれについて、画素に対応する物体を推定する推定部１２０と、推定部１２０による推定結果に対して、細かな絵柄変化に反応して誤推定された誤推定部分に反応するフィルタ処理を施すことによって、誤推定部分を検出する検出部１３０と、を備える。【選択図】図１The present invention provides an image processing device, a program, and an image processing method for estimating an object corresponding to each of a plurality of pixels of a target image. An image processing apparatus 100 includes an image acquisition unit 110 that acquires a target image, an estimation unit 120 that estimates an object corresponding to each pixel for each of a plurality of pixels of the target image, and an estimation result by the estimation unit 120. , a detecting unit 130 detects an incorrectly estimated portion by applying filter processing that reacts to the incorrectly estimated portion that is incorrectly estimated in response to a small pattern change. [Selection diagram] Figure 1

Description

本発明は、画像処理装置、プログラム、及び画像処理方法に関する。 The present invention relates to an image processing device, a program, and an image processing method.

非特許文献１には、Ｆｅｗ－ｓｈｏｔｌｅａｒｎｉｎｇを用いて画像をセグメンテーションするＦＳＳ（Ｆｅｗ－ｓｈｏｔｓｅｇｍｅｎｔａｔｉｏｎ）について記載されている。特許文献１には、画素レベルで対象物の検出を行う画素レベル対象物検出システムであって、検出対象を写した画像である推論用画像を収集する撮影部と、推論用画像から、検出対象を含む領域を検出する領域検出部と、推論用画像から、局所的情報のみを用い検出対象を検出する詳細検出部と、領域検出部の出力と、詳細検出部の出力とを統合して、画像内のいずれの画素が検出対象物と対応するかを確率マップとして示すセグメンテーションマップを出力する結果統合処理部とを有する画素レベル対象物検出システムが記載されている。
［先行技術文献］
［非特許文献］
［非特許文献１］Lihe Yang, Wei Zhuo, Lei Qi, Yinghuan Shi, Yang Gao: Mining Latent Classes for Few-shot Segmentation (2021), The IEEE International Conference on ComputerVision(ICCV)
［特許文献］
［特許文献１］特開２０２１－１７４１８２号公報 Non-Patent Document 1 describes FSS (few-shot segmentation) that performs image segmentation using few-shot learning. Patent Document 1 discloses a pixel-level object detection system that detects objects at a pixel level, including an imaging unit that collects an inference image that is an image of a detection object, and a detection object that is detected from the inference image. A region detection section that detects a region including a region, a detailed detection section that detects a detection target from an inference image using only local information, an output of the region detection section, and an output of the detailed detection section are integrated, A pixel-level object detection system is described that includes a result integration processing unit that outputs a segmentation map indicating as a probability map which pixels in an image correspond to a detection object.
[Prior art documents]
[Non-patent literature]
[Non-patent Document 1] Lihe Yang, Wei Zhuo, Lei Qi, Yinghuan Shi, Yang Gao: Mining Latent Classes for Few-shot Segmentation (2021), The IEEE International Conference on ComputerVision(ICCV)
[Patent document]
[Patent Document 1] Japanese Patent Application Publication No. 2021-174182

本発明の一実施態様によれば、画像処理装置が提供される。前記画像処理装置は、対象画像を取得する画像取得部を備えてよい。前記画像処理装置は、前記対象画像の複数の画素のそれぞれについて、画素に対応する物体を推定する推定部を備えてよい。前記画像処理装置は、前記推定部による推定結果に対して、細かな絵柄変化に反応して誤推定された誤推定部分に反応するフィルタ処理を施すことによって、前記誤推定部分を検出する検出部を備えてよい。 According to one embodiment of the present invention, an image processing device is provided. The image processing device may include an image acquisition unit that acquires a target image. The image processing device may include an estimation unit that estimates an object corresponding to each of the plurality of pixels of the target image. The image processing device includes a detection unit that detects the erroneously estimated portion by performing filter processing on the estimation result by the estimating unit in response to a erroneously estimated portion that is erroneously estimated in response to a small pattern change. may be provided.

前記画像処理装置において、前記検出部は、前記推定部による推定が行われた前記対象画像に対して、５×５画素以上のサイズのフィルタによって、基準画素と、前記基準画素と比較する複数の比較対象画素とを決定し、比較結果に基づいて前記誤推定部分を検出してよい。前記検出部は、前記フィルタの中心部分に位置する画素を前記基準画素とし、前記フィルタの外周部分に位置する複数の画素を前記複数の比較対象画素とし、前記推定部によって推定された、前記基準画素に対応する物体と、前記複数の比較対象画素のそれぞれに対応する物体との一致数が、０である場合又は予め定められた閾値以下である場合に、前記基準画素を、前記誤推定部分として検出してよい。 In the image processing device, the detection unit may detect a reference pixel and a plurality of pixels to be compared with the reference pixel using a filter having a size of 5×5 pixels or more on the target image that has been estimated by the estimation unit. A pixel to be compared may be determined, and the erroneously estimated portion may be detected based on the comparison result. The detection unit sets a pixel located at the center of the filter as the reference pixel, sets a plurality of pixels located at the outer periphery of the filter as the plurality of comparison target pixels, and uses the reference pixel estimated by the estimation unit as the reference pixel. If the number of matches between the object corresponding to the pixel and the objects corresponding to each of the plurality of comparison target pixels is 0 or less than a predetermined threshold, the reference pixel is replaced with the erroneously estimated portion. It can be detected as

前記いずれかの画像処理装置は、前記対象画像における前記誤推定部分の画素の周辺の複数の画素について、対応する物体が同一である画素の数を、物体毎に特定し、特定結果に基づいて、前記誤推定部分の画素に対応させる物体を決定する決定部をさらに備えてよい。前記画像処理装置は、前記推定部によって推定された、前記対象画像における前記誤推定部分の画素に対応する物体を、前記決定部によって決定された物体に置き換える置換処理部をさらに備えてよい。 Any of the image processing devices described above specifies, for each object, the number of pixels for which the corresponding object is the same for a plurality of pixels surrounding the pixel in the erroneously estimated portion in the target image, and based on the identification result. , the image forming apparatus may further include a determining unit that determines an object to be made to correspond to the pixel of the incorrectly estimated portion. The image processing device may further include a replacement processing unit that replaces an object estimated by the estimation unit and corresponding to a pixel of the incorrectly estimated portion in the target image with an object determined by the determination unit.

前記いずれかの画像処理装置において、前記検出部は、前記対象画像のサイズを変更し、複数の異なるサイズの前記対象画像のそれぞれに対して、前記誤推定部分を検出する処理を実行し、前記複数の異なるサイズの前記対象画像の少なくともいずれかにおいて前記誤推定部分として検出された部分を、前記対象画像の前記誤推定部分としてよい。 In any of the image processing devices, the detection unit changes the size of the target image, performs a process of detecting the erroneously estimated portion on each of the target images of a plurality of different sizes, and A portion detected as the erroneously estimated portion in at least one of the plurality of target images of different sizes may be set as the erroneously estimated portion of the target image.

前記いずれかの画像処理装置は、複数の物体のそれぞれについて、前記物体が撮影されたサポート画像及び前記サポート画像における前記物体の位置を示すアノテーションデータを含むサポートデータを取得するサポートデータ取得部を備えてよく、前記推定部は、前記サポートデータ取得部が取得した複数の前記サポートデータを用いた照合推論処理を実行することによって、前記対象画像の複数の画素のそれぞれについて、画素に対応する物体を推定してよい。 Any of the image processing devices described above includes a support data acquisition unit that acquires, for each of a plurality of objects, support data including a support image in which the object is photographed and annotation data indicating the position of the object in the support image. The estimation unit may determine, for each of the plurality of pixels of the target image, an object corresponding to the pixel by executing a matching inference process using the plurality of support data acquired by the support data acquisition unit. You can estimate it.

本発明の一実施形態によれば、コンピュータを、前記画像処理装置として機能させるためのプログラムが提供される。 According to one embodiment of the present invention, a program for causing a computer to function as the image processing device is provided.

本発明の一実施形態によれば、コンピュータによって実行される画像処理方法が提供される。前記画像処理方法は、対象画像を取得する画像取得段階を備えてよい。前記画像処理方法は、前記対象画像の複数の画素のそれぞれについて、画素に対応する物体を推定する推定段階を備えてよい。前記画像処理方法は、前記推定段階による推定結果に対して、細かな絵柄変化に反応して誤推定された誤推定部分に反応するフィルタ処理を施すことによって、前記誤推定部分を検出する検出段階を備えてよい。 According to one embodiment of the present invention, a computer-implemented image processing method is provided. The image processing method may include an image acquisition step of acquiring a target image. The image processing method may include, for each of the plurality of pixels of the target image, an estimation step of estimating an object corresponding to the pixel. The image processing method includes a detection step of detecting the erroneously estimated portion by applying a filtering process to the estimation result from the estimating step in response to the erroneously estimated portion that is erroneously estimated in response to a small pattern change. may be provided.

なお、上記の発明の概要は、本発明の必要な特徴の全てを列挙したものではない。また、これらの特徴群のサブコンビネーションもまた、発明となりうる。 Note that the above summary of the invention does not list all the necessary features of the invention. Furthermore, subcombinations of these features may also constitute inventions.

画像処理装置１００の機能構成の一例を概略的に示す。An example of the functional configuration of the image processing device 100 is schematically shown. フィルタ２００の一例を概略的に示す。An example of a filter 200 is schematically shown. フィルタ２００の一例を概略的に示す。An example of a filter 200 is schematically shown. 画像処理装置１００による処理の流れの一例を概略的に示す。An example of the flow of processing by the image processing device 100 is schematically shown. 画像処理装置１００によるマルチスケール処理について説明するための説明図である。FIG. 3 is an explanatory diagram for explaining multi-scale processing by the image processing device 100. FIG. 推定部１２０の機能構成の一例を概略的に示す。An example of the functional configuration of the estimation unit 120 is schematically shown. 推定部１２０によるＦＳＳについて概略的に説明するための説明図である。FIG. 3 is an explanatory diagram for schematically explaining FSS by the estimation unit 120. FIG. 推定部１２０によるＦＳＳについて概略的に説明するための説明図である。FIG. 3 is an explanatory diagram for schematically explaining FSS by the estimation unit 120. FIG. 推定部１２０によるＦＳＳについて概略的に説明するための説明図である。FIG. 3 is an explanatory diagram for schematically explaining FSS by the estimation unit 120. FIG. 画像処理装置１００として機能するコンピュータ１２００のハードウェア構成の一例を概略的に示す。An example of the hardware configuration of a computer 1200 functioning as the image processing apparatus 100 is schematically shown.

ＦＳＳ等では、認識対象が撮影されたサポート画像における認識対象の位置を示すアノテーションデータを準備する必要がある。画像内の認識対象の位置を推定して、推定結果をユーザに提示する技術が知られているが、その推定結果には、いわゆるドットごみが含まれる場合がある。ドットごみとは、画像中の、ある物体が占める領域のうち、誤推定によって他の物体と推定されている部分のことを示す。特に、画像内に細かな絵柄が含まれる場合に、数多くのドットごみが含まれてしまうことになる。推定結果に数多くのドットごみが含まれる場合、ユーザによるドットごみの除去作業の負荷が高くなり、作業効率が著しく低下する。したがって、高品質なドットごみの除去技術が必要となる。特許文献１に係る発明では、絵柄の変化の大きい部分を重要視する思想のもと、画素差分を用いたエッジ強調処理を行っているが、当該思想では、エッジに加えてドットごみも検出してしまうことになり、ドットごみに対処することが難しい。それに対して、本実施形態に係る画像処理装置１００は、例えば、画像のうち、ドットごみに相当する特定のパターンを検出して、その領域を補正する。 In FSS and the like, it is necessary to prepare annotation data that indicates the position of a recognition target in a support image in which the recognition target is photographed. A technique is known that estimates the position of a recognition target in an image and presents the estimation result to a user, but the estimation result may include so-called dot dust. Dot dust refers to a portion of an image occupied by a certain object that is incorrectly estimated to be another object. Particularly, when an image includes a fine pattern, a large number of dots will be included. If a large number of dots are included in the estimation result, the burden of the user's task of removing dots increases, and work efficiency is significantly reduced. Therefore, a high-quality dot dust removal technique is required. In the invention according to Patent Document 1, edge enhancement processing using pixel differences is performed based on the idea of placing importance on parts with large changes in the pattern, but in this idea, dot dust is also detected in addition to edges. This makes it difficult to deal with dot waste. In contrast, the image processing apparatus 100 according to the present embodiment, for example, detects a specific pattern corresponding to dot dust in the image and corrects the area.

以下、発明の実施の形態を通じて本発明を説明するが、以下の実施形態は特許請求の範囲にかかる発明を限定するものではない。また、実施形態の中で説明されている特徴の組み合わせの全てが発明の解決手段に必須であるとは限らない。 Hereinafter, the present invention will be described through embodiments of the invention, but the following embodiments do not limit the invention according to the claims. Furthermore, not all combinations of features described in the embodiments are essential to the solution of the invention.

図１は、画像処理装置１００の機能構成の一例を概略的に示す。画像処理装置１００は、記憶部１０２、画像取得部１１０、推定部１２０、検出部１３０、決定部１４０、置換処理部１５０、及び出力制御部１６０を備える。なお、画像処理装置１００がこれらの全てを備えることは必須とは限らない。 FIG. 1 schematically shows an example of the functional configuration of an image processing apparatus 100. The image processing device 100 includes a storage section 102, an image acquisition section 110, an estimation section 120, a detection section 130, a determination section 140, a replacement processing section 150, and an output control section 160. Note that it is not essential that the image processing apparatus 100 include all of these.

画像取得部１１０は、処理対象となる対象画像を取得する。画像取得部１１０は、画像処理装置１００の外部から対象画像を取得してよい。例えば、画像取得部１１０は、他の装置から対象画像を受信する。例えば、画像取得部１１０は、カメラから、カメラによって撮像された対象画像を受信する。画像取得部１１０は、取得した対象画像を記憶部１０２に記憶させる。 The image acquisition unit 110 acquires a target image to be processed. The image acquisition unit 110 may acquire the target image from outside the image processing device 100. For example, the image acquisition unit 110 receives a target image from another device. For example, the image acquisition unit 110 receives from a camera a target image captured by the camera. The image acquisition unit 110 causes the storage unit 102 to store the acquired target image.

推定部１２０は、画像取得部１１０が取得した対象画像の複数の画素のそれぞれについて、画素に対応する物体を推定する。 For each of the plurality of pixels of the target image acquired by the image acquisition unit 110, the estimation unit 120 estimates an object corresponding to the pixel.

推定部１２０は、例えば、ＣＮＮ（ＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋ）等の既存のＡＩ（ＡｒｔｉｆｉｃｉａｌＩｎｔｅｌｌｉｇｅｎｃｅ）技術を用いた画像セグメンテーションを実行することによって、対象画像の複数の画素のそれぞれについて、画像に対応する物体を推定する。推定部１２０は、例えば、セマンティックセグメンテーションを実行する。推定部１２０は、例えば、インスタンスセグメンテーションを実行する。推定部１２０は、例えば、パノプティックセグメンテーションを実行する。 For example, the estimation unit 120 performs image segmentation using existing AI (Artificial Intelligence) technology such as CNN (Convolutional Neural Network) to determine the object corresponding to the image for each of the plurality of pixels of the target image. presume. The estimation unit 120 performs, for example, semantic segmentation. The estimation unit 120 performs instance segmentation, for example. The estimation unit 120 performs panoptic segmentation, for example.

推定部１２０は、例えば、ＦＳＳを実行することによって、対象画像の複数の画素のそれぞれについて、画像に対応する物体を推定する。推定部１２０は、複数の物体のそれぞれについて、物体が撮影されたサポート画像及びサポート画像における物体の位置を示すアノテーションデータを含むサポートデータを取得しておき、それらを用いて、ＦＳＳを実行する。 The estimation unit 120 estimates an object corresponding to the image for each of the plurality of pixels of the target image, for example, by performing FSS. For each of the plurality of objects, the estimation unit 120 acquires support data including a support image in which the object is photographed and annotation data indicating the position of the object in the support image, and uses the support data to perform FSS.

推定部１２０による推定結果は、対象画像の全画素のそれぞれに、その画素が対応する物体を示す物体ＩＤが対応付けられたデータであってよい。具体例として、対象画像のうち、飛行機であると推定された画素に対して飛行機を示す物体ＩＤ：０が対応付けられ、道路であると推定された画素に対して道路を示す物体ＩＤ：１が対応付けられ、空であると推定された画素に対して空を示す物体ＩＤ：２が対応付けられ、建物であると推定された画素に対して建物を示す物体ＩＤ：３が対応付けられる。 The estimation result by the estimation unit 120 may be data in which each of all pixels of the target image is associated with an object ID indicating an object to which the pixel corresponds. As a specific example, in the target image, a pixel estimated to be an airplane is associated with an object ID: 0 indicating an airplane, and a pixel estimated to be a road is associated with an object ID: 1 indicating a road. , object ID: 2 indicating the sky is associated with a pixel estimated to be a sky, and object ID: 3 indicating a building is associated with a pixel estimated to be a building. .

検出部１３０は、推定部１２０による推定結果に対して、細かな絵柄変化に反応して誤推定された誤推定部分に反応するフィルタ処理を施すことによって、前記誤推定部分を検出する。これにより、検出部１３０は、推定部１２０による推定結果に含まれるドットごみに相当する部分を検出することができる。 The detection unit 130 detects the erroneously estimated portion by performing filter processing on the estimation result by the estimating unit 120 in response to the erroneously estimated portion that is erroneously estimated in response to a small pattern change. Thereby, the detection unit 130 can detect a portion corresponding to dot dust included in the estimation result by the estimation unit 120.

検出部１３０は、推定部１２０による推定が行われた対象画像に対して、５×５画素以上のサイズのフィルタによって、基準画素と、基準画素とを比較する複数の比較対象画素とを決定し、比較結果に基づいて誤推定部分を検出してよい。検出部１３０は、対象画像の全画素に対して、当該フィルタを適用することによって、全画素のそれぞれについて、誤推定部分に相当するか否かを判定してよい。 The detection unit 130 determines a reference pixel and a plurality of comparison target pixels for comparing the reference pixel with a filter having a size of 5×5 pixels or more for the target image estimated by the estimation unit 120. , an incorrectly estimated portion may be detected based on the comparison result. The detection unit 130 may determine whether each of all pixels corresponds to an incorrectly estimated portion by applying the filter to all pixels of the target image.

例えば、検出部１３０は、フィルタの中心部分に位置する画素を基準画素とし、フィルタの外周部分に位置する複数の画素を複数の比較対象画素として、推定部１２０によって推定された、基準画素に対応する物体と、複数の比較対象画素のそれぞれに対応する物体との一致数が、０である場合に、当該基準画素を誤推定部分として検出する。外周部分に位置する全ての画素の物体ＩＤが、中心部分に位置する基準画素の物体ＩＤと異なる場合、基準画素の推定結果は誤りである可能性が非常に高い。よって、これにより、非常に高い確度で、ドットごみを検出することができる。 For example, the detection unit 130 uses a pixel located at the center of the filter as a reference pixel, and a plurality of pixels located at the outer periphery of the filter as a plurality of comparison target pixels, which correspond to the reference pixel estimated by the estimation unit 120. If the number of matches between the object corresponding to the target pixel and the object corresponding to each of the plurality of comparison target pixels is 0, the reference pixel is detected as an erroneously estimated portion. If the object IDs of all the pixels located in the outer periphery are different from the object ID of the reference pixel located in the center, the estimation result of the reference pixel is very likely to be incorrect. Therefore, with this, dot dust can be detected with very high accuracy.

例えば、検出部１３０は、フィルタの中心部分に位置する画素を基準画素とし、フィルタの外周部分に位置する複数の画素を複数の比較対象画素として、推定部１２０によって推定された、基準画素に対応する物体と、複数の比較対象画素のそれぞれに対応する物体との一致数が、予め定められた閾値以下である場合に、当該基準画素を誤推定部分として検出する。外周部分に位置する大部分の画素の物体ＩＤが、中心部分に位置する基準画素の物体ＩＤと異なる場合、基準画素の推定結果は誤りである可能性が高い。よって、これにより、高い確度で、ドットごみを検出することができる。なお、当該閾値は、任意に設定可能であってよく、また、変更可能であってよい。当該閾値は、例えば、画像処理装置１００の利用者が、経験的に設定してよい。また、例えば、画像処理装置１００の利用者が、複数の閾値のそれぞれを設定して実験を行うことにより、ドットごみの検出精度を特定し、特定結果によって、閾値を設定してよい。 For example, the detection unit 130 uses a pixel located at the center of the filter as a reference pixel, and a plurality of pixels located at the outer periphery of the filter as a plurality of comparison target pixels, which correspond to the reference pixel estimated by the estimation unit 120. When the number of matches between the object corresponding to the reference pixel and the object corresponding to each of the plurality of comparison target pixels is less than or equal to a predetermined threshold, the reference pixel is detected as an incorrectly estimated portion. If the object IDs of most of the pixels located in the outer peripheral portion are different from the object IDs of the reference pixels located in the center portion, the estimation result of the reference pixels is likely to be incorrect. Therefore, with this, dot dust can be detected with high accuracy. Note that the threshold value may be arbitrarily set or changeable. The threshold value may be set empirically by the user of the image processing device 100, for example. Further, for example, the user of the image processing apparatus 100 may specify the detection accuracy of dot dust by setting each of a plurality of threshold values and conducting an experiment, and may set the threshold value based on the identification result.

決定部１４０は、対象画像における誤推定部分の画素に対して新たに対応させる物体を決定する。決定部１４０は、対象画像における誤推定部分の画素の周辺の複数の画素について、対応する物体が同一である画素の数を、物体毎に特定し、特定結果に基づいて、誤推定部分の画素に対応させる物体を決定してよい。 The determining unit 140 determines an object to be newly associated with pixels in the incorrectly estimated portion of the target image. The determining unit 140 identifies, for each object, the number of pixels that have the same corresponding object for a plurality of pixels surrounding a pixel in the erroneously estimated portion in the target image, and based on the identification result, determines the number of pixels in the erroneously estimated portion. You may decide which object corresponds to.

決定部１４０は、対象画像における誤推定部分の画素に対応付けられている物体ＩＤを補完する補完ＩＤを決定してよい。決定部１４０は、例えば、誤推定部分の画素の周辺画素に対応する物体ＩＤの最頻値となる物体ＩＤを補完ＩＤとして決定する。決定部１４０は、例えば、誤推定部分の画素の周辺の３×３の画素に対応する物体ＩＤの最頻値となる物体ＩＤを補完ＩＤとして決定する。周辺の３×３に限らず、ｎ×ｎ（ｎは２以上）であってもよく、ｎ×ｍ（ｎは２以上、ｍは３以上）であってもよい。また、決定部１４０は、誤推定部分の画素に隣接する８画素に対応する物体ＩＤの最頻値となる物体ＩＤを補完ＩＤとして決定してもよい。なお、決定部１４０は、誤推定部分の画素の周辺画素に対応する物体ＩＤの最頻値に代えて、中央値を用いてもよい。 The determining unit 140 may determine a complementary ID that complements the object ID associated with the pixel of the incorrectly estimated portion in the target image. For example, the determining unit 140 determines, as the complementary ID, the object ID that is the most frequent value of the object IDs corresponding to pixels surrounding the pixel in the incorrectly estimated portion. For example, the determining unit 140 determines, as the complementary ID, the object ID that is the most frequent value of the object IDs corresponding to 3×3 pixels around the pixel in the incorrectly estimated portion. The surrounding area is not limited to 3x3, but may be nxn (n is 2 or more) or nxm (n is 2 or more, m is 3 or more). Further, the determining unit 140 may determine, as the complementary ID, the object ID that is the most frequent value of the object IDs corresponding to eight pixels adjacent to the pixel in the incorrectly estimated portion. Note that the determining unit 140 may use the median value instead of the most frequent value of object IDs corresponding to pixels surrounding the pixel in the incorrectly estimated portion.

置換処理部１５０は、推定部１２０によって推定された、対象画像における誤推定部分の画素に対応する物体を、決定部１４０によって決定された物体に置き換える。置換処理部１５０は、推定部１２０によって、対象画像における誤推定部分に対応付けられた物体ＩＤを、決定部１４０によって決定された補完ＩＤに置き換えてよい。 The replacement processing unit 150 replaces the object estimated by the estimation unit 120 that corresponds to the pixel of the incorrectly estimated portion in the target image with the object determined by the determination unit 140. The replacement processing unit 150 may replace the object ID associated with the incorrectly estimated portion in the target image by the estimation unit 120 with the complementary ID determined by the determination unit 140.

出力制御部１６０は、各種出力を制御する。出力制御部１６０は、例えば、各種情報を表示出力させる。出力制御部１６０は、画像処理装置１００が備えるディスプレイに各種情報を表示させてよい。出力制御部１６０は、他の装置に各種情報を送信することによって、他の装置が備えるディスプレイに各種情報を表示させてよい。 The output control unit 160 controls various outputs. The output control unit 160 displays and outputs various information, for example. The output control unit 160 may display various information on a display included in the image processing apparatus 100. The output control unit 160 may display various information on a display included in the other device by transmitting the various information to the other device.

出力制御部１６０は、例えば、推定部１２０による推定結果を表示出力させる。出力制御部１６０は、例えば、検出部１３０による検出結果を表示出力させる。出力制御部１６０は、例えば、決定部１４０による決定結果を表示出力させる。 For example, the output control unit 160 causes the estimation result by the estimation unit 120 to be displayed and output. For example, the output control unit 160 causes the detection result by the detection unit 130 to be displayed and output. For example, the output control unit 160 causes the determination result by the determination unit 140 to be displayed and output.

出力制御部１６０は、例えば、対象画像に、推定部１２０による推定結果と、検出部１３０による検出結果とを重畳表示させる。出力制御部１６０は、例えば、対象画像の複数の画素のそれぞれに、推定部１２０によって推定された物体ＩＤに対応する色を施し、かつ、誤推定部分として検出された画素に、誤推定部分であることを識別する表示態様を適用させた、対象画像を表示出力させる。出力制御部１６０は、更に、決定部１４０による決定結果を重畳表示させる。例えば、出力制御部１６０は、誤推定部分に対して、決定部１４０によって決定された物体の物体ＩＤを、修正候補として表示させる。置換処理部１５０は、修正候補として表示された物体ＩＤへの置き換えを指示されたことに応じて、物体ＩＤの置き換えを実行してもよい。出力制御部１６０は、置換処理部１５０による置換が行われた後に、対象画像の複数の画素のそれぞれに、対応する物体ＩＤに対応する色を施して、対象画像を表示出力させてもよい。 For example, the output control unit 160 causes the estimation result by the estimation unit 120 and the detection result by the detection unit 130 to be displayed in a superimposed manner on the target image. For example, the output control unit 160 applies a color corresponding to the object ID estimated by the estimation unit 120 to each of the plurality of pixels of the target image, and applies a color corresponding to the object ID estimated by the estimation unit 120 to a pixel detected as an incorrectly estimated portion. A target image is displayed and output to which a display mode that identifies a certain thing is applied. The output control unit 160 further causes the determination result by the determination unit 140 to be displayed in a superimposed manner. For example, the output control unit 160 causes the object ID of the object determined by the determination unit 140 to be displayed as a correction candidate for the incorrectly estimated portion. The replacement processing unit 150 may perform the replacement of the object ID in response to an instruction to replace the object ID with the object ID displayed as a modification candidate. After the replacement processing unit 150 performs the replacement, the output control unit 160 may apply a color corresponding to the corresponding object ID to each of the plurality of pixels of the target image, and display and output the target image.

図２は、検出部１３０が用いるフィルタ２００の一例を概略的に示す。ここでは、フィルタ２００のサイズが５×５である場合を例示している。 FIG. 2 schematically shows an example of a filter 200 used by the detection unit 130. Here, a case is illustrated in which the size of the filter 200 is 5×5.

検出部１３０は、フィルタ２００の中心部分に位置する画素２１０を基準画素とし、フィルタ２００の外周部分に位置する複数の画素２３０を複数の比較対象画素としてよい。 The detection unit 130 may use the pixel 210 located at the center of the filter 200 as a reference pixel, and use the plurality of pixels 230 located at the outer peripheral portion of the filter 200 as the plurality of comparison target pixels.

検出部１３０は、例えば、推定部１２０によって推定された、基準画素に対応する物体と、複数の比較対象画素のそれぞれに対応する物体との一致数が、０である場合に、基準画素を、誤推定部分として検出する。図２に示す例において、例えば、検出部１３０は、画素２１０に対応する物体ＩＤと、１６個の画素２３０のそれぞれに対応する物体ＩＤとを比較する。そして、検出部１３０は、１６個の画素２３０のそれぞれに対応する物体ＩＤの全てが、画素２１０に対応する物体ＩＤと異なる場合に、画素２１０を誤推定部分として検出する。 For example, when the number of matches between the object corresponding to the reference pixel and the objects corresponding to each of the plurality of comparison target pixels estimated by the estimation unit 120 is 0, the detection unit 130 detects the reference pixel as Detected as an incorrectly estimated part. In the example shown in FIG. 2, for example, the detection unit 130 compares the object ID corresponding to the pixel 210 and the object ID corresponding to each of the 16 pixels 230. Then, when all of the object IDs corresponding to each of the 16 pixels 230 are different from the object ID corresponding to the pixel 210, the detection unit 130 detects the pixel 210 as an incorrectly estimated portion.

検出部１３０は、例えば、推定部１２０によって推定された、基準画素に対応する物体と、複数の比較対象画素のそれぞれに対応する物体との一致数が、予め定められた閾値（ここでは２であるものとして説明する）以下である場合に、基準画素を、誤推定部分として検出してよい。図２に示す例において、例えば、検出部１３０は、画素２１０に対応する物体ＩＤと、１６個の画素２３０のそれぞれに対応する物体ＩＤとを比較する。そして、検出部１３０は、１６個の画素２３０のそれぞれに対応する物体ＩＤのうち、１４個以上が、画素２１０に対応する物体ＩＤと異なる場合に、画素２１０を誤推定部分として検出する。 For example, the detection unit 130 determines that the number of matches between the object corresponding to the reference pixel and the objects corresponding to each of the plurality of comparison target pixels, estimated by the estimation unit 120, is set to a predetermined threshold (here, 2). The reference pixel may be detected as an erroneously estimated portion in the following cases. In the example shown in FIG. 2, for example, the detection unit 130 compares the object ID corresponding to the pixel 210 and the object ID corresponding to each of the 16 pixels 230. Then, when 14 or more of the object IDs corresponding to each of the 16 pixels 230 are different from the object ID corresponding to the pixel 210, the detection unit 130 detects the pixel 210 as an incorrectly estimated portion.

決定部１４０は、画素２１０が誤推定部分として検出された場合に、画素２１０の周辺の複数の画素に対応する物体に基づいて、画素２１０に対応させる物体を決定してよい。決定部１４０は、周辺の複数の画素について、対応する物体が同一である画素の数を、物体毎に特定し、特定結果に基づいて、誤推定部分の画素２１０に対応させる物体を決定してよい。図２に示す例において、例えば、決定部１４０は、画素２１０及び８個の画素２２０を周辺の画素とする。そして、決定部１４０は、画素２１０及び８個の画素２２０について、対応する物体ＩＤが同一である画素の数を、物体ＩＤ毎に特定し、特定結果に基づいて、画素２１０に対応させる物体ＩＤを決定する。例えば、道路の物体ＩＤの数が６個であり、空の物体ＩＤの数が２個であり、飛行機の物体ＩＤの数が１個である場合、決定部１４０は、画素２１０に対応させる物体ＩＤを、最頻の道路の物体ＩＤとする。 The determining unit 140 may determine an object to be associated with the pixel 210 based on objects corresponding to a plurality of pixels around the pixel 210 when the pixel 210 is detected as an incorrectly estimated portion. The determining unit 140 identifies, for each object, the number of pixels that have the same corresponding object among a plurality of surrounding pixels, and determines the object to be made to correspond to the pixel 210 in the incorrectly estimated portion based on the identification result. good. In the example shown in FIG. 2, for example, the determining unit 140 sets the pixel 210 and eight pixels 220 as peripheral pixels. Then, the determining unit 140 specifies the number of pixels having the same corresponding object ID for the pixel 210 and the eight pixels 220, and identifies the object ID to be associated with the pixel 210 based on the identification result. Determine. For example, if the number of road object IDs is 6, the number of sky object IDs is 2, and the number of airplane object IDs is 1, the determining unit 140 determines the object ID to be associated with the pixel 210. Let ID be the object ID of the most frequent road.

検出部１３０は、フィルタ２００の中心部分に位置する画素２１０を基準画素とし、フィルタ２００の内周部分に位置する複数の画素２２０を複数の比較対象画素とした比較処理を実行してもよい。図２に示す例において、検出部１３０は、推定部１２０によって推定された、画素２１０に対応する物体ＩＤと、８個の画素２２０のそれぞれに対応する物体ＩＤとの一致数を算出してよい。 The detection unit 130 may perform a comparison process using a pixel 210 located at the center of the filter 200 as a reference pixel and using a plurality of pixels 220 located at an inner peripheral portion of the filter 200 as a plurality of comparison target pixels. In the example shown in FIG. 2, the detection unit 130 may calculate the number of matches between the object ID corresponding to the pixel 210 estimated by the estimation unit 120 and the object ID corresponding to each of the eight pixels 220. .

図３は、検出部１３０が用いるフィルタ２００の一例を概略的に示す。ここでは、フィルタ２００のサイズが７×７である場合を例示している。 FIG. 3 schematically shows an example of a filter 200 used by the detection unit 130. Here, a case is illustrated in which the size of the filter 200 is 7×7.

検出部１３０は、例えば、推定部１２０によって推定された、基準画素に対応する物体と、複数の比較対象画素のそれぞれに対応する物体との一致数が、０である場合に、基準画素を、誤推定部分として検出する。図３に示す例において、例えば、検出部１３０は、画素２１０に対応する物体ＩＤと、２４個の画素２３０のそれぞれに対応する物体ＩＤとを比較する。そして、検出部１３０は、２４個の画素２３０のそれぞれに対応する物体ＩＤの全てが、画素２１０に対応する物体ＩＤと異なる場合に、画素２１０を誤推定部分として検出する。 For example, when the number of matches between the object corresponding to the reference pixel and the objects corresponding to each of the plurality of comparison target pixels estimated by the estimation unit 120 is 0, the detection unit 130 detects the reference pixel as Detected as an incorrectly estimated part. In the example shown in FIG. 3, for example, the detection unit 130 compares the object ID corresponding to the pixel 210 and the object ID corresponding to each of the 24 pixels 230. Then, when all of the object IDs corresponding to each of the 24 pixels 230 are different from the object ID corresponding to the pixel 210, the detection unit 130 detects the pixel 210 as an incorrectly estimated portion.

検出部１３０は、例えば、推定部１２０によって推定された、基準画素に対応する物体と、複数の比較対象画素のそれぞれに対応する物体との一致数が、予め定められた閾値（ここでは４であるものとして説明する）以下である場合に、基準画素を、誤推定部分として検出してよい。図３に示す例において、例えば、検出部１３０は、画素２１０に対応する物体ＩＤと、２４個の画素２３０のそれぞれに対応する物体ＩＤとを比較する。そして、検出部１３０は、２４個の画素２３０のそれぞれに対応する物体ＩＤのうち、２０個以上が、画素２１０に対応する物体ＩＤと異なる場合に、画素２１０を誤推定部分として検出する。 For example, the detection unit 130 determines that the number of matches between the object corresponding to the reference pixel and the objects corresponding to each of the plurality of comparison target pixels, estimated by the estimation unit 120, is set to a predetermined threshold (here, 4). The reference pixel may be detected as an erroneously estimated portion in the following cases. In the example shown in FIG. 3, for example, the detection unit 130 compares the object ID corresponding to the pixel 210 and the object ID corresponding to each of the 24 pixels 230. Then, when 20 or more of the object IDs corresponding to each of the 24 pixels 230 are different from the object ID corresponding to the pixel 210, the detection unit 130 detects the pixel 210 as an incorrectly estimated portion.

決定部１４０は、画素２１０が誤推定部分として検出された場合に、画素２１０の周辺の複数の画素に対応する物体に基づいて、画素２１０に対応させる物体を決定してよい。決定部１４０は、周辺の複数の画素について、対応する物体が同一である画素の数を、物体毎に特定し、特定結果に基づいて、誤推定部分の画素２１０に対応させる物体を決定してよい。図３に示す例において、例えば、決定部１４０は、画素２１０及び８個の画素２２０を周辺の画素とする。そして、決定部１４０は、画素２１０及び８個の画素２２０について、対応する物体ＩＤが同一である画素の数を、物体ＩＤ毎に特定し、特定結果に基づいて、画素２１０に対応させる物体ＩＤを決定する。また、例えば、決定部１４０は、画素２１０と、８個の画素２２０と、その周りの１６個の画素とを周辺の画素としてもよい。 The determining unit 140 may determine an object to be associated with the pixel 210 based on objects corresponding to a plurality of pixels around the pixel 210 when the pixel 210 is detected as an incorrectly estimated portion. The determining unit 140 identifies, for each object, the number of pixels that have the same corresponding object among a plurality of surrounding pixels, and determines the object to be made to correspond to the pixel 210 in the incorrectly estimated portion based on the identification result. good. In the example shown in FIG. 3, for example, the determining unit 140 sets the pixel 210 and eight pixels 220 as peripheral pixels. Then, the determining unit 140 specifies the number of pixels having the same corresponding object ID for the pixel 210 and the eight pixels 220, and identifies the object ID to be associated with the pixel 210 based on the identification result. Determine. Further, for example, the determining unit 140 may set the pixel 210, the eight pixels 220, and the 16 pixels around them as peripheral pixels.

検出部１３０は、フィルタ２００の中心部分に位置する画素２１０を基準画素とし、フィルタ２００の内周部分に位置する複数の画素２２０を複数の比較対象画素とした比較処理を実行してもよい。図３に示す例において、検出部１３０は、推定部１２０によって推定された、画素２１０に対応する物体ＩＤと、８個の画素２２０のそれぞれに対応する物体ＩＤとの一致数を算出してよい。 The detection unit 130 may perform a comparison process using a pixel 210 located at the center of the filter 200 as a reference pixel and using a plurality of pixels 220 located at an inner peripheral portion of the filter 200 as a plurality of comparison target pixels. In the example shown in FIG. 3, the detection unit 130 may calculate the number of matches between the object ID corresponding to the pixel 210 estimated by the estimation unit 120 and the object ID corresponding to each of the eight pixels 220. .

図４は、画像処理装置１００による処理の流れの一例を概略的に示す。ここでは、ドットごみを検出する対象となる対象画像を取得して、当該対象画像のドットごみを検出し、ドットごみを除去する流れを説明する。 FIG. 4 schematically shows an example of the flow of processing by the image processing apparatus 100. Here, a flow of acquiring a target image from which dot dust is to be detected, detecting dot dust in the target image, and removing the dot dust will be described.

ステップ１０２（ステップをＳと省略して記載する場合がある。）１０２では、画像取得部１１０が、対象画像を取得する。Ｓ１０４では、推定部１２０が、Ｓ１０２において画像取得部１１０が取得した対象画像の各画素に対応する物体を推定する。推定部１２０は、各画素に対して、推定した物体の物体ＩＤを対応付ける。 In step 102 (step may be abbreviated as S) 102, the image acquisition unit 110 acquires a target image. In S104, the estimation unit 120 estimates an object corresponding to each pixel of the target image acquired by the image acquisition unit 110 in S102. The estimation unit 120 associates each pixel with an object ID of the estimated object.

Ｓ１０６では、検出部１３０が、ドットごみ検出処理を実行する。検出部１３０は、Ｓ１０４における推定結果に対して、フィルタ２００を用いたフィルタ処理を施すことによって、誤推定部分を検出する。検出部１３０は、対象画像の全画素に対して、フィルタ２００を用いたフィルタ処理を施してよい。 In S106, the detection unit 130 executes dot dust detection processing. The detection unit 130 detects an erroneously estimated portion by performing filter processing using the filter 200 on the estimation result in S104. The detection unit 130 may perform filter processing using the filter 200 on all pixels of the target image.

ドットごみが１つでも検出された場合（Ｓ１０８でＹＥＳ）、Ｓ１１０に進み、ドットごみが検出されなかった場合（Ｓ１０８でＮＯ）、処理を終了する。Ｓ１１０では、決定部１４０が、Ｓ１０６において検出された誤推定部分の画素の物体ＩＤに対して、補完すべき補完ＩＤを算出する。決定部１４０は、誤推定部分の画素の周辺の複数の画素について、対応する物体ＩＤが同一である画素の数を、物体ＩＤ毎に特定し、特定結果に基づいて、補完ＩＤを決定する。 If even one dot dust is detected (YES in S108), the process advances to S110, and if no dot dust is detected (NO in S108), the process ends. In S110, the determining unit 140 calculates a complementary ID to be complemented for the object ID of the pixel in the incorrectly estimated portion detected in S106. The determining unit 140 identifies, for each object ID, the number of pixels with the same corresponding object ID for a plurality of pixels surrounding the pixel in the incorrectly estimated portion, and determines a complementary ID based on the identification result.

Ｓ１１２では、置換処理部１５０が、Ｓ１０４において推定された、対象画像における誤推定部分の画素に対応する物体ＩＤを、Ｓ１１０において決定された補完ＩＤに置き換える。そして、処理を終了する。 In S112, the replacement processing unit 150 replaces the object ID corresponding to the pixel in the incorrectly estimated portion of the target image estimated in S104 with the complementary ID determined in S110. Then, the process ends.

図５は、検出部１３０によるマルチスケール処理について説明するための説明図である。検出部１３０は、対象画像３００のサイズを変更し、複数の異なるサイズの対象画像３００のそれぞれに対して、誤推定部分を検出する検出処理を実行する。図５では、等倍の対象画像３００と、１／４のサイズの対象画像３００と、１／８のサイズの対象画像３００のそれぞれに対して、検出処理を実行する場合を例示している。 FIG. 5 is an explanatory diagram for explaining multiscale processing by the detection unit 130. The detection unit 130 changes the size of the target image 300 and performs a detection process to detect an incorrectly estimated portion for each of the target images 300 of a plurality of different sizes. FIG. 5 illustrates a case where the detection process is performed on each of a target image 300 of equal size, a target image 300 of 1/4 size, and a target image 300 of 1/8 size.

等倍の対象画像３００に対して検出処理を実行することによって、等倍のサイズにおける１画素の大きさのドットごみを検出できる。しかし、対象画像３００内には、様々なサイズのドットサイズが存在し得る。そこで、検出部１３０は、対象画像３００を縮小して検出処理を実行する。例えば、図５に示すように、１／４のサイズに縮小した対象画像３００に対して検出処理を実行することによって、等倍の対象画像３００ではドットごみとして検出できない、２×２画素のドットごみが、１画素のサイズとなることによって、ドットごみとして検出することができる。 By performing the detection process on the target image 300 at the same size, it is possible to detect dot dust with a size of one pixel at the same size. However, various dot sizes may exist within the target image 300. Therefore, the detection unit 130 reduces the target image 300 and executes the detection process. For example, as shown in FIG. 5, by performing the detection process on the target image 300 reduced to 1/4 the size, it is possible to detect 2×2 pixel dots that cannot be detected as dot dust in the target image 300 of the same size. When the dust has a size of one pixel, it can be detected as dot dust.

検出部１３０は、１／４のサイズの対象画像３００に対して検出処理を実行し、誤推定部分を検出する。そして、検出部１３０は、対象画像３００を等倍のサイズに戻し、検出結果３１４とする。これにより、１／４のサイズの対象画像３００における１画素のサイズの誤推定部分は、２×２画素のサイズの誤推定部分となる。 The detection unit 130 performs a detection process on the target image 300 of 1/4 size and detects an incorrectly estimated portion. Then, the detection unit 130 returns the target image 300 to the same size as the detection result 314. As a result, an erroneously estimated portion with a size of 1 pixel in the target image 300 having a size of 1/4 becomes an erroneously estimated portion with a size of 2×2 pixels.

同様に、検出部１３０は、１／８のサイズの対象画像３００に対して検出処理を実行し、誤推定部分を検出する。そして、検出部１３０は、誤推定部分を含めて、対象画像３００を等倍のサイズに戻し、検出結果３１６とする。 Similarly, the detection unit 130 performs detection processing on the target image 300 of ⅛ size to detect an incorrectly estimated portion. Then, the detection unit 130 returns the target image 300 to the same size, including the erroneously estimated portion, as a detection result 316.

検出部１３０は、複数の異なるサイズの対象画像３００のそれぞれに対して検出処理を実行した検出結果を統合し、複数の異なるサイズの対象画像３００の少なくともいずれかにおいて誤推定部分として検出された部分を、対象画像３００の誤推定部分としてよい。図５に示す例において、検出部１３０は、検出結果３１２、検出結果３１４、及び検出結果３１６を統合し、検出結果３１２、検出結果３１４、及び検出結果３１６の少なくともいずれかにおいて誤推定部分として検出された部分を、誤推定部分とする。 The detection unit 130 integrates the detection results obtained by performing detection processing on each of the plurality of target images 300 of different sizes, and detects a portion detected as an incorrectly estimated portion in at least one of the plurality of target images 300 of different sizes. may be the incorrectly estimated portion of the target image 300. In the example shown in FIG. 5, the detection unit 130 integrates the detection result 312, the detection result 314, and the detection result 316, and detects it as an incorrectly estimated part in at least one of the detection result 312, the detection result 314, and the detection result 316. The part that is incorrectly estimated is the incorrectly estimated part.

これにより、様々なサイズのドットごみを検出することができる。決定部１４０は、様々なサイズの誤推定部分について、誤推定部分の画素の周辺の画素を用いて、誤推定部分の画素に対応させる物体を特定してよい。例えば、決定部１４０は、２×２画素のサイズの誤推定部分について、その周辺の１２画素、又は、当該１２画素と誤推定部分の４画素について、対応する物体が同一である画素の数を、物体毎に特定し、特定結果に基づいて、誤推定部分の画素に対応させる物体を決定する。決定部１４０は、最頻の物体を、誤推定部分の画素に対応させる物体として決定してよい。置換処理部１５０は、誤推定部分の画素に対応する物体を、決定部１４０によって決定された物体に置き換えてよい。 This makes it possible to detect dot dust of various sizes. For incorrectly estimated portions of various sizes, the determining unit 140 may use pixels around the pixels of the incorrectly estimated portion to identify an object to be made to correspond to a pixel of the incorrectly estimated portion. For example, the determining unit 140 calculates the number of pixels in which the corresponding object is the same for the 12 pixels surrounding the erroneously estimated portion having a size of 2×2 pixels, or the 12 pixels and the 4 pixels in the erroneously estimated portion. , each object is identified, and based on the identification result, an object to be associated with the pixel of the incorrectly estimated portion is determined. The determining unit 140 may determine the most frequent object as the object that corresponds to the pixels of the incorrectly estimated portion. The replacement processing unit 150 may replace the object corresponding to the pixel of the incorrectly estimated portion with the object determined by the determination unit 140.

図５では、サイズの例として、１／４及び１／８を挙げているが、これらに限られず、他の縮小サイズが用いられてもよい。また、拡大サイズが用いられてもよい。対象画像３００に対してマルチスケール処理を実行する場合に、どのサイズを採用するかは、例えば、予め設定されてよい。例えば、画像処理装置１００の利用者が、対象画像３００のサイズ等に応じて、マルチスケール処理に用いる複数のサイズを設定する。例えば、画像処理装置１００の利用者は、対象画像３００のサイズが第１のサイズである場合に、等倍、１／４及び１／８を設定し、対象画像３００のサイズが第１のサイズよりも大きい第２のサイズである場合、等倍、１／４、１／８、及び１／１６を設定し得る。なお、マルチスケール処理に用いる複数のサイズは、検出部１３０によって自動設定されてもよい。検出部１３０は、対象画像３００のサイズに応じた、マルチスケール処理に用いる複数のサイズを設定する。 Although 1/4 and 1/8 are shown as examples of sizes in FIG. 5, the size is not limited to these, and other reduced sizes may be used. Also, enlarged sizes may be used. When performing multiscale processing on the target image 300, which size is to be adopted may be set in advance, for example. For example, a user of the image processing device 100 sets a plurality of sizes to be used in multi-scale processing depending on the size of the target image 300 and the like. For example, when the size of the target image 300 is the first size, the user of the image processing device 100 sets the size to equal size, 1/4, and 1/8, and sets the size of the target image 300 to the first size. If the second size is larger than , it may be set to equal size, 1/4, 1/8, and 1/16. Note that the plurality of sizes used for multi-scale processing may be automatically set by the detection unit 130. The detection unit 130 sets a plurality of sizes to be used in multi-scale processing according to the size of the target image 300.

図６は、推定部１２０の機能構成の一例を概略的に示す。ここでは、推定部１２０がＦＳＳを実行することによって、対象画像の複数の画素のそれぞれについて、画像に対応する物体を推定する場合における、推定部１２０の機能構成の一例を示す。推定部１２０がＦＳＳを実行する場合、画像取得部１１０が、複数の物体のそれぞれについて、サポートデータを取得して、記憶部１０２に記憶させておく。推定部１２０は、サポートデータ取得部１２１、及び照合推論処理部１２２を有する。 FIG. 6 schematically shows an example of the functional configuration of the estimation unit 120. Here, an example of the functional configuration of the estimating section 120 will be shown in a case where the estimating section 120 executes FSS to estimate an object corresponding to the image for each of a plurality of pixels of the target image. When the estimation unit 120 executes FSS, the image acquisition unit 110 acquires support data for each of the plurality of objects and stores it in the storage unit 102. The estimation unit 120 includes a support data acquisition unit 121 and a matching inference processing unit 122.

サポートデータ取得部１２１は、記憶部１０２からサポートデータを取得する。サポートデータ取得部１２１は、それぞれが複数の物体のそれぞれに対応する複数のサポートデータを取得してよい。１つのサポートデータには、１以上のサポート画像及びアノテーションデータの組み合わせが含まれる。 The support data acquisition unit 121 acquires support data from the storage unit 102. The support data acquisition unit 121 may acquire a plurality of pieces of support data, each of which corresponds to a plurality of objects. One piece of support data includes a combination of one or more support images and annotation data.

照合推論処理部１２２は、サポートデータ取得部１２１が取得した複数のサポートデータを用いて、対象画像に含まれる各種物体の位置を推定する。照合推論処理部１２２は、特徴抽出処理部１２３、プロトタイプ算出処理部１２４、及びプロトタイプ照合処理部１２５を有する。 The matching and inference processing unit 122 uses the plurality of pieces of support data acquired by the support data acquisition unit 121 to estimate the positions of various objects included in the target image. The matching and inference processing unit 122 includes a feature extraction processing unit 123, a prototype calculation processing unit 124, and a prototype matching processing unit 125.

特徴抽出処理部１２３は、複数のサポートデータのそれぞれについて、サポートデータに含まれる１以上のサポート画像のそれぞれの特徴量を抽出する。特徴抽出処理部１２３は、サポート画像の複数の領域毎の特徴量を示すサポートフィーチャを生成してよい。 The feature extraction processing unit 123 extracts, for each of the plurality of support data, the feature amount of each of one or more support images included in the support data. The feature extraction processing unit 123 may generate support features indicating feature amounts for each of a plurality of regions of the support image.

特徴抽出処理部１２３は、画像取得部１１０が取得した対象画像の特徴量を抽出する。特徴抽出処理部１２３は、対象画像の複数の領域毎の特徴量を示す対象画像フィーチャを生成してよい。 The feature extraction processing unit 123 extracts the feature amount of the target image acquired by the image acquisition unit 110. The feature extraction processing unit 123 may generate target image features indicating feature amounts for each of a plurality of regions of the target image.

プロトタイプ算出処理部１２４は、サポートデータに含まれるサポート画像及びアノテーションデータに基づいて、サポート画像の物体の特徴を表すプロトタイプを算出する。プロトタイプ算出処理部１２４は、特徴抽出処理部１２３によって生成されたサポートフィーチャと、アノテーションデータに基づいて、プロトタイプを算出してよい。 The prototype calculation processing unit 124 calculates a prototype representing the features of the object in the support image based on the support image and annotation data included in the support data. The prototype calculation processing unit 124 may calculate a prototype based on the support features generated by the feature extraction processing unit 123 and the annotation data.

例えば、プロトタイプ算出処理部１２４は、アノテーションデータを用いて、サポートフィーチャから、物体の位置に対応する複数の領域の特徴量を抽出する。プロトタイプ算出処理部１２４は、複数のサポートフィーチャのそれぞれから、物体の位置に対応する複数の領域の特徴量を抽出してよい。プロトタイプ算出処理部１２４は、抽出した複数の特徴量を特徴空間に配置して、複数の特徴量の重心を、物体のプロトタイプとして算出してよい。このように、プロトタイプは、特徴空間における特徴ベクトルであってよい。 For example, the prototype calculation processing unit 124 uses the annotation data to extract feature amounts of a plurality of regions corresponding to the position of the object from the support feature. The prototype calculation processing unit 124 may extract feature amounts of a plurality of regions corresponding to the position of the object from each of the plurality of support features. The prototype calculation processing unit 124 may arrange the extracted feature quantities in a feature space and calculate the center of gravity of the plurality of feature quantities as a prototype of the object. Thus, a prototype may be a feature vector in feature space.

プロトタイプ照合処理部１２５は、対象画像の複数の領域毎に、プロトタイプ算出処理部１２４によって算出されたプロトタイプとの類似度に基づいて、対応する物体を判定する。プロトタイプ照合処理部１２５は、例えば、対象画像フィーチャを用いて、対象画像の複数の領域のそれぞれについて、領域の特徴量と物体のプロトタイプとの類似度を算出する。プロトタイプ照合処理部１２５は、例えば、領域の特徴量と物体のプロトタイプとのコサイン類似度を算出する。そして、プロトタイプ照合処理部１２５は、対象画像の複数の領域のそれぞれについて、類似度が最も高いプロトタイプに対応する物体を、領域に対応する物体として決定する。 The prototype matching processing unit 125 determines the corresponding object for each of the plurality of regions of the target image based on the degree of similarity with the prototype calculated by the prototype calculation processing unit 124. The prototype matching processing unit 125 uses, for example, the target image features to calculate the degree of similarity between the feature amount of the region and the prototype of the object for each of the plurality of regions of the target image. The prototype matching processing unit 125 calculates, for example, the cosine similarity between the feature amount of the region and the prototype of the object. Then, for each of the plurality of regions of the target image, the prototype matching processing unit 125 determines the object corresponding to the prototype with the highest degree of similarity as the object corresponding to the region.

図７、図８及び図９は、推定部１２０によるＦＳＳについて概略的に説明するための説明図である。ここでは、４－Ｓｈｏｔ、すなわち、サポートデータ４０に４組のサポート画像４２及びアノテーションデータ４４が含まれている場合を例示している。 7, FIG. 8, and FIG. 9 are explanatory diagrams for schematically explaining the FSS by the estimation unit 120. Here, a 4-Shot, that is, a case where the support data 40 includes four sets of support images 42 and annotation data 44 is illustrated.

図７では、複数のサポートデータ４０のうちの１つのサポートデータ４０を例示している。当該サポートデータは、飛行機に対応する。４つのサポート画像４２のそれぞれは、飛行機を含む。４つのアノテーションデータ４４のそれぞれは、対応するサポート画像４２における飛行機の位置を示す。 In FIG. 7, one support data 40 among the plurality of support data 40 is illustrated. The support data corresponds to an airplane. Each of the four support images 42 includes an airplane. Each of the four annotation data 44 indicates the position of the airplane in the corresponding support image 42.

特徴抽出処理部１２３は、４つのサポート画像４２のそれぞれについて、サポート画像４２の複数の領域４０１毎の特徴量を示すサポートフィーチャ４００を生成する。領域４０１のサイズは、任意のサイズであってよく、設定によって変更可能であってよい。 The feature extraction processing unit 123 generates, for each of the four support images 42, a support feature 400 indicating the feature amount for each of the plurality of regions 401 of the support image 42. The size of the area 401 may be any size and may be changeable according to settings.

プロトタイプ算出処理部１２４は、アノテーションデータ４４及びサポートフィーチャ４００を用いて、飛行機のプロトタイプを生成する。本例において、プロトタイプ算出処理部１２４は、１つ目のサポートフィーチャ４００のうちの、複数の飛行機領域４０２の特徴量を抽出する。同様に、プロトタイプ算出処理部１２４は、他の３つのサポートフィーチャ４００のうちの、複数の飛行機領域４０２の特徴量を抽出する。プロトタイプ算出処理部１２４は、抽出した複数の特徴量を特徴空間４１２に配置して、複数の特徴量の重心を、飛行機プロトタイプ４２２として算出する。 The prototype calculation processing unit 124 uses the annotation data 44 and the support features 400 to generate a prototype of the airplane. In this example, the prototype calculation processing unit 124 extracts feature amounts of a plurality of airplane regions 402 of the first support feature 400. Similarly, the prototype calculation processing unit 124 extracts feature amounts of a plurality of airplane regions 402 from among the other three support features 400. The prototype calculation processing unit 124 arranges the plurality of extracted feature quantities in the feature space 412 and calculates the center of gravity of the plurality of feature quantities as the airplane prototype 422.

特徴抽出処理部１２３は、図９に例示するように、対象画像３００の複数の領域３０１毎の特徴量を示す対象画像フィーチャ３０２を生成する。プロトタイプ照合処理部１２５は、複数の領域３０１のそれぞれについて、複数のプロトタイプとの類似度を算出して、最も類似度が高いプロトタイプを決定し、プロトタイプに対応する物体ＩＤを対応付ける。これにより、対象画像３００の全画素のそれぞれに物体ＩＤが対応付けられる。 The feature extraction processing unit 123 generates a target image feature 302 that indicates the feature amount for each of a plurality of regions 301 of the target image 300, as illustrated in FIG. The prototype matching processing unit 125 calculates the degree of similarity with a plurality of prototypes for each of the plurality of regions 301, determines the prototype with the highest degree of similarity, and associates the object ID corresponding to the prototype. As a result, each of all pixels of the target image 300 is associated with an object ID.

図１０は、画像処理装置１００として機能するコンピュータ１２００のハードウェア構成の一例を概略的に示す。コンピュータ１２００にインストールされたプログラムは、コンピュータ１２００を、本実施形態に係る装置の１又は複数の「部」として機能させ、又はコンピュータ１２００に、本実施形態に係る装置に関連付けられるオペレーション又は当該１又は複数の「部」を実行させることができ、及び／又はコンピュータ１２００に、本実施形態に係るプロセス又は当該プロセスの段階を実行させることができる。そのようなプログラムは、コンピュータ１２００に、本明細書に記載のフローチャート及びブロック図のブロックのうちのいくつか又はすべてに関連付けられた特定のオペレーションを実行させるべく、ＣＰＵ１２１２によって実行されてよい。 FIG. 10 schematically shows an example of the hardware configuration of a computer 1200 functioning as the image processing apparatus 100. The program installed on the computer 1200 causes the computer 1200 to function as one or more "parts" of the apparatus according to the present embodiment, or causes the computer 1200 to perform operations associated with the apparatus according to the present embodiment or the one or more "parts" of the apparatus according to the present embodiment. Multiple units may be executed and/or the computer 1200 may execute a process or a stage of a process according to the present embodiments. Such programs may be executed by CPU 1212 to cause computer 1200 to perform certain operations associated with some or all of the blocks in the flowcharts and block diagrams described herein.

本実施形態によるコンピュータ１２００は、ＣＰＵ１２１２、ＲＡＭ１２１４、及びグラフィックコントローラ１２１６を含み、それらはホストコントローラ１２１０によって相互に接続されている。コンピュータ１２００はまた、通信インタフェース１２２２、記憶装置１２２４、ＤＶＤドライブ１２２６、及びＩＣカードドライブのような入出力ユニットを含み、それらは入出力コントローラ１２２０を介してホストコントローラ１２１０に接続されている。ＤＶＤドライブ１２２６は、ＤＶＤ－ＲＯＭドライブ及びＤＶＤ－ＲＡＭドライブ等であってよい。記憶装置１２２４は、ハードディスクドライブ及びソリッドステートドライブ等であってよい。コンピュータ１２００はまた、ＲＯＭ１２３０及びキーボードのようなレガシの入出力ユニットを含み、それらは入出力チップ１２４０を介して入出力コントローラ１２２０に接続されている。 The computer 1200 according to this embodiment includes a CPU 1212, a RAM 1214, and a graphics controller 1216, which are interconnected by a host controller 1210. The computer 1200 also includes input/output units such as a communication interface 1222, a storage device 1224, a DVD drive 1226, and an IC card drive, which are connected to the host controller 1210 via an input/output controller 1220. DVD drive 1226 may be a DVD-ROM drive, a DVD-RAM drive, or the like. Storage device 1224 may be a hard disk drive, solid state drive, or the like. Computer 1200 also includes legacy input/output units, such as ROM 1230 and a keyboard, which are connected to input/output controller 1220 via input/output chips 1240.

ＣＰＵ１２１２は、ＲＯＭ１２３０及びＲＡＭ１２１４内に格納されたプログラムに従い動作し、それにより各ユニットを制御する。グラフィックコントローラ１２１６は、ＲＡＭ１２１４内に提供されるフレームバッファ等又はそれ自体の中に、ＣＰＵ１２１２によって生成されるイメージデータを取得し、イメージデータがディスプレイデバイス１２１８上に表示されるようにする。 The CPU 1212 operates according to programs stored in the ROM 1230 and RAM 1214, thereby controlling each unit. Graphics controller 1216 obtains image data generated by CPU 1212, such as in a frame buffer provided in RAM 1214 or itself, and causes the image data to be displayed on display device 1218.

通信インタフェース１２２２は、ネットワークを介して他の電子デバイスと通信する。記憶装置１２２４は、コンピュータ１２００内のＣＰＵ１２１２によって使用されるプログラム及びデータを格納する。ＤＶＤドライブ１２２６は、プログラム又はデータをＤＶＤ－ＲＯＭ１２２７等から読み取り、記憶装置１２２４に提供する。ＩＣカードドライブは、プログラム及びデータをＩＣカードから読み取り、及び／又はプログラム及びデータをＩＣカードに書き込む。 Communication interface 1222 communicates with other electronic devices via a network. Storage device 1224 stores programs and data used by CPU 1212 within computer 1200. The DVD drive 1226 reads a program or data from a DVD-ROM 1227 or the like and provides it to the storage device 1224. The IC card drive reads programs and data from and/or writes programs and data to the IC card.

ＲＯＭ１２３０はその中に、アクティブ化時にコンピュータ１２００によって実行されるブートプログラム等、及び／又はコンピュータ１２００のハードウェアに依存するプログラムを格納する。入出力チップ１２４０はまた、様々な入出力ユニットをＵＳＢポート、パラレルポート、シリアルポート、キーボードポート、マウスポート等を介して、入出力コントローラ１２２０に接続してよい。 ROM 1230 stores therein programs that are dependent on the computer 1200 hardware, such as a boot program that is executed by the computer 1200 upon activation. I/O chip 1240 may also connect various I/O units to I/O controller 1220 via USB ports, parallel ports, serial ports, keyboard ports, mouse ports, etc.

プログラムは、ＤＶＤ－ＲＯＭ１２２７又はＩＣカードのようなコンピュータ可読記憶媒体によって提供される。プログラムは、コンピュータ可読記憶媒体から読み取られ、コンピュータ可読記憶媒体の例でもある記憶装置１２２４、ＲＡＭ１２１４、又はＲＯＭ１２３０にインストールされ、ＣＰＵ１２１２によって実行される。これらのプログラム内に記述される情報処理は、コンピュータ１２００に読み取られ、プログラムと、上記様々なタイプのハードウェアリソースとの間の連携をもたらす。装置又は方法が、コンピュータ１２００の使用に従い情報のオペレーション又は処理を実現することによって構成されてよい。 The program is provided by a computer readable storage medium such as a DVD-ROM 1227 or an IC card. The program is read from a computer-readable storage medium, installed in storage device 1224, RAM 1214, or ROM 1230, which are also examples of computer-readable storage media, and executed by CPU 1212. The information processing described in these programs is read by the computer 1200 and provides coordination between the programs and the various types of hardware resources described above. An apparatus or method may be configured to implement the operation or processing of information according to the use of computer 1200.

例えば、通信がコンピュータ１２００及び外部デバイス間で実行される場合、ＣＰＵ１２１２は、ＲＡＭ１２１４にロードされた通信プログラムを実行し、通信プログラムに記述された処理に基づいて、通信インタフェース１２２２に対し、通信処理を命令してよい。通信インタフェース１２２２は、ＣＰＵ１２１２の制御の下、ＲＡＭ１２１４、記憶装置１２２４、ＤＶＤ－ＲＯＭ１２２７、又はＩＣカードのような記録媒体内に提供される送信バッファ領域に格納された送信データを読み取り、読み取られた送信データをネットワークに送信し、又はネットワークから受信した受信データを記録媒体上に提供される受信バッファ領域等に書き込む。 For example, when communication is performed between the computer 1200 and an external device, the CPU 1212 executes a communication program loaded into the RAM 1214 and sends communication processing to the communication interface 1222 based on the processing written in the communication program. You may give orders. The communication interface 1222 reads transmission data stored in a transmission buffer area provided in a recording medium such as a RAM 1214, a storage device 1224, a DVD-ROM 1227, or an IC card under the control of the CPU 1212, and transmits the read transmission data. Data is transmitted to the network, or received data received from the network is written to a reception buffer area provided on the recording medium.

また、ＣＰＵ１２１２は、記憶装置１２２４、ＤＶＤドライブ１２２６（ＤＶＤ－ＲＯＭ１２２７）、ＩＣカード等のような外部記録媒体に格納されたファイル又はデータベースの全部又は必要な部分がＲＡＭ１２１４に読み取られるようにし、ＲＡＭ１２１４上のデータに対し様々なタイプの処理を実行してよい。ＣＰＵ１２１２は次に、処理されたデータを外部記録媒体にライトバックしてよい。 Further, the CPU 1212 causes the RAM 1214 to read all or a necessary part of a file or database stored in an external recording medium such as a storage device 1224, a DVD drive 1226 (DVD-ROM 1227), or an IC card. Various types of processing may be performed on the data. CPU 1212 may then write the processed data back to an external storage medium.

様々なタイプのプログラム、データ、テーブル、及びデータベースのような様々なタイプの情報が記録媒体に格納され、情報処理を受けてよい。ＣＰＵ１２１２は、ＲＡＭ１２１４から読み取られたデータに対し、本開示の随所に記載され、プログラムの命令シーケンスによって指定される様々なタイプのオペレーション、情報処理、条件判断、条件分岐、無条件分岐、情報の検索／置換等を含む、様々なタイプの処理を実行してよく、結果をＲＡＭ１２１４に対しライトバックする。また、ＣＰＵ１２１２は、記録媒体内のファイル、データベース等における情報を検索してよい。例えば、各々が第２の属性の属性値に関連付けられた第１の属性の属性値を有する複数のエントリが記録媒体内に格納される場合、ＣＰＵ１２１２は、当該複数のエントリの中から、第１の属性の属性値が指定されている条件に一致するエントリを検索し、当該エントリ内に格納された第２の属性の属性値を読み取り、それにより予め定められた条件を満たす第１の属性に関連付けられた第２の属性の属性値を取得してよい。 Various types of information, such as various types of programs, data, tables, and databases, may be stored on a recording medium and subjected to information processing. CPU 1212 performs various types of operations, information processing, conditional determination, conditional branching, unconditional branching, and information retrieval on data read from RAM 1214 as described elsewhere in this disclosure and specified by the program's instruction sequence. Various types of processing may be performed, including /substitutions, etc., and the results are written back to RAM 1214. Further, the CPU 1212 may search for information in a file in a recording medium, a database, or the like. For example, when a plurality of entries are stored in a recording medium, each having an attribute value of a first attribute associated with an attribute value of a second attribute, the CPU 1212 selects the first entry from among the plurality of entries. Search for an entry whose attribute value matches the specified condition, read the attribute value of the second attribute stored in the entry, and then set the attribute value to the first attribute that satisfies the predetermined condition. An attribute value of the associated second attribute may be obtained.

上で説明したプログラム又はソフトウエアモジュールは、コンピュータ１２００上又はコンピュータ１２００近傍のコンピュータ可読記憶媒体に格納されてよい。また、専用通信ネットワーク又はインターネットに接続されたサーバシステム内に提供されるハードディスク又はＲＡＭのような記録媒体が、コンピュータ可読記憶媒体として使用可能であり、それによりプログラムを、ネットワークを介してコンピュータ１２００に提供する。 The programs or software modules described above may be stored in a computer-readable storage medium on or near computer 1200. Also, a storage medium such as a hard disk or RAM provided in a server system connected to a dedicated communication network or the Internet can be used as a computer-readable storage medium, thereby allowing the program to be transferred to the computer 1200 via the network. provide.

本実施形態におけるフローチャート及びブロック図におけるブロックは、オペレーションが実行されるプロセスの段階又はオペレーションを実行する役割を持つ装置の「部」を表わしてよい。特定の段階及び「部」が、専用回路、コンピュータ可読記憶媒体上に格納されるコンピュータ可読命令と共に供給されるプログラマブル回路、及び／又はコンピュータ可読記憶媒体上に格納されるコンピュータ可読命令と共に供給されるプロセッサによって実装されてよい。専用回路は、デジタル及び／又はアナログハードウェア回路を含んでよく、集積回路（ＩＣ）及び／又はディスクリート回路を含んでよい。プログラマブル回路は、例えば、フィールドプログラマブルゲートアレイ（ＦＰＧＡ）、及びプログラマブルロジックアレイ（ＰＬＡ）等のような、論理積、論理和、排他的論理和、否定論理積、否定論理和、及び他の論理演算、フリップフロップ、レジスタ、並びにメモリエレメントを含む、再構成可能なハードウェア回路を含んでよい。 Blocks in the flowcharts and block diagrams of the present embodiments may represent stages in a process in which an operation is performed or a "part" of a device responsible for performing the operation. Certain steps and units may be provided with dedicated circuitry, programmable circuitry provided with computer readable instructions stored on a computer readable storage medium, and/or provided with computer readable instructions stored on a computer readable storage medium. May be implemented by a processor. Dedicated circuitry may include digital and/or analog hardware circuits, and may include integrated circuits (ICs) and/or discrete circuits. Programmable circuits can perform AND, OR, EXCLUSIVE OR, NAND, NOR, and other logical operations, such as field programmable gate arrays (FPGAs), programmable logic arrays (PLAs), etc. , flip-flops, registers, and memory elements.

コンピュータ可読記憶媒体は、適切なデバイスによって実行される命令を格納可能な任意の有形なデバイスを含んでよく、その結果、そこに格納される命令を有するコンピュータ可読記憶媒体は、フローチャート又はブロック図で指定されたオペレーションを実行するための手段を作成すべく実行され得る命令を含む、製品を備えることになる。コンピュータ可読記憶媒体の例としては、電子記憶媒体、磁気記憶媒体、光記憶媒体、電磁記憶媒体、半導体記憶媒体等が含まれてよい。コンピュータ可読記憶媒体のより具体的な例としては、フロッピー（登録商標）ディスク、ディスケット、ハードディスク、ランダムアクセスメモリ（ＲＡＭ）、リードオンリメモリ（ＲＯＭ）、消去可能プログラマブルリードオンリメモリ（ＥＰＲＯＭ又はフラッシュメモリ）、電気的消去可能プログラマブルリードオンリメモリ（ＥＥＰＲＯＭ）、静的ランダムアクセスメモリ（ＳＲＡＭ）、コンパクトディスクリードオンリメモリ（ＣＤ－ＲＯＭ）、デジタル多用途ディスク（ＤＶＤ）、ブルーレイ（登録商標）ディスク、メモリスティック、集積回路カード等が含まれてよい。 A computer-readable storage medium may include any tangible device capable of storing instructions for execution by a suitable device such that a computer-readable storage medium with instructions stored therein may be illustrated in a flowchart or block diagram. A product will be provided that includes instructions that can be executed to create a means for performing specified operations. Examples of computer-readable storage media may include electronic storage media, magnetic storage media, optical storage media, electromagnetic storage media, semiconductor storage media, and the like. More specific examples of computer readable storage media include floppy disks, diskettes, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory). , Electrically Erasable Programmable Read Only Memory (EEPROM), Static Random Access Memory (SRAM), Compact Disk Read Only Memory (CD-ROM), Digital Versatile Disk (DVD), Blu-ray Disc, Memory Stick , integrated circuit cards, and the like.

コンピュータ可読命令は、アセンブラ命令、命令セットアーキテクチャ（ＩＳＡ）命令、マシン命令、マシン依存命令、マイクロコード、ファームウェア命令、状態設定データ、又はＳｍａｌｌｔａｌｋ（登録商標）、ＪＡＶＡ（登録商標）、Ｃ＋＋等のようなオブジェクト指向プログラミング言語、及び「Ｃ」プログラミング言語又は同様のプログラミング言語のような従来の手続型プログラミング言語を含む、１又は複数のプログラミング言語の任意の組み合わせで記述されたソースコード又はオブジェクトコードのいずれかを含んでよい。 Computer-readable instructions may include assembler instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state configuration data, or instructions such as Smalltalk®, JAVA®, C++, etc. any source code or object code written in any combination of one or more programming languages, including object-oriented programming languages such as may include.

コンピュータ可読命令は、汎用コンピュータ、特殊目的のコンピュータ、若しくは他のプログラム可能なデータ処理装置のプロセッサ、又はプログラマブル回路が、フローチャート又はブロック図で指定されたオペレーションを実行するための手段を生成するために当該コンピュータ可読命令を実行すべく、ローカルに又はローカルエリアネットワーク（ＬＡＮ）、インターネット等のようなワイドエリアネットワーク（ＷＡＮ）を介して、汎用コンピュータ、特殊目的のコンピュータ、若しくは他のプログラム可能なデータ処理装置のプロセッサ、又はプログラマブル回路に提供されてよい。プロセッサの例としては、コンピュータプロセッサ、処理ユニット、マイクロプロセッサ、デジタル信号プロセッサ、コントローラ、マイクロコントローラ等を含む。 The computer-readable instructions are for producing means for a processor of a general purpose computer, special purpose computer, or other programmable data processing device, or programmable circuit to perform the operations specified in the flowchart or block diagrams. A general purpose computer, special purpose computer, or other programmable data processor, locally or over a local area network (LAN), wide area network (WAN), such as the Internet, to execute the computer readable instructions. It may be provided in a processor or programmable circuit of the device. Examples of processors include computer processors, processing units, microprocessors, digital signal processors, controllers, microcontrollers, and the like.

以上、本発明を実施の形態を用いて説明したが、本発明の技術的範囲は上記実施の形態に記載の範囲には限定されない。上記実施の形態に、多様な変更又は改良を加えることが可能であることが当業者に明らかである。その様な変更又は改良を加えた形態も本発明の技術的範囲に含まれ得ることが、特許請求の範囲の記載から明らかである。 Although the present invention has been described above using the embodiments, the technical scope of the present invention is not limited to the range described in the above embodiments. It will be apparent to those skilled in the art that various changes or improvements can be made to the embodiments described above. It is clear from the claims that such modifications or improvements may be included within the technical scope of the present invention.

特許請求の範囲、明細書、及び図面中において示した装置、システム、プログラム、及び方法における動作、手順、ステップ、及び段階などの各処理の実行順序は、特段「より前に」、「先立って」などと明示しておらず、また、前の処理の出力を後の処理で用いるのでない限り、任意の順序で実現しうることに留意すべきである。特許請求の範囲、明細書、及び図面中の動作フローに関して、便宜上「まず、」、「次に、」などを用いて説明したとしても、この順で実施することが必須であることを意味するものではない。 The execution order of each process, such as an operation, a procedure, a step, and a stage in the apparatus, system, program, and method shown in the claims, specification, and drawings, specifically refers to "before" or "before". It should be noted that they can be implemented in any order unless the output of the previous process is used in the subsequent process. Even if the claims, specifications, and operational flows in the drawings are explained using "first," "next," etc. for convenience, this does not mean that it is essential to carry out the operations in this order. It's not a thing.

４０サポートデータ、４２サポート画像、４４アノテーションデータ、１００画像処理装置、１０２記憶部、１１０画像取得部、１２０推定部、１２１サポートデータ取得部、１２２照合推論処理部、１２３特徴抽出処理部、１２４プロトタイプ算出処理部、１２５プロトタイプ照合処理部、１３０検出部、１４０決定部、１５０置換処理部、１６０出力制御部、２００フィルタ、２１０、２２０、２３０画素、３００対象画像、３０１領域、３０２対象画像フィーチャ、３１２、３１４、３１６検出結果、４００サポートフィーチャ、４０１領域、４０２飛行機領域、４１２特徴空間、４２２飛行機プロトタイプ、１２００コンピュータ、１２１０ホストコントローラ、１２１２ＣＰＵ、１２１４ＲＡＭ、１２１６グラフィックコントローラ、１２１８ディスプレイデバイス、１２２０入出力コントローラ、１２２２通信インタフェース、１２２４記憶装置、１２２６ＤＶＤドライブ、１２２７ＤＶＤ－ＲＯＭ、１２３０ＲＯＭ、１２４０入出力チップ 40 support data, 42 support image, 44 annotation data, 100 image processing device, 102 storage unit, 110 image acquisition unit, 120 estimation unit, 121 support data acquisition unit, 122 matching inference processing unit, 123 feature extraction processing unit, 124 prototype calculation processing unit, 125 prototype matching processing unit, 130 detection unit, 140 determination unit, 150 replacement processing unit, 160 output control unit, 200 filter, 210, 220, 230 pixels, 300 target image, 301 area, 302 target image feature, 312, 314, 316 detection result, 400 support feature, 401 region, 402 airplane region, 412 feature space, 422 airplane prototype, 1200 computer, 1210 host controller, 1212 CPU, 1214 RAM, 1216 graphics controller, 1218 display device, 1220 input Output controller, 1222 Communication interface, 1224 Storage device, 1226 DVD drive, 1227 DVD-ROM, 1230 ROM, 1240 Input/output chip

Claims

対象画像を取得する画像取得部と、
前記対象画像の複数の画素のそれぞれについて、画素に対応する物体を推定する推定部と、
前記推定部による推定結果に対して、細かな絵柄変化に反応して誤推定された誤推定部分に反応するフィルタ処理を施すことによって、前記誤推定部分を検出する検出部と
を備える、画像処理装置。 an image acquisition unit that acquires a target image;
an estimation unit that estimates an object corresponding to each pixel for each of the plurality of pixels of the target image;
and a detection unit that detects the erroneously estimated portion by applying filter processing that responds to the erroneously estimated portion that is erroneously estimated in response to a small picture pattern change to the estimation result by the estimator. Device.

前記検出部は、前記推定部による推定が行われた前記対象画像に対して、５×５画素以上のサイズのフィルタによって、基準画素と、前記基準画素と比較する複数の比較対象画素とを決定し、比較結果に基づいて前記誤推定部分を検出する、請求項１に記載の画像処理装置。 The detection unit determines a reference pixel and a plurality of comparison target pixels to be compared with the reference pixel using a filter having a size of 5×5 pixels or more for the target image estimated by the estimation unit. The image processing device according to claim 1, wherein the erroneously estimated portion is detected based on a comparison result.

前記検出部は、前記フィルタの中心部分に位置する画素を前記基準画素とし、前記フィルタの外周部分に位置する複数の画素を前記複数の比較対象画素とし、前記推定部によって推定された、前記基準画素に対応する物体と、前記複数の比較対象画素のそれぞれに対応する物体との一致数が、０である場合又は予め定められた閾値以下である場合に、前記基準画素を、前記誤推定部分として検出する、請求項２に記載の画像処理装置。 The detection unit sets a pixel located at the center of the filter as the reference pixel, sets a plurality of pixels located at the outer periphery of the filter as the plurality of comparison target pixels, and uses the reference pixel estimated by the estimation unit as the reference pixel. If the number of matches between the object corresponding to the pixel and the objects corresponding to each of the plurality of comparison target pixels is 0 or less than a predetermined threshold, the reference pixel is replaced with the erroneously estimated portion. The image processing device according to claim 2, wherein the image processing device detects as follows.

前記対象画像における前記誤推定部分の画素の周辺の複数の画素について、対応する物体が同一である画素の数を、物体毎に特定し、特定結果に基づいて、前記誤推定部分の画素に対応させる物体を決定する決定部
をさらに備える、請求項１に記載の画像処理装置。 Identifying for each object the number of pixels that have the same corresponding object among a plurality of pixels surrounding the pixel in the incorrectly estimated portion in the target image, and corresponding to the pixel in the incorrectly estimated portion based on the identification result. The image processing device according to claim 1, further comprising: a determining unit that determines an object to be displayed.

前記推定部によって推定された、前記対象画像における前記誤推定部分の画素に対応する物体を、前記決定部によって決定された物体に置き換える置換処理部
をさらに備える、請求項４に記載の画像処理装置。 The image processing device according to claim 4, further comprising: a replacement processing unit that replaces an object corresponding to a pixel of the incorrectly estimated portion in the target image estimated by the estimation unit with an object determined by the determination unit. .

前記検出部は、前記対象画像のサイズを変更し、複数の異なるサイズの前記対象画像のそれぞれに対して、前記誤推定部分を検出する処理を実行し、前記複数の異なるサイズの前記対象画像の少なくともいずれかにおいて前記誤推定部分として検出された部分を、前記対象画像の前記誤推定部分とする、請求項１に記載の画像処理装置。 The detection unit changes the size of the target image, executes a process of detecting the erroneously estimated portion for each of the target images of a plurality of different sizes, and The image processing device according to claim 1, wherein a portion detected as the erroneously estimated portion in at least one of the portions is set as the erroneously estimated portion of the target image.

複数の物体のそれぞれについて、前記物体が撮影されたサポート画像及び前記サポート画像における前記物体の位置を示すアノテーションデータを含むサポートデータを取得するサポートデータ取得部
を備え、
前記推定部は、前記サポートデータ取得部が取得した複数の前記サポートデータを用いた照合推論処理を実行することによって、前記対象画像の複数の画素のそれぞれについて、画素に対応する物体を推定する、請求項１に記載の画像処理装置。 A support data acquisition unit that acquires, for each of a plurality of objects, support data including a support image in which the object is photographed and annotation data indicating the position of the object in the support image,
The estimation unit estimates, for each of the plurality of pixels of the target image, an object corresponding to the pixel by executing a matching inference process using the plurality of support data acquired by the support data acquisition unit. The image processing device according to claim 1.

コンピュータを、請求項１から７のいずれか一項に記載の画像処理装置として機能させるためのプログラム。 A program for causing a computer to function as the image processing device according to any one of claims 1 to 7.

コンピュータによって実行される画像処理方法であって、
対象画像を取得する画像取得段階と、
前記対象画像の複数の画素のそれぞれについて、画素に対応する物体を推定する推定段階と、
前記推定段階による推定結果に対して、細かな絵柄変化に反応して誤推定された誤推定部分に反応するフィルタ処理を施すことによって、前記誤推定部分を検出する検出段階と
を備える、画像処理方法。 An image processing method performed by a computer, the method comprising:
an image acquisition step of acquiring a target image;
an estimation step of estimating, for each of the plurality of pixels of the target image, an object corresponding to the pixel;
and a detection step of detecting the erroneously estimated portion by performing filter processing on the estimation result from the estimating step in response to a small picture pattern change, thereby detecting the erroneously estimated portion. Method.