JP7350515B2

JP7350515B2 - Information processing device, information processing method and program

Info

Publication number: JP7350515B2
Application number: JP2019095889A
Authority: JP
Inventors: 友則矢澤
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2019-05-22
Filing date: 2019-05-22
Publication date: 2023-09-26
Anticipated expiration: 2039-05-22
Also published as: JP2020190926A

Description

本発明は、情報処理装置、情報処理方法およびプログラムに関する。 The present invention relates to an information processing device, an information processing method, and a program.

画像上の対象物体の領域を算出するために機械学習が用いられる。非特許文献１には、画像上の対象物体の領域を算出する方法として、ニューラルネットワークによるフィルタ処理を用いて算出する方法が開示されている。非特許文献２には、バックプロパゲーションのフィルタ処理が開示されている。 Machine learning is used to calculate the area of the target object on the image. Non-Patent Document 1 discloses a method of calculating the area of a target object on an image using filter processing using a neural network. Non-Patent Document 2 discloses backpropagation filter processing.

Ｓ．Ｖａｒａｄａｒａｊａｎ，ａｎｄＭ．Ｍ．Ｓｒｉｖａｓｔａｖａ．ＷｅａｋｌｙＳｕｐｅｒｖｉｓｅｄＯｂｊｅｃｔＬｏｃａｌｉｚａｔｉｏｎｏｎｇｒｏｃｅｒｙｓｈｅｌｖｅｓｕｓｉｎｇｓｉｍｐｌｅＦＣＮａｎｄＳｙｎｔｈｅｔｉｃＤａｔａｓｅｔ．ａｒＸｉｖｐｒｅｐｒｉｎｔ，ａｒＸｉｖ：１８０３．０６８１３，２０１８．S. Varadarajan, and M. M. Srivastava. Weakly Supervised Object Localization on grocery shelves using simple FCN and Synthetic Dataset. arXiv preprint, arXiv:1803.06813, 2018. Ｄ．Ｅ．Ｒｕｍｅｌｈａｒｔ，Ｇ．Ｅ．ＨｉｎｔｏｎａｎｄＲ．Ｊ．Ｗｉｌｌｉａｍｓ．Ｌｅａｒｎｉｎｇｒｅｐｒｅｓｅｎｔａｔｉｏｎｓｂｙｂａｃｋ－ｐｒｏｐａｇａｔｉｎｇｅｒｒｏｒｓ．Ｎａｔｕｒｅ３２３，５３３－５３６，１９８６．D. E. Rumelhart, G. E. Hinton and R. J. Williams. Learning representations by back-propagating errors. Nature 323, 533-536, 1986.

非特許文献１では、入力データを拡大するフィルタ処理を行うと、出力データが周期性を持って変化するため、その周期のそれぞれの位相に対応できる十分に大きいサイズの画像で学習をしなければ、領域算出の精度が下がってしまう。 In Non-Patent Document 1, when filter processing is performed to enlarge input data, the output data changes periodically, so learning must be performed using images of a sufficiently large size that can correspond to each phase of the period. , the accuracy of region calculation decreases.

本発明の目的は、拡大するフィルタ処理を行う場合でも、小さいサイズの２次元画像を用いた学習を行うことができるようにすることである。 An object of the present invention is to enable learning using small-sized two-dimensional images even when enlarging filter processing is performed.

本発明の一観点によれば、情報処理装置は、２次元画像に対してフィルタ処理を行うフィルタ処理手段と、前記フィルタ処理手段が入力データサイズに対して出力データサイズを縮小するフィルタ処理を行う場合には、前記縮小するフィルタ処理の入力データの座標情報を保存する座標情報保存手段と、前記フィルタ処理手段が入力データサイズに対して出力データサイズを拡大するフィルタ処理を行う場合には、前記座標情報保存手段により保存された座標情報を用いて、前記フィルタ処理手段のフィルタ処理の出力データの座標情報を復元する座標情報復元手段とを有し、前記座標情報保存手段は、前記フィルタ処理手段が入力データサイズに対して出力データサイズを縮小するフィルタ処理を行う場合には、前記２次元画像のそれぞれの座標軸方向に、前記フィルタ処理手段のフィルタ処理の入力データをシフトした複数のシフトデータを前記フィルタ処理手段に出力し、前記フィルタ処理手段は、複数のシフトデータに対して、前記縮小するフィルタ処理を行い、複数の出力データを出力することを特徴とする。 According to one aspect of the present invention, an information processing apparatus includes a filter processing unit that performs filter processing on a two-dimensional image, and a filter processing unit that performs filter processing that reduces an output data size with respect to an input data size. In this case, coordinate information storage means for storing the coordinate information of the input data of the filter processing to be reduced; and when the filter processing means performs the filter processing to enlarge the output data size with respect to the input data size; coordinate information restoring means for restoring the coordinate information of the output data of the filter processing of the filter processing means using the coordinate information stored by the coordinate information storage means; performs filter processing to reduce the output data size with respect to the input data size, a plurality of shift data obtained by shifting the input data of the filter processing of the filter processing means in the direction of each coordinate axis of the two-dimensional image. The data is output to the filter processing means, and the filter processing means performs the reduction filter processing on a plurality of shift data and outputs a plurality of output data.

本発明によれば、拡大するフィルタ処理を行う場合でも、小さいサイズの２次元画像を用いた学習を行うことができる。 According to the present invention, even when performing enlarging filter processing, learning can be performed using small-sized two-dimensional images.

情報処理装置の構成例を示すブロック図である。FIG. 1 is a block diagram illustrating a configuration example of an information processing device. 学習フェーズの情報処理方法を示すフローチャートである。3 is a flowchart showing an information processing method in a learning phase. 推定フェーズの情報処理方法を示すフローチャートである。3 is a flowchart showing an information processing method in an estimation phase.

（第１の実施形態）
図１は、本発明の第１の実施形態による情報処理装置１の構成例を示す図である。情報処理装置１は、２次元画像１１を入力とし、出力装置１３とフィルタ更新装置１２に接続される。情報処理装置１は、物体（対象）が写った２次元画像１１と、その２次元画像１１の中心が物体の中心か否かを示すラベル列を入力して、フィルタ更新装置１２により２次元画像１１の各画素の物体の中心らしさを計算する関数を学習する。また、情報処理装置１は、学習後は、物体が写った２次元画像１１を入力して、その２次元画像１１上の物体の中心の位置を推定し、出力装置１３へ出力する。 (First embodiment)
FIG. 1 is a diagram showing a configuration example of an information processing device 1 according to a first embodiment of the present invention. The information processing device 1 receives a two-dimensional image 11 as an input, and is connected to an output device 13 and a filter update device 12 . The information processing device 1 inputs a two-dimensional image 11 showing an object (target) and a label string indicating whether the center of the two-dimensional image 11 is the center of the object, and uses the filter updating device 12 to update the two-dimensional image. Learn a function that calculates the centrality of an object for each pixel of 11. Further, after learning, the information processing device 1 inputs a two-dimensional image 11 containing an object, estimates the position of the center of the object on the two-dimensional image 11, and outputs it to the output device 13.

情報処理装置１は、データ取得部１０１と、フィルタ保持部１０２と、座標情報保存部１０３と、フィルタ処理部１０４と、座標情報復元部１０５と、出力データ生成部１０６とを有する。 The information processing device 1 includes a data acquisition section 101 , a filter holding section 102 , a coordinate information storage section 103 , a filter processing section 104 , a coordinate information restoration section 105 , and an output data generation section 106 .

データ取得部１０１は、２次元画像１１を取得する。学習フェーズでは、データ取得部１０１は、教師データとして用いる２次元画像１１を取得する。２次元画像１１は、主に物体が写った画像である。学習フェーズ後の推定フェーズでは、データ取得部１０１は、任意の２次元画像１１を取得する。 The data acquisition unit 101 acquires the two-dimensional image 11. In the learning phase, the data acquisition unit 101 acquires the two-dimensional image 11 to be used as teacher data. The two-dimensional image 11 is an image mainly showing objects. In the estimation phase after the learning phase, the data acquisition unit 101 acquires an arbitrary two-dimensional image 11.

フィルタ保持部１０２は、フィルタ処理部１０４に順次出力する２つ以上のフィルタパラメータを保持する。フィルタパラメータは、入力データに対して空間フィルタ処理を行う時に用いるパラメータである。例えば、空間フィルタ処理として線形フィルタを適用する場合、フィルタ保持部１０２は、その線形フィルタのパラメータ（係数）を保持する。また、最大値フィルタなどの非線形フィルタを適用する場合、フィルタ保持部１０２は、フィルタ処理の種類やカーネルサイズや入力から出力のサイズ変化率などのパラメータを保持する。また、フィルタ保持部１０２は、２つ以上のフィルタパラメータを出力する順番も保持する。 The filter holding unit 102 holds two or more filter parameters that are sequentially output to the filter processing unit 104. The filter parameters are parameters used when performing spatial filter processing on input data. For example, when applying a linear filter as spatial filter processing, the filter holding unit 102 holds parameters (coefficients) of the linear filter. Further, when applying a nonlinear filter such as a maximum value filter, the filter holding unit 102 holds parameters such as the type of filter processing, the kernel size, and the rate of change in size from input to output. The filter holding unit 102 also holds the order in which two or more filter parameters are output.

サイズ変化率は、以下のように定める。十分に大きく異なるサイズの２つの入力データがフィルタ処理部１０４のフィルタに与えられる。その２つの入力データの横方向のサイズは、それぞれ、ｓｘ１およびｓｘ２である。２つの出力データの横方向のサイズは、それぞれ、ｔｘ１およびｔｘ２である。サイズｓｘ１、ｓｘ２、ｔｘ１およびｔｘ２が［数１］のような定数αおよびβを用いて表せるとき、αが横方向のサイズ変化率である。 The size change rate is determined as follows. Two input data having sufficiently different sizes are applied to the filter of the filter processing unit 104. The horizontal sizes of the two input data are sx1 and sx2, respectively. The horizontal sizes of the two output data are tx1 and tx2, respectively. When the sizes sx1, sx2, tx1, and tx2 can be expressed using constants α and β as shown in [Equation 1], α is the rate of change in size in the horizontal direction.

縦方向のサイズ変化率も同様に表される。以下、フィルタ処理部１０４が入力データサイズに対して出力データサイズを縮小するフィルタ処理を、縮小するフィルタという。また、フィルタ処理部１０４が入力データサイズに対して出力データサイズを拡大するフィルタ処理を、拡大するフィルタという。縮小するフィルタは、例えば、サイズ変化率（縮小率）が縦方向と横方向のそれぞれで０．５である。拡大するフィルタは、例えば、サイズ変化率（拡大率）が縦方向と横方向のそれぞれで２である。 The rate of change in size in the vertical direction is similarly expressed. Hereinafter, filter processing in which the filter processing unit 104 reduces the output data size with respect to the input data size will be referred to as a reduction filter. Further, filter processing in which the filter processing unit 104 expands the output data size with respect to the input data size is referred to as an expanding filter. For example, the size change rate (reduction rate) of the filter to be reduced is 0.5 in both the vertical direction and the horizontal direction. For example, the filter to be enlarged has a size change rate (enlargement rate) of 2 in both the vertical direction and the horizontal direction.

座標情報保存部１０３は、フィルタ保持部１０２が保持するフィルタパラメータを出力する順番に基づき、次に行うフィルタ処理のサイズ変化率を判断する。座標情報保存部１０３は、次に行うフィルタ処理のサイズ変化率が０．５の場合、データ取得部１０１が取得した２次元画像１１をそれぞれの座標軸方向にシフトしたシフトデータを生成する。座標軸ごとにシフトする大きさは、サイズ変化率をｒとしたとき、［数２］で表される整数ｓである。 The coordinate information storage unit 103 determines the size change rate of the filter processing to be performed next based on the order in which the filter parameters held by the filter holding unit 102 are output. When the size change rate of the next filtering process is 0.5, the coordinate information storage unit 103 generates shift data by shifting the two-dimensional image 11 acquired by the data acquisition unit 101 in the direction of each coordinate axis. The size of the shift for each coordinate axis is an integer s expressed by [Equation 2], where r is the size change rate.

サイズ変化率ｒが０．５である場合、整数ｓは０か１のどちらかである。サイズ変化率ｒが１より小さい場合、フィルタの入力データに対応する出力データは分解能が低下し、情報ロスが出る。情報量を失わないために、座標情報保存部１０３が複数のシフトデータを生成してから、フィルタ処理部１０４がフィルタ処理を行う。フィルタ処理部１０４がシフトデータごとにフィルタ処理を行うことで、分解能が低下する前の情報量を保つことができる。 If the size change rate r is 0.5, the integer s is either 0 or 1. When the size change rate r is smaller than 1, the resolution of the output data corresponding to the input data of the filter decreases, resulting in information loss. In order not to lose the amount of information, the coordinate information storage unit 103 generates a plurality of shift data, and then the filter processing unit 104 performs filter processing. By performing filter processing on each shift data by the filter processing unit 104, the amount of information before the resolution is degraded can be maintained.

フィルタ処理部１０４は、フィルタ保持部１０２から入力したフィルタパラメータを基に、座標情報保存部１０３から入力した２次元画像１１またはシフトデータに対してフィルタ処理を行い、フィルタ処理データを生成する。 The filter processing unit 104 performs filter processing on the two-dimensional image 11 or shift data input from the coordinate information storage unit 103 based on the filter parameters input from the filter holding unit 102, and generates filtered data.

座標情報復元部１０５は、フィルタ処理部１０４が行ったフィルタ処理のサイズ変化率ｒが２の場合、フィルタ処理部１０４が生成したフィルタ処理データを２次元画像１１のそれぞれの座標軸方向にシフトし、和を取る。座標軸ごとにシフトする大きさは、サイズ変化率をｅとしたとき、［数３］で表される整数ｓである。 When the size change rate r of the filter processing performed by the filter processing unit 104 is 2, the coordinate information restoration unit 105 shifts the filter processing data generated by the filter processing unit 104 in the direction of each coordinate axis of the two-dimensional image 11, Take the sum. The size of the shift for each coordinate axis is an integer s expressed by [Equation 3], where e is the size change rate.

サイズ変化率ｅが２である場合、整数ｓは０か１のどちらかである。サイズ変化率ｅが１より大きい場合、フィルタの入力データの情報量に対して、フィルタの出力データのサイズが大きくなるため、データを補完する部分は他の部分と依存関係が生じ、周期性が現れる。周期性を相殺するために、座標情報保存部１０３がシフトデータとして複数の入力データを生成し、座標情報復元部１０５は、フィルタ処理部１０４がフィルタ処理したフィルタ処理データを異なる位相かつ同じ重みで足し合わせる。このように足し合わせると、データの利用方法が一定になり、周期性が相殺される。 When the size change rate e is 2, the integer s is either 0 or 1. When the size change rate e is greater than 1, the size of the output data of the filter becomes large relative to the amount of information in the input data of the filter, so the part that complements the data becomes dependent on other parts, and periodicity is reduced. appear. In order to offset the periodicity, the coordinate information storage unit 103 generates a plurality of input data as shift data, and the coordinate information restoration unit 105 generates the filtered data processed by the filter processing unit 104 with different phases and the same weight. Add them together. This addition ensures that the data is used consistently and cancels out periodicity.

出力データ生成部１０６は、推定部であり、フィルタ保持部１０２が保持するフィルタパラメータをすべて処理したフィルタ処理データを基に、物体の領域を検出してその領域の重心を推定し、その重心を２次元画像１１の中の物体の中心の位置として出力装置１３へ出力する。 The output data generation unit 106 is an estimation unit, which detects an area of an object, estimates the center of gravity of the area, and calculates the center of gravity of the area based on the filter processing data obtained by processing all the filter parameters held by the filter holding unit 102. It is output to the output device 13 as the position of the center of the object in the two-dimensional image 11.

以上のように、学習フェーズでは、情報処理装置１は、物体が写った２次元画像１１と、その２次元画像１１の中心が物体の領域の画像重心近傍か否かを示すラベルを入力して学習する。また、学習フェーズ後の推定フェーズでは、情報処理装置１は、任意の２次元画像１１を入力して、その２次元画像１１に学習フェーズで写っていた物体の中心またはその近傍の領域を推定し、出力装置１３へ出力する。 As described above, in the learning phase, the information processing device 1 inputs the two-dimensional image 11 containing the object and a label indicating whether the center of the two-dimensional image 11 is near the image center of gravity of the object area. learn. In addition, in the estimation phase after the learning phase, the information processing device 1 inputs an arbitrary two-dimensional image 11 and estimates the center of the object that was shown in the two-dimensional image 11 in the learning phase or the area near it. , output to the output device 13.

情報処理装置１は、フィルタ処理部１０４のフィルタの出力データサイズが入力データサイズより縮小されている場合、座標情報保持部１０３がサイズ変化率ｒに応じて複数のシフトデータを生成することで、データ利用履歴の情報ロスを防ぐ。 In the information processing device 1, when the output data size of the filter of the filter processing unit 104 is smaller than the input data size, the coordinate information holding unit 103 generates a plurality of shift data according to the size change rate r. Prevent information loss of data usage history.

また、情報処理装置１は、フィルタ処理部１０４のフィルタの出力データサイズが入力データサイズより拡大されている場合、拡大に伴うデータ補完から周期性が生まれる。座標情報復元部１０５は、入力データを拡大するフィルタ処理部１０４の処理の後に、縮小するフィルタ処理の時に保持した複数のフィルタ処理データをデータ利用履歴が一定になるように統合することで、周期性を生じさせないようにする。そのため、その周期のすべての位相に対応する画像は、フィルタ処理部１０４のフィルタに入力可能な最小サイズまたはそれ以上の任意のサイズの画像でよい。情報処理装置１は、そのような画像による学習でも、高精度に物体の領域を算出できる。 Further, in the information processing device 1, when the output data size of the filter of the filter processing unit 104 is expanded than the input data size, periodicity is generated from data complementation accompanying the expansion. After the processing of the filter processing section 104 for enlarging input data, the coordinate information restoration section 105 integrates a plurality of pieces of filter processing data held during the filter processing for reduction so that the data usage history becomes constant. Avoid giving rise to sex. Therefore, images corresponding to all phases of the period may be images of the minimum size that can be input to the filter of the filter processing unit 104 or any size larger than that. The information processing device 1 can calculate the area of the object with high precision even through learning using such images.

図２は、学習フェーズにおける情報処理装置１の情報処理方法を示すフローチャートである。学習フェーズでは、情報処理装置１は、フィルタ更新装置１２によって、フィルタ保持部１０２が保持するフィルタパラメータを更新する学習処理を行う。 FIG. 2 is a flowchart showing the information processing method of the information processing device 1 in the learning phase. In the learning phase, the information processing device 1 uses the filter update device 12 to perform a learning process to update the filter parameters held by the filter holding unit 102.

ステップＳ１１０１では、データ取得部１０１は、学習に用いる複数の２次元画像１１を教師データとして取得し、２次元画像１１を座標情報保存部１０３に出力する。２次元画像１１は、所定サイズのカラー画像であり、各色が３つの整数値を保持する２次元配列のデータである。３つの整数値は、赤、青および緑の３原色に関する輝度を表す値である。 In step S1101, the data acquisition unit 101 acquires a plurality of two-dimensional images 11 used for learning as teacher data, and outputs the two-dimensional images 11 to the coordinate information storage unit 103. The two-dimensional image 11 is a color image of a predetermined size, and is two-dimensional array data in which each color holds three integer values. The three integer values represent the brightness of the three primary colors of red, blue, and green.

なお、情報処理装置１は、複数（例えば３２）の２次元画像１１を並列に処理するが、説明の簡単のため、以下、１つの２次元画像１１の処理について説明する。 Note that although the information processing device 1 processes a plurality of (for example, 32) two-dimensional images 11 in parallel, for the sake of simplicity, the processing of one two-dimensional image 11 will be described below.

２次元画像１１は、フィルタ処理部１０４の入力データの初期値として用いられる。座標情報保存部１０３は、シフト量を保存するためのスタック型の配列であるシフトスタックを用意し、シフトスタックの中身を空の状態にしておく。 The two-dimensional image 11 is used as an initial value of input data to the filter processing unit 104. The coordinate information storage unit 103 prepares a shift stack, which is a stack-type array for storing shift amounts, and leaves the contents of the shift stack empty.

また、フィルタ更新装置１２は、学習に用いる教師データの２次元画像１１のラベルを入力する。ラベルは、２次元画像１１の画像の中心位置が画像に写っていることが検出された物体の中心付近である場合には１で表現され、２次元画像１１の画像の中心位置が物体の中心付近でない場合には０で表現される。物体の中心付近とは、物体の領域の画像重心から所定の距離以内であることを示す。 The filter updating device 12 also inputs the label of the two-dimensional image 11 of the teacher data used for learning. The label is expressed as 1 if the center position of the two-dimensional image 11 is near the center of the object detected in the image; If it is not nearby, it is expressed as 0. Near the center of the object indicates that it is within a predetermined distance from the center of gravity of the image of the object region.

ステップＳ１１０２では、フィルタ保持部１０２は、複数のフィルタパラメータおよびフィルタパラメータを出力する順番を座標情報保存部１０３およびフィルタ処理部１０４に出力する。複数のフィルタパラメータは、少なくとも、縮小するフィルタのパラメータと、拡大するフィルタのパラメータとを有する。例えば、縮小するフィルタのサイズ変化率は０．５であり、拡大するフィルタのサイズ変化率は２である。 In step S1102, the filter holding unit 102 outputs a plurality of filter parameters and the order in which the filter parameters are output to the coordinate information storage unit 103 and the filter processing unit 104. The plurality of filter parameters includes at least a parameter for a filter to reduce and a parameter for a filter to enlarge. For example, the size change rate of a shrinking filter is 0.5, and the size change rate of a growing filter is 2.

ステップＳ１１０３では、座標情報保存部１０３は、複数のフィルタパラメータのうちの１回目のフィルタパラメータを参照し、１回目のフィルタパラメータを用いるフィルタが縮小するフィルタであるか否かを判定する。座標情報保存部１０３は、１回目のフィルタパラメータを用いるフィルタが縮小するフィルタである場合には、２次元画像１１をシフトしたシフトデータを生成する。上記のように、縮小するフィルタのサイズ変化率が０．５の場合、シフト量は０または１である。 In step S1103, the coordinate information storage unit 103 refers to the first filter parameter among the plurality of filter parameters, and determines whether the filter using the first filter parameter is a filter to be reduced. If the filter using the first filter parameter is a reduction filter, the coordinate information storage unit 103 generates shift data by shifting the two-dimensional image 11. As described above, when the size change rate of the filter to be reduced is 0.5, the shift amount is 0 or 1.

座標情報保存部１０３は、２次元画像１１に対して、縦方向にシフト量ＳＨＩＦＴｘ、かつ横方向にシフト量ＳＨＩＦＴｙシフトさせ、第１～第４のシフトデータを生成する。第１のシフトデータは、縦方向のシフト量ＳＨＩＦＴｘが０、かつ横方向のシフト量ＳＨＩＦＴｙが０であるシフトデータである。第２のシフトデータは、縦方向のシフト量ＳＨＩＦＴｘが０、かつ横方向のシフト量ＳＨＩＦＴｙが１であるシフトデータである。第３のシフトデータは、縦方向のシフト量ＳＨＩＦＴｘが１、かつ横方向のシフト量ＳＨＩＦＴｙが０であるシフトデータである。第４のシフトデータは、縦方向のシフト量ＳＨＩＦＴｘが１、かつ横方向のシフト量ＳＨＩＦＴｙが１であるシフトデータである。第１～第４のシフトデータは、それぞれ、２次元画像１１の座標がシフト量ＳＨＩＦＴｘおよびＳＨＩＦＴｙである場所をコーナーとする２次元画像である。 The coordinate information storage unit 103 shifts the two-dimensional image 11 by a shift amount SHIFTx in the vertical direction and by a shift amount SHIFTy in the horizontal direction to generate first to fourth shift data. The first shift data is shift data in which the vertical shift amount SHIFTx is 0 and the horizontal shift amount SHIFTy is 0. The second shift data is shift data in which the vertical shift amount SHIFTx is 0 and the horizontal shift amount SHIFTy is 1. The third shift data is shift data in which the vertical shift amount SHIFTx is 1 and the horizontal shift amount SHIFTy is 0. The fourth shift data is shift data in which the vertical shift amount SHIFTx is 1 and the horizontal shift amount SHIFTy is 1. The first to fourth shift data are two-dimensional images whose corners are locations where the coordinates of the two-dimensional image 11 are the shift amounts SHIFTx and SHIFTy, respectively.

また、座標情報保存部１０３は、上記の第１～第４のシフトデータに対応する４種類のシフト量ＳＨＩＦＴｘおよびＳＨＩＦＴｙを第１～第４のシフトデータの座標情報としてスタックに保存しておく。座標情報保存部１０３は、第１～第４のシフトデータをフィルタ処理部１０４に出力する。 Further, the coordinate information storage unit 103 stores four types of shift amounts SHIFTx and SHIFTy corresponding to the above-mentioned first to fourth shift data in a stack as coordinate information of the first to fourth shift data. The coordinate information storage unit 103 outputs the first to fourth shift data to the filter processing unit 104.

なお、座標情報保存部１０３は、１回目のフィルタパラメータを用いるフィルタが縮小するフィルタでない場合には、２次元画像１１をフィルタ処理部１０４に出力する。 Note that the coordinate information storage unit 103 outputs the two-dimensional image 11 to the filter processing unit 104 when the filter using the first filter parameter is not a filter to be reduced.

ステップＳ１１０４では、フィルタ処理部１０４は、フィルタが縮小するフィルタである場合には、第１～第４のシフトデータに対して、それぞれ、１回目のフィルタパラメータを用いた縮小するフィルタ処理を実行し、第１～第４のフィルタ処理データを生成する。そして、フィルタ処理部１０４は、第１～第４のフィルタ処理データを座標復元部１０５に出力する。 In step S1104, if the filter is a reduction filter, the filter processing unit 104 performs reduction filter processing using the first filter parameter on each of the first to fourth shift data. , generates first to fourth filtered data. Then, the filter processing unit 104 outputs the first to fourth filtered data to the coordinate restoration unit 105.

また、フィルタ処理部１０４は、フィルタが縮小するフィルタでない場合には、２次元画像１１に対して、１回目のフィルタパラメータを用いたフィルタ処理を実行し、フィルタ処理データを出力する。フィルタ処理部１０４は、第１～第４のフィルタ処理データまたは１種類のフィルタ処理データを座標情報復元部１０５に出力する。 Furthermore, if the filter is not a filter that reduces, the filter processing unit 104 performs filter processing using the first filter parameters on the two-dimensional image 11, and outputs filter processing data. The filter processing unit 104 outputs the first to fourth filter processing data or one type of filter processing data to the coordinate information restoration unit 105.

フィルタ処理部１０４は、線形フィルタ処理を実行する場合には、フィルタ保持部１０２から入力したフィルタパラメータ（フィルタ係数）を用いてフィルタ処理を実行し、フィルタ処理データを出力する。フィルタ処理部１０４は、最大値フィルタ、ＲｅＬＵ関数などの活性化関数、バッチノーマライゼーション、またはデコンボリューションなどのフィルタ処理を行う場合には、以下のフィルタパラメータを基にフィルタ処理を実行する。その場合のフィルタパラメータは、フィルタ保持部１０２から入力したカーネルサイズおよび画像のサイズ変化率などのパラメータである。フィルタ処理部１０４は、ステップＳ１１０７のフィルタ更新処理において、フィルタパラメータを更新する勾配方向の判断をするために、ステップＳ１１０４の処理を行う度、フィルタ処理データを保存しておく。 When performing linear filter processing, the filter processing section 104 performs the filter processing using the filter parameters (filter coefficients) input from the filter holding section 102, and outputs filter processing data. When performing filter processing such as a maximum value filter, an activation function such as a ReLU function, batch normalization, or deconvolution, the filter processing unit 104 performs the filter processing based on the following filter parameters. The filter parameters in this case are parameters such as the kernel size and the image size change rate input from the filter holding unit 102. The filter processing unit 104 saves filter processing data each time it performs the process in step S1104 in order to determine the gradient direction in which the filter parameters are updated in the filter update process in step S1107.

ステップＳ１１０５では、座標情報復元部１０５は、ステップＳ１１０４で実行したフィルタが拡大するフィルタでない場合、フィルタ処理部１０４が出力する第１～第４のフィルタ処理データまたは１種類のフィルタ処理データを座標情報保存部１０３に出力する。 In step S1105, if the filter executed in step S1104 is not a filter to be expanded, the coordinate information restoring unit 105 converts the first to fourth filter processing data or one type of filter processing data output by the filter processing unit 104 into coordinate information. It is output to the storage unit 103.

ステップＳ１１０６では、座標情報保存部１０３は、フィルタ処理部１０４が複数のフィルタパラメータのうちのすべてをフィルタパラメータのフィルタ処理を実行していない場合には、ステップＳ１１０３に戻る。座標情報保存部１０３は、複数のフィルタパラメータのうちの１回目のフィルタパラメータの処理を終えた段階であるので、２回目のフィルタパラメータの処理を行うため、ステップＳ１１０３に戻る。 In step S1106, the coordinate information storage unit 103 returns to step S1103 if the filter processing unit 104 has not performed filter processing on all of the plurality of filter parameters. Since the coordinate information storage unit 103 has finished processing the first filter parameter among the plurality of filter parameters, the process returns to step S1103 in order to process the second filter parameter.

ステップＳ１１０３では、座標情報保存部１０３は、複数のフィルタパラメータのうちの２回目のフィルタパラメータを参照し、２回目のフィルタパラメータを用いるフィルタが縮小するフィルタであるか否かを判定する。座標情報保存部１０３は、２回目のフィルタパラメータを用いるフィルタが縮小するフィルタである場合には、上記と同様に、座標情報復元部１０５から入力したフィルタ処理データをシフトした複数のシフトデータを生成する。そして、座標情報保存部１０３は、上記と同様に、複数のシフトデータに対応する複数のシフト量を複数のシフトデータの座標情報としてスタックに保存し、シフトデータをフィルタ処理部１０４に出力する。 In step S1103, the coordinate information storage unit 103 refers to the second filter parameter among the plurality of filter parameters, and determines whether the filter using the second filter parameter is a filter to be reduced. If the filter using the second filter parameter is a shrinking filter, the coordinate information storage unit 103 generates a plurality of shift data by shifting the filter processing data input from the coordinate information restoration unit 105, in the same manner as described above. do. Then, similarly to the above, the coordinate information storage unit 103 stores a plurality of shift amounts corresponding to a plurality of shift data in a stack as coordinate information of a plurality of shift data, and outputs the shift data to the filter processing unit 104.

まず、１回目のフィルタが縮小するフィルタでなく、２回目のフィルタが縮小するフィルタである場合を説明する。その場合、座標情報保存部１０３は、上記と同様に、１種類のフィルタ処理データに対して、縦方向にシフト量ＳＨＩＦＴｘ、かつ横方向にシフト量ＳＨＩＦＴｙシフトさせ、第１～第４のシフトデータを生成する。そして、座標情報保存部１０３は、第１～第４のシフトデータに対応する４種類のシフト量ＳＨＩＦＴｘおよびＳＨＩＦＴｙを第１～第４のシフトデータの座標情報としてスタックに保存し、第１～第４のシフトデータをフィルタ処理部１０４に出力する。 First, a case will be described in which the first filter is not a reducing filter but the second filter is a reducing filter. In that case, similarly to the above, the coordinate information storage unit 103 shifts one type of filter processing data by a shift amount SHIFTx in the vertical direction and by a shift amount SHIFTy in the horizontal direction, and stores the first to fourth shift data. generate. Then, the coordinate information storage unit 103 stores four types of shift amounts SHIFTx and SHIFTy corresponding to the first to fourth shift data in the stack as coordinate information of the first to fourth shift data, and The shift data of No. 4 is output to the filter processing section 104.

次に、１回目のフィルタが縮小するフィルタであり、２回目のフィルタも縮小するフィルタである場合を説明する。その場合、座標情報保存部１０３は、第１～第４のフィルタ処理データのそれぞれに対して、縦方向にシフト量ＳＨＩＦＴｘ、かつ横方向にシフト量ＳＨＩＦＴｙシフトさせ、第１～第１６のシフトデータを生成する。シフト量ＳＨＩＦＴｘおよびＳＨＩＦＴｙは、それぞれ、０または１である。座標情報保存部１０３は、第１～４のフィルタ処理データのそれぞれに対して、４種類のシフトを行い、第１～第１６のシフトデータを生成し、第１～第１６のシフトデータをフィルタ処理部１０４に出力する。また、座標情報保存部１０３は、第１～第１６のシフトデータに対応する１６種類のシフト量ＳＨＩＦＴｘおよびＳＨＩＦＴｙを第１～第１６のシフトデータの座標情報としてスタックに保存する。 Next, a case where the first filter is a reducing filter and the second filter is also a reducing filter will be described. In that case, the coordinate information storage unit 103 shifts each of the first to fourth filtered data by a shift amount SHIFTx in the vertical direction and by a shift amount SHIFTy in the horizontal direction, and stores the first to sixteenth shift data. generate. The shift amounts SHIFTx and SHIFTy are 0 or 1, respectively. The coordinate information storage unit 103 performs four types of shifts on each of the first to fourth filtered data, generates first to sixteenth shift data, and filters the first to sixteenth shift data. It is output to the processing unit 104. Further, the coordinate information storage unit 103 stores 16 types of shift amounts SHIFTx and SHIFTy corresponding to the first to sixteenth shift data in the stack as coordinate information of the first to sixteenth shift data.

ここで、第１～第４のシフトデータは、上記の第１のフィルタ処理データをシフトしたデータである。第５～第８のシフトデータは、上記の第２のフィルタ処理データをシフトしたデータである。第９～第１２のシフトデータは、上記の第３のフィルタ処理データをシフトしたデータである。第１３～第１６のシフトデータは、上記の第４のフィルタ処理データをシフトしたデータである。 Here, the first to fourth shift data are data obtained by shifting the above first filtered data. The fifth to eighth shift data are data obtained by shifting the second filtered data. The ninth to twelfth shift data are data obtained by shifting the third filtered data. The 13th to 16th shift data are data obtained by shifting the above-mentioned fourth filtered data.

また、座標情報保存部１０３は、２回目のフィルタパラメータを用いるフィルタが縮小するフィルタでない場合（拡大するフィルタである場合を含む）には、上記と同様に、座標情報復元部１０５から入力したフィルタ処理データをフィルタ処理部１０４に出力する。 Furthermore, if the filter using the second filter parameter is not a shrinking filter (including the case where it is an enlarging filter), the coordinate information storage unit 103 stores the filter input from the coordinate information restoring unit 105 in the same way as above. The processed data is output to the filter processing unit 104.

ステップＳ１１０４では、フィルタ処理部１０４は、フィルタが縮小するフィルタである場合には、入力したデータに対して、それぞれ、２回目のフィルタパラメータを用いたフィルタ処理を実行し、フィルタ処理データを生成する。 In step S1104, if the filter is a shrinking filter, the filter processing unit 104 performs filter processing using the second filter parameter on each input data to generate filter processing data. .

まず、１回目および２回目のフィルタが縮小するフィルタでない場合を説明する。その場合、フィルタ処理部１０４は、１種類のフィルタ処理データに対して、２回目のフィルタパラメータを用いたフィルタ処理を実行し、フィルタ処理データを生成する。そして、フィルタ処理部１０４は、フィルタ処理データを座標情報復元部１０５に出力する。 First, a case will be described in which the first and second filters are not filters that reduce. In that case, the filter processing unit 104 performs filter processing using the second filter parameter on one type of filter processing data, and generates filter processing data. The filter processing unit 104 then outputs the filtered data to the coordinate information restoration unit 105.

次に、１回目のフィルタが縮小するフィルタであり、２回目のフィルタが縮小するフィルタでない場合を説明する。その場合、フィルタ処理部１０４は、第１～第４のフィルタ処理データに対して、それぞれ、２回目のフィルタパラメータを用いたフィルタ処理を実行し、第１～第４のフィルタ処理データを生成する。そして、フィルタ処理部１０４は、第１～第４のフィルタ処理データを座標情報復元部１０５に出力する。 Next, a case where the first filter is a reducing filter and the second filter is not a reducing filter will be described. In that case, the filter processing unit 104 performs filter processing using the second filter parameter on each of the first to fourth filtered data to generate first to fourth filtered data. . Then, the filter processing unit 104 outputs the first to fourth filtered data to the coordinate information restoration unit 105.

次に、１回目のフィルタが縮小するフィルタでなく、２回目のフィルタが縮小するフィルタである場合を説明する。その場合、フィルタ処理部１０４は、第１～第４のシフトデータに対して、それぞれ、２回目のフィルタパラメータを用いたフィルタ処理を実行し、第１～第４のフィルタ処理データを生成する。そして、フィルタ処理部１０４は、第１～第４のフィルタ処理データを座標情報復元部１０５に出力する。 Next, a case will be described in which the first filter is not a reducing filter but the second filter is a reducing filter. In that case, the filter processing unit 104 performs filter processing using the second filter parameter on each of the first to fourth shift data to generate first to fourth filtered data. Then, the filter processing unit 104 outputs the first to fourth filtered data to the coordinate information restoration unit 105.

次に、１回目および２回目のフィルタが縮小するフィルタである場合を説明する。その場合、フィルタ処理部１０４は、第１～第１６のシフトデータに対して、２回目のフィルタパラメータを用いたフィルタ処理を実行し、第１～第１６のフィルタ処理データを生成する。そして、フィルタ処理部１０４は、第１～第１６のフィルタ処理データを座標情報復元部１０５に出力する。 Next, a case where the first and second filters are shrinking filters will be described. In that case, the filter processing unit 104 performs filter processing using the second filter parameter on the first to sixteenth shift data to generate first to sixteenth filtered data. Then, the filter processing unit 104 outputs the first to sixteenth filtered data to the coordinate information restoration unit 105.

ステップＳ１１０５では、座標情報復元部１０５は、ステップＳ１１０４で実行したフィルタが拡大するフィルタでない場合には、フィルタ処理部１０４が出力するフィルタ処理データを座標情報保存部１０３およびフィルタ更新装置１２に出力する。 In step S1105, if the filter executed in step S1104 is not a filter to be expanded, the coordinate information restoring unit 105 outputs the filter processing data output by the filter processing unit 104 to the coordinate information storage unit 103 and the filter updating device 12. .

また、座標情報復元部１０５は、ステップＳ１１０４で実行したフィルタが拡大するフィルタである場合には、座標情報保存部１０３が最後にスタックに保存したシフト量ＳＨＩＦＴｘおよびＳＨＩＦＴｙを取り出す。上記のように、拡大するフィルタのサイズ変化率が２の場合、シフト量は０または１である。また、縮小するフィルタのサイズ変化率が０．５の場合の、シフト量は０または１である。拡大するフィルタのサイズ変化率（＝２）が縮小するフィルタのサイズ変化率（＝０．５）の逆数である場合、座標情報復元部１０５は、スタックに保存されているシフト量ＳＨＩＦＴｘおよびＳＨＩＦＴｙを用いることができる。 Furthermore, if the filter executed in step S1104 is an expanding filter, the coordinate information restoring unit 105 retrieves the shift amounts SHIFTx and SHIFTy that were last stored in the stack by the coordinate information storage unit 103. As described above, when the size change rate of the expanding filter is 2, the shift amount is 0 or 1. Further, when the size change rate of the filter to be reduced is 0.5, the shift amount is 0 or 1. When the size change rate (=2) of the expanding filter is the reciprocal of the size change rate (=0.5) of the shrinking filter, the coordinate information restoring unit 105 calculates the shift amounts SHIFTx and SHIFTy stored in the stack. Can be used.

１回目のフィルタが縮小するフィルタであり、２回目のフィルタが拡大するフィルタである場合を説明する。その場合、座標情報復元部１０５は、座標情報保存部１０３が最後にスタックに保存した４種類のシフト量ＳＨＩＦＴｘおよびＳＨＩＦＴｙを取り出す。そして、座標情報復元部１０５は、第１～第４のフィルタ処理データに対して、縦方向にシフト量ＳＨＩＦＴｘかつ横方向にシフト量ＳＨＩＦＴｙシフトさせ、第１～第４のシフトデータを生成することにより、第１～第４のフィルタ処理データの座標情報を復元する。第１のシフトデータは、第１のフィルタ処理データに対して、縦方向のシフト量ＳＨＩＦＴｘが０、かつ横方向のシフト量ＳＨＩＦＴｙが０であるシフトデータである。第２のシフトデータは、第２のフィルタ処理データに対して、縦方向のシフト量ＳＨＩＦＴｘが０、かつ横方向のシフト量ＳＨＩＦＴｙが１であるシフトデータである。第３のシフトデータは、第３のフィルタ処理データに対して、縦方向のシフト量ＳＨＩＦＴｘが１、かつ横方向のシフト量ＳＨＩＦＴｙが０であるシフトデータである。第４のシフトデータは、第４のフィルタ処理データに対して、縦方向のシフト量ＳＨＩＦＴｘが１、かつ横方向のシフト量ＳＨＩＦＴｙが１であるシフトデータである。 A case will be described in which the first filter is a reducing filter and the second filter is an enlarging filter. In that case, the coordinate information restoring unit 105 retrieves the four types of shift amounts SHIFTx and SHIFTy that the coordinate information storing unit 103 last stored in the stack. The coordinate information restoring unit 105 then shifts the first to fourth filtered data by a shift amount SHIFTx in the vertical direction and by a shift amount SHIFTy in the horizontal direction to generate first to fourth shift data. As a result, the coordinate information of the first to fourth filtered data is restored. The first shift data is shift data in which the vertical shift amount SHIFTx is 0 and the horizontal shift amount SHIFTy is 0 with respect to the first filter processing data. The second shift data is shift data in which the vertical shift amount SHIFTx is 0 and the horizontal shift amount SHIFTy is 1 with respect to the second filter processing data. The third shift data is shift data in which the vertical shift amount SHIFTx is 1 and the horizontal shift amount SHIFTy is 0 with respect to the third filter processing data. The fourth shift data is shift data in which the vertical shift amount SHIFTx is 1 and the horizontal shift amount SHIFTy is 1 with respect to the fourth filter processing data.

そして、座標情報復元部１０５は、シフト量ＳＨＩＦＴｘおよびＳＨＩＦＴｙを基に第１～第４のシフトデータの座標の位置を合わせて、位置を合わせた第１～第４のシフトデータを座標ごとに加算（統合）し、統合データを生成する。なお、座標情報復元部１０５は、加算の代わりに、平均化または乗算を行うことにより、統合してもよい。 Then, the coordinate information restoration unit 105 aligns the coordinate positions of the first to fourth shift data based on the shift amounts SHIFTx and SHIFTy, and adds the aligned first to fourth shift data for each coordinate. (integration) and generate integrated data. Note that the coordinate information restoring unit 105 may perform integration by averaging or multiplication instead of addition.

そして、座標情報復元部１０５は、統合データのうち周期性が残っている最外周の１画素のデータを削除する。この後のフィルタは、削除によるデータサイズ変更に合わせたフィルタ処理を行う。座標情報復元部１０５は、削除後の統合データをフィルタ処理データとして座標情報保存部１０３およびフィルタ更新装置１２に出力する。座標情報復元部１０５は、統合データを生成することにより、フィルタ処理データの周期性を取り除くことができる。 Then, the coordinate information restoring unit 105 deletes the data of one pixel at the outermost periphery where periodicity remains from among the integrated data. The subsequent filter performs filter processing in accordance with the change in data size due to deletion. The coordinate information restoration unit 105 outputs the deleted integrated data to the coordinate information storage unit 103 and the filter update device 12 as filtered data. The coordinate information restoration unit 105 can remove periodicity from the filtered data by generating integrated data.

ステップＳ１１０６では、座標情報保存部１０３は、フィルタ処理部１０４が複数のフィルタパラメータのうちのすべてをフィルタパラメータのフィルタ処理を実行していない場合には、ステップＳ１１０３に戻り、３回目のフィルタパラメータの処理を繰り返す。 In step S1106, if the filter processing unit 104 has not performed filter processing on all of the filter parameters, the coordinate information storage unit 103 returns to step S1103 and stores the filter parameters for the third time. Repeat the process.

その後、ステップＳ１１０５において、１回目および２回目のフィルタが縮小するフィルタであり、３回目のフィルタが拡大するフィルタである場合の座標情報復元部１０５の処理を説明する。その場合、座標情報復元部１０５は、座標情報保存部１０３が最後にスタックに保存した１６種類のシフト量ＳＨＩＦＴｘおよびＳＨＩＦＴｙを取り出す。座標情報復元部１０５は、第１～第１６のフィルタ処理データに対して、それぞれ、１６種類のシフト量ＳＨＩＦＴｘおよびＳＨＩＦＴｙのシフトを行い、第１～第１６のシフトデータを生成することにより、第１～第１６のフィルタ処理データの座標情報を復元する。ここで、これらの第１～第１６のシフトデータは、それぞれ、座標情報保存部１０３により生成された第１～第１６のシフトデータを基に生成されたデータである。 After that, in step S1105, the processing of the coordinate information restoring unit 105 in the case where the first and second filters are filters for reducing and the third filter is a filter for enlarging will be described. In that case, the coordinate information restoration unit 105 retrieves the 16 types of shift amounts SHIFTx and SHIFTy that the coordinate information storage unit 103 last saved in the stack. The coordinate information restoring unit 105 shifts the first to sixteenth filtered data by 16 types of shift amounts SHIFTx and SHIFTy, respectively, and generates the first to sixteenth shift data. The coordinate information of the 1st to 16th filtered data is restored. Here, these first to sixteenth shift data are data generated based on the first to sixteenth shift data generated by the coordinate information storage unit 103, respectively.

座標情報復元部１０５は、第１～第４のシフトデータに対応するシフト量ＳＨＩＦＴｘとＳＨＩＦＴｙを基に第１～第４のシフトデータの座標の位置を合わせて、位置を合わせた第１～第４のシフトデータを座標ごとに加算し、第１の統合データを生成する。 The coordinate information restoring unit 105 aligns the coordinate positions of the first to fourth shift data based on the shift amounts SHIFTx and SHIFTy corresponding to the first to fourth shift data, and restores the aligned first to fourth shift data. 4 shift data are added for each coordinate to generate first integrated data.

座標情報復元部１０５は、第５～第８のシフトデータに対応するシフト量ＳＨＩＦＴｘとＳＨＩＦＴｙを基に第５～第８のシフトデータの座標の位置を合わせて、位置を合わせた第５～第８のシフトデータを座標ごとに加算し、第２の統合データを生成する。 The coordinate information restoring unit 105 aligns the coordinate positions of the fifth to eighth shift data based on the shift amounts SHIFTx and SHIFTy corresponding to the fifth to eighth shift data, and restores the aligned fifth to eighth shift data. 8 shift data are added for each coordinate to generate second integrated data.

座標情報復元部１０５は、第９～第１２のシフトデータに対応するシフト量ＳＨＩＦＴｘとＳＨＩＦＴｙを基に第９～第１２のシフトデータの座標の位置を合わせて、位置を合わせた第９～第１２のシフトデータを座標ごとに加算し、第３の統合データを生成する。 The coordinate information restoring unit 105 aligns the coordinate positions of the ninth to twelfth shift data based on the shift amounts SHIFTx and SHIFTy corresponding to the ninth to twelfth shift data, and adjusts the aligned ninth to twelfth shift data. The 12 shift data are added for each coordinate to generate third integrated data.

座標情報復元部１０５は、第１３～第１６のシフトデータに対応するシフト量ＳＨＩＦＴｘとＳＨＩＦＴｙを基に第１３～第１６のシフトデータの座標の位置を合わせ、位置を合わせた第１３～第１６のシフトデータを座標毎に加算し、第４の統合データを生成する。 The coordinate information restoring unit 105 aligns the coordinate positions of the 13th to 16th shift data based on the shift amounts SHIFTx and SHIFTy corresponding to the 13th to 16th shift data, and adjusts the aligned coordinates of the 13th to 16th The shift data is added for each coordinate to generate fourth integrated data.

そして、座標情報復元部１０５は、第１～第４の統合データのそれぞれに対して、周期性が残っている最外周の１画素のデータを削除する。座標情報復元部１０５は、削除後の第１～第４の統合データを第１～第４のフィルタ処理データとして座標情報保存部１０３に出力する。 Then, the coordinate information restoring unit 105 deletes the data of one pixel at the outermost periphery where periodicity remains for each of the first to fourth integrated data. The coordinate information restoration unit 105 outputs the first to fourth integrated data after deletion to the coordinate information storage unit 103 as first to fourth filtered data.

ステップＳ１１０６では、座標情報保存部１０３は、フィルタ処理部１０４が複数のフィルタパラメータのうちのすべてをフィルタパラメータのフィルタ処理を実行していない場合には、ステップＳ１１０３に戻り、４回目のフィルタパラメータの処理を繰り返す。 In step S1106, if the filter processing unit 104 has not performed filter processing on all of the filter parameters, the coordinate information storage unit 103 returns to step S1103 and stores the filter parameters for the fourth time. Repeat the process.

その後、ステップＳ１１０５において、１回目および２回目のフィルタが縮小するフィルタであり、３回目および４回目のフィルタが拡大するフィルタである場合の座標情報復元部１０５の処理を説明する。その場合、座標情報復元部１０５は、座標情報保存部１０３のスタックに残っている４種類のシフト量ＳＨＩＦＴｘおよびＳＨＩＦＴｙを取り出す。そして、座標情報復元部１０５は、フィルタ処理部１０４から入力した第１～第４のフィルタ処理データに対して、それぞれ、４種類のシフト量ＳＨＩＦＴｘおよびＳＨＩＦＴｙのシフトを行い、第１～第４のシフトデータを生成する。これにより、座標情報復元部１０５は、第１～第４のフィルタ処理データの座標情報を復元する。 Thereafter, in step S1105, the processing of the coordinate information restoring unit 105 in the case where the first and second filters are filters that reduce, and the third and fourth filters are filters that enlarge will be described. In that case, the coordinate information restoration unit 105 takes out the four types of shift amounts SHIFTx and SHIFTy remaining in the stack of the coordinate information storage unit 103. Then, the coordinate information restoration unit 105 shifts the first to fourth filtered data inputted from the filter processing unit 104 by four types of shift amounts SHIFTx and SHIFTy, respectively, to obtain the first to fourth filtered data. Generate shift data. Thereby, the coordinate information restoring unit 105 restores the coordinate information of the first to fourth filtered data.

座標情報復元部１０５は、４種類のシフト量ＳＨＩＦＴｘとＳＨＩＦＴｙを基に第１～第４のシフトデータの座標の位置を合わせて、位置を合わせた第１～第４のシフトデータを座標ごとに加算し、統合データを生成する。 The coordinate information restoring unit 105 aligns the coordinate positions of the first to fourth shift data based on the four types of shift amounts SHIFTx and SHIFTy, and converts the aligned first to fourth shift data for each coordinate. Add and generate integrated data.

そして、座標情報復元部１０５は、統合データに対して、周期性が残っている最外周の１画素のデータを削除する。座標情報復元部１０５は、削除後の統合データをフィルタ処理データとして座標情報保存部１０３に出力する。 Then, the coordinate information restoring unit 105 deletes data of one pixel at the outermost periphery where periodicity remains from the integrated data. The coordinate information restoration unit 105 outputs the deleted integrated data to the coordinate information storage unit 103 as filtered data.

ステップＳ１１０６において、座標情報保存部１０３は、フィルタ処理部１０４が複数のフィルタパラメータのうちのすべてをフィルタパラメータのフィルタ処理を実行した場合には、ステップＳ１１０７に進む。この場合、座標情報復元部１０５は、座標情報保存部１０３に対して、１つの座標サイズのフィルタ処理データ（２つの実数を含む）を出力している。そのような出力サイズになるように、２次元画像１１のサイズおよびフィルタ処理部１０４のフィルタが調整されている。すべてのフィルタパラメータのフィルタにおいて、拡大するフィルタの数は、縮小するフィルタの数以下である。 In step S1106, if the filter processing unit 104 has performed filter processing on all of the plurality of filter parameters, the coordinate information storage unit 103 proceeds to step S1107. In this case, the coordinate information restoration unit 105 outputs filtered data of one coordinate size (including two real numbers) to the coordinate information storage unit 103. The size of the two-dimensional image 11 and the filter of the filter processing unit 104 are adjusted to achieve such an output size. For all filter parameters, the number of filters to expand is less than or equal to the number of filters to contract.

なお、出力サイズが１つの座標にならない場合、フィルタ処理部１０４は、フィルタの最後に平均値フィルタなどを追加して、出力サイズが１つの座標になるように調整することができる。 Note that if the output size does not correspond to one coordinate, the filter processing unit 104 can add an average value filter or the like at the end of the filter to adjust the output size to one coordinate.

上記の２つの実数は、２次元画像１１の中心が物体の中心付近である確からしさと、２次元画像１１の中心が物体の中心付近でない確からしさを表す。情報処理装置１は、複数の２次元画像１１について、上記の処理を並列に行う。 The above two real numbers represent the probability that the center of the two-dimensional image 11 is near the center of the object, and the probability that the center of the two-dimensional image 11 is not near the center of the object. The information processing device 1 performs the above processing on the plurality of two-dimensional images 11 in parallel.

ステップＳ１１０７では、フィルタ更新装置１２は、複数の２次元画像１１のラベルと、座標情報復元部１０５が最後に出力する複数の２次元画像１１についてのフィルタ処理データを基に、フィルタパラメータの更新を行う。なお、１つの２次元画像１１について複数のフィルタ処理データが存在する場合には、フィルタ更新装置１２は、複数のフィルタ処理データを基に、フィルタパラメータの更新を行う。 In step S1107, the filter update device 12 updates the filter parameters based on the labels of the plurality of two-dimensional images 11 and the filter processing data for the plurality of two-dimensional images 11 that the coordinate information restoration unit 105 finally outputs. conduct. Note that when a plurality of filter processing data exist for one two-dimensional image 11, the filter updating device 12 updates the filter parameters based on the plurality of filter processing data.

複数の２次元画像１１は、教師データである。２次元画像１１のラベルは、２次元画像１１の画像の中心位置が物体の中心付近である場合には１で表現され、２次元画像１１の画像の中心位置が物体の中心付近でない場合には０で表現される。 The plurality of two-dimensional images 11 are teacher data. The label of the two-dimensional image 11 is expressed as 1 when the center position of the two-dimensional image 11 is near the center of the object, and is expressed as 1 when the center position of the two-dimensional image 11 is not near the center of the object. Represented by 0.

フィルタ処理データは、２次元画像１１の中心が物体の中心付近である確からしさを表す実数と、２次元画像１１の中心が物体の中心付近でない確からしさを表す実数を含む。フィルタ更新装置１２は、フィルタ処理データがラベルに近づくように、フィルタパラメータを更新する。具体的には、フィルタ更新装置１２は、ラベルが物体の中心付近であることを表す１である場合、フィルタ処理データは物体の中心付近である確からしさの方が物体の中心付近でない確からしさより高くなるように、フィルタパラメータを更新する。また、フィルタ更新装置１２は、ラベルが物体の中心付近でないことを表す０である場合、フィルタ処理データは物体の中心付近でない確からしさの方が物体の中心付近である確からしさより高くなるように、フィルタパラメータを更新する。 The filter processing data includes a real number representing the probability that the center of the two-dimensional image 11 is near the center of the object, and a real number representing the probability that the center of the two-dimensional image 11 is not near the center of the object. The filter update device 12 updates the filter parameters so that the filtered data approaches the label. Specifically, when the label is 1 indicating that the area is near the center of the object, the filter updating device 12 determines that the filter processing data is more likely to be near the center of the object than to be not near the center of the object. Update the filter parameters to be higher. Furthermore, when the label is 0 indicating that the label is not near the center of the object, the filter processing data is configured such that the probability that the label is not near the center of the object is higher than the probability that it is near the center of the object. , update filter parameters.

例えば、フィルタ更新装置１２は、フィルタ処理データの物体の中心付近である確からしさに関して、ソフトマックス関数を適用した結果に対する交差エントロピー誤差を用いて、フィルタパラメータを更新する。その場合、フィルタ更新装置１２は、非特許文献２の方法を用いて、勾配法を基に更新を行うことができる。フィルタ更新装置１２は、フィルタ処理部１０４のフィルタ処理の最中に適用されたフィルタパラメータに対して、他の層の出力結果の情報を利用することで、高速にフィルタパラメータを更新できる。 For example, the filter updating device 12 updates the filter parameters using the cross-entropy error for the result of applying the softmax function regarding the probability of filter processing data near the center of the object. In that case, the filter update device 12 can update based on the gradient method using the method disclosed in Non-Patent Document 2. The filter update device 12 can update the filter parameters at high speed by using information on the output results of other layers for the filter parameters applied during the filter processing by the filter processing unit 104.

フィルタ更新装置１２は、更新後のフィルタパラメータをフィルタ保持部１０２に出力する。フィルタ保持部１０２は、更新後のフィルタパラメータを保持する。 The filter update device 12 outputs the updated filter parameters to the filter holding unit 102. The filter holding unit 102 holds the updated filter parameters.

ステップＳ１１０８では、情報処理装置１は、学習処理の終了判定を行う。上記の複数の２次元画像１１の一部は、複数のテスト用２次元画像である。フィルタ更新装置１２は、テスト用２次元画像に対しては、上記のステップＳ１１０７でフィルタパラメータの更新を行わず、テスト用２次元画像以外の２次元画像１１に対して、上記のステップＳ１１０７でフィルタパラメータの更新を行う。 In step S1108, the information processing device 1 determines whether the learning process is finished. Some of the plurality of two-dimensional images 11 described above are a plurality of test two-dimensional images. The filter update device 12 does not update the filter parameters in the above step S1107 for the test two-dimensional image, but updates the filter parameters in the above step S1107 for the two-dimensional images 11 other than the test two-dimensional image. Update parameters.

テスト用２次元画像以外の２次元画像１１は、教師データとしての学習のためのフィルタパラメータの更新処理に用いられるため、フィルタ処理データが、上記のように、ラベルに近づいていく。フィルタ処理データは、テスト用２次元画像に関しても、教師データとして用いた２次元画像１１と同様に、ラベルに近づくことが期待されるが、フィルタパラメータの更新処理には用いられないため、保証できないことがある。第１の実施形態では、テスト用２次元画像と対応するラベルの一致度を精度として扱い、更新処理の終了の判定に利用する。 Since the two-dimensional images 11 other than the test two-dimensional image are used as teacher data in the updating process of filter parameters for learning, the filtered data approaches the label as described above. It is expected that the filtered data will approach the label for the test 2D image as well as for the 2D image 11 used as the teacher data, but this cannot be guaranteed as it will not be used for updating the filter parameters. Sometimes. In the first embodiment, the degree of matching between the test two-dimensional image and the corresponding label is treated as accuracy, and used to determine whether the update process is complete.

情報処理装置１は、テスト用２次元画像を入力データとした場合の最終的に得られるフィルタ処理データの物体の中心付近である確からしさと物体の中心付近でない確からしさのうちの値の大きい方と、ラベルとの一致度を比較する。情報処理装置１は、複数のテスト用２次元画像について、直前のステップＳ１１０７の処理と過去数回分のステップＳ１１０７の処理において、一致度の改善があった場合には、まだ改善の見込みがあるものとして、ステップＳ１１０２に戻る。ステップＳ１１０２では、フィルタ保持部１０２は、更新後の複数のフィルタパラメータおよびフィルタパラメータを出力する順番を座標情報保存部１０３およびフィルタ処理部１０４に出力し、上記の処理を繰り返す。 The information processing device 1 selects the larger of the probability that the filtered data finally obtained when the test two-dimensional image is input data is near the center of the object and the probability that it is not near the center of the object. and the degree of match with the label. If there is an improvement in the degree of matching in the previous step S1107 and the past few steps S1107 for the plurality of test two-dimensional images, the information processing device 1 determines that there is still a prospect of improvement. Then, the process returns to step S1102. In step S1102, the filter holding unit 102 outputs the updated plurality of filter parameters and the order in which the filter parameters are output to the coordinate information storage unit 103 and the filter processing unit 104, and repeats the above processing.

また、情報処理装置１は、複数のテスト用２次元画像について、直前のステップＳ１１０７の処理と過去数回分のステップＳ１１０７の処理において、一致度の改善がない場合には、改善の見込みがないものとして、学習処理を終了する。 In addition, if there is no improvement in the degree of matching in the process of the immediately preceding step S1107 and the process of the past several times of step S1107 for the plurality of test two-dimensional images, the information processing device 1 determines that there is no prospect of improvement. The learning process ends.

図３は、推定フェーズにおける情報処理装置１の情報処理方法を示すフローチャートである。推定フェーズでは、情報処理装置１は、任意の２次元画像１１の中の物体の中心位置を出力装置１３に出力する。フィルタ保持部１０２は、学習フェーズで更新されたフィルタパラメータを保持している。 FIG. 3 is a flowchart showing the information processing method of the information processing device 1 in the estimation phase. In the estimation phase, the information processing device 1 outputs the center position of the object in the arbitrary two-dimensional image 11 to the output device 13. The filter holding unit 102 holds filter parameters updated in the learning phase.

ステップＳ１２０１では、データ取得部１０１は、物体の写った２次元画像１１を取得し、２次元画像１１を座標情報保存部１０３に出力する。この２次元画像１１は、図２の学習処理の時の２次元画像１１以上の任意のサイズのカラー画像である。 In step S1201, the data acquisition unit 101 acquires a two-dimensional image 11 containing an object, and outputs the two-dimensional image 11 to the coordinate information storage unit 103. This two-dimensional image 11 is a color image of any size larger than the two-dimensional image 11 used in the learning process in FIG. 2.

ステップＳ１２０２では、情報処理装置１は、図２のステップＳ１１０２と同様の処理を行う。 In step S1202, the information processing device 1 performs the same process as step S1102 in FIG.

ステップＳ１２０３では、情報処理装置１は、図２のステップＳ１１０３と同様の処理を行う。 In step S1203, the information processing apparatus 1 performs the same process as step S1103 in FIG.

ステップＳ１２０４では、情報処理装置１は、図２のステップＳ１１０４と同様の処理を行う。なお、図２のステップＳ１１０７のフィルタ更新処理は行わないため、フィルタ処理部１０４は、フィルタ処理データを保存しなくてもよい。 In step S1204, the information processing apparatus 1 performs the same process as step S1104 in FIG. Note that since the filter update process in step S1107 in FIG. 2 is not performed, the filter processing unit 104 does not need to save the filter processing data.

ステップＳ１２０５では、情報処理装置１は、図２のステップＳ１１０５と同様の処理を行う。 In step S1205, the information processing apparatus 1 performs the same process as step S1105 in FIG.

ステップＳ１２０６では、情報処理装置１は、図２のステップＳ１１０６と同様の処理を行う。情報処理装置は、フィルタ処理部１０４が複数のフィルタパラメータのうちのすべてをフィルタパラメータのフィルタ処理を実行した場合には、ステップＳ１２０９に進む。 In step S1206, the information processing device 1 performs the same process as step S1106 in FIG. If the filter processing unit 104 has performed filter processing on all of the plurality of filter parameters, the information processing apparatus advances to step S1209.

ステップＳ１２０９では、出力データ生成部１０６は、座標情報復元部１０５から最後のフィルタ処理データを入力し、２次元画像１１の中の物体の中心位置の座標を推定する。ステップＳ１２０１の２次元画像１１のサイズは、図２のステップＳ１１０１の２次元画像１１のサイズ以上である。したがって、座標情報復元部１０５が出力データ生成部１０６に出力する最後のフィルタ処理データは、１つの座標以上のサイズであり、各座標に対して、物体の中心付近である確からしさを表す実数と、物体の中心付近でない確からしさを表す実数を含む。 In step S1209, the output data generation unit 106 inputs the last filtered data from the coordinate information restoration unit 105, and estimates the coordinates of the center position of the object in the two-dimensional image 11. The size of the two-dimensional image 11 in step S1201 is greater than or equal to the size of the two-dimensional image 11 in step S1101 of FIG. Therefore, the final filtered data that the coordinate information restoration unit 105 outputs to the output data generation unit 106 has a size larger than one coordinate, and for each coordinate, a real number representing the probability that it is near the center of the object. , contains a real number that represents the probability that it is not near the center of the object.

まず、複数のフィルタパラメータのフィルタの中で縮小するフィルタの数と拡大するフィルタの数が同じ場合について説明する。その場合、座標情報復元部１０５は、最後の１種類のフィルタ処理データを出力データ生成部１０６に出力する。出力データ生成部１０６は、フィルタ処理データを基に、物体の中心付近である確からしさが物体の中心付近でない確からしさ高い座標の領域を算出する。物体の中心付近とは、物体の領域の画像重心から所定の距離以内であることを示す。出力データ生成部１０６は、算出した領域が上記の所定の距離を半径とする円の面積より大きい場合には、算出した領域の重心を算出し、その重心を物体の中心座標として出力とする。なお、２次元画像１１が複数の物体を含む場合には、出力データ生成部１０６は、複数の物体の中心座標を出力することができる。また、出力データ生成部１０６は、算出した領域が上記の所定の距離を半径とする円の面積より小さい場合には、算出した領域がノイズであり、２次元画像１１内に物体が存在しないと判断し、物体の中心座標を出力しない。なお、出力データ生成部１０６は、物体の中心の位置または物体の全体の位置を推定し、その推定した物体の中心の位置または物体の全体の位置を出力装置１３に出力することができる。 First, a case will be described in which the number of filters to be reduced and the number of filters to be enlarged among the filters of a plurality of filter parameters are the same. In that case, the coordinate information restoration unit 105 outputs the last type of filtered data to the output data generation unit 106. The output data generation unit 106 calculates, based on the filter processing data, a region of coordinates with a high probability of being near the center of the object but not near the center of the object. Near the center of the object indicates that it is within a predetermined distance from the center of gravity of the image of the object region. If the calculated area is larger than the area of a circle whose radius is the predetermined distance, the output data generation unit 106 calculates the center of gravity of the calculated area and outputs the center of gravity as the center coordinates of the object. Note that when the two-dimensional image 11 includes a plurality of objects, the output data generation unit 106 can output the center coordinates of the plurality of objects. Further, if the calculated area is smaller than the area of a circle whose radius is the predetermined distance, the output data generation unit 106 determines that the calculated area is noise and no object exists in the two-dimensional image 11. judgment and does not output the center coordinates of the object. Note that the output data generation unit 106 can estimate the position of the center of the object or the entire position of the object, and output the estimated position of the center of the object or the entire position of the object to the output device 13.

次に、複数のフィルタパラメータのフィルタの中で縮小するフィルタの数が拡大するフィルタの数より多い場合について説明する。例えば、座標情報復元部１０５は、第１～第４のフィルタ処理データを出力データ生成部１０６に出力する。出力データ生成部１０６は、４種類のシフト量ＳＨＩＦＴｘおよびＳＨＩＦＴｙを用いて、第１～第４のフィルタ処理データの座標（ｘ，ｙ）の要素の値を、それぞれ、座標（２ｘ＋ＳＨＩＦＴｘ，２ｙ＋ＳＨＩＦＴｙ）へコピーしたデータを生成する。例えば、出力データ生成部１０６は、第１のフィルタ処理データの座標（ｘ，ｙ）の要素の値を、座標（２ｘ＋０，２ｙ＋０）へコピーしたデータを生成する。また、出力データ生成部１０６は、第２のフィルタ処理データの座標（ｘ，ｙ）の要素の値を、座標（２ｘ＋０，２ｙ＋１）へコピーしたデータを生成する。これにより、座標情報復元部１０５は、第１～第４のフィルタ処理データを統合した２倍に拡大した統合データを得ることができる。 Next, a case will be described in which the number of filters to be reduced among the filters of a plurality of filter parameters is greater than the number of filters to be enlarged. For example, the coordinate information restoration unit 105 outputs the first to fourth filtered data to the output data generation unit 106. The output data generation unit 106 uses four types of shift amounts SHIFTx and SHIFTy to convert the element values of the coordinates (x, y) of the first to fourth filtered data to the coordinates (2x+SHIFTx, 2y+SHIFTy), respectively. Generate copied data. For example, the output data generation unit 106 generates data by copying the value of the element at the coordinates (x, y) of the first filter processing data to the coordinates (2x+0, 2y+0). Further, the output data generation unit 106 generates data by copying the value of the element at the coordinates (x, y) of the second filter processing data to the coordinates (2x+0, 2y+1). Thereby, the coordinate information restoring unit 105 can obtain integrated data that is expanded twice by integrating the first to fourth filtered data.

次に、座標情報復元部１０５が１６個以上のフィルタ処理データを出力データ生成部１０６に出力する場合を説明する。その場合、出力データ生成部１０６は、ステップＳ１１０５と同様に、４つのフィルタ処理データを単位として上記の統合を行う処理を繰り返し、最終的に１つの統合データを得る。 Next, a case will be described in which the coordinate information restoration unit 105 outputs 16 or more filtered data to the output data generation unit 106. In that case, similarly to step S1105, the output data generation unit 106 repeats the above-described integration process using four pieces of filtered data as a unit, and finally obtains one piece of integrated data.

すなわち、出力データ生成部１０６は、縮小するフィルタの数が拡大するフィルタの数より多い数だけ、フィルタ処理データの座標（ｘ，ｙ）の要素の値を、座標（２ｘ＋ＳＨＩＦＴｘ，２ｙ＋ＳＨＩＦＴｙ）へコピーしたデータを生成する処理を繰り返す。 That is, the output data generation unit 106 copies the element values of the coordinates (x, y) of the filter processing data to the coordinates (2x+SHIFTx, 2y+SHIFTy) by the number of filters to be reduced than the number of filters to be expanded. Repeat the process to generate data.

その後、出力データ生成部１０６は、統合データを用いて、上記と同様に、物体の中心座標を出力する。 Thereafter, the output data generation unit 106 uses the integrated data to output the center coordinates of the object in the same manner as described above.

次に、第１の実施形態の効果を説明する。座標情報保存部１０３は、縮小するフィルタ処理の場合、複数のシフトデータを生成することにより、座標情報のロスを防ぐことができる。座標情報復元部１０５は、拡大するフィルタの場合、フィルタ処理データのサイズが大きくなる分のデータの補完部分に周期性が生じるが、統合データを生成することにより、周期性を相殺することができる。 Next, the effects of the first embodiment will be explained. In the case of filter processing for reduction, the coordinate information storage unit 103 can prevent loss of coordinate information by generating a plurality of shift data. In the case of a filter that expands, the coordinate information restoring unit 105 can offset periodicity by generating integrated data, although periodicity occurs in the complemented portion of data as the size of filtered data increases. .

（第２の実施形態）
第１の実施形態では、フィルタ保持部１０２は、学習フェーズの初期値として、ランダムのフィルタパラメータを保持することができる。本発明の第２の実施形態では、フィルタ保持部１０２は、学習フェーズの初期値として、事前に他のデータセット等で更新されたフィルタパラメータを用いる。事前に他のデータセット等で更新されたフィルタパラメータは、ランダムのフィルタパラメータより、２次元画像１１を分類する性能が高い。学習フェーズでは、情報処理装置１は、ファインチューニングで、学習に用いる時間を削減しつつ、高精度のフィルタパラメータを得ることができる。 (Second embodiment)
In the first embodiment, the filter holding unit 102 can hold random filter parameters as initial values for the learning phase. In the second embodiment of the present invention, the filter holding unit 102 uses filter parameters that have been updated in advance using another data set or the like as initial values in the learning phase. Filter parameters that have been updated in advance using other data sets or the like have higher performance in classifying the two-dimensional image 11 than random filter parameters. In the learning phase, the information processing device 1 can obtain highly accurate filter parameters while reducing the time used for learning through fine tuning.

ただし、事前に他のデータセット等で更新されたフィルタパラメータは、座標情報保存部１０３および座標情報復元部１０５の使用を前提としていないため、変更が必要になる。以下、第２の実施形態が第１の実施形態と異なる点を説明する。 However, filter parameters that have been updated in advance using other data sets etc. are not intended for use by the coordinate information storage unit 103 and the coordinate information restoration unit 105, and therefore need to be changed. Hereinafter, the differences between the second embodiment and the first embodiment will be explained.

ファインチューニングの際に用いる２次元画像１１は、第１の実施形態と同様に、フィルタ処理に入力可能な最小サイズまたはそれ以上の任意のサイズでよい。 As in the first embodiment, the two-dimensional image 11 used in fine tuning may be the minimum size that can be input to filter processing or any size larger than that.

ステップＳ１１０２では、フィルタ保持部１０２は、事前に更新された複数のフィルタパラメータおよびフィルタパラメータを出力する順番を保持する。複数のフィルタパラメータのフィルタが拡大するフィルタを含む場合、上記のように、座標情報復元部１０５は、最外周の１画素のデータを削除する。事前に更新された複数のフィルタパラメータは、そのような削除を前提としていない。そのため、フィルタ保持部１０２は、フィルタ処理部１０４が、拡大するフィルタ処理の後に、データサイズを調整するように、事前に更新された複数のフィルタパラメータを変更する。 In step S1102, the filter holding unit 102 holds a plurality of filter parameters that have been updated in advance and the order in which the filter parameters are output. If the filters of the plurality of filter parameters include a filter that expands, the coordinate information restoring unit 105 deletes the data of one pixel at the outermost circumference, as described above. Pre-updated filter parameters are not subject to such deletion. Therefore, the filter holding unit 102 changes the plurality of filter parameters that have been updated in advance so that the filter processing unit 104 adjusts the data size after the expanding filter processing.

また、事前に更新された複数のフィルタパラメータのフィルタ処理は、２次元画像１１の周囲部分に定数値を入力してフィルタ処理のサイズを調整する処理（以下、パディングという）を含んでいる場合がある。その場合、フィルタ保持部１０２は、フィルタ処理部１０４がパディングを行わないようにし、後で２次元画像１１のサイズを調整するように、事前に更新された複数のフィルタパラメータを変更する。 Furthermore, the filter processing using the plurality of filter parameters updated in advance may include processing (hereinafter referred to as padding) of adjusting the size of the filter processing by inputting a constant value to the surrounding area of the two-dimensional image 11. be. In that case, the filter holding unit 102 changes the plurality of filter parameters updated in advance so that the filter processing unit 104 does not perform padding and adjusts the size of the two-dimensional image 11 later.

第１の実施形態では、フィルタ処理部１０４が最後に出力するフィルタ処理データが、１つの座標サイズになるように、複数のフィルタパラメータが調整されていた。しかし、事前に更新された複数のフィルタパラメータが用いられる場合、フィルタ処理部１０４が最後に出力するフィルタ処理データは、１つの座標サイズになるとは限らない。そこで、情報処理装置１は、フィルタ処理部１０４が最後に出力するフィルタ処理データが、１つの座標サイズになるように、２次元画像１１のサイズを決定する。 In the first embodiment, a plurality of filter parameters are adjusted so that the filter processing data finally output by the filter processing unit 104 has one coordinate size. However, when a plurality of filter parameters updated in advance are used, the filter processing data finally output by the filter processing unit 104 does not necessarily have one coordinate size. Therefore, the information processing device 1 determines the size of the two-dimensional image 11 so that the filter processing data finally output by the filter processing unit 104 has one coordinate size.

例えば、情報処理装置１は、２次元画像１１のサイズを少しずつ変更しながら、フィルタ処理部１０４が最後に出力するフィルタ処理データのサイズが１つの座標サイズになるような、２次元画像１１のサイズを探索する。最後のフィルタ処理データのサイズが１つの座標サイズに１になるような２次元画像１１のサイズが存在しない場合、フィルタ保持部１０２は、新たにランダムな値で係数を初期化した線形フィルタを最後に加えるように、複数のフィルタパラメータを変更する。情報処理装置１は、この線形フィルタのサイズが縦横共に２以上の整数の全探索により、最後のフィルタ処理データのサイズが１つの座標サイズになるような２次元画像１１のサイズを決定する。 For example, the information processing device 1 changes the size of the two-dimensional image 11 little by little so that the size of the filtered data finally output by the filter processing unit 104 becomes one coordinate size. Explore Size. If the size of the two-dimensional image 11 does not exist such that the size of the final filtered data is 1 per coordinate size, the filter holding unit 102 stores the linear filter whose coefficients are newly initialized with random values as the final filter. Modify multiple filter parameters, such as adding The information processing device 1 determines the size of the two-dimensional image 11 such that the size of the last filtered data becomes one coordinate size by performing a full search for integers whose linear filter size is 2 or more in both the vertical and horizontal directions.

ステップＳ１１０５及びステップＳ１２０５の処理では、座標情報復元部１０５は、拡大するフィルタの場合、フィルタ処理データをシフトして加算する代わりに、シフトして平均化する。事前に更新された複数のフィルタパラメータは、データレベルが増加する加算を前提とせず、データレベルが増加しない平均化を前提としているためである。 In the processing of step S1105 and step S1205, in the case of a filter to be expanded, the coordinate information restoring unit 105 shifts and averages the filter processing data instead of shifting and adding the data. This is because the plurality of filter parameters updated in advance are not based on addition that increases the data level, but are based on averaging that does not increase the data level.

情報処理装置１は、上記の処理を行うことにより、事前に更新された複数のフィルタパラメータを用いることができる。 By performing the above processing, the information processing device 1 can use a plurality of filter parameters that have been updated in advance.

第１および第２の実施形態によれば、座標情報保存部１０３は、縮小するフィルタの場合、座標情報のロスを防ぐため、第１～第４のシフトデータを生成し、４種類のシフト量ＳＨＩＦＴｘおよびＳＨＩＦＴｙを保存する。フィルタ処理部１０４が入力データを拡大するフィルタ処理を行うと、拡大に伴うデータ補完により処理に周期性が生じ、第１～第４のフィルタ処理データが周期性を持つ。座標情報復元部１０５は、入力データを拡大するフィルタ処理の後に、保存された４種類のシフト量ＳＨＩＦＴｘおよびＳＨＩＦＴｙを基に、第１～第４のフィルタ処理データを統合することにより統合データを生成することにより、周期性を相殺する。 According to the first and second embodiments, in the case of a filter that reduces, the coordinate information storage unit 103 generates first to fourth shift data and stores four types of shift amounts in order to prevent loss of coordinate information. Save SHIFTx and SHIFTy. When the filter processing unit 104 performs filter processing to enlarge input data, periodicity occurs in the processing due to data complementation accompanying the enlargement, and the first to fourth filtered data have periodicity. After the filter processing to enlarge the input data, the coordinate information restoration unit 105 generates integrated data by integrating the first to fourth filter processing data based on the four types of stored shift amounts SHIFTx and SHIFTy. This cancels the periodicity.

以上の処理により、情報処理装置１は、学習フェーズで、拡大するフィルタ処理を行う場合でも、小さいサイズの２次元画像１１を用いた学習を行うことができ、推定フェーズでは、高精度に対象物体の中心位置を推定できる。学習フェーズで用いる２次元画像１１は、フィルタ処理部１０４のフィルタ処理に入力可能な最小サイズまたはそれ以上の任意のサイズでよい。 Through the above processing, the information processing device 1 can perform learning using a small-sized two-dimensional image 11 even when performing enlarging filter processing in the learning phase, and can accurately detect the target object in the estimation phase. The center position of can be estimated. The two-dimensional image 11 used in the learning phase may be of any size that is the minimum size that can be input to the filter processing of the filter processing unit 104 or larger.

学習フェーズでは、フィルタ処理部１０４の最後のフィルタ処理の出力データサイズを調節する場合、入力データサイズを変更、または、フィルタ処理部１０４にフィルタを追加することができる。また、学習フェーズでは、入力データサイズに対して出力データサイズを拡大するフィルタ処理をフィルタ処理部１０４が含む場合、入力データサイズを変更、または、フィルタ処理部１０４にフィルタを追加することができる。 In the learning phase, when adjusting the output data size of the last filter process of the filter processing unit 104, the input data size can be changed or a filter can be added to the filter processing unit 104. Furthermore, in the learning phase, if the filter processing unit 104 includes filter processing that expands the output data size with respect to the input data size, the input data size can be changed or a filter can be added to the filter processing unit 104.

（その他の実施形態）
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読み出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other embodiments)
The present invention provides a system or device with a program that implements one or more functions of the embodiments described above via a network or a storage medium, and one or more processors in a computer of the system or device reads and executes the program. This can also be achieved by processing. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

なお、上記実施形態は、何れも本発明を実施するにあたっての具体化の例を示したものに過ぎず、これらによって本発明の技術的範囲が限定的に解釈されてはならないものである。すなわち、本発明はその技術思想、又はその主要な特徴から逸脱することなく、様々な形で実施することができる。 Note that the above embodiments are merely examples of implementation of the present invention, and the technical scope of the present invention should not be interpreted to be limited by these embodiments. That is, the present invention can be implemented in various forms without departing from its technical idea or main features.

１情報処理装置、１０３座標情報保存部、１０４フィルタ処理部、１０５座標情報復元部 1 information processing device, 103 coordinate information storage unit, 104 filter processing unit, 105 coordinate information restoration unit

Claims

２次元画像に対してフィルタ処理を行うフィルタ処理手段と、
前記フィルタ処理手段が入力データサイズに対して出力データサイズを縮小するフィルタ処理を行う場合には、前記縮小するフィルタ処理の入力データの座標情報を保存する座標情報保存手段と、
前記フィルタ処理手段が入力データサイズに対して出力データサイズを拡大するフィルタ処理を行う場合には、前記座標情報保存手段により保存された座標情報を用いて、前記フィルタ処理手段のフィルタ処理の出力データの座標情報を復元する座標情報復元手段と
を有し、
前記座標情報保存手段は、前記フィルタ処理手段が入力データサイズに対して出力データサイズを縮小するフィルタ処理を行う場合には、前記２次元画像のそれぞれの座標軸方向に、前記フィルタ処理手段のフィルタ処理の入力データをシフトした複数のシフトデータを前記フィルタ処理手段に出力し、
前記フィルタ処理手段は、複数のシフトデータに対して、前記縮小するフィルタ処理を行い、複数の出力データを出力する
ことを特徴とする情報処理装置。 a filter processing means that performs filter processing on the two-dimensional image;
When the filter processing means performs filter processing to reduce the output data size with respect to the input data size, coordinate information storage means for saving coordinate information of the input data for the filter processing to be reduced;
When the filter processing means performs filter processing to enlarge the output data size with respect to the input data size, the output data of the filter processing of the filter processing means is processed using the coordinate information stored by the coordinate information storage means. coordinate information restoring means for restoring the coordinate information of
When the filter processing means performs filter processing to reduce the output data size with respect to the input data size, the coordinate information storage means stores the filter processing of the filter processing means in the direction of each coordinate axis of the two-dimensional image. outputting a plurality of shifted data obtained by shifting the input data of to the filter processing means;
The filter processing means performs the reduction filter processing on a plurality of shift data and outputs a plurality of output data.
An information processing device characterized by:

前記座標情報保存手段は、前記フィルタ処理手段が入力データサイズに対して出力データサイズを縮小するフィルタ処理を行う場合には、前記縮小する縮小率に応じて、前記フィルタ処理手段のフィルタ処理の入力データをシフトした前記複数のシフトデータを前記フィルタ処理手段に出力することを特徴とする請求項１に記載の情報処理装置。 When the filter processing means performs filter processing to reduce the output data size with respect to the input data size, the coordinate information storage means stores the input data of the filter processing of the filter processing means in accordance with the reduction rate. The information processing apparatus according to claim 1, wherein the plurality of shifted data obtained by shifting data are outputted to the filter processing means.

前記座標情報保存手段は、前記複数のシフトデータに対応する複数のシフト量を前記座標情報として保存することを特徴とする請求項２に記載の情報処理装置。 3. The information processing apparatus according to claim 2, wherein the coordinate information storage means stores a plurality of shift amounts corresponding to the plurality of shift data as the coordinate information.

前記座標情報復元手段は、前記フィルタ処理手段が入力データサイズに対して出力データサイズを拡大するフィルタ処理を行う場合には、前記拡大する拡大率に応じて、前記フィルタ処理手段のフィルタ処理の複数の出力データをそれぞれシフトした複数のシフトデータを統合することを特徴とする請求項２または３に記載の情報処理装置。 When the filter processing means performs filter processing to enlarge the output data size with respect to the input data size, the coordinate information restoring means performs a plurality of filter processing operations of the filter processing means according to the enlargement rate. 4. The information processing apparatus according to claim 2, wherein the information processing apparatus integrates a plurality of shift data obtained by shifting output data of the respective output data.

前記座標情報復元手段は、加算、平均化または乗算により、前記統合を行うことを特徴とする請求項４に記載の情報処理装置。 5. The information processing apparatus according to claim 4, wherein the coordinate information restoring means performs the integration by addition, averaging, or multiplication.

前記座標情報復元手段は、前記統合の後のデータのうちの外周のデータを削除することを特徴とする請求項４または５に記載の情報処理装置。 6. The information processing apparatus according to claim 4, wherein the coordinate information restoring means deletes outer peripheral data of the data after the integration.

前記フィルタ処理手段は、前記縮小するフィルタ処理を行った後に、前記拡大するフィルタ処理を行うことを特徴とする請求項１～６のいずれか１項に記載の情報処理装置。 7. The information processing apparatus according to claim 1, wherein the filter processing means performs the enlarging filter processing after performing the reducing filter processing.

前記フィルタ処理手段は、複数のフィルタ処理を行うことを特徴とする請求項１～７のいずれか１項に記載の情報処理装置。 8. The information processing apparatus according to claim 1, wherein the filter processing means performs a plurality of filter processes.

学習フェーズでは、前記フィルタ処理手段のフィルタ処理に用いるフィルタパラメータは、前記フィルタ処理手段の最後のフィルタ処理の出力データを基に更新されることを特徴とする請求項８に記載の情報処理装置。 9. The information processing apparatus according to claim 8, wherein in the learning phase, filter parameters used for filter processing by the filter processing means are updated based on output data of the last filter processing by the filter processing means.

前記学習フェーズでは、前記フィルタ処理手段の最後のフィルタ処理の出力データサイズを調節する場合、前記入力データサイズを変更、または、前記フィルタ処理手段にフィルタを追加することを特徴とする請求項９に記載の情報処理装置。 10. In the learning phase, when adjusting the output data size of the last filtering process of the filter processing means, the input data size is changed or a filter is added to the filter processing means. The information processing device described.

前記学習フェーズでは、前記入力データサイズに対して前記出力データサイズを拡大するフィルタ処理を前記フィルタ処理手段が含む場合、前記入力データサイズを変更、または、前記フィルタ処理手段にフィルタを追加することを特徴とする請求項１０に記載の情報処理装置。 In the learning phase, if the filter processing means includes filter processing for expanding the output data size with respect to the input data size, changing the input data size or adding a filter to the filter processing means is performed. The information processing device according to claim 10.

推定フェーズでは、前記フィルタ処理手段の最後のフィルタ処理の出力データを基に、前記２次元画像の中の物体の位置を推定する推定手段をさらに有することを特徴とする請求項８～１１のいずれか１項に記載の情報処理装置。 Any one of claims 8 to 11, characterized in that the estimation phase further comprises estimation means for estimating the position of an object in the two-dimensional image based on the output data of the last filter processing of the filter processing means. The information processing device according to item 1.

２次元画像に対してフィルタ処理を行うフィルタ処理手段を有する情報処理装置の情報処理方法であって、
前記フィルタ処理手段が入力データサイズに対して出力データサイズを縮小するフィルタ処理を行う場合には、前記縮小するフィルタ処理の入力データの座標情報を保存する座標情報保存ステップと、
前記フィルタ処理手段が入力データサイズに対して出力データサイズを拡大するフィルタ処理を行う場合には、前記座標情報保存ステップで保存された座標情報を用いて、前記フィルタ処理手段のフィルタ処理の出力データの座標情報を復元する座標情報復元ステップと
を有し、
前記座標情報保存ステップは、前記フィルタ処理手段が入力データサイズに対して出力データサイズを縮小するフィルタ処理を行う場合には、前記２次元画像のそれぞれの座標軸方向に、前記フィルタ処理手段のフィルタ処理の入力データをシフトした複数のシフトデータを前記フィルタ処理手段に出力し、
前記フィルタ処理手段は、複数のシフトデータに対して、前記縮小するフィルタ処理を行い、複数の出力データを出力する
ことを特徴とする情報処理方法。 An information processing method for an information processing device having a filter processing means for performing filter processing on a two-dimensional image, the method comprising:
When the filter processing means performs filter processing to reduce the output data size with respect to the input data size, a coordinate information saving step of saving coordinate information of the input data of the filter processing to be reduced;
When the filter processing means performs filter processing to enlarge the output data size with respect to the input data size, the coordinate information saved in the coordinate information storage step is used to adjust the output data of the filter processing of the filter processing means. a coordinate information restoring step of restoring the coordinate information of the
In the coordinate information storage step, when the filter processing means performs filter processing to reduce the output data size with respect to the input data size, the filter processing of the filter processing means is performed in the direction of each coordinate axis of the two-dimensional image. outputting a plurality of shifted data obtained by shifting the input data of to the filter processing means;
The filter processing means performs the reduction filter processing on a plurality of shift data and outputs a plurality of output data.
An information processing method characterized by:

コンピュータを、請求項１～１２のいずれか１項に記載された情報処理装置の各手段として機能させるためのプログラム。 A program for causing a computer to function as each means of the information processing apparatus according to any one of claims 1 to 12.