JP6427038B2

JP6427038B2 - Camera parameter estimation apparatus and camera parameter estimation program

Info

Publication number: JP6427038B2
Application number: JP2015038511A
Authority: JP
Inventors: 秀樹三ツ峰; 英彦大久保; 寛史盛岡
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2015-02-27
Filing date: 2015-02-27
Publication date: 2018-11-21
Anticipated expiration: 2035-02-27
Also published as: JP2016163130A

Description

本発明は、実写映像とＣＧ（Computer Graphics）描画画像との映像合成等において必要となるカメラパラメータを推定する、カメラパラメータ推定装置およびカメラパラメータ推定プログラムに関する。 The present invention relates to a camera parameter estimation device and a camera parameter estimation program for estimating a camera parameter necessary for image combination of a photographed image and a CG (Computer Graphics) drawn image.

カメラパラメータは、カメラの位置や向き、レンズの状態を示す。映像制作において、このカメラパラメータを利用することにより、カメラの実写映像とＣＧ描画画像とを違和感なく合成することが可能となる。 The camera parameters indicate the position and orientation of the camera and the state of the lens. By using this camera parameter in video production, it is possible to composite a photographed image of a camera and a CG drawing image without a sense of discomfort.

従来、カメラパラメータを推定する手法として、カメラの三脚やクレーンの関節などの各自由度の回転量を、ロータリーエンコーダ等を取り付けて計測することにより推定する手法が知られている（特許文献１参照）。また、撮影した映像の解析によりカメラパラメータを推定する手法として、例えば、バンドルアジャストメント（Bundle Adjustment：バンドル調整処理）が知られている（特許文献２参照）。 Conventionally, as a method of estimating a camera parameter, there is known a method of estimating a rotation amount of each degree of freedom such as a camera tripod or a crane joint by measuring with a rotary encoder or the like (see Patent Document 1) ). Further, as a method of estimating a camera parameter by analysis of a captured image, for example, bundle adjustment (bundle adjustment processing) is known (see Patent Document 2).

特許文献２のバンドルアジャストメントを用いた手法では、撮影映像上の特徴点の位置および特徴ベクトルを映像解析により抽出し、その特徴点の位置を、特徴ベクトルの類似性を利用して撮影映像中で追跡する。さらに、その追跡結果を利用して、最適化手法によりカメラパラメータを推定する。
この２つの手法のうち、映像解析による手法は、計測機材を必要としないことに加えて、過去に撮りためた映像にも適用できるなどのメリットがある。 In the method using the bundle adjustment of Patent Document 2, the position and feature vector of the feature point on the captured video are extracted by video analysis, and the position of the feature point is captured in the captured video using the similarity of the feature vectors. Track on. Furthermore, camera parameters are estimated by an optimization method using the tracking results.
Of these two methods, the method by image analysis has the merit that it can be applied to images taken in the past, in addition to not requiring measurement equipment.

特開２００７−１４２９９３号公報JP 2007-142993 A 特開２００９−２３７８４５号公報JP, 2009-237845, A

しかしながら、映像解析による従来の手法は、三脚を利用したカメラを用いて撮影した映像であるのか、ハンディカメラ等により手持ちで撮影した映像であるのかを判定することができない。三脚を利用した撮影映像と手持ちで撮影された映像は、それぞれ異なるアルゴリズムに対応させて解析する必要があるため、誤った判定を行うと、推定処理が破綻してしまう場合がある。
また、ハンディカメラ等を用いて、カメラ位置を変更しつつ撮影された映像の解析は計算コストが高く、三脚を用いて撮影した映像に適用した場合、不必要に計算資源を費やすことにもなる。 However, according to the conventional method based on video analysis, it can not be determined whether the video is a video captured using a camera using a tripod or a video captured with a hand-held camera or the like. Since it is necessary to analyze the captured image using a tripod and the image captured with a hand according to different algorithms, the estimation process may be broken if an erroneous determination is made.
In addition, analysis of the image taken while changing the camera position using a handy camera etc. is expensive in calculation cost, and if it is applied to the image taken using a tripod, computational resources will be unnecessarily consumed. .

さらに、計算コストが高い処理が含まれるにもかかわらず、通常はカメラパラメータの推定に必要となる初期値（三脚を利用した撮影映像か否かなど）の設定が必要であり、映像の管理者等による手動の設定をなくすこと、つまり、自動化は困難であった。また、実際には、必要とする撮影映像について、ＶＦＸ（Visual Effects）などの作業を行うときになって初めてカメラパラメータの推定処理を開始することになるため、作業時間を拡大させる要因となっていた。 Furthermore, although the processing with high computational cost is included, it is necessary to set an initial value (such as whether it is a photographed video using a tripod or not) which is usually required for estimation of camera parameters. It is difficult to eliminate the manual setting by the etc., that is, automation. Also, in practice, the camera parameter estimation process is started for the first time when performing work such as VFX (Visual Effects) on the necessary captured video, which is a factor that extends the working time. The

本発明は、以上のような問題を鑑みてなされたものであり、三脚を利用した撮影映像か否かの初期設定をすることなく、効率的にカメラパラメータの推定を可能とする、カメラパラメータ推定装置およびカメラパラメータ推定プログラムを提供することを課題とする。 SUMMARY OF THE INVENTION The present invention has been made in view of the above problems, and provides camera parameter estimation that enables efficient estimation of camera parameters without performing initial setting as to whether or not a video image is captured using a tripod. It is an object of the present invention to provide an apparatus and a camera parameter estimation program.

前記課題を解決するために、本願第１発明のカメラパラメータ推定装置は、撮影カメラで撮影された撮影映像のカメラパラメータを推定するカメラパラメータ推定装置であって、映像取得手段と、グローバルモーション推定手段と、エッジ抽出手段と、三脚利用判定手段と、三脚用カメラパラメータ算出手段と、手持ち用カメラパラメータ算出手段と、カメラパラメータ出力手段と、を備える構成とした。 In order to solve the above problems, a camera parameter estimation device according to a first aspect of the present invention is a camera parameter estimation device for estimating a camera parameter of a photographed image photographed by a photographing camera, comprising: video acquisition means; , Edge extraction means, tripod utilization determination means, tripod camera parameter calculation means, hand-held camera parameter calculation means, and camera parameter output means.

かかる構成によれば、カメラパラメータ推定装置は、映像取得手段によって、撮影映像が記憶されている記憶手段から、撮影映像を取得する。また、グローバルモーション推定手段によって、取得した撮影映像を構成するフレーム画像それぞれの特徴点を抽出し、基準となるフレーム画像において抽出された特徴点と、他のフレーム画像において抽出された特徴点との間で、同一の特徴点が対応付けられた対応点の探索を行うことにより、基準となるフレーム画像と他のフレーム画像との間の画面全体の移動量を示すグローバルモーションを推定する。
これにより、カメラパラメータ推定装置は、記憶手段から撮影映像を取得し、その撮影映像のフレーム画像から特徴点を抽出することにより、基準となるフレーム画像と他のフレーム画像との間のグローバルモーションを推定することができる。 According to this configuration, the camera parameter estimation device acquires the captured video from the storage unit in which the captured video is stored by the video acquisition unit. Also, the global motion estimation means extracts feature points of each of the frame images constituting the acquired captured image, and the feature points extracted in the reference frame image and the feature points extracted in other frame images are extracted. A global motion indicating an amount of movement of the entire screen between a reference frame image and another frame image is estimated by searching for corresponding points in which the same feature points are associated with each other.
Thereby, the camera parameter estimation apparatus acquires the captured video from the storage means, and extracts the feature points from the frame image of the captured video, thereby performing the global motion between the reference frame image and the other frame images. It can be estimated.

また、カメラパラメータ推定装置は、エッジ抽出手段によって、フレーム画像それぞれについて、エッジの抽出を行う。そして、三脚利用判定手段によって、抽出されたエッジのうち、フレーム画像それぞれの間において対応点となる当該エッジの周辺を示す所定領域の画像の類似度を算出し、当該算出した類似度が所定の第１の閾値を超えた場合に、撮影映像が三脚を利用した映像であると判定し、当該算出した類似度が所定の第１の閾値以下の場合に、撮影映像が三脚を利用した映像でないと判定する。
これにより、カメラパラメータ推定装置は、フレーム画像それぞれについてエッジの抽出を行い、エッジ周辺の所定領域の画像の類似度を算出し、所定の第１の閾値を超えた場合に、三脚を利用した映像であると判定し、所定の第１の閾値以下の場合に、三脚を利用した映像でないと判定することができる。 Also, the camera parameter estimation device performs edge extraction on each of the frame images by the edge extraction unit. Then, the tripod utilization determining means calculates, among the extracted edges, the similarity of the image of the predetermined area indicating the periphery of the corresponding edge which is the corresponding point between the frame images, and the calculated similarity is predetermined. When the first threshold is exceeded, it is determined that the captured video is a video using a tripod, and the calculated video is not a video using a tripod if the calculated similarity is less than or equal to a predetermined first threshold. It is determined that
Thereby, the camera parameter estimation device extracts an edge for each of the frame images, calculates the similarity of the image of the predetermined area around the edge, and when the predetermined first threshold is exceeded, the image using the tripod It can be determined that the image is not a video using a tripod if it is less than or equal to a predetermined first threshold.

また、カメラパラメータ推定装置は、三脚用カメラパラメータ算出手段によって、撮影映像が三脚を利用した映像であると判定された場合に、推定されたグローバルモーションで示される移動量を用いて、カメラパラメータを算出する。また、手持ち用カメラパラメータ算出手段によって、撮影映像が三脚を利用した映像でないと判定された場合に、フレーム画像それぞれに含まれる対応する特徴点を解析してカメラパラメータを算出する。そして、カメラパラメータ出力手段によって、三脚用カメラパラメータ算出手段により算出されたカメラパラメータ、または、手持ち用カメラパラメータ算出手段により算出されたカメラパラメータを、記憶手段に出力する。
これにより、カメラパラメータ推定装置は、撮影映像が三脚を利用した映像であるか否かが判定された場合に、当該判定に基づく最適なカメラパラメータの算出手段によりカメラパラメータを算出し、記憶手段に出力することができる。 Further, the camera parameter estimation device uses the movement amount indicated by the estimated global motion when the camera parameter calculation unit for tripods determines that the captured video is a video using a tripod, and uses the camera parameters as the camera parameters. calculate. When it is determined by the hand-held camera parameter calculation unit that the captured video is not a video using a tripod, camera parameters are calculated by analyzing corresponding feature points included in each of the frame images. Then, the camera parameter output unit outputs the camera parameter calculated by the tripod camera parameter calculation unit or the camera parameter calculated by the hand-held camera parameter calculation unit to the storage unit.
Thus, when it is determined whether or not the captured video is a video using a tripod, the camera parameter estimation device calculates the camera parameter by the optimal camera parameter calculation unit based on the determination, and stores it in the storage unit. It can be output.

このように、本願第１発明のカメラパラメータ推定装置は、三脚を利用した撮影映像か否かの初期値の設定をすることなくカメラパラメータの推定処理を実行することができる。また、三脚を利用した映像か否かの判定を行うことにより、三脚利用の映像に適したカメラパラメータ算出処理を実行できるため、不必要な計算コストの増大を抑制することができる。さらに、三脚利用か否かの判定に用いる情報を利用して、撮影映像のカメラパラメータを算出することができる。よって、本発明のカメラパラメータ推定装置は、トータルとして効率的なカメラパラメータ推定が可能となる。 As described above, the camera parameter estimation device according to the first aspect of the present invention can execute the camera parameter estimation process without setting the initial value as to whether or not the image is a photographed image using a tripod. In addition, since it is possible to execute the camera parameter calculation process suitable for the image using the tripod by determining whether the image is the image using the tripod, it is possible to suppress an increase in unnecessary calculation cost. Furthermore, camera parameters of the photographed image can be calculated using information used to determine whether or not to use a tripod. Therefore, the camera parameter estimation device of the present invention enables efficient camera parameter estimation as a whole.

また、本願第２発明のカメラパラメータ推定装置は、撮影カメラで撮影された撮影映像のカメラパラメータを推定するカメラパラメータ推定装置であって、映像取得手段と、グローバルモーション推定手段と、エッジ抽出手段と、近傍エッジフィッティング手段と、レンズ歪係数算出手段と、非剛体領域判定手段と、三脚利用判定手段と、三脚用カメラパラメータ算出手段と、手持ち用カメラパラメータ算出手段と、カメラパラメータ出力手段と、を備える構成とした。 Further, a camera parameter estimation device according to a second aspect of the present invention is a camera parameter estimation device for estimating a camera parameter of a photographed image photographed by a photographing camera, comprising: video acquisition means; global motion estimation means; A proximity edge fitting unit, a lens distortion coefficient calculation unit, a non-rigid region determination unit, a tripod usage determination unit, a tripod camera parameter calculation unit, a handheld camera parameter calculation unit, and a camera parameter output unit It had composition.

かかる構成によれば、カメラパラメータ推定装置は、映像取得手段によって、撮影映像が記憶されている記憶手段から、撮影映像を取得する。また、グローバルモーション推定手段によって、取得した撮影映像を構成するフレーム画像それぞれの特徴点を抽出し、基準となるフレーム画像において抽出された特徴点と、他のフレーム画像において抽出された特徴点との間で、同一の特徴点が対応付けられた対応点の探索を行うことにより、基準となるフレーム画像と他のフレーム画像との間の画面全体の移動量を示す第１のグローバルモーションを推定する。
これにより、カメラパラメータ推定装置は、記憶手段から撮影映像を取得し、その撮影映像のフレーム画像から特徴点を抽出することにより、基準となるフレーム画像と他のフレーム画像との間の第１のグローバルモーションを推定することができる。 According to this configuration, the camera parameter estimation device acquires the captured video from the storage unit in which the captured video is stored by the video acquisition unit. Also, the global motion estimation means extracts feature points of each of the frame images constituting the acquired captured image, and the feature points extracted in the reference frame image and the feature points extracted in other frame images are extracted. The first global motion indicating the amount of movement of the entire screen between the reference frame image and the other frame images is estimated by searching for corresponding points in which the same feature points are associated with each other. .
Thus, the camera parameter estimation device acquires the captured video from the storage unit, and extracts the feature points from the frame image of the captured video, thereby obtaining the first frame image between the reference frame image and the other frame image. Global motion can be estimated.

また、カメラパラメータ推定装置は、エッジ抽出手段によって、フレーム画像それぞれについて、エッジの抽出を行う。そして、近傍エッジフィッティング手段によって、抽出されたエッジについて、当該エッジに隣接するエッジの情報に基づき法線方向を求め、当該法線方向に設定した法線上で最近傍の他のフレーム画像のエッジの位置を決定し、基準となるフレーム画像のエッジの位置と、決定した他のフレーム画像のエッジの位置とから得たエッジの移動量を用いて、第２のグローバルモーションを算出する。
これにより、カメラパラメータ推定装置は、フレーム画像それぞれについてエッジの抽出を行い、基準となるフレーム画像のエッジの位置と、決定した他のフレーム画像のエッジの位置とから得たエッジの移動量を用いて、第２のグローバルモーションを算出することができる。 Also, the camera parameter estimation device performs edge extraction on each of the frame images by the edge extraction unit. Then, for the extracted edge, the normal direction is determined by the neighboring edge fitting means based on the information of the edge adjacent to the edge, and the edge of the other frame image closest to the normal on the normal direction is determined. The position is determined, and the movement amount of the edge obtained from the position of the edge of the frame image serving as the reference and the positions of the edges of the other frame images determined is used to calculate the second global motion.
Thus, the camera parameter estimation apparatus extracts an edge for each frame image, and uses the movement amount of the edge obtained from the position of the edge of the reference frame image and the position of the edge of the other frame image determined. The second global motion can be calculated.

また、カメラパラメータ推定装置は、レンズ歪係数算出手段によって、第２のグローバルモーションで示される移動量を用いて、第１のグローバルモーションでの対応点の誤りを除去した上で、エッジ抽出手段により抽出されたエッジのうち、基準となるフレーム画像で検出されたエッジの位置についてレンズ歪を補正したエッジの位置と、他のフレーム画像で検出されたエッジの位置についてレンズ歪および第２のグローバルモーションの移動量を補正したエッジの位置との、距離が０に収束するように解析する最適化処理を行うことにより、レンズ歪係数を算出する。また、非剛体領域判定手段によって、フレーム画像それぞれを所定領域のブロックに分割し、基準となるフレーム画像のブロックと、それに対応する他のフレーム画像のブロックとの類似度を算出し、当該算出した類似度が所定の第２の閾値以下である場合に、他のフレーム画像のブロックを非剛体領域であると判定する。
これにより、カメラパラメータ推定装置は、レンズ歪係数を算出することができ、また、フレーム画像のブロックのうち類似度が第２の閾値以下であるブロックを非剛体領域と判定することができる。 In the camera parameter estimation device, the lens distortion coefficient calculation means removes an error of the corresponding point in the first global motion using the movement amount indicated by the second global motion, and then the edge extraction means The position of the edge which corrected lens distortion about the position of the edge detected by the reference frame picture among the extracted edges, and the lens distortion and the 2nd global motion about the position of the edge detected in other frame pictures The lens distortion coefficient is calculated by performing an optimization process that analyzes so that the distance converges to 0 with the position of the edge whose movement amount has been corrected. In addition, the non-rigid region determination unit divides each frame image into blocks of a predetermined region, calculates the similarity between the block of the frame image serving as a reference and the block of the other frame image corresponding thereto, and calculates the degree of similarity If the similarity is less than or equal to a predetermined second threshold, it is determined that the other frame image block is a non-rigid region.
As a result, the camera parameter estimation device can calculate the lens distortion coefficient, and can determine a block of the frame image whose similarity is equal to or less than the second threshold as the non-rigid region.

また、カメラパラメータ推定装置は、三脚利用判定手段によって、フレーム画像それぞれについて、レンズ歪係数に基づく補正を行った上で、非剛体領域のブロックに含まれる特徴点を対象とせず、第２のグローバルモーションを更新し第３のグローバルモーションを算出するとともに、フレーム画像それぞれの間において対応点となるエッジの周辺を示す所定領域の画像の類似度を算出し、当該算出した類似度が所定の第１の閾値を超えた場合に、撮影映像が三脚を利用した映像であると判定し、当該算出した類似度が所定の第１の閾値以下の場合に、撮影映像が三脚を利用した映像でないと判定する。
これにより、カメラパラメータ推定装置は、レンズ歪係数に基づきレンズ歪の補正を行い、非剛体領域と判定されたブロックを処理対象から取り除くことにより、精度を向上させた上で、撮影映像が三脚を利用した映像か否かを判定することができる。 Also, the camera parameter estimation device performs correction based on the lens distortion coefficient for each frame image by the tripod utilization determination means, and then does not target feature points included in the non-rigid region block, and the second global The motion is updated, the third global motion is calculated, and the similarity of the image of the predetermined area indicating the periphery of the edge serving as the corresponding point between each of the frame images is calculated, and the calculated similarity is a predetermined first If it exceeds the threshold, it is determined that the captured video is a video using a tripod, and if the calculated similarity is less than or equal to a predetermined first threshold, it is determined that the captured video is not a video using a tripod Do.
Thus, the camera parameter estimation device corrects lens distortion based on the lens distortion coefficient, removes blocks determined to be non-rigid regions from the processing target, improves the accuracy, and then uses a tripod for the captured image. It can be determined whether it is a used video.

また、カメラパラメータ推定装置は、三脚用カメラパラメータ算出手段によって、撮影映像が三脚を利用した映像であると判定された場合に、第３のグローバルモーションで示される移動量を用いて、カメラパラメータを算出する。また、手持ち用カメラパラメータ算出手段によって、撮影映像が三脚を利用した映像でないと判定された場合に、フレーム画像それぞれに含まれる対応する特徴点を解析してカメラパラメータを算出する。そして、カメラパラメータ出力手段によって、三脚用カメラパラメータ算出手段により算出されたカメラパラメータ、または、手持ち用カメラパラメータ算出手段により算出されたカメラパラメータを、記憶手段に出力する。
これにより、カメラパラメータ推定装置は、撮影映像が三脚を利用した映像であるか否かが判定された場合に、当該判定に基づく最適なカメラパラメータの算出手段によりカメラパラメータを算出し、記憶手段に出力することができる。 Further, the camera parameter estimation device uses the movement amount indicated by the third global motion when the camera parameter calculation unit for tripods determines that the captured video is a video using a tripod, and uses the camera parameters as the camera parameters. calculate. When it is determined by the hand-held camera parameter calculation unit that the captured video is not a video using a tripod, camera parameters are calculated by analyzing corresponding feature points included in each of the frame images. Then, the camera parameter output unit outputs the camera parameter calculated by the tripod camera parameter calculation unit or the camera parameter calculated by the hand-held camera parameter calculation unit to the storage unit.
Thus, when it is determined whether or not the captured video is a video using a tripod, the camera parameter estimation device calculates the camera parameter by the optimal camera parameter calculation unit based on the determination, and stores it in the storage unit. It can be output.

このように、本願第２発明のカメラパラメータ推定装置は、三脚を利用した撮影映像か否かの初期値の設定をすることなくカメラパラメータの推定処理を実行することができる。また、近傍エッジフィッティング手段、レンズ歪係数算出手段および非剛体領域判定手段を備えることにより、精度を向上させて三脚を利用した映像か否かの判定を行うことができる。そして、この三脚を利用した映像か否かの判定により、三脚利用の映像に適したカメラパラメータ算出処理を実行できるため、不必要な計算コストの増大を抑制することができる。さらに、三脚利用か否かの判定に用いる情報を利用して、撮影映像のカメラパラメータを算出することができる。よって、本発明のカメラパラメータ推定装置は、トータルとして効率的なカメラパラメータ推定が可能となる。 As described above, the camera parameter estimation device according to the second aspect of the present invention can execute the camera parameter estimation process without setting the initial value as to whether or not the image is a captured image using a tripod. Further, by providing the near-edge fitting unit, the lens distortion coefficient calculation unit, and the non-rigid region determination unit, it is possible to improve the accuracy and to determine whether or not the image is using a tripod. Then, since it is possible to execute the camera parameter calculation process suitable for the image using the tripod by determining whether the image is the image using the tripod, it is possible to suppress an increase in unnecessary calculation cost. Furthermore, camera parameters of the photographed image can be calculated using information used to determine whether or not to use a tripod. Therefore, the camera parameter estimation device of the present invention enables efficient camera parameter estimation as a whole.

なお、本願第１発明のカメラパラメータ推定装置、本願第２発明のカメラパラメータ推定装置のそれぞれは、コンピュータを、前記した各手段として機能させるためのカメラパラメータ推定プログラムで動作させることができる。 Each of the camera parameter estimation device according to the present invention and the camera parameter estimation device according to the present invention can be operated with a camera parameter estimation program for causing a computer to function as the respective means described above.

本発明によれば、三脚を利用した撮影映像か否かの初期設定をすることなく、効率的にカメラパラメータの推定をすることができる。 According to the present invention, camera parameters can be efficiently estimated without performing an initial setting as to whether or not an image is taken using a tripod.

本実施形態に係るカメラパラメータ推定装置を含むカメラパラメータ推定システムの全体構成を示す図である。BRIEF DESCRIPTION OF THE DRAWINGS It is a figure which shows the whole structure of the camera parameter estimation system containing the camera parameter estimation apparatus which concerns on this embodiment. 本実施形態に係る近傍エッジフィッティング手段による近傍エッジフィッティング処理を説明するための図である。It is a figure for demonstrating the vicinity edge fitting process by the vicinity edge fitting means which concerns on this embodiment. グローバルモーションによる移動、回転を考慮したフレーム間のエッジの位置関係を示す図である。It is a figure which shows the positional relationship of the edge between the frames which considered the movement by global motion, and rotation. グローバルモーションによる移動、回転を考慮したフレーム間のエッジの位置関係を示す図である。It is a figure which shows the positional relationship of the edge between the frames which considered the movement by global motion, and rotation. 本実施形態に係るカメラパラメータ推定装置の三脚利用判定手段が行うオクルージョン量の評価法を説明するための図である。It is a figure for demonstrating the evaluation method of the amount of occlusions which the tripod utilization determination means of the camera parameter estimation apparatus which concerns on this embodiment performs. 本実施形態に係るカメラパラメータ推定装置が行うカメラパラメータ推定処理（第１の処理例）を示すフローチャートである。It is a flowchart which shows the camera parameter estimation process (1st process example) which the camera parameter estimation apparatus which concerns on this embodiment performs. 本実施形態に係るカメラパラメータ推定装置が行うカメラパラメータ推定処理（第２の処理例）を示すフローチャートである。It is a flowchart which shows the camera parameter estimation process (2nd process example) which the camera parameter estimation apparatus which concerns on this embodiment performs. 本実施形態に係るカメラパラメータ推定装置が行うカメラパラメータ推定処理（第３の処理例）を示すフローチャートである。It is a flowchart which shows the camera parameter estimation process (3rd process example) which the camera parameter estimation apparatus which concerns on this embodiment performs. 本実施形態に係るカメラパラメータ推定装置を含むカメラパラメータ推定システムの全体構成（第３の処理例の構成）を示す図である。It is a figure which shows the whole structure (structure of the 3rd process example) of the camera parameter estimation system containing the camera parameter estimation apparatus which concerns on this embodiment.

以下、本発明を実施するための形態（以下、「実施形態」という）について図面を参照して説明する。
まず、本実施形態に係るカメラパラメータ推定装置１が実行する処理の概要について説明する。 Hereinafter, modes for carrying out the present invention (hereinafter, referred to as “embodiments”) will be described with reference to the drawings.
First, an outline of processing performed by the camera parameter estimation device 1 according to the present embodiment will be described.

＜概要＞
本実施形態に係るカメラパラメータ推定装置１は、カメラパラメータの効率的な推定処理を実現するため、撮影映像が三脚に設置されたカメラにより撮影されたものなのか、それ以外のハンディカメラやクレーン等（以下、「手持ち等」と称する。）を利用して撮影されたものなのか、を判定した上で、三脚を利用して撮影された映像、手持ち等により撮影された映像のそれぞれにおいて推定処理手法を分別して実行する。このようにすることにより、本実施形態に係るカメラパラメータ推定装置１は、計算コストを抑え、かつ、頑健で効率的な映像解析によるカメラパラメータの推定を可能とする。 <Overview>
The camera parameter estimation device 1 according to the present embodiment, in order to realize efficient estimation processing of camera parameters, whether the captured image is captured by a camera installed on a tripod, other handy cameras, cranes, etc. After determining whether the image was taken using the following (hereinafter referred to as "hand-held etc."), estimation processing is performed on each of the image taken using the tripod and the image taken using the hand-held etc. Separate and execute the method. By doing this, the camera parameter estimation device 1 according to the present embodiment can reduce the calculation cost, and make it possible to estimate the camera parameter by the robust and efficient image analysis.

三脚を利用して撮影された映像と、手持ち等により撮影された映像とは、視差量が異なるものとなる。仮に被写体が剛体（静物）であるとし、カメラ位置に動きがある、つまり、手持ち等により撮影した場合には、撮影映像上において、カメラの動きと被写体の配置とに依存して一定以上のオクルージョン（カメラの移動に伴う視野の異なる領域）が生じる。一方、三脚を利用して撮影した場合、回転中心と、レンズ主点位置のズレに依存したオクルージョンが生じるものの僅かなものとなる。 An image captured using a tripod and an image captured by hand-holding or the like have different parallax amounts. Assuming that the subject is a rigid body (still life) and there is a motion at the camera position, that is, when shooting with a hand, etc., a certain amount or more of occlusions depending on the motion of the camera and the placement of the subject on the captured image. (Different regions of the field of view due to the movement of the camera) occur. On the other hand, when photographing is performed using a tripod, although the occlusion depending on the shift between the rotation center and the lens principal point position is generated, it is a slight one.

カメラパラメータ推定装置１は、映像解析により、このオクルージョンの量を求め三脚を利用して撮影された映像か否かを判定する。このとき、カメラパラメータ推定装置１は、後記する、エッジフィッティング処理や、レンズ歪の補正、非剛体領域の判定処理を行うことにより精度向上を図る。また、オクルージョン評価の際に、撮影映像上の被写体の移動量が求まるため、これを三脚利用時のカメラパラメータとして算出する。カメラパラメータ推定装置１は、三脚を利用して撮影された映像ではない、つまり、手持ち等により撮影された映像であると判定した場合は、バンドルアジャストメントによる手法など、カメラ位置に動きのある場合の推定に適した手法を用いて処理を行う。 The camera parameter estimation device 1 determines the amount of this occlusion by video analysis to determine whether it is a video shot using a tripod. At this time, the camera parameter estimation device 1 improves the accuracy by performing an edge fitting process, a lens distortion correction process, and a non-rigid body area determination process described later. Further, at the time of occlusion evaluation, the movement amount of the subject on the photographed image can be obtained, and this is calculated as a camera parameter when using a tripod. When the camera parameter estimation device 1 determines that the image is not an image captured using a tripod, that is, an image captured by hand, etc., there is a movement at the camera position, such as a method by bundle adjustment. Processing is performed using a method suitable for estimating.

＜カメラパラメータ推定システム＞
次に、本実施形態に係るカメラパラメータ推定システムＳについて説明する。
図１は、本実施形態に係るカメラパラメータ推定装置１を含むカメラパラメータ推定システムＳの全体構成を示す図である。
図１に示すように、カメラパラメータ推定システムＳは、カメラＣａから入力された撮影映像を蓄積する映像アーカイブス１０００と、映像アーカイブス１０００と通信可能に接続されるカメラパラメータ推定装置１とを含んで構成される。 <Camera parameter estimation system>
Next, a camera parameter estimation system S according to the present embodiment will be described.
FIG. 1 is a diagram showing an overall configuration of a camera parameter estimation system S including a camera parameter estimation device 1 according to the present embodiment.
As shown in FIG. 1, the camera parameter estimation system S includes a video archive 1000 for storing captured video input from a camera Ca, and a camera parameter estimation device 1 connected communicably to the video archive 1000. Be done.

映像アーカイブス１０００は、撮影映像を蓄積する記憶手段を備えるコンピュータにより構成される。この映像アーカイブス１０００に蓄積される撮影映像には、撮影日時や、カメラの設定情報（シャッタースピードやレンズの状態）、画角、撮影対象に関する情報（被写体となる人物の名称や、撮影場所）等のメタデータが付されている。ただし、本実施形態の説明においては、映像アーカイブス１０００に初期状態で記憶される撮影映像のメタデータには、カメラパラメータの情報は付されていないものとする。
なお、映像アーカイブス１０００を、コンピュータ１台で構成してもよいし、複数台のコンピュータを連携させて構成してもよい。また、この映像アーカイブス１０００を、カメラパラメータ推定装置１に内包させて後記する記憶手段３０に備えさせるようにしてもよい。ただし、以降の本実施形態の説明においては、図１に示すように、映像アーカイブス１０００とカメラパラメータ推定装置１とが外部接続されるものとして説明する。
また、この映像アーカイブス１０００は、カメラＣａから新たな撮影映像が蓄積される毎や、所定の時間間隔、蓄積した撮影映像の出力指示情報（カメラパラメータの付与指示情報）を外部から受け付けたこと等を契機として、蓄積した撮影映像のうち、カメラパラメータが付されていない撮影映像を、カメラパラメータ推定装置１に出力する。 The video archive 1000 is configured by a computer including storage means for storing captured video. The captured video stored in the video archive 1000 includes the shooting date, camera setting information (shutter speed and lens status), angle of view, information on the shooting target (name of the person who is the subject, shooting location), etc. Metadata is attached. However, in the description of the present embodiment, it is assumed that camera parameter information is not attached to the metadata of the shot video stored in the video archive 1000 in the initial state.
The video archive 1000 may be configured by one computer, or may be configured by linking a plurality of computers. Also, the video archive 1000 may be included in the camera parameter estimation device 1 and provided in the storage unit 30 described later. However, in the following description of the present embodiment, as shown in FIG. 1, the video archive 1000 and the camera parameter estimation device 1 are described as being externally connected.
In addition, the video archive 1000 receives, each time a new captured video is stored from the camera Ca, a predetermined time interval, and externally received output instruction information of the captured video (instruction to assign camera parameters) from the outside, etc. At the same time, of the stored captured images, the captured image without a camera parameter is output to the camera parameter estimation device 1.

≪カメラパラメータ推定装置≫
次に、カメラパラメータ推定装置１の機能構成について、図１を参照して説明する。
カメラパラメータ推定装置１は、映像アーカイブス１０００から、メタデータが付された撮影映像を取得し、映像解析により、その撮影映像が三脚を利用して撮影された映像か否かを判定する。そして、カメラパラメータ推定装置１は、三脚を利用して撮影された映像と判定した場合、それ以外の手持ち等により撮影された映像と判定した場合のそれぞれに適した手法により、カメラパラメータの推定処理を実行する。カメラパラメータ推定装置１は、推定結果であるカメラパラメータを、その撮影映像のメタデータに付して、映像アーカイブス１０００に出力する。
このカメラパラメータ推定装置１は、図１に示すように、制御手段１０と、入出力手段２０と、記憶手段３０とを含んで構成される。 «Camera parameter estimation device»
Next, the functional configuration of the camera parameter estimation device 1 will be described with reference to FIG.
The camera parameter estimation device 1 acquires a captured video to which metadata is attached from the video archive 1000, and determines whether the captured video is a video captured using a tripod by video analysis. Then, when the camera parameter estimation device 1 determines that the image is an image captured using a tripod, the camera parameter estimation process is performed according to a method suitable for each image determined to be an image captured by other hand-helds etc. Run. The camera parameter estimation device 1 adds camera parameters, which are estimation results, to the metadata of the captured video, and outputs the metadata to the video archive 1000.
As shown in FIG. 1, the camera parameter estimation device 1 includes a control unit 10, an input / output unit 20, and a storage unit 30.

入出力手段２０は、映像アーカイブス１０００等との間の情報の入出力を行う。また、この入出力手段２０は、ネットワークに接続される通信回線や専用線等を介して情報の送受信を行う通信インタフェースと、図示を省略したキーボード等の入力手段やモニタ等の出力手段等との間で入出力を行う入出力インタフェースとから構成される。 The input / output unit 20 inputs / outputs information to / from the video archives 1000 and the like. Further, the input / output means 20 includes a communication interface for transmitting and receiving information through a communication line connected to the network and a dedicated line, and an input means such as a keyboard (not shown) and an output means such as a monitor. It consists of an I / O interface that performs I / O between them.

制御手段１０は、カメラパラメータ推定装置１が実行する処理の全般を司り、映像取得手段１０１、グローバルモーション推定手段１０２、エッジ抽出手段１０３、近傍エッジフィッティング手段１０４、レンズ歪係数算出手段１０５、非剛体領域判定手段１０６、三脚利用判定手段１０７、三脚用カメラパラメータ算出手段１０８、手持ち用カメラパラメータ算出手段１０９、カメラパラメータ出力手段１１０を含んで構成される。なお、近傍エッジフィッティング手段１０４、レンズ歪係数算出手段１０５、非剛体領域判定手段１０６のそれぞれは、三脚利用か否かの判定やカメラパラメータ推定の精度をより向上させるための手段であるため、これらの手段のうちのいずれかまたはすべてを制御手段１０が含まない構成であってもよい（詳細は後記する。）。また、制御手段１０は、例えば、記憶手段３０に格納されたプログラム（カメラパラメータ推定プログラム）を、図示を省略したＣＰＵ（Central Processing Unit）が図示を省略したＲＡＭ（Random Access Memory）に展開し実行することで実現される。 The control means 10 is responsible for the entire process performed by the camera parameter estimation device 1, and the image acquisition means 101, global motion estimation means 102, edge extraction means 103, neighboring edge fitting means 104, lens distortion coefficient calculation means 105, non-rigid body A region determination unit 106, a tripod usage determination unit 107, a tripod camera parameter calculation unit 108, a hand-held camera parameter calculation unit 109, and a camera parameter output unit 110 are included. Each of the near-edge fitting unit 104, the lens distortion coefficient calculation unit 105, and the non-rigid region determination unit 106 is a unit for further improving the accuracy of the camera parameter estimation and the determination as to whether or not the tripod is used. The control means 10 may be configured not to include any or all of the means described in (details will be described later). Further, the control unit 10 executes, for example, the program (camera parameter estimation program) stored in the storage unit 30 by expanding it into a RAM (Random Access Memory) whose CPU (Central Processing Unit) (not shown) omits. It is realized by doing.

映像取得手段１０１は、入出力手段２０を介して、映像アーカイブス１０００から撮影映像を取得し、記憶手段３０内の映像記憶手段３００に記憶する。なお、この映像取得手段１０１による撮影映像の取得は、前記したように、外部に設けられた映像アーカイブス１０００が送信してきた撮影映像を取得するものでもよいし、記憶手段３０内に映像アーカイブス１０００が設けられる場合には、記憶手段３０内の映像アーカイブス１０００から撮影映像を取得してもよい。
また、映像取得手段１０１が映像アーカイブス１０００から取得する撮影映像には、メタデータが付与されている。そして、このメタデータの中には、少なくとも、三脚利用判定手段１０７が利用する、一連で撮影された複数のフレーム（フレーム画像）からならショット区間を表わす情報と、三脚用カメラパラメータ算出手段１０８が利用するレンズズーム量の情報とが含まれるものとする。なお、詳細は後記する。 The video acquisition unit 101 acquires a captured video from the video archive 1000 via the input / output unit 20, and stores the acquired video in the video storage unit 300 in the storage unit 30. As described above, the acquisition of the photographed video by the video acquisition unit 101 may be acquisition of the photographed video transmitted by the video archive 1000 provided outside, and the video archive 1000 may be stored in the storage unit 30. If provided, the photographed video may be acquired from the video archive 1000 in the storage unit 30.
In addition, metadata is attached to the captured video acquired by the video acquisition unit 101 from the video archive 1000. Then, in the metadata, at least information representing a shot section from a plurality of frames (frame images) captured in series, which is used by the tripod utilization determination unit 107, and the tripod camera parameter calculation unit 108 It is assumed that the information of the lens zoom amount to be used is included. Details will be described later.

グローバルモーション推定手段１０２は、撮影映像の特徴点を抽出し、フレーム間での対応点探索を行うことにより、画面全体の動き（移動量）を示すグローバルモーションの推定値を算出する。
具体的には、グローバルモーション推定手段１０２は、撮影映像（動画）をフレーム単位で静止画として取り出し、例えば、ＳＵＲＦ（Speeded Up Robust Feature）を用いて特徴点を算出し、各フレームに対して対応点探索を行い、対応誤り除去を行う。ここでは、時系列で１つ後のフレームに対して対応点探索を行うものとする。また、対応誤り除去は、双方向（時系列で前後）で対応点探索を行い、同じ特徴点の位置に対応しない場合は誤りと判定する。 The global motion estimation means 102 extracts feature points of the captured video and searches for corresponding points between frames to calculate an estimated value of global motion indicating movement (movement amount) of the entire screen.
Specifically, the global motion estimation means 102 takes out a captured image (moving image) as a still image in frame units, calculates feature points using, for example, SURF (Speeded Up Robust Feature), and copes with each frame Perform point search and remove corresponding error. Here, it is assumed that the corresponding point search is performed on the next frame in time series. Further, in correspondence error removal, correspondence point search is performed in both directions (before and after in time series), and when it does not correspond to the position of the same feature point, it is determined as an error.

次に、グローバルモーション推定手段１０２は、対応する特徴点の移動量から、並進量と回転量とを求める。ここで、グローバルモーション推定手段１０２は、対応する特徴点間の特徴点移動ベクトルの平均値、つまり、各特徴量移動ベクトルの重心位置の移動量を並進量とする。また、グローバルモーション推定手段１０２は、回転量について、並進量分のオフセットを考慮した上で、画像中心を頂点として対応する特徴点間の角度の平均値を求め、回転量θとする。このグローバルモーション推定手段１０２が算出した並進量と回転量を、グローバルモーション推定値とし、以下において、その値を「ＧＭ１」（第１のグローバルモーション）と称する。なお、この「ＧＭ１」は、後記する「ＧＭ２」、「ＧＭ３」で示される、より精度を高めたグローバルモーション推定値と比較すると、荒い推定値を算出するものとして意味付けることができる。 Next, the global motion estimation means 102 obtains the translation amount and the rotation amount from the movement amount of the corresponding feature point. Here, the global motion estimation means 102 sets the average value of feature point movement vectors between corresponding feature points, that is, the movement amount of the barycentric position of each feature amount movement vector as the translation amount. Further, the global motion estimation unit 102 determines an average value of angles between corresponding feature points with the image center as a vertex after taking into consideration the offset corresponding to the translation amount for the rotation amount, and sets it as the rotation amount θ. The translation amount and the rotation amount calculated by the global motion estimation means 102 are referred to as a global motion estimated value, and the values are hereinafter referred to as “GM1” (first global motion). In addition, this "GM 1" can be added as meaning that rough estimation values can be calculated as compared with global motion estimation values with higher accuracy shown by "GM 2" and "GM 3" described later.

なお、以下に示す説明において、グローバルモーション推定値等を算出する基準となるフレームを「Ａフレーム」とし、移動量を算出するフレームを「Ｂフレーム」とする。そして、Ａフレームのｎ番目の特徴点の位置（以下、「特徴点位置」「エッジ位置」と称することがある。）を（ｘａ_ｎ，ｙａ_ｎ）とし、Ｂフレームのｎ番目の特徴点位置（ｘｂ_ｎ，ｙｂ_ｎ）に画像中心でθ回転した上で並進量を加算したものを（ｘｂ'_ｎ，ｙｂ'_ｎ）とする。 In the following description, a frame serving as a reference for calculating a global motion estimated value or the like is referred to as “A frame”, and a frame for calculating a movement amount is referred to as “B frame”. Then, assuming that the position of the n-th feature point of the A frame (hereinafter sometimes referred to as “feature point position” or “edge position”) is (xa _n , ya _n ), the n-th feature point position of the B frame _(xb _n, yb n) to a material obtained by adding the translation amount on rotated θ at image center to the _{_{(xb 'n, yb' n}} ).

このグローバルモーション推定手段１０２は、グローバルモーション推定値（並進量および回転量）の算出処理を繰り返し行う場合（２回目以降の場合）、レンズ歪係数算出手段１０５により直近で算出されたレンズ歪係数に基づく画像の補正を行うとともに、非剛体領域判定手段１０６が非剛体領域と判定したブロック内の特徴点を除外して、グローバルモーションの推定値の算出を行う。なお、詳細は後記する。 When the global motion estimation means 102 repeatedly calculates the global motion estimated value (translation amount and rotation amount) (for the second and subsequent times), the lens distortion coefficient calculated by the lens distortion coefficient calculation unit 105 is used as the lens distortion coefficient most recently. Based on the correction of the image based on the above, the estimated value of the global motion is calculated excluding the feature points in the block determined by the non-rigid region determination unit 106 as the non-rigid region. Details will be described later.

また、本実施形態においては、特徴点の抽出等の手法としてＳＵＲＦを用いるものとして説明するが、特徴点抽出や、その記述方法、対応点探索の対象フレームの選択方法、対応誤り除去方法は、これに限定されるものではない。例えば、特徴点として、ＳＩＦＴ（Scale Invariant Feature Transform）やＫＡＺＥの利用、対象フレームの選択方法として、前後フレームや、全フレーム総当たりでの処理も適用可能である。また、対応誤り除去方法として、対応点探索範囲の限定なども利用可能である。さらに、並進量および回転量の算出についても、ホモグラフィ（射影変換）を求めることにより、並進量および回転量を算出する手法を用いてもよい。 Further, in the present embodiment, it is described that SURF is used as a technique for extracting feature points and the like, but feature point extraction, a description method thereof, a target frame selection method for corresponding point search, and a corresponding error removal method It is not limited to this. For example, the use of SIFT (Scale Invariant Feature Transform) or KAZE as feature points, and processing of previous and subsequent frames or all frames all round can be applied as a method of selecting a target frame. In addition, as the correspondence error removal method, limitation of the correspondence point search range can also be used. Furthermore, also for the calculation of the translation amount and the rotation amount, a method of calculating the translation amount and the rotation amount by obtaining homography (projective transformation) may be used.

なお、ＳＵＲＦについては（参考文献１）、ＳＩＦＴについては（参考文献２）、ＫＡＸＥについては（参考文献３）、ホモグラフィについては（参考文献４）に詳しい。
（参考文献１）H. Bay, A. Ess, T. Tuytelaars, L. V. Gool:“Speeded-Up Robust Features(SURF),” 2008, Computer Vision and Image Understanding, Vol.110, No.3, pp.346-359
（参考文献２）藤吉ほか，「Gradientベースの特徴抽出 - SIFTとHOG -」, 2007年，情報処理学会研究報告CVIM 160, pp.211-224
（参考文献３）P. F. Alcantarilla, A. Bartoli and A. J. Davison：“KAZE Features,” In European Conference on Computer Vision (ECCV), Fiorenze, Italy, October 2012.
http://www.robesafe.com/personal/pablo.alcantarilla/papers/Alcantarilla12eccv.pdf
（参考文献４）特開２０１４−１３４８５６号公報 The details of SURF are (Reference 1), the details of SIFT (Reference 2), the details of KAXE (reference 3), and the details of homography (reference 4).
(Reference 1) H. Bay, A. Ess, T. Tuytelaars, LV Gool: “Speeded-Up Robust Features (SURF),” 2008, Computer Vision and Image Understanding, Vol. 110, No. 3, pp. 346 -359
(Reference 2) Fujiyoshi et al., "Gradient-based feature extraction-SIFT and HOG-", 2007, IPSJ SIG Information Report CVIM 160, pp. 211-224
(Reference 3) PF Alcantarilla, A. Bartoli and AJ Davison: “KAZE Features,” In European Conference on Computer Vision (ECCV), Fiorenze, Italy, October 2012.
http://www.robesafe.com/personal/pablo.alcantarilla/papers/Alcantarilla12eccv.pdf
(Reference 4) Japanese Patent Application Laid-Open No. 2014-134856

エッジ抽出手段１０３は、処理対象となる撮影映像の各フレームに対し、エッジ抽出を行う。エッジ抽出手段１０３は、例えば、ｓｏｂｅｌフィルタを用いて、輝度の勾配画像を作成し、さらにエッジを取り出すための非極大値除去を行う。
ただし、このエッジ抽出手段１０３のエッジ抽出処理は、このｓｏｂｅｌフィルタを用いた手法に限定されず、輝度勾配の画像の作成にｃａｎｎｙフィルタを利用してもよい。また、エッジの取り出しに、輝度の絶対値をとって閾値処理するなどの手法や、細線化処理（二値化された画像において、線の中心１画素分だけ残すように線を細くする処理）を利用してもよい。なお、ｓｏｂｅｌフィルタやｃａｎｎｙフィルタ等を用いたエッジ抽出処理は、例えば、特開２００６−１７０９９５号公報等に詳しい。 The edge extraction unit 103 performs edge extraction on each frame of the captured image to be processed. The edge extraction unit 103 creates a gradient image of luminance using, for example, a sobel filter, and performs non-maximum value removal for extracting an edge.
However, the edge extraction processing of the edge extraction unit 103 is not limited to the method using the sobel filter, and a canny filter may be used to create an image of a luminance gradient. In addition, a method such as performing threshold processing by taking the absolute value of luminance to extract an edge, or thinning processing (processing to make a line thin so as to leave only one pixel center of the line in a binarized image) You may use In addition, the edge extraction process using a sobel filter, a canny filter, etc. is detailed to Unexamined-Japanese-Patent No. 2006-170995 etc., for example.

近傍エッジフィッティング手段１０４は、エッジ抽出手段１０３により抽出されたエッジ画像に対し、以下において説明するエッジフィッティング処理を行うことにより、グローバルモーション推定手段１０２が算出したグローバルモーション推定値を更新する。この近傍エッジフィッティング手段１０４による近傍エッジフィッティング処理は、グローバルモーション推定値や三脚利用判定処理等の精度をさらに向上させるために行われる。 The neighboring edge fitting unit 104 updates the global motion estimated value calculated by the global motion estimation unit 102 by performing the edge fitting process described below on the edge image extracted by the edge extraction unit 103. The neighborhood edge fitting processing by the neighborhood edge fitting unit 104 is performed to further improve the accuracy of the global motion estimated value, the tripod usage determination process, and the like.

図２は、本実施形態に係る近傍エッジフィッティング手段１０４による近傍エッジフィッティング処理を説明するための図である。
図２（ａ）は、基準となるＡフレームの撮影画像を表し、図２（ｂ）は、エッジ抽出手段１０３により、エッジ抽出処理がされた結果としてのＡフレームのエッジ画像を表す。
ここで、近傍エッジフィッティング手段１０４は、以下に示すエッジフィッティング処理を行う。まず、近傍エッジフィッティング手段１０４は、図２（ｂ）に示すＡフレームのエッジ画像において、ラインスキャンすることによりフレーム内のエッジを探索する。そして、近傍エッジフィッティング手段１０４は、見つかったエッジに関して、図２（ｃ）に示すように、隣接するエッジの情報から法線方向を求め、その法線方向に設定した法線上で最近傍のＢフレームのエッジの相対的な位置を求める。なお、図２（ｃ）において、Ａフレームのエッジを実線で表し、Ｂフレームのエッジを破線で表している。また、近傍エッジフィッティング手段１０４が実行するＢフレームの探索基準は、グローバルモーション推定手段１０２が算出した「ＧＭ１」分のオフセットをかけたものとする。 FIG. 2 is a view for explaining the neighboring edge fitting process by the neighboring edge fitting means 104 according to the present embodiment.
FIG. 2A shows a photographed image of A frame as a reference, and FIG. 2B shows an edge image of A frame as a result of edge extraction processing by the edge extracting means 103.
Here, the neighboring edge fitting unit 104 performs an edge fitting process described below. First, in the edge image of the A frame shown in FIG. 2B, the neighboring edge fitting unit 104 performs line scanning to search for an edge in the frame. Then, as shown in FIG. 2C, the neighboring edge fitting unit 104 obtains the normal direction from the information of the adjacent edges with respect to the found edge, and the nearest B on the normal set in the normal direction. Find the relative position of the edges of the frame. In FIG. 2C, the edge of the A frame is indicated by a solid line, and the edge of the B frame is indicated by a broken line. Further, it is assumed that the search reference of the B frame executed by the neighboring edge fitting means 104 is an offset of “GM1” calculated by the global motion estimation means 102.

近傍エッジフィッティング手段１０４は、この近傍エッジフィッティング処理を、Ａフレーム内のすべてのエッジに関して行う。そして、近傍エッジフィッティング手段１０４は、Ａフレーム内の各エッジに対するＢフレームにおける相対的な位置を利用し、グローバルモーション推定手段１０２が用いた手法と同様に、対応するエッジの移動量から、並進量と回転量とを求める。このようにすることにより、グローバルモーション推定手段１０２が算出した「ＧＭ１」の並進量と回転量について、さらに精度を向上させたグローバルモーション推定値を算出することができる。なお、この近傍エッジフィッティング手段１０４が算出した並進量と回転量で示されるグローバルモーション推定値を、以下において、「ＧＭ２」（第２のグローバルモーション）と称する。 The neighboring edge fitting unit 104 performs this neighboring edge fitting process on all edges in the A frame. Then, the neighboring edge fitting means 104 utilizes the relative position in the B frame with respect to each edge in the A frame, and the translation amount from the corresponding edge movement amount as in the method used by the global motion estimation means 102. And the amount of rotation. By doing this, it is possible to calculate a global motion estimated value with further improved accuracy for the translation amount and rotation amount of “GM1” calculated by the global motion estimation means 102. The global motion estimated value indicated by the translation amount and the rotation amount calculated by the neighboring edge fitting means 104 is hereinafter referred to as "GM2" (second global motion).

図１に戻り、レンズ歪係数算出手段１０５は、それまでの直近で算出されたグローバルモーションの推定値に基づき、レンズ歪係数を算出する。
なお、このレンズ歪係数算出手段１０５による、レンズ歪係数算出処理も、グローバルモーション推定値や三脚利用判定処理等の精度をさらに向上させるために行うものである。 Returning to FIG. 1, the lens distortion coefficient calculation unit 105 calculates the lens distortion coefficient based on the estimated value of the global motion calculated most recently.
The lens distortion coefficient calculation processing by the lens distortion coefficient calculation means 105 is also performed to further improve the accuracy of the global motion estimated value, the tripod usage determination processing, and the like.

レンズ歪係数算出手段１０５は、まず、事前処理として、それまでの直近で算出されたグローバルモーションの推定値、つまり、近傍エッジフィッティング手段１０４が「ＧＭ２」を算出している場合には、その「ＧＭ２」を基準として、グローバルモーション推定手段１０２が算出した「ＧＭ１」での対応点の誤りを除去する。具体的には、レンズ歪係数算出手段１０５は、ユークリッド距離を基準とし、「ＧＭ２」の移動量を超える距離で対応点として対応付けられているエッジ点を評価対象から除外する。
続いて、レンズ歪係数算出手段１０５は、次に示すレンズ歪係数算出処理を実行する。 The lens distortion coefficient calculation means 105 first estimates, as pre-processing, an estimated value of global motion calculated most recently, that is, when the neighboring edge fitting means 104 calculates “GM2”, The error of the corresponding point in “GM1” calculated by the global motion estimation means 102 is eliminated on the basis of “GM2”. Specifically, the lens distortion coefficient calculation unit 105 excludes, from the evaluation target, an edge point associated as a corresponding point at a distance exceeding the movement amount of “GM2” based on the Euclidean distance.
Subsequently, the lens distortion coefficient calculation unit 105 executes lens distortion coefficient calculation processing described below.

ここで、レンズ歪は、以下の式（１）で表わされる。

Here, lens distortion is expressed by the following equation (1).

この式（１）は、レンズ歪のない状態の２次元座標位置（ｘ'，ｙ'）を、歪のかかった２次元座標（ｘ”，ｙ”）に写像するものである。なお、「κ」は半径方向の歪係数、「ｐ」は、円周方向（接線方向）の歪係数である。「ｒ」は、画像中心からの距離である。
本実施形態において、レンズ歪係数算出手段１０５は、「κ_１」、「κ_２」のみを求めるものとする。つまり、「κ_３」や「ｐ_１」「ｐ_２」は省略し近似式とする。したがって、式（１）の近似式から、以下の式（２）、式（３）を導出できる。 This equation (1) maps the two-dimensional coordinate position (x ′, y ′) in a lens distortion-free state to a two-dimensional coordinate (x ′ ′, y ′ ′) subjected to distortion. Here, “κ” is a distortion coefficient in the radial direction, and “p” is a distortion coefficient in the circumferential direction (tangent direction). "R" is the distance from the image center.
In the present embodiment, the lens distortion coefficient calculation unit 105 obtains only “κ ₁ ” and “κ ₂ ”. That is, “κ ₃ ”, “p ₁ ”, and “p ₂ ” are omitted and are approximate expressions. Therefore, the following equations (2) and (3) can be derived from the approximate equation of equation (1).

ここで、ｘ”，ｙ”については、エッジ位置（後記する繰り返し処理を実行した場合にはＧＭ３を考慮した位置）から既知である。したがって、「ｒ」についてもそれぞれの画像中心からの距離として既知であり、「κ_１」「κ_２」以外は既知となる。しかしながら、ノイズを含む等の理由から実際にこの方程式を解くことは困難である。したがって、本実施形態において、レンズ歪係数算出手段１０５は、レーベンバーグマーカート法を用いて最適化することにより、レンズ歪係数「κ_１」「κ_２」を算出する。 Here, x ′ ′ and y ′ ′ are known from an edge position (a position in which GM3 is taken into consideration when the repetitive processing described later is executed). Therefore, “r” is also known as the distance from the center of each image, and other than “κ ₁ ” and “κ ₂ ” are known. However, it is difficult to actually solve this equation because it contains noise. Therefore, in the present embodiment, the lens distortion coefficient calculation unit 105 calculates the lens distortion coefficients “κ ₁ ” and “κ ₂ ” by optimizing using the Levenberg Markedt method.

以下、この最適化について、図３および図４を参照して説明する。
図３および図４は、グローバルモーションによる移動、回転を考慮したフレーム間のエッジの位置関係を示す図である。 Hereinafter, this optimization will be described with reference to FIGS. 3 and 4.
FIG. 3 and FIG. 4 are diagrams showing the positional relationship of edges between frames in consideration of movement and rotation due to global motion.

図３において、エッジの位置は、以下に示すものである。
（ｘａ_ｎ”，ｙａ_ｎ”）は、Ａフレームにおいて検出したエッジ位置を表す。
（ｘｂ_ｎ”，ｙｂ_ｎ”）は、Ｂフレームにおいて検出したエッジ位置を表す。
（ｘａ_ｎ’，ｙａ_ｎ’）は、Ａフレームにおいて検出したエッジ位置の歪を補正した位置を表す。
（ｘｂ_ｎ’，ｙｂ_ｎ’）は、Ｂフレームにおいて検出したエッジ位置の歪を補正した位置を表す。
ｒａ_ｎ、ｒｂ_ｎは、それぞれＡフレーム、Ｂフレームの画像中心からエッジまでの距離を表わす。
ただし、以上の点は、それぞれのフレームの２次元画像の座標系を基準としたもの、つまり、画像中心を原点とするものである。 The positions of the edges in FIG. 3 are as follows.
_{_{(Xa n ", ya n"}} ) represents the edge position detected in the A-frame.
(Xb _{n ′} ′, yb _{n ′} ′) represents an edge position detected in the B frame.
_{_{(Xa n ', ya n'}} ) represents the position obtained by correcting the distortion of the edge position detected in the A-frame.
(Xb _n ′, yb _n ′) represents a position obtained by correcting the distortion of the edge position detected in the B frame.
ra _n and rb _n respectively represent the distances from the image center to the edge of the A frame and B frame.
However, the above points are based on the coordinate system of the two-dimensional image of each frame, that is, the image center is the origin.

ここで、レンズ歪を補正したＡフレームのｎ番目のエッジ位置（Ｐａ１_ｎ，Ｐａ２_ｎ）、および、レンズ歪とグローバルモーションとを補正したＢフレームのｎ番目のエッジ位置（Ｐｂ１_ｎ，Ｐｂ２_ｎ）は、上記した式（２）および式（３）に基づき、以下の、式（４）〜式（７）で表わされる。 Here, n-th edge position A frame correcting lens distortion _(Pa1 n, _Pa2 n), and, n-th edge position of the B frame obtained by correcting the lens distortion and the global motion _(Pb1 n, _Pb2 n) Is represented by the following formulas (4) to (7) based on the above formulas (2) and (3).

このとき、ＡフレームおよびＢフレームから得られるｎ番目のエッジ位置は、本来同一の被写体部位であり歪がなければいずれも図４の（ｘ_ｎ’，ｙ_ｎ’）となる。よって、レンズ歪を補正したＡフレームのｎ番目のエッジ位置（Ｐａ１_ｎ，Ｐａ２_ｎ）と、レンズ歪とグローバルモーションとを補正したＢフレームのｎ番目のエッジ位置（Ｐｂ１_ｎ，Ｐｂ２_ｎ）との間の距離、つまり、エッジ間の距離は「０」に収束することとなる。これに基づき、最適化の評価式が、式（８）で表わされる。ここで、評価値「Ｃ_ｄ」は、エッジ間の距離の平均値を表わし、「ｋ」は、ＡフレームとＢフレームにおいて対応するエッジの数を表わす。 At this time, the n-th edge position obtained from the A frame and the B frame is originally the same object portion, and if there is no distortion, both become (x _n ', y _n ') in FIG. Therefore, n-th edge position A frame obtained by correcting the lens distortion and _{_{(Pa1 n, Pa2 n),}} n th edge position of the B frame obtained by correcting the lens distortion and the global motion _(Pb1 _n, Pb2 n) and the The distance between them, that is, the distance between the edges converges to “0”. Based on this, the evaluation expression for optimization is expressed by Expression (8). Here, the evaluation value “C _d ” represents an average value of distances between edges, and “k” represents the number of corresponding edges in A frame and B frame.

ここで、図４の（ｘａ_ｎ”，ｙａ_ｎ”）は、Ａフレームの歪補正前のエッジ位置である。
（ｘｂ_ｎ''’、ｙｂ_ｎ'''）は、Ｂフレームの歪補正前の対応するエッジ位置であり、（ｘｂ_ｎ”，ｙｂ_ｎ”）に対し、グローバルモーションの回転、並進量を逆に射影したものである。
また、エッジ位置を示す点（Ｐａ１_ｎ，Ｐａ２_ｎ）は、（ｘａ_ｎ”，ｙａ_ｎ”）のレンズ歪を補正した座標、つまり、（ｘａ_ｎ'，ｙａ_ｎ'）であり、点（Ｐｂ１_ｎ，Ｐｂ２_ｎ）は、（ｘｂ_ｎ'''，ｙｂ_ｎ'''）のレンズ歪とグローバルモーションとを補正した座標、つまり、（ｘｂ_ｎ'，ｙｂ_ｎ'）である。
レンズ歪係数算出手段１０５は、このように、レーベンバーグマーカート法により最適化することで、レンズ歪係数「κ_１」「κ_２」を算出する。 Here, (xa _{n ′} ′, ya _n ′ ′) in FIG. 4 is an edge position of the A frame before distortion correction.
_{(Xb n ''', yb} n''') is the corresponding edge position before the distortion correction of B _{_{frames, (xb n ", yb n}} ") to the rotation of the global motion, reverse translation amount Projected onto the
Also, the point indicating the edge position _(Pa1 _n, Pa2 n) _is a _{(xa n ", ya n"} ) coordinates obtained by correcting the lens distortion, _{_{i.e., (xa n ', ya n}} '), the point (Pb1 _{n 1} and Pb 2 _n ) are coordinates obtained by correcting the lens distortion of (xb _n ′ ′ ′, yb _n ′ ′ ′) and the global motion, that is, (xb _n ′, yb _n ′).
The lens distortion coefficient calculation means 105 calculates the lens distortion coefficients “κ ₁ ” and “κ ₂ ” by optimizing according to the Levenberg marker method as described above.

図１に戻り、非剛体領域判定手段１０６は、各フレームの中の非剛体（例えば、人物等）の映る領域を判定する。非剛体（例えば、人物等）は、カメラの動きとは関係なくその人物自体が移動するため、非剛体を処理対象に含めると、三脚利用か否かの判定や、カメラパラメータの算出にとっては精度の低下をまねく。よって、カメラパラメータ推定装置１は、非剛体の領域を、処理対象から取り除くことにより精度を向上させる。 Returning to FIG. 1, the non-rigid region determination unit 106 determines a region in which a non-rigid body (for example, a person or the like) appears in each frame. Because non-rigid bodies (for example, people, etc.) move themselves regardless of camera movement, if non-rigid bodies are included in the processing target, accuracy in determining whether to use a tripod or calculating camera parameters Leading to a decline in Therefore, the camera parameter estimation device 1 improves the accuracy by removing the non-rigid region from the processing target.

この非剛体領域判定手段１０６は、具体的には、各フレームをＮ×Ｍに分割し、各ブロックに対し、Ａフレーム、Ｂフレームの色ヒストグラムを比較し、類似度が低いブロックを非剛体の領域と判定する。
非剛体領域判定手段１０６は、例えば、各フレームを１６×９に分割し、ヒストグラムの比較には、色ヒストグラムインターセクションを利用し、所定の閾値（所定の第２の閾値）（例えば、「０．５」とする。）以下であれば、そのブロックが、非剛体領域であると判定する。
なお、色ヒストグラムインターセクションについては、次に示す三脚利用判定手段１０７においても説明するが、（参考文献５）に詳しい。
（参考文献５）M. J. Swain , D. H. Ballard,“ Color indexing,” International Journal of Computer Vision, v.7 n.1, p.11-32, Nov. 1991 Specifically, the non-rigid region determination unit 106 divides each frame into N × M, compares the color histograms of A frame and B frame with each block, and determines a block having a low degree of similarity as non-rigid. Determined as an area.
The non-rigid region determination unit 106 divides each frame into 16 × 9, for example, and uses color histogram intersection to compare histograms, and uses a predetermined threshold (a predetermined second threshold) (for example, “0 If it is less than or equal to 5), it is determined that the block is a non-rigid region.
The color histogram intersection will be described also in the tripod utilization determination means 107 shown below, but is described in detail in (Reference 5).
(Reference 5) MJ Swain, DH Ballard, “Color indexing,” International Journal of Computer Vision, v. 7 n. 1, p. 11-32, Nov. 1991

三脚利用判定手段１０７は、それまでに求めた情報を利用し、撮影映像が三脚を利用して撮影した映像か否かを判定する。その際、三脚利用判定手段１０７は、それまでに求めた情報（レンズ歪係数や、非剛体領域の情報等のそれぞれ）を利用し、グローバルモーション推定手段１０２が推定したグローバルモーションを更新する。
具体的には、三脚利用判定手段１０７は、レンズ歪係数算出手段１０５が算出したレンズ歪係数を用いて、ＡフレームおよびＢフレームに対し、レンズ歪の補正処理を行う。そして、三脚利用判定手段１０７は、非剛体領域判定手段１０６が非剛体領域と判定したブロック内に関しては、特徴点抽出および対応点探索の対象とせず、再度、グローバルモーション推定手段１０２を介して、グローバルモーション推定処理を行う。さらに、三脚利用判定手段１０７は、ここで算出されたグローバルモーションに基づき、近傍エッジフィッティング手段１０４を介して、エッジフィッティング処理を行うことにより、グローバルモーションを更新する。なお、このようにして三脚利用判定手段１０７により算出されたグローバルモーション推定値（並進量と回転量）を、「ＧＭ３」（第３のグローバルモーション）と称する。 The tripod usage determination unit 107 determines whether the captured image is an image captured using a tripod, using the information obtained up to that point. At that time, the tripod utilization determination means 107 updates the global motion estimated by the global motion estimation means 102 using the information (lens distortion coefficient, information of non-rigid region, etc.) obtained up to that point.
Specifically, using the lens distortion coefficient calculated by the lens distortion coefficient calculation unit 105, the tripod use determination unit 107 performs lens distortion correction processing on the A frame and the B frame. Then, the tripod utilization determination unit 107 does not set the feature point extraction and the corresponding point search for the blocks determined by the non-rigid region determination unit 106 as the non-rigid region, again via the global motion estimation unit 102. Perform global motion estimation processing. Furthermore, the tripod usage determination unit 107 updates the global motion by performing edge fitting processing via the neighboring edge fitting unit 104 based on the global motion calculated here. The global motion estimated value (translation amount and rotation amount) calculated by the tripod usage determination unit 107 in this manner is referred to as "GM3" (third global motion).

続いて、三脚利用判定手段１０７は、レンズ歪係数算出手段１０５が算出したレンズ歪を補正した画像と、それ以前の最新のグローバルモーション（ここでは、「ＧＭ３」）とに基づき、ＡフレームとＢフレームのオクルージョン量の算出を行う。
なお、以下において、まず、２つのフレーム（ＡフレームとＢフレーム）について行う三脚利用の判定処理を説明し、その後、複数のフレームからなるショット区間での三脚利用の判定処理を説明する。 Subsequently, the tripod usage determining unit 107 determines whether the lens distortion calculated by the lens distortion coefficient calculation unit 105 is the A frame and B based on the latest global motion (here, “GM3”) obtained by correcting the lens distortion. Calculate the amount of occlusion of the frame.
In the following, first, determination processing of tripod usage performed for two frames (A frame and B frame) will be described, and then determination processing of tripod usage in a shot section including a plurality of frames will be described.

（２つのフレーム間の三脚利用判定処理）
三脚利用判定手段１０７は、撮影映像が三脚を利用して撮影された映像か否かの判定を、エッジ抽出手段１０３が抽出したエッジ画像から得られるエッジ部周辺のオクルージョン量を評価することにより行う。また、三脚利用判定手段１０７は、このエッジ部周辺のオクルージョン量の評価法として、色ヒストグラムインターセクションを用いる。 (Tripod use judgment processing between two frames)
The tripod usage determining unit 107 determines whether the captured image is an image captured using a tripod by evaluating the amount of occlusion around the edge obtained from the edge image extracted by the edge extracting unit 103. . In addition, the tripod utilization determination unit 107 uses color histogram intersection as a method of evaluating the amount of occlusion around the edge portion.

図５は、本実施形態に係るカメラパラメータ推定装置１の三脚利用判定手段１０７が行うオクルージョン量の評価法を説明するための図である。
図５（ａ）に示すように、エッジ画像において、エッジの存在する部位の周囲Ｌ×Ｌ画素を、Ａフレーム、Ｂブロックそれぞれの対象エッジ近傍ブロックとする。ここで、図５（ａ）の左図は、レンズ歪を補正した画像であり、対象エッジ近傍ブロックである「α_Ａ」の位置を表わす。また、図５（ａ）の右図は、レンズ歪を補正した画像であり、Ａフレームの対象エッジ近傍ブロック「α_Ａ」に対応する、Ｂフレームの対象エッジ近傍ブロック「α_Ｂ」の位置を表わす。なお、この対象エッジ近傍ブロック「α_Ｂ」の位置は、直近のグローバルモーション（ここでは「ＧＭ３」）の並進量と回転量とに基づき補正されたものである。 FIG. 5 is a view for explaining a method of evaluating the amount of occlusion performed by the tripod use determination unit 107 of the camera parameter estimation device 1 according to the present embodiment.
As shown in FIG. 5A, in the edge image, L × L pixels around the portion where the edge exists are set as target edge neighboring blocks of A frame and B block. Here, the left view of FIG. 5A is an image in which lens distortion is corrected, and represents the position of “α _A ” which is a block near the target edge. The right figure of FIG. 5A is an image in which lens distortion is corrected, and the position of the target edge near block “α _B ” of the B frame corresponding to the target edge near block “α _A ” of the _A frame is Show. The position of the target edge near block “α _B ” is corrected based on the translation amount and the rotation amount of the latest global motion (here, “GM3”).

そして、三脚利用判定手段１０７は、そのＡフレームとＢフレームの対象エッジ近傍ブロック（「α_Ａ」「α_Ｂ」）の色ヒストグラムの類似度（評価値）を示す色ヒストグラムインターセクションＤを、以下の式（９）により求める。なお、「ｈａ_ｉ」は、Ａフレームの色ヒストグラムを表わし、「ｈｂ_ｉ」はＢフレームの色ヒストグラムを表わす。 Then, the tripod usage determination unit 107 performs color histogram intersection section D indicating the similarity (evaluation value) of the color histogram of the target edge neighboring blocks (“α _A ” and “α _B ”) of the A frame and the B frame as follows. It calculates | requires by Formula (9) of. “Ha _i ” represents a color histogram of A frame, and “hb _i ” represents a color histogram of B frame.

このオクルージョン評価においては、対象とするエッジ近傍ブロック内のＲＧＢ各色の輝度を４つ段階のＢＩＮ（総計ｎ＝１２のＢＩＮ）に量子化し、そのブロックに含まれる各画素の色からヒストグラムを作成し、Ｓｗａｉｎらの手法（前記した参考文献５）により、色ヒストグラムインターセクションを求めるものとする。ただし、本実施形態におけるエッジ周辺の類似度評価法としてＳｗａｉｎらの手法に限定するものではない。 In this occlusion evaluation, the luminance of each RGB color in the target edge neighboring block is quantized into four stages of BINs (bin of n = 12 in total), and a histogram is created from the color of each pixel included in the block. The color histogram intersection shall be determined by the method of Swain et al. (Reference 5 described above). However, the similarity evaluation method around the edge in the present embodiment is not limited to the method of Swain et al.

三脚利用判定手段１０７は、前記したＳｗａｉｎらの手法により、図５（ｂ）に示すような色ヒストグラムをＡフレーム（左図）、Ｂフレーム（右図）それぞれについて求める。そして、図５（ｃ）に示す式（前記した式（９）と同等の式）により、色ヒストグラムインターセクションＤを算出する。
続いて、三脚利用判定手段１０７は、エッジ画像の各対象近傍ブロックにおいて算出した色ヒストグラムインターセクションＤのうち、「０．５」を超えるブロックを抽出し、そのブロックの数とそのブロックの色ヒストグラムインターセクションＤの値の総和を求める。そして、三脚利用判定手段１０７は、色ヒストグラムインターセクションＤの平均値を求め、所定の閾値（所定の第１の閾値）（例えば、「０．８」）を超える場合に、三脚利用であると判定する。なお、色ヒストグラムインターセクションＤの値が「０．５」以下を判定の対象外とするのは、極端に類似していない画像を取り除くことにより、誤って異なるブロックを対象ブロックとして算出していた場合や、非剛体判定に漏れがあり、非剛体自身の移動により色ヒストグラムがＡフレームとＢフレームとで極端に異なる場合等を排除するためである。 The tripod utilization determining means 107 obtains a color histogram as shown in FIG. 5B for each of the A frame (left figure) and the B frame (right figure) by the method of Swain et al. Then, the color histogram intersection D is calculated by the expression shown in FIG. 5C (an expression equivalent to the above-described expression (9)).
Subsequently, the tripod usage determining unit 107 extracts blocks exceeding “0.5” from the color histogram intersection D calculated in each target neighboring block of the edge image, and the number of the blocks and the color histogram of the blocks Calculate the sum of the values of intersection D. Then, the tripod utilization determining unit 107 determines the average value of the color histogram intersection D, and uses the tripod when it exceeds a predetermined threshold (a predetermined first threshold) (for example, “0.8”). judge. It should be noted that the reason why the value of the color histogram intersection D is not "0.5" or less is excluded from the determination because the image which is not extremely similar is erroneously calculated as a different block as a target block. This is to eliminate the case where there is a leak in the non-rigid body determination and the case where the color histogram is extremely different between the A frame and the B frame due to the movement of the non-rigid body itself.

（ショット区間の三脚利用判定処理）
次に、三脚利用判定手段１０７が行う、複数のフレームからなるショット区間での三脚利用の判定処理を説明する。このショット区間での三脚利用の判定処理は、前記した２フレーム間での三脚利用の判定結果（色ヒストグラムインターセクションＤの平均値）を利用して行ってもよいし、各フレームのエッジ画像の中で、エッジの強い（例えば、輝度勾配の値が大きい）順に、所定数のエッジを選択して色ヒストグラムインターセクションＤを計算し、その平均値を求めるようにしてもよい。 (Tripod use judgment processing of shot section)
Next, determination processing of tripod utilization in a shot section including a plurality of frames performed by the tripod utilization determination unit 107 will be described. The determination process of tripod use in this shot section may be performed using the determination result of tripod use between two frames (average value of color histogram intersection D) or the edge image of each frame Among them, a predetermined number of edges may be selected in descending order of edge strength (for example, the value of the luminance gradient is large), color histogram intersection D may be calculated, and the average value thereof may be obtained.

そして、三脚利用判定手段１０７は、取得した撮影映像に付与されたメタデータに含まれる、ショット区間の情報（一連で撮影された複数のフレームの情報）を用いて、このショット区間のフレーム数を「ｎ」とし、以下に示す、式（１０）および式（１１）を用いて、そのショット区間の映像が三脚を利用したものか否かを判定する。 Then, using the information of the shot section (information of a plurality of frames taken in a series) included in the metadata added to the acquired captured video, the tripod usage determination unit 107 determines the number of frames of this shot section. Assuming that “n” is used, it is determined whether the image of the shot section uses a tripod or not using Expressions (10) and (11) shown below.

三脚利用判定手段１０７は、ショット区間において、そのフレームの色ヒストグラムインターセクションＤ（ここでは、前記した各フレームにおける「Ｄ」の平均値を、当該フレームの「Ｄ」値とする。）が、閾値Ｔｈ_Ｄ（例えば、「０．８」）を超えるか否かを判定する。具体的には、式（１０）に基づき、閾値Ｔｈ_Ｄを超える場合には「１」とし、閾値Ｔｈ_Ｄ以下である場合には「０」とする。 The tripod utilization determining unit 107 determines that the color histogram intersection D of the frame (here, the average value of “D” in each frame described above is the “D” value of the frame) in the shot section is a threshold. It is determined whether Th _D (eg, “0.8”) is exceeded. Specifically, based on the equation (10), “1” is set when the threshold Th _D is exceeded, and “0” is set when the threshold Th _D is equal to or less than the threshold Th _D.

次に、三脚利用判定手段１０７は、式（１１）に示すように、閾値Ｔｈ_Ｄを超えるフレーム数、つまり、三脚利用と推定されるフレーム数を集計し、ショット区間の総フレーム数ｎで除算して評価値「Ｄ_ｒ」を求める。そして、三脚利用判定手段１０７は、評価値「Ｄ_ｒ」が、閾値Ｔｈ_Ｖ（例えば、「０．９」）を超えた場合に、そのショット区間が三脚を利用して撮影されたものと判定する。
このようにすることにより、三脚利用判定手段１０７は、ショット区間単位で、そのショット区間内の撮影映像が三脚を利用したものか否かを判定することができる。 Next, the tripod use determination unit 107 counts the number of frames exceeding the threshold Th _D , that is, the number of frames estimated to be used as a tripod, and divides by the total number of frames n in the shot section, as shown in equation (11). Then, an evaluation value "D _r " is obtained. Then, when the evaluation value "D _r " exceeds the threshold value Th _V (for example, "0.9"), the tripod use determination means 107 determines that the shot section is photographed using a tripod. Do.
By doing this, the tripod utilization determination unit 107 can determine, on a shot interval basis, whether or not the photographed image in the shot interval uses a tripod.

なお、三脚利用判定手段１０７は、上記の手法以外にも、例えば、次に示す手法で、ショット区間での三脚利用を判定してもよい。
三脚利用判定手段１０７は、該当するショット区間内で、各フレームで得られた色ヒストグラムインターセクションＤの値を用いて、その色ヒストグラムインターセクションＤについてのヒストグラムを作成する。なお、ここでは、前記した手法と同様の手法を用いて、フレーム内における色ヒストグラムインターセクションＤの平均値を求め、その「Ｄ」の平均値を、当該フレームの「Ｄ」の値とする。また、三脚利用判定手段１０７は、ヒストグラムのＢＩＮとなる「Ｄ」の値（「０」〜「１」）を、例えば１０等分し、各フレームの「Ｄ」の値が「０」以上「０．１」未満の「Ｄ」値をＢＩＮ「０」とし、「０．１」以上「０．２」未満の「Ｄ」値をＢＩＮ「０．１」とし、「０．２」以上「０．３」未満の「Ｄ」値をＢＩＮ「０．２」とし、・・・、「０．８」以上「０．９」未満の「Ｄ」値をＢＩＮ「０．８」とし、「０．９」以上「１．０」以下の「Ｄ」値をＢＩＮ「０．９」として、ヒストグラムを作成する。 In addition to the above method, the tripod use determination unit 107 may determine tripod use in the shot section by, for example, the following method.
The tripod usage determination means 107 creates a histogram for the color histogram intersection D using the value of the color histogram intersection D obtained in each frame within the corresponding shot section. Here, the average value of the color histogram intersection D in the frame is determined using the same method as the method described above, and the average value of the “D” is taken as the value of the “D” of the frame. In addition, the tripod usage determining unit 107 divides the value of “D” (“0” to “1”) that is the bin of the histogram, for example, into 10 equal parts, and the value of “D” of each frame is “0” or more. Set "D" value less than 0.1 "as BIN" 0 "," D "value more than" 0.1 "and less than" 0.2 "as BIN" 0.1 "," 0.2 "or more Set the "D" value less than 0.3 "as BIN" 0.2 ", ...," 0.8 "or more and less than" 0.9 "as BIN" 0.8 " A histogram is created with a “D” value of 0.9 ”or more and“ 1.0 ”or less as BIN“ 0.9 ”.

そして、三脚利用判定手段１０７は、作成した各フレームの「Ｄ」値についてのヒストグラムにおいてピークとなるＢＩＮの値を、そのショット区間のオクルージョンの状態を表わすものと仮定し、そのピークとなるＢＩＮの値が、閾値Ｔｈ_{ＨｉｓｔＤ}を超えた場合に、三脚利用と判定する。この閾値Ｔｈ_{ＨｉｓｔＤ}の値は、実験結果では「０．８」とすることにより、安定的な判定を行うことが可能であった。 Then, the tripod utilization determining unit 107 assumes that the peak value of the bin of the histogram of the “D” value of each frame created represents the occlusion state of the shot section, and the peak of the bin of the peak. When the value exceeds the threshold Th _HistD , it is determined that the tripod is used. By _setting the value of the threshold Th _HistD to “0.8” in the experimental results, it was possible to perform stable determination.

図１に戻り、三脚用カメラパラメータ算出手段１０８は、三脚利用判定手段１０７が、三脚利用と判定した撮影映像について、直近で算出されたグローバルモーション（ここでは、「ＧＭ３」）を基準に、カメラパラメータを算出する。
ここでは、三脚用カメラパラメータ算出手段１０８が、取得した撮影映像に付与されたメタデータに含まれるレンズズーム量と、算出された直近のグローバルモーション値（並進量と回転量）とを用いて、カメラの仰角（俯角）、方位角を算出し、カメラパラメータとして出力する。
なお、三脚用カメラパラメータ算出手段１０８は、仰角（俯角）、方位角の出力形式ではなく、回転行列の形式で、カメラパラメータを出力するようにしてもよい。 Returning to FIG. 1, the camera parameter calculation unit for tripods 108 is a camera based on the global motion (here, “GM3”) calculated most recently for the captured image determined to be tripod usage by the tripod usage determination unit 107. Calculate the parameters.
Here, the tripod camera parameter calculation unit 108 uses the lens zoom amount included in the metadata added to the acquired captured image and the calculated latest global motion value (translation amount and rotation amount). The elevation angle (the depression angle) and azimuth angle of the camera are calculated and output as camera parameters.
Note that the tripod camera parameter calculation unit 108 may output camera parameters in the form of a rotation matrix instead of the output format of elevation angle (depression angle) and azimuth angle.

手持ち用カメラパラメータ算出手段１０９は、三脚利用判定手段１０７が、三脚利用でないと判定した撮影映像について、バンドルアジャストメント（前記した特許文献２参照）等によるカメラパラメータ推定処理を行う。なお、このバンドルアジャストメントでは、異なる位置から撮影した複数のフレームに含まれる対応する特徴点を解析して、その特徴点の位置を１つの収束させる処理を行い、各フレームのカメラパラメータを求める。 The handheld camera parameter calculation unit 109 performs camera parameter estimation processing by bundle adjustment (see the above-mentioned Patent Document 2) or the like on the photographed image that the tripod usage determination unit 107 has determined not to use a tripod. In this bundle adjustment, corresponding feature points included in a plurality of frames captured from different positions are analyzed, and processing for converging the positions of the feature points is performed to obtain camera parameters of each frame.

カメラパラメータ出力手段１１０は、三脚用カメラパラメータ算出手段１０８、または、手持ち用カメラパラメータ算出手段１０９により算出されたカメラパラメータの情報を、取得した撮影映像のメタデータに付し、入出力手段２０を介して、映像アーカイブス１０００に出力する。 The camera parameter output unit 110 attaches the information of the camera parameter calculated by the tripod camera parameter calculation unit 108 or the hand-held camera parameter calculation unit 109 to the acquired metadata of the photographed image, and the input / output unit 20 Output to the video archive 1000.

＜処理の流れ＞
次に、カメラパラメータ推定装置１の動作について説明する。
本実施形態に係るカメラパラメータ推定装置１のカメラパラメータ推定処理について、以下３つの処理例について説明する。
「第１の処理例」は、制御手段１０（図１参照）に備わる、近傍エッジフィッティング手段１０４、レンズ歪係数算出手段１０５および非剛体領域判定手段１０６が行う精度向上のための処理をすべて含んだカメラパラメータ推定処理である。「第２の処理例」は、算出したレンズ歪の情報を用いて映像を補正し、非剛体領域を処理対象から除外した上で、第１の処理を繰り返すことにより、カメラパラメータ推定処理の精度をさらに向上させる例である。「第３の処理例」は、制御手段１０に、近傍エッジフィッティング手段１０４、レンズ歪係数算出手段１０５および非剛体領域判定手段１０６を備えない構成とすることにより、処理負荷を軽減し、計算速度を向上させる例である。以下、３つの処理例について具体的に説明する。 <Flow of processing>
Next, the operation of the camera parameter estimation device 1 will be described.
About the camera parameter estimation process of the camera parameter estimation apparatus 1 which concerns on this embodiment, the following three process examples are demonstrated.
The “first processing example” includes all the processing for improving the accuracy performed by the neighboring edge fitting unit 104, the lens distortion coefficient calculation unit 105, and the non-rigid region determination unit 106 provided in the control unit 10 (see FIG. 1). Camera parameter estimation processing. The “second processing example” corrects the image using the calculated lens distortion information, excludes the non-rigid region from the processing target, and then repeats the first processing to obtain the accuracy of the camera parameter estimation processing. Is an example to further improve In the “third processing example”, the processing load is reduced by configuring the control unit 10 not to include the neighboring edge fitting unit 104, the lens distortion coefficient calculation unit 105, and the non-rigid region determination unit 106, and the calculation speed is reduced. Is an example of improving Hereinafter, three processing examples will be specifically described.

≪カメラパラメータ推定の第１の処理例≫
カメラパラメータ推定の第１の処理例は、図１に示したカメラパラメータ推定装置１の制御手段１０内の各手段がすべて備わる場合の処理である。
図６は、本実施形態に係るカメラパラメータ推定装置１が行うカメラパラメータ推定処理（第１の処理例）を示すフローチャートである。 << First Processing Example of Camera Parameter Estimation >>
The first processing example of camera parameter estimation is processing in the case where all the units in the control unit 10 of the camera parameter estimation device 1 shown in FIG. 1 are provided.
FIG. 6 is a flowchart showing camera parameter estimation processing (first processing example) performed by the camera parameter estimation device 1 according to the present embodiment.

まず、カメラパラメータ推定装置１の映像取得手段１０１は、映像アーカイブス１０００から、メタデータが付与された撮影映像を取得する（ステップＳ１０）。
続いて、カメラパラメータ推定装置１のグローバルモーション推定手段１０２は、例えば、ＳＵＲＦを用いて撮影映像の各フレームにおいて特徴点を抽出することにより、グローバルモーション推定値を算出する（ステップＳ１１：特徴点を利用したグローバルモーション推定処理）。このとき、グローバルモーション推定手段１０２は、各フレームに対して対応点探索を行い、対応誤り除去を行う。
なお、このとき、グローバルモーション推定手段１０２がグローバルモーション推定値として算出した並進量と回転量が「ＧＭ１」である。 First, the video acquisition unit 101 of the camera parameter estimation device 1 acquires a photographed video to which metadata is added from the video archive 1000 (step S10).
Subsequently, the global motion estimation means 102 of the camera parameter estimation device 1 calculates a global motion estimated value by extracting feature points in each frame of the captured video using, for example, SURF (step S11: feature points Global motion estimation processing used). At this time, the global motion estimation means 102 performs corresponding point search on each frame and performs corresponding error removal.
At this time, the translation amount and the rotation amount calculated by the global motion estimation means 102 as the global motion estimated value are “GM1”.

次に、カメラパラメータ推定装置１のエッジ抽出手段１０３は、例えば、Ｓｏｂｅｌフィルタを用いて、各フレームに対し、エッジ抽出を行う（ステップＳ１２）。
そして、カメラパラメータ推定装置１の近傍エッジフィッティング手段１０４は、エッジ抽出手段１０３により抽出されたエッジ画像に対し、図２において説明したエッジフィッティング処理を実行することにより、グローバルモーション推定手段１０２が算出した「ＧＭ１」について、さらに精度を向上させたグローバルモーション推定値を算出する（ステップＳ１３：近傍エッジフィッティングによるグローバルモーション更新処理）。ここで、近傍エッジフィッティング手段１０４により算出される更新されたグローバルモーション推定値（並進量と回転量）が「ＧＭ２」である。 Next, the edge extraction unit 103 of the camera parameter estimation device 1 performs edge extraction on each frame using, for example, a Sobel filter (step S12).
Then, the neighboring edge fitting unit 104 of the camera parameter estimation device 1 calculates the global motion estimation unit 102 by performing the edge fitting process described in FIG. 2 on the edge image extracted by the edge extraction unit 103. A global motion estimated value with further improved accuracy is calculated for “GM1” (step S13: global motion updating process by near edge fitting). Here, the updated global motion estimated value (translation amount and rotation amount) calculated by the neighboring edge fitting unit 104 is “GM2”.

続いて、カメラパラメータ推定装置１のレンズ歪係数算出手段１０５は、「ＧＭ２」を基準に、グローバルモーション推定手段１０２が「ＧＭ１」を算出する際に求めた対応点の誤りを除去した上で、レンズ歪係数を算出する（ステップＳ１４）。
ここで、レンズ歪係数算出手段１０５は、レンズ歪を補正したＡフレームのエッジ位置と、レンズ歪およびグローバルモーション（ＧＭ２）を補正したＢフレームのエッジ位置との距離が「０」に収束することに基づく最適化処理を行うことにより、レンズ歪係数を算出する。なお、ここで算出される評価値Ｃ_ｄ（式（８）参照）は、前記したように、ＡフレームとＢフレームの対応するエッジ間の距離の平均値を表わす。 Subsequently, the lens distortion coefficient calculation unit 105 of the camera parameter estimation device 1 removes an error of the corresponding point obtained when the global motion estimation unit 102 calculates "GM1" on the basis of "GM2". The lens distortion coefficient is calculated (step S14).
Here, the lens distortion coefficient calculation unit 105 is configured such that the distance between the edge position of the A frame in which the lens distortion is corrected and the edge position of the B frame in which the lens distortion and the global motion (GM2) are corrected converges to “0”. The lens distortion coefficient is calculated by performing the optimization process based on. As described above, the evaluation value C _d (see equation (8)) calculated here represents the average value of the distances between corresponding edges of the A frame and the B frame.

そして、カメラパラメータ推定装置１の非剛体領域判定手段１０６は、各フレーム内の領域を複数のブロックに分割し、色ヒストグラムインターセクションを利用することにより、ＡフレームとＢフレームとの類似度を評価し、所定の閾値以下であれば、そのブロックを非剛体の領域であると判定する（ステップＳ１５）。 Then, the non-rigid region determination unit 106 of the camera parameter estimation device 1 divides the region in each frame into a plurality of blocks, and evaluates the similarity between the A frame and the B frame by using color histogram intersection. If it is less than a predetermined threshold value, it is determined that the block is a non-rigid region (step S15).

続いて、三脚利用判定手段１０７は、レンズ歪と非剛体領域に基づくグローバルモーションの更新処理を行う（ステップＳ１６）。
具体的には、三脚利用判定手段１０７は、ステップＳ１４においてレンズ歪係数算出手段１０５が算出したレンズ歪係数を用いて、ＡフレームおよびＢフレームに対し、レンズ歪の補正処理を行う。そして、三脚利用判定手段１０７は、ステップＳ１５において非剛体領域と判定されたブロック内に関しては、特徴点抽出および対応点探索の対象とせず、再度、グローバルモーション推定手段１０２を介して、グローバルモーション推定処理を行う。さらに、三脚利用判定手段１０７は、そこで算出されたグローバルモーションに基づき、近傍エッジフィッティング手段１０４を介して、エッジフィッティング処理を行うことにより、グローバルモーションを更新する。なお、ここで、三脚利用判定手段１０７により算出されたグローバルモーション推定値（並進量と回転量）が「ＧＭ３」である。 Subsequently, the tripod usage determination unit 107 performs global motion update processing based on lens distortion and a non-rigid region (step S16).
Specifically, the tripod usage determination unit 107 performs lens distortion correction processing on the A frame and the B frame using the lens distortion coefficient calculated by the lens distortion coefficient calculation unit 105 in step S14. Then, the tripod use determination means 107 does not set the feature point extraction and the corresponding point search in the block determined as the non-rigid region in step S15, and estimates the global motion again via the global motion estimation means 102. Do the processing. Further, the tripod usage determination unit 107 updates the global motion by performing edge fitting processing via the neighboring edge fitting unit 104 based on the global motion calculated there. Here, the global motion estimated value (translation amount and rotation amount) calculated by the tripod usage determination unit 107 is “GM3”.

次に、三脚利用判定手段１０７は、エッジ画像から得られるエッジ部周辺のオクルージョン量を、色ヒストグラムインターセクションＤを用いて評価することにより、撮影映像が三脚を利用して撮影された映像か否かを判定する（ステップＳ１７）。このとき、三脚利用判定手段１０７は、撮影映像に付されたメタデータに含まれるショット区間の情報を用いて、ショット区間毎に撮影映像が三脚を利用して撮影したか否かを判定する。
そして、三脚利用判定手段１０７が、三脚利用と判定した場合（ステップＳ１７→Ｙｅｓ）、次のステップＳ１８に進み、三脚利用でないと判定した場合（ステップＳ１７→Ｎｏ）、次のステップＳ１９に進む。 Next, by using the color histogram intersection D to evaluate the amount of occlusion around the edge portion obtained from the edge image, the tripod usage determination unit 107 determines whether the captured image is an image captured using a tripod. It is determined (step S17). At this time, the tripod usage determination unit 107 determines, for each shot section, whether or not the captured image is shot using a tripod, using information on the shot section included in the metadata attached to the shot video.
Then, when the tripod use determining unit 107 determines that the tripod is used (Step S17: Yes), the process proceeds to the next Step S18, and when it is determined that the tripod is not used (Step S17: No), the process proceeds to the next Step S19.

ステップＳ１８において、三脚用カメラパラメータ算出手段１０８は、撮影映像に付与されたメタデータに含まれるレンズズーム量と、「ＧＭ３」で示されるグローバルモーション値とを用いて、カメラパラメータを算出する。
一方、ステップＳ１９において、手持ち用カメラパラメータ算出手段１０９は、バンドルアジャストメント（前記した特許文献２参照）等の手法を用いて、カメラパラメータを算出する。 In step S18, the tripod camera parameter calculation unit 108 calculates a camera parameter using the lens zoom amount included in the metadata added to the photographed image and the global motion value indicated by "GM3".
On the other hand, in step S19, the hand-held camera parameter calculation means 109 calculates camera parameters using a method such as bundle adjustment (refer to Patent Document 2 described above).

続いて、カメラパラメータ出力手段１１０は、三脚用カメラパラメータ算出手段１０８または手持ち用カメラパラメータ算出手段１０９により算出されたカメラパラメータの情報を、撮影映像のメタデータに付し、映像アーカイブス１０００に出力する（ステップＳ２０）。 Subsequently, the camera parameter output unit 110 attaches the information of the camera parameter calculated by the tripod camera parameter calculation unit 108 or the hand-held camera parameter calculation unit 109 to the metadata of the photographed image, and outputs it to the image archive 1000 (Step S20).

≪カメラパラメータ推定の第２の処理例≫
次に、カメラパラメータ推定の第２の処理例について説明する。
図７は、本実施形態に係るカメラパラメータ推定装置１が行うカメラパラメータ推定処理（第２の処理例）を示すフローチャートである。
図６に示した第１の処理例と、図７で示す第２の処理例との違いは、ステップＳ１７の撮影映像が三脚を利用して撮影された映像か否かの判定の前に、ステップＳ１１〜Ｓ１６を繰り返すか否かの判定処理を設け、エッジ間の距離の平均値が所定の閾値以下になるまで、グローバルモーション等の更新処理を繰り返すことである。これにより三脚利用か否かの判定およびカメラパラメータ推定値の精度をさらに向上させることができる。
なお、図７においては、図６において説明した同一の処理については、同一のステップ番号を付し、説明を省略する。 << Second Processing Example of Camera Parameter Estimation >>
Next, a second processing example of camera parameter estimation will be described.
FIG. 7 is a flowchart showing camera parameter estimation processing (second processing example) performed by the camera parameter estimation device 1 according to the present embodiment.
The difference between the first processing example shown in FIG. 6 and the second processing example shown in FIG. 7 is that before the determination of whether or not the captured image in step S17 is an image captured using a tripod, A determination process as to whether or not to repeat steps S11 to S16 is provided, and the update process such as global motion is repeated until the average value of the distances between the edges becomes equal to or less than a predetermined threshold. This makes it possible to further improve the determination of whether or not to use a tripod and the accuracy of the camera parameter estimation value.
In FIG. 7, the same steps as those described in FIG. 6 carry the same step numbers, and the description thereof will be omitted.

まず、カメラパラメータ推定装置１は、図６と同様に、ステップＳ１０〜Ｓ１６の処理を実行することにより、三脚利用判定手段１０７が、グローバルモーション推定値（並進量と回転量）として「ＧＭ３」を算出する。
続いて、三脚利用判定手段１０７は、ステップＳ３０において、直近で算出されたグローバルモーション（ここでは、「ＧＭ３」）での対応点（エッジ位置）に基づき、前記した式（８）で示されるエッジ間の距離の平均値（評価値「Ｃ_ｄ」）を算出する。そして、三脚利用判定手段１０７は、そのエッジ間の距離の平均値（評価値Ｃ_ｄ）が所定の閾値Ｔｈ_ｉ（所定の第３の閾値）（例えば、「０．８」）を超えるか否かを判定する。
ここで、所定の閾値Ｔｈ_ｉを超える場合には（ステップＳ３０→Ｙｅｓ）、ステップＳ１１に戻って処理を続ける。なお、２回目以降の繰り返し処理のステップＳ１１において、グローバルモーション推定手段１０２は、レンズ歪係数算出手段１０５により直近で算出されたレンズ歪係数に基づく画像の補正を行うとともに、非剛体領域判定手段１０６が非剛体領域と判定したブロック内の特徴点を除外して、グローバルモーションの推定値の算出を行う。また、ステップＳ１２のエッジ抽出処理は、１回目に行っているため、２回目以降は実行しないようにしてもよい。それ以降の処理は、図６に示したステップＳ１３〜Ｓ１６の処理と同様である。 First, the camera parameter estimation device 1 executes the processing of steps S10 to S16 in the same manner as in FIG. 6 so that the tripod usage determination unit 107 determines “GM3” as a global motion estimated value (translation amount and rotation amount). calculate.
Subsequently, in step S30, the tripod usage determining unit 107 determines the edge represented by the above-described equation (8) based on the corresponding point (edge position) in the global motion (here, “GM3”) calculated most recently. Calculate the average value of the distance between (the evaluation value “C _d ”). Then, if the tripod usage determination unit 107 is greater than the average value of the distance between the edges (evaluation value C _d) is a predetermined threshold value Th i _{(predetermined} third threshold) (e.g., "0.8") or not Determine if
Here, if it exceeds a predetermined threshold value Th _i (step S30 → Yes), continues the process returns to step S11. In step S11 of the second and subsequent iterations, the global motion estimation means 102 corrects the image based on the lens distortion coefficient most recently calculated by the lens distortion coefficient calculation means 105, and the non-rigid region determination means 106. The global motion estimation value is calculated excluding the feature points in the block determined to be a non-rigid region. Further, since the edge extraction process in step S12 is performed for the first time, the second and subsequent times may not be performed. The subsequent processing is the same as the processing in steps S13 to S16 shown in FIG.

一方、三脚利用判定手段１０７は、ステップＳ３０において、そのエッジ間の距離の平均値（評価値Ｃ_ｄ）が所定の閾値Ｔｈ_ｉ（例えば、「０．８」）以下である場合（ステップＳ３０→Ｎｏ）、撮影映像が三脚を利用して撮影された映像か否かを判定するステップＳ１７に進む。それ以降の処理は、図６に示したステップＳ１８〜Ｓ２０の処理と同様である。 On the other hand, in the case where the average value of the distance between the edges (evaluation value C _d ) is less than or equal to a predetermined threshold value Th _i (for example, “0.8”) in step S30 No), the process proceeds to step S17 where it is determined whether the shot video is a video shot using a tripod. The subsequent processing is the same as the processing in steps S18 to S20 shown in FIG.

このようにすることにより、カメラパラメータ推定装置１は、エッジ間の距離の平均値で示される評価値「Ｃ_ｄ」を所定の閾値Ｔｈ_ｉ以下まで収束させることができる。よって、カメラパラメータ推定の第２の処理例では、第１の処理例よりもさらに精度を向上させて、三脚利用か否かの判定と、カメラパラメータ推定値の算出とを実行することができる。 By doing so, the camera parameter estimation apparatus 1 may be converged evaluation value indicated by the average value of the distance between the edges of the "C _d" to or less than a predetermined threshold value Th _i. Therefore, in the second processing example of camera parameter estimation, it is possible to further improve the accuracy than the first processing example, and to determine whether or not to use a tripod and to calculate a camera parameter estimated value.

≪カメラパラメータ推定の第３の処理例≫
次に、カメラパラメータ推定の第３処理例について説明する。
図８は、本実施形態に係るカメラパラメータ推定装置１が行うカメラパラメータ推定処理（第３の処理例）を示すフローチャートである。
第３の処理例を実行するカメラパラメータ推定装置１の制御手段１０は、図１に示した構成と比べると、図９に示すように、近傍エッジフィッティング手段１０４、レンズ歪係数算出手段１０５および非剛体領域判定手段１０６を備えていない。この構成の相違に伴う、図６に示した第１の処理例と、図８に示すこの第３の処理例との違いは、近傍エッジフィッティング手段１０４が実行するステップＳ１３、レンズ歪係数算出手段１０５が実行するステップＳ１４、非剛体領域判定手段１０６が実行するステップＳ１５、および、三脚利用判定手段１０７が実行する、レンズ歪と非剛体領域に基づくグローバルモーションの更新処理（ステップＳ１６）の各処理を含まない点である。 << Third Processing Example of Camera Parameter Estimation >>
Next, a third processing example of camera parameter estimation will be described.
FIG. 8 is a flowchart showing camera parameter estimation processing (third processing example) performed by the camera parameter estimation device 1 according to the present embodiment.
As compared with the configuration shown in FIG. 1, the control unit 10 of the camera parameter estimation device 1 that executes the third processing example has the neighboring edge fitting unit 104, the lens distortion coefficient calculation unit 105, and the non- The rigid region determination unit 106 is not provided. The difference between the first processing example shown in FIG. 6 and the third processing example shown in FIG. 8 due to the difference in this configuration is the step S13 performed by the neighboring edge fitting means 104, and the lens distortion coefficient calculation means Step S14 executed by 105, step S15 executed by non-rigid region determination unit 106, and global motion update processing based on lens distortion and non-rigid region executed by tripod usage determination unit 107 (step S16) Is not included.

よって、図８に示すように、ステップＳ１１においてグローバルモーション推定手段１０２が算出したグローバルモーション推定値（「ＧＭ１」の並進量と回転量）、および、エッジ抽出手段１０３が抽出したエッジ画像に基づき、三脚利用判定手段１０７が、そのエッジ画像から得られるエッジ部周辺のオクルージョン量に基づき、撮影映像が三脚を利用して撮影された映像か否かを判定する（ステップＳ１７）。それ以降の処理は、図６に示したステップＳ１８〜Ｓ２０の処理と同様である。 Therefore, as shown in FIG. 8, based on the global motion estimated value (translation amount and rotation amount of “GM1”) calculated by the global motion estimation means 102 in step S11, and the edge image extracted by the edge extraction means 103, Based on the amount of occlusion around the edge obtained from the edge image, the tripod usage determining unit 107 determines whether the captured image is an image captured using a tripod (step S17). The subsequent processing is the same as the processing in steps S18 to S20 shown in FIG.

このようにすることにより、第３の処理例を実行するカメラパラメータ推定装置１は、第１の処理例よりもさらに処理負荷を軽減し、計算速度を向上させた上で、三脚利用か否かの判定と、カメラパラメータ推定値の算出とを実行することができる。 By doing this, the camera parameter estimation device 1 executing the third processing example reduces the processing load further than the first processing example and improves the calculation speed, and then it is used as a tripod or not And the calculation of the camera parameter estimate.

以上説明したように、本実施形態に係るカメラパラメータ推定装置１およびカメラパラメータ推定プログラムによれば、三脚を利用した撮影映像か否かの初期設定をすることなく、効率的にカメラパラメータの推定を可能とすることができる。
つまり、三脚を利用した撮影映像か否かの初期値の設定をすることなくカメラパラメータの推定処理を自動化することができる。また、三脚を利用した映像か否かの判定を行うことにより、三脚利用の映像に適したカメラパラメータ算出処理を実行できるため、不必要な計算コストの増大を抑制することができる。さらに、三脚利用か否かの判定処理とともに、その判定に用いる情報を利用して、撮影映像のカメラパラメータを算出することができる。よって、トータルとして効率的なカメラパラメータ推定が可能となる。 As described above, according to the camera parameter estimation device 1 and the camera parameter estimation program according to the present embodiment, the camera parameter estimation can be efficiently performed without performing the initial setting as to whether or not the image is a photographed image using a tripod. It can be possible.
That is, the estimation process of the camera parameter can be automated without setting the initial value as to whether or not the image is taken using a tripod. In addition, since it is possible to execute the camera parameter calculation process suitable for the image using the tripod by determining whether the image is the image using the tripod, it is possible to suppress an increase in unnecessary calculation cost. Furthermore, together with the determination processing as to whether or not the tripod is used, it is possible to calculate the camera parameter of the photographed image using the information used for the determination. Therefore, efficient camera parameter estimation can be performed as a whole.

なお、本発明は、ここで説明した実施形態に限定されるものではない。例えば、第３の処理例を実行する構成に加えて、カメラパラメータ推定装置１は、近傍エッジフィッティング手段１０４、レンズ歪係数算出手段１０５、非剛体領域判定手段１０６のいずれか１つ、または、その組み合わせを追加して制御手段１０に備えるようにし、精度を向上させるようにしてもよい。また、その際に、三脚利用判定手段１０７が、図７のステップＳ３０で示したように、エッジ間の距離の平均値（評価値「Ｃ_ｄ」）を算出し、その値が所定の閾値Ｔｈ_ｉを超える場合に、グローバルモーション値を算出する処理等を繰り返し、精度を向上させるようにしてもよい。 The present invention is not limited to the embodiments described herein. For example, in addition to the configuration for executing the third processing example, the camera parameter estimation device 1 may be any one of the neighboring edge fitting unit 104, the lens distortion coefficient calculation unit 105, and the non-rigid region determination unit 106, or A combination may be added to the control means 10 to improve the accuracy. Further, at that time, as shown in step S30 of FIG. 7, the tripod utilization determination means 107 calculates the average value of the distance between the edges (evaluation value “C _d ”), and the value is a predetermined threshold value Th. _{If i} exceeds _i , the process of calculating the global motion value may be repeated to improve the accuracy.

１カメラパラメータ推定装置
１０制御手段
２０入出力手段
３０記憶手段
１０１映像取得手段
１０２グローバルモーション推定手段
１０３エッジ抽出手段
１０４近傍エッジフィッティング手段
１０５レンズ歪係数算出手段
１０６非剛体領域判定手段
１０７三脚利用判定手段
１０８三脚用カメラパラメータ算出手段
１０９手持ち用カメラパラメータ算出手段
１１０カメラパラメータ出力手段
３００映像記憶手段
１０００映像アーカイブス
Ｓカメラパラメータ推定システム DESCRIPTION OF SYMBOLS 1 Camera parameter estimation apparatus 10 Control means 20 Input-output means 30 Storage means 101 Image acquisition means 102 Global motion estimation means 103 Edge extraction means 104 Near edge fitting means 105 Lens distortion coefficient calculation means 106 Non-rigid region judgment means 107 Tripod utilization judgment means 108 Tripod camera parameter calculation means 109 Handheld camera parameter calculation means 110 Camera parameter output means 300 Image storage means 1000 Image archives S Camera parameter estimation system

Claims

撮影カメラで撮影された撮影映像のカメラパラメータを推定するカメラパラメータ推定装置であって、
前記撮影映像が記憶されている記憶手段から、前記撮影映像を取得する映像取得手段と、
前記取得した撮影映像を構成するフレーム画像それぞれの特徴点を抽出し、基準となるフレーム画像において抽出された特徴点と、前記撮影カメラの動きの評価対象となる他のフレーム画像において抽出された特徴点との間で、同一の前記特徴点が対応付けられた対応点の探索を行うことにより、前記基準となるフレーム画像と前記他のフレーム画像との間の画面全体の移動量を示すグローバルモーションを推定するグローバルモーション推定手段と、
前記フレーム画像それぞれについて、エッジの抽出を行うエッジ抽出手段と、
前記抽出されたエッジのうち、前記フレーム画像それぞれの間において前記対応点となる当該エッジの周辺を示す所定領域の画像の類似度を算出し、当該算出した類似度が所定の第１の閾値を超えた場合に、前記撮影映像が三脚を利用した映像であると判定し、当該算出した類似度が前記所定の第１の閾値以下の場合に、前記撮影映像が三脚を利用した映像でないと判定する三脚利用判定手段と、
前記撮影映像が三脚を利用した映像であると判定された場合に、前記推定されたグローバルモーションで示される移動量を用いて、前記カメラパラメータを算出する三脚用カメラパラメータ算出手段と、
前記撮影映像が三脚を利用した映像でないと判定された場合に、前記フレーム画像それぞれに含まれる対応する特徴点を解析して前記カメラパラメータを算出する手持ち用カメラパラメータ算出手段と、
前記三脚用カメラパラメータ算出手段により算出されたカメラパラメータ、または、前記手持ち用カメラパラメータ算出手段により算出されたカメラパラメータを、前記記憶手段に出力するカメラパラメータ出力手段と、
を備えることを特徴とするカメラパラメータ推定装置。 A camera parameter estimation device for estimating camera parameters of a captured image captured by a capturing camera, comprising:
A video acquisition unit that acquires the captured video from storage means in which the captured video is stored;
The feature points of each of the frame images constituting the acquired captured image are extracted, and the feature points extracted in the reference frame image and the features extracted in the other frame images to be evaluated for the movement of the imaging camera A global motion indicating the amount of movement of the entire screen between the frame image serving as the reference and the other frame image by searching for corresponding points where the same feature point is associated with the points. Global motion estimation means for estimating
Edge extraction means for extracting an edge for each of the frame images;
The similarity of the image of the predetermined area which shows the circumference of the edge concerned which becomes the corresponding point among each of the frame images among the extracted edges is calculated, and the calculated similarity is a predetermined first threshold. If exceeded, it is determined that the captured video is a video using a tripod, and if the calculated similarity is less than or equal to the predetermined first threshold, it is determined that the captured video is not a video using a tripod Tripod usage judgment means to
A tripod camera parameter calculation unit that calculates the camera parameter using the movement amount indicated by the estimated global motion when it is determined that the captured video is a video using a tripod;
A handheld camera parameter calculation unit that analyzes the corresponding feature points included in each of the frame images and determines the camera parameter when it is determined that the captured video is not a video using a tripod;
Camera parameter output means for outputting the camera parameters calculated by the tripod camera parameter calculation means or the camera parameters calculated by the hand-held camera parameter calculation means to the storage means;
A camera parameter estimation apparatus comprising:

前記基準となるフレーム画像において、前記エッジ抽出手段が抽出したエッジについて、当該エッジに隣接するエッジの情報に基づき法線方向を求め、当該法線方向に設定した法線上で最近傍の前記他のフレーム画像のエッジの位置を決定し、前記基準となるフレーム画像のエッジの位置と、前記決定した他のフレーム画像のエッジの位置とから得たエッジの移動量を用いて、前記グローバルモーションを更新する近傍エッジフィッティング手段を、さらに備え、
前記三脚利用判定手段は、前記近傍エッジフィッティング手段により更新されたグローバルモーションで示される移動量を用いて、前記グローバルモーション推定手段が推定したグローバルモーションでの対応点の誤りを除去した上で、前記対応点となる当該エッジの周辺を示す所定領域の画像の類似度を算出し、
前記三脚用カメラパラメータ算出手段は、前記グローバルモーション推定手段により推定されたグローバルモーションの代わりに、前記近傍エッジフィッティング手段が更新したグローバルモーションに基づき、前記カメラパラメータを算出すること
を特徴とする請求項１に記載のカメラパラメータ推定装置。 In the frame image serving as the reference, for the edge extracted by the edge extraction means, the normal direction is determined based on the information of the edge adjacent to the edge, and the other nearest neighbor on the normal set in the normal direction. The position of the edge of the frame image is determined, and the global motion is updated using the movement amount of the edge obtained from the position of the edge of the frame image serving as the reference and the positions of the edges of the other frame images determined. Further comprising a nearby edge fitting means
The tripod utilization determining means uses the movement amount indicated by the global motion updated by the neighboring edge fitting means to remove an error of the corresponding point in the global motion estimated by the global motion estimating means, and Calculate the similarity of the image of a predetermined area indicating the periphery of the corresponding edge that is the corresponding point,
The tripod camera parameter calculation means calculates the camera parameter based on the global motion updated by the neighboring edge fitting means instead of the global motion estimated by the global motion estimation means. The camera parameter estimation device according to 1.

前記エッジ抽出手段により抽出されたエッジのうち、前記基準となるフレーム画像で検出されたエッジの位置についてレンズ歪を補正したエッジの位置と、前記他のフレーム画像で検出されたエッジの位置についてレンズ歪および前記推定されたグローバルモーションの移動量を補正したエッジの位置との、距離が０に収束するように解析する最適化処理を行うことにより、レンズ歪係数を算出するレンズ歪係数算出手段を、さらに備え、
前記三脚利用判定手段は、
前記フレーム画像それぞれについて、前記レンズ歪係数算出手段が算出したレンズ歪係数に基づく補正を行った上で、前記グローバルモーションを更新するとともに、前記フレーム画像それぞれの間において前記対応点となる当該エッジの周辺を示す所定領域の画像の類似度を算出し、
前記三脚用カメラパラメータ算出手段は、前記グローバルモーション推定手段により推定されたグローバルモーションの代わりに、前記三脚利用判定手段により更新された前記グローバルモーションに基づき、前記カメラパラメータを算出すること
を特徴とする請求項１に記載のカメラパラメータ推定装置。 Among the edges extracted by the edge extraction means, the position of the edge of which the lens distortion is corrected with respect to the position of the edge detected in the reference frame image and the position of the edge detected in the other frame image Lens distortion coefficient calculation means for calculating a lens distortion coefficient by performing an optimization process of analyzing the distortion and the estimated global motion movement amount with the position of the edge corrected to make the distance converge on 0; , And more,
The tripod usage judging means is
After performing correction based on the lens distortion coefficient calculated by the lens distortion coefficient calculation unit for each of the frame images, the global motion is updated, and the corresponding point as the corresponding point between the frame images is updated. Calculate the similarity of the image of the predetermined area that shows the surroundings,
The tripod camera parameter calculation unit is characterized in that the camera parameter is calculated based on the global motion updated by the tripod usage determination unit instead of the global motion estimated by the global motion estimation unit. The camera parameter estimation device according to claim 1.

前記フレーム画像それぞれを所定領域のブロックに分割し、前記基準となるフレーム画像のブロックと、それに対応する前記他のフレーム画像のブロックとの類似度を算出し、当該算出した類似度が所定の第２の閾値以下である場合に、前記他のフレーム画像のブロックを非剛体領域であると判定する非剛体領域判定手段を、さらに備え、
前記三脚利用判定手段は、前記非剛体領域判定手段が判定した非剛体領域のブロックに含まれる特徴点を対象とせず、前記グローバルモーション推定手段を介して前記グローバルモーションを更新するとともに、前記非剛体領域のブロックに含まれる対応点を処理対象とせずに、前記フレーム画像それぞれの間において前記対応点となる当該エッジの周辺を示す所定領域の画像の類似度を算出し、
前記三脚用カメラパラメータ算出手段は、前記グローバルモーション推定手段により推定されたグローバルモーションの代わりに、前記三脚利用判定手段により更新された前記グローバルモーションに基づき、前記カメラパラメータを算出すること
を特徴とする請求項１に記載のカメラパラメータ推定装置。 Each of the frame images is divided into blocks of a predetermined area, and the similarity between the block of the frame image serving as the reference and the block of the other frame image corresponding thereto is calculated, and the calculated similarity is a predetermined number. The image processing apparatus further comprises non-rigid region determination means for determining that the block of the other frame image is a non-rigid region if it is less than or equal to a threshold of 2.
The tripod utilization determination means updates the global motion via the global motion estimation means without targeting the feature points included in the block of the non-rigid area determined by the non-rigid area determination means, and the non-rigid body Calculating a similarity between images of a predetermined area indicating the periphery of the corresponding edge serving as the corresponding point between the frame images without setting the corresponding point included in the block of the area as a processing target;
The tripod camera parameter calculation unit is characterized in that the camera parameter is calculated based on the global motion updated by the tripod usage determination unit instead of the global motion estimated by the global motion estimation unit. The camera parameter estimation device according to claim 1.

撮影カメラで撮影された撮影映像のカメラパラメータを推定するカメラパラメータ推定装置であって、
前記撮影映像が記憶されている記憶手段から、前記撮影映像を取得する映像取得手段と、
前記取得した撮影映像を構成するフレーム画像それぞれの特徴点を抽出し、基準となるフレーム画像において抽出された特徴点と、前記撮影カメラの動きの評価対象となる他のフレーム画像において抽出された特徴点との間で、同一の前記特徴点が対応付けられた対応点の探索を行うことにより、前記基準となるフレーム画像と前記他のフレーム画像との間の画面全体の移動量を示す第１のグローバルモーションを推定するグローバルモーション推定処理を行うグローバルモーション推定手段と、
前記フレーム画像それぞれについて、エッジの抽出を行うエッジ抽出手段と、
前記基準となるフレーム画像において、前記エッジ抽出手段が抽出したエッジについて、当該エッジに隣接するエッジの情報に基づき法線方向を求め、当該法線方向に設定した法線上で最近傍の前記他のフレーム画像のエッジの位置を決定し、前記基準となるフレーム画像のエッジの位置と、前記決定した他のフレーム画像のエッジの位置とから得たエッジの移動量を用いて、第２のグローバルモーションを算出する近傍エッジフィッティング手段と、
前記第２のグローバルモーションで示される移動量を用いて、前記グローバルモーション推定手段が推定した前記第１のグローバルモーションでの対応点の誤りを除去した上で、前記エッジ抽出手段により抽出されたエッジのうち、前記基準となるフレーム画像で検出されたエッジの位置についてレンズ歪を補正したエッジの位置と、前記他のフレーム画像で検出されたエッジの位置についてレンズ歪および前記第２のグローバルモーションの移動量を補正したエッジの位置との、距離が０に収束するように解析する最適化処理を行うことにより、レンズ歪係数を算出するレンズ歪係数算出手段と、
前記フレーム画像それぞれを所定領域のブロックに分割し、前記基準となるフレーム画像のブロックと、それに対応する前記他のフレーム画像のブロックとの類似度を算出し、当該算出した類似度が所定の第２の閾値以下である場合に、前記他のフレーム画像のブロックを非剛体領域であると判定する非剛体領域判定手段と、
前記フレーム画像それぞれについて、前記レンズ歪係数算出手段が算出したレンズ歪係数に基づく補正を行った上で、前記非剛体領域判定手段が判定した非剛体領域のブロックに含まれる特徴点を対象とせず、前記グローバルモーション推定手段を介して前記第２のグローバルモーションを更新し第３のグローバルモーションを算出するとともに、前記フレーム画像それぞれの間において前記対応点となる前記エッジの周辺を示す所定領域の画像の類似度を算出し、当該算出した類似度が所定の第１の閾値を超えた場合に、前記撮影映像が三脚を利用した映像であると判定し、当該算出した類似度が前記所定の第１の閾値以下の場合に、前記撮影映像が三脚を利用した映像でないと判定する三脚利用判定処理を行う三脚利用判定手段と、
前記撮影映像が三脚を利用した映像であると判定された場合に、前記第３のグローバルモーションで示される移動量を用いて、前記カメラパラメータを算出する三脚用カメラパラメータ算出手段と、
前記撮影映像が三脚を利用した映像でないと判定された場合に、前記フレーム画像それぞれに含まれる対応する特徴点を解析して前記カメラパラメータを算出する手持ち用カメラパラメータ算出手段と、
前記三脚用カメラパラメータ算出手段により算出されたカメラパラメータ、または、前記手持ち用カメラパラメータ算出手段により算出されたカメラパラメータを、前記記憶手段に出力するカメラパラメータ出力手段と、
を備えることを特徴とするカメラパラメータ推定装置。 A camera parameter estimation device for estimating camera parameters of a captured image captured by a capturing camera, comprising:
A video acquisition unit that acquires the captured video from storage means in which the captured video is stored;
The feature points of each of the frame images constituting the acquired captured image are extracted, and the feature points extracted in the reference frame image and the features extracted in the other frame images to be evaluated for the movement of the imaging camera A search is performed for corresponding points to which the same feature point is associated with a point, thereby indicating a first movement amount of the entire screen between the frame image serving as the reference and the other frame image. Global motion estimation means for performing global motion estimation processing to estimate the global motion of
Edge extraction means for extracting an edge for each of the frame images;
In the frame image serving as the reference, for the edge extracted by the edge extraction means, the normal direction is determined based on the information of the edge adjacent to the edge, and the other nearest neighbor on the normal set in the normal direction. The second global motion is determined by determining the position of the edge of the frame image and using the movement amount of the edge obtained from the position of the edge of the frame image serving as the reference and the positions of the edges of the other frame images determined above. Neighborhood edge fitting means for calculating
An edge extracted by the edge extraction unit after removing an error of the corresponding point in the first global motion estimated by the global motion estimation unit using the movement amount indicated by the second global motion The position of the edge of which the lens distortion is corrected with respect to the position of the edge detected in the reference frame image, and the position of the edge of the second global motion with respect to the position of the edge detected in the other frame image. Lens distortion coefficient calculation means for calculating a lens distortion coefficient by performing optimization processing to analyze so that the distance converges to 0 with the position of the edge whose movement amount has been corrected;
Each of the frame images is divided into blocks of a predetermined area, and the similarity between the block of the frame image serving as the reference and the block of the other frame image corresponding thereto is calculated, and the calculated similarity is a predetermined number. A non-rigid region determination unit that determines a block of the other frame image to be a non-rigid region if it is less than or equal to a threshold of 2;
After performing correction based on the lens distortion coefficient calculated by the lens distortion coefficient calculation unit for each of the frame images, feature points included in the block of the non-rigid region judged by the non-rigid region judgment unit are not targeted. And updating the second global motion via the global motion estimation means to calculate a third global motion, and an image of a predetermined area indicating the periphery of the edge serving as the corresponding point between the frame images. Calculating the degree of similarity, and determining that the captured image is a video using a tripod when the calculated degree of similarity exceeds a predetermined first threshold, and the calculated degree of similarity is the predetermined number A tripod usage determining unit that performs tripod usage determination processing that determines that the captured video is not a video using a tripod when the threshold is less than or equal to 1;
Camera parameter calculation means for tripods which calculates said camera parameter using movement amount shown by said 3rd global motion, when it is judged that said photography picture is a picture using a tripod.
A handheld camera parameter calculation unit that analyzes the corresponding feature points included in each of the frame images and determines the camera parameter when it is determined that the captured video is not a video using a tripod;
Camera parameter output means for outputting the camera parameters calculated by the tripod camera parameter calculation means or the camera parameters calculated by the hand-held camera parameter calculation means to the storage means;
A camera parameter estimation apparatus comprising:

前記三脚利用判定手段は、
前記フレーム画像それぞれについて、前記レンズ歪係数算出手段が算出したレンズ歪係数に基づく補正を行った上で、前記非剛体領域判定手段が判定した非剛体領域のブロックに含まれる特徴点を対象とせずに、前記エッジの位置の間の前記距離が０に収束するように解析する最適化処理を再度行い、当該最適化処理により求まる前記エッジ間の前記距離が所定の第３の閾値を超えるか否かを判定し、
前記所定の第３の閾値を超えた場合に、前記レンズ歪係数算出手段により直近で算出されたレンズ歪で前記フレーム画像を補正するとともに、前記非剛体領域判定手段が直近で判定した非剛体領域のブロックに含まれる特徴点を対象とせずに、前記グローバルモーション推定手段による前記グローバルモーション推定処理に戻り、前記三脚利用判定手段が、前記エッジ間の前記距離が前記所定の第３の閾値以下になるまで、前記グローバルモーション推定処理に戻る処理を繰り返し、前記エッジ間の前記距離が前記所定の第３の閾値以下になった場合に、前記三脚利用判定処理を行うこと
を特徴とする請求項５に記載のカメラパラメータ推定装置。 The tripod usage judging means is
After performing correction based on the lens distortion coefficient calculated by the lens distortion coefficient calculation unit for each of the frame images, feature points included in the block of the non-rigid region judged by the non-rigid region judgment unit are not targeted. Then, optimization processing is performed again to analyze that the distance between the positions of the edges converges to 0, and the distance between the edges obtained by the optimization processing exceeds a predetermined third threshold value. To determine
When the predetermined third threshold is exceeded, the frame image is corrected with the lens distortion calculated most recently by the lens distortion coefficient calculation means, and the non-rigid area determined most recently by the non-rigid area judgment means Returning to the global motion estimation process by the global motion estimation means without targeting the feature points included in the block, the tripod utilization determining means determines that the distance between the edges is less than or equal to the predetermined third threshold value. The process of returning to the global motion estimation process is repeated until the distance between the edges becomes equal to or less than the predetermined third threshold value, and the tripod usage determination process is performed. The camera parameter estimation device according to claim 1.

コンピュータを、請求項１乃至６のいずれか一項に記載のカメラパラメータ推定装置として機能させるためのカメラパラメータ推定プログラム。 The camera parameter estimation program for functioning a computer as a camera parameter estimation apparatus as described in any one of Claims 1-6.