JP2004062757A

JP2004062757A - Image processing method and method for estimating imaging part position and attitude

Info

Publication number: JP2004062757A
Application number: JP2002223281A
Authority: JP
Inventors: Kazuki Takemoto; 武本　和樹; Shinji Uchiyama; 内山　晋二
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2002-07-31
Filing date: 2002-07-31
Publication date: 2004-02-26
Anticipated expiration: 2022-07-31
Also published as: JP4194316B2

Abstract

<P>PROBLEM TO BE SOLVED: To improve the estimation accuracy of an imaging means position and attitude when a situation changes by automatically adjusting a detection parameter in accordance with the brightness and color change of a picked up image with respect of an imaged landmark. <P>SOLUTION: An information processing method for estimating the position and attitude of an imaging means located in a three-dimensional space comprises an acquisition step for acquiring the position and attitude information of the imaging means obtained by a position and posture measuring part; a characteristic point detection step for using a detection condition to detect a characteristic point from an image obtained by imaging a real space where the characteristic point whose three-dimensional position has already be known exists by the imaging means; a correction step for correcting the position and attitude information on the basis of the position of the characteristic point; and an optimization step for automatically optimizing the detection condition from the image. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は、撮像画像における特徴部の検出条件を、撮影画像から自動的に調整するものに関する。
【０００２】
【従来の技術】
現実空間を撮像する撮像部の外部パラメータ（位置姿勢）を計測、決定する方法において、現実空間に３次元位置姿勢が既知である特徴点（ランドマーク）を配置しておき、撮像部によって撮像されたいくつかのランドマークの撮像面上における画像情報を基準として、撮像部の位置姿勢を求める方法がある。例えば、現実空間を撮像する撮像部（ビデオカメラ等）の外部パラメータ（位置姿勢）を計測、決定する方法において、
（１）画像情報のみによる推定方法
（例えば、加藤，Ｍａｒｋ，浅野，橘：「マーカ追跡に基づく拡張現実感システムとそのキャリブレーション」，日本バーチャルリアリティ学会論文集　Ｖｏｌ．４　Ｎｏ．４，　ｐｐ．６０７−６１６，１９９９に記載。）、
（２）６自由度位置姿勢センサと画像とのハイブリッドな推定方法
（例えば特開平１１−１３６７０６号公報、特開２０００−３４７１２８号公報に記載）
（３）画像と加速度センサのハイブリッドな推定方法
（例えば、横小路，菅原，吉川：「画像と加速度計を用いたＨＭＤ上での映像の正確な重ね合わせ」，日本バーチャルリアリティ学会論文集　Ｖｏｌ．４　Ｎｏ．４，ｐｐ．５８９−５９８，１９９９）
（４）ジャイロセンサと画像とのハイブリッドな位置合わせなどの推定方法
（例えば、藤井，神原，岩佐，竹村，横矢：「拡張現実のためのジャイロセンサを併用したステレオカメラによる位置合わせ」，信学技報　ＰＲＭＵ９９−１９２，　Ｊａｎｕａｒｙ　２０００）
等が知られている。また、画像による位置合わせや、６自由度センサと画像のハイブリッドな方法では、さらに手法の異なる様々な方法がある。
【０００３】
【発明が解決しようとする課題】
しかし、従来のような撮像部の位置姿勢推定装置は、ランドマークの検出精度に大きく依存しており、状況によっては、ランドマークの検出精度が著しく低下し、現実空間の撮像部の位置姿勢と、位置姿勢推定装置により出力された撮像部の位置姿勢とのずれの大きさが問題になる場合があった。ランドマークの検出精度が落ちる要因は、例えば、照明条件の変化、撮像部のシャッタースピード、ホワイトバランス、ゲインなどの設定変化、もしくはこれらの自動調整機能などが挙げられる。従来の位置姿勢推定装置においては、ランドマークの検出処理における検出パラメータが固定されていることから、前述した画像の変化に対応できないことがランドマークの検出精度を低下させている主な原因である。
【０００４】
この検出精度の低下に対して、従来の位置姿勢推定装置においては、なるべく撮像する空間の照明条件が変わらないように工夫し、さらに時間や場所ごとに最適な検出パラメータを手動で設定して対処する必要があった。
【０００５】
本発明は、このような課題に鑑みて発明されたものであり、撮像画像における特徴部の検出条件を自動的に調整することにより、常に高精度に特徴部を検出可能にすることを目的とする。
【０００６】
【課題を解決するための手段】
上記目的を達成するために、本発明は以下の構成を有することを特徴とする。
【０００７】
本願請求項１の発明は、３次元空間内に位置する撮像手段の位置姿勢を推定する情報処理方法であって、位置姿勢測定部によって得られた前記撮像手段の位置姿勢情報を取得する取得ステップと、３次元位置が既知である特徴点の存在する現実空間を前記撮像手段によって撮像することにより得られた映像から、検出条件を用いてに前記特徴点を検出する特徴点検出ステップと、前記特徴点の位置に基づいて、前記位置姿勢情報を補正する補正ステップと、前記映像から前記検出条件を自動的に最適化する最適化ステップを有することを特徴とする。
【０００８】
本願請求項１１の発明は、撮影手段によって撮影された映像内の特徴部を検出し、前記検出結果を用いて前記撮影手段の位置姿勢情報を求める際に用いる、該特徴部の検出条件を調整する情報処理方法であって、３次元位置および形状が既知である特徴部が存在する現実空間を前記撮像手段によって撮像することにより得られた映像から、検出条件を用いて該特徴点の領域を検出し、前記特徴点の３次元位置および形状から、前記映像における特徴領域を算出し、前記検出された特徴領域と前記算出された特徴領域を比較し、前記検出条件を調整することを特徴とする。
【０００９】
【発明の実施の形態】
＜第１の実施形態＞
以下、添付図面を参照して、本発明を適用した好適な実施形態に従って詳細に説明する。
本実施形態では、従来例の（２）６自由度位置姿勢センサと画像とのハイブリッドな推定方法に本発明を適用した例を示す。
【００１０】
＜６自由度位置姿勢センサの誤差＞
ここで、６自由度位置姿勢センサとは、Ｐｏｌｈｅｍｕｓ社の３ＳＰＣＡＣＥＦＡＳＴＲＡＫやＡｓｓｅｎｓｉｏｎ　Ｔｅｃｈｎｏｌｏｇｙ社のＦｌｏｃｋｏｆ　Ｂｉｒｄｓなどの磁気センサ、または、Ｎｏｒｔｈｅｒｎ　Ｄｉｇｉｔａｌ社のＯＰＴＯＴＲＡＫなどの光学式センサなど、計測対象の位置と姿勢を計測する機器を指す。この６自由度位置姿勢センサ１４０は、撮像部１１０に固定されていることから、６自由度センサ１４０の計測値により撮像部の位置姿勢が取得できる。しかしながら、センサには誤差があり、状況によっては、撮像部１１０の位置姿勢を精度良く求めることができない。例えば、磁気センサを用いた６自由度位置姿勢センサの測定値は周囲の磁場に影響されるので、計測部近辺に金属物質があると誤差が大きくなり、結果として撮像部１１０の位置姿勢と、６自由度センサによる出力のずれ量（誤差）が増える。
【００１１】
＜特開平１１−１３６７０６号公報における解決法＞
このような状況に鑑みて特開平１１−１３６７０６号公報では、画像情報を用いて、６自由度センサ誤差を補正する方法を述べている。図７は上述の従来の方法を説明する模式図である。この方法では、現実空間にマーカと呼ばれる、画像処理によって検出しやすい特定の色や形状を有する物体、例えばシールを、現実空間中の物体に貼り付けてランドマーク（絶対位置基準）として利用する。また、ランドマーク１００としては特別に配置したもの以外にも、現実空間中の特徴的な物体や点を利用することも可能である。点Ａは撮像部１１０の位置姿勢に基づいて予測されるランドマーク１００の位置、点Ｂはこのランドマーク１００の実際の位置、点Ｃは撮像部１１０の視点位置を示す。なお、点Ａ、点Ｂが示す位置はカメラ座標系における位置であって、点Ｃはカメラ座標系の原点とする。また、点Ｐは撮像面上における点Ａの位置、点Ｑは撮像面上における点Ｂの位置を示す。
【００１２】
このとき、撮像面においてカメラの位置姿勢により予測されるランドマークの位置Ａと実際の位置Ｂとでは、前述のようにずれ（誤差）がある。このずれを、撮像部によって撮像されたランドマーク１００の撮像面上での位置Ｑを現実空間の基準とし、撮像面上でのランドマーク予測位置Ｐに重ね合わせるような変換行列ΔＭ_Ｃを求める。この変換行列ΔＭ_Ｃを６自由度センサ１４０から得られる撮像部のビューイング変換行列Ｍ_Ｃに積算することにより、誤差補正を行った撮像部のビューイング行列Ｍ_Ｃ´が得られる。ここで、Ｍ_Ｃ´の逆行列が撮像部の位置姿勢成分を含む。ここでＭ_Ｃ´を
【外１】

とすると、位置ｔは（Ｍ_１４，Ｍ_２４，Ｍ_３４）のベクトルで得られ、回転Ｒは
【外２】

の行列によって得られる。つまり、撮像部１１０の位置姿勢を求めるためには、撮像部のビューイング変換行列Ｍ_Ｃ´が求められればよい。
【００１３】
このように、６自由度センサ１４０の出力を基本とし、さらに現実空間の絶対位置基準により位置合わせされた位置姿勢は、６自由度センサ１４０のみで計測する位置姿勢よりも、実際の撮像部１１０の位置姿勢に対するずれ量は安定して減少する。
【００１４】
しかし、ランドマーク１００の位置が後述する位置姿勢補正部において絶対位置基準として利用されるため、この基準が正確でないと、撮像部の位置姿勢は正確に推定できない。このことから、いかにして撮像画像に映ったランドマーク１００を最適な検出パラメータによって検出するかが解決するべき課題である。
【００１５】
＜特開平１１−１３６７０６号公報の手法に本発明を適応した例＞
前述の特開平１１−１３６７０６号公報の手法に本発明を適応することにより、撮像部で得られる撮像画像の明度、色変化に伴い、検出パラメータを自動的に最適化することで、ランドマークの検出性能を向上させることが可能となる。ここで、最適化とは、「最適な状態に変化させること」と定義する。よって、１回の最適化処理を行った検出パラメータは、必ずしも撮像画像において、ランドマークを検出するのに最適な状態になるのではない。複数回の最適化処理によって、最適な状態に成り得る。
【００１６】
図１は、特開平１１−１３６７０６号で述べられているような、６自由度センサと画像のハイブリッドな位置合わせ手法に、本発明を適用する場合における撮像部位置姿勢推定装置の構成図を表す。図１に従って撮像部位置姿勢の推定処理手順を示す。
【００１７】
図５は装置全体の処理の流れを示すフローチャートを表している。まず、ステップＳ５００撮像部により、ランドマークを配置している現実空間を撮像する。次にステップＳ５１０において、後述するランドマーク検出処理を行う。次にステップＳ５２０に移り、６自由度センサによる計測を行う。さらに、ステップＳ５３０で、計測値と撮像画像から検出されたランドマーク位置を用いて撮像部の位置姿勢を推定する。次に、ステップＳ５４０において、検出パラメータの最適化を行う。検出パラメータの最適化が終了した時点で、ステップＳ５５０に移り、終了命令があれば終了し、なければステップＳ５００に戻り、更新された撮像画像を基に再度処理を行う。
【００１８】
以下にステップＳ５１０からステップＳ５４０までの各処理の詳細を述べる。
【００１９】
＜１．ランドマーク検出処理＞
まず、現実空間の既知の位置に配置されているランドマーク１００を撮像部１１０で撮像し、撮像画像の取得部１２０に格納する。その撮像画像からランドマーク１００の検出を行い、観測座標での２次元位置Ｑ（ｘ_ｑ，ｙ_ｑ）を得る。
【００２０】
ここで、撮像部１１０から得られた画像におけるランドマーク１００の検出方法については特に限らないが、例えば以下のような例が挙げられる。現実空間に図２中で示すような赤色ランドマーク１００を配置している場合は、撮像画像中の注目画素値（Ｒ，Ｇ，Ｂ）の特徴量Ｉ_ｓを
Ｉ_ｓ＝Ｒ／（（Ｇ＋Ｂ）／２）　　（式３）
の計算式を用いて算出する。また、緑色のランドマーク１００の場合は、
Ｉ_ｓ＝Ｇ／（（Ｒ＋Ｂ）／２）　　（式４）
であり、青色のランドマークの場合は、
Ｉ_ｓ＝Ｂ／（（Ｒ＋Ｇ）／２）　　（式５）
の計算式で算出する。本発明では、ランドマークの色を前述の３色に限るものではなく、任意の色を検出する場合においても適応可能である。例えば、任意の色の検出において、画素値Ｒ，Ｇ，ＢをＹＣｂＣｒフォーマットの画素に変換し、Ｃｂ、Ｃｒの色領域に対応するランドマーク検出を行い、Ｃｂ、Ｃｒの指定領域に関するパラメータに対して最適化を行う処理にも適応可能である。この場合、領域を指定するパラメータは、ＣｂＣｒ空間中における、楕円の領域を指し、中心位置を表すθ、ｒと、各軸を表すＲａ、Ｒｂから構成される。
【００２１】
このランドマーク検出処理においては、検出パラメータ調整部１９５により調整された検出パラメータである閾値Ｔ_ｃを入力し、特徴量Ｉ_ｓがＴ_ｃを超えた場合は、注目画素がマーカ領域内であると判定する。これを、撮像画像の全ての画素に対して適応し、検出された特徴領域は、個々にラベル付けされる。さらに、ラベル付けされた領域のうち、ラベルを構成する画素が、ランドマーク画素数閾値Ｎを越えたものをランドマーク領域の候補とする。ランドマーク画素数閾値Ｎは、検出時のノイズがラベル付けされていた場合、ほとんどのノイズの画素数が微小であるを利用して、ある閾値以下のものをランドマークとして認識しないようにしておく。この処理により、検出ノイズを減少させ、位置姿勢補正部１６０の処理において検出ノイズをランドマークと誤認識することをある程度防ぐことが可能である。このランドマーク領域の重心位置をランドマーク１００の２次元位置とし、観測座標のランドマーク位置Ｑ（ｘ_ｑ，ｙ_ｑ）とする。ここで、ランドマーク１００が撮像画像に複数含まれる場合は、個々のランドマークに対して上述の処理を１回ずつ行う。このときの特徴量Ｉ_ｓとランドマーク画素数閾値Ｎも個々のランドマーク毎に記憶しておく。例えばランドマークが３つ撮像されている場合は、Ｔ_ｃ＝（Ｔ_ｃ１，Ｔ_ｃ２，Ｔ_ｃ３）Ｎ＝（Ｎ_１，Ｎ_２，Ｎ_３）のようにランドマーク毎の値を格納する。ここで撮像画像からランドマークを検出した検出画像Ｕ_ｄ（例えば、本実施形態においては、対応画素の特徴量Ｉ_ｓを画素値として格納した画像）と、検出パラメータ（本実施形態においては閾値Ｔ_ｃ、ランドマーク画素数閾値Ｎから構成される）を状態記憶部１９０に保存する。
【００２２】
＜２．６自由度センサによる計測処理＞
一方で、６自由度センサ１４０により、撮像部１１０の位置姿勢を検出する。本実施形態では、６自由度センサとしてＰｏｌｈｅｍｕｓ社の３ＳＰＣＡＣＥ　ＦＡＳＴＲＡＫ（以下、ＦＡＳＴＲＡＫ）を用いている。ＦＡＳＴＲＡＫセンサのレシーバ１４０Ａが撮像部１１０の位置姿勢の動きに追従するように固定する。例えば、図２は本実施形態の処理を実行中時の状態を表す図であるが、図２中の撮像部１１０とＦＡＳＴＲＡＫレシーバ１４０Ａのように、金属以外の棒状のもので２つを固定し、ＦＡＳＴＲＡＫレシーバ１４０付近のの交流磁界になるべく影響を与えない方法で固定する。ＦＡＳＴＲＡＫセンサのトランスミッタ１４０Ｂから発生される交流磁界をレシーバ１４０Ａが受ける。位置姿勢計測部１５０によりレシーバ１４０Ａが受けた交流磁界の変化から、撮像部１１０の位置姿勢を計測する。
【００２３】
＜３．撮像部位置姿勢の推定処理＞
このようにして得られた撮像部１１０の位置姿勢計測値と、ランドマーク１００の撮像部１１０で撮像した画像上の２次元位置とを位置姿勢補正部１６０に入力する。図３は位置姿勢補正部１６０の処理を示す図である。まず、位置姿勢計測部１５０より入力された計測値から撮像部のビューイング変換行列Ｍ_Ｃを生成する（ステップＳ３００）。さらに、ビューイング変換座標Ｍ_Ｃとランドマーク配置・形状記憶部１７０に保存されている世界座標系上での各ランドマークの３次元位置と、既知である撮像部１１０の理想的透視変換行列から、各ランドマークの観測座標予測値Ｐ_ｉ（ｘ_ｐｉ，ｙ_ｐｉ）を算出する（ステップＳ３１０）。次に、ステップＳ３２０において、渡されたランドマーク観測予測座標値Ｐに基づいて、現在観測しているランドマーク、すなわち補正の基準となるランドマークを判別する。本実施形態における判別方法は、注目観測予測座標値Ｐ_ｊ（ｘ_ｐｊ，ｙ_ｐｊ）から、観測しているランドマークの観測座標値Ｑ_ｉ（ｘ_ｑｉ，ｙ_ｑｉ）との距離が近いものを対応付けする方法を採用している。すなわち、各ランドマークの観測予測座標Ｐと観測座標Ｑの組み合わせのうち、（ｘ_ｑｉ−ｘ_ｐｊ）^２＋（ｙ_ｑｉ−ｙ_ｐｊ）^２が最小になるランドマーク同士を対応付けする。ただし、本発明においては、この判別方法を限定するものではなく、６自由度センサから得られる観測予測座標のランドマークを撮像画像中のランドマークと正しく対応付けが可能な方法であれば適応可能である。ランドマークの対応付け処理において１点でも対応付けが成功していれば、ステップＳ３３０では、ステップＳ３１０で演算されたランドマークの観測予測座標値Ｐ（ｘ_ｐ，ｙ_ｐ）とランドマーク検出部１３０が検出したランドマークの観測座標値Ｑ（ｘ_ｑ，ｙ_ｑ）との差異に基づいて、位置姿勢計測部１５０によって得られた撮像部１１０の位置姿勢を表すビューイング変換行列Ｍ_Ｃを補正するためのΔＭ_Ｃを求める。さらに、ステップＳ３４０において、ステップＳ３３０で求めたΔＭ_ＣとステップＳ３００で求めた計測値からのカメラのビューイング変換行列Ｍ_Ｃを積算することにより補正後の撮像部視点のビューイング変換行列Ｍ_Ｃ´を得ることができる。
【００２４】
Ｍ_Ｃ´＝ΔＭ_Ｃ・Ｍ_Ｃ　　（式６）
ステップＳ３２０において、もし、ランドマークの対応付けができない場合は、例えば、ΔＭ_Ｃを単位行列に設定することにより、ＦＡＳＴＲＡＫセンサ出力から得たＭ_Ｃを最終的に出力する視点のビューイング変換行列Ｍ_Ｃ´としてもよい。
【００２５】
＜４．ランドマーク領域生成処理＞
位置姿勢補正部１６０を通して、ランドマーク配置・形状記憶部１７０から各ランドマークの配置情報と形状情報をランドマーク領域生成部１８０に入力する。
【００２６】
まず、形状情報とは、例えば，図２で示すような円形の赤色ランドマーク１００（図８Ａ）を用いている場合は，図８Ｂのように、あらかじめ３Ｄモデル化しておき、複数の頂点８００から構成され擬似的にランドマークの形状を表現するものである。本発明は、円形のランドマークに制限するものではなく、画像処理によって検出しやすい特定の色や形状を有する物体であれば適応可能である。また、ランドマーク１００としては特別に配置したもの以外にも、現実空間中の特徴的な物体や点を利用することも可能である。さらに、複数のランドマークが個々に違う形状をしていても適応可能である。ここで、３次元空間中のモデル座標系での頂点８００の座標値（ｘ_ｍ，ｙ_ｍ，ｚ_ｍ）とする。ここで、この頂点情報は各頂点の結合情報を持つ。本発明においては、３次元モデルが結合情報を持つことに限定するものではなく、後述するランドマーク領域生成部１８０において、適切なランドマーク領域情報を再構築できる方法であれば適応可能である。
【００２７】
さらに、配置情報とは、現実空間に配置されたランドマークの重心位置を表す頂点８１０の世界座標系における位置姿勢を表す行列Ｌを格納している。この頂点８００とランドマークの重心位置８１０における位置姿勢情報Ｌは、それぞれランドマーク配置・形状記憶部１７０に記憶されている。ランドマーク領域生成部１８０では、このランドマーク形状情報８００を、配置情報Ｌと補正された撮像部視点のビューイング変換行列Ｍ_Ｃ´によって変換することにより、各頂点８００の視点座標系上での位置（ｘ_ｍ’，ｙ_ｍ’，ｚ_ｍ’）を得る。
【００２８】
【外３】

【００２９】
さらに、既知である視点座標系から観測座標系へのプロジェクション行列Ｃによって、撮像面上での２次元位置Ｖ（ｘ_ｖ，ｙ_ｖ）を得る。
【００３０】
【外４】

【００３１】
【外５】

【００３２】
次に、各頂点８００の２次元位置Ｖから、ランドマーク形状情報が持つ頂点の結合情報を基にランドマーク形状の各辺に対して２次元直線の式を算出し、アウトライン９００を生成する。このアウトライン９００は、観測座標において各頂点を結ぶ直線式群によって構成される。ランドマーク領域９１０はこのアウトライン９００とランドマーク領域内の点Ｇから構成される。領域内の点Ｇは、例えば、円形のランドマーク１００であれば、ランドマークの重心位置８１０を領域内の点とすればよい。生成されたランドマーク領域９１０は状態記憶部１９０に記憶される。図９は、図２の撮像部位置姿勢において、ランドマーク１００のうちの一つを撮像したときの撮像画像に、アウトライン９００と各頂点８００を重ねて表示した画像を表す。この図中におけるアウトライン９００は、位置姿勢補正部１６０により推定された撮像部１１０の位置姿勢を基に位置と形状が決定されるので、位置姿勢補正部１６０において、ランドマークの対応が取れている状態であれば、撮像画像におけるランドマーク１００の位置に表示される。
【００３３】
＜５．検出パラメータの調整処理＞
図４は、検出パラメータ調整部１９５の処理の詳細手順を表す図である。検出パラメータ調整部１９５は、状態記憶部１９０から、ランドマーク領域９１０、検出画像Ｕ_ｄ、検出パラメータである閾値Ｔ_ｃと、ランドマーク画素数閾値Ｎを受ける。
【００３４】
まず、ステップＳ４１０において、ランドマーク領域９１０を完全に含む処理領域１０００を生成する。この処理領域１０００は、例えば、ランドマーク領域内の点８１０を重心とし、全てのランドマーク領域を含む矩形領域でもよい。図１０はこの処理領域を表す。この処理領域１０００は、撮像されているランドマーク毎に生成する。例えば、図１４に示すように、３つのランドマークが撮像されている場合の処理領域１０００は、個々のランドマーク周辺にそれぞれ生成される。本発明は、この処理領域１０００の生成方法を制限するものではなく、ランドマーク領域９１０と検出画像上の検出画素の形状を比較できるものであれば適応可能である。
【００３５】
次に、ステップＳ４２０において、検出画像Ｕ_ｄ内のランドマーク検出画像とランドマーク領域を比較するための理想検出画素パターン１１１０を作成する。まず、処理領域と同画素で、全ての画素が０で初期化されている画像バッファ領域１１００を作成する。次に、このバッファ領域１１００でランドマーク領域内に含まれる画素にマーキングを行う。このマーキングは例えば、領域内の全ての画素値１に設定する方法でもよい。この様子を図１１中のＡ，Ｂに表す。図１１Ａに表されるようなランドマーク領域９１０がバッファ領域１１００上にあったとすると、マーキング処理によって図１１Ｂのように、ランドマーク領域内の画素がマーキングされる。ここで、図１１Ｂの図では黒色の画素がマーキングされていることを表す。本実施形態では、画素上に領域を含む場合であっても、この領域が画素面積の半分以上を占めない限りマーキングしない。ただし、本発明は、このマーキング処理に制限するものではなく、検出画素パターンと比較して適切な検出パラメータを設定できる方法であれば適応可能である。ここで、マーキングされている領域を理想検出画素パターン１１１０と呼ぶ。
【００３６】
次に、ステップＳ４３０において、後述するランドマークの検出画素パターン１２００と理想検出画素パターン１１１０を比較する。
【００３７】
まず、検出画像Ｕ_ｄ内の処理領域に含まれるランドマークの検出画素パターン１２００を生成する。ここで、画像バッファ領域１１００と同じ画像サイズで、画素値が０で初期化されているバッファ領域１２２０を別に用意する。このバッファ領域１２２０の画素値には、対応する検出画像Ｕ_ｄの画素値である特徴量Ｉ_ｓが検出パラメータの閾値Ｔ_ｄよりも大きい場合に１が入力される。このようにして、生成したパターンを検出画素パターン１２００とする。図１２は、ある時点において同一ランドマークを注視している場合の、検出画像Ｕ_ｄに含まれるランドマークの検出画素パターン例を表す。さらに、図１２Ａ、Ｂは検出パラメータが最適でないために、ランドマークの検出画素パターン１２００が理想検出画素パターン１１１０と異なっている状態を表す。図１２Ａに関しては、ランドマークの領域上の画素特徴量Ｉ_ｓに対して、閾値Ｔ_ｃが大きすぎるため、本来はランドマーク領域内であるはずの画素が検出できていない状態を表す。逆に、図１２Ｂでは、閾値Ｔ_ｃがランドマークの領域上の画素特徴量Ｉ_ｓに対して小さすぎるため、ランドマーク領域外の画素までランドマークとして検出している状態を表している。ここで、図１２Ｂにおいて理想検出画素パターン１１１０の内側に含まれない検出画素で、かつ、検出画素パターン１２００以外の画素を検出ノイズ１２１０と呼ぶ。この検出ノイズ１２１０は、ランドマーク検出処理において、ランドマーク画素数閾値Ｎよりもラベル画素数が多く、ランドマーク領域の候補だと判断された領域である。このステップＳ４３０における比較方法は、理想検出画素パターン１１１０と検出画素パターン１２００の検出画素数を利用する。まず、理想検出画素パターン内に含まれる検出画素パターン１２００の画素数ｇを認識し、理想検出画素パターンが内包する画素数ｉとの差分ｄを計算する。
【００３８】
ｄ＝ｇ−ｉ　　（式１０）
さらに、処理領域１０００内の理想検出画素パターン１１１０に含まれない検出画素パターン１２００の画素（はみ出し画素）の合計数をｅ，検出ノイズ１２１０の画素の合計数をｎとする。
【００３９】
図１２の例であれば、図１２Ａはｄ＝１２、ｅ＝０、ｎ＝０、図１２Ｂはｄ＝０、ｅ＝１０、ｎ＝７となる。
【００４０】
次に、ステップＳ４４０において、検出パラメータの最適化を行う。この処理の詳細を図６のフローチャートを用いて説明する。まず、ステップＳ６００において、ｄとｅが０であるかを検査する。両方とも０であれば、理想検出画素パターンと検出画素パターンが一致していることになるので、検出パラメータは最適な状態であると判断し、この処理を終了してランドマーク検出処理に戻る。ランドマーク検出処理においては、現在の検出パラメータによって新たな撮像画像のランドマーク検出を行う。ｄまたはｅのどちらかが正の場合は、ステップＳ６１０に移る。
【００４１】
ステップＳ６１０においては、理想検出画素パターン１１１０と検出画素パターン１２００の画素数の差分であるｄが正の場合は、ステップＳ６２０へ移る。もし、ｄが０である場合は、ステップＳ６５０に移る。
【００４２】
ステップ６２０では、現在のランドマーク画素数閾値Ｎが現在の理想検出画素数ｉよりも小さいことを判定する。この判定は、本来ランドマーク上にある画素をランドマーク画素数閾値Ｎによって除去しないように、Ｎの値を理想検出画素数ｉよりも必ず小さくする処理である。本実施形態においては、常にＮとｉの間に３画素分の閾を設けており、Ｎが必ずｉよりも３画素分小さくなるように設定される。もし、Ｎがｉ＋３よりも小さい場合は、ステップ６３０に移る。もし、Ｎがｉ＋３よりも大きい場合は、ステップＳ６４０に移る。
【００４３】
ステップ６３０では、ｄが正であることから、理想検出画素パターン内のランドマーク領域に未検出の画素があるので、未検出であった画素が検出できるように検出閾値Ｔ_ｃを１減少させる。
【００４４】
ステップＳ６４０では、ランドマーク画素数閾値Ｎがｉ＋３よりも大きいので、Ｎにｉ−４を代入し、ランドマークを的確に検出できるように設定する。また、同時に未検出であった理想検出画素パターン上の画素が検出できるように検出閾値Ｔ_ｃを１減少させる。
【００４５】
ステップＳ６５０ではｄが０であるので、理想検出画素パターン内の検出画素パターンは全て検出さていることになるが、ｅが正であることから、理想検出画素画素１１１０外の画素がランドマーク１００内の画素として検出されている状態である。ここでは、まず検出ノイズ１２１０の画素数ｎが正であるかどうかを判定し、検出ノイズ１２１０の有無を調べる。検出ノイズ１２１０がない場合は、ステップＳ６６０に移る。もし検出ノイズがある場合は、ステップＳ６７０に移り、検出ノイズを減少させる処理を行う。
【００４６】
ステップＳ６６０では、はみ出し画素が検出されないように検出閾値Ｔ_ｃを１増加させる。
【００４７】
ステップＳ６７０においては、ステップＳ６２０と同様に、ランドマーク画素数閾値Ｎが現在の理想検出画素数ｉよりも小さいことを判定する。もし、Ｎがｉ＋３よりも小さい場合は、ステップＳ６８０に移り、はみ出し画素が検出されないように検出閾値Ｔ_ｃを１増加させる。また、Ｎがｉ＋３よりも大きい場合は、ステップＳ６７０に移り、はみ出し画素が検出されないように検出閾値Ｔ_ｃを１増加させると共に、ランドマーク画素数閾値Ｎを１増加させ、検出ノイズ画素数ｎを減少させる。これらの処理が終了した場合は、個々の検出パラメータを保存し、ランドマーク検出処理部に戻り、更新された検出パラメータによって新たな撮像画像からランドマーク検出を行う。
【００４８】
本発明は、図６で示した検出パラメータの最適化処理に制限するものではなく、ランドマークが的確に検出できるような最適化処理であれば適応可能である。
【００４９】
＜変形例１＞
本発明は、上述の実施形態にのみ適用されるものではない。上述の実施形態においては、検出パラメータの最適化処理は、１つの撮像画像に対して、撮像されているランドマークの数と同じ回数だけ行われるが、図１３が示す処理のように、検出パラメータをランドマークの数だけ最適化処理を行ったあと、さらに、同一の撮像画像を用いて同様の最適化処理を行うことで、前述した実施形態の最適化処理に比べて、より最適な状態に早く到達する。
【００５０】
この例の処理の流れを図１３を用いて説明する。
【００５１】
まず、ステップＳ５００において、ランドマークが配置されている現実空間を撮像する。次にステップＳ５１０において、配置されたランドマークを撮像画像中から抽出する。次にステップＳ５２０において、撮像部１１０の位置姿勢を６自由度センサによって計測する。次にステップＳ５３０では、６自由度センサによる計測値と撮像画像から検出されたランドマークの画像上での位置を用いて、撮像部１１０の位置姿勢を推定する。次にステップＳ５４０においては、ステップＳ５１０において使用した検出パラメータを、画像上にあるランドマーク領域９１０毎に、それぞれ最適化処理を行う。次にステップＳ１３００に移るが、同一の撮像画像に対して、１回の最適化処理を終了した時点では必ずステップＳ５１０に戻り、前回と同じ撮像画像に対して、最適化処理を施された検出パラメータでランドマーク検出を行う。さらに、前回と同様にＳ５２０、Ｓ５３０、Ｓ５４０を実行する。次に、ステップＳ１３００において、同一の撮像画像に対して、２回以上の最適化処理が終了していることを確認した場合はステップＳ１３１０に移る。ステップＳ１３１０において、最適化処理中で算出したｄ、ｅ、ｎの減少率Ｃを判定する。例えば、減少率Ｃが予め定めた閾値Ｃ_Ｔを越えた時点でステップＳ５５０に移ってもよいし、減少率の微分値が予め定めた閾値Ｄ_Ｔを下回った場合にステップＳ５５０へ移ってもよい。ステップＳ５５０において、終了命令が発せられなければ、ステップＳ５００に移り、撮像画像の更新を行う。終了命令が発せられれば一連の処理を終了する。
【００５２】
＜変形例２＞
上述の実施形態においては、Ｒ、Ｇ、Ｂの３色に対して、マーカの検出を行っている。しかし、本発明は、上述の実施形態にのみ適用されるものではない。例えば、輝度を特徴とするランドマークに対しても本発明を適応可能である。以下に、この輝度を特徴とするランドマークに対して、本発明を適応した例を挙げる。上述の実施形態における＜２．ランドマーク検出処理＞において、ＲＧＢの画素値から輝度Ｙを検出し、この輝度Ｙに対してマーカ検出を行い、この輝度Ｙに対する閾値を検出パラメータとして最適化を行う。
【００５３】
ここで、輝度Ｙの算出方法を以下に示す。
【００５４】
Ｙ＝０．２９９×Ｒ＋０．５８７×Ｇ＋０．１１４×Ｂ　　　（式１０）
このＹから特徴量Ｉ_ｓを算出し、閾値Ｔ_ｃが特徴量Ｉ_ｓを越えた場合は、注目画素がマーカ領域内であると判定する。例えば、ランドマークが黒い物体の場合は、
Ｉ_ｓ＝Ｙ_ｍａｘ−Ｙ　　　（式１１）
として、特徴量Ｉ_ｓを算出する。ここでＹ_ｍａｘは輝度Ｙの最大値とする。さらに、閾値Ｔ_ｃは輝度の特徴量に関する閾値である。
【００５５】
この処理により、輝度を特徴としたランドマークに対して、検出パラメータの最適化処理を行うことが可能である。
【００５６】
（他の実施形態）
前述した実施形態の機能を実現する様に各種のデバイスを動作させる様に該各種デバイスと接続された装置あるいはシステム内のコンピュータに、前記実施の形態の機能を実現するためのソフトウエアのプログラムコードを供給し、そのシステムあるいは装置のコンピュータ（ＣＰＵあるいはＭＰＵ）を格納されたプログラムに従って前記各種デバイスを動作させることによって実施したものも本発明の範疇に含まれる。
【００５７】
この場合、前記ソフトウエアのプログラムコード自体が前述した実施の形態の機能を実現することになり、そのプログラムコード自体、及びそのプログラムコードをコンピュータに供給するための手段、例えばかかるプログラムコードを格納した記憶媒体は本発明を構成する。
【００５８】
かかるプログラムコードを格納する記憶媒体としては例えばフロッピー（Ｒ）ディスク、ハードディスク、光ディスク、光磁気ディスク、ＣＤ−ＲＯＭ、磁気テープ、不揮発性のメモリカード、ＲＯＭ等を用いることが出来る。
【００５９】
またコンピュータが供給されたプログラムコードを実行することにより、前述の実施形態の機能が実現されるだけではなく、そのプログラムコードがコンピュータにおいて稼働しているＯＳ（オペレーティングシステム）、あるいは他のアプリケーションソフト等と共同して前述の実施形態の機能が実現される場合にもかかるプログラムコードは本発明の実施形態に含まれることは言うまでもない。
【００６０】
更に供給されたプログラムコードが、コンピュータの機能拡張ボードやコンピュータに接続された機能拡張ユニットに備わるメモリに格納された後そのプログラムコードの指示に基づいてその機能拡張ボードや機能格納ユニットに備わるＣＰＵ等が実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も本発明に含まれることは言うまでもない。
【００６１】
【発明の効果】
以上説明したように、本発明によれば、特徴部を検出するための検出条件を、撮影画像から自動的に最適化することができ、撮像画像の変化にかかわらず撮像手段の位置姿勢を高精度に求めることができるようにすることができる。
【図面の簡単な説明】
【図１】本発明の第１の実施形態を適用した撮像部位置姿勢推定装置の構成例を示すブロック図である。
【図２】図１の撮像部位置姿勢推定装置の使用時の状態を説明する図である。
【図３】第１の実施形態における、位置姿勢補正部１６０の構成例を示すブロック図である。
【図４】第１の実施形態における、検出パラメータ調整部１９５の構成例を示すブロック図である。
【図５】第１の実施形態における、撮像部位置姿勢推定装置の処理を説明するフローチャートである。
【図６】図５における、検出パラメータの最適化の処理を説明するフローチャートである。
【図７】従来の方法における、撮像部位置姿勢の補正方法を説明する模式図である。
【図８】第１の実施形態における、ランドマークの形状を示す図である。
【図９】図２の状態において、ランドマーク１点を撮像した時の撮像画像に、ランドマークの各頂点８００とアウトライン９００を重畳した模式図である。
【図１０】第１の実施形態における、処理領域１０００を示す模式図である。
【図１１】第１の実施形態における、理想検出画素パターン生成の処理の過程を示す模式図である。
【図１２】第１の実施形態における、ランドマークの検出画素パターン例を示す模式図である。
【図１３】変形例を適用した撮像部位置姿勢推定装置の構成例を示すブロック図である。
【図１４】第１の実施形態における、複数のランドマークを撮像したときの、処理領域を示す模式図である。[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to an apparatus for automatically adjusting a detection condition of a characteristic portion in a captured image from the captured image.
[0002]
[Prior art]
In a method of measuring and determining external parameters (position and orientation) of an imaging unit for imaging a real space, a feature point (landmark) having a known three-dimensional position and orientation is arranged in the real space, and the image is captured by the imaging unit. There is a method of obtaining the position and orientation of the imaging unit based on image information of some landmarks on the imaging surface. For example, in a method of measuring and determining external parameters (position and orientation) of an imaging unit (such as a video camera) for imaging a real space,
(1) Estimation method using only image information
(For example, Kato, Mark, Asano, Tachibana: "Augmented reality system based on marker tracking and its calibration", described in Transactions of the Virtual Reality Society of Japan Vol.4 No.4, pp.607-616, 1999.) ,
(2) Hybrid estimation method of position and orientation sensor with 6 degrees of freedom and image
(For example, described in JP-A-11-136706 and JP-A-2000-347128)
(3) Hybrid estimation method of image and acceleration sensor
(For example, Yokokoji, Sugawara, Yoshikawa: "Precise superimposition of video on HMD using image and accelerometer", Transactions of the Virtual Reality Society of Japan, Vol. 4, No. 4, pp. 589-598, 1999. )
(4) Estimation method such as hybrid positioning of gyro sensor and image
(For example, Fujii, Kamihara, Iwasa, Takemura, Yokoya: "Positioning using a stereo camera combined with a gyro sensor for augmented reality", IEICE Technical Report PRMU 99-192, January 2000)
Etc. are known. In addition, there are various methods that are different from each other in the method of image alignment and the hybrid method of a 6-DOF sensor and an image.
[0003]
[Problems to be solved by the invention]
However, the position and orientation estimation device of the imaging unit like the conventional one largely depends on the detection accuracy of the landmark, and depending on the situation, the detection accuracy of the landmark is remarkably reduced, and the position and orientation of the imaging unit in the real space are different. In some cases, the magnitude of the deviation from the position and orientation of the imaging unit output by the position and orientation estimation device may be a problem. Factors that reduce the detection accuracy of landmarks include, for example, changes in illumination conditions, changes in settings such as shutter speed, white balance, and gain of the imaging unit, and automatic adjustment functions of these. In the conventional position and orientation estimating apparatus, since the detection parameters in the landmark detection processing are fixed, the inability to cope with the above-described change in the image is a main cause of lowering the landmark detection accuracy. .
[0004]
In order to cope with this decrease in detection accuracy, the conventional position and orientation estimating device is designed so that the lighting conditions of the space to be imaged do not change as much as possible, and furthermore, the optimum detection parameters are manually set for each time and place, and are dealt with. I needed to.
[0005]
The present invention has been made in view of such a problem, and an object of the present invention is to automatically adjust a feature detection condition in a captured image so that a feature can always be detected with high accuracy. I do.
[0006]
[Means for Solving the Problems]
In order to achieve the above object, the present invention is characterized by having the following configuration.
[0007]
The invention according to claim 1 of the present application is an information processing method for estimating a position and orientation of an imaging unit located in a three-dimensional space, and an acquisition step of acquiring position and orientation information of the imaging unit obtained by a position and orientation measurement unit. A feature point detecting step of detecting the feature point from a video obtained by imaging the real space having a feature point whose three-dimensional position is known by the imaging unit using a detection condition; The method includes a correction step of correcting the position and orientation information based on a position of a feature point, and an optimization step of automatically optimizing the detection condition from the video.
[0008]
The invention according to claim 11 of the present application detects a characteristic portion in an image photographed by a photographing unit, and adjusts a detection condition of the characteristic unit used when obtaining position and orientation information of the photographing unit using the detection result. An information processing method, wherein a region of a feature point is detected using a detection condition from a video obtained by imaging a real space in which a feature portion having a known three-dimensional position and shape is present by the imaging unit. Detecting, calculating a characteristic region in the video from the three-dimensional position and shape of the characteristic point, comparing the detected characteristic region with the calculated characteristic region, and adjusting the detection condition. I do.
[0009]
BEST MODE FOR CARRYING OUT THE INVENTION
<First embodiment>
Hereinafter, with reference to the accompanying drawings, a detailed description will be given according to a preferred embodiment to which the present invention is applied.
In the present embodiment, an example in which the present invention is applied to (2) a hybrid estimation method of a position and orientation sensor having six degrees of freedom and an image, which is a conventional example, will be described.
[0010]
<Error of 6 degrees of freedom position and orientation sensor>
Here, the 6-degree-of-freedom position / posture sensor measures the position and orientation of a measurement target, such as a magnetic sensor such as 3SPCACEFASTRAK of Polhemus or Flockof Birds of Assion Technology, or an optical sensor such as OPTOTRAK of Northern Digital. Device that Since the six-degree-of-freedom position and orientation sensor 140 is fixed to the imaging unit 110, the position and orientation of the imaging unit can be acquired from the measurement values of the six-degree-of-freedom sensor 140. However, there is an error in the sensor, and depending on the situation, the position and orientation of the imaging unit 110 cannot be obtained with high accuracy. For example, the measurement value of the 6-degree-of-freedom position / posture sensor using a magnetic sensor is affected by the surrounding magnetic field, so if there is a metal substance near the measurement unit, the error increases, and as a result, the position / posture of the imaging unit 110, The shift amount (error) of the output from the six-degree-of-freedom sensor increases.
[0011]
<Solution in JP-A-11-136706>
In view of such a situation, Japanese Patent Laying-Open No. 11-136706 describes a method of correcting a 6-degree-of-freedom sensor error using image information. FIG. 7 is a schematic diagram illustrating the above-described conventional method. In this method, an object having a specific color or shape, called a marker, which is easy to detect by image processing in a real space, for example, a sticker, is attached to an object in the real space and used as a landmark (absolute position reference). As the landmarks 100, characteristic objects and points in the real space other than the specially arranged landmarks can be used. Point A indicates the position of the landmark 100 predicted based on the position and orientation of the imaging unit 110, point B indicates the actual position of the landmark 100, and point C indicates the viewpoint position of the imaging unit 110. The positions indicated by points A and B are positions in the camera coordinate system, and point C is the origin of the camera coordinate system. Point P indicates the position of point A on the imaging surface, and point Q indicates the position of point B on the imaging surface.
[0012]
At this time, there is a deviation (error) between the position A of the landmark predicted by the position and orientation of the camera on the imaging surface and the actual position B as described above. A conversion matrix ΔM is set such that this displacement is superimposed on a predicted landmark position P on the imaging plane, using the position Q of the landmark 100 captured by the imaging unit on the imaging plane as a reference in the real space. _C Ask for. This transformation matrix ΔM _C Is obtained from the six-degree-of-freedom sensor 140. _C , The viewing matrix M of the imaging unit that has corrected the error _C 'Is obtained. Where M _C The inverse matrix of 'includes the position and orientation components of the imaging unit. Where M _C ´
[Outside 1]

Then, the position t becomes (M _14, M _24, M ₃₄ ), And the rotation R is
[Outside 2]

It is obtained by the matrix of That is, in order to obtain the position and orientation of the imaging unit 110, the viewing transformation matrix M _C 'Should be obtained.
[0013]
In this manner, the position and orientation based on the output of the six-degree-of-freedom sensor 140 and further aligned based on the absolute position reference in the real space are higher than the actual position and orientation measured by the six-degree-of-freedom sensor 140 alone. The amount of deviation with respect to the position / posture decreases stably.
[0014]
However, since the position of the landmark 100 is used as an absolute position reference in a position and orientation correction unit described later, the position and orientation of the imaging unit cannot be accurately estimated unless this reference is accurate. From this, it is a problem to be solved how to detect the landmark 100 reflected in the captured image using the optimal detection parameter.
[0015]
<Example in which the present invention is applied to the technique disclosed in JP-A-11-136706>
By applying the present invention to the method described in Japanese Patent Application Laid-Open No. H11-136706, the detection parameters are automatically optimized according to the brightness and color change of the captured image obtained by the imaging unit, so that the landmark Detection performance can be improved. Here, optimization is defined as "change to an optimal state". Therefore, a detection parameter that has been subjected to one optimization process is not necessarily in an optimal state for detecting a landmark in a captured image. An optimum state can be achieved by a plurality of optimization processes.
[0016]
FIG. 1 shows a configuration diagram of an imaging unit position / posture estimating apparatus when the present invention is applied to a hybrid alignment method of a six-degree-of-freedom sensor and an image as described in JP-A-11-136706. . A procedure for estimating the position and orientation of the imaging unit will be described with reference to FIG.
[0017]
FIG. 5 is a flowchart showing the flow of processing of the entire apparatus. First, in step S500, the real space in which the landmark is arranged is imaged by the imaging unit. Next, in step S510, a landmark detection process described later is performed. Next, the process proceeds to step S520, in which measurement is performed by a 6-DOF sensor. Further, in step S530, the position and orientation of the imaging unit are estimated using the measured values and the landmark positions detected from the captured image. Next, in step S540, the detection parameters are optimized. When the optimization of the detection parameters is completed, the process proceeds to step S550. If there is an end command, the process ends. If not, the process returns to step S500, and the process is performed again based on the updated captured image.
[0018]
The details of each process from step S510 to step S540 will be described below.
[0019]
<1. Landmark detection processing>
First, the landmark 100 arranged at a known position in the real space is imaged by the imaging unit 110 and stored in the captured image acquisition unit 120. The landmark 100 is detected from the captured image, and the two-dimensional position Q (x _q , Y _q Get)
[0020]
Here, the method of detecting the landmark 100 in the image obtained from the imaging unit 110 is not particularly limited, but examples include the following. When the red landmark 100 as shown in FIG. 2 is arranged in the real space, the feature amount I of the target pixel value (R, G, B) in the captured image _s To
I _s = R / ((G + B) / 2) (Equation 3)
It is calculated using the calculation formula. In the case of a green landmark 100,
I _s = G / ((R + B) / 2) (Equation 4)
And for blue landmarks,
I _s = B / ((R + G) / 2) (Equation 5)
It is calculated by the following formula. In the present invention, the colors of the landmarks are not limited to the three colors described above, and the present invention can be applied to a case where an arbitrary color is detected. For example, in the detection of an arbitrary color, the pixel values R, G, and B are converted into pixels in the YCbCr format, landmarks corresponding to the Cb and Cr color regions are detected, and parameters for the designated regions of Cb and Cr are determined. It is also applicable to the process of performing optimization. In this case, the parameter for designating the region indicates an elliptical region in the CbCr space, and is composed of θ and r representing the center position, and Ra and Rb representing each axis.
[0021]
In this landmark detection process, the threshold value T, which is the detection parameter adjusted by the detection parameter adjustment unit 195, is used. _c And the feature amount I _s Is T _c Is exceeded, it is determined that the target pixel is within the marker area. This is applied to all the pixels of the captured image, and the detected characteristic regions are individually labeled. Further, of the labeled regions, those having a label exceeding the landmark pixel number threshold N are set as landmark region candidates. The landmark pixel number threshold value N is set so that if noise at the time of detection is labeled, the number of pixels of most noises is very small, so that those below a certain threshold value are not recognized as landmarks. . With this processing, it is possible to reduce detection noise and prevent the detection noise from being erroneously recognized as a landmark in the processing of the position and orientation correction unit 160 to some extent. The centroid position of this landmark area is defined as the two-dimensional position of the landmark 100, and the landmark position Q (x _q , Y _q ). Here, when a plurality of landmarks 100 are included in the captured image, the above-described processing is performed once for each landmark. The feature value I at this time _s And the landmark pixel number threshold N are also stored for each landmark. For example, if three landmarks are imaged, T _c = (T _c1 , T _c2 , T _c3 ) N = (N ₁ , N ₂ , N ₃ ) Is stored for each landmark. Here, a detected image U in which a landmark is detected from the captured image _d (For example, in the present embodiment, the feature amount I of the corresponding pixel is _s Is stored as a pixel value) and a detection parameter (in this embodiment, a threshold T _c , Landmark pixel number threshold N) in the state storage unit 190.
[0022]
<Measurement processing by 2.6 degrees of freedom sensor>
On the other hand, the 6-degree-of-freedom sensor 140 detects the position and orientation of the imaging unit 110. In the present embodiment, a 3SPCACE FASTRAK (hereinafter, FASTRAK) manufactured by Polhemus is used as a 6-degree-of-freedom sensor. The receiver 140A of the FASTRAK sensor is fixed so as to follow the movement of the position and orientation of the imaging unit 110. For example, FIG. 2 is a diagram illustrating a state when the processing of the present embodiment is being executed, and two of them are fixed with a rod-shaped material other than metal, such as the imaging unit 110 and the FASTRAK receiver 140A in FIG. The AC magnetic field in the vicinity of the FASTRAK receiver 140 is fixed by a method that does not affect the AC magnetic field as much as possible. The receiver 140A receives an AC magnetic field generated from the transmitter 140B of the FASTRAK sensor. The position and orientation of the imaging unit 110 are measured from the change in the alternating magnetic field received by the receiver 140A by the position and orientation measurement unit 150.
[0023]
<3. Estimation process of imaging unit position and orientation>
The position and orientation measurement values of the imaging unit 110 obtained in this way and the two-dimensional position of the landmark 100 on the image captured by the imaging unit 110 are input to the position and orientation correction unit 160. FIG. 3 is a diagram illustrating the processing of the position and orientation correction unit 160. First, the viewing transformation matrix M of the imaging unit is calculated from the measurement values input from the position and orientation measurement unit 150. _C Is generated (step S300). Further, the viewing transformation coordinates M _C From the known three-dimensional position of each landmark on the world coordinate system stored in the landmark arrangement / shape storage unit 170 and the known ideal perspective transformation matrix of the imaging unit 110, the observation coordinate prediction of each landmark is performed. Value P _i (X _pi , Y _pi ) Is calculated (step S310). Next, in step S320, based on the passed landmark observation predicted coordinate value P, a landmark that is currently being observed, that is, a landmark that is a reference for correction is determined. The discrimination method according to the present embodiment uses the observation observation predicted coordinate value P _j (X _pj , Y _pj ), The observed coordinate value Q of the observed landmark _i (X _qi , Y _qi ) Is used for associating objects that are close to each other. That is, among the combinations of the observed predicted coordinates P and the observed coordinates Q of each landmark, (x _qi -X _pj ) ² + (Y _qi -Y _pj ) ² Are associated with each other. However, in the present invention, this discrimination method is not limited, and any method is applicable as long as the landmark of the predicted observation coordinates obtained from the six-degree-of-freedom sensor can be correctly associated with the landmark in the captured image. It is. If at least one point is successfully associated in the landmark associating process, in step S330, the observed predicted coordinate value P (x _p , Y _p ) And the observation coordinate value Q (x) of the landmark detected by the landmark detection unit 130. _q , Y _q ), A viewing transformation matrix M representing the position and orientation of the imaging unit 110 obtained by the position and orientation measurement unit 150 _C ΔM for correcting _C Ask for. Further, in step S340, ΔM obtained in step S330 _C And the viewing transformation matrix M of the camera from the measurement values obtained in step S300 _C Of the imaging unit viewpoint corrected by integrating _C 'Can be obtained.
[0024]
M _C '= ΔM _C ・ M _C (Equation 6)
In step S320, if landmarks cannot be associated, for example, ΔM _C Is set as a unit matrix, so that M obtained from the FASTRAK sensor output is obtained. _C The viewing transformation matrix M of the viewpoint that finally outputs _C '.
[0025]
<4. Landmark area generation processing>
Through the position / orientation correction unit 160, the arrangement information and the shape information of each landmark are input from the landmark arrangement / shape storage unit 170 to the landmark area generation unit 180.
[0026]
First, the shape information means, for example, when a circular red landmark 100 (FIG. 8A) as shown in FIG. 2 is used, a 3D model is formed in advance as shown in FIG. It is configured and expresses the shape of a landmark in a pseudo manner. The present invention is not limited to circular landmarks, but can be applied to any object having a specific color or shape that can be easily detected by image processing. As the landmarks 100, characteristic objects and points in the real space other than the specially arranged landmarks can be used. Furthermore, even if a plurality of landmarks have different shapes individually, it is applicable. Here, the coordinate value (x) of the vertex 800 in the model coordinate system in the three-dimensional space _m , Y _m , Z _m ). Here, the vertex information has connection information of each vertex. In the present invention, the present invention is not limited to the case where the three-dimensional model has the connection information, but may be applied to any method that can reconstruct appropriate landmark area information in the landmark area generation unit 180 described later.
[0027]
Further, the arrangement information stores a matrix L representing the position and orientation of the vertex 810 representing the position of the center of gravity of the landmark arranged in the real space in the world coordinate system. The position and orientation information L at the vertex 800 and the landmark centroid position 810 are stored in the landmark arrangement / shape storage unit 170, respectively. The landmark area generation unit 180 converts the landmark shape information 800 into the layout information L and the corrected viewing transformation matrix M of the imaging unit viewpoint. _C ', The position of each vertex 800 on the viewpoint coordinate system (x _m ', Y _m ', Z _m ') Get.
[0028]
[Outside 3]

[0029]
Further, by using a known projection matrix C from the viewpoint coordinate system to the observation coordinate system, the two-dimensional position V (x _v , Y _v Get)
[0030]
[Outside 4]

[0031]
[Outside 5]

[0032]
Next, from the two-dimensional position V of each vertex 800, a two-dimensional straight line equation is calculated for each side of the landmark shape based on the connection information of the vertices included in the landmark shape information, and an outline 900 is generated. The outline 900 is constituted by a group of straight-line formulas connecting each vertex in the observation coordinates. The landmark area 910 includes the outline 900 and a point G in the landmark area. If the point G in the area is, for example, a circular landmark 100, the center of gravity 810 of the landmark may be set as a point in the area. The generated landmark area 910 is stored in the state storage unit 190. FIG. 9 illustrates an image obtained by superimposing an outline 900 and each vertex 800 on a captured image when one of the landmarks 100 is captured in the position and orientation of the imaging unit in FIG. 2. In the outline 900 in this figure, the position and shape are determined based on the position and orientation of the imaging unit 110 estimated by the position and orientation correction unit 160, so that the position and orientation correction unit 160 has correspondence between landmarks. If it is in the state, it is displayed at the position of the landmark 100 in the captured image.
[0033]
<5. Adjustment of detection parameters>
FIG. 4 is a diagram illustrating a detailed procedure of the process of the detection parameter adjustment unit 195. The detection parameter adjustment unit 195 reads the landmark area 910, the detected image U _d , A threshold T which is a detection parameter _c And the landmark pixel number threshold N.
[0034]
First, in step S410, a processing area 1000 completely including the landmark area 910 is generated. The processing region 1000 may be, for example, a rectangular region having a point 810 in the landmark region as a center of gravity and including all landmark regions. FIG. 10 shows this processing area. This processing area 1000 is generated for each landmark being imaged. For example, as shown in FIG. 14, the processing area 1000 in the case where three landmarks are imaged is generated around each of the landmarks. The present invention does not limit the method of generating the processing area 1000, but is applicable as long as the shape of the landmark area 910 and the shape of the detection pixel on the detection image can be compared.
[0035]
Next, in step S420, the detected image U _d Then, an ideal detection pixel pattern 1110 for comparing the landmark detection image in the area with the landmark area is created. First, an image buffer area 1100 in which all pixels are initialized to 0 with the same pixel as the processing area is created. Next, marking is performed on the pixels included in the landmark area in the buffer area 1100. This marking may be, for example, a method of setting all pixel values 1 in the area. This situation is represented by A and B in FIG. Assuming that a landmark area 910 as shown in FIG. 11A exists on the buffer area 1100, pixels in the landmark area are marked by the marking process as shown in FIG. 11B. Here, FIG. 11B shows that black pixels are marked. In this embodiment, even when a region is included on a pixel, marking is not performed unless this region occupies more than half of the pixel area. However, the present invention is not limited to this marking process, but can be applied to any method that can set an appropriate detection parameter as compared with a detection pixel pattern. Here, the marked area is called an ideal detection pixel pattern 1110.
[0036]
Next, in step S430, a landmark detection pixel pattern 1200 described later and an ideal detection pixel pattern 1110 are compared.
[0037]
First, the detected image U _d A detection pixel pattern 1200 of the landmark included in the processing area in the inside is generated. Here, a buffer area 1220 having the same image size as the image buffer area 1100 and initialized with a pixel value of 0 is separately prepared. The pixel value of the buffer area 1220 corresponds to the detected image U _d Feature value I, which is the pixel value of _s Is the threshold T of the detection parameter _d If it is larger than 1, 1 is input. The pattern thus generated is referred to as a detection pixel pattern 1200. FIG. 12 shows a detected image U when the same landmark is being watched at a certain point in time. _d 2 shows an example of a detected pixel pattern of a landmark included in. 12A and 12B show a state in which the detection pixel pattern 1200 of the landmark is different from the ideal detection pixel pattern 1110 because the detection parameters are not optimal. Referring to FIG. 12A, the pixel feature amount I _s For a threshold T _c Is too large, and a pixel that should originally be in the landmark area has not been detected. Conversely, in FIG. _c Is the pixel feature amount I on the landmark area. _s , The state outside the landmark area is detected as a landmark. Here, in FIG. 12B, pixels that are not included inside the ideal detection pixel pattern 1110 and that are other than the detection pixel pattern 1200 are referred to as detection noise 1210. The detection noise 1210 is an area in which the number of label pixels is larger than the landmark pixel number threshold N in the landmark detection processing and is determined to be a landmark area candidate. The comparison method in step S430 uses the number of detected pixels of the ideal detected pixel pattern 1110 and the detected pixel pattern 1200. First, the number g of pixels of the detection pixel pattern 1200 included in the ideal detection pixel pattern is recognized, and the difference d from the number i of pixels included in the ideal detection pixel pattern is calculated.
[0038]
d = gi (Equation 10)
Further, it is assumed that the total number of pixels (protruding pixels) of the detection pixel pattern 1200 not included in the ideal detection pixel pattern 1110 in the processing region 1000 is e, and the total number of pixels of the detection noise 1210 is n.
[0039]
In the example of FIG. 12, FIG. 12A has d = 12, e = 0, n = 0, and FIG. 12B has d = 0, e = 10, n = 7.
[0040]
Next, in step S440, the detection parameters are optimized. Details of this processing will be described with reference to the flowchart of FIG. First, in step S600, it is checked whether d and e are 0. If both are 0, it means that the ideal detected pixel pattern and the detected pixel pattern are coincident, so that it is determined that the detection parameters are in an optimal state, and this processing is terminated and the processing returns to the landmark detection processing. In the landmark detection processing, landmark detection of a new captured image is performed using the current detection parameters. If either d or e is positive, the process moves to step S610.
[0041]
In step S610, if the difference d between the number of pixels of the ideal detection pixel pattern 1110 and the number of pixels of the detection pixel pattern 1200 is positive, the process proceeds to step S620. If d is 0, the process moves to step S650.
[0042]
In step 620, it is determined that the current landmark pixel number threshold N is smaller than the current ideal detection pixel number i. This determination is a process of making the value of N smaller than the ideal detection pixel number i so that pixels originally on the landmark are not removed by the landmark pixel number threshold N. In the present embodiment, a threshold of three pixels is always provided between N and i, and N is always set to be smaller than i by three pixels. If N is smaller than i + 3, the process proceeds to step 630. If N is larger than i + 3, the process moves to step S640.
[0043]
In step 630, since d is positive, there is an undetected pixel in the landmark area in the ideal detection pixel pattern, and the detection threshold T is set so that the undetected pixel can be detected. _c Is reduced by one.
[0044]
In step S640, since the landmark pixel number threshold value N is larger than i + 3, i-4 is substituted for N, and setting is made so that landmarks can be accurately detected. Further, the detection threshold value T is set so that pixels on the ideal detection pixel pattern which have not been detected at the same time can be detected. _c Is reduced by one.
[0045]
In step S650, since d is 0, all the detection pixel patterns in the ideal detection pixel pattern have been detected. However, since e is positive, pixels outside the ideal detection pixel pixel 1110 This is a state in which the pixel is detected as a pixel. Here, first, it is determined whether or not the number n of pixels of the detection noise 1210 is positive, and the presence or absence of the detection noise 1210 is checked. If there is no detection noise 1210, the process moves to step S660. If there is detection noise, the process moves to step S670, and processing for reducing the detection noise is performed.
[0046]
In step S660, the detection threshold value T is set so that the protruding pixel is not detected. _c Is increased by one.
[0047]
In step S670, similarly to step S620, it is determined that the landmark pixel number threshold N is smaller than the current ideal detection pixel number i. If N is smaller than i + 3, the process moves to step S680, and the detection threshold value T is set so that the protruding pixel is not detected. _c Is increased by one. If N is larger than i + 3, the process moves to step S670, and the detection threshold value T is set so that the pixel that does not protrude is detected. _c Is increased by one, the landmark pixel number threshold N is increased by one, and the detected noise pixel number n is decreased. When these processes have been completed, the individual detection parameters are stored, and the process returns to the landmark detection processing unit, where landmarks are detected from a new captured image using the updated detection parameters.
[0048]
The present invention is not limited to the detection parameter optimizing process shown in FIG. 6, but can be applied to any optimizing process that can accurately detect landmarks.
[0049]
<Modification 1>
The present invention is not applied only to the above embodiments. In the above-described embodiment, the detection parameter optimization process is performed for one captured image by the same number of times as the number of landmarks being captured. However, as shown in FIG. After performing optimization processing by the number of landmarks, the same optimization processing is further performed using the same captured image, so that a more optimal state is obtained as compared to the optimization processing of the above-described embodiment. Reach early.
[0050]
The processing flow of this example will be described with reference to FIG.
[0051]
First, in step S500, an image of a real space where landmarks are arranged is taken. Next, in step S510, the arranged landmark is extracted from the captured image. Next, in step S520, the position and orientation of the imaging unit 110 are measured by a six-degree-of-freedom sensor. Next, in step S530, the position and orientation of the imaging unit 110 are estimated using the values measured by the six-degree-of-freedom sensor and the positions of the landmarks detected from the captured image on the image. Next, in step S540, the detection parameters used in step S510 are optimized for each landmark area 910 on the image. Next, the process proceeds to step S1300. When one optimization process is completed for the same captured image, the process always returns to step S510, and the detection is performed when the optimization process has been performed on the same captured image as the previous image. Performs landmark detection using parameters. Further, S520, S530, and S540 are executed similarly to the previous time. Next, in step S1300, if it is confirmed that the optimization process has been completed twice or more for the same captured image, the process moves to step S1310. In step S1310, the reduction rate C of d, e, and n calculated during the optimization processing is determined. For example, the reduction rate C is a predetermined threshold C _T May be moved to step S550 at the point in time when the difference exceeds the predetermined threshold D. _T May be shifted to step S550. If no end command is issued in step S550, the process moves to step S500, and the captured image is updated. When the end command is issued, the series of processing ends.
[0052]
<Modification 2>
In the above-described embodiment, markers are detected for three colors of R, G, and B. However, the present invention is not applied only to the above embodiments. For example, the present invention is applicable to landmarks characterized by luminance. An example in which the present invention is applied to a landmark characterized by this luminance will be described below. <2. In the landmark detection process>, the luminance Y is detected from the RGB pixel values, a marker is detected for the luminance Y, and optimization is performed using a threshold value for the luminance Y as a detection parameter.
[0053]
Here, a method of calculating the luminance Y will be described below.
[0054]
Y = 0.299 × R + 0.587 × G + 0.114 × B (Equation 10)
From this Y, the feature amount I _s Is calculated and the threshold T _c Is the feature value I _s Is exceeded, it is determined that the target pixel is within the marker area. For example, if the landmark is a black object,
I _s = Y _max −Y (Equation 11)
As the feature value I _s Is calculated. Where Y _max Is the maximum value of the luminance Y. Further, the threshold T _c Is a threshold related to the luminance feature amount.
[0055]
With this processing, it is possible to perform the optimization processing of the detection parameter for the landmark characterized by the luminance.
[0056]
(Other embodiments)
A program code of software for realizing the functions of the above-described embodiment is provided to an apparatus or a computer in a system connected to the various devices so as to operate the various devices so as to realize the functions of the above-described embodiment. The present invention also includes a computer (CPU or MPU) of the system or the apparatus that operates by operating the various devices according to a stored program.
[0057]
In this case, the program code itself of the software realizes the function of the above-described embodiment, and the program code itself and a unit for supplying the program code to the computer, for example, the program code is stored. The storage medium constitutes the present invention.
[0058]
As a storage medium for storing such a program code, for example, a floppy (R) disk, hard disk, optical disk, magneto-optical disk, CD-ROM, magnetic tape, nonvolatile memory card, ROM or the like can be used.
[0059]
When the computer executes the supplied program code, not only the functions of the above-described embodiments are realized, but also the OS (Operating System) in which the program code runs on the computer, or other application software. It goes without saying that such a program code is also included in the embodiment of the present invention when the functions of the above-described embodiment are realized in cooperation with the above.
[0060]
Further, after the supplied program code is stored in a memory provided in a function expansion board of a computer or a function expansion unit connected to the computer, a CPU or the like provided in the function expansion board or the function storage unit based on the instruction of the program code. It is needless to say that the present invention includes a case in which the functions of the above-described embodiments are implemented by performing part or all of the actual processing.
[0061]
【The invention's effect】
As described above, according to the present invention, the detection condition for detecting the characteristic portion can be automatically optimized from the captured image, and the position and orientation of the imaging unit can be increased regardless of the change in the captured image. The accuracy can be determined.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating a configuration example of an imaging unit position and orientation estimation apparatus to which a first embodiment of the present invention has been applied.
FIG. 2 is a diagram illustrating a state when the imaging unit position and orientation estimation device of FIG. 1 is used.
FIG. 3 is a block diagram illustrating a configuration example of a position and orientation correction unit 160 according to the first embodiment.
FIG. 4 is a block diagram illustrating a configuration example of a detection parameter adjustment unit 195 according to the first embodiment.
FIG. 5 is a flowchart illustrating a process performed by the imaging unit position and orientation estimation device according to the first embodiment.
FIG. 6 is a flowchart illustrating a process of optimizing a detection parameter in FIG. 5;
FIG. 7 is a schematic diagram illustrating a method of correcting the position and orientation of an imaging unit in a conventional method.
FIG. 8 is a diagram illustrating a shape of a landmark according to the first embodiment.
9 is a schematic diagram in which each vertex 800 of a landmark and an outline 900 are superimposed on a captured image of one landmark in the state of FIG.
FIG. 10 is a schematic diagram showing a processing area 1000 according to the first embodiment.
FIG. 11 is a schematic diagram showing a process of generating an ideal detection pixel pattern in the first embodiment.
FIG. 12 is a schematic diagram illustrating an example of a landmark detection pixel pattern according to the first embodiment.
FIG. 13 is a block diagram illustrating a configuration example of an imaging unit position and orientation estimation apparatus to which a modification is applied.
FIG. 14 is a schematic diagram showing a processing area when a plurality of landmarks are imaged in the first embodiment.

Claims

３次元空間内に位置する撮像手段の位置姿勢を推定する情報処理方法であって、
位置姿勢測定部によって得られた前記撮像手段の位置姿勢情報を取得する取得ステップと、
３次元位置が既知である特徴点の存在する現実空間を前記撮像手段によって撮像することにより得られた映像から、検出条件を用いてに前記特徴点を検出する特徴点検出ステップと、
前記特徴点の位置に基づいて、前記位置姿勢情報を補正する補正ステップと、
前記映像から前記検出条件を自動的に最適化する最適化ステップを有することを特徴とする情報処理方法。An information processing method for estimating a position and orientation of an imaging unit located in a three-dimensional space,
An acquisition step of acquiring position and orientation information of the imaging unit obtained by a position and orientation measurement unit,
A feature point detecting step of detecting the feature point from a video obtained by imaging the real space having a feature point whose three-dimensional position is known by the imaging unit using a detection condition;
A correction step of correcting the position and orientation information based on the position of the feature point,
An information processing method, comprising an optimization step of automatically optimizing the detection condition from the video.

前記補正ステップによって得られた撮像部位置姿勢および既知である特徴点の３次元位置・形状情報に基づいて、前記映像上における前記特徴点の投影形状を推定し、
前記投影形状と前記特徴点検出ステップによって得られた検出結果とを比較し、
前記比較結果から前記検出条件の最適化を図ることを特徴とする請求項１記載の情報処理方法。Estimating the projection shape of the feature point on the video based on the imaging unit position and orientation obtained by the correction step and the three-dimensional position / shape information of the known feature point,
Comparing the projection shape and the detection result obtained by the feature point detection step,
2. The information processing method according to claim 1, wherein the detection condition is optimized based on the comparison result.

前記投影形状の内部領域と、前記映像を構成する画素が重なる領域を理想検出画素領域とすることを特徴とすることを特徴とする請求項２記載の情報処理方法。3. The information processing method according to claim 2, wherein a region where the pixel forming the image overlaps with the internal region of the projection shape is set as an ideal detection pixel region.

前記最適化ステップは、前記特徴点検出ステップにおいて特徴点として検出した検出画素領域の特徴量と、前記理想検出画素領域の特徴量とから検出条件の最適化を行うことを特徴とする請求項３に記載の情報処理方法。4. The method according to claim 3, wherein the optimizing step optimizes a detection condition based on a feature amount of the detection pixel region detected as a feature point in the feature point detection step and a feature amount of the ideal detection pixel region. An information processing method according to claim 1.

前記特徴点検出ステップは、前記最適化ステップによって更新された検出条件を用いて、特徴点の検出を行うことを特徴とする請求項１乃至４のいずれかに記載の情報処理方法。5. The information processing method according to claim 1, wherein the feature point detection step detects a feature point using the detection condition updated in the optimization step. 6.

前記検出条件は、前記取得ステップ、前記特徴点検出ステップ、前記最適化ステップの処理を複数回行うことにより、最適化されることを特徴とする請求項１記載の情報処理方法。The information processing method according to claim 1, wherein the detection condition is optimized by performing the processes of the obtaining step, the feature point detecting step, and the optimizing step a plurality of times.

前記検出条件は、複数の特徴点に対して異なる条件が設定されていることを特徴とする請求項１乃至６のいずれかに記載の情報処理方法。The information processing method according to claim 1, wherein different conditions are set for the plurality of feature points as the detection condition.

請求項１乃至７のいずれかに記載の情報処理方法をコンピュータ装置に実行させるためのコンピュータプログラム。A computer program for causing a computer device to execute the information processing method according to claim 1.

請求項８記載のコンピュータプログラムを格納したことを特徴とするコンピュータ装置読みとり可能な記憶媒体。A computer-readable storage medium storing the computer program according to claim 8.

撮影手段によって撮影された映像内の特徴部を検出し、前記検出結果を用いて前記撮影手段の位置姿勢情報を求める際に用いる、該特徴部の検出条件を調整する情報処理方法であって、
３次元位置および形状が既知である特徴部が存在する現実空間を前記撮像手段によって撮像することにより得られた映像から、検出条件を用いて該特徴点の領域を検出し、
前記特徴点の３次元位置および形状から、前記映像における特徴領域を算出し、
前記検出された特徴領域と前記算出された特徴領域を比較し、前記検出条件を調整することを特徴とする情報処理方法。An information processing method for detecting a characteristic portion in a video imaged by an imaging unit and using the detection result to obtain position and orientation information of the imaging unit, wherein the information processing method adjusts a detection condition of the characteristic unit,
Detecting, using a detection condition, a region of the feature point from a video obtained by imaging the real space in which a feature portion having a known three-dimensional position and shape is present by the imaging unit;
Calculating a characteristic region in the video from the three-dimensional position and shape of the characteristic point;
An information processing method, comprising: comparing the detected characteristic region with the calculated characteristic region to adjust the detection condition.

３次元空間内に位置する撮像手段の位置姿勢を推定し、位置姿勢情報として出力する撮像部位置姿勢推定装置であって、
前記撮像手段の位置姿勢を撮像された映像を利用する以外の方法で計測して、該撮像手段の位置姿勢を示す外部パラメータを取得する計測手段と、
３次元位置が既知である複数の特徴点の存在する現実空間を、前記撮像手段によって撮像した映像から検出パラメータにより前記特徴点を検出する特徴点検出手段と、
前記特徴点の位置に基づいて、前記計測手段によって得られた前記外部パラメータを補正する補正手段とを備え、
前記特徴点検出手段は、映像の輝度、色変化に応じて自動的に特徴点を検出する前記検出パラメータを最適化する検出パラメータの最適化手段を備えることを特徴とする撮像部位置姿勢推定装置。An imaging unit position and orientation estimation device that estimates a position and orientation of an imaging unit located in a three-dimensional space and outputs the position and orientation information,
A measuring unit that measures the position and orientation of the imaging unit by a method other than using a captured image, and obtains external parameters indicating the position and orientation of the imaging unit,
Feature point detection means for detecting a feature point from a video taken by the imaging means using a detection parameter in a real space having a plurality of feature points whose three-dimensional positions are known;
Correction means for correcting the external parameter obtained by the measurement means based on the position of the feature point,
An image pickup unit position / posture estimating device, comprising: a detection parameter optimizing unit that optimizes the detection parameter for automatically detecting a characteristic point in accordance with luminance and color change of an image. .