JPH0698206A

JPH0698206A - Image photographing state detecting method

Info

Publication number: JPH0698206A
Application number: JP4244674A
Authority: JP
Inventors: Akito Akutsu; 明人阿久津; Yoshinobu Tonomura; 佳伸外村
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1992-09-14
Filing date: 1992-09-14
Publication date: 1994-04-08
Anticipated expiration: 2016-12-04
Also published as: JP3234300B2

Abstract

PURPOSE:To provide an image photographing state detecting method which has the robust properties and the high speed performance by applying the filter processing and the statistical analysis processing to plural sheets of cross-section images including the image time bases. CONSTITUTION:In a procedure 2-1, the input images, i.e., the images which are continuously photographed in terms of time are arranged in the time base direction and a time space image is produced. In a procedure 2-2, a time space cross-section image is produced. The three-dimensional position of the photographed object is obtained from this time space image. In a procedure 2-3, the filter processing is applied to an x-t time space image train obtained in the procedure 2-2 for each x-t time space image and the intensity and the direction are detected for each edge and line. Then a locus is detected out of the edge image obtained from the detection of the preceding intensity and direction in regard of the movement of the feature point of an object. In a procedure 2-4, the independent statistical analysis processing is applied to the intensity and the direction of edges and lines calculated in the procedure 2-3. Thus each image photographing state parameter is calculated.

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、フレームを単位とし時
間連続を有する映像の中から、該映像の撮影状態を検出
する映像撮影状態検出方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a video shooting state detecting method for detecting a shooting state of a video from a video having time continuity in units of frames.

【０００２】[0002]

【従来の技術】ビデオカメラにより撮影された映像を後
から編集する場合、あるいは編集された映像を再編集し
たりする場合等において、情報量の膨大な映像はそのま
までは計算機上で高度にハンドリングすることは困難で
ある。このため映像の単位化、構造化技術の構築が進め
られている［例えば、外村、安部”動画像データベース
ハンドリングに関する検討”、信学技報、IE89-33(198
9)］。2. Description of the Related Art When an image captured by a video camera is to be edited later, or an edited image is to be edited again, an image with a huge amount of information is handled as it is on a computer. Is difficult. For this reason, construction of video unitization and structuring technology is in progress [for example, Sotomura, Abe "Study on moving image database handling", IEICE Technical Report, IE89-33 (198).
9)].

【０００３】また、映像の単位化に際しては、撮影者，
編集者等の意図を多く含む映像撮影状態は、映像の単
位、映像インデクスの一つとして有効であり［阿久津、
外村、大庭”時系列オプティカルフローを用いた映像単
位の提案”、画像符号化シンポジウム(PCSJ91), pp29-3
2, 1991］、映像作成者（撮影者，編集者）の意図を反
映する形で規定される必要性がある。この場合、付与す
るインデクスとしては、後にその映像から作成者の意図
が抽出できるようなものが好ましい。In addition, when the video is unitized, the photographer,
The video shooting state, which includes many editors' intentions, is effective as one of the video unit and video index [Akutsu,
Tonomura, Ohba “Proposal of video unit using time-series optical flow”, Image Coding Symposium (PCSJ91), pp29-3
2, 1991], and needs to be defined in a way that reflects the intention of the video creator (photographer, editor). In this case, the index to be added is preferably such that the intention of the creator can be extracted from the video later.

【０００４】従来、撮影された映像から撮影後に画像処
理等を用いて撮影状態を検出する方法に映像のフレーム
間より算出される動きベクトル［例えば、B.K.P.Horn a
nd B.G.Schunck : "Determing optical flow" Articial
Intelligence, Vol.17, pp.185-203 (1981)］を用いた
手法が報告されている［阿久津、外村、大庭、”動画像
インデクシングを目的としたカメラ操作の規定方法”、
信学論文誌、D-II Vol.J75-D-II No.2 pp.226-235, 199
2］。この方法は、映像データからフレーム毎にブロッ
ク単位で算出される動きベクトルについて、統計的解析
を行うことにより撮影時のカメラ操作の時間的特徴を定
量的変数として算出し、映像の撮影状態を検出する手法
である。Conventionally, in a method of detecting a shooting state by using image processing or the like after shooting a shot video, a motion vector calculated between frames of the video [eg, BKPHorn a
nd BGSchunck: "Determing optical flow" Articial
Intelligence, Vol.17, pp.185-203 (1981)] has been reported [Akutsu, Tonomura, Ohiwa, "Camera operation prescription method for moving image indexing",
IEICE Transactions, D-II Vol.J75-D-II No.2 pp.226-235, 199
2]. This method calculates the temporal characteristics of camera operation at the time of shooting as a quantitative variable by performing statistical analysis on the motion vector calculated for each frame from the video data, and detects the shooting state of the video. It is a method to do.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、上記映
像撮影状態検出方法では、動きベクトルが精度良く算出
されなければ、結果として映像の撮影状態の検出が精度
良くできない。一方、動きベクトル算出方法において、
未だ精度良く算出する手法が確立していない。精度よく
算出できないのは、ノイズ等のＳＮ比に起因するところ
が大きい。この点で報告手法のロバスト性に問題があ
る。However, in the above-mentioned image capturing state detecting method, if the motion vector is not calculated accurately, as a result, the image capturing state of the image cannot be detected accurately. On the other hand, in the motion vector calculation method,
The method to calculate with high accuracy has not been established yet. The reason why it cannot be calculated accurately is largely due to the SN ratio such as noise. In this respect, there is a problem in the robustness of the reporting method.

【０００６】また、動きベクトル算出処理の処理時間の
膨大さも、処理の高速性の観点から問題がある。従来報
告されている映像撮影状態検出方法は、動きベクトル算
出時の上記のような問題を抱えている。本発明の目的
は、ロバスト性、高速性を有する映像撮影状態検出方法
を提供することである。Further, the enormous processing time of the motion vector calculation processing is also problematic from the viewpoint of high speed processing. The video capturing state detection method that has been conventionally reported has the above-mentioned problems when calculating a motion vector. An object of the present invention is to provide a method of detecting a video image capturing state having robustness and high speed.

【０００７】[0007]

【課題を解決するための手段】上記目的を達成するた
め、本発明請求項１では、連続的に入力された各画像フ
レームのマトリクス上の、同一の行又は列単位に画素値
を読み取り、これを時系列順に配置して時空間断面画像
フレームを作成し、この時空間画像フレームについてフ
ィルタ処理を行ってエッジの方向を算出し、該算出され
たエッジの方向及び強度について統計的解析処理を行う
ことにより得たパラメータから映像の撮影状態を判定す
ることを特徴とする。In order to achieve the above object, according to claim 1 of the present invention, pixel values are read in the same row or column unit on a matrix of consecutively input image frames, and Are arranged in chronological order to create a space-time cross-sectional image frame, filter processing is performed on this space-time image frame to calculate edge directions, and statistical analysis processing is performed on the calculated edge directions and strengths. It is characterized in that the shooting state of the video is determined from the parameters obtained by the above.

【０００８】また、本発明請求項２では、連続的に入力
された各画像フレームのマトリクス上の、同一の行又は
列単位に画素値を読み取り、これを時系列順に配置して
行う時空間断面画像フレームの作成を前記行の全て又は
列の全てについて行い、該作成された各時空間断面画像
フレームについてフィルタ処理を行って前記画素につい
てエッジの強度を算出し、算出された各画素ごとのエッ
ジの強度について全ての時空間断面画像フレームについ
ての積分値を算出し、該算出で得られたエッジ強度を画
素値とする時空間投影画像を作成し、この時空間投影画
像についての統計的解析を行うことにより得られたパラ
メータから映像の撮影状態を判定することを特徴とす
る。Further, according to the second aspect of the present invention, the spatiotemporal cross section is performed by reading pixel values in the same row or column unit on the matrix of successively input image frames and arranging them in chronological order. An image frame is created for all of the rows or all of the columns, filter processing is performed for each of the created spatiotemporal cross-sectional image frames to calculate edge strength for the pixel, and the calculated edge for each pixel The integral value for all the spatiotemporal cross-sectional image frames is calculated for the intensity of, and a spatiotemporal projection image in which the edge intensity obtained by the calculation is used as a pixel value is created, and statistical analysis is performed on this spatiotemporal projection image. It is characterized in that the shooting state of the image is determined from the parameter obtained by performing the operation.

【０００９】[0009]

【作用】上記のように、映像の時間軸を含む複数枚の断
面画像についてのフィルタ処理及び統計的解析処理を行
うため、高速性、ロバスト性に富む処理が可能となり、
また、映像の撮影時のカメラ操作状態の時間的特徴の直
感的視覚化と、定量的変数の算出機能をも有した撮影状
態の検出が可能となる。As described above, since the filter processing and the statistical analysis processing are performed on a plurality of cross-sectional images including the time axis of the video, it is possible to perform processing with high speed and robustness.
Further, it becomes possible to intuitively visualize the temporal characteristics of the camera operation state at the time of shooting a video and detect the shooting state having a function of calculating quantitative variables.

【００１０】[0010]

【実施例】以下、本発明の実施例を、図面を参照して詳
細に説明する。撮影状態を決める要素には、カメラ操
作、カメラのオン／オフ、手ブレ等があり、これらの要
素の変化により撮影状態が変化する。以下、撮影状態の
変化によって起きる撮影対象物の特徴点の動き、例えば
カメラを左に振れば、対象物の特徴点は右に動くことを
グローバルな動きと呼ぶ。Embodiments of the present invention will now be described in detail with reference to the drawings. Elements that determine the shooting state include camera operation, camera on / off, camera shake, etc., and the shooting state changes due to changes in these elements. Hereinafter, the movement of the feature point of the object to be photographed caused by the change of the photographing state, for example, when the camera is swung to the left, the feature point of the object moves to the right is referred to as global movement.

【００１１】カメラ操作について図を用いて説明する。
図２はカメラの基本操作を説明するための図である。カ
メラ操作には７つの基本操作があり、撮影はそれらを組
み合わせた操作で構成されている。基本操作にはフィッ
クス、パン（１−１）、チルト（１−２）、ズーム（１
−３）、トラック（１−４）、ドリー（１−５）、ブー
ム（１−６）、がある。The camera operation will be described with reference to the drawings.
FIG. 2 is a diagram for explaining the basic operation of the camera. There are seven basic operations for camera operation, and shooting is composed of a combination of these operations. For basic operation, fix, pan (1-1), tilt (1-2), zoom (1
-3), a truck (1-4), a dolly (1-5), and a boom (1-6).

【００１２】フィックスはカメラを静止させたままの撮
影操作、パン及びチルトはカメラ投影中心固定の光軸方
向の変化、いわゆる首振り操作、ズームは画角の変化、
トラック，ブーム，ドリーはカメラ投影中心の位置変化
を伴う操作で、トラックは横へのカメラ移動、ブームは
縦方向のカメラ移動、ドリーはカメラ自身が被写体に近
寄ったり遠ざかったりする撮影操作である。トラック，
ブーム，ドリーはカメラ投影中心の位置変化が伴い三次
元情報を含む操作である。カメラによる撮影時には、こ
れらの操作のうち少なくとも１つの操作が含まれる。Fix is a shooting operation while the camera is stationary, pan and tilt are changes in the optical axis direction with the camera projection center fixed, so-called swinging operation, zoom is a change in angle of view,
A truck, a boom, and a dolly are operations accompanied by changes in the position of the camera projection center. A truck is a horizontal camera movement, a boom is a vertical camera movement, and a dolly is a photographing operation in which the camera itself approaches or moves away from a subject. truck,
Boom and dolly are operations that include three-dimensional information as the position of the camera projection center changes. At least one of these operations is included when the image is taken by the camera.

【００１３】次に、これらのカメラ操作を伴って撮影さ
れた映像データから本発明を用いて撮影状態を検出する
際の実施例について説明する。図１は、本発明の実施例
のアルゴリズムである。以下、図面を用いて詳細に説明
する。まず、手順２−１において、入力された映像すな
わち時間的に連続して撮影された画像を時間軸方向に並
べ時空間画像を作成する。Next, a description will be given of an embodiment in which the present invention is used to detect a photographing state from video data photographed by these camera operations. FIG. 1 is an algorithm of an embodiment of the present invention. Hereinafter, a detailed description will be given with reference to the drawings. First, in step 2-1, the input video, that is, the images captured continuously in time are arranged in the time axis direction to create a spatiotemporal image.

【００１４】図３に時空間画像の例を示す。３−１は画
像フレーム、３−２は時間軸を示す。このような並べて
できた映像フレームの集まりを時空間画像をいう。次
に、手順２−２において、時空間断面画像を作成する。
ここでいう時空間断面画像とは、映像を、時間軸方向に
切断した画像であり、この時空間断面画像の一例とし
て、コンピュータビジョンの分野で用いられているエピ
ポーラ平面画像（Epipolar Plane Image）がある。エピ
ポーラ平面画像とは、カメラの進行方向と画面の法線を
含む平面で時空間画像を切断した時の切断面の画像のこ
とをいう。この時空間断面画像から被写体の三次元位置
を推定できる。これは、このエピポーラ平面画像上で、
物体の特徴点の軌跡が直線になりこの直線の傾きが物体
特徴点の動きの大きさになることによる［R.C.Bolles,
H.Baker, and D.H.Marimont,"Epipolar-plane image an
alysis : An approach to determing structure from m
otion",IJCV,1,1,pp7-55, june 1989.］。FIG. 3 shows an example of the spatiotemporal image. 3-1 is an image frame and 3-2 is a time axis. A collection of such video frames arranged side by side is called a spatiotemporal image. Next, in step 2-2, a spatiotemporal cross-sectional image is created.
The spatio-temporal cross-sectional image here is an image obtained by cutting the video in the time axis direction. As an example of this spatio-temporal cross-sectional image, an epipolar plane image used in the field of computer vision is used. is there. The epipolar plane image refers to an image of a cut surface when the spatiotemporal image is cut along a plane including the moving direction of the camera and the normal line of the screen. The three-dimensional position of the subject can be estimated from this spatiotemporal cross-sectional image. This is on this epipolar plane image,
The trajectory of the feature point of the object becomes a straight line, and the inclination of this straight line becomes the magnitude of the motion of the object feature point [RCBolles,
H.Baker, and DHMarimont, "Epipolar-plane image an
alysis: An approach to determing structure from m
otion ", IJCV, 1,1, pp7-55, june 1989.].

【００１５】具体的には、各画像フレーム上の同一の行
単位に画素値を読み取り、読み取った画素値を時系列に
配列したものである。図４に時空間画像の切断の一方法
を示す。４−１はｘ−ｔ時空間画像、４−２はｘ−ｔ時
空間画像列である。また、４−３はｙ−ｔ時空間画像、
４−４はｙ−ｔ時空間画像列である。ここでは、時空間
断面画像として、ｘ−ｔ時空間画像とｙ−ｔ時空間画像
を算出する。ここでいうｘ−ｔ時空間画像，ｙ−ｔ時空
間画像とは、それぞれ画面の法線を含む平面で時空間画
像を切断した切断面の画像をいう。ｘ−ｔ時空間画像列
とは、複数枚のｘ−ｔ時空間画像であり、すなわち時空
間断面画像列である。ｙ−ｔ時空間画像列についても同
様である。Specifically, the pixel values are read in the same row unit on each image frame, and the read pixel values are arranged in time series. FIG. 4 shows a method of cutting the spatiotemporal image. 4-1 is an xt spatiotemporal image, and 4-2 is an xt spatiotemporal image sequence. 4-3 is a y-t spatiotemporal image,
Reference numeral 4-4 is a y-t spatiotemporal image sequence. Here, an xt spatiotemporal image and a yt spatiotemporal image are calculated as spatiotemporal cross-sectional images. The x-t spatio-temporal image and the y-t spatio-temporal image referred to here are images of a cut surface obtained by cutting the spatio-temporal image along a plane including the normal line of the screen. The xt spatiotemporal image sequence is a plurality of xt spatiotemporal images, that is, a spatiotemporal cross-sectional image sequence. The same applies to the y-t spatiotemporal image sequence.

【００１６】図５に図３の時空間画像より算出したｘ−
ｔ時空間画像とｙ−ｔ時空間画像の一例を示す。５−１
はｘ−ｔ時空間画像、５−２はｙ−ｔ時空間画像であ
る。続いて、手順２−３では、手順２−２で得られたｘ
−ｔ時空間画像列に対して、各ｘ−ｔ時空間画像毎にフ
ィルター（第一次微分、第二次微分等）処理を施し、エ
ッジ及び線についてそれぞれ強度及び方向を検出する。
この検出から得たエッジ画像から、物体の特徴点の動き
に関する軌跡を検出できる。FIG. 5 shows x− calculated from the spatiotemporal image of FIG.
An example of a t spatiotemporal image and a y-t spatiotemporal image is shown. 5-1
Is an xt spatiotemporal image, and 5-2 is a yt spatiotemporal image. Then, in procedure 2-3, x obtained in procedure 2-2
The -t spatiotemporal image sequence is subjected to a filter (first derivative, second derivative, etc.) process for each xt spatiotemporal image to detect the intensity and direction of each edge and line.
From the edge image obtained from this detection, it is possible to detect the trajectory related to the movement of the feature points of the object.

【００１７】図６に、この処理結果からの軌跡検出の一
例として、図５に示したｘ−ｔ時空間画像及びｙ−ｔ時
空間画像についてのエッジ画像を示す。同図において、
６−１はｘ−ｔ時空間画像のエッジ画像、６−２はｙ−
ｔ時空間画像のエッジ画像を示す。手順２−４におい
て、手順２−３により算出されたエッジ及び線の強度、
エッジ及び線の方向について、それぞれ独立に統計的解
析を行う。エッジ及び線の方向からは直接に、またエッ
ジ及び線の強度からは、時空間投影（積分）画像が算出
して、この時空間投影画像にさらに統計的解析をおこな
うことにより、それぞれ映像撮影状態パラメータが算出
される。FIG. 6 shows an edge image for the xt spatiotemporal image and the yt spatiotemporal image shown in FIG. 5, as an example of trajectory detection from the processing result. In the figure,
6-1 is an edge image of the x-t spatiotemporal image, 6-2 is y-
The edge image of t spatiotemporal image is shown. In step 2-4, the edge and line intensities calculated in step 2-3,
Statistical analysis is performed independently for the directions of edges and lines. The spatiotemporal projection (integral) image is calculated directly from the edge and line directions, and from the edge and line intensities, and the spatiotemporal projection image is further statistically analyzed to obtain the image capturing state. The parameters are calculated.

【００１８】まず、時空間投影画像を導出する際に算出
された各時空間断面画像のエッジの方向から映像撮影状
態パラメータを決定する方法について説明する。この推
定方法は、上記エピポーラ平面画像上で、物体の特徴点
の軌跡の直線の傾きが物体の特徴点の大きさを表すこと
から、グローバルな動きのパラメータ推定に用いること
ができる。これは図１の右側の流れに該当する。First, a method of deciding the image capturing state parameter from the direction of the edge of each spatiotemporal cross-sectional image calculated when deriving the spatiotemporal projection image will be described. This estimation method can be used for parameter estimation of global motion because the slope of the straight line of the trajectory of the feature points of the object on the epipolar plane image represents the size of the feature points of the object. This corresponds to the flow on the right side of FIG.

【００１９】次に、図１の左側の流れに該当する時空間
投影画像から映像撮影状態パラメータを決定する方法に
ついて説明する。この方法には、時空間投影画像の流れ
を解析し算出する方法（イ）と時空間投影画像を導出す
る際に算出された各時空間断面画像のエッジ方向を解析
し算出する方法（ロ）の二通りの方法がある。各方法に
ついて説明する。Next, a method of determining the video image capturing state parameter from the spatiotemporal projection image corresponding to the flow on the left side of FIG. 1 will be described. This method includes a method of analyzing and calculating the flow of the spatiotemporal projection image (a) and a method of analyzing and calculating the edge direction of each spatiotemporal cross-sectional image calculated when deriving the spatiotemporal projection image (b). There are two ways. Each method will be described.

【００２０】算出された時空間投影画像では、物体の特
徴点の動きがグローバルな動きの場合、積分処理をする
ことによりその動きが強調され、逆に、ローカルな動き
による物の場合、積分処理をすることによりその動きは
弱小された処理結果になる。このことによりグローバル
な動きが時空間投影画像上では、流れとして視覚化さ
れ、グローバルな動きの直感的視覚化の表現として有効
である。In the calculated spatiotemporal projection image, when the movement of the feature point of the object is a global movement, the movement is emphasized by performing the integration processing, and conversely, in the case of the object caused by the local movement, the integration processing is performed. By doing so, the movement results in a weak and small processing result. As a result, the global movement is visualized as a flow on the spatiotemporal projection image, which is effective as an intuitive visualization of the global movement.

【００２１】図７に時空間投影画像の作成方法を示す。
なお、以下は便宜のため、ｘ−ｔ時空間画像について説
明する。同図において、７−１はｘ−ｔ時空間画像のエ
ッジ画像、７−２は積分方向である。ｘ−ｔ時空間画像
のエッジ画像列について７−２の積分方向に積分して得
られた画像が時空間投影画像である。すなわち、時空間
断面画像列を手順２−３により処理した結果から列方向
の積分値（ｘ−ｔ二次元ヒストグラム）を算出したもの
である。FIG. 7 shows a method of creating a spatiotemporal projection image.
For the sake of convenience, the xt spatiotemporal image will be described below. In the figure, 7-1 is an edge image of the xt spatiotemporal image, and 7-2 is an integration direction. An image obtained by integrating the edge image sequence of the xt space-time image in the integration direction of 7-2 is the space-time projection image. That is, the column-direction integrated value (x-t two-dimensional histogram) is calculated from the result of processing the spatiotemporal cross-sectional image sequence in step 2-3.

【００２２】図３に示した時空間画像から算出した時空
間投影画像を図８に示す。同図において、８−１はｘ−
ｔ時空間投影画像を示す。まず、（イ）の時空間投影画
像の流れを解析する方法を説明する。この方法は、図２
中の手順２−５に相当する。解析手法として時空間投影
画像のエッジを算出しエッジ方向を解析する方法や、時
空間投影画像上の流れの時間的連続性を仮定し時間軸に
沿って微小時間で相関を算出し解析する手法等が考えら
れる。また、流れをエネルギー最小化問題として解析す
ることも考えられる。FIG. 8 shows a spatiotemporal projection image calculated from the spatiotemporal image shown in FIG. In the figure, 8-1 is x-
3 shows a t-time space projection image. First, a method for analyzing the flow of the spatiotemporal projection image in (a) will be described. This method is shown in FIG.
This corresponds to steps 2-5 in the above. As an analysis method, a method of calculating the edge of the spatiotemporal projection image and analyzing the edge direction, or a method of calculating the correlation in a minute time along the time axis assuming the temporal continuity of the flow on the spatiotemporal projection image Etc. are possible. It is also possible to analyze the flow as an energy minimization problem.

【００２３】次に、（ロ）の時空間投影画像を導出する
際に算出されたエッジの方向からの、映像撮影状態パラ
メータの決定は、上記エピポーラ平面画像上で、エッジ
の方向、すなわち物体の特徴点の軌跡の直線の傾きが物
体の特徴点の大きさを表すことから、これをグローバル
な動きのパラメータ推定に用いることにより行う。上記
によりそれぞれの方法で求められた、三次元情報を含ま
ないカメラ操作，すなわちズーム操作，パン操作，チル
ト操作、及びこれらの組み合わせ操作によってもたらさ
れるエッジの方向の分布を図９に示す。同図の、左の図
はｘ−ｔ時空間画像についてのものであり、右の図はｙ
−ｔ時空間画像についてのものである。同図中の直線が
エッジの方向の分布であり、直線の傾き９−１がズーム
操作パラメータであり切片９−２がパン操作パラメー
タ、９−３がチルト操作パラメータを表している。この
直線を最小二乗法等で近似することによって、各パラメ
ータを推定することができる。Next, the determination of the image capturing state parameter from the edge direction calculated when deriving the spatiotemporal projection image (b) is performed by determining the direction of the edge, that is, the object, on the epipolar plane image. Since the inclination of the straight line of the locus of the feature points represents the size of the feature points of the object, this is used for estimating the global motion parameters. FIG. 9 shows the distribution of the edge directions produced by the above-described camera operations that do not include three-dimensional information, that is, the zoom operation, the pan operation, the tilt operation, and a combination of these operations. In the figure, the left figure is for the xt spatiotemporal image, and the right figure is y.
-T for spatiotemporal images. The straight line in the figure represents the distribution in the direction of the edge, the slope 9-1 of the straight line represents the zoom operation parameter, the intercept 9-2 represents the pan operation parameter, and 9-3 represents the tilt operation parameter. Each parameter can be estimated by approximating this straight line by the method of least squares or the like.

【００２４】また、三次元情報を含む操作についての時
空間投影画像で見られる特徴は、ミクロ的には、三次元
操作を含まない操作に等しいことから、上記処理を部分
的に施すことにより可能である。カメラ操作以外のグロ
ーバルな動きの要因、カメラのオン／オフ、手ブレの検
出は、カメラのオン／オフについては、時空間投影画像
で見られる不連続な特徴から検出可能であり、また、手
ブレに対しては、比較的時間的に短い周期的（10〜20H
z）なパン操作／チルト操作であると考えられ、カメラ
操作検出後、フーリエ変換等の処理により検出可能であ
る。Further, since the feature seen in the spatiotemporal projection image regarding the operation including the three-dimensional information is microscopically equivalent to the operation not including the three-dimensional operation, it is possible by partially performing the above processing. Is. Global movement factors other than camera operation, camera on / off, and camera shake detection can be detected from the discontinuous features seen in the spatiotemporal projection image, and the camera on / off can be detected. For blurring, it is relatively short in time (10 to 20H
It is considered to be a pan operation / tilt operation such as z), and can be detected by processing such as Fourier transform after the camera operation is detected.

【００２５】図１０に、本発明を用いて、テスト映像と
して市販のビデオカメラから入力した３００の画像フレ
ームについて行った実験の結果を示す。縦軸は各操作の
有無を表し、横軸はフレーム番号を表す。グローバルな
動きを、次式に示すカメラ操作で解釈した場合の各係数
をエッジの方向の統計的解析により算出した結果であ
る。FIG. 10 shows the result of an experiment conducted using the present invention on 300 image frames input from a commercially available video camera as a test image. The vertical axis represents the presence or absence of each operation, and the horizontal axis represents the frame number. It is the result of calculating each coefficient by the statistical analysis of the direction of the edge when the global movement is interpreted by the camera operation shown in the following equation.

【００２６】[0026]

【数１】 [Equation 1]

【００２７】ここで、ａ，ｂ，ｃはそれぞれ係数を表
し、（ｕ，ｖ）は、特定の撮影対象物の特徴点の座標
（ｘ，ｙ）のグローバルな動きを表す。係数ａは、ズー
ミング操作によって生じる係数であり、係数ｂ，ｃはそ
れぞれパンニング，チルティング操作によるものであ
る。この実験ではグローバルな動きの原因であるカメラ
操作は、右へのパンニング操作で始まり、次に静止が継
続し、最後に左へパンニングしながらズームインする操
作で構成されている映像を使用した。Here, a, b and c represent coefficients, respectively, and (u, v) represent global movement of the coordinates (x, y) of the characteristic points of the specific object to be photographed. The coefficient a is a coefficient generated by a zooming operation, and the coefficients b and c are a panning and tilting operation, respectively. In this experiment, the camera operation that is the cause of the global movement was started with a panning operation to the right, followed by stillness, and finally an operation of zooming in while panning to the left.

【００２８】図１０に示すように、右（同図では負の向
き）へのパンニングから静止が続き、最後に左へのパン
ニングと同時にズームインを行っているカメラ操作が再
現されている。以上、本発明を実施例に基づき具体的に
説明したが、本発明は、前記実施例に限定されるもので
はなく、その要旨を逸脱しない範囲において種々の変更
が可能であることは言うまでもない。As shown in FIG. 10, a camera operation in which panning to the right (negative direction in the figure) is followed by stillness, and finally panning to the left and zooming in simultaneously is reproduced. Although the present invention has been specifically described based on the embodiments, the present invention is not limited to the above embodiments, and it goes without saying that various modifications can be made without departing from the scope of the invention.

【００２９】[0029]

【発明の効果】以上説明したように本発明は、映像の時
間軸を含む複数枚の断面画像についてのフィルタ処理を
行い、エッジ及び線を検出し、該エッジ及び線の統計的
解析を行うことにより、該映像の撮影時のカメラ操作状
態の時間的特徴の定量的変数を算出し、該映像の撮影状
態を検出するものであり、映像の撮影状態検出に動きベ
クトルを用いない点、処理が統計的である点で処理に高
速性、ロバスト性を有し、映像の撮影時のカメラ操作状
態の時間的特徴の直感的視覚化と、定量的変数を算出を
も有した撮影状態の検出が可能となる。As described above, according to the present invention, a filtering process is performed on a plurality of cross-sectional images including a time axis of an image, edges and lines are detected, and the edges and lines are statistically analyzed. According to the method, a quantitative variable of a temporal characteristic of a camera operation state at the time of shooting the video is calculated, and the shooting status of the video is detected. It has statistically high processing speed and robustness, and has intuitive visualization of the temporal characteristics of the camera operation state when capturing an image, and detection of the capturing state that also includes calculation of quantitative variables. It will be possible.

【図面の簡単な説明】[Brief description of drawings]

【図１】本発明の一実施例の映像撮影検出方法のアルゴ
リズムを説明するための図である。FIG. 1 is a diagram for explaining an algorithm of a video shooting detection method according to an embodiment of the present invention.

【図２】映像撮影に用いられるカメラ操作の基本操作を
説明する図である。FIG. 2 is a diagram illustrating a basic operation of a camera used for capturing an image.

【図３】時空間画像の例を示した図である。FIG. 3 is a diagram showing an example of a spatiotemporal image.

【図４】時空間画像を切断し時空間断面画像を算出する
方法を説明するための図である。FIG. 4 is a diagram for explaining a method of cutting a spatiotemporal image and calculating a spatiotemporal cross-sectional image.

【図５】時空間断面画像の例を示した図である。FIG. 5 is a diagram showing an example of a spatiotemporal cross-sectional image.

【図６】時空間断面画像のエッジ画像の例を示した図で
ある。FIG. 6 is a diagram showing an example of an edge image of a spatiotemporal cross-sectional image.

【図７】時空間断面図から時空間投影画像を算出する方
法を示す図である。FIG. 7 is a diagram showing a method of calculating a spatiotemporal projection image from a spatiotemporal sectional view.

【図８】時空間投影画像の例を示した図である。FIG. 8 is a diagram showing an example of a spatiotemporal projection image.

【図９】時空間投影画像で見られるカメラ操作パラメー
タの特徴を示す図である。FIG. 9 is a diagram showing characteristics of camera operation parameters seen in a spatiotemporal projection image.

【図１０】本発明を用いて行った実験結果を示す図であ
る。FIG. 10 is a diagram showing the results of an experiment conducted using the present invention.

【符号の説明】[Explanation of symbols]

２−１時空間画像を作成する処理２−２時空間画像から時空間断面画像を作成する処理２−３時空間顔面画像についてフィルタ処理を行って
エッジ及び線の強度及び方向を検出する処理２−４エッジ及び線の強度及び方向について統計的解
析を行う処理２−５時空間投影画像についての統計的解析を行う処
理４−１ｘ−ｔ時空間画像４−２ｙ−ｔ時空間画像列４−３ｙ−ｔ時空間画像４−４ｙ−ｔ時空間画像列７−１ｘ−ｔ時空間画像７−２積分方向９−１ズームパラメータ９−２パンパラメータ９−３チルトパラメータ2-1 Process for creating spatiotemporal image 2-2 Process for creating spatiotemporal cross-sectional image from spatiotemporal image 2-3 Process for performing filter process on spatiotemporal face image to detect edge and line strength and direction 2 -4 Processing for statistically analyzing the strength and direction of edges and lines 2-5 Processing for statistically analyzing spatiotemporal projection images 4-1 xt spatiotemporal image 4-2 yt spatiotemporal image sequence 4-3 y-t spatio-temporal image 4-4 y-t spatio-temporal image sequence 7-1 x-t spatio-temporal image 7-2 integration direction 9-1 zoom parameter 9-2 pan parameter 9-3 tilt parameter

Claims

【特許請求の範囲】[Claims]

【請求項１】対象物を連続的に撮影して得た映像情報を
用いて該映像の撮影状態を検出する映像撮影状態検出方
法において、前記映像情報を画素を二次元に配置してな
る画像フレームに読み出し、各画像フレーム上の同一の
行又は列単位に前記画素の画素値を読み取って時系列順
に配置した時空間断面画像フレームを前記行又は列対応
に作成し、該時空間断面画像フレームについてフィルタ
処理を行ってエッジの方向を算出し、該算出されたエッ
ジの方向から映像撮影時の時間的特徴の定量的変数を算
出して映像の撮影状態を検出することを特徴とする映像
撮影状態検出方法。1. A method for detecting a shooting state of a video using video information obtained by continuously shooting an object, wherein the video information is an image in which pixels are arranged two-dimensionally. The spatio-temporal cross-sectional image frame is read out in a frame, the pixel values of the pixels are read in the same row or column unit on each image frame, and the spatio-temporal cross-sectional image frame is arranged corresponding to the row or column in time series order. Video shooting characterized by detecting the shooting state of the video by calculating a direction of the edge by performing a filter process on the image, and calculating a quantitative variable of temporal characteristics at the time of shooting the video from the calculated edge direction. State detection method.

【請求項２】対象物を連続的に撮影して得た映像情報を
用いて該映像の撮影状態を検出する映像撮影状態検出方
法において、前記映像情報を画素を二次元に配置してな
る画像フレームに読み出し、各画像フレーム上の同一の
行又は列単位に前記画素の画素値を読み取って時系列順
に配置した時空間断面画像フレームを前記行の全て又は
列の全てについて作成し、該作成された各時空間断面画
像フレームについてフィルタ処理を行って前記画素につ
いてエッジの強度を算出し、各画素ごとに前記算出され
たエッジの強度について全ての時空間断面画像フレーム
について積分値を算出し、該算出で得られたエッジ強度
を画素値とする時空間投影画像を作成し、該時空間投影
画像についての統計的解析を行うことにより得られたパ
ラメータから映像撮影時の時間的特徴の定量的変数を算
出して映像の撮影状態を検出することを特徴とする映像
撮影状態検出方法。2. A video shooting state detecting method for detecting a shooting state of a video using video information obtained by continuously shooting an object, wherein the video information is an image in which pixels are arranged two-dimensionally. The image is read out into a frame, the pixel values of the pixels are read in the same row or column unit on each image frame, and a space-time cross-sectional image frame arranged in chronological order is created for all the rows or all the columns. The edge strength of each pixel is calculated by performing a filtering process on each spatiotemporal cross-section image frame, and an integral value is calculated for all the spatiotemporal cross-section image frames with respect to the calculated edge strength for each pixel. An image is created from the parameters obtained by creating a spatiotemporal projection image with the edge strength obtained by calculation as the pixel value and performing statistical analysis on the spatiotemporal projection image. Image capturing condition detection method characterized by by calculating the quantitative variables of temporal characteristics when shadow detecting the shooting state of the image.