JP4883530B2

JP4883530B2 - Device control method based on image recognition Content creation method and apparatus using the same

Info

Publication number: JP4883530B2
Application number: JP2007162278A
Authority: JP
Inventors: 剛治江藤; 一基田中; 洋司越智; 俊一根津; 由布江藤
Original assignee: Kinki University
Current assignee: Kinki University
Priority date: 2007-06-20
Filing date: 2007-06-20
Publication date: 2012-02-22
Anticipated expiration: 2027-06-20
Also published as: JP2009003606A

Description

本発明は、指示者の動作を画像により認識し、その動作に基づいて機器を制御する方法および装置に関する。特に教育用コンテンツを講義者が講義をしながら作成し編集する際に好適に利用することのできる画像認識による機器制御方法および装置に関する。本発明では、指示者の手の指の数を判断し、その指の数を識別することで機器を制御する。 The present invention relates to a method and an apparatus for recognizing an action of an instructor from an image and controlling a device based on the action. In particular, the present invention relates to a device control method and apparatus based on image recognition that can be suitably used when a lecturer creates and edits educational content while giving a lecture. In the present invention, the number of fingers of the instructor's hand is determined, and the device is controlled by identifying the number of fingers.

従来教育や研修の場においては、本やコンパクトディスク（ＣＤ）といった教材を用いていたが、これらの媒体は、量がかさばり、また大量に作成した場合は修正がやりにくいといった問題点があった。 Conventionally, teaching materials such as books and compact discs (CDs) have been used in educational and training venues, but these media are bulky and have a problem that they are difficult to modify when produced in large quantities.

近年イーラーニング（ｅ−ｌｅａｒｎｉｎｇ）と呼ばれるインターネットを利用した教育および研修形態が普及している。イーラーニングでは、学習者がインターネットから提供される教材を用いるため、端末さえあればいつでもどこでも学習ができ、また教材の嵩（かさ）や修正といった問題点も簡単に解決できる。 In recent years, an education and training form using the Internet called e-learning has become widespread. In e-Learning, since learners use teaching materials provided from the Internet, they can learn anytime and anywhere as long as they have a terminal, and can easily solve problems such as bulkiness and correction of teaching materials.

このイーラーニングの更なる普及には教材となるコンテンツの充実が不可欠となる。できるだけわかりやすく、視覚的にも質の高いコンテンツの整備が望まれる。 In order to further spread this e-learning, it is indispensable to enrich the contents that serve as teaching materials. It is desirable to maintain contents that are as easy to understand as possible and have high visual quality.

しかし、コンテンツの作成には高価な機器の整備、担当者の周到な準備、十分な時間をかけた撮影、さらに事後の編集作業が必要になる。一方、それほど質が高くなくても、必要に応じて簡便にコンテンツを作成できる環境があれば、イーラーニングを気軽に普及させることに役立つ。 However, content creation requires maintenance of expensive equipment, careful preparation by the person in charge, shooting with sufficient time, and post-editing work. On the other hand, if there is an environment in which content can be easily created as needed, even if the quality is not so high, it will help to easily spread e-learning.

例えば、講義において、講師が学習者に提供するのは、講師の映像、資料の映像、板書の映像、講師の音声であり、これらを適切な順番に記録することができればイーラーニングの教材としては役に立つ。これらの情報は講義の進行に伴って講師が演出して学習者に示すものであるから、講師自身がこれらの情報を適宜記録すれば、イーラーニング用の教材を講義と同時に作ることが可能である。 For example, in the lecture, the lecturer provides the learner with the video of the lecturer, the video of the material, the video of the board, and the audio of the lecturer. If these can be recorded in the proper order, Useful. This information is produced by the instructor as the lecture progresses and shown to the learner. If the instructor records this information appropriately, e-learning materials can be created at the same time as the lecture. is there.

講師は講義に集中しているので、講師によるこれらの情報の選択はできるだけ容易な方法で、しかも違和感なく行えることが必要である。特に、講師は黒板に板書をするため教壇上を動き回る。また、手には資料やチョーク、差し棒を持っている。従って、講義を行いながら通常機器操作手段として考えられるリモコンを使うのは容易でない。 Since the instructor concentrates on the lecture, it is necessary for the instructor to select such information in an easy way and without any sense of incongruity. In particular, the instructor moves around on the platform to write on the blackboard. In addition, I have materials, chalk, and a stick in my hand. Therefore, it is not easy to use a remote controller that is normally considered as a device operating means while giving a lecture.

上記のような要求に応えられる方法としては、講師の動作を認識して映像情報等を切換記録できるようにする方法が有力である。特に講師の手の形によって機器を選択制御することができれば、煩雑な操作も必要なく、また教壇の上でリモコンを持ち歩くこともない。 As a method that can meet the above-described requirements, a method that enables the recording and recording of video information and the like by recognizing the operation of the lecturer is effective. In particular, if the device can be selected and controlled by the shape of the instructor's hand, no complicated operation is required, and the remote control is not carried around on the platform.

人の手の形を認識して機器制御を行う方法は古くから提案されている。特許文献１では、手の形自体を認識するために手の幾何学的特長と、輪郭線の特徴の２つを併用して、特徴を抽出し、予め用意した候補掌形と比較することで、掌形を認識する掌形認識方法が開示されている。 A method for controlling a device by recognizing the shape of a human hand has been proposed for a long time. In patent document 1, in order to recognize the hand shape itself, the geometric features of the hand and the features of the contour line are used together to extract the features and compare them with the candidate palm shapes prepared in advance. A palm shape recognition method for recognizing a palm shape is disclosed.

特許文献２には、キーボードやマウス等の入力装置の代わりに手の形状を認識することによるインターフェース装置が開示されている。ここでは、指の本数の認識方法としては、テンプレートマッチング、形状モデルとのマッチング、ニューラルネットワークを用いた認識方法の存在を開示している。 Patent Document 2 discloses an interface device that recognizes the shape of a hand instead of an input device such as a keyboard or a mouse. Here, as a method for recognizing the number of fingers, template matching, matching with a shape model, and existence of a recognition method using a neural network are disclosed.

特許文献３には、人の体を所要の部分領域に分け、部分領域毎の個別の認識処理を行うことで認識率を上げる方法が開示されている。ここでは主として画素の累積画素数を算出することで、画像を認識しようとしている。 Patent Document 3 discloses a method of increasing a recognition rate by dividing a human body into required partial areas and performing individual recognition processing for each partial area. Here, an attempt is made to recognize an image mainly by calculating the cumulative number of pixels.

特許文献４には、人を含む画像を撮影し、肌色部分を顔および手として分離し、比較的小さな略長方形の領域を抽出し、掌領域との位置関係や大きさの比較を行うことで指の本数を判定する方法が開示されている。 Patent Document 4 captures an image including a person, separates a flesh-colored part as a face and a hand, extracts a relatively small substantially rectangular region, and compares the positional relationship and size with a palm region. A method for determining the number of fingers is disclosed.

また、特許文献５には、やはり同じく肌色の部分を検出することで、手領域を求め、指同士が重なり合わないようにして得た画像から、指を線分で近似することで伸ばした指の本数を認識する方法が開示されている。
特開昭６２−３２５８１号公報特開平９−１８５４５６号公報特開平１１−１９１１５８号公報特開２００１−２８０４６号公報特開２００３−３４６１６２号公報 Similarly, Patent Document 5 also discloses a finger stretched by approximating a finger with a line segment from an image obtained by obtaining a hand region by detecting a flesh-colored part and preventing fingers from overlapping each other. A method for recognizing the number is disclosed.
JP-A-62-32581 JP-A-9-185456 Japanese Patent Laid-Open No. 11-191158 JP 2001-28046 A JP 2003-346162 A

講義の進行自体をイーラーニング用の教材として講師自身が記録編集するには、いくつかの制約がある。具体的には、教壇上を動き回る講師の動作から指の本数を判断しなければならない。すなわち、講師の動作を認識させるカメラは教壇全体が入るような広角レンズを用いる必要がある。従って、あまり高い解像度の画像を得ることは期待できない。 There are some restrictions for the instructor himself to record and edit the progress of the lecture as an e-learning material. Specifically, the number of fingers must be determined from the movement of the instructor moving around on the platform. That is, it is necessary to use a wide-angle lens that can accommodate the entire teaching platform for the camera that recognizes the movement of the instructor. Therefore, it cannot be expected to obtain an image with a very high resolution.

また、講義の進行と同時に機器の選択操作を行うので、講師の動作に対してリアルタイムに反応する必要がある。すなわち、指の数を認識するために複雑で膨大な計算が必要では利用できない。 In addition, since the device selection operation is performed simultaneously with the progress of the lecture, it is necessary to react in real time to the operation of the lecturer. That is, it cannot be used if a complicated and enormous calculation is required to recognize the number of fingers.

このような制約条件に対して、手の指の本数を認識する特許文献１、３、４、５は指の輪郭を直線などで近似するため低い解像度の画像データを用いるのは認識精度が低下するおそれがある。 Under such constraints, Patent Documents 1, 3, 4, and 5 that recognize the number of fingers of a hand approximate the contour of the finger with a straight line, etc., so using low-resolution image data reduces the recognition accuracy. There is a risk.

本発明は上記のように、講師が自らの講義を進めながらイーラーニング用の教材として編集を行うという条件の下で、手の指の本数若しくは手の形状を判断する方法を提供する。 As described above, the present invention provides a method for determining the number of fingers or the shape of a hand under the condition that an instructor performs editing as a learning material for e-learning while advancing his / her lecture.

上記の課題を解決するために、本発明は指示者の差し出した手の幅（掌形幅）と指先の幅（指先幅）を映像データ上で求めその比（掌形比）から指の本数を判定する。そして判定した指の本数に対応する制御信号を出力する。若しくは掌形比から直接対応する制御信号を求め出力する。 In order to solve the above problems, the present invention obtains the width of the hand (palm shape width) and the fingertip width (fingertip width) provided by the instructor on the video data, and calculates the number of fingers from the ratio (palm shape ratio). Determine. Then, a control signal corresponding to the determined number of fingers is output. Alternatively, the control signal corresponding directly from the palm shape ratio is obtained and output.

本発明は、指の数を認識するのに、手の幅と差し出された指の幅の比で、指の本数を判断するので、低解像度の画像からでも、指の数を判断できる。また、処理の計算量も多くなく、標準となる掌形の候補も用いないため、講師の動作に対して短い時間で反応させることができる。また、講師が差し出した手がカメラに対して多少傾いていても指の本数を誤認識することが少ない。 In the present invention, since the number of fingers is determined based on the ratio of the width of the hand to the width of the inserted finger to recognize the number of fingers, the number of fingers can be determined even from a low-resolution image. In addition, since the amount of processing is not large and no standard palm shape candidate is used, it is possible to react to the instructor's actions in a short time. In addition, even if the hand that the instructor gave is slightly tilted with respect to the camera, the number of fingers is not recognized erroneously.

そのため、指示を出す講師は、簡単な動作で編集操作ができ、講義をしながら自分で、自分の講義を編集しイーラーニング教材を作成することができるという効果を有する。 Therefore, an instructor who gives an instruction can perform an editing operation with a simple operation, and has the effect of being able to edit his / her lecture and create an e-learning material by himself while giving a lecture.

以下に本発明を説明するための実施の形態について説明を行うが、以下の説明だけに限定されるものではなく、本発明の範囲内で適宜変更若しくは公知の技術を追加することができる。また、実施の形態１での説明は適宜他の実施の形態２にも適用でき、その逆も同様である。 Embodiments for describing the present invention will be described below. However, the present invention is not limited to the following description, and appropriate modifications or known techniques can be added within the scope of the present invention. Further, the description in Embodiment 1 can be applied to other Embodiments 2 as appropriate, and vice versa.

（実施の形態１）
図１に本実施の形態の画像認識による機器制御装置（以下「画像処理装置」という。）を用いたコンテンツ作製システムの全容を示す。本システムには、講師の指の本数を求めるために、講師を撮影しているサブカメラ１２と、サブカメラ１２からの映像を取得し、講師の指の本数を判断する画像処理装置１０と、画像処理装置１０が制御するビデオセレクタ１４と、ビデオセレクタからの信号を記録するレコーダ１８を有する。なお、本明細書および特許請求の範囲において、指の本数を「指数」ともいう。 (Embodiment 1)
FIG. 1 shows the entire contents production system using a device control apparatus (hereinafter referred to as “image processing apparatus”) based on image recognition according to the present embodiment. In this system, in order to obtain the number of instructor's fingers, the sub camera 12 photographing the instructor, the image processing apparatus 10 for acquiring the video from the sub camera 12, and determining the number of instructor's fingers, It has a video selector 14 controlled by the image processing apparatus 10 and a recorder 18 for recording signals from the video selector. In the present specification and claims, the number of fingers is also referred to as an “index”.

また、ビデオセレクタ１４には、講師の顔を映すカメラ２０と、黒板を映すカメラ２２と、講義資料を映し出すプロジェクタ２４とが接続されている。 The video selector 14 is connected to a camera 20 that reflects the face of the lecturer, a camera 22 that reflects the blackboard, and a projector 24 that reflects the lecture material.

また、ビデオセレクタ１４の出力とマイク２６の出力は合成器１６に接続されている。合成器１６の出力はレコーダ１８に接続されている。 The output of the video selector 14 and the output of the microphone 26 are connected to the synthesizer 16. The output of the synthesizer 16 is connected to the recorder 18.

合成器１６の出力はまた分配器３２にも接続されており、講師自身が見ることのできる映像モニタ３４と受講生たちが見る映像モニタ４０や４２を有する。
画像処理装置１０からは、制御信号の状態を示す制御表示器３０が接続されている。 The output of the synthesizer 16 is also connected to a distributor 32 and has a video monitor 34 that can be viewed by the instructor and video monitors 40 and 42 that are viewed by the students.
A control indicator 30 indicating the state of the control signal is connected from the image processing apparatus 10.

次に各構成機器を詳細に説明する。
サブカメラ１２は、講師（以下「指示者」という。）を常に撮影する。サブカメラ１２からの映像は、指示者が機器制御のための指示（以下「制御指示」という。）を出す合図として示す指の本数を判断するために使われる。従って、できるだけ分解能の高いカメラが望ましい。一方、指示者は教壇の上を左右に動く事が想定される。そこで、サブカメラ１２を固定カメラとした場合は、教壇の端から端までが写せるように、画角の広い映像を写せる広角のレンズを持ったカメラがよい。 Next, each component device will be described in detail.
The sub camera 12 always photographs a lecturer (hereinafter referred to as “instructor”). The video from the sub camera 12 is used to determine the number of fingers shown as a cue for the instructor to issue an instruction for device control (hereinafter referred to as “control instruction”). Therefore, a camera with as high a resolution as possible is desirable. On the other hand, the instructor is supposed to move left and right on the platform. Therefore, when the sub camera 12 is a fixed camera, a camera having a wide-angle lens capable of capturing an image with a wide angle of view so that the end of the platform can be captured is preferable.

サブカメラ１２で写した指示者の画像データＶｍは、画像処理装置１０に送られる。画像処理装置１０では、画像データＶｍから指示者の指の本数を抽出する。指の本数の抽出の仕方についての詳細は後述する。 The image data Vm of the instructor copied by the sub camera 12 is sent to the image processing apparatus 10. In the image processing apparatus 10, the number of fingers of the instructor is extracted from the image data Vm. Details of how to extract the number of fingers will be described later.

そして、指示者の指の本数に応じた制御信号Ｓｃｖ、Ｓｃｐをビデオセレクタ１４とプロジェクタ制御器２５へ出力する。制御信号Ｓｃｖは、ビデオセレクタ１４へ入力されるカメラ２０からの映像信号Ｖｃ１、カメラ２２からの映像信号Ｖｃ２およびプロジェクタ２４からの映像信号Ｖｐｊのうちのどの映像信号を出力させるかを制御する。 Then, control signals Scv and Scp corresponding to the number of fingers of the instructor are output to the video selector 14 and the projector controller 25. The control signal Scv controls which video signal of the video signal Vc1 from the camera 20, the video signal Vc2 from the camera 22, and the video signal Vpj from the projector 24 input to the video selector 14 is output.

制御信号Ｓｃｐはプロジェクタ制御器２５から映像を、出力させるまたは出力する画像を変えるといった動作を制御する。 The control signal Scp controls operations such as outputting an image from the projector controller 25 or changing an image to be output.

カメラ２０とカメラ２２はそれぞれ指示者を追うカメラと、板書を写すカメラである。指示者を追うカメラ２０は、話をしている指示者を写すためのものであり、サブカメラ１２とは役割が異なる。従って、視聴しにくい範囲でない限り画角の狭い、望遠タイプのカメラであってもよい。 The camera 20 and the camera 22 are a camera for following the instructor and a camera for copying a board. The camera 20 that follows the instructor is for copying the instructor who is talking, and has a role different from that of the sub camera 12. Therefore, a telephoto camera with a narrow angle of view may be used as long as it is not difficult to view.

カメラ２２は板書を写すカメラである。黒板に書いたチョークの文字や白板に書いた黒マジックの文字が判別できるように撮影できるカメラがよい。また、黒板全体を一度に写せるカメラであるのが好ましい。 The camera 22 is a camera for copying a board. A camera that can take pictures so that you can distinguish chalk characters on a blackboard or black magic characters on a white board is good. Moreover, it is preferable that it is a camera which can image the whole blackboard at once.

プロジェクタ２４は、説明用の資料などを教壇前に設置したスクリーン４５に映すためのプロジェクタである。プロジェクタ制御器２５からの出力を教壇の前に設置されたスクリーンなどに映写する。 The projector 24 is a projector for projecting explanatory materials and the like on a screen 45 installed in front of the teaching platform. The output from the projector controller 25 is projected on a screen or the like installed in front of the teacher.

プロジェクタ制御器２５は、描画ソフトウェアが搭載された小型のコンピュータなどで構成され、制御信号Ｓｃｐによって、画像の出力や出力画像の変更を行う。 The projector controller 25 is composed of a small computer or the like on which drawing software is installed, and outputs an image or changes an output image by a control signal Scp.

カメラ２０、カメラ２２、プロジェクタ２４の映像出力信号はそれぞれＶｃ１、Ｖｃ２、Ｖｐｊであり、ビデオセレクタ１４に送られる。 Video output signals of the camera 20, camera 22, and projector 24 are Vc1, Vc2, and Vpj, respectively, and are sent to the video selector 14.

ビデオセレクタ１４は、Ｖｃ１、Ｖｃ２、Ｖｐｊを入力し、そのうちの１つ、若しくは複数の信号を整形して１にした信号Ｖｓｉｇを出力する。つまり映像選択を行う。複数の信号を整形して１つにした信号とは、複数の信号を１画面に入るように整形した信号をいう。すなわち、Ｖｓｉｇは、Ｖｃ１、Ｖｃ２、Ｖｐｊのうちの１つの信号でもあるし、また、Ｖｃ１とＶｃ２といった複数の映像信号を１つの画面に入るように修正し１つの映像信号とした信号も含む。Ｖｓｉｇは、合成器１６に入力される。 The video selector 14 inputs Vc1, Vc2, and Vpj, and outputs a signal Vsig obtained by shaping one or a plurality of signals to be 1. That is, video selection is performed. The signal obtained by shaping a plurality of signals into one means a signal obtained by shaping a plurality of signals so as to enter one screen. That is, Vsig is also one of the signals Vc1, Vc2, and Vpj, and also includes a signal obtained by correcting a plurality of video signals such as Vc1 and Vc2 so as to enter one screen. Vsig is input to the synthesizer 16.

マイク２６は、指示者の音声を拾うマイクである。このマイク２６の出力Ａｓｉｇは合成器１６に入力される。図１には特に示していないが、マイク２６の出力を合成器１６までの間で分割したり、また適当なアンプやエフェクターを接続しても構わない。 The microphone 26 is a microphone that picks up the voice of the instructor. The output Asig of the microphone 26 is input to the synthesizer 16. Although not specifically shown in FIG. 1, the output of the microphone 26 may be divided up to the synthesizer 16, or an appropriate amplifier or effector may be connected.

合成器１６はビデオセレクタ１４の出力Ｖｓｉｇとマイク２６の出力Ａｓｉｇを合成し映像音声信号ＡＶｓｉｇとして出力する。合成器１６は、使用するＶｓｉｇやＡｓｉｇの信号の形式の種類によって適宜構成を変えてもよい。しかし、出力信号となる映像音声信号ＡＶｓｉｇは、再生されることが前提であるので、既存のフォーマットに準じた形式やプロトコルであるのが望ましい。映像音声信号ＡＶｓｉｇは、レコーダ１８と分配器３２に入力される。 The synthesizer 16 synthesizes the output Vsig of the video selector 14 and the output Asig of the microphone 26 and outputs it as a video / audio signal AVsig. The synthesizer 16 may be appropriately changed in configuration depending on the type of Vsig or Asig signal format to be used. However, since the video / audio signal AVsig as an output signal is assumed to be reproduced, it is desirable that the format and protocol conform to an existing format. The video / audio signal AVsig is input to the recorder 18 and the distributor 32.

レコーダ１８は、合成器１６の出力である映像音声信号ＡＶｓｉｇが入力され記録する。レコーダ１８は、指示者の行う講義の内容を音声と映像で記録するものである。公知のアナログ映像音声記録装置やデジタル映像音声記録装置を利用することができる。 The recorder 18 receives and records the video / audio signal AVsig which is the output of the synthesizer 16. The recorder 18 records the contents of the lecture given by the instructor with sound and video. A known analog video / audio recording apparatus or digital video / audio recording apparatus can be used.

分配器３２は、合成器１６からの映像音声信号ＡＶｓｉｇを分配する。公知の分配器を利用することができる。また、分配器３２には１つ若しくは複数のモニタが接続される。図ではモニタ３４、モニタ４０、モニタ４２の３台のモニタが接続されている様子を示す。このうち、モニタ３４は、指示者用のモニタである。指示者はこのモニタの映像を見て、教材として記録されるコンテンツの内容を確認できる。モニタ４０、モニタ４２は、講義を直接受ける受講生のためのモニタである。なお、受講生のためのモニタはインターネットを介して国内外の遠隔地にも設置してもよい。 The distributor 32 distributes the video / audio signal AVsig from the synthesizer 16. A known distributor can be used. The distributor 32 is connected to one or a plurality of monitors. The figure shows a state in which three monitors, a monitor 34, a monitor 40, and a monitor 42 are connected. Among these, the monitor 34 is a monitor for an instructor. The instructor can check the contents of the content recorded as teaching materials by viewing the video on the monitor. The monitor 40 and the monitor 42 are monitors for students who take lectures directly. Students' monitors may also be installed at remote locations in Japan and overseas via the Internet.

制御表示器３０は、画像処理装置１０が指示者の指示をどのように判定したかを指示者に示す機器である。指示者は教壇の上で一定のポーズを取ることで、画像処理装置１０に制御指示を出す。しかし、その指示は指示者用のモニタ３４を見ただけでは分からない場合もある。そこで、制御表示器３０は、指示者の行ったポーズを画像処理装置１０がどのように判定したかを表示する。 The control indicator 30 is a device that indicates to the instructor how the image processing apparatus 10 has determined the instructor's instruction. The instructor issues a control instruction to the image processing apparatus 10 by taking a certain pose on the platform. However, there is a case where the instruction is not understood only by looking at the monitor 34 for the instructor. Therefore, the control display 30 displays how the image processing apparatus 10 has determined the pose performed by the instructor.

制御表示器３０は画像処理装置１０からの制御表示信号Ｓｉｄを受け取る。このＳｉｄを指示者に見える形態で表示する。可視状態に表示すると言っても良い。具体的には、モニタ上に映し出すようにしてもよいし、予め決められたランプを点灯させるような形態であってもよい。また、この制御表示信号Ｓｉｄには、指示されたことを認識したが、指の本数を識別できなかった場合の信号を含む。 The control indicator 30 receives the control display signal Sid from the image processing apparatus 10. This Sid is displayed in a form visible to the instructor. It may be said that it is displayed in a visible state. Specifically, the image may be displayed on a monitor, or a predetermined lamp may be turned on. Further, the control display signal Sid includes a signal when the instruction is recognized but the number of fingers cannot be identified.

例えば、指示者の指示があまりに早かったために指の数を識別できなかった場合などにこの信号をＳｉｄとして出力する。このような表示は、指示者に再度制御指示を行うように促す信号であるともいえる。 For example, this signal is output as Sid when the number of fingers cannot be identified because the instruction from the instructor is too early. It can be said that such a display is a signal that prompts the instructor to give a control instruction again.

なお、制御表示器３０は、指示者に画像処理装置１０の識別状態を伝えるためのものであるので、指示者用のモニタ３４だけに出力できるように構成してもよい。この場合は、合成器１６からの映像音声信号による画面表示に重なるようにＳｉｄの表示を映出してもよい。 The control indicator 30 is for informing the instructor of the identification status of the image processing apparatus 10 and may be configured to output only to the instructor monitor 34. In this case, the Sid display may be displayed so as to overlap the screen display by the video / audio signal from the synthesizer 16.

以上のようなコンテンツ作製システムの動作について詳細な説明を行う。まず、コンテンツ作成システムの全体の動作を図２のフローを参照して説明する。 The operation of the content creation system as described above will be described in detail. First, the overall operation of the content creation system will be described with reference to the flow of FIG.

図２を参照して、コンテンツ作成システムが起動すると（Ｓ１０００）、システムを終了するか否かを判断する（Ｓ１００２）。この判断は、電源がオフにされたといったハードウェア的な割り込み処理でもよい。システムを終了する場合は、処理を停止する（Ｓ１００５）。サブカメラ１２は、指示者の映像Ｖｍを画像処理装置１０へ送る（Ｓ１１００）。画像処理装置１０は、画像毎に指示者の映像を分析し（Ｓ１２００）、制御指示のポーズであるか否かを判断する（Ｓ１３００）。制御信号を出力するポーズである場合は、指示者の指の本数を抽出する（Ｓ１４００）。そして、指示者の指の本数に応じた制御信号を出力する（Ｓ１５００）。 Referring to FIG. 2, when the content creation system is activated (S1000), it is determined whether to end the system (S1002). This determination may be a hardware interrupt process such as when the power is turned off. If the system is to be terminated, the processing is stopped (S1005). The sub camera 12 sends the video image Vm of the instructor to the image processing apparatus 10 (S1100). The image processing apparatus 10 analyzes the video of the instructor for each image (S1200), and determines whether or not it is a control instruction pose (S1300). If the pose is to output a control signal, the number of fingers of the instructor is extracted (S1400). Then, a control signal corresponding to the number of fingers of the instructor is output (S1500).

指示者は、自らが受講生に語りかけるような場面では、カメラ２０を選択し、板書する際にはカメラ２２を選択し、資料に基づいて説明を行う際はプロジェクタを選択する。そして、これらの選択制御は指示者の制御指示のポーズによって切り替えられる。 The instructor selects the camera 20 in a scene where he / she talks to the student, selects the camera 22 when writing on the board, and selects the projector when explaining based on the material. These selection controls are switched according to the control instruction pause of the instructor.

次に指示者のポーズを認識し、指の数に対応する制御信号を出力する部分について詳細に説明する。 Next, a part for recognizing the pose of the instructor and outputting a control signal corresponding to the number of fingers will be described in detail.

図３に画像処理装置１０の構成を示す。画像処理装置１０は、コンピュータで構成され、画像処理はほとんどソフトウェアで実現される場合が多い。従って、具体的にはＭＰＵ（ＭｉｃｒｏＰｒｏｃｅｓｓｏｒＵｎｉｔ）とメモリと入出力部だけで構成される場合が多い。ここでは、処理の流れが理解しやすいように処理単位での構成を示して説明を進める。なお、それぞれの処理を半導体などの実際のハードウェアで構成してもよい。 FIG. 3 shows the configuration of the image processing apparatus 10. The image processing apparatus 10 is configured by a computer, and image processing is often realized by software. Therefore, specifically, it is often configured only by an MPU (Micro Processor Unit), a memory, and an input / output unit. Here, the description will be given by showing the configuration in units of processing so that the flow of processing can be easily understood. Each process may be configured by actual hardware such as a semiconductor.

画像処理装置１０は、制御部１００とメモリ２００、メモリ２０２、メモリ２０４、カウンタ２５０と信号変換器３００および出力ポート部４００を含む。３つのメモリは１つのメモリ中の領域で区別してもよい。メモリ２００には指の本数を判断するための対応テーブルＴＲｗ−ｆｇが記憶されている。メモリ２０２には、判定された指の本数と出力すべき制御信号の対応表である制御信号対応テーブルＴｆｇ−Ｓｃが記憶されている。メモリ２０４には、指示者のいない場合の背景画面Ｄｂｋが記録されている。 The image processing apparatus 10 includes a control unit 100, a memory 200, a memory 202, a memory 204, a counter 250, a signal converter 300, and an output port unit 400. The three memories may be distinguished by areas in one memory. The memory 200 stores a correspondence table TRw-fg for determining the number of fingers. The memory 202 stores a control signal correspondence table Tfg-Sc that is a correspondence table between the determined number of fingers and a control signal to be output. The memory 204 records a background screen Dbk when there is no instructor.

なお、この背景画面はコンテンツ作成システムを使用する前に予め記憶されているものとする。また、図示していないが、制御部１００が使用するワークエリアおよび制御部１００の処理を行うプログラムを格納するためのメモリ領域は別途有する。カウンタ２５０は、指示者のポーズが一定時間継続する事を確認するために用いる。タイマーで時間を測定することで代用することもできる。また、図３ではメモリを制御部１００の中に含めたが、メモリは制御部１００の外にあってもよい。 This background screen is stored in advance before using the content creation system. Although not shown, a work area used by the control unit 100 and a memory area for storing a program for processing the control unit 100 are separately provided. The counter 250 is used to confirm that the pose of the instructor continues for a certain time. It can be substituted by measuring time with a timer. Further, although the memory is included in the control unit 100 in FIG. 3, the memory may be outside the control unit 100.

信号処理器３００は入力された指示者の映像Ｖｍを、画像処理のための画像データＤｚに変換する装置である。例えばＶｍがアナログ信号の場合は、適当なＡＤ変換によってデジタルデータに変換する装置である。元々の入力信号Ｖｍがデジタル信号で直接画像処理に使える場合は、信号変換器３００はなくてもよい。 The signal processor 300 is a device that converts the input video Vm of the instructor into image data Dz for image processing. For example, when Vm is an analog signal, it is a device that converts it into digital data by appropriate AD conversion. When the original input signal Vm is a digital signal and can be directly used for image processing, the signal converter 300 is not necessary.

出力ポート部４００は、制御部１００からの制御信号Ｓｃを接続された機器へ正しく送信する。ここでは、画像処理装置１０は、ビデオセレクタ１４への制御信号Ｓｃｖとプロジェクタ制御器への制御信号Ｓｃｐを出力するので、制御信号Ｓｃは、ＳｃｖとＳｃｐを含んでいる。また、出力ポート部４００は、制御表示器３０への表示信号Ｓｉｄも出力する。 The output port unit 400 correctly transmits the control signal Sc from the control unit 100 to the connected device. Here, since the image processing apparatus 10 outputs the control signal Scv to the video selector 14 and the control signal Scp to the projector controller, the control signal Sc includes Scv and Scp. The output port unit 400 also outputs a display signal Sid to the control indicator 30.

制御部１００は、人物データ検出部１３０、重心線検出部１３２、腕先データ検出部１１０と、掌形幅検出部１１２、指先幅検出部１１４、指数判定部１１６、制御信号出力部１１８を含む。 The control unit 100 includes a person data detection unit 130, a centroid detection unit 132, an arm tip data detection unit 110, a palm shape width detection unit 112, a fingertip width detection unit 114, an exponent determination unit 116, and a control signal output unit 118. .

以下これらの動作を図３の構成図と図４のフローを参照しつつ説明する。画像処理装置１０への入力信号はサブカメラ１２からの指示者映像信号Ｖｍであった。これは信号処理器３００で入力画像データＤｚに変換され、制御部１００に入力される（図２のＳ１１００）。画像データＤｚが入力されると、人物データ検出部１３０が、入力画像データＤｚからメモリ２０４に記憶された背景データＤｂｋを差し引き、人物データＤｄを抽出する（Ｓ１２１１）。 These operations will be described below with reference to the configuration diagram of FIG. 3 and the flow of FIG. The input signal to the image processing apparatus 10 was the instructor video signal Vm from the sub camera 12. This is converted into input image data Dz by the signal processor 300 and input to the control unit 100 (S1100 in FIG. 2). When the image data Dz is input, the person data detection unit 130 subtracts the background data Dbk stored in the memory 204 from the input image data Dz to extract the person data Dd (S1211).

この際、人物データＤｄは２値化される。２値化は入力画像データＤｚから背景データＤｂｋを差し引いた値に対して背景部分の値をゼロと判定できる適当な閾値を設け、その閾値との比較で行えばよい。すなわち、人物データＤｄでは人物の部分は1の値になり、背景の部分は値がゼロとなる。 At this time, the person data Dd is binarized. Binarization may be performed by providing an appropriate threshold value that can determine that the value of the background portion is zero with respect to the value obtained by subtracting the background data Dbk from the input image data Dz, and comparing it with the threshold value. That is, in the person data Dd, the person portion has a value of 1, and the background portion has a value of zero.

次に重心線検出部１３２で、人物データＤｄ中の重心線を求める（Ｓ１２１２）。重心線とは、画面に映った人物のシルエットの重心を通るＹ軸に平行な縦方向の線である。 Next, the barycentric line detection unit 132 obtains barycentric lines in the person data Dd (S1212). The barycentric line is a vertical line parallel to the Y axis that passes through the barycenter of the silhouette of the person on the screen.

図６を参照して重心線を説明する。図６は重心線が付加された人物データＤｄｃの例である。Ｄｄｃは人物データＤｄの重心を求めたデータをいう。枠線５３は人物データＤｄの１単位を示す。例えば１フィールドであってもよいし、１フレームであってもよい。人物データＤｄは２次元のデータであり、それぞれの画素点は座標によって区別される。ここでは、画面横方向をＸ軸５０とし、画面縦方向をＹ軸５１とする。 The center of gravity line will be described with reference to FIG. FIG. 6 is an example of the person data Ddc to which the center of gravity line is added. Ddc is data obtained by obtaining the center of gravity of the person data Dd. A frame line 53 indicates one unit of the person data Dd. For example, it may be one field or one frame. The person data Dd is two-dimensional data, and each pixel point is distinguished by coordinates. Here, the horizontal direction of the screen is the X axis 50 and the vertical direction of the screen is the Y axis 51.

背景データＤｂｋを差し引いてあるので、背景部分５４は、値がゼロの部分である。図６では白であらわした。指示者の映像５５は２値化された結果すべて値が１の画素となる。図６では黒であらわした。もちろん、白と黒を逆にしてもよい。人物データＤｄは、人物のシルエットとして得ることができる。上半身しかないのは、腰から下は机などの背景データに隠れてしまったからである。 Since the background data Dbk is subtracted, the background portion 54 is a portion having a value of zero. In FIG. 6, it is represented in white. The video 55 of the instructor becomes a pixel having a value of 1 as a result of binarization. In FIG. 6, it is represented in black. Of course, white and black may be reversed. The person data Dd can be obtained as a silhouette of a person. The only reason for the upper body is that the lower part of the body is hidden behind background data such as a desk.

重心は、２値化した人物データＤｄの画面上の座標データを用い、ｘ座標、ｙ座標それぞれの座標の平均を求めることで得ることができる。また最小２乗法などの方法で求めてもよい。図５に、重心５６を示す。重心を通るＹ軸に平行な直線を重心線５７とする。 The center of gravity can be obtained by using the coordinate data on the screen of the binarized human data Dd and obtaining the average of the coordinates of the x coordinate and the y coordinate. Further, it may be obtained by a method such as a least square method. FIG. 5 shows the center of gravity 56. A straight line passing through the center of gravity and parallel to the Y axis is defined as a center of gravity line 57.

図３および図４に戻って、次に腕先データ検出部１１０によって腕から先のデータである腕先データＤａｒｍを抽出する（Ｓ１２１３）。これは重心を付加した人物データＤｄｃの重心線から所定の距離だけ離れた部分を切り出すことによって行う。すなわち、人物データから腕先データを抽出する。また、重心に基づいて人物データを分割するとも言える。これからわかるように、指示者は画像処理装置１０に対して制御指示を出す場合は、体から一定の距離だけ離れた空間まで腕を伸ばしそこで必要な本数の指を伸ばすポーズをとる。 Returning to FIGS. 3 and 4, the arm tip data detection unit 110 extracts the arm tip data Darm which is data ahead of the arm (S1213). This is done by cutting out a portion that is a predetermined distance away from the barycentric line of the person data Ddc with the barycenter added. That is, the arm tip data is extracted from the person data. It can also be said that the person data is divided based on the center of gravity. As can be seen from this, when the instructor issues a control instruction to the image processing apparatus 10, the instructor takes a pose in which the arm is extended to a space apart from the body by a certain distance and the necessary number of fingers are extended.

ここでの一定距離の決め方には特に限定されるものではない。例えば、固定値を最初から決めておいてもよいし、人物データのシルエットの横幅に基づいてその都度求めてもよい。固定値としては、４０〜６０ｃｍが好適である。この程度の距離であれば、指示者は違和感なく腕を伸ばして制御指示の合図を出せるからである。 The method for determining the fixed distance here is not particularly limited. For example, a fixed value may be determined from the beginning, or may be obtained each time based on the width of the silhouette of the person data. The fixed value is preferably 40 to 60 cm. This is because if the distance is such a distance, the instructor can give a control instruction signal by extending his arm without a sense of incongruity.

もし、指示者の腕が下がっていたり、若しくは体の近くにあって、制御信号を出す合図でない場合には、腕先データＤａｒｍには、２値化された背景のデータであるゼロしかない。すなわち、腕先データＤａｒｍが値を持つか否か（Ｓ１３１１）で、指示者が所定のポーズをしたか否かを判断できる。 If the arm of the instructor is down or close to the body and it is not a signal to output a control signal, the arm tip data Darm has only zero which is binarized background data. In other words, whether or not the instructor has made a predetermined pose can be determined based on whether or not the arm tip data Darm has a value (S1311).

なお、腕先データＤａｒｍの値とは、人物画像データＤｄの重心線から所定の距離だけ離れた一定の領域を切り出した中に含まれる値が１の画素数の総和である。また、偶然に指示者の腕の一部が切り出した領域に入る場合もあり、Ｄａｒｍが所定の値を越えるか否かで判断しても良い。図４で説明すると、ステップＳ１３１１では、Ｄａｒｍの値と「Ｌａ」を比較しているが、「Ｌａ」はゼロでもよいし、所定の値でも構わない。 Note that the value of the arm tip data Darm is the sum of the number of pixels with a value of 1 included in a predetermined area separated from the barycentric line of the human image data Dd by a predetermined distance. In some cases, a part of the arm of the instructor accidentally enters the cut-out area, and determination may be made based on whether or not Darm exceeds a predetermined value. Referring to FIG. 4, in step S1311, the value of Darm is compared with “La”, but “La” may be zero or a predetermined value.

図７には腕先データＤａｒｍを検出した状態を示す。重心を求めた人物データＤｄｃの重心線５７から所定の距離６０離れた部分画像データ６２を切り出す。部分画像データ６２が腕先データＤａｒｍに相当する。図７の状態では、部分画像データ６２の領域中に指示者の腕の部分６４がある。この部分の画素データは値が１であるので、部分画像データ６２全体での画素データを総和すると値を持つ。この値を腕先データＤａｒｍの値とする。 FIG. 7 shows a state in which the arm tip data Darm is detected. The partial image data 62 separated by a predetermined distance 60 from the barycentric line 57 of the person data Ddc for which the barycenter has been obtained is cut out. The partial image data 62 corresponds to the arm tip data Darm. In the state of FIG. 7, the arm portion 64 of the instructor is in the area of the partial image data 62. Since the pixel data of this portion has a value of 1, the sum of the pixel data of the entire partial image data 62 has a value. This value is set as the value of the arm tip data Darm.

腕先データＤａｒｍがあった場合は、さらに、それが一定時間継続しているかを判断する（Ｓ１３１２）。図３の構成図では、腕先データ検出部１１０は、腕先データＤａｒｍを検出した場合は、カウンタ２５０をインクリメントし、カウンタの値が一定以上になっているかどうかを確認する。また、タイマーを用いて所定時間の経過を検出してもよい。 If there is arm tip data Darm, it is further determined whether or not it continues for a certain time (S1312). In the configuration diagram of FIG. 3, when detecting the arm tip data Darm, the arm tip data detection unit 110 increments the counter 250 and checks whether the value of the counter is equal to or greater than a certain value. Moreover, you may detect progress of predetermined time using a timer.

これは、指示者の動きのなかで、体の重心線から一定距離だけ離れた位置に腕が移動する場合もあり、そのようなポーズは制御指示のポーズから排除するためである。従って、指示者は制御指示のためのポーズを一定時間維持しなければならない。この時間が余り長いと指示者は講義のリズムを崩してしまう。０．３から１．５秒が好適である。０．３秒より短いと、制御指示を意図しない動作でも、指示があったものと誤認してしまう場合がある。 This is because the arm may move to a position away from the center of gravity of the body by a certain distance during the movement of the instructor, and such a pose is excluded from the pose of the control instruction. Therefore, the instructor must maintain a pause for a control instruction for a certain period of time. If this time is too long, the instructor breaks the rhythm of the lecture. 0.3 to 1.5 seconds is preferred. If it is shorter than 0.3 seconds, there is a case where even an operation that does not intend to give a control instruction may be mistaken as an instruction.

また１．５秒を超えると、指示者は講義のリズムを崩してしまう。０．４秒から０．６秒がより好適である。一定時間指示者の指示ポーズが継続している場合（Ｓ１３１２のＹ分岐）は、腕先データＤａｒｍを出力する（Ｓ１３１３）。 If the time exceeds 1.5 seconds, the instructor breaks the rhythm of the lecture. 0.4 seconds to 0.6 seconds is more preferable. When the instruction pose of the instructor continues for a certain time (Y branch of S1312), the arm tip data Darm is output (S1313).

本発明の特徴は、この腕先データから掌形の幅と指先の幅を抽出し、その比を予め記憶させておいた閾値と比較することで指の本数を判定し、その比若しくは判定した指の本数に対応する制御信号を出力する点にある。腕先データＤａｒｍは、掌形幅検出部１１２と指先幅検出部１１４に送られる。 The feature of the present invention is that the palm shape width and the fingertip width are extracted from the arm tip data, and the ratio of the ratio is determined by comparing the ratio with a threshold value stored in advance. The control signal corresponding to the number of fingers is output. The arm tip data Darm is sent to the palm shape width detection unit 112 and the fingertip width detection unit 114.

掌形幅検出部１１４では、掌形データＤａｒｍの最も幅の広い部分の長さを掌形幅データＤｗｈとして抽出する（Ｓ１４１１）。図８では、７０の部分が掌形幅データＤｗｈである。この抽出方法に特に限定された方法はないが、腕先データＤａｒｍのＸ軸上の１点において、値が１であるＹ座標の数が最も多い点を見つけ、そのときのＹ座標の数を掌形幅データＤｗｈとするなどの方法が利用できる。 The palm shape width detection unit 114 extracts the length of the widest part of the palm shape data Darm as the palm shape width data Dwh (S1411). In FIG. 8, the portion 70 is the palm shape width data Dwh. Although this extraction method is not particularly limited, a point having the largest number of Y coordinates having a value of 1 is found at one point on the X axis of the arm tip data Darm, and the number of Y coordinates at that time is determined. A method such as setting the palm shape width data Dwh can be used.

一方、指先幅検出部１１４は、腕先データＤａｒｍに対して、重心線から遠い方の端部の幅を指先幅データＤｗｆとして検出する（Ｓ１４１１）。図８では７１の部分が指先幅データＤｗｆである。ＤｗｆをどのＸ座標で抽出するかについては、特に限定されるものではない。例えば、腕先データＤａｒｍの中で最もＹ軸から離れた点の距離の９５％の点のＸ座標上で抽出するなどが考えられる。 On the other hand, the fingertip width detection unit 114 detects, as fingertip width data Dwf, the width of the end far from the center of gravity line for the armtip data Darm (S1411). In FIG. 8, the portion 71 is fingertip width data Dwf. The X coordinate at which Dwf is extracted is not particularly limited. For example, it is conceivable to extract on the X coordinate of a point that is 95% of the distance of the point farthest from the Y axis in the arm tip data Darm.

図３の構成図においては、掌形幅検出部１１２と指先幅検出部１１４は並列になるように記載したが、ソフトウェア的に行う場合は、どちらかを先に処理してもよい。なお、図４のフローでは、ＤｗｈとＤｗｆの抽出を１つの処理（Ｓ１４１１）として記載したが、どちらを先に行ってもよい。 In the configuration diagram of FIG. 3, the palm shape width detection unit 112 and the fingertip width detection unit 114 are described so as to be arranged in parallel, but when performing in software, either one may be processed first. In addition, in the flow of FIG. 4, although extraction of Dwh and Dwf was described as one process (S1411), whichever may be performed first.

再び図３、図４を参照して、掌形幅データＤｗｈと指先幅データＤｗｆは、指数判定部１１６に送られる。指数幅判定部１１６は、ＤｗｈとＤｗｆの比である掌形比Ｒｗを求め、メモリ２００に記憶している指数判定テーブルＴＲｗ−ｆｇのそれぞれの閾値と比較し指数Ｎｆを判定する（Ｓ１４１２）。 Referring again to FIGS. 3 and 4, the palm shape width data Dwh and the fingertip width data Dwf are sent to the exponent determination unit 116. The exponent width determination unit 116 determines the palm shape ratio Rw, which is the ratio of Dwh and Dwf, and compares the threshold value with each threshold value of the exponent determination table TRw-fg stored in the memory 200 to determine the index Nf (S1412).

図９に指数判定テーブルＴＲｗ−ｆｇの例を示す。掌形幅と指先幅の比であるＲｗ（＝Ｄｗｆ／Ｄｗｈ）は０から１までの間の値で４つの区間を設定してある。それぞれの区間に対して指の数Ｎｆが対応している。例えばＲｗの値が、０．４１であった場合は、指の数Ｎｆは２本と判断される。 FIG. 9 shows an example of the index determination table TRw-fg. Rw (= Dwf / Dwh), which is the ratio of the palm shape width to the fingertip width, is a value between 0 and 1, and four sections are set. The number of fingers Nf corresponds to each section. For example, when the value of Rw is 0.41, it is determined that the number of fingers Nf is two.

指数判定テーブルＴＲｗ−ｆｇの区間の境界となる閾値は、個人や性別により異なるため、いくつかの掌形データを測定し、統計的な平均を求めて決めるのが好適である。なお、図９の指数判定テーブルの閾値は例示であって、これに限定されるものではない。 Since the threshold value that is the boundary of the section of the exponent determination table TRw-fg differs depending on the individual and sex, it is preferable to measure some palm data and determine a statistical average. In addition, the threshold value of the index determination table of FIG. 9 is an example, and is not limited to this.

このように本発明では、伸ばされた指の本数を数えるのに、実際の指を認識して本数を決めるのではなく、伸ばされた指のシルエットの幅と手の幅の比によって判定する。そのため、判定のための計算量が少なく、また解像度が高くない掌形データからも指数を判別することができる。 Thus, in the present invention, the number of fingers extended is not determined by recognizing an actual finger, but by the ratio of the width of the extended finger silhouette to the width of the hand. Therefore, the index can be discriminated from palm-shaped data that requires a small amount of calculation for determination and does not have high resolution.

また、本発明の指の本数の判定方法は、指先幅と掌形幅の比Ｒｗを用いるので、指先が多少開いている場合や差し出された手が多少傾いていても計算量が増えることなく判断できる。また、個々の指を認識するのではないので誤認識することも少ない。これは、指示者が自然に出した指の本数を判定できることを意味する。従って、指示者は、制御指示を出す際に、画像処理装置に指の本数を認識させようと、強く意識する必要がなく、講義に集中できる。 In addition, since the method for determining the number of fingers according to the present invention uses the ratio Rw of the fingertip width to the palm-shaped width, the amount of calculation increases even if the fingertip is slightly open or the hand that is presented is slightly tilted. It can be judged without. In addition, since individual fingers are not recognized, there is little misrecognition. This means that the number of fingers that the instructor has naturally put out can be determined. Therefore, the instructor does not need to be conscious of making the image processing apparatus recognize the number of fingers when issuing the control instruction, and can concentrate on the lecture.

指数Ｎｆを判定したらその値を出力する（同Ｓ１４１２）。この指数Ｎｆは、図１で示した制御表示器３０への通知信号Ｓｉｄとしても出力される。図３では指数判定部１１６からの指数Ｎｆを出力ポート部４００を介して制御表示器３０への信号Ｓｉｄとして出力しているように記したが、１例を示したものである。指示者はこの表示を見て、自分行ったポーズによって、自分の意思通りの指示が認識されたか否かを知ることができる。 When the index Nf is determined, the value is output (S1412). This index Nf is also output as a notification signal Sid to the control display 30 shown in FIG. In FIG. 3, the index Nf from the index determining unit 116 is described as being output as the signal Sid to the control display 30 via the output port unit 400, but this is an example. The instructor can see this display and know whether or not the instruction according to his / her intention has been recognized by the pose he / she has performed.

以上のＳ１２１１からＳ１２１３は、図２のポーズ検出処理Ｓ１２００に含まれる。また、Ｓ１３１１からＳ１３１３は図２の制御指示か否かの判断Ｓ１３００に含まれる。また、Ｓ１４１１からＳ１４１２は図２の指の本数を抽出する処理Ｓ１４００に含まれる。 The above S1211 to S1213 are included in the pause detection process S1200 of FIG. S1311 to S1313 are included in the determination S1300 as to whether or not the control instruction is in FIG. Further, S1411 to S1412 are included in the process S1400 for extracting the number of fingers in FIG.

次に図３と図５を参照して、続く処理を説明する。制御部１００の指数判定部１１６からは、指数Ｎｆが出力される。制御信号出力部１１８は、指数Ｎｆを受け取り、メモリ２０２に記憶されている、制御信号対応テーブルＴｆｇ−Ｓｃと比較し、出力すべき制御信号を決定し、出力する。 Next, the subsequent processing will be described with reference to FIGS. The exponent Nf is output from the exponent determining unit 116 of the control unit 100. The control signal output unit 118 receives the index Nf, compares it with the control signal correspondence table Tfg-Sc stored in the memory 202, determines a control signal to be output, and outputs it.

ここでは、指示者の指は１本から４本まで区別できる場合で説明を進める。１本の場合はカメラ１（２０）、２本の場合はカメラ２（２２）、３本の場合はプロジェクタ２４、のそれぞれの出力をビデオセレクタ１４で選択する。また４本の場合はプロジェクタ制御器２５に次の画像を出力させる。以上のような取り決めは予め決めておく必要がある。 Here, the description will be made in the case where one to four fingers of the instructor can be distinguished. The video selector 14 selects the output of the camera 1 (20) in the case of one, the camera 2 (22) in the case of two, and the projector 24 in the case of three. In the case of four, the projector controller 25 outputs the next image. The above arrangement needs to be determined in advance.

指示者の指の本数が１本の場合は（Ｓ１５１０）、画像処理装置１０は制御信号Ｓｃｖでカメラ１（２０）の出力Ｖｃ１を選択し出力するようにビデオセレクタを制御する（Ｓ１５１１）。 When the number of fingers of the instructor is one (S1510), the image processing apparatus 10 controls the video selector to select and output the output Vc1 of the camera 1 (20) with the control signal Scv (S1511).

指示者の指の本数が２本の場合は（Ｓ１５１５）、画像処理装置１０は制御信号Ｓｃｖでカメラ２（２２）の出力Ｖｃ１を選択し出力するようにビデオセレクタを制御する（Ｓ１５１６）。 When the number of fingers of the instructor is two (S1515), the image processing apparatus 10 controls the video selector to select and output the output Vc1 of the camera 2 (22) with the control signal Scv (S1516).

指示者の指の本数が３本の場合は（Ｓ１５２０）、画像処理装置１０は制御信号Ｓｃｖでプロジェクタ２４の出力Ｖｐｊを選択し出力するようにビデオセレクタを制御する（Ｓ１５２１）。 When the number of fingers of the instructor is three (S1520), the image processing apparatus 10 controls the video selector to select and output the output Vpj of the projector 24 with the control signal Scv (S1521).

また、指示者の指の本数が４本の場合は（Ｓ１５２５）、画像処理装置１０は制御信号Ｓｃｐでプロジェクタ制御器２５に次の画像を出力するように制御を行う（Ｓ１５２６）。 If the number of fingers of the instructor is four (S1525), the image processing apparatus 10 performs control to output the next image to the projector controller 25 with the control signal Scp (S1526).

これらのいずれでもない場合は、再指示表示を制御表示器３０に出力し（Ｓ１５３０）、処理を戻す。処理は、図２の終了判定（Ｓ１００２）へ戻る。 If it is neither of these, a re-instruction display is output to the control display 30 (S1530), and the process returns. The process returns to the end determination (S1002) in FIG.

制御信号出力部１１８は、出力すべき制御信号Ｓｃ（Ｓｃｖ若しくはＳｃｐ）を出力ポート部４００に出力する。出力ポート部４００は、複数の出力ポートを有し、制御信号の種類によって正しい出力先に制御信号を送る。ここでは、ビデオセレクタの制御信号Ｓｃｖはビデオセレクタ１４へ、またプロジェクタ制御器への制御信号Ｓｃｐはプロジェクタ制御器２５へ送る。 The control signal output unit 118 outputs a control signal Sc (Scv or Scp) to be output to the output port unit 400. The output port unit 400 has a plurality of output ports and sends a control signal to a correct output destination depending on the type of the control signal. Here, the control signal Scv of the video selector is sent to the video selector 14 and the control signal Scp to the projector controller is sent to the projector controller 25.

制御信号Ｓｃは、制御する機器毎に出力できるようにしてもよいし、制御対象機器にアドレスなどを割り当て、アドレスとコード化した制御信号を送り出してもよい。
図１０には、メモリ２０２に記憶された制御信号対応テーブルＴｆｇ−Ｓｃの例を示す。指数Ｎｆに対する制御信号Ｓｃが記録されている。例えば、指数Ｎｆが２本の場合はＶｃ２を選択する制御信号Ｓｃである旨が記録してある。なお、実際にはＳｃの内容は所定のコードで記載してある。このテーブルの内容は予め記憶させておく。 The control signal Sc may be output for each device to be controlled, or an address or the like may be assigned to the device to be controlled, and a control signal encoded with the address may be sent out.
FIG. 10 shows an example of the control signal correspondence table Tfg-Sc stored in the memory 202. A control signal Sc for the index Nf is recorded. For example, when the index Nf is two, it is recorded that the control signal Sc is for selecting Vc2. Actually, the contents of Sc are described in a predetermined code. The contents of this table are stored in advance.

図９および図１０を参照すると、指数Ｎｆは、掌形比Ｒｗの範囲と制御信号Ｓｃを関係付ける付ける変数となっている。そこで、掌形比Ｒｗを制御信号Ｓｃと直接関連付けてもよい。この場合は、図３の指数対応テーブル２００には、掌形比と制御信号の対応テーブルが記載され、指数判定部１１６からの出力は制御信号Ｓｃとなる。また、制御信号出力部１１８と制御対応テーブル２０２は省略することができる。図１１には、その具体的な対応テーブルＴＲｗ−Ｓｃを示す。 Referring to FIGS. 9 and 10, the index Nf is a variable that relates the range of the palm shape ratio Rw and the control signal Sc. Therefore, the palm shape ratio Rw may be directly associated with the control signal Sc. In this case, the correspondence table between the palm shape ratio and the control signal is described in the exponent correspondence table 200 in FIG. 3, and the output from the exponent determination unit 116 is the control signal Sc. Further, the control signal output unit 118 and the control correspondence table 202 can be omitted. FIG. 11 shows a specific correspondence table TRw-Sc.

また、図４の処理フローでは、Ｓ１０４３の指数Ｎｆを判定し出力する処理で、制御信号Ｓｃ自体を出力する。それを受けて図５のＳ１５１０乃至Ｓ１５２５の判断処理は、制御信号Ｓｃ自体を判断する処理に置き換わる。具体的な例としては、Ｓ１５１０では、「指の本数が１本であるか否か」を判断するが、これを「制御信号ＳｃはＶｃ１を選択する信号か否か」を判断する処理に置き換わる。 In the processing flow of FIG. 4, the control signal Sc itself is output in the process of determining and outputting the index Nf in S1043. In response to this, the determination processing in S1510 to S1525 in FIG. 5 is replaced with processing for determining the control signal Sc itself. As a specific example, in S1510, “whether or not the number of fingers is one” is determined, but this is replaced with processing for determining “whether or not the control signal Sc is a signal for selecting Vc1”. .

以上の動作を行うコンテンツ作成システムを用いると、講師である指示者は、講義を進めながら、同時に適宜制御指示のポーズを取るだけで、講師、板書、プロジェクタ出力の映像を切り替え、指示者の音声による説明とともに記録することができる。このようにして作成されたコンテンツは、そのままイーラーニングの教材として利用することができる。すなわち、機器制御のための補助者や後からの編集作業をすることなく、講師が講義を行いながら１人で教材を作成することができる。 Using the content creation system that performs the above operations, the instructor who is the instructor can switch the video of the instructor, the board, and the projector output by simply posing the control instruction at the same time while proceeding with the lecture. Can be recorded along with the explanation. The contents created in this way can be used as e-learning materials as they are. That is, it is possible to create teaching materials by one person while the lecturer performs a lecture without an assistant for device control or editing work later.

なお、ビデオセレクタにつないだ機器は３つとしたが、それ以上であってもよい。また、画像処理装置１０は、より多くの機器を制御してもよい。また、指示者の制御指示を得るためのサブカメラ１２は、赤外線カメラであってもよい。特に近赤外線を感知できるカメラであれば、背景の映像のコントラストを低くすることができ、背景データＤｂｋを差し引いて人物データＤｄを得る際のＳＮＲを高くすることが可能である。また、背景の映像のコントラストを十分に低くできる場合は、人物データを得る際に背景データを差し引く必要がなく、２値化を行うだけで人物データＤｄを得ることも可能である。 Although three devices are connected to the video selector, more devices may be used. Further, the image processing apparatus 10 may control more devices. Further, the sub camera 12 for obtaining the control instruction of the instructor may be an infrared camera. In particular, if the camera can detect near infrared rays, the contrast of the background image can be lowered, and the SNR when obtaining the person data Dd by subtracting the background data Dbk can be increased. If the contrast of the background image can be made sufficiently low, it is not necessary to subtract the background data when obtaining the person data, and it is possible to obtain the person data Dd simply by binarization.

また、サブカメラ１２は固定カメラとして説明を行ったが、指示者を自動追尾するようにしてもよい。この場合はサブカメラ１２の回転アングルに応じた背景データを予め取得しておき、指示者を撮影したアングルに対応する背景データを使って人物データを取得するようにすればよい。 Although the sub camera 12 has been described as a fixed camera, the instructor may be tracked automatically. In this case, background data corresponding to the rotation angle of the sub camera 12 may be acquired in advance, and person data may be acquired using background data corresponding to the angle at which the instructor is photographed.

また、上記の赤外線カメラを用いて背景データを省略できる場合は、サブカメラ１２を赤外線カメラとして指示者を自動追尾してもよい。また、上記の説明では人物データとして指示者の上半身が入る画像で説明をしたが、サブカメラ１２の前に差し出した制御指示の手の形を認識させるようにしてもよい。 When the background data can be omitted using the above infrared camera, the instructor may be automatically tracked using the sub camera 12 as the infrared camera. In the above description, an image in which the upper body of the instructor is entered as the person data has been described. However, the shape of the control instruction hand that is presented in front of the sub camera 12 may be recognized.

また、以上に説明した処理は適当なプログラム言語で記載し、画像処理装置１０に格納し、実行させることで実現することができる。また、そのようなプログラムを記載した記録媒体を外部記憶手段等に用意し、そのプログラムを読み込んで実行させてもよい。 The processing described above can be realized by describing it in an appropriate program language, storing it in the image processing apparatus 10, and executing it. Further, a recording medium in which such a program is described may be prepared in an external storage means and the program may be read and executed.

（実施の形態２）
本実施の形態では、指示者の出す制御指示の種類を増やす方法を例示する。実施の形態１では、人物データＤｄに対して重心線を求め、重心線から一定の距離を離れた領域部分に指示者の腕（腕先データＤａｒｍ）があるか否かを判断していた。ここで腕先データＤａｒｍの存在領域をより細かくすることで、多くの指示を出すことができる。 (Embodiment 2)
In the present embodiment, a method for increasing the types of control instructions issued by the instructor will be exemplified. In the first embodiment, a barycentric line is obtained for the person data Dd, and it is determined whether or not the arm (armtip data Darm) of the instructor is in an area portion that is a certain distance away from the barycentric line. Here, many instructions can be issued by making the existence area of the arm tip data Darm finer.

図１２には、重心を求めた人物データＤｄｃに対して腕先データＤａｒｍの検出位置を増やした例を示す。人物データＤｄに対して例えば４つの領域を定義した場合を示す。この領域を定義するために、実施の形態１求めた重心線５７に加え、頭上線８７を求める。これは人物データＤｄの中でＹ軸方向の最も高い値を見つければよい。 FIG. 12 shows an example in which the detection position of the arm tip data Darm is increased with respect to the person data Ddc whose center of gravity is obtained. For example, a case where four areas are defined for the person data Dd is shown. In order to define this region, an overhead line 87 is obtained in addition to the barycentric line 57 obtained in the first embodiment. This can be achieved by finding the highest value in the Y-axis direction in the person data Dd.

そして、重心線５７と頭上線８７で区切られる領域を識別のための領域とする。領域８０は実施の形態１と同じ方向にある領域である。ただし、領域８１は新しく定義された領域である。 Then, an area divided by the barycentric line 57 and the overhead line 87 is used as an area for identification. A region 80 is a region in the same direction as in the first embodiment. However, the area 81 is a newly defined area.

領域８３は、領域８０と同じように重心線５７から所定の距離だけ離れた領域を切り出す。領域８１、８２は、人物のシルエットの左右であって、頭上線８７より高い部分である。 Similar to the region 80, the region 83 cuts out a region separated from the barycentric line 57 by a predetermined distance. Regions 81 and 82 are the left and right sides of the silhouette of the person and are higher than the overhead line 87.

以上の領域を定義しておき、図４および図５の処理に修正を加える。図１２および図１３に修正を加えた処理のフローを示す。なお、図３に示す制御部１００の各構成要素も本実施の形態の処理に従って変更される。ここでは図３の各構成要素に補足的な説明を加える。重心線検出部１３２では、重心線だけでなく頭上線も求め、これらの線によって人物データを領域に分ける（Ｓ１２２２）。 The above areas are defined, and the processes in FIGS. 4 and 5 are modified. FIG. 12 and FIG. 13 show the flow of processing with modifications. Note that each component of the control unit 100 shown in FIG. 3 is also changed according to the processing of the present embodiment. Here, a supplementary explanation is added to each component of FIG. The center-of-gravity line detection unit 132 calculates not only the center-of-gravity line but also the overhead line, and divides the person data into regions by these lines (S1222).

このとき、領域毎の識別データ（以後「領域データＤａｒｅ」とする。）を付与する。ここでは、領域データＤａｒｅは領域８０から領域８２の符号と一致させ、それぞれ８０乃至８２とする。すなわち、例えば領域８０のＤａｒｅは「８０」である。腕先データ検出部１１０では、４つの領域毎に腕先データＤａｒｍの有無を調べる（Ｓ１３２１）。ある領域でＤａｒｍが存在した場合は、所定時間の経過を確認して（Ｓ１３２２）、Ｄａｒｍの存在する領域データＤａｒｅと腕先データＤａｒｍを出力する（Ｓ１３２３）。これは腕先データＤａｒｍが存在する領域データＤａｒｅを検出する工程とも言える。 At this time, identification data for each region (hereinafter referred to as “region data Dare”) is assigned. Here, the area data Dare is made to coincide with the signs of the area 80 to the area 82 and is 80 to 82, respectively. That is, for example, the Dare of the area 80 is “80”. The arm tip data detection unit 110 checks the presence or absence of the arm tip data Darm for each of the four areas (S1321). If a Darm exists in a certain area, the elapse of a predetermined time is confirmed (S1322), and the area data Dare and the armtip data Darm in which the Darm exists are output (S1323). This can be said to be a step of detecting the area data Dare in which the arm tip data Darm exists.

掌形幅検出部１１２および指先幅検出部１１４では、各領域データを参考にして掌形幅と指先幅を求める（Ｓ１４２１）。そして掌形比Ｒｗを求め、指先対応テーブルに基づいて指数Ｎｆと領域データＤａｒｅを出力する（Ｓ１４２２）。ここで指数対応テーブルは実施の形態１の場合と同じでも良いが、各領域毎の指数対応テーブルを用意してもよい。手を体の横に出した時と、頭の上に伸ばした時では、カメラの解像度が変化する場合もあり、それぞれの手の方向で掌形比に対する指数の閾値が変化する場合もあるからである。 The palm shape width detection unit 112 and the fingertip width detection unit 114 obtain the palm shape width and the fingertip width with reference to each area data (S1421). Then, the palm shape ratio Rw is obtained, and the index Nf and the area data Dare are output based on the fingertip correspondence table (S1422). Here, the index correspondence table may be the same as that in the first embodiment, but an index correspondence table for each region may be prepared. When you put your hand next to your body and when you extend it over your head, the resolution of the camera may change, and the index threshold for the palm shape ratio may change depending on the direction of each hand. It is.

図１５に領域毎の指数対応テーブルの例を示す。この指数対応テーブルが図３の指数対応テーブル２００の代わりにメモリに記録され用いられる。領域データが８０と８３は左右の違いはあるが体の横方向なので、同じ掌形比で指数を判断できる。領域８１と８２は手を上に上げた場合であるので、横方向と縦方向のカメラの解像度の違いで少し閾値が変っている。使用するシステムやカメラによって、全ての領域毎に指数対応テーブルを有していても構わない。 FIG. 15 shows an example of an index correspondence table for each area. This index correspondence table is recorded and used in a memory instead of the index correspondence table 200 of FIG. Since the area data 80 and 83 are laterally different from each other, the index can be determined with the same palm shape ratio. Since the areas 81 and 82 are when the hand is raised, the threshold value is slightly changed due to the difference in resolution between the cameras in the horizontal and vertical directions. Depending on the system and camera used, an index correspondence table may be provided for every region.

図１４を参照して、制御信号出力部１１８は、まず領域データＤｒａｒが領域８０、領域８１、領域８２、領域８３のいずれの領域にあるかを確認し（Ｓ１５５０、Ｓ１５６０、Ｓ１５７０、Ｓ１５８０）、それぞれに対応する制御信号を制御信号対応テーブルから選び出力する（Ｓ１５５５、Ｓ１５６５，Ｓ１５７５、Ｓ１５８５）。なお、Ｓ１５５５、Ｓ１５６５，Ｓ１５７５、Ｓ１５８５の処理は図５の処理と同様に、指の本数毎に制御信号を選択する。 Referring to FIG. 14, the control signal output unit 118 first confirms whether the region data Drar is in the region 80, the region 81, the region 82, or the region 83 (S1550, S1560, S1570, S1580). The control signal corresponding to each is selected from the control signal correspondence table and output (S1555, S1565, S1575, S1585). Note that the processing of S1555, S1565, S1575, and S1585 selects a control signal for each number of fingers, as in the processing of FIG.

図１６には領域データと指数に対応する制御信号テーブルを例示する。この制御信号対応テーブルは図３の制御信号対応テーブル２０２に代えてメモリに記録され用いられる。そして、制御信号出力部１１８は、Ｓ１５５５、Ｓ１５６５，Ｓ１５７５、Ｓ１５８５での処理を行う場合は、このテーブルに従って、制御信号を出力する。 FIG. 16 illustrates a control signal table corresponding to the area data and the index. This control signal correspondence table is recorded and used in a memory instead of the control signal correspondence table 202 of FIG. And the control signal output part 118 outputs a control signal according to this table, when performing the process by S1555, S1565, S1575, and S1585.

領域８０から８３に対して指数が１から４まで識別することができる。指示者が右利きの場合、左手は比較的空いているので左手を上げて、映像信号の選択を行えるよう、領域データが８２の場合にＶｃ１の選択などの映像信号選択関係の制御信号を割り当ててある。未定義の部分は画像処理装置１０が指示者の動作を識別しても特に制御信号を出力しない。ただし、制御表示信号Ｓｉｄは出力する。指示者に制御指示を認識はしている点を通知するためである。 Indexes 1 to 4 can be identified for regions 80-83. When the instructor is right-handed, the left hand is relatively free, so a control signal related to video signal selection such as Vc1 selection is assigned when the area data is 82 so that the left hand can be raised and the video signal can be selected. It is. An undefined portion does not particularly output a control signal even when the image processing apparatus 10 identifies the operation of the instructor. However, the control display signal Sid is output. This is to notify the instructor that the control instruction is recognized.

カメラ２は板書を映すカメラであるが、左右に腕を広げて指を２本出すことで、カメラ２が左右にパンする制御信号を定義してある。具体的には領域データが８０で指数Ｎｆが２の場合と領域データが８３で指数Ｎｆが２の場合である。 The camera 2 is a camera that projects a board, but defines a control signal that causes the camera 2 to pan left and right by spreading two arms and extending two fingers. Specifically, there are a case where the area data is 80 and the index Nf is 2, and a case where the area data is 83 and the index Nf is 2.

また領域データが８０で指が１本の場合にカメラ２のズームアウトを領域データが８１で指が１本の場合にカメラ２のズームインを割り当ててある。これによって指示者が右手を上げて指を１本伸ばすと、カメラ２はズームインし、そのまま右腕を体の横まで下ろすとカメラ２はズームアウトする。なお、この場合、図３においては、画像処理装置１０からカメラ２へパンやズームを制御するための信号線が用意されるのは言うまでもない。 When the area data is 80 and the number of fingers is one, the camera 2 is zoomed out. When the area data is 81 and the number of fingers is one, the camera 2 is zoomed in. Thus, when the instructor raises his right hand and extends one finger, the camera 2 zooms in, and when the right arm is lowered to the side of the body, the camera 2 zooms out. In this case, it goes without saying that signal lines for controlling pan and zoom from the image processing apparatus 10 to the camera 2 are prepared in FIG.

以上のように、指示者の人物データを複数の領域に分割し、分割された領域で指示者の出した指数を識別することで、指示の種類を多くすることができ、より細かい制御を一人で行うことができる。 As described above, by dividing the person data of the instructor into a plurality of areas and identifying the index given by the instructor in the divided areas, the types of instructions can be increased, and finer control can be performed by one person. Can be done.

なお、掌形比と制御信号を直接関連付けてもよいのは実施の形態１の場合と同じである。図１７に領域データと掌形比に対する制御信号を直接関連付けたテーブルを例示する。また、頭上線は頭の頂点でなく、頭の中心や首、若しくは肩の高さにしてもよい。 Note that the palm shape ratio and the control signal may be directly associated with each other as in the first embodiment. FIG. 17 illustrates a table that directly associates the area data with the control signal for the palm shape ratio. The overhead line may be the center of the head, the neck, or the height of the shoulder, not the top of the head.

また、本発明では掌形幅と指先幅を映像から取得したが、掌形幅や指先幅をより細かく判断するようにしてもよい。例えば、親指の出し方で掌形幅が変わる。従って親指の出し方を識別するようにしてもよい。また、掌形の小指に沿ったラインと伸ばした指との段差を識別するようにしてもよい。このようにより細かく識別を行うと、より正確な認識をすることができる。 Further, in the present invention, the palm shape width and the fingertip width are acquired from the video, but the palm shape width and the fingertip width may be determined more finely. For example, the palm shape width changes depending on how the thumb is put out. Therefore, you may make it identify how to put out thumb. Further, the step between the line along the palm-shaped little finger and the extended finger may be identified. If the identification is performed more finely in this way, more accurate recognition can be performed.

また、本発明の掌形比に基く機器制御方法に、他の画像認識を組み合わせてもよい。例えば、指示者の腕の屈伸状態や、指示者の手のひらの向き、また、手や腕の動きを動的に認識するなどが含まれる。 Further, other image recognition may be combined with the device control method based on the palm shape ratio of the present invention. For example, it includes dynamically recognizing the bending state of the arm of the instructor, the direction of the palm of the instructor, and the movement of the hand or arm.

また、所定の時間内に制御指示を複数回行うことで、２桁若しくはそれ以上の桁数の制御信号の種類を定義してもよい。また、領域の分割は重心線や頭上線による分割に限定しないし、より多くの分割をおこなってもよい。また、本発明の機器制御方法は、一般家庭のテレビ等の機器制御に利用できるのは言うまでもない。 In addition, the type of control signal having two or more digits may be defined by performing a control instruction a plurality of times within a predetermined time. Further, the division of the region is not limited to the division by the center of gravity line or the overhead line, and more divisions may be performed. Further, it goes without saying that the device control method of the present invention can be used for device control of a general household television or the like.

コンテンツを一人で作成するシステムに好適に利用することができる。 It can be suitably used in a system for creating content alone.

本発明のコンテンツ作成システムの構成を示す図である。It is a figure which shows the structure of the content creation system of this invention. 本発明のコンテンツ作成方法のフローを示す図である。It is a figure which shows the flow of the content creation method of this invention. 本発明の画像処理装置の構成を示す図である。It is a figure which shows the structure of the image processing apparatus of this invention. 本発明の画像処理装置での動作のフローを示す図である。It is a figure which shows the flow of operation | movement with the image processing apparatus of this invention. 本発明の画像処理装置での動作のフローを示す図である。It is a figure which shows the flow of operation | movement with the image processing apparatus of this invention. 人物データを例示する図である。It is a figure which illustrates person data. 腕先データを例示する図である。It is a figure which illustrates arm tip data. 掌形幅と指先幅を求める様子を例示する図である。It is a figure which illustrates a mode that a palm shape width and a fingertip width are calculated. 指数対応テーブルを例示する図である。It is a figure which illustrates an index correspondence table. 制御信号対応テーブルを例示する図である。It is a figure which illustrates a control signal correspondence table. 掌形比と制御信号を対応させたテーブルを例示する図である。It is a figure which illustrates the table which matched the palm shape ratio and the control signal. 人物データを４つの領域に分けた状態を例示する図である。It is a figure which illustrates the state which divided | segmented person data into four area | regions. 領域を複数に分けた場合の処理のフローを示す図である。It is a figure which shows the flow of a process at the time of dividing an area | region into plurality. 領域を複数に分けた場合の処理のフローを示す図である。It is a figure which shows the flow of a process at the time of dividing an area | region into plurality. 領域毎の指数対応テーブルを例示する図である。It is a figure which illustrates the index | correspondence table | surface for every area | region. 領域データと指数に対応する制御信号テーブルを例示する図である。It is a figure which illustrates the control signal table corresponding to area | region data and an index | exponent. 領域データと掌形比に対する制御信号を直接関連付けたテーブルを例示する図である。It is a figure which illustrates the table which linked | related the control signal with respect to area | region data and palm shape ratio directly.

符号の説明Explanation of symbols

１０画像処理装置
１２サブカメラ
１４ビデオセレクタ
１６合成器
１８レコーダ
２０カメラ
３０制御表示器
３２分配器
３４モニタ
４５スクリーン
DESCRIPTION OF SYMBOLS 10 Image processing apparatus 12 Sub camera 14 Video selector 16 Synthesizer 18 Recorder 20 Camera 30 Control display 32 Distributor 34 Monitor 45 Screen

Claims

指示者の腕先データを検出する工程と、
前記腕先データから掌形幅を検出する工程と、
前記腕先データから指先幅を検出する工程と、
前記指先幅と前記掌形幅の比を掌形比として求める工程と、
前記掌形比に対応した制御信号を出力する工程を有する画像認識による機器制御方法。 Detecting the armtip data of the instructor;
Detecting a palm shape width from the wrist data;
Detecting a fingertip width from the armtip data;
Obtaining a ratio of the fingertip width and the palm shape width as a palm shape ratio;
A device control method based on image recognition, including a step of outputting a control signal corresponding to the palm shape ratio.

指示者の腕先データを検出する工程と、
前記腕先データから掌形幅を検出する工程と、
前記腕先データから指先幅を検出する工程と、
前記指先幅と前記掌形幅の比を掌形比として求める工程と、
前記掌形比から指数を判定する工程と、
前記指数に対応した制御信号を出力する工程を有する画像認識による機器制御方法。 Detecting the armtip data of the instructor;
Detecting a palm shape width from the wrist data;
Detecting a fingertip width from the armtip data;
Obtaining a ratio of the fingertip width and the palm shape width as a palm shape ratio;
Determining an index from the palm shape ratio;
A device control method based on image recognition, including a step of outputting a control signal corresponding to the index.

指示者の上半身を含む画像データを得る工程と、
前記指示者の背景の画像データを前記上半身を含む画像データから差し引き、人物データを得る工程と、
前記人物データから腕先データを得る工程と、
前記腕先データから掌形幅を検出する工程と、
前記腕先データから指先幅を検出する工程と、
前記指先幅と前記掌形幅の比を掌形比として求める工程と、
前記掌形比に対応した制御信号を出力する工程を有する画像認識による機器制御方法。 Obtaining image data including the upper body of the instructor;
Subtracting the background image data of the instructor from the image data including the upper body to obtain person data;
Obtaining arm data from the person data;
Detecting a palm shape width from the wrist data;
Detecting a fingertip width from the armtip data;
Obtaining a ratio of the fingertip width and the palm shape width as a palm shape ratio;
A device control method based on image recognition, including a step of outputting a control signal corresponding to the palm shape ratio.

前記人物データを複数の領域に分割する工程と、
前記腕先データは前記複数の領域のどの領域に存在したかを示す領域データを検出する工程をさらに含み
前記制御信号を出力する工程は、
前記掌形比と前記領域データに対応した制御信号を出力する工程である請求項３記載の画像認識による機器制御方法。 Dividing the person data into a plurality of regions;
The step of outputting the control signal further includes a step of detecting region data indicating in which region of the plurality of regions the arm tip data existed,
The apparatus control method by image recognition according to claim 3, wherein the apparatus is a step of outputting a control signal corresponding to the palm shape ratio and the area data.

指示者の上半身を含む画像データを得る工程と、
前記指示者の背景の画像データを前記上半身を含む画像データから差し引き、人物データを得る工程と、
前記人物データから腕先データを得る工程と、
前記腕先データから掌形幅を検出する工程と、
前記腕先データから指先幅を検出する工程と、
前記指先幅と前記掌形幅の比を掌形比として求める工程と、
前記掌形比から指数を判定する工程と、
前記指数に対応した制御信号を出力する工程を有する画像認識による機器制御方法。 Obtaining image data including the upper body of the instructor;
Subtracting the background image data of the instructor from the image data including the upper body to obtain person data;
Obtaining arm data from the person data;
Detecting a palm shape width from the wrist data;
Detecting a fingertip width from the armtip data;
Obtaining a ratio of the fingertip width and the palm shape width as a palm shape ratio;
Determining an index from the palm shape ratio;
A device control method based on image recognition, including a step of outputting a control signal corresponding to the index.

前記人物データを複数の領域に分割する工程と、
前記腕先データは前記複数の領域のどの領域に存在したかを示す領域データを検出する工程をさらに含み
前記制御信号を出力する工程は、
前記指数と前記領域データに対応した制御信号を出力する工程である請求項５記載の画像認識による機器制御方法。 Dividing the person data into a plurality of regions;
The step of outputting the control signal further includes a step of detecting region data indicating in which region of the plurality of regions the arm tip data existed,
6. The device control method by image recognition according to claim 5, which is a step of outputting a control signal corresponding to the index and the area data.

前記領域に分割する工程は、
前記人物データの重心を求める工程と、
前記重心に基づいて前記人物データを複数の領域に分割する工程を含む請求項４または６の何れかの請求項に記載された画像認識による機器制御方法。 The step of dividing into the regions includes
Obtaining a center of gravity of the person data;
The device control method by image recognition according to claim 4, further comprising a step of dividing the person data into a plurality of regions based on the center of gravity.

前記制御信号は可視状態に表示する工程を含む請求項１乃至７のいずれかの請求項に記載された画像認識による機器制御方法。 The device control method by image recognition according to any one of claims 1 to 7, wherein the control signal includes a step of displaying the control signal in a visible state.

請求項１乃至８のいずれかの請求項に記載された工程を実行するプログラム。 The program which performs the process described in the claim in any one of Claims 1 thru | or 8.

請求項９のプログラムを記録した記録媒体。 A recording medium on which the program according to claim 9 is recorded.

複数の映像信号のうちの少なくとも１の映像信号を制御信号により選択する映像選択工程と、
前記制御信号を請求項１乃至７のいずれかの画像認識による機器制御方法によって得る工程を有するコンテンツ作成方法。 A video selection step of selecting at least one video signal of the plurality of video signals by a control signal;
A content creation method including a step of obtaining the control signal by the device control method based on image recognition according to claim 1.

前記映像選択工程で選択された前記映像信号を記録する工程を含む請求項１１記載のコンテンツ作成方法。 12. The content creation method according to claim 11, further comprising a step of recording the video signal selected in the video selection step.

前記記録する工程は音声を記録する工程も含んだ請求項１２記載のコンテンツ作成方法。 The content creation method according to claim 12, wherein the recording step includes a step of recording sound.

指示者の掌形幅と指先幅の比率である掌形比と前記掌形比に対応する制御信号を記した制御信号テーブルを記憶したメモリと、
前記指示者の腕先データが入力され、
前記腕画像データから掌形幅と指先幅を求め、
前記掌形幅および指先幅の比率である掌形比を求め、
前記掌形比と前記制御信号テーブルから制御信号を得て、
前記制御信号を出力する制御部と
を有する画像認識による機器制御装置。 A memory storing a control signal table in which a control signal corresponding to a control signal corresponding to a control signal corresponding to the control result corresponding to the control result corresponding to the control unit
The wrist data of the instructor is input,
Obtain the palm shape width and fingertip width from the arm image data,
Obtain a palm shape ratio which is a ratio of the palm shape width and the fingertip width,
Obtain a control signal from the palm shape ratio and the control signal table,
An apparatus control apparatus based on image recognition having a control unit that outputs the control signal.

指示者の掌形幅と指先幅の比率である掌形比と前記掌形比に対応する指数を記した指数判定テーブルと、
前記指数に対応する制御信号を記した制御信号テーブルとを記憶したメモリと、
前記指示者の腕から先の腕先データが入力され、
前記腕先データから掌形幅と指先幅を求め、
前記掌形幅および指先幅から求めた掌形比と前記指数判定テーブルから指数を求め、
前記指数と前記制御信号テーブルから対応する制御信号を得て、
前記制御信号を出力する制御部と
を有する画像認識による機器制御装置。 An index determination table in which a palm shape ratio, which is a ratio between the palm shape width of the instructor and the fingertip width, and an index corresponding to the palm shape ratio;
A memory storing a control signal table describing control signals corresponding to the exponents;
The arm tip data from the arm of the instructor is input,
Obtain the palm shape width and fingertip width from the arm tip data,
Obtain an index from the palm shape ratio determined from the palm shape width and fingertip width and the index determination table,
Obtaining a corresponding control signal from the index and the control signal table;
An apparatus control apparatus based on image recognition having a control unit that outputs the control signal.

指示者の掌形幅と指先幅の比率である掌形比と前記掌形比に対応する制御信号を記した制御信号テーブルと、前記指示者の背景の画像からなる背景データを記録したメモリと、
前記指示者の前記上半身を含む画像データが入力され、
前記画像データから前記背景データを差し引いて人物データを抽出し、
前記人物データから腕先データを抽出し、
前記腕先データから掌形幅と指先幅を求め、
前記掌形幅および指先幅の比率である掌形比を求め、
前記掌形比と前記制御信号テーブルから制御信号を得て、
前記制御信号を出力する制御部と
を有する画像認識による機器制御装置。 A control signal table describing a palm shape ratio, which is a ratio of the palm shape width of the instructor and the fingertip width, and a control signal corresponding to the palm shape ratio, and a memory in which background data composed of an image of the background of the instructor is recorded; ,
Image data including the upper body of the instructor is input,
Human data is extracted by subtracting the background data from the image data;
Extract arm data from the person data,
Obtain the palm shape width and fingertip width from the arm tip data,
Obtain a palm shape ratio which is a ratio of the palm shape width and the fingertip width,
Obtain a control signal from the palm shape ratio and the control signal table,
An apparatus control apparatus based on image recognition having a control unit that outputs the control signal.

前記制御信号テーブルは、前記人物データの中で前記腕先データを抽出した領域を示す領域データと前記掌形比に対応した制御信号を記した制御信号テーブルであり、
前記制御装置は、さらに
前記人物データを複数の領域に分割し、
前記腕先データは前記複数の領域のどの領域に存在したかを示す領域データを検出し、
前記制御信号を、
前記掌形比と前記領域データに基づき前記制御信号テーブルから得て出力する請求項１６記載の画像認識による機器制御装置。 The control signal table is a control signal table in which area data indicating an area from which the armtip data is extracted in the person data and a control signal corresponding to the palm shape ratio are described.
The control device further divides the person data into a plurality of regions,
The arm data detects region data indicating in which region of the plurality of regions,
The control signal,
17. The device control apparatus by image recognition according to claim 16, wherein the device control apparatus obtains and outputs from the control signal table based on the palm shape ratio and the area data.

指示者の掌形幅と指先幅の比率である掌形比と前記掌形比に対応する指数を記した指数判定テーブルと、
前記指数に対応する制御信号を記した制御信号テーブルと、
前記指示者の背景の画像からなる背景データ
を記録したメモリと、
前記指示者の前記上半身を含む画像データが入力され、
前記画像データから前記背景データを差し引いて人物データを抽出し、
前記人物データから腕先データを抽出し、
前記腕先データから掌形幅と指先幅を求め、
前記掌形幅および指先幅から求めた掌形比と前記指数判定テーブルから指数を求め、
前記指数と前記制御信号テーブルから対応する制御信号を得て、
前記制御信号を出力する制御部と
を有する画像認識による機器制御装置。 An index determination table in which a palm shape ratio, which is a ratio between the palm shape width of the instructor and the fingertip width, and an index corresponding to the palm shape ratio;
A control signal table describing control signals corresponding to the exponents;
A memory storing background data consisting of an image of the background of the instructor;
Image data including the upper body of the instructor is input,
Human data is extracted by subtracting the background data from the image data;
Extract arm data from the person data,
Obtain the palm shape width and fingertip width from the arm tip data,
Obtain an index from the palm shape ratio determined from the palm shape width and fingertip width and the index determination table,
Obtaining a corresponding control signal from the index and the control signal table;
An apparatus control apparatus based on image recognition having a control unit that outputs the control signal.

前記制御信号テーブルは、前記人物データの中で前記腕先データを抽出した領域を示す領域データと前記指数に対応した制御信号を記した制御信号テーブルであり、
前記制御装置は、さらに
前記人物データを複数の領域に分割し、
前記腕先データは前記複数の領域のどの領域に存在したかを示す領域データを検出し、
前記制御信号を、
前記指数と前記領域データに基づき前記制御信号テーブルから得て出力する請求項１８記載の画像認識による機器制御装置。 The control signal table is a control signal table in which area data indicating an area from which the arm tip data is extracted in the person data and a control signal corresponding to the index are described.
The control device further divides the person data into a plurality of regions,
The arm data detects region data indicating in which region of the plurality of regions,
The control signal,
19. The device control apparatus by image recognition according to claim 18, wherein the device control apparatus obtains and outputs from the control signal table based on the index and the area data.

前記制御信号は可視状態に表示する制御信号表示器をさらに有する請求項１４ないし１９のいずれかの請求項に記載された画像認識による機器制御装置。 20. The device control apparatus by image recognition according to claim 14, further comprising a control signal indicator for displaying the control signal in a visible state.

請求項１３乃至１８のいずれかに記載の画像認識による機器制御装置と、
前記画像認識による機器制御装置からの制御信号によって、
複数の映像信号のうちの少なくとも１の映像信号を選択するビデオセレクタとを有するコンテンツ作成システム。 A device control apparatus by image recognition according to any one of claims 13 to 18,
By a control signal from the device control device by the image recognition,
A content creation system comprising: a video selector that selects at least one video signal of a plurality of video signals.

前記ビデオセレクタの出力を記録するレコーダをさらに有する請求項２１記載のコンテンツ作成システム。 The content creation system according to claim 21, further comprising a recorder for recording the output of the video selector.

前記レコーダは、さらに音声をも記録する請求項２２記載のコンテンツ作成システム。

The content creation system according to claim 22, wherein the recorder further records audio.