JP2024057447A

JP2024057447A - Image processing device, control method for image processing device, program, and storage medium

Info

Publication number: JP2024057447A
Application number: JP2022164194A
Authority: JP
Inventors: 真宏会見; 孝明東海林; 良典林
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2022-10-12
Filing date: 2022-10-12
Publication date: 2024-04-24

Abstract

【課題】最適なフォーカス対象を設定できるようにする。【解決手段】撮像素子により得られた画像信号から人物を認識する認識工程と、第１、第２の動作または第１、第２の音声を検出する検出工程と、前記撮像素子により得られた画像信号からマーカを検出するマーカ検出工程と、焦点状態を調整する対象として、前記認識工程により認識された人物と前記マーカ検出工程により検出されたマーカのいずれか一方を選択する選択工程と、を有し、前記選択工程では、前記第１の動作または前記第１の音声が検出された場合に、前記認識工程により認識された人物を選択し、前記第２の動作または前記第２の音声が検出された場合に前記マーカ検出工程により検出されたマーカを選択する。【選択図】図２[Problem] To enable setting of an optimal focus target. [Solution] The present invention comprises a recognition step of recognizing a person from an image signal obtained by an imaging element, a detection step of detecting a first or second motion or a first or second sound, a marker detection step of detecting a marker from the image signal obtained by the imaging element, and a selection step of selecting either the person recognized by the recognition step or the marker detected by the marker detection step as a target for adjusting the focus state, in which in the selection step, when the first motion or the first sound is detected, the person recognized by the recognition step is selected, and when the second motion or the second sound is detected, the marker detected by the marker detection step is selected. [Selected Figure] Figure 2

Description

本発明は画像処理装置、画像処理装置の制御方法、プログラム、記憶媒体に関するものである。 The present invention relates to an image processing device, a control method for an image processing device, a program, and a storage medium.

近年、動画配信サイトでは、個人による映像配信が一般的になっている。個人による映像配信では、映像を配信する配信者が、画面に映って商品の紹介などをする出演者と、それを撮影する撮影者を兼ねることが多い。このような場合に、配信者は撮影中にカメラの操作を行うことは難しく、配信者の意図した対象にフォーカスを合わせることが難しいという課題があった。そこで、予め意図する対象の近くにマーカを配置しておき、撮影の際には、画像内のマーカの検出情報を元にフォーカスを合わせることが考えられる。 In recent years, video distribution by individuals has become common on video distribution sites. In such cases, the person distributing the video often acts as both the performer who appears on the screen to introduce a product, etc., and the person who films it. In such cases, it is difficult for the distributor to operate the camera while filming, and there is an issue that it is difficult to focus on the distributor's intended subject. As a solution, it is conceivable to place a marker near the intended subject in advance, and when filming, focus is adjusted based on detection information of the marker in the image.

引用文献１では、マーカの実際の形状や大きさと、マーカのリアルタイム撮影画像内の形状や大きさとの比較により、マーカと撮像部との現実距離を算出し、オブジェクトに撮像部のピントを合わせるためのピント位置を決定する。 In cited document 1, the actual shape and size of the marker are compared with the shape and size of the marker in the real-time captured image to calculate the actual distance between the marker and the imaging unit, and the focus position for focusing the imaging unit on the object is determined.

ＷＯ２０１５／１４１１８５WO2015/141185

しかしながら、個人による映像配信で、例えば商品紹介動画などでは、マーカの位置だけでなく、出演者（配信者）などの人物にもフォーカスを合わせたい場合も多い。 However, in video distribution by individuals, such as product promotional videos, it is often necessary to focus not only on the position of the marker but also on people such as the performers (distributors).

上述の特許文献に開示された従来技術では、マーカと人物の両方が映像にある場合の優先順位について記載されていない。特許文献１では、マーカと人物の両方が画像内に存在するようなシチュエーションについては考えられていなかった。個人による映像配信で、商品の紹介などをする動画では、マーカの位置だけではなく、出演者にもフォーカスを合わせたい場合も多い。 The conventional technology disclosed in the above-mentioned patent document does not mention the priority order when both a marker and a person are present in the video. Patent document 1 does not consider the situation where both a marker and a person are present in the image. In video distribution by individuals, such as videos introducing products, it is often desirable to focus not only on the position of the marker but also on the performers.

そこで、本発明は、最適なフォーカス対象を設定できるようにすることを目的とする。 Therefore, the present invention aims to enable the optimal focus target to be set.

本発明の技術的特徴として、撮像素子により得られた画像信号から人物を検出する認識工程と、前記認識工程により認識された人物の第１、第２の動作を検出する動作検出工程と、前記撮像素子により得られた画像信号からマーカを検出するマーカ検出工程と、焦点状態を調整する対象として、前記認識工程により認識された人物と前記マーカ検出工程により検出されたマーカのいずれか一方を選択する選択工程と、を有し、前記選択工程では、前記動作検出工程により第１の動作が検出された場合に、前記認識工程により認識された人物を選択し、前記動作検出工程により第２の動作が検出された場合に前記マーカ検出工程により検出されたマーカを選択することを特徴とする。 The technical features of the present invention include a recognition process for detecting a person from an image signal obtained by an imaging element, a motion detection process for detecting first and second motions of the person recognized by the recognition process, a marker detection process for detecting a marker from an image signal obtained by the imaging element, and a selection process for selecting either the person recognized by the recognition process or the marker detected by the marker detection process as a target for adjusting the focus state, and the selection process is characterized in that when a first motion is detected by the motion detection process, the person recognized by the recognition process is selected, and when a second motion is detected by the motion detection process, the marker detected by the marker detection process is selected.

また、別の技術的特徴として、撮像素子により得られた画像信号から人物を認識する認識工程と、第１、第２の音声を検出する音声検出工程と、前記撮像素子により得られた画像信号からマーカを検出するマーカ検出工程と、焦点状態を調整する対象として、前記認識工程により認識された人物と前記マーカ検出工程により検出されたマーカのいずれか一方を選択する選択工程と、を有し、前記選択工程では、前記音声検出工程により第１の音声が検出された場合に、前記認識工程により認識された人物を選択し、前記音声検出工程により第２の音声が検出された場合に前記マーカ検出工程により検出されたマーカを選択することを特徴とする。 As another technical feature, the present invention has a recognition process for recognizing a person from an image signal obtained by an imaging element, a sound detection process for detecting first and second sounds, a marker detection process for detecting a marker from an image signal obtained by the imaging element, and a selection process for selecting either the person recognized by the recognition process or the marker detected by the marker detection process as a target for adjusting the focus state, and is characterized in that the selection process selects the person recognized by the recognition process when a first sound is detected by the sound detection process, and selects the marker detected by the marker detection process when a second sound is detected by the sound detection process.

本発明によれば、最適なフォーカス対象を設定できる。 The present invention allows you to set the optimal focus target.

撮像装置の構成図である。FIG. 1 is a diagram illustrating a configuration of an imaging device. 第１の実施形態における撮像装置の処理構成を示す図である。FIG. 2 is a diagram illustrating a processing configuration of an imaging apparatus according to the first embodiment. 第１の実施形態におけるＡＦ動作を示すフローチャートである。5 is a flowchart showing an AF operation in the first embodiment. 第２の実施形態におけるＡＦ動作を示すフローチャートである。10 is a flowchart showing an AF operation according to a second embodiment. 第２の実施形態におけるフォーカスモード設定動作を示すフローチャートである。10 is a flowchart showing a focus mode setting operation in the second embodiment. 第２の実施形態におけるフォーカスモード処理を示すフローチャートである。10 is a flowchart showing a focus mode process in the second embodiment. 第２の実施形態における被写体検出処理を示すフローチャートである。10 is a flowchart showing a subject detection process according to the second embodiment. 第３の実施形態における撮像装置の処理構成を示す図である。FIG. 13 is a diagram illustrating a processing configuration of an imaging apparatus according to a third embodiment. 第３の実施形態におけるＡＦ動作を示すフローチャートである。13 is a flowchart showing an AF operation according to the third embodiment.

以下に、本発明の好ましい実施の形態を、添付の図面に基づいて詳細に説明する。 A preferred embodiment of the present invention will be described in detail below with reference to the accompanying drawings.

（第１の実施形態）
●撮像装置の構成
図１に本発明を適用可能な画像処理装置の一例としての撮像装置１の構成図を示す。撮像装置１は、レンズ部１００とカメラ本体２００とを有し、レンズ部１００がカメラ本体２００に着脱可能に構成されているデジタルカメラ、あるいは、レンズ部１００とカメラ本体２００が一体的に構成されたデジタルカメラである。カメラ本体２００は、ネットワーク上のサーバ装置３００と無線通信または有線通信により接続可能である。サーバ装置３００は、映像配信機能を有する映像配信サーバである。 First Embodiment
Configuration of the Imaging Device Fig. 1 shows a configuration diagram of an imaging device 1 as an example of an image processing device to which the present invention can be applied. The imaging device 1 is a digital camera having a lens unit 100 and a camera body 200, the lens unit 100 being configured to be detachable from the camera body 200, or a digital camera in which the lens unit 100 and the camera body 200 are configured as an integral unit. The camera body 200 can be connected to a server device 300 on a network by wireless communication or wired communication. The server device 300 is a video distribution server having a video distribution function.

レンズ部１００は撮像装置１の撮影光学系を構成する。レンズ部１００は、絞り１１、手振れ補正レンズ群１２、フォーカス・ズームレンズ群１３、などを備え、被写体の光学像をカメラ本体２００へ導くことができる。 The lens unit 100 constitutes the imaging optical system of the imaging device 1. The lens unit 100 includes an aperture 11, an image stabilization lens group 12, a focus/zoom lens group 13, etc., and can guide an optical image of a subject to the camera body 200.

カメラ本体２００は、レンズ部１００により結像された光学像を光電変換して画像信号を生成する撮像素子２１と、撮像素子２１を露光する露出時間を調整するメカニカルシャッター２２を備える。カメラ本体２００は、複数の設定項目の設定値に基づいて、レンズ部１００の絞り１１とレンズ群１２、１３を制御すると共に、撮像素子２１の駆動タイミングとメカニカルシャッター２２のシャッタースピードを制御して適正な露出で画像の撮像を行う。 The camera body 200 includes an image sensor 21 that photoelectrically converts the optical image formed by the lens unit 100 to generate an image signal, and a mechanical shutter 22 that adjusts the exposure time for exposing the image sensor 21. The camera body 200 controls the aperture 11 and lens groups 12 and 13 of the lens unit 100 based on the settings of multiple setting items, and also controls the drive timing of the image sensor 21 and the shutter speed of the mechanical shutter 22 to capture an image with appropriate exposure.

カメラ本体２００は、撮像素子２１で撮像された画像やカメラの撮影時の各種の設定値などを表示可能な背面表示部２３を備える。背面表示部２３は、液晶パネルや有機ＥＬなどの表示デバイスで構成され、カメラ本体２００におけるレンズ部１００とは反対側の背面部に設けられている。 The camera body 200 is equipped with a rear display unit 23 that can display images captured by the image sensor 21 and various settings when the camera is photographed. The rear display unit 23 is composed of a display device such as a liquid crystal panel or an organic EL, and is provided on the rear part of the camera body 200 on the side opposite the lens unit 100.

なお、撮像素子２１が信号蓄積時間および信号読出時間を制御することで露出時間を調整可能な電子シャッター機能を備えている場合にはメカニカルシャッター２２は不要である。また、メカニカルシャッター２２と電子シャッター機能を備える場合に、電子シャッターで露出時間を調整する場合はメカニカルシャッター２２は全開状態とする。 Note that if the image sensor 21 has an electronic shutter function that can adjust the exposure time by controlling the signal accumulation time and the signal readout time, the mechanical shutter 22 is not necessary. Also, if the image sensor 21 has both the mechanical shutter 22 and the electronic shutter function, the mechanical shutter 22 is fully open when adjusting the exposure time with the electronic shutter.

カメラ本体２００は、電気回路２０を備える。電気回路２０は、演算処理回路２０ａ、メモリ回路２０ｂ、画像処理回路２０ｃ、画像圧縮回路２０ｄ、駆動制御回路２０ｇ、などを含む。 The camera body 200 includes an electric circuit 20. The electric circuit 20 includes an arithmetic processing circuit 20a, a memory circuit 20b, an image processing circuit 20c, an image compression circuit 20d, a drive control circuit 20g, and the like.

演算処理回路２０ａは、レンズ部１００やカメラ本体２００の動作を制御するための各種の演算処理を行うＣＰＵやＭＰＵなどのプロセッサを含む。演算処理回路２０ａは、記憶部２９に格納されたプログラムを実行することにより、レンズ部１００やカメラ本体２００の各部を制御する。ここでいうプログラムは、本実施形態の制御処理を行うプログラムを含む。 The arithmetic processing circuit 20a includes a processor such as a CPU or MPU that performs various arithmetic processing to control the operation of the lens unit 100 and the camera body 200. The arithmetic processing circuit 20a controls each part of the lens unit 100 and the camera body 200 by executing a program stored in the memory unit 29. The program here includes a program that performs the control processing of this embodiment.

メモリ回路２０ｂは、記憶部２９から読み出したプログラムを展開するワークメモリ、撮像素子２１で撮像された画像データを一時的に保持するバッファメモリ、背面表示部２３の画像表示用メモリとして使用される。 The memory circuit 20b is used as a work memory for expanding the program read from the storage unit 29, a buffer memory for temporarily storing image data captured by the image sensor 21, and a memory for displaying images on the rear display unit 23.

画像処理回路２０ｃは、撮像素子２１で生成された画像信号をデジタルデータに変換し、各種の画像処理を行う。画像処理回路２０ｃから出力される画像データは、背面表示部２３に出力、または、画像圧縮回路２０ｄで所定のデータ形式に圧縮されて記憶部２９に出力され記録される。 The image processing circuit 20c converts the image signal generated by the image sensor 21 into digital data and performs various image processing. The image data output from the image processing circuit 20c is output to the rear display unit 23, or compressed into a specified data format by the image compression circuit 20d and output to the memory unit 29 for recording.

画像圧縮回路２０ｄは、画像処理回路２０ｃから出力される画像データを所定のデータ形式に圧縮符号化して画像ファイルを生成する。 The image compression circuit 20d compresses and encodes the image data output from the image processing circuit 20c into a specified data format to generate an image file.

駆動制御回路２０ｇは、演算処理回路２０ａの演算処理結果に基づいて、不図示の駆動回路やアクチュエータなどを制御して、レンズ部１００の絞り１１、レンズ群１２、１３、カメラ本体２００のメカニカルシャッター２２を制御する。 The drive control circuit 20g controls the drive circuits and actuators (not shown) based on the results of the calculations performed by the calculation processing circuit 20a, and controls the aperture 11 of the lens unit 100, the lens groups 12 and 13, and the mechanical shutter 22 of the camera body 200.

カメラ本体２００は、ユーザ操作を受け付けるスイッチ、ボタン、タッチパネルなどの操作入力部２８を備える。本実施形態では、操作入力部２８は、撮影準備または撮影開始を指示するシャッタースイッチを含む。シャッタースイッチを一段目まで浅く押す、いわゆる「半押し」することで、オートフォーカス処理や自動露出処理、オートホワイトバランス処理等の動作を開始する。さらに、シャッタースイッチを半押しから二段目まで深く押す、いわゆる「全押し」することで、メカニカルシャッター２２または撮像素子２１の電子シャッター機能を作動させる。そして、撮像素子２１からの信号読み出しから記憶部２９に画像データを書き込むまでの一連の撮影処理の動作を開始する。操作入力部２８として、後述するサーバ装置３００による映像配信機能をユーザがオンまたはオフできるスイッチを設けてもよい。 The camera body 200 includes an operation input unit 28 such as a switch, button, or touch panel that accepts user operations. In this embodiment, the operation input unit 28 includes a shutter switch that instructs preparation for shooting or the start of shooting. By lightly pressing the shutter switch to the first position, a so-called "half press," operations such as autofocus processing, autoexposure processing, and autowhite balance processing are started. Furthermore, by pressing the shutter switch deeply from the half press to the second position, a so-called "full press," the mechanical shutter 22 or the electronic shutter function of the image sensor 21 is activated. Then, a series of shooting processing operations from reading a signal from the image sensor 21 to writing image data to the memory unit 29 is started. The operation input unit 28 may be provided with a switch that allows the user to turn on or off a video distribution function by the server device 300, which will be described later.

カメラ本体２００は、通信部２５を備える。通信部２５は、カメラ本体２００をインターネットなどのネットワークを介して外部機器と通信可能に接続するためのインターフェース回路を備える。カメラ本体２００は、通信部２５により、有線または無線のネットワークに接続された外部機器とデータの送受信を行うことができる。カメラ本体２００は、通信部２５を制御して、画像処理回路が処理した画像データをネットワーク上のサーバ装置３００に出力可能である。 The camera body 200 includes a communication unit 25. The communication unit 25 includes an interface circuit for communicatively connecting the camera body 200 to an external device via a network such as the Internet. The camera body 200 can transmit and receive data to and from an external device connected to a wired or wireless network via the communication unit 25. The camera body 200 can control the communication unit 25 to output image data processed by the image processing circuit to a server device 300 on the network.

カメラ本体２００は、音声入力部２７を備える。音声入力部２７は、マイクロフォンなどを備え、入力された音声を電気信号に変換し、音声データとして電気回路２０に出力する。電気回路２０に出力された音声データは、画像データに付加されて記憶部２９に出力され記録されたりする。本実施形態においては、音声入力部２７はユーザが発した音声を入力し、音声データを電気回路２０に出力する。音声入力部２７は、カメラ本体２００に内蔵されていてもよいし、不図示の外部端子に接続されていてもよい。 The camera body 200 includes an audio input unit 27. The audio input unit 27 includes a microphone or the like, converts input audio into an electrical signal, and outputs it to the electrical circuit 20 as audio data. The audio data output to the electrical circuit 20 is added to image data and output to the storage unit 29 for recording. In this embodiment, the audio input unit 27 inputs audio uttered by the user and outputs the audio data to the electrical circuit 20. The audio input unit 27 may be built into the camera body 200, or may be connected to an external terminal (not shown).

カメラ本体２００は、メモリカードやハードディスクなどの記憶部２９を備える。記憶部２９には、演算処理回路２０ａが実行するプログラムが格納されている。また、記憶部２９は、画像圧縮回路２０ｄで所定のフォーマットに圧縮された画像ファイルが記録され、または、既に記録されている画像ファイルが読み出される。記憶部２９は、カメラ本体２００に対して着脱可能な形態であってもよいし、カメラ本体２００に内蔵された形態であってもよい。 The camera body 200 includes a storage unit 29 such as a memory card or a hard disk. The storage unit 29 stores programs executed by the arithmetic processing circuit 20a. The storage unit 29 also records image files compressed into a predetermined format by the image compression circuit 20d, or reads out image files that have already been recorded. The storage unit 29 may be detachable from the camera body 200, or may be built into the camera body 200.

次に、本実施形態の映像配信サーバ３００の構成および機能について説明する。 Next, we will explain the configuration and functions of the video distribution server 300 of this embodiment.

映像配信サーバ３００は、制御部３０、通信部３１、ストリーミング処理部３２を備える。 The video distribution server 300 includes a control unit 30, a communication unit 31, and a streaming processing unit 32.

制御部３０は、映像配信サーバ３００の動作を制御するための各種の演算処理を行うＣＰＵやＭＰＵなどのプロセッサを含む。制御部３０は、所定のプログラムを実行することにより、映像配信サーバ３００の各部を制御する。ここでいうプログラムは、本実施形態の映像配信処理を行うプログラムを含む。 The control unit 30 includes a processor such as a CPU or MPU that performs various arithmetic processing to control the operation of the video distribution server 300. The control unit 30 controls each part of the video distribution server 300 by executing a predetermined program. The program here includes a program that performs the video distribution processing of this embodiment.

通信部３１は、ネットワークを介してカメラ本体２００の通信部２５と接続し、カメラ本体２００および外部デバイスとデータの送受信が可能である。通信部３１は、カメラ本体２００の通信部２５から送信された画像データをストリーミング処理部３２に出力する。 The communication unit 31 is connected to the communication unit 25 of the camera body 200 via a network, and is capable of sending and receiving data between the camera body 200 and external devices. The communication unit 31 outputs image data sent from the communication unit 25 of the camera body 200 to the streaming processing unit 32.

ストリーミング処理部３２は通信部２５から送信された画像データを基に配信用の画像を作り出し、通信部３１に送信する。ストリーミング処理された画像データは通信部３１を介して図１に不図示の視聴者側のデバイスに送信される。 The streaming processing unit 32 creates an image for distribution based on the image data sent from the communication unit 25 and sends it to the communication unit 31. The streaming processed image data is sent via the communication unit 31 to a viewer's device (not shown in FIG. 1).

●第１の実施形態における撮像装置の処理構成
図２は、第１の実施形態における電気回路２０の処理構成例を概念的に示す図である。 Processing Configuration of the Imaging Apparatus in the First Embodiment FIG. 2 is a diagram conceptually showing an example of the processing configuration of the electric circuit 20 in the first embodiment.

図２に示されるように、電気回路２０は、画像取得部１０１、マーカ検出部１０２、第１距離算出部１０３、人物検出部１０４、第２距離算出部１０５、優先順位決定部１０６、パラメータ決定部１０７、フォーカス設定部１０８を有する。 As shown in FIG. 2, the electrical circuit 20 includes an image acquisition unit 101, a marker detection unit 102, a first distance calculation unit 103, a person detection unit 104, a second distance calculation unit 105, a priority determination unit 106, a parameter determination unit 107, and a focus setting unit 108.

これら各処理部は、例えば、プログラムとしてメモリ回路２０ｂに格納され、演算処理回路２０ａによりこれらのプログラムが実行されることにより実現される。 Each of these processing units is stored as a program in the memory circuit 20b, and is realized by the calculation processing circuit 20a executing these programs.

画像取得部１０１は、撮像素子２１からリアルタイムに画像を逐次取得する。 The image acquisition unit 101 sequentially acquires images from the image sensor 21 in real time.

マーカ検出部１０２は、画像取得部１０１により取得されたリアルタイム撮影画像からマーカを検出する。ここで、マーカとは、所定のパターン、形状、色が印刷されたステッカーなどである。マーカ検出部１０２は、マーカの形状、色、情報等を予め保持しており、これらの情報に基づいて、撮影画像からマーカを検出する。このマーカ検出には、特開２０２１－２７５４４で開示された技術のような、公知の画像認識手法が利用される。本実施形態のマーカは、撮像画像から検出できる形状、色のものであればよく、具体的形態は制限しない。また、撮影画像にマーカが複数存在する場合、複数のマーカを検出してもよい。 The marker detection unit 102 detects a marker from the real-time captured image acquired by the image acquisition unit 101. Here, a marker is a sticker or the like on which a predetermined pattern, shape, and color are printed. The marker detection unit 102 stores the shape, color, information, etc. of the marker in advance, and detects the marker from the captured image based on this information. A known image recognition method such as the technology disclosed in JP 2021-27544 A is used for this marker detection. The marker in this embodiment may have a shape and color that can be detected from the captured image, and there are no limitations on the specific form. Furthermore, if multiple markers are present in the captured image, multiple markers may be detected.

第１距離検出部１０３は、マーカ検出部１０２により検出されたマーカに関する画像情報に基づいて、そのマーカと撮像素子２１との距離を検出する。マーカに関する画像情報とは、リアルタイム画像内の形状や大きさ等を示す。第１距離検出部１０３は、あらかじめ保持してあるマーカに関する形状、大きさ情報と検出された画像情報との比較により、マーカと撮像素子２１との距離を算出することができる。第１距離検出部１０３による距離の算出手法は、マーカに関する形状と大きさの情報や画像情報の少なくとも一つが利用されれば、特に制限されない。 The first distance detection unit 103 detects the distance between the marker and the image sensor 21 based on image information about the marker detected by the marker detection unit 102. The image information about the marker indicates the shape, size, etc. in the real-time image. The first distance detection unit 103 can calculate the distance between the marker and the image sensor 21 by comparing the shape and size information about the marker stored in advance with the detected image information. The method of calculating the distance by the first distance detection unit 103 is not particularly limited as long as at least one of the shape and size information and image information about the marker is used.

人物検出部１０４は、画像取得部１０１により取得されたリアルタイム画像から人物を認識する。人物検出部１０４は、人の顔や体の形状情報、色情報等を予め保持しており、これら保持情報に基づいて、撮影画像から人物を認識する。この人物検出には、公知の画像認識手法が利用される。 The person detection unit 104 recognizes people from the real-time images acquired by the image acquisition unit 101. The person detection unit 104 stores shape information and color information of people's faces and bodies in advance, and recognizes people from the captured images based on this stored information. A publicly known image recognition method is used for this person detection.

第２距離検出部１０５は、人物検出部１０２により認識され人物に関する現実情報及び画像情報に基づいて、その人物と撮像素子２１との距離を検出する。この人物と撮像素子２１との距離の検出には、特開２００７－３２９７８４で開示された技術のような、公知の技術が使われる。 The second distance detection unit 105 detects the distance between the person and the image sensor 21 based on the real-world information and image information about the person recognized by the person detection unit 102. A known technique, such as the technique disclosed in JP 2007-329784 A, is used to detect the distance between the person and the image sensor 21.

優先順位決定部１０６は、前記人物またはマーカの内、前記撮像部がフォーカスを合わせる対象を決定する。どの対象を優先するかについては後述する。 The priority determination unit 106 determines which of the people or markers the imaging unit will focus on. Which object is given priority will be described later.

パラメータ決定部１０７は、第１距離算出部１０３により算出された距離情報を用いて、その対象に焦点状態を調整するためのフォーカスパラメータを決定する。フォーカスパラメータは、ピント位置と被写界深度を調整するための絞り値等である。ピント位置は、撮像部１０１における焦点状態を調整するための構成に応じて、例えば、レンズと撮像素子との間の距離、焦点距離等で表すことができる。パラメータ決定部１５０によるパラメータ決定手法には公知の手法が利用され得る。 The parameter determination unit 107 uses the distance information calculated by the first distance calculation unit 103 to determine focus parameters for adjusting the focus state of the target. The focus parameters are aperture values for adjusting the focus position and the depth of field, etc. The focus position can be expressed, for example, as the distance between the lens and the image sensor, the focal length, etc., depending on the configuration for adjusting the focus state in the imaging unit 101. A publicly known method can be used as the parameter determination method used by the parameter determination unit 150.

フォーカス設定部１０８は、パラメータ決定部１０７により決定されたフォーカスパラメータを駆動制御回路２０ｇに設定する。 The focus setting unit 108 sets the focus parameters determined by the parameter determination unit 107 in the drive control circuit 20g.

音声検出部１０９は、音声入力部２７により入力された音声中から特定の音声を検出する。 The voice detection unit 109 detects a specific voice from the voice input by the voice input unit 27.

動作検出部１１０は、画像取得部１０１により取得されたリアルタイム撮影画像から特定のジェスチャ（動作）を検出する。 The motion detection unit 110 detects a specific gesture (motion) from the real-time captured image acquired by the image acquisition unit 101.

なお、第１距離検出部１０３や第２距離検出部１０５は、撮像面位相差ＡＦ方式を用いてデフォーカス情報を求めて、その情報を元にフォーカスを合わせる対象と撮像素子２１との距離を検出してもよい。また、距離ではなくデフォーカス情報を用いて、フォーカスパラメータを設定してもよい。 The first distance detection unit 103 and the second distance detection unit 105 may obtain defocus information using an image plane phase difference AF method and detect the distance between the target to be focused on and the image sensor 21 based on the defocus information. Also, the focus parameters may be set using the defocus information instead of the distance.

●第１の実施形態におけるＡＦ動作
以下、第１の実施形態における撮像制御方法について図３を用いて説明する。 AF Operation in First Embodiment An image capture control method in the first embodiment will now be described with reference to FIG.

第１の実施形態における撮像制御方法は、図３に示されるように、ステップＳ３０１からステップＳ３１５で示される複数の工程を含む。 The imaging control method in the first embodiment includes multiple steps shown in FIG. 3, from step S301 to step S315.

図３は、第１の実施形態における電気回路２０の動作例を示すフローチャートである。 Figure 3 is a flowchart showing an example of the operation of the electric circuit 20 in the first embodiment.

以下の説明では、各工程は、電気回路の一部である演算処理回路２０ａにより実行される。 In the following explanation, each process is performed by the arithmetic processing circuit 20a, which is part of the electrical circuit.

操作部入力部２８の入力操作により商品紹介モードが開始されると、まず、ステップＳ３０１において、電気回路２０の画像取得部１１は、撮像素子２１からリアルタイム画像を逐次取得する。そして、ステップＳ３０２において、人物検出部１０４は、取得されたリアルタイム撮影画像から人物を認識する。 When the product introduction mode is started by an input operation of the operation unit input unit 28, first, in step S301, the image acquisition unit 11 of the electric circuit 20 sequentially acquires real-time images from the image sensor 21. Then, in step S302, the person detection unit 104 recognizes people from the acquired real-time captured images.

次にステップＳ３０３において、演算処理回路２０ａは事前に操作部入力部２８の入力操作によりマーカ検出モードが指示されたかどうかを判定する。指示されていた場合はＳ３０４に移行し、指示されていない場合はＳ３１２に移行する。 Next, in step S303, the calculation processing circuit 20a determines whether the marker detection mode has been instructed in advance by an input operation of the operation unit input unit 28. If it has been instructed, the process proceeds to S304, and if it has not been instructed, the process proceeds to S312.

ステップＳ３０４において、電気回路２０のマーカ検出部１０２は、ステップＳ３０１で取得されたリアルタイム画像からマーカを検出する。ステップＳ３０４は、リアルタイム画像が取得される度に実行されてもよいし、所定周期で実行されてもよい。 In step S304, the marker detection unit 102 of the electric circuit 20 detects a marker from the real-time image acquired in step S301. Step S304 may be executed each time a real-time image is acquired, or may be executed at a predetermined interval.

ステップＳ３０５において、リアルタイム画像内でマーカが検出されたかを判定する。マーカを検出した場合は、ステップＳ３０６に移行し、検出しなかった場合は、ステップＳ３１２に移行する。 In step S305, it is determined whether a marker has been detected in the real-time image. If a marker has been detected, the process proceeds to step S306; if not, the process proceeds to step S312.

ステップＳ３０６において、演算処理回路２０ａは、マーカ検出モードが、マーカ優先モード、音声モード、ジェスチャモードのいずれであるかを判定する。マーカ検出モードは、操作部入力部２８により事前に指示しておく。マーカ優先モードの場合はステップＳ３１１に移行する。音声モードの場合はステップＳ３０７に移行する。ジェスチャモードの場合はステップＳ３０９に移行する。 In step S306, the calculation processing circuit 20a determines whether the marker detection mode is the marker priority mode, the voice mode, or the gesture mode. The marker detection mode is specified in advance by the operation unit input unit 28. If it is the marker priority mode, the process proceeds to step S311. If it is the voice mode, the process proceeds to step S307. If it is the gesture mode, the process proceeds to step S309.

ステップＳ３０７において、優先順位決定部１０６は音声検出部１０９で音声入力部２７により入力された音声から第１の音声が検出されたかを判定する。第１の音声とは、例えば、紹介する商品を示す音声や、商品紹介の開始を示す配信者による音声である。第１の音声が検出された場合はステップＳ３１１に移行し、検出されなかった場合はステップＳ３０８に移行する。 In step S307, the priority determination unit 106 determines whether the voice detection unit 109 has detected a first voice from the voice input by the voice input unit 27. The first voice is, for example, a voice indicating the product to be introduced or a voice by the broadcaster indicating the start of the product introduction. If the first voice is detected, the process proceeds to step S311, and if not, the process proceeds to step S308.

ステップＳ３０８において、優先順位決定部１０６は音声検出部１０９で音声入力部２７により入力された音声から第２の音声が検出されたかを判定する。第２の音声とは、例えば、商品紹介の終了を示す配信者による音声である。第２の音声が検出された場合はステップＳ３１１に移行し、検出されなかった場合はステップＳ３１２に移行する。 In step S308, the priority determination unit 106 determines whether the voice detection unit 109 has detected a second voice from the voice input by the voice input unit 27. The second voice is, for example, a voice by the distributor indicating the end of the product introduction. If the second voice is detected, the process proceeds to step S311, and if not, the process proceeds to step S312.

ステップＳ３０９において、優先順位決定部１０６は動作検出部１１０で画像から第１のジェスチャ（第２の動作）が検出されたかどうかを判定する。第１のジェスチャとは、例えば、商品紹介開始時に行われる指を差すジェスチャや、手を前に差し出すジェスチャである。第１のジェスチャが検出された場合はステップＳ３１１に移行し、検出されなかった場合はステップＳ３０９に移行する。 In step S309, the priority determination unit 106 determines whether or not a first gesture (second action) has been detected from the image by the action detection unit 110. The first gesture is, for example, a pointing gesture or a gesture of holding a hand out in front that is performed when starting a product introduction. If the first gesture has been detected, the process proceeds to step S311, and if not, the process proceeds to step S309.

ステップＳ３１０において、優先順位決定部１０６は動作検出部１１０で画像から第２のジェスチャ（第１の動作）が検出されたかどうかを判定する。第２のジェスチャとは、例えば、商品紹介終了時に行われる指を差すのをやめるジェスチャや、手を後ろに戻すジェスチャである。第２のジェスチャが検出された場合はステップＳ３１２に移行し、検出されなかった場合はステップＳ３１１に移行する。 In step S310, the priority determination unit 106 determines whether or not a second gesture (first action) has been detected from the image by the action detection unit 110. The second gesture is, for example, a gesture of stopping pointing when the product introduction ends, or a gesture of moving the hand back. If the second gesture is detected, the process proceeds to step S312, and if not, the process proceeds to step S311.

ステップＳ３１１において第１距離検出部１０３は、ステップＳ３０５で検出されたマーカの画像情報に基づいて、そのマーカと撮像素子２１との距離を算出する。マーカが複数存在する場合は、優先順位決定部１０６は、ステップＳ３０９で検出したジェスチャの内容に応じて、撮像素子２１との距離を算出するマーカを決定する。例えば、ジェスチャで指を差した場所付近にあるマーカを、それ以外のマーカより優先して決定する。 In step S311, the first distance detection unit 103 calculates the distance between the marker detected in step S305 and the image sensor 21 based on the image information of the marker. If there are multiple markers, the priority determination unit 106 determines the marker for which the distance from the image sensor 21 is to be calculated according to the content of the gesture detected in step S309. For example, the marker that is near the location pointed to by the finger in the gesture is determined to be given priority over other markers.

ステップＳ３１２において、人物検出部１０４により、画像から人物が認識されたかどうかを判定する。人物が検出された場合はステップＳ３１３に移行し、検出されなかった場合はステップＳ３１４に移行する。 In step S312, the person detection unit 104 determines whether a person has been recognized from the image. If a person has been detected, the process proceeds to step S313; if not, the process proceeds to step S314.

ステップＳ３１３において、第２距離検出部１０３は、ステップＳ３１３で検出された人物に関する画像情報に基づいて、その人物と撮像素子２１との距離を算出する。ステップＳ３１４において、演算処理回路２０ａはマーカおよび人物以外の被写体を検出し、それを主被写体とする。そして、主被写体に関する画像情報に基づいて、その主被写体と撮像素子２１との距離を算出する。主被写体とは、マーカおよび顔以外の被写体で、たとえば画像の中央付近にある被写体である。 In step S313, the second distance detection unit 103 calculates the distance between the person detected in step S313 and the image sensor 21 based on image information about the person. In step S314, the arithmetic processing circuit 20a detects a subject other than the markers and the person and sets it as the main subject. Then, based on the image information about the main subject, it calculates the distance between the main subject and the image sensor 21. A main subject is a subject other than the markers and the face, for example, a subject located near the center of the image.

ステップＳ３１５において、パラメータ決定部１０７は、一連の処理のいずれかで算出された距離を用いて、焦点状態を調整するためのフォーカスパラメータを決定する。そして、電気回路２０の演算処理回路２０ａは、決定されたフォーカスパラメータを駆動制御回路２０ｇに設定する。 In step S315, the parameter determination unit 107 determines a focus parameter for adjusting the focus state using the distance calculated in one of the series of processes. Then, the arithmetic processing circuit 20a of the electric circuit 20 sets the determined focus parameter in the drive control circuit 20g.

このように処理することで、第１の音声が検出された場合に、マーカにフォーカスを合わせ、第２の音声が検出された場合は人物にフォーカスを合わせることができ、配信者の意図によりフォーカスを合わせる被写体を自由に切り替えることができる。 By processing in this way, if a first sound is detected, the focus is on the marker, and if a second sound is detected, the focus is on the person, allowing the broadcaster to freely switch the subject on which they want to focus.

マーカにフォーカスを合わせた後、周囲の画像データを用いてマーカの存在する領域を補間し、マーカが存在しないように見える画像にレタッチ処理する。レタッチ処理後の画像は、通信部を介して映像配信サーバに送信する。一方で表示部２３にはレタッチ処理前の画像を表示するか、レタッチ処理後の画像で、レタッチ処理した箇所に焦点状態を調整する対象を示す指標としてのフォーカスの枠を挿入した画像を表示する。このようにすることで、映像配信先の映像視聴者側ではステッカーのない違和感のない画像を見ることができ、配信者側はフォーカスを合わせているポイント、つまりマーカのある場所をわかりやすく見ることができる。また、切り替えたタイミングの情報を記録部に記録しておき、後で動画編集の際にどのタイミングで商品を紹介しているのかの判定に用いることもできる。 After focusing on the marker, the area where the marker is present is interpolated using surrounding image data, and the image is retouched to make it look like the marker is not present. The retouched image is sent to the video distribution server via the communication unit. Meanwhile, the display unit 23 displays either the image before retouching, or the image after retouching with a focus frame inserted in the retouched area as an indicator of the target for adjusting the focus state. In this way, the video viewer at the destination of the video distribution can see a natural image without a sticker, and the distributor can easily see the point where the focus is set, that is, where the marker is. Information on the timing of the switch can also be recorded in the recording unit and used later to determine the timing of product introduction when editing the video.

（第２の実施形態）
以下、本発明の第２の実施形態について詳述する。第１の実施形態と同様の箇所については同じ記号で示し、それらの説明は省略する。 Second Embodiment
The second embodiment of the present invention will be described in detail below. The same parts as those in the first embodiment are indicated by the same reference numerals, and the description thereof will be omitted.

●第２の実施形態におけるＡＦ動作
図４は、第２の実施形態における電気回路２０の動作例を示すフローチャートである。 AF Operation in the Second Embodiment FIG. 4 is a flowchart showing an example of the operation of the electric circuit 20 in the second embodiment.

ステップＳ２１０１において、演算処理回路２０ａはフォーカスモード設定処理を行う。フォーカスモード設定処理の詳細については、図５にて説明する。 In step S2101, the arithmetic processing circuit 20a performs focus mode setting processing. Details of the focus mode setting processing are described in FIG. 5.

ステップＳ２１０２において、演算処理回路２０ａはフォーカスモード処理を行う。フォーカスモード処理の詳細については、図６にて説明する。 In step S2102, the arithmetic processing circuit 20a performs focus mode processing. Details of the focus mode processing are described in FIG. 6.

ステップＳ２１０３において、演算処理回路２０ａは被写体検出処理を行う。被写体検出処理の詳細については、図７にて説明する。 In step S2103, the calculation processing circuit 20a performs subject detection processing. Details of the subject detection processing are described in FIG. 7.

●第２の実施形態におけるフォーカスモード設定動作
図５は、第２の実施形態におけるフォーカスモード設定処理の動作例を示すフローチャートである。 Focus Mode Setting Operation in the Second Embodiment FIG. 5 is a flowchart showing an example of the operation of focus mode setting processing in the second embodiment.

ステップＳ２２０１において、演算処理回路２０ａはユーザからフォーカスモードを指示されているかを判定する。指示されていた場合はステップＳ２２０２に進み、指示されていなかった場合はステップＳ２２０７に進む。 In step S2201, the arithmetic processing circuit 20a determines whether the user has instructed focus mode. If so, the process proceeds to step S2202; if not, the process proceeds to step S2207.

ステップＳ２２０２において、演算処理回路２０ａはユーザからマーカ形状識別処理を指示されているかを判定する。指示されていた場合はステップＳ２２０３に進み、指示されていなかった場合はステップＳ２２０４に進む。 In step S2202, the calculation processing circuit 20a determines whether the user has instructed marker shape identification processing. If so, the process proceeds to step S2203; if not, the process proceeds to step S2204.

ステップＳ２２０３において、演算処理回路２０ａはマーカ検出部１０２が検出したマーカの形状や色を判定する。マーカの形状が前位置モードを示す形状であった場合は、ステップＳ２２０５に進む。マーカの形状が後位置モードを示す形状であった場合は、ステップＳ２２０６に進む。マーカの形状が中間位置モードを示す形状であった場合は、ステップＳ２２０７に進む。前位置モードとは、検出したマーカより撮像装置１に近い距離にフォーカスを合わせる（前ピンとなる）第１のモードである。後位置モードは、検出したマーカより撮像装置１から遠い距離にフォーカスを合わせる（後ピンとなる）第２のモードである。中間位置モードは、検出したマーカの距離にフォーカスを合わせる（合焦となる）第３のモードである。 In step S2203, the arithmetic processing circuit 20a determines the shape and color of the marker detected by the marker detection unit 102. If the shape of the marker indicates the front position mode, proceed to step S2205. If the shape of the marker indicates the rear position mode, proceed to step S2206. If the shape of the marker indicates the intermediate position mode, proceed to step S2207. The front position mode is a first mode in which the focus is adjusted to a distance closer to the imaging device 1 than the detected marker (front focus). The rear position mode is a second mode in which the focus is adjusted to a distance farther from the imaging device 1 than the detected marker (rear focus). The intermediate position mode is a third mode in which the focus is adjusted to the distance of the detected marker (in-focus).

ステップＳ２２０４において、第１のモード、第２のモード、第３のモードはユーザからの指示により設定可能であり、演算処理回路２０ａはユーザから指示されたフォーカスモード設定を判別する。ユーザからの指示が前位置モードであった場合は、ステップＳ２２０５に進む。ユーザからの指示が後位置モードであった場合は、ステップＳ２２０６に進む。ユーザからの指示が中間位置モードであった場合は、ステップＳ２２０７に進む。 In step S2204, the first mode, second mode, and third mode can be set by user instruction, and the arithmetic processing circuit 20a determines the focus mode setting instructed by the user. If the user instruction is the front position mode, the process proceeds to step S2205. If the user instruction is the rear position mode, the process proceeds to step S2206. If the user instruction is the intermediate position mode, the process proceeds to step S2207.

ステップＳ２２０５において、演算処理回路２０ａはフォーカスモードを前位置モードに設定する。 In step S2205, the calculation processing circuit 20a sets the focus mode to the front position mode.

ステップＳ２２０６において、演算処理回路２０ａはフォーカスモードを後位置モードに設定する。 In step S2206, the calculation processing circuit 20a sets the focus mode to the rear position mode.

ステップＳ２２０７において、演算処理回路２０ａはフォーカスモードを中間位置モードに設定する。 In step S2207, the calculation processing circuit 20a sets the focus mode to the intermediate position mode.

●第２の実施形態におけるフォーカスモード処理
図６は、第２の実施形態におけるフォーカスモード設定処理の動作例を示すフローチャートである。 Focus Mode Processing in the Second Embodiment FIG. 6 is a flowchart showing an example of the operation of focus mode setting processing in the second embodiment.

ステップＳ２３０１において、演算処理回路２０ａはＳ２１０１で設定されたフォーカスモードを判別する。フォーカスモードが前位置モードであった場合は、ステップＳ２３０２に進む。フォーカスモードが後位置モードであった場合は、ステップＳ２３０３に進む。フォーカスモードが中間位置モードであった場合は、フォーカスモード処理を終了する。 In step S2301, the arithmetic processing circuit 20a determines the focus mode set in S2101. If the focus mode is the front position mode, the process proceeds to step S2302. If the focus mode is the rear position mode, the process proceeds to step S2303. If the focus mode is the intermediate position mode, the focus mode process ends.

ステップＳ２３０２において、演算処理回路２０ａはステップＳ３１１で算出された距離を変更前距離としてメモリ回路２０ｂに記憶した後、撮像装置１に近い方向に変更する。変更する量はユーザが指定した数値に基づき算出してもよい。 In step S2302, the calculation processing circuit 20a stores the distance calculated in step S311 in the memory circuit 20b as the pre-change distance, and then changes the distance in a direction closer to the imaging device 1. The amount of change may be calculated based on a value specified by the user.

ステップＳ２３０３において、演算処理回路２０ａはステップＳ３１１で算出された距離を変更前距離としてメモリ回路２０ｂに記憶した後、撮像装置１から遠い方向に変更する。変更する量はユーザが指定した数値に基づき算出してもよい。 In step S2303, the calculation processing circuit 20a stores the distance calculated in step S311 in the memory circuit 20b as the pre-change distance, and then changes the distance in a direction away from the imaging device 1. The amount of change may be calculated based on a value specified by the user.

●第２の実施形態における被写体検出処理
図７は、第２の実施形態におけるフォーカスモード設定処理の動作例を示すフローチャートである。 Subject Detection Processing in the Second Embodiment FIG. 7 is a flowchart showing an example of the operation of focus mode setting processing in the second embodiment.

ステップＳ２４０１において、演算処理回路２０ａは画像取得部１０１から取得した画像から主被写体を検索する。 In step S2401, the calculation processing circuit 20a searches for the main subject from the image acquired from the image acquisition unit 101.

ステップＳ２４０２において、演算処理回路２０ａは主被写体が存在したかどうかを判定する。存在した場合は、被写体検出処理を終了する。存在しなかった場合は、ステップＳ２４０３に進む。 In step S2402, the calculation processing circuit 20a determines whether a main subject is present. If a main subject is present, the subject detection process ends. If a main subject is not present, the process proceeds to step S2403.

ステップＳ２４０３において、パラメータ決定部１０７はステップＳ２３０２ないしステップＳ２３０３でメモリ回路２０ｂに記憶した変更前距離を基にフォーカスパラメータを決定する。その後、演算処理回路２０ａは、決定されたフォーカスパラメータを駆動制御回路２０ｇに設定する。 In step S2403, the parameter determination unit 107 determines the focus parameters based on the pre-change distance stored in the memory circuit 20b in steps S2302 and S2303. The calculation processing circuit 20a then sets the determined focus parameters in the drive control circuit 20g.

以上述べたように、本実施形態の撮像装置１は、マーカが商品に直接貼れない場合でもマーカの前後の所定の距離にフォーカスを合わせることができ、確実に商品にフォーカスを合わせることが可能になる。 As described above, the imaging device 1 of this embodiment can focus at a specified distance in front of or behind the marker even if the marker cannot be directly attached to the product, making it possible to reliably focus on the product.

（第３の実施形態）
以下、本発明の第３の実施形態について詳述する。第１の実施形態と同様の箇所については同じ記号で示し、それらの説明は省略する。 Third Embodiment
The third embodiment of the present invention will be described in detail below. The same parts as those in the first embodiment are indicated by the same reference numerals, and the description thereof will be omitted.

●第３の実施形態における撮像装置の処理構成
図８は、第３の実施形態における電気回路３１０１の処理構成例を概念的に示す図である。 Processing Configuration of the Imaging Apparatus in the Third Embodiment FIG. 8 is a diagram conceptually showing an example of the processing configuration of the electric circuit 3101 in the third embodiment.

電気回路３１０１は、第１の実施形態における電気回路２０に対して、マーカ検出部３１０２と優先順位決定部３１０３の構成が異なる。マーカ検出部３１０２は、画像取得部１０１により取得されたリアルタイム撮影画像からマーカを検出する。本実施形態では、さらにマーカの検出個数を算出する。優先順位決定部３１０３は、前記人物またはマーカの内、前記撮像部がフォーカスを合わせる対象を決定する。どの対象を優先するかについては後述する。 The electric circuit 3101 differs from the electric circuit 20 in the first embodiment in the configuration of the marker detection unit 3102 and the priority determination unit 3103. The marker detection unit 3102 detects markers from real-time captured images acquired by the image acquisition unit 101. In this embodiment, it also calculates the number of detected markers. The priority determination unit 3103 determines which of the people or markers the imaging unit will focus on. Which object is given priority will be described later.

●第３の実施形態におけるＡＦ動作
以下、第３の実施形態における撮像制御方法について図９を用いて説明する。 AF Operation in Third Embodiment An image capture control method in the third embodiment will now be described with reference to FIG.

なお、第１の実施形態と同一の構成、動作及び処理については図中に同一の符号を付し、その説明は省略する。 The same configurations, operations, and processes as those in the first embodiment are denoted by the same reference numerals in the figures, and their explanations are omitted.

ステップＳ３２０１において、マーカの検出個数と比較する所定の個数を設定する。ユーザが操作入力部２８を操作して所定の個数を設定する。ステップＳ３２０２において、マーカ検出部３１０２がリアルタイム画像内で検出されたマーカの個数を算出する。検出したマーカの個数が所定の個数以上の場合は、ステップＳ３０６に移行し、所定の個数よりも小さい場合は、ステップＳ３１２に移行する。 In step S3201, a predetermined number to be compared with the number of detected markers is set. The user operates the operation input unit 28 to set the predetermined number. In step S3202, the marker detection unit 3102 calculates the number of markers detected in the real-time image. If the number of detected markers is equal to or greater than the predetermined number, the process proceeds to step S306, and if it is less than the predetermined number, the process proceeds to step S312.

本実施形態では、ステップＳ３２０１においてユーザの指示に基づいて所定の個数を設定する例を示した。しかしながら、他にマーカの大きさや個数、フォーカスを合わせる被写体の大きさや撮影シーンに基づいて不図示のマーカ所定個数設定部が自動で所定の個数を設定してもよい。 In this embodiment, an example has been shown in which the predetermined number is set based on a user instruction in step S3201. However, a predetermined number may also be automatically set by a predetermined number of markers setting unit (not shown) based on the size and number of markers, the size of the subject to be focused on, and the shooting scene.

所定の個数は配置されたマーカの個数よりも小さい値を設定する。マーカの個数に合わせて所定の個数を設定する場合は、例えばマーカの個数が多い場合は所定の個数を大きくする。マーカの個数が少ない場合は所定の個数を小さくする。またマーカの検出しやすさに応じて所定の個数を設定する。例えばマーカやフォーカスを合わせる被写体が大きくマーカを多く貼れる場合は所定の個数を大きくする。マーカやフォーカスを合わせる被写体が小さい場合は所定の個数を小さくする。また人物が複数いるような撮影シーンにおいてはマーカが隠れる可能性があるので所定の個数を小さくする。 The predetermined number is set to a value smaller than the number of markers placed. When setting the predetermined number according to the number of markers, for example, if there are a large number of markers, the predetermined number is made larger. If there are a small number of markers, the predetermined number is made smaller. The predetermined number is also set according to how easy it is to detect the markers. For example, if the markers or the subject to be focused on are large and many markers can be placed, the predetermined number is made larger. If the markers or the subject to be focused on are small, the predetermined number is made smaller. Also, in scenes with multiple people, the markers may be hidden, so the predetermined number is made smaller.

本実施形態では、所定の個数を１種類設定する例を示したが、他に所定の個数を２種類以上設定して優先順位決定部３１０３はマーカの検出個数に応じて判定してもよい。マーカの検出個数に応じて、動作検出部１１０の検出結果に基づいた優先順位の決定をするかを判定する。もしくは、音声検出部１０９の検出結果に基づいた優先順位の決定をするかを判定する。 In this embodiment, an example is shown in which one type of predetermined number is set, but two or more types of predetermined numbers may be set and the priority determination unit 3103 may make a decision depending on the number of detected markers. Depending on the number of detected markers, it is determined whether to determine the priority based on the detection results of the motion detection unit 110. Alternatively, it is determined whether to determine the priority based on the detection results of the voice detection unit 109.

例えば、第１の所定の個数と第２の所定の個数と第３の所定の個数を設定する。マーカの検出個数が第１の所定の個数と第２の所定の個数と第３の所定の個数よりも多い場合はマーカ検出部３１０２の検出結果に基づいた優先順位の決定を行う。マーカの検出個数が第１の所定の個数と第２の所定の個数よりも多く第３所定の個数よりも少ない場合は動作検出部１１０の検出結果に基づいた優先順位の決定を行う。マーカの検出個数が第１の所定の個数よりも多く第２の所定の個数と第３所定の個数よりも少ない場合は音声検出部１０９の検出結果に基づいた優先順位の決定を行う。またマーカの検出個数が第１の所定の個数と第２の所定の個数と第３所定の個数よりも少ない場合は人物検出部１０４の検出結果に基づいた優先順位の決定を行う。上記のように複数の所定の個数を設定して検出したマーカの検出個数に応じて優先順位の決定を行うことで、マーカをユーザの手で隠したり出したりして検出モードの選択ができる。 For example, a first predetermined number, a second predetermined number, and a third predetermined number are set. If the number of detected markers is greater than the first predetermined number, the second predetermined number, and the third predetermined number, the priority is determined based on the detection result of the marker detection unit 3102. If the number of detected markers is greater than the first predetermined number and the second predetermined number and less than the third predetermined number, the priority is determined based on the detection result of the motion detection unit 110. If the number of detected markers is greater than the first predetermined number and less than the second predetermined number and the third predetermined number, the priority is determined based on the detection result of the voice detection unit 109. If the number of detected markers is less than the first predetermined number, the second predetermined number, and the third predetermined number, the priority is determined based on the detection result of the person detection unit 104. By setting multiple predetermined numbers as described above and determining the priority according to the number of detected markers, the detection mode can be selected by hiding or revealing the markers with the user's hand.

また、本発明をその好適な実施形態に基づいて詳述してきたが、本発明はこれら特定の実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の様々な形態も本発明に含まれる。さらに、上述した各実施形態は本発明の一実施形態を示すものにすぎず、各実施形態を適宜組み合わせることも可能である。 Although the present invention has been described in detail based on preferred embodiments thereof, the present invention is not limited to these specific embodiments, and various forms within the scope of the gist of the present invention are also included in the present invention. Furthermore, each of the above-described embodiments merely represents one embodiment of the present invention, and each embodiment can be combined as appropriate.

また、上述した実施形態においては、本発明をデジタルカメラ１００に適用した場合を例にして説明したが、これはこの例に限定されず、画像処理に関する制御を行うことができるような表示制御装置であれば適用可能である。すなわち、本発明は携帯電話端末や携帯型の画像ビューワ、ＰＣ、ファインダーを備えるプリンタ装置、表示部を有する家電、デジタルフォトフレーム、プロジェクター、タブレットＰＣ、音楽プレーヤー、ゲーム機、電子ブックリーダーなどに適用可能である。 In the above-mentioned embodiment, the present invention has been described as being applied to a digital camera 100, but the present invention is not limited to this example and can be applied to any display control device that can control image processing. In other words, the present invention can be applied to mobile phone terminals, portable image viewers, PCs, printer devices with viewfinders, home appliances with displays, digital photo frames, projectors, tablet PCs, music players, game consoles, e-book readers, and the like.

（その他の実施形態）
本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア（プログラム）をネットワーク又は各種記憶媒体を介してシステム或いは装置に供給し、そのシステム或いは装置のコンピュータ（又はＣＰＵやＭＰＵ等）がプログラムコードを読み取り実行する処理である。この場合、そのプログラム、及び該プログラムを記憶した記録媒体は本発明を構成することになる。 Other Embodiments
The present invention can also be realized by executing the following process. That is, software (programs) that realize the functions of the above-described embodiments are supplied to a system or device via a network or various storage media, and the computer (or CPU, MPU, etc.) of the system or device reads and executes the program codes. In this case, the program and the storage medium on which the program is stored constitute the present invention.

１００レンズ部
１１絞り
１２手振れ補正レンズ群
１３フォーカス・ズームレンズ群
２００カメラ本体
２０電気回路
２１撮像素子
２２メカニカルシャッター
２３背面表示部
２５カメラ本体側通信部
２７音声入力部
２８操作入力部
２９記憶部
３００サーバ装置
３０制御部
３１サーバ装置側通信部
３２ストリーミング処理部 REFERENCE SIGNS LIST 100 Lens section 11 Aperture 12 Image stabilization lens group 13 Focus/zoom lens group 200 Camera body 20 Electric circuit 21 Image sensor 22 Mechanical shutter 23 Rear display section 25 Camera body side communication section 27 Audio input section 28 Operation input section 29 Storage section 300 Server device 30 Control section 31 Server device side communication section 32 Streaming processing section

Claims

撮像素子により得られた画像信号から人物を認識する認識手段と、
前記認識手段により認識された人物の第１、第２の動作を検出する動作検出手段と、
前記撮像素子により得られた画像信号からマーカを検出するマーカ検出手段と、
焦点状態を調整する対象として、前記認識手段により認識された人物と前記マーカ検出手段により検出されたマーカのいずれか一方を選択する選択手段と、を備え、
前記選択手段は、前記動作検出手段により第１の動作が検出された場合に、前記認識手段により認識された人物を選択し、前記動作検出手段により第２の動作が検出された場合に前記マーカ検出手段により検出されたマーカを選択することを特徴とする画像処理装置。 A recognition means for recognizing a person from an image signal obtained by the imaging element;
a motion detection means for detecting first and second motions of a person recognized by the recognition means;
a marker detection means for detecting a marker from an image signal obtained by the imaging element;
a selection means for selecting, as a target for adjusting a focus state, either the person recognized by the recognition means or the marker detected by the marker detection means,
the selection means, when a first motion is detected by the motion detection means, selects the person recognized by the recognition means, and, when a second motion is detected by the motion detection means, selects the marker detected by the marker detection means.

撮像素子により得られた画像信号から人物を認識する認識手段と、
第１、第２の音声を検出する音声検出手段と、
前記撮像素子により得られた画像信号からマーカを検出するマーカ検出手段と、
焦点状態を調整する対象として、前記認識手段により認識された人物と前記マーカ検出手段により検出されたマーカのいずれか一方を選択する選択手段と、を備え、
前記選択手段は、前記音声検出手段により第１の音声が検出された場合に、前記認識手段により認識された人物を選択し、前記音声検出手段により第２の音声が検出された場合に前記マーカ検出手段により検出されたマーカを選択することを特徴とする画像処理装置。 A recognition means for recognizing a person from an image signal obtained by the imaging element;
a voice detection means for detecting a first voice and a second voice;
a marker detection means for detecting a marker from an image signal obtained by the imaging element;
a selection means for selecting, as a target for adjusting a focus state, either the person recognized by the recognition means or the marker detected by the marker detection means,
the selection means, when a first voice is detected by the voice detection means, selects the person recognized by the recognition means, and, when a second voice is detected by the voice detection means, selects the marker detected by the marker detection means.

前記選択手段が対象を選択したタイミングを情報として記録する記録手段をさらに有することを特徴とする請求項１または２に記載の画像処理装置。 The image processing device according to claim 1 or 2, further comprising a recording means for recording information on the timing at which the selection means selects the target.

前記マーカ検出手段が検出したマーカが存在しないように見える画像にレタッチ処理する処理手段と、
前記処理手段が出力したレタッチ処理後の画像信号を映像配信サーバに送信する送信手段をさらに有することを特徴とする請求項１または２に記載の画像処理装置。 a processing means for performing a retouching process on the image so that the marker detected by the marker detection means does not appear to exist;
3. The image processing apparatus according to claim 1, further comprising a transmitting unit for transmitting the retouched image signal output by said processing unit to a video distribution server.

前記処理手段が出力したレタッチ処理後の画像信号のマーカが存在した位置に、焦点状態を調整する対象を示す指標を表示するように制御する表示制御手段をさらに有することを特徴とする請求項４に記載の画像処理装置。 The image processing device according to claim 4, further comprising a display control means for controlling the display of an indicator indicating a target for adjusting the focus state at a position where a marker was present in the retouched image signal output by the processing means.

マーカに対して前ピンとなるように焦点状態を調整する第１のモードと、マーカに対して後ピンとなるように焦点状態を調整する第２のモードと、マーカに対して合焦となるように焦点状態を調整する第３のモードと、を設定可能なモード設定手段をさらに有することを特徴とする請求項１に記載の画像処理装置。 The image processing device according to claim 1, further comprising a mode setting means capable of setting a first mode for adjusting the focus state so that the marker is in front focus, a second mode for adjusting the focus state so that the marker is in back focus, and a third mode for adjusting the focus state so that the marker is in focus.

ユーザ操作に応じて前記第１、第２、第３のモードから特定のモードに設定可能であることを特徴とする請求項６に記載の画像処理装置。 The image processing device according to claim 6, characterized in that a specific mode can be set from the first, second, and third modes in response to a user operation.

前記マーカ検出手段はマーカの形状とマーカの色の少なくともいずれか一方を検出し、
前記モード設定手段は、マーカの形状とマーカの色の少なくともいずれか一方に応じて前記第１、第２、第３のモードを設定することを特徴とする請求項６に記載の画像処理装置。 The marker detection means detects at least one of the shape and color of the marker,
7. The image processing apparatus according to claim 6, wherein said mode setting means sets the first, second and third modes in accordance with at least one of a shape and a color of a marker.

前記第１のモードにおいて、マーカに対して前ピンとなる所定の距離にピントを合わせる対象が存在しない場合は、当該マーカに対して合焦となるように焦点状態を調整することを特徴とする請求項６に記載の画像処理装置。 The image processing device according to claim 6, characterized in that in the first mode, if there is no object to be focused on at a predetermined distance from the marker that is in front focus, the focus state is adjusted so that the marker is in focus.

前記第２のモードにおいて、マーカに対して後ピンとなる所定の距離にピントを合わせる対象が存在しない場合は、当該マーカに対して合焦となるように焦点状態を調整することを特徴とする請求項６に記載の画像処理装置。 The image processing device according to claim 6, characterized in that in the second mode, if there is no object to be focused on at a predetermined distance from the marker that is in back focus, the focus state is adjusted so that the marker is in focus.

前記選択手段は、前記マーカ検出手段により所定の個数以上のマーカが検出されない場合、前記認識手段により認識された人物を選択することを特徴とする請求項１または２に記載の画像処理装置。 The image processing device according to claim 1 or 2, characterized in that the selection means selects a person recognized by the recognition means when the marker detection means does not detect a predetermined number of markers or more.

前記所定の個数は、被写体の大きさ、撮影シーンに応じて設定されることを特徴とする請求項１１に記載の画像処理装置。 The image processing device according to claim 11, characterized in that the predetermined number is set according to the size of the subject and the shooting scene.

撮像素子により得られた画像信号から人物を検出する認識工程と、
前記認識工程により認識された人物の第１、第２の動作を検出する動作検出工程と、
前記撮像素子により得られた画像信号からマーカを検出するマーカ検出工程と、
焦点状態を調整する対象として、前記認識工程により認識された人物と前記マーカ検出工程により検出されたマーカのいずれか一方を選択する選択工程と、を有し、
前記選択工程では、前記動作検出工程により第１の動作が検出された場合に、前記認識工程により認識された人物を選択し、前記動作検出工程により第２の動作が検出された場合に前記マーカ検出工程により検出されたマーカを選択することを特徴とする画像処理方法。 a recognition step of detecting a person from an image signal obtained by the imaging element;
a motion detection step of detecting first and second motions of the person recognized by the recognition step;
a marker detection step of detecting a marker from an image signal obtained by the imaging element;
a selection step of selecting, as a target for adjusting a focus state, either the person recognized in the recognition step or the marker detected in the marker detection step,
an image processing method comprising: selecting a person recognized by the recognition step when a first motion is detected by the motion detection step; and selecting a marker detected by the marker detection step when a second motion is detected by the motion detection step, said selection step including:

撮像素子により得られた画像信号から人物を認識する認識工程と、
第１、第２の音声を検出する音声検出工程と、
前記撮像素子により得られた画像信号からマーカを検出するマーカ検出工程と、
焦点状態を調整する対象として、前記認識工程により認識された人物と前記マーカ検出工程により検出されたマーカのいずれか一方を選択する選択工程と、を有し、
前記選択工程では、前記音声検出工程により第１の音声が検出された場合に、前記認識工程により認識された人物を選択し、前記音声検出工程により第２の音声が検出された場合に前記マーカ検出工程により検出されたマーカを選択することを特徴とする画像処理方法。 a recognition step of recognizing a person from an image signal obtained by the imaging element;
a voice detection step of detecting a first voice and a second voice;
a marker detection step of detecting a marker from an image signal obtained by the imaging element;
a selection step of selecting, as a target for adjusting a focus state, either the person recognized in the recognition step or the marker detected in the marker detection step;
an image processing method comprising: selecting a person recognized by the recognition step when a first voice is detected by the voice detection step; and selecting a marker detected by the marker detection step when a second voice is detected by the voice detection step, said selection step including:

コンピュータを、請求項１または２に記載された画像処理装置の各手段として機能させるためのプログラム。 A program for causing a computer to function as each of the means of the image processing device described in claim 1 or 2.

コンピュータを、請求項１または２に記載された画像処理装置の各手段として機能させるためのプログラムを格納したコンピュータが読み取り可能な記憶媒体。 A computer-readable storage medium storing a program for causing a computer to function as each of the means of the image processing device described in claim 1 or 2.