JP2006203860A

JP2006203860A - Imaging apparatus, imaging method, reproducing apparatus, reproducing method and program

Info

Publication number: JP2006203860A
Application number: JP2005341031A
Authority: JP
Inventors: Ichigaku Mino; 一学三野
Original assignee: Fuji Photo Film Co Ltd
Current assignee: Fujifilm Holdings Corp
Priority date: 2004-12-24
Filing date: 2005-11-25
Publication date: 2006-08-03

Abstract

<P>PROBLEM TO BE SOLVED: To provide an imaging apparatus or reproducing apparatus which allows a user to easily obtain a desirable sound and image. <P>SOLUTION: The imaging apparatus or reproducing apparatus includes: an imaging section for capturing an image of a subject; a recording section for recording a sound surrounding the imaging section; a threshold sound volume storage section for storing a specified threshold sound volume; a sound extraction section for extracting the sound within a part of period including the sound having the volume larger than the threshold sound volume stored in the threshold sound volume storage section among the sound recorded by the recording section; a data storage section for associating each of a plurality of imaging images imaged by the imaging section with each of a plurality of sounds extracted by the sound extraction section in the order of imaging and recording, and for storing the same; and a data output section for synchronizing the imaged image with the sound which are associated and stored in the data storage section and outputting the same. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、撮像装置、撮像方法、再生装置、再生方法、及びプログラムに関する。特に本発明は、画像を撮像する撮像装置及び撮像方法、並びに画像を再生する再生装置及び再生方法、並びに当該撮像装置及び再生装置用のプログラムに関する。 The present invention relates to an imaging device, an imaging method, a playback device, a playback method, and a program. In particular, the present invention relates to an imaging device and an imaging method for capturing an image, a reproducing device and a reproducing method for reproducing an image, and a program for the imaging device and the reproducing device.

従来、静止画だけでなく動画もメモリカードに記録することができ、また、静止画や動画の撮影記録時にマイクロホンで検出した音声を画像に対応させて記録できるデジタルスチルカメラがある（例えば、特許文献１参照。）。
特開平７−１５４７３４号公報 Conventionally, there is a digital still camera that can record not only a still image but also a moving image on a memory card, and can record a sound detected by a microphone at the time of shooting and recording a still image or a moving image corresponding to the image (for example, patents). Reference 1).
JP-A-7-154734

しかしながら、このようなカメラを用いて、例えば山で鳥を撮影したとき、鳥の鳴き声の他に周囲の雑音も一緒に録音されてしまい、画像を再生したときの音声が面白くないものになってしまう場合がある。このような場合には、鳥の鳴き声以外の音をカットしたり、周囲の雑音が少ないときの音声を再生するようにしたりして、より楽しく画像を観賞することができることが望ましい。さらに、ユーザにとっては撮影後に画像と音声の編集処理等の煩雑な作業をすることなく、容易に画像と音声とを鑑賞することができることが望ましい。 However, when a bird is photographed using such a camera, for example, the surrounding noise is recorded together with the sound of the bird, and the sound when the image is reproduced is not interesting. May end up. In such a case, it is desirable to be able to enjoy the image more happily by cutting the sound other than the bird's cry or reproducing the sound when the surrounding noise is low. Furthermore, it is desirable for the user to be able to easily appreciate the image and sound without performing complicated operations such as image and sound editing processing after shooting.

そこで本発明は、上記の課題を解決することができる撮像装置、撮像方法、再生装置、再生方法、及びプログラムを提供することを目的とする。この目的は特許請求の範囲における独立項に記載の特徴の組み合わせにより達成される。また従属項は本発明の更なる有利な具体例を規定する。 Accordingly, an object of the present invention is to provide an imaging device, an imaging method, a playback device, a playback method, and a program that can solve the above-described problems. This object is achieved by a combination of features described in the independent claims. The dependent claims define further advantageous specific examples of the present invention.

本発明の第１の形態における撮像装置は、被写体を撮像する撮像部と、撮像部の周囲の音声を録音する録音部と、設定された閾値音量を格納する閾値音量格納部と、録音部が録音した音声のうちで、閾値音量格納部が格納している閾値音量より大きい音量の音声が含まれる一部の期間の音声を抽出する音声抽出部と、撮像部が撮像した撮像画像と、音声抽出部が抽出した音声とを対応づけて格納するデータ格納部と、データ格納部が対応づけて格納している撮像画像と音声とを同期させて出力するデータ出力部とを備える。 An imaging apparatus according to a first aspect of the present invention includes an imaging unit that images a subject, a recording unit that records sound around the imaging unit, a threshold volume storage unit that stores a set threshold volume, and a recording unit. Among the recorded voices, a voice extraction unit that extracts voices of a part of a period in which voices having a volume higher than the threshold volume stored in the threshold volume storage unit are included, a captured image captured by the imaging unit, and a voice A data storage unit that stores the audio extracted by the extraction unit in association with each other, and a data output unit that outputs the captured image and the audio stored in association with each other in synchronization with each other.

データ格納部は、撮像部が撮像した複数の撮像画像のそれぞれと、音声抽出部が抽出した複数の音声のそれぞれとを、撮像及び録音された順に対応づけて格納してよい。 The data storage unit may store each of the plurality of captured images captured by the imaging unit and each of the plurality of sounds extracted by the sound extraction unit in association with each other in the order of imaging and recording.

録音部が録音した音声を格納する音声格納部と、音声格納部が格納している音声の音量分布に基づいて、閾値音量格納部が格納している閾値音量を設定する閾値音量設定部とをさらに備えてよい。閾値音量設定部は、音声格納部が格納している音声の音量の平均値がより大きい場合に、閾値音量格納部が格納している閾値音量をより大きく設定してよい。 A voice storage unit that stores the voice recorded by the recording unit, and a threshold volume setting unit that sets the threshold volume stored in the threshold volume storage unit based on the volume distribution of the voice stored in the voice storage unit. Furthermore, you may prepare. The threshold sound volume setting unit may set the threshold sound volume stored in the threshold sound volume storage unit larger when the average value of the sound volume stored in the sound storage unit is larger.

音声抽出部が抽出する複数の音声の期間の合計が、撮像部が撮像した複数の撮像画像の数に、予め定められた撮像画像の再生時間を乗じた期間と同一となるように、閾値音量格納部が格納している閾値音量を設定する閾値音量設定部をさらに備えてよい。閾値音量格納部は、複数の周波数帯域のそれぞれに対応づけて帯域別閾値音量を格納し、音声抽出部は、録音部が録音した音声の音量を周波数帯域毎に、閾値音量格納部が格納している帯域別閾値音量と比較し、特定の周波数帯域において帯域別閾値音量より大きい音量が含まれる一部の期間の音声を抽出してよい。 The threshold volume is set so that the sum of the periods of the plurality of sounds extracted by the sound extraction unit is equal to the period obtained by multiplying the number of the plurality of captured images captured by the imaging unit by the reproduction time of a predetermined captured image. A threshold volume setting unit for setting the threshold volume stored in the storage unit may be further provided. The threshold volume storage unit stores the threshold volume for each band in association with each of the plurality of frequency bands, and the voice extraction unit stores the volume of the voice recorded by the recording unit for each frequency band. Compared with the threshold sound volume for each band, the sound of a part of the period in which the sound volume larger than the threshold sound volume for each band is included in a specific frequency band may be extracted.

当該撮像装置の周囲の環境を特定する環境特定部と、設定された帯域周波数の音声を透過させる可変フィルタ部と、環境特定部が特定した環境に応じて、可変フィルタ部が透過させる音声の帯域周波数を設定する帯域制御部とをさらに備え、録音部は、フィルタ部が透過させた音声を録音してよい。 An environment specifying unit that specifies the environment around the imaging device, a variable filter unit that transmits sound of a set band frequency, and a voice band that the variable filter unit transmits according to the environment specified by the environment specifying unit A band control unit for setting a frequency, and the recording unit may record the sound transmitted through the filter unit.

当該撮像装置の位置を検出する位置検出部と、位置を示す情報に対応づけて、環境を示す情報を格納する環境情報格納部とをさらに備え、環境特定部は、位置検出部が検出した位置に基づいて環境情報格納部を検索し、当該撮像装置の周囲の環境を特定してよい。 The environment detection unit further includes a position detection unit that detects the position of the imaging device and an environment information storage unit that stores information indicating the environment in association with the information indicating the position. The environment information storage unit may be searched based on the information to identify the environment around the imaging device.

時刻を検出する時刻検出部と、時刻を示す情報に対応づけて、環境を示す情報を格納する環境情報格納部とをさらに備え、環境特定部は、時刻検出部が検出した時刻に基づいて環境情報格納部を検索し、当該撮像装置の周囲の環境を特定してよい。 A time detection unit for detecting the time; and an environment information storage unit for storing information indicating the environment in association with the information indicating the time. The environment specifying unit is configured to generate an environment based on the time detected by the time detection unit. The information storage unit may be searched to identify the environment around the imaging device.

本発明の第２の形態における撮像方法は、撮像部を用いて被写体を撮像する段階と、撮像部の周囲の音声を録音する録音段階と、設定された閾値音量を格納する閾値音量格納段階と、録音段階において録音された音声のうちで、閾値音量格納段階において格納される閾値音量より大きい音量の音声が含まれる一部の期間の音声を抽出する音声抽出段階と、撮像部が撮像した撮像画像と、音声抽出段階で抽出された音声とを対応づけて格納するデータ格納段階と、データ格納段階において対応づけて格納される撮像画像と音声とを同期させて出力するデータ出力段階とを備える。 An imaging method according to a second aspect of the present invention includes a step of imaging a subject using an imaging unit, a recording step of recording sound around the imaging unit, and a threshold volume storage step of storing a set threshold volume. The voice extraction stage for extracting the voice of a part of the period including the voice of the volume higher than the threshold volume stored in the threshold volume storage stage among the voices recorded in the recording stage, and the imaging captured by the imaging unit A data storage stage for storing the image and the voice extracted in the voice extraction stage in association with each other; and a data output stage for synchronizing and outputting the captured image and the voice stored in correspondence in the data storage stage .

本発明の第３の形態によると、画像を撮像する撮像装置用のプログラムであって、撮像装置を被写体を撮像する撮像部、撮像部の周囲の音声を録音する録音部、設定された閾値音量を格納する閾値音量格納部、録音部が録音した音声のうちで、閾値音量格納部が格納している閾値音量より大きい音量の音声が含まれる一部の期間の音声を抽出する音声抽出部、撮像部が撮像した撮像画像と、音声抽出部が抽出した音声とを対応づけて格納するデータ格納部、データ格納部が対応づけて格納している撮像画像と音声とを同期させて出力するデータ出力部として機能させる。 According to a third aspect of the present invention, there is provided a program for an imaging device that captures an image, the imaging device capturing an image of a subject with the imaging device, a recording unit recording sound around the imaging unit, and a set threshold volume A sound volume extraction unit for extracting a sound of a part of a period including sound having a volume larger than the threshold sound volume stored in the threshold sound volume storage unit among sound recorded by the recording unit, A data storage unit that stores the captured image captured by the imaging unit in association with the audio extracted by the audio extraction unit, and data that is output by synchronizing the captured image and audio stored in association with the data storage unit It functions as an output unit.

本発明の第４の形態における再生装置は、撮像装置によって撮像された撮像画像を格納する撮像画像格納部と、撮像装置によって録音された音声を格納する音声格納部と、閾値音量を格納する閾値音量格納部と、音声格納部が格納している音声のうちで、閾値音量格納部が格納している閾値音量より大きい音量の音声が含まれる一部の期間の音声を抽出する音声抽出部と、撮像画像格納部が格納する撮像画像と、音声抽出部が抽出した音声とを対応づけて格納するデータ格納部と、データ格納部が対応づけて格納している撮像画像と音声とを同期させて出力するデータ出力部とを備える。 A playback device according to a fourth aspect of the present invention includes a captured image storage unit that stores a captured image captured by an imaging device, an audio storage unit that stores sound recorded by the imaging device, and a threshold value that stores a threshold volume. A volume storage unit, and a voice extraction unit that extracts voices of a part of a period including voices having a volume higher than the threshold volume stored in the threshold volume storage unit among the voices stored in the voice storage unit; The data storage unit stores the captured image stored in the captured image storage unit in association with the audio extracted by the audio extraction unit, and the captured image and audio stored in association with the data storage unit are synchronized. And a data output unit for outputting.

設定された許容時間を格納する許容時間格納部をさらに備え、撮像画像格納部は、撮像装置によって撮像された時刻に対応づけて撮像画像を格納し、音声格納部は、撮像装置によって録音された時刻に対応づけて音声を格納し、音声抽出部は、撮像画像格納部が格納している撮像画像が撮像された時刻から、許容時間格納部が格納している許容時間の範囲内の時刻に録音された音声のうちで、閾値音量格納部が格納している閾値音量より大きい音量の音声が含まれる一部の期間の音声を抽出してよい。 It further includes an allowable time storage unit that stores the set allowable time, the captured image storage unit stores the captured image in association with the time taken by the imaging device, and the audio storage unit is recorded by the imaging device The voice is stored in association with the time, and the voice extraction unit moves from the time when the captured image stored in the captured image storage unit is captured to a time within the allowable time range stored in the allowable time storage unit. Of the recorded voices, voices of a part of a period in which voices having a volume larger than the threshold volume stored in the threshold volume storage unit may be extracted.

撮像画像格納部が格納している撮像画像を再生するべき旨の指示を受け付ける指示受付部と、指示受付部が指示を受け付けたときの時刻を検出する時刻検出部と、撮像画像格納部が格納している撮像画像が撮像された時刻と、時刻検出部が検出した時刻との差が大きいほど、許容時間格納部が格納している許容時間を長く設定する許容時間制御部をさらに備えてよい。 An instruction reception unit that receives an instruction to reproduce a captured image stored in the captured image storage unit, a time detection unit that detects a time when the instruction reception unit receives the instruction, and a captured image storage unit And a permissible time control unit that sets the permissible time stored in the permissible time storage unit to be longer as the difference between the time when the captured image is captured and the time detected by the time detection unit is larger. .

音声格納部が格納している音声の音量分布に基づいて、閾値音量格納部が格納している閾値音量を設定する閾値音量設定部をさらに備えてよい。閾値音量設定部は、音声格納部が格納している音声の音量の平均値がより大きい場合に、閾値音量格納部が格納している閾値音量をより大きく設定してよい。 You may further provide the threshold volume setting part which sets the threshold volume stored in the threshold volume storage part based on the volume distribution of the voice stored in the voice storage part. The threshold sound volume setting unit may set the threshold sound volume stored in the threshold sound volume storage unit larger when the average value of the sound volume stored in the sound storage unit is larger.

本発明の第５の形態における再生方法は、撮像装置によって撮像された撮像画像を格納する撮像画像格納段階と、撮像装置によって録音された音声を格納する音声格納段階と、閾値音量を格納する閾値音量格納段階と、音声格納段階において格納される音声のうちで、閾値音量格納段階において格納される閾値音量より大きい音量の音声が含まれる一部の期間の音声を抽出する音声抽出段階と、撮像画像格納段階において格納される撮像画像と、音声抽出段階において抽出される音声とを対応づけて格納するデータ格納段階と、データ格納段階において対応づけて格納される撮像画像と音声とを同期させて出力するデータ出力部とを備える。 The reproduction method according to the fifth aspect of the present invention includes a captured image storage stage for storing a captured image captured by an imaging apparatus, an audio storage stage for storing sound recorded by the imaging apparatus, and a threshold for storing a threshold volume. A sound volume extraction stage, a sound extraction stage for extracting a sound of a part of a period in which a sound having a volume larger than the threshold volume stored in the threshold volume storage stage is included among the sounds stored in the sound storage stage, and imaging A data storage stage that associates and stores a captured image stored in the image storage stage and a sound that is extracted in the sound extraction stage, and a captured image and a sound that are stored in association in the data storage stage A data output unit for outputting.

本発明の第６の形態によると、画像を再生する再生装置用のプログラムであって、再生装置を撮像装置によって撮像された撮像画像を格納する撮像画像格納部、撮像装置によって録音された音声を格納する音声格納部、閾値音量を格納する閾値音量格納部、音声格納部が格納している音声のうちで、閾値音量格納部が格納している閾値音量より大きい音量の音声が含まれる一部の期間の音声を抽出する音声抽出部、撮像画像格納部が格納している撮像画像と、音声抽出部が抽出した音声とを対応づけて格納するデータ格納部、データ格納部が対応づけて格納している撮像画像と音声とを同期させて出力するデータ出力部として機能させる。 According to a sixth aspect of the present invention, there is provided a program for a playback device that plays back an image, the captured image storage unit storing a captured image captured by the playback device, and the sound recorded by the imaging device. Among the voices stored in the voice storage unit, the threshold volume storage unit that stores the threshold volume, and the voices stored in the voice storage unit, some of the voices that are louder than the threshold volume stored in the threshold volume storage unit are included A voice extraction unit that extracts voice during the period of time, a captured image stored in the captured image storage unit, and a data storage unit that stores the voice extracted by the voice extraction unit in association with each other, and a data storage unit stores them in association with each other It functions as a data output unit that outputs the captured image and the sound that are synchronized with each other.

なお上記の発明の概要は、本発明の必要な特徴の全てを列挙したものではなく、これらの特徴群のサブコンビネーションもまた発明となりうる。 Note that the above summary of the invention does not enumerate all the necessary features of the present invention, and sub-combinations of these feature groups can also be the invention.

本発明によれば、望ましい音声と画像をユーザが容易に得ることができる撮像装置又は再生装置を提供することができる。 ADVANTAGE OF THE INVENTION According to this invention, the imaging device or reproducing | regenerating apparatus which can obtain a desired audio | voice and an image easily can be provided.

以下、発明の実施形態を通じて本発明を説明するが、以下の実施形態は特許請求の範囲に係る発明を限定するものではなく、また実施形態の中で説明されている特徴の組み合わせの全てが発明の解決手段に必須であるとは限らない。 Hereinafter, the present invention will be described through embodiments of the invention. However, the following embodiments do not limit the claimed invention, and all combinations of features described in the embodiments are inventions. It is not always essential to the solution.

図１は、本発明の第１実施形態に係る撮像装置１００及び再生装置１４０の利用環境の一例を示す。この例では、撮像装置１００は、海岸に遊びにきている人の画像を撮像する。また、撮像装置１００は、撮像装置１００の周囲の音をマイクロホン１０２で録音する。撮像装置１００は、撮像装置１００が撮像した画像及び録音した音声を、インターネット等の通信回線１５０を通じて再生装置１４０に出力する。再生装置１４０は、撮像装置１００から受け取った画像を再生しつつ、撮像装置１００から受け取った音声を再生する。 FIG. 1 shows an example of the usage environment of the imaging device 100 and the playback device 140 according to the first embodiment of the present invention. In this example, the imaging apparatus 100 captures an image of a person who is visiting the beach. Further, the imaging apparatus 100 records sounds around the imaging apparatus 100 with the microphone 102. The imaging apparatus 100 outputs the image captured by the imaging apparatus 100 and the recorded sound to the playback apparatus 140 through a communication line 150 such as the Internet. The playback device 140 plays back the audio received from the imaging device 100 while playing back the image received from the imaging device 100.

このとき、撮像装置１００は、撮像画像を撮像したときの撮像場所、撮像時刻において録音された特徴的な音声を、撮像画像とともに再生装置１４０に出力する。また、再生装置１４０は、撮像装置１００が撮像画像を再生するときに、録音された音声のうち撮像画像を撮像したときの特徴的な音声を、撮像画像とともに再生する。このため、ユーザ１８０は、望ましい音声と画像を容易に得ることができる。 At this time, the imaging apparatus 100 outputs to the playback apparatus 140, together with the captured image, the characteristic sound recorded at the imaging location and the imaging time when the captured image is captured. In addition, when the imaging apparatus 100 reproduces a captured image, the reproduction apparatus 140 reproduces a characteristic sound when the captured image is captured out of the recorded sound together with the captured image. Therefore, the user 180 can easily obtain desired sound and images.

撮像装置１００は、例えば、ユーザ１８０が所持するデジタルスチルカメラ、カメラ付携帯電話等であってよい。また、再生装置１４０は、例えば、画像及び音声を再生することのできるＨＤＴＶ、フォトスタンド等であってよい。他にも、再生装置１４０は、画像及び音声を再生するコンピュータであってもよい。撮像装置１００は、再生装置１４０が有する画像又は音声を再生する機能を持ってもよい。また、撮像装置１００は画像及び音声データを記録媒体に記録し、再生装置１４０は当該記録媒体からデータを受け取って、画像及び音声を再生してもよい。また、撮像装置１００は、画像及び音声データを、通信回線１５０に接続されたサーバの、ユーザ１８０毎にそれぞれ設けられたディレクトリ、例えば撮像装置１００と関連付けられたディレクトリに格納してもよい。そして再生装置１４０は、ユーザ１８０毎にサーバに格納された画像及び音声データを受け取ってもよい。 The imaging device 100 may be, for example, a digital still camera, a camera-equipped mobile phone, etc. possessed by the user 180. Further, the playback device 140 may be, for example, an HDTV or a photo stand that can play back images and sounds. In addition, the playback device 140 may be a computer that plays back images and sounds. The imaging device 100 may have a function of playing back an image or sound that the playback device 140 has. The imaging apparatus 100 may record image and audio data on a recording medium, and the reproducing apparatus 140 may receive data from the recording medium and reproduce the image and audio. In addition, the imaging apparatus 100 may store the image and audio data in a directory provided for each user 180 of the server connected to the communication line 150, for example, a directory associated with the imaging apparatus 100. Then, the playback device 140 may receive image and audio data stored in the server for each user 180.

図２は、撮像装置１００のブロック構成の一例を示す。撮像装置１００は、撮像部２１２、録音部２１４、音声格納部２１６、音声抽出部２１８、閾値音量設定部２２０、閾値音量格納部２２２、データ格納部２３２、データ出力部２３４、可変フィルタ部２４２、帯域制御部２４４、環境特定部２５２、環境情報格納部２４６、位置検出部２４８、及び時刻検出部２５０を備える。 FIG. 2 shows an example of a block configuration of the imaging apparatus 100. The imaging apparatus 100 includes an imaging unit 212, a recording unit 214, a voice storage unit 216, a voice extraction unit 218, a threshold volume setting unit 220, a threshold volume storage unit 222, a data storage unit 232, a data output unit 234, a variable filter unit 242, A band control unit 244, an environment identification unit 252, an environment information storage unit 246, a position detection unit 248, and a time detection unit 250 are provided.

撮像部２１２は、被写体を撮像する。具体的には、撮像部２１２は、被写体からの光をＣＣＤ等の撮像デバイスで受光して、被写体を撮像する。なお、撮像部２１２は、所定の時間間隔で連続的に被写体を撮像してもよい。そして、撮像部２１２は、連続的に撮像して得られる所定の個数の画像を保持しておいてよい。そして、撮像部２１２は、保持した画像の中から、撮像を指示された時刻に最も近いタイミングで撮像された画像を、当該時刻に撮像された撮像画像として選択してもよい。 The imaging unit 212 images a subject. Specifically, the imaging unit 212 receives light from the subject with an imaging device such as a CCD and images the subject. Note that the imaging unit 212 may continuously capture the subject at predetermined time intervals. The imaging unit 212 may hold a predetermined number of images obtained by continuously capturing images. Then, the imaging unit 212 may select, from the held images, an image captured at a timing closest to the time when the imaging is instructed as a captured image captured at the time.

録音部２１４は、撮像部２１２の周囲の音声を録音する。例えば録音部２１４は、マイクロホン１０２で集音される音声を録音する。音声格納部２１６は、録音部２１４が録音した音声を格納する。 The recording unit 214 records the sound around the imaging unit 212. For example, the recording unit 214 records the sound collected by the microphone 102. The voice storage unit 216 stores the voice recorded by the recording unit 214.

閾値音量格納部２２２は、設定された閾値音量を格納する。そして、音声抽出部２１８は、録音部２１４が録音した音声のうちで、閾値音量格納部２２２が格納している閾値音量より大きい音量の音声が含まれる一部の期間の音声を抽出する。 The threshold volume storage unit 222 stores the set threshold volume. Then, the voice extraction unit 218 extracts voices of a part of a period in which voices with a volume higher than the threshold volume stored in the threshold volume storage unit 222 are included among the voices recorded by the recording unit 214.

データ格納部２３２は、撮像部２１２が撮像した撮像画像と、音声抽出部２１８が抽出した音声とを対応づけて格納する。具体的には、データ格納部２３２は、撮像部２１２が撮像した複数の撮像画像のそれぞれと、音声抽出部２１８が抽出した複数の音声のそれぞれとを、撮像及び録音された順に対応づけて格納する。データ出力部２３４は、データ格納部２３２が対応づけて格納している撮像画像と音声とを同期させて出力する。このため、ユーザ１８０は、撮像画像と音声を容易に鑑賞することができる。 The data storage unit 232 stores the captured image captured by the imaging unit 212 and the sound extracted by the sound extraction unit 218 in association with each other. Specifically, the data storage unit 232 stores each of the plurality of captured images captured by the imaging unit 212 and each of the plurality of sounds extracted by the sound extraction unit 218 in association with each other in the order of imaging and recording. To do. The data output unit 234 synchronizes and outputs the captured image and sound stored in association with the data storage unit 232. For this reason, the user 180 can easily appreciate the captured image and the sound.

なお、閾値音量格納部２２２は、撮像装置１００のユーザ１８０が設定した閾値音量を格納してもよい。また、閾値音量設定部２２０によって設定された閾値音量を格納してもよい。閾値音量設定部２２０は、音声抽出部２１８が抽出する複数の音声の期間の合計が、撮像部２１２が撮像した複数の撮像画像の数に、予め定められた撮像画像の再生時間を乗じた期間と同一となるように、閾値音量格納部２２２が格納している閾値音量を設定する。 The threshold volume storage unit 222 may store the threshold volume set by the user 180 of the imaging apparatus 100. The threshold volume set by the threshold volume setting unit 220 may be stored. The threshold volume setting unit 220 is a period obtained by multiplying the total number of the plurality of sound periods extracted by the sound extraction unit 218 by the number of the plurality of picked-up images picked up by the image pickup unit 212 by a predetermined reproduction time of picked-up images The threshold volume stored in the threshold volume storage unit 222 is set so as to be the same.

また、閾値音量設定部２２０は、音声格納部２１６が格納している音声の音量分布に基づいて、閾値音量格納部２２２が格納している閾値音量を設定してもよい。具体的には、閾値音量設定部２２０は、音声格納部２１６が格納している音声の音量分布がより大きい方に偏っている場合に、閾値音量格納部２２２が格納している閾値音量をより大きく設定してもよい。具体的には、閾値音量設定部２２０は、音声格納部２１６が格納している音声の音量の平均値がより大きい場合に、閾値音量格納部２２２が格納している閾値音量をより大きく設定してよい。 Further, the threshold volume setting unit 220 may set the threshold volume stored in the threshold volume storage unit 222 based on the volume distribution of the audio stored in the audio storage unit 216. Specifically, the threshold sound volume setting unit 220 increases the threshold sound volume stored in the threshold sound volume storage unit 222 when the sound volume distribution of the sound stored in the sound storage unit 216 is biased toward the larger one. You may set large. Specifically, the threshold sound volume setting unit 220 sets the threshold sound volume stored in the threshold sound volume storage unit 222 to be larger when the average value of the sound volume stored in the sound storage unit 216 is larger. It's okay.

また、閾値音量格納部２２２は、複数の周波数帯域のそれぞれに対応づけて帯域別閾値音量を格納してもよい。そして、音声抽出部２１８は、録音部２１４が録音した音声の音量を周波数帯域毎に、閾値音量格納部２２２が格納している帯域別閾値音量と比較し、特定の周波数帯域において帯域別閾値音量より大きい音量が含まれる一部の期間の音声を抽出してもよい。このため、ユーザ１８０は、撮像画像に対して望ましい周波数帯域の音声を撮像画像とともに鑑賞することができる。例えば、１００Ｈｚから４０００Ｈｚの周波数帯域の閾値音量を低く設定することによって、遊園地で遊んでいる人物が撮像された画像を、人物の声と容易に対応づけて鑑賞することができる。 Further, the threshold volume storage unit 222 may store the threshold volume for each band in association with each of a plurality of frequency bands. Then, the voice extraction unit 218 compares the volume of the voice recorded by the recording unit 214 with the band-specific threshold volume stored in the threshold volume storage unit 222 for each frequency band, and the band-specific threshold volume in a specific frequency band. You may extract the audio | voice of the one part period in which a bigger sound volume is included. For this reason, the user 180 can appreciate the sound of a desirable frequency band with respect to the captured image together with the captured image. For example, by setting the threshold volume in the frequency band from 100 Hz to 4000 Hz low, an image of a person playing in an amusement park can be easily viewed and associated with the voice of the person.

環境情報格納部２４６は、位置を示す情報に対応づけて、環境を示す情報を格納する。具体的には、環境情報格納部２４６は、緯度及び経度情報と、当該緯度及び経度における環境情報とを対応づけて格納する。環境情報とは、例えば、海、山、川等の自然の環境を示す情報であってよい。他にも、環境情報とは、遊園地、球技場、音楽ホール等、人間による利用環境を示す情報であってよい。 The environment information storage unit 246 stores information indicating the environment in association with information indicating the position. Specifically, the environment information storage unit 246 stores latitude and longitude information and environment information at the latitude and longitude in association with each other. The environmental information may be information indicating a natural environment such as the sea, a mountain, and a river. In addition, the environmental information may be information indicating a use environment by humans, such as an amusement park, a ball game ground, and a music hall.

位置検出部２４８は、撮像装置１００の位置を検出する。例えば、位置検出部２４８は、ＧＰＳ衛星からの緯度及び経度情報を受信することによって、撮像装置１００が存在する緯度及び経度を特定する。 The position detection unit 248 detects the position of the imaging device 100. For example, the position detection unit 248 specifies latitude and longitude in which the imaging device 100 exists by receiving latitude and longitude information from a GPS satellite.

環境特定部２５２は、位置検出部２４８が検出した位置に基づいて環境情報格納部２４６を検索し、撮像装置１００の周囲の環境を特定する。例えば、環境特定部２５２は、位置検出部２４８によって検出された緯度及び経度情報に合致する環境情報を、環境情報格納部２４６を検索することによって特定する。 The environment identification unit 252 searches the environment information storage unit 246 based on the position detected by the position detection unit 248, and identifies the environment around the imaging apparatus 100. For example, the environment specifying unit 252 specifies environment information that matches the latitude and longitude information detected by the position detection unit 248 by searching the environment information storage unit 246.

他にも、環境情報格納部２４６は、時刻を示す情報に対応づけて、環境を示す情報を格納する。例えば、環境情報格納部２４６は、日付を含む時刻と季節とを対応づけて格納する。 In addition, the environment information storage unit 246 stores information indicating the environment in association with information indicating the time. For example, the environment information storage unit 246 stores the time including the date and the season in association with each other.

時刻検出部２５０は、時刻を検出する。そして、環境特定部２５２は、時刻検出部２５０が検出した時刻に基づいて環境情報格納部２４６を検索し、撮像装置１００の周囲の環境を特定する。例えば、時刻検出部２５０が検出する時刻に該当する季節を、環境情報格納部２４６を検索することによって特定する。 The time detection unit 250 detects time. Then, the environment specifying unit 252 searches the environment information storage unit 246 based on the time detected by the time detection unit 250, and specifies the environment around the imaging device 100. For example, the season corresponding to the time detected by the time detection unit 250 is specified by searching the environment information storage unit 246.

帯域制御部２４４は、環境特定部２５２が特定した撮像装置１００の周囲の環境に応じて、可変フィルタ部２４２が透過させる音声の帯域周波数を設定する。可変フィルタ部２４２は、帯域制御部２４４によって設定された帯域周波数の音声を透過させる。そして、録音部２１４は、可変フィルタ部２４２が透過させた音声を録音する。 The band control unit 244 sets the band frequency of the sound transmitted by the variable filter unit 242 according to the environment around the imaging apparatus 100 specified by the environment specifying unit 252. The variable filter unit 242 transmits the audio of the band frequency set by the band control unit 244. Then, the recording unit 214 records the sound transmitted through the variable filter unit 242.

このため、ユーザ１８０は、撮像装置１００を用いて撮像するときの環境、時刻に応じた望ましい周波数帯域の音声を録音することができる。 For this reason, the user 180 can record sound in a desired frequency band according to the environment and time when imaging is performed using the imaging apparatus 100.

図３は、再生装置１４０のブロック構成の一例を示す。再生装置１４０は、音声格納部３１６、音声抽出部３１８、撮像画像格納部３２０、データ格納部３３２、データ出力部３３４、指示受付部３１２、許容時間制御部３６２、許容時間格納部３６４、時刻検出部３６０、閾値音量格納部３２２、及び閾値音量設定部３２４を備える。 FIG. 3 shows an example of a block configuration of the playback device 140. The playback device 140 includes an audio storage unit 316, an audio extraction unit 318, a captured image storage unit 320, a data storage unit 332, a data output unit 334, an instruction reception unit 312, an allowable time control unit 362, an allowable time storage unit 364, and time detection. Unit 360, threshold volume storage unit 322, and threshold volume setting unit 324.

撮像画像格納部３２０は、撮像装置１００によって撮像された撮像画像を格納する。また、音声格納部３１６は、撮像装置１００によって録音された音声を格納する。具体的には、撮像画像格納部３２０は、撮像装置１００によって撮像された時刻に対応づけて撮像画像を格納する。また、音声格納部３１６は、撮像装置１００によって録音された時刻に対応づけて音声を格納する。 The captured image storage unit 320 stores captured images captured by the imaging apparatus 100. The sound storage unit 316 stores sound recorded by the imaging device 100. Specifically, the captured image storage unit 320 stores the captured image in association with the time when the image capturing apparatus 100 captured the image. In addition, the sound storage unit 316 stores sound in association with the time recorded by the imaging apparatus 100.

閾値音量格納部３２２は、閾値音量を格納する。音声抽出部３１８は、音声格納部３１６が格納している音声のうちで、閾値音量格納部３２２が格納している閾値音量より大きい音量の音声が含まれる一部の期間の音声を抽出する。 The threshold volume storage unit 322 stores the threshold volume. The voice extraction unit 318 extracts voices of a part of a period in which voices having a volume higher than the threshold volume stored in the threshold volume storage unit 322 are included among the voices stored in the voice storage unit 316.

データ格納部３３２は、撮像画像格納部３２０が格納している撮像画像と、音声抽出部３１８が抽出した音声とを対応づけて格納する。指示受付部３１２は、撮像画像格納部３２０が格納している撮像画像を再生するべき旨の指示を受け付ける。指示受付部３１２は、例えばユーザ１８０からの指示を受け付ける。データ出力部３３４は、指示受付部３１２が指示を受け付けた場合に、データ格納部２３２が対応づけて格納している撮像画像と音声とを同期させて出力する。 The data storage unit 332 stores the captured image stored in the captured image storage unit 320 and the audio extracted by the audio extraction unit 318 in association with each other. The instruction receiving unit 312 receives an instruction to reproduce the captured image stored in the captured image storage unit 320. The instruction receiving unit 312 receives an instruction from the user 180, for example. When the instruction receiving unit 312 receives an instruction, the data output unit 334 outputs the captured image and sound stored in association with the data storage unit 232 in synchronization.

許容時間格納部３６４は、設定された許容時間を格納する。音声抽出部３１８は、撮像画像格納部３２０が格納している撮像画像が撮像された時刻から、許容時間格納部３６４が格納している許容時間の範囲内の時刻に録音された音声のうちで、閾値音量格納部２２２が格納している閾値音量より大きい音量の音声が含まれる一部の期間の音声を抽出する。 The allowable time storage unit 364 stores the set allowable time. The voice extraction unit 318 is the voice recorded at the time within the range of the allowable time stored in the allowable time storage unit 364 from the time when the captured image stored in the captured image storage unit 320 is captured. Then, the voice of a part of the period in which the voice of the volume larger than the threshold volume stored in the threshold volume storage unit 222 is included is extracted.

具体的には、音声抽出部３１８は、撮像画像格納部３２０が格納している撮像画像が撮像された時刻から、許容時間格納部３６４が格納している許容時間だけ前及び／又は後の時間範囲内の時刻に録音された音声のうちで、閾値音量格納部２２２が格納している閾値音量より大きい音量の音声が含まれる一部の期間の音声を抽出する。 Specifically, the audio extraction unit 318 is a time before and / or after the allowable time stored in the allowable time storage unit 364 from the time when the captured image stored in the captured image storage unit 320 is captured. Of the voices recorded at a time within the range, the voices of a part of the period in which the voices having a volume higher than the threshold volume stored in the threshold volume storage unit 222 are included are extracted.

閾値音量設定部３２４は、音声抽出部３１８が抽出する複数の音声の期間の合計が、予め定められた撮像画像の再生時間と同一となるように、閾値音量格納部３２２が格納している閾値音量を設定する。 The threshold volume setting unit 324 stores the threshold stored in the threshold volume storage unit 322 such that the sum of the periods of the plurality of sounds extracted by the sound extraction unit 318 is the same as the reproduction time of a predetermined captured image. Set the volume.

時刻検出部３６０は、指示受付部３１２が指示を受け付けたときの時刻を検出する。そして、許容時間制御部３６２は、撮像画像格納部３２０が格納している撮像画像が撮像された時刻と、時刻検出部３６０が検出した時刻との差が大きいほど、許容時間格納部３６４が格納している許容時間を長く設定する。このため、再生装置１４０は、より広い時間範囲で録音された音声の中から選択されるより特徴的な音声とともに、遠い過去の撮像画像を再生することができる。また、再生装置１４０は、より近い過去に撮像された撮像画像を、撮像時刻の近くで録音された音声の中から選択して再生することができるので、撮像画像に対して再生される音声が不自然なものになることを防ぐことができる。 The time detection unit 360 detects the time when the instruction receiving unit 312 receives the instruction. The allowable time control unit 362 stores the allowable time storage unit 364 as the difference between the time when the captured image stored in the captured image storage unit 320 is captured and the time detected by the time detection unit 360 increases. Set a longer allowable time. For this reason, the playback device 140 can play back a captured image of a distant past together with a more characteristic voice selected from voices recorded in a wider time range. In addition, the playback device 140 can select and play back a captured image captured in the near past from the sound recorded near the imaging time, so that the sound reproduced for the captured image can be reproduced. It can prevent becoming unnatural.

閾値音量設定部３２４は、音声格納部３１６が格納している音声の音量分布に基づいて、閾値音量格納部３２２が格納している閾値音量を設定してよい。具体的には、閾値音量設定部３２４は、音声格納部３１６が格納している音声の音量分布がより大きい方に偏っている場合に、閾値音量格納部３２２が格納している閾値音量をより大きく設定してもよい。より具体的には、閾値音量設定部３２４は、音声格納部３１６が格納している音声の音量の平均値がより大きい場合に、閾値音量格納部３２２が格納している閾値音量をより大きく設定してよい。 The threshold volume setting unit 324 may set the threshold volume stored in the threshold volume storage unit 322 based on the volume distribution of the audio stored in the audio storage unit 316. Specifically, the threshold volume setting unit 324 increases the threshold volume stored in the threshold volume storage unit 322 when the volume distribution of the audio stored in the audio storage unit 316 is biased toward the larger one. You may set large. More specifically, the threshold volume setting unit 324 sets the threshold volume stored in the threshold volume storage unit 322 to be larger when the average value of the volume of the audio stored in the audio storage unit 316 is larger. You can do it.

図４は、撮像画像と音声との対応関係の一例を示す。撮像部２１２は、時刻ｔ３、ｔ４、ｔ７、ｔ９、ｔ１３、ｔ１４の順に撮像された６個の画像を格納している。そして、音声格納部２１６には、録音部２１４によって録音した音声を時刻に対応づけて、音量波形４０２で示される音量の音声が格納されている。 FIG. 4 shows an example of the correspondence between the captured image and the sound. The imaging unit 212 stores six images captured in the order of times t3, t4, t7, t9, t13, and t14. The voice storage unit 216 stores the voice of the volume indicated by the volume waveform 402 by associating the voice recorded by the recording unit 214 with the time.

閾値音量格納部２２２には、閾値音量Ｌ４１２が設定されている。そして、音声抽出部２１８は、音量波形４０２の音声のうち、閾値音量格納部２２２が格納している閾値音量Ｌ４１２より大きい音量の音声を抽出する。このとき、閾値音量設定部２２０は、音量波形４０２のうち閾値音量Ｌ４１２より大きい期間（ｔ１〜ｔ２、ｔ５〜ｔ１０、及びｔ１２〜ｔ１６）を合計した期間と、撮像部２１２によって撮像された６個の画像を再生する再生時間の合計時間とが同一になるように、閾値音量Ｌ４１２を設定する。そして、データ格納部２３２は、撮像部２１２が撮像した撮像画像のそれぞれが撮像された順に、音声抽出部２１８によって抽出された期間の音声のうち再生時間毎の期間の音声を順に対応づけて格納する。 A threshold volume L412 is set in the threshold volume storage unit 222. Then, the voice extraction unit 218 extracts a voice having a volume larger than the threshold volume L412 stored in the threshold volume storage unit 222 from the voice of the volume waveform 402. At this time, the threshold volume setting unit 220 includes a period obtained by totaling periods (t1 to t2, t5 to t10, and t12 to t16) larger than the threshold volume L412 in the volume waveform 402 and the six captured by the imaging unit 212. The threshold volume L412 is set so that the total reproduction time for reproducing the images is the same. Then, the data storage unit 232 stores the audio of the period for each reproduction time among the audios of the period extracted by the audio extraction unit 218 in order in the order in which the captured images captured by the imaging unit 212 are captured. To do.

具体的には、データ格納部２３２は、時刻ｔ３に撮像された画像を、時刻（ｔ１〜ｔ２）に録音された音声と対応づけて格納する。また、データ格納部２３２は、時刻（ｔ４、ｔ７、ｔ９）に撮像された画像を、期間（ｔ５〜ｔ１０）に録音された音声のうち、それぞれ再生時間で分割した期間（ｔ５〜ｔ６）、期間（ｔ６〜ｔ８）、期間（ｔ８〜ｔ１０）の音声と対応づけて格納する。同様に、データ格納部２３２は、時刻ｔ１３及び時刻ｔ１４に撮像された画像を、それぞれ期間（ｔ１２〜ｔ１５）、及び期間（ｔ１５〜ｔ１６）に録音された音声と対応づける。なお、説明を簡単にするために、時刻（ｔ１〜ｔ２）、期間（ｔ５〜ｔ６）、期間（ｔ６〜ｔ８）、期間（ｔ８〜ｔ１０）、期間（ｔ１２〜ｔ１５）、及び期間（ｔ１５〜ｔ１６）のそれぞれは、予め設定された再生時間と同一の期間であるとした。このため、ユーザ１８０は、撮像装置１００を用いて撮像しながら周囲の音声を録音しておくだけで、撮像画像と音声とを容易に対応づけることができる。 Specifically, the data storage unit 232 stores the image captured at time t3 in association with the sound recorded at time (t1 to t2). In addition, the data storage unit 232 divides the image captured at the time (t4, t7, t9) by the playback time from the sound recorded in the period (t5 to t10), respectively (t5 to t6), It is stored in association with the sound of the period (t6 to t8) and the period (t8 to t10). Similarly, the data storage unit 232 associates the images captured at time t13 and time t14 with audio recorded during the period (t12 to t15) and the period (t15 to t16), respectively. For the sake of simplicity, time (t1 to t2), period (t5 to t6), period (t6 to t8), period (t8 to t10), period (t12 to t15), and period (t15 to t15) Each of t16) is assumed to be the same period as the preset reproduction time. For this reason, the user 180 can easily associate the captured image with the sound simply by recording the surrounding sound while capturing the image using the image capturing apparatus 100.

なお、音量及び閾値音量は、音圧を意味してよい。他にも、音量及び閾値音量は、人間の聴覚に対応する音の大きさを意味してもよい。 Note that the sound volume and the threshold sound volume may mean sound pressure. In addition, the sound volume and the threshold sound volume may mean a loudness level corresponding to human hearing.

図５は、撮像画像と音声との対応関係の他の一例を示す。撮像部２１２は、時刻ｔ３、ｔ４、ｔ７、ｔ９、ｔ１３、ｔ１４の順に撮像された６個の画像を格納している。そして、音声格納部２１６には、録音部２１４が録音した音量波形４０２で示される音量の音声が格納されている。この場合、閾値音量設定部２２０は、音声格納部２１６が格納する音量波形から、時間についての平均音量Ｌａｖを算出する。そして、閾値音量設定部２２０は算出した平均音量Ｌａｖに、予め定めれた係数（例えば、１以上の係数）を乗じた音量閾値Ｌ４１２を閾値音量格納部２２２に格納させる。そして、音声抽出部２１８は、音量波形４０２の音声のうち、閾値音量格納部２２２が格納している閾値音量Ｌ４１２より大きい音量の音声、例えば、期間（ｔ１〜ｔ２、ｔ５〜ｔ１０、及びｔ１２〜ｔ１６）を抽出する。なお、データ格納部２３２は、図４に関連して説明した音声と画像との対応付けと同様にして、音声抽出部２１８が抽出した期間の音声と画像とを、画像の撮像時刻の順に対応付けて格納する。 FIG. 5 shows another example of the correspondence between the captured image and the sound. The imaging unit 212 stores six images captured in the order of times t3, t4, t7, t9, t13, and t14. The sound storage unit 216 stores sound having a volume indicated by the volume waveform 402 recorded by the recording unit 214. In this case, the threshold volume setting unit 220 calculates the average volume Lav with respect to time from the volume waveform stored in the voice storage unit 216. Then, the threshold volume setting unit 220 causes the threshold volume storage unit 222 to store a volume threshold L412 obtained by multiplying the calculated average volume Lav by a predetermined coefficient (for example, a coefficient of 1 or more). Then, the voice extraction unit 218 has a volume larger than the threshold volume L412 stored in the threshold volume storage unit 222 among the voices of the volume waveform 402, for example, periods (t1 to t2, t5 to t10, and t12 to t12). t16) is extracted. Note that the data storage unit 232 corresponds to the sound and the image of the period extracted by the sound extraction unit 218 in the order of the image capturing time in the same manner as the association of the sound and the image described with reference to FIG. Store with attachments.

このように、閾値音量設定部２２０は、音声格納部２１６に格納されている音声の平均的な音量に応じた適切な閾値を設定することができる。なお、閾値音量設定部２２０が設定した閾値音量Ｌ４１２より大きい音量の音声の期間の合計が１つの画像の再生時間に画像の数を乗じた期間より短い場合には、データ格納部２３２は、当該期間の合計値を画像の数で除した期間の音声のそれぞれと画像とを、画像の撮像時刻の順に対応づけて格納してよい。なお、閾値音量設定部２２０が設定した閾値音量Ｌ４１２より大きい音量の音声の期間の合計が１つの画像の再生時間に画像の数を乗じた期間より長い場合には、データ格納部２３２は、複数の画像のそれぞれについて、画像が撮像された時刻に最も近い、他の画像に対応づけられた音声の期間とは異なる期間の音声を選択して、選択した音声と画像とを対応づけて格納してよい。 As described above, the threshold volume setting unit 220 can set an appropriate threshold according to the average volume of the voice stored in the voice storage unit 216. Note that if the sum of the periods of sound with a volume larger than the threshold volume L412 set by the threshold volume setting unit 220 is shorter than the period obtained by multiplying the playback time of one image by the number of images, the data storage unit 232 Each of the sound and the image of the period obtained by dividing the total value of the period by the number of images may be stored in association with each other in the order of image capturing time. Note that if the total period of sound with a volume larger than the threshold volume L412 set by the threshold volume setting unit 220 is longer than the period obtained by multiplying the playback time of one image by the number of images, the data storage unit 232 includes a plurality of data storage units 232. For each of the images, a sound having a period different from the sound period associated with the other image closest to the time when the image was captured is selected, and the selected sound and the image are stored in association with each other. It's okay.

図６は、周波数帯域毎に設定される閾値音量の一例を示す。例えば、閾値音量格納部２２２は、周波数ｆ５５１〜ｆ５５２、周波数ｆ５５２〜ｆ５５３、及び周波数ｆ５５３〜ｆ５５４の周波数帯域の音声に対する閾値音量として、それぞれ閾値音量Ｌ５０１、閾値音量Ｌ５０２、及び閾値音量Ｌ５０３を格納する。図６の音量波形５２０、音量波形５３０、及び音量波形５４０は、それぞれ周波数ｆ５５１〜ｆ５５２、周波数ｆ５５２〜ｆ５５３、及び周波数ｆ５５３〜ｆ５５４の周波数帯域の成分の音量の時間発展を示す。 FIG. 6 shows an example of the threshold volume set for each frequency band. For example, the threshold volume storage unit 222 stores a threshold volume L501, a threshold volume L502, and a threshold volume L503 as threshold volumes for sounds in the frequency bands of frequencies f551 to f552, frequencies f552 to f553, and frequencies f553 to f554, respectively. . The volume waveform 520, the volume waveform 530, and the volume waveform 540 of FIG. 6 show the time evolution of the volume of the frequency band components of the frequencies f551 to f552, the frequencies f552 to f553, and the frequencies f553 to f554, respectively.

そして、音声抽出部２１８は、周波数ｆ５５１〜ｆ５５２における閾値音量Ｌ５０１より大きい音量を含む期間（ｔ５０〜ｔ５１）及び期間（ｔ５５〜ｔ５７）、並びに、周波数ｆ５５２〜ｆ５５３の周波数帯域における量閾値Ｌ５０２より大きい音量を含む期間（ｔ５６〜ｔ５８）を検出する。そして、音声抽出部２１８は、いずれかの周波数帯域において閾値音量よりも大きい音量を有する期間（ｔ５０〜ｔ５１）及び期間（ｔ５５〜ｔ５８）の音声を抽出する。 The voice extraction unit 218 then includes a period (t50 to t51) and a period (t55 to t57) including a volume larger than the threshold volume L501 in the frequencies f551 to f552, and a volume threshold L502 in the frequency band of the frequencies f552 to f553. A period (t56 to t58) including the volume is detected. Then, the voice extraction unit 218 extracts the voices in the period (t50 to t51) and the period (t55 to t58) having a volume that is larger than the threshold volume in any frequency band.

この場合、ｆ５５３〜ｆ５５４の周波数帯域の閾値音量Ｌ５０３を、閾値音量Ｌ５０２及び閾値音量Ｌ５０１よりも高く設定することで、例えば、時刻ｔ５３の付近における、周波数ｆ５５３〜ｆ５５４の周波数帯域に大きい音量を持つ期間の音声が、音声抽出部２１８によって抽出されることを防ぐことができる。このため、例えば、鳥を被写体として撮像装置１００が撮像した撮像画像に対して、撮像装置１００の周囲の人間の囁き声等が対応づけられてしまうことを防ぐことができる。したがって、撮像装置１００は、望ましい周波数帯域の音量が大きい期間の音声を撮像画像と対応づけてユーザ１８０に提供することができる。 In this case, by setting the threshold volume L503 in the frequency band of f553 to f554 higher than the threshold volume L502 and the threshold volume L501, for example, there is a large volume in the frequency band of the frequencies f553 to f554 in the vicinity of time t53. The voice extraction unit 218 can prevent the voice of the period from being extracted. For this reason, for example, it is possible to prevent a human whispering voice or the like around the imaging apparatus 100 from being associated with a captured image captured by the imaging apparatus 100 using a bird as a subject. Therefore, the imaging apparatus 100 can provide the user 180 with sound in a period in which the volume of a desired frequency band is high in association with the captured image.

図７は、可変フィルタ部２４２が透過する音声の帯域周波数の一例を示す。帯域制御部２４４は、環境特定部２５２によって特定される撮像装置１００の周囲の環境に応じて、可変フィルタ部２４２が透過させる音声の帯域周波数を設定する。具体的には、環境情報格納部２４６は、緯度及び経度情報と、当該緯度及び経度における環境情報とを対応づけて格納する。環境情報とは、例えば、海、山、川等の自然の環境を示す情報であってよい。他にも、環境情報とは、遊園地、球技場、音楽ホール等、人間による利用環境を示す情報であってよい。 FIG. 7 shows an example of the band frequency of the sound transmitted through the variable filter unit 242. The band control unit 244 sets the band frequency of the sound transmitted by the variable filter unit 242 according to the environment around the imaging device 100 specified by the environment specifying unit 252. Specifically, the environment information storage unit 246 stores latitude and longitude information and environment information at the latitude and longitude in association with each other. The environmental information may be information indicating a natural environment such as the sea, a mountain, and a river. In addition, the environmental information may be information indicating a use environment by humans, such as an amusement park, a ball game ground, and a music hall.

そして、位置検出部２４８は、例えばＧＰＳ衛星からの緯度及び経度情報を受信することによって、撮像装置１００が存在する緯度及び経度を特定する。そして、環境特定部２５２は、位置検出部２４８によって検出された緯度及び経度情報に合致する環境を、環境情報格納部２４６を検索することによって特定する。帯域制御部２４４は、環境特定部２５２によって検出される環境情報に応じた周波数帯域の音声を録音させるべく、可変フィルタ部２４２が透過させる周波数帯域を決定する。 And the position detection part 248 specifies the latitude and longitude in which the imaging device 100 exists by receiving the latitude and longitude information from a GPS satellite, for example. Then, the environment identification unit 252 identifies an environment that matches the latitude and longitude information detected by the position detection unit 248 by searching the environment information storage unit 246. The band control unit 244 determines a frequency band to be transmitted by the variable filter unit 242 so as to record a sound having a frequency band corresponding to the environment information detected by the environment specifying unit 252.

例えば、帯域制御部２４４は、環境特定部２５２によって特定された環境が山である場合に、例えば山に生息する虫の鳴き声、鳥の鳴き声等を主として録音部２１４に録音させるべく、３０００Ｈｚ〜２００００Ｈｚの間の周波数帯域の音声を可変フィルタ部２４２に透過させる。他にも、帯域制御部２４４は、環境特定部２５２によって特定された環境が遊園地である場合には、例えば人間の歓声等を録音部２１４に録音させるべく、１００Ｈｚ〜４０００Ｈｚの間の周波数帯域を含む音声を可変フィルタ部２４２に透過させる。 For example, when the environment specified by the environment specifying unit 252 is a mountain, the bandwidth control unit 244, for example, has a recording frequency of 3000 Hz to 20000 Hz in order to cause the recording unit 214 to mainly record insect calls, bird calls, and the like. Is transmitted through the variable filter unit 242. In addition, when the environment specified by the environment specifying unit 252 is an amusement park, the band control unit 244 has a frequency band between 100 Hz and 4000 Hz so that the recording unit 214 can record, for example, a human cheer Is transmitted through the variable filter unit 242.

また、帯域制御部２４４は、時刻検出部２５０によって検出される時刻に応じて、可変フィルタ部２４２が透過する音声の周波数帯域を制御してもよい。具体的には、環境情報格納部２４６は、日付を含む時刻と季節とを対応づけて格納する。そして、環境特定部２５２は、時刻検出部２５０が検出する時刻に該当する季節を、環境情報格納部２４６を検索することによって特定する。そして、例えば環境特定部２５２によって季節が夏であると特定された場合には、帯域制御部２４４は、例えばセミの鳴き声の周波数帯域（４０００Ｈｚ〜５０００Ｈｚ）を含む周波数帯域を可変フィルタ部２４２に透過させる。 Further, the band control unit 244 may control the frequency band of the sound transmitted through the variable filter unit 242 according to the time detected by the time detection unit 250. Specifically, the environment information storage unit 246 stores the time including the date and the season in association with each other. Then, the environment identification unit 252 identifies the season corresponding to the time detected by the time detection unit 250 by searching the environment information storage unit 246. For example, when the environment specifying unit 252 specifies that the season is summer, the band control unit 244 transmits, for example, a frequency band including a frequency band (4000 Hz to 5000 Hz) of a semi-speech to the variable filter unit 242. Let

このため、ユーザ１８０は、撮像装置１００によって撮像する環境又は時刻に応じた望ましい音声を容易に録音することができる。 Therefore, the user 180 can easily record a desired sound corresponding to the environment or time taken by the imaging device 100.

本実施形態の撮像装置１００によれば、望ましい音声が対応づけられた撮像画像をユーザ１８０に容易に提供することができる。したがって、ユーザ１８０は、撮像画像に対して録音させるべき音声を意識することなく、楽しく撮像することができる。 According to the imaging apparatus 100 of the present embodiment, a captured image associated with a desired voice can be easily provided to the user 180. Therefore, the user 180 can happily capture an image without being conscious of the sound to be recorded with respect to the captured image.

図８は、撮像画像と音声の対応関係の一例を示す。撮像画像格納部３２０は、撮像装置１００によって時刻ｔ３、ｔ４、ｔ７、ｔ９、ｔ１３、ｔ１４の順に撮像された６個の画像を格納している。そして、音声格納部３１６は、撮像装置１００によって録音された音声を時刻に対応づけて、音量波形４０２で示される音量の音声が格納されている。 FIG. 8 shows an example of the correspondence between the captured image and the sound. The captured image storage unit 320 stores six images captured by the imaging apparatus 100 in the order of times t3, t4, t7, t9, t13, and t14. The sound storage unit 316 stores the sound having the volume indicated by the volume waveform 402 by associating the sound recorded by the imaging apparatus 100 with the time.

閾値音量格納部３２２には、閾値音量Ｌ４１２が設定されている。そして、音声抽出部３１８は、音量波形４０２の音声のうち、閾値音量格納部３２２が格納している閾値音量Ｌ４１２より大きい音量の音声を抽出する。このとき、閾値音量設定部３２４は、音量波形４０２のうち閾値音量Ｌ４１２より大きい期間（ｔ１〜ｔ２、ｔ５〜ｔ１０、及びｔ１２〜ｔ１６）を合計した期間と、撮像装置１００によって撮像された６個の画像を再生する再生時間の合計時間とが同一になるように、閾値音量Ｌ４１２を設定する。そして、データ格納部３３２は、撮像装置１００が撮像した撮像画像のそれぞれが撮像された順に、音声抽出部３１８によって抽出された期間の音声のうち再生時間毎の期間の音声を順に対応づけて格納する。 The threshold volume L412 is set in the threshold volume storage unit 322. Then, the voice extraction unit 318 extracts a voice having a volume larger than the threshold volume L412 stored in the threshold volume storage unit 322 from the voice of the volume waveform 402. At this time, the threshold volume setting unit 324 includes a period obtained by totaling periods (t1 to t2, t5 to t10, and t12 to t16) larger than the threshold volume L412 in the volume waveform 402 and the six captured by the imaging apparatus 100. The threshold volume L412 is set so that the total reproduction time for reproducing the images is the same. Then, the data storage unit 332 stores the audio of the period for each reproduction time in the order in which the captured images captured by the imaging device 100 are captured in the order of the audio extracted during the period extracted by the audio extraction unit 318. To do.

具体的には、データ格納部３３２は、時刻ｔ３に撮像された画像を、時刻（ｔ１〜ｔ２）に録音された音声と対応づけて格納する。また、データ格納部３３２は、時刻（ｔ４、ｔ７、ｔ９）に撮像された画像を、期間（ｔ５〜ｔ１０）に録音された音声のうち、それぞれ再生時間毎の期間（ｔ５〜ｔ６）、期間（ｔ６〜ｔ８）、期間（ｔ８〜ｔ１０）の音声と対応づけて格納する。同様に、データ格納部３３２は、時刻ｔ１３及び時刻ｔ１４に撮像された画像を、それぞれ期間（ｔ１２〜ｔ１５）、及び期間（ｔ１５〜ｔ１６）に録音された音声と対応づける。なお、説明を簡単にするために、時刻（ｔ１〜ｔ２）、期間（ｔ５〜ｔ６）、期間（ｔ６〜ｔ８）、期間（ｔ８〜ｔ１０）、期間（ｔ１２〜ｔ１５）、及び期間（ｔ１５〜ｔ１６）のそれぞれは、予め設定された再生時間と同一の期間であるとした。このため、再生装置１４０は、撮像装置１００を用いて撮像しながら周囲の音声を録音しておくことによって、撮像画像と音声とを容易に対応づけて再生することができる。 Specifically, the data storage unit 332 stores the image captured at the time t3 in association with the sound recorded at the time (t1 to t2). In addition, the data storage unit 332 captures images captured at the time (t4, t7, t9) from the sound recorded during the period (t5 to t10), the period for each reproduction time (t5 to t6), the period, respectively. (T6 to t8) and stored in association with the sound of the period (t8 to t10). Similarly, the data storage unit 332 associates the images picked up at time t13 and time t14 with sounds recorded during the period (t12 to t15) and the period (t15 to t16), respectively. For the sake of simplicity, time (t1 to t2), period (t5 to t6), period (t6 to t8), period (t8 to t10), period (t12 to t15), and period (t15 to t15) Each of t16) is assumed to be the same period as the preset reproduction time. For this reason, the reproducing device 140 can easily reproduce the captured image and the sound by recording the surrounding sound while capturing the image using the image capturing device 100.

また、閾値音量格納部３２２は、複数の周波数帯域のそれぞれに対応づけて帯域別の閾値音量を格納してもよい。そして、音声抽出部３１８は、音声格納部３１６が格納した音声の音量を周波数帯域毎に、閾値音量格納部３２２が格納する帯域別の閾値音量と比較し、特定の周波数帯域において帯域別の閾値音量より大きい音量が含まれる一部の期間の音声を抽出してもよい。 The threshold volume storage unit 322 may store a threshold volume for each band in association with each of a plurality of frequency bands. Then, the voice extraction unit 318 compares the volume of the voice stored in the voice storage unit 316 with the band-specific threshold volume stored in the threshold volume storage unit 322 for each frequency band, and compares the band-specific threshold volume in a specific frequency band. You may extract the audio | voice of the one part period in which the volume larger than a volume is included.

この場合、例えば、鳥を被写体として撮像装置１００によって撮像された撮像画像に対して、撮像装置１００によって録音された周囲の人間の囁き声等が対応づけられて再生されてしまうことを防ぐことができる。したがって、再生装置１４０は、望ましい周波数帯域の音声が大きい期間の音声を撮像画像と対応づけて再生することができる。 In this case, for example, it is possible to prevent a surrounding human whisper recorded by the imaging device 100 from being reproduced in association with a captured image captured by the imaging device 100 using a bird as a subject. it can. Therefore, the playback device 140 can play back sound in a period where the sound in the desired frequency band is high in association with the captured image.

図９は、再生される画像と音声の対応関係の一例を示す。指示受付部３１２は、ユーザ１８０から、画像を再生する指示を受け付ける。例えば、時刻ｔ８３で撮像された画像を再生する指示を指示受付部３１２が受け付けた場合に、音声抽出部３１８は、時刻ｔ８３から、許容時間格納部３６４に格納されている許容時間Δｔ８０３後のｔ８４までの時間範囲内で、閾値音量Ｌ８２４より大きい音量の音声が含まれる期間の音声を抽出することによって、音量波形８３８の一部分の音声を抽出する。 FIG. 9 shows an example of the correspondence between reproduced images and sound. The instruction receiving unit 312 receives an instruction to reproduce an image from the user 180. For example, when the instruction receiving unit 312 receives an instruction to reproduce an image captured at time t83, the voice extraction unit 318 performs t84 after the allowable time Δt803 stored in the allowable time storage unit 364 from the time t83. The voice of a part of the volume waveform 838 is extracted by extracting the voice in the period including the voice with the volume larger than the threshold volume L824 within the time range up to.

また、許容時間制御部３６２は、撮像画像格納部３２０が格納する撮像画像が撮像された時刻と、再生する指示を受け付けた時刻との差が大きいほど、許容時間格納部３６４が格納する許容時間を長く設定する。例えば、許容時間制御部３６２は、撮像時刻ｔ８３よりも前の時刻ｔ８１に撮像された撮像画像を再生する場合に、許容時間Δｔ８０３に比べてより長い許容時間Δｔ８０２を、許容時間格納部３６４に格納する。そして、音声抽出部３１８は、時刻ｔ８１から、許容時間Δｔ８０２後のｔ８２までの時間範囲内で、閾値音量Ｌ８３４より大きい音量の音声が含まれる期間の音声を抽出することによって、音量波形８３４の一部分の音声を抽出する。 In addition, the allowable time control unit 362 stores the allowable time stored in the allowable time storage unit 364 as the difference between the time when the captured image stored in the captured image storage unit 320 is captured and the time when the instruction to reproduce is received is large. Set a longer time. For example, the allowable time control unit 362 stores, in the allowable time storage unit 364, an allowable time Δt802 that is longer than the allowable time Δt803 when reproducing a captured image captured at a time t81 prior to the imaging time t83. To do. Then, the voice extraction unit 318 extracts a part of the volume waveform 834 by extracting a voice in a period in which a voice having a volume larger than the threshold volume L834 is included in a time range from the time t81 to t82 after the allowable time Δt802. Extract the voice.

例えば、再生装置１４０は、一週間前に撮影した画像を再生するときには、撮影した当日に録音された音声の中から抽出される音声を再生する。また、２０年前の小学校の入学式、運動会、卒業式等の画像及び音声が記録されている場合に、再生装置１４０によって運動会の様子を撮影した画像を再生するときには、例えば運動会の日の前後６年間の範囲で録音された音声の中から再生する音声を抽出する。この場合、例えば小学校の運動会の時の音声の他に、入学式、卒業式で録音された音声も再生装置１４０によって再生される。このため、ユーザ１８０は、小学生時代の運動会の様子を鑑賞しながら、小学校への入学式、卒業式等における記憶を沢山思い出すことができるので、より楽しく画像を鑑賞することができる。 For example, when playing back an image shot one week ago, the playback device 140 plays back a voice extracted from the voice recorded on the day of shooting. Also, when images and sounds of an entrance ceremony, an athletic meet, a graduation ceremony, etc. of an elementary school 20 years ago are recorded, when playing back an image of the state of the athletic meet by the playback device 140, for example, before and after the day of the athletic meet Extract the audio to be played back from the audio recorded for 6 years. In this case, for example, in addition to the voice at the elementary school athletic meet, the voice recorded at the entrance ceremony and the graduation ceremony is also played back by the playback device 140. For this reason, the user 180 can remember a lot of memories at the entrance ceremony, graduation ceremony, etc. to the elementary school while appreciating the state of the athletic meet in the elementary school days, so that the image can be enjoyed more enjoyably.

なお、閾値音量設定部３２４は、許容時間格納部３６４に格納される範囲内で、再生される音声の時間が、予め定められた撮像画像の再生時間と一致するように、閾値音量格納部３２２が格納している閾値音量を設定してもよい。例えば、閾値音量設定部３２４は、時刻ｔ８１〜ｔ８２までの時間範囲内で、閾値音量格納部３２２が格納している閾値音量よりも大きい音量の音声が含まれる期間が、予め定めた撮像画像の再生時間と一致するよう、閾値音量格納部３２２が格納している閾値音量をＬ８２２に設定する。 It should be noted that the threshold volume setting unit 324 is within the range stored in the allowable time storage unit 364 so that the time of the reproduced audio matches the predetermined reproduction time of the captured image. May be set as the threshold volume. For example, the threshold volume setting unit 324 has a period in which a sound having a volume larger than the threshold volume stored in the threshold volume storage unit 322 is included in a time range from time t81 to t82 of a predetermined captured image. The threshold volume stored in the threshold volume storage unit 322 is set to L822 so as to coincide with the playback time.

また、音声抽出部３１８は、時刻ｔ８１に撮像された撮像画像を再生する指示を受け付けた場合に、時刻ｔ８３から許容時間Δｔ８０２だけ前の時刻ｔ８０から時刻ｔ８１までの時間範囲で、閾値音量より大きい音量の音声が含まれる範囲の音声を抽出することによって、音声波形８３２の一部の音声を抽出してもよい。また、音声抽出部３１８は、時刻ｔ８１で撮像された撮像画像を再生する場合に、時刻ｔ８１から、許容時間Δｔ８０２だけ前及び後の時間範囲（時刻ｔ８０〜ｔ８２）で、閾値音量より大きい音量の音声が含まれる範囲の音声を抽出してもよい。 In addition, when receiving an instruction to reproduce the captured image captured at time t81, the voice extraction unit 318 is larger than the threshold volume in a time range from time t80 to time t81 that is an allowable time Δt802 before time t83. A part of the voice waveform 832 may be extracted by extracting a voice in a range including a volume of voice. In addition, when reproducing the captured image captured at time t81, the sound extraction unit 318 has a volume larger than the threshold volume in a time range (time t80 to t82) before and after the allowable time Δt802 from time t81. You may extract the audio | voice of the range including an audio | voice.

図１０は、第２実施形態に係る撮像装置９００のブロック構成の一例を示す。なお、第２実施形態の撮像装置９００の利用環境の一例は、図１で説明した撮像装置１００の利用環境と、以下の点を除いて同一であるので説明を省略する。すなわち、第２実施形態に係る撮像装置９００は、撮像装置９００の利用状態、例えば撮像状態、待機状態等の利用状態に応じて、撮像装置９００の録音動作を調整する。例えば、撮像装置９００は、被写体を撮像しているときには、被写体からのより特徴的な音声を録音する。 FIG. 10 shows an example of a block configuration of an imaging apparatus 900 according to the second embodiment. An example of the usage environment of the imaging apparatus 900 according to the second embodiment is the same as the usage environment of the imaging apparatus 100 described with reference to FIG. That is, the imaging apparatus 900 according to the second embodiment adjusts the recording operation of the imaging apparatus 900 according to the usage state of the imaging apparatus 900, for example, the usage state such as the imaging state and the standby state. For example, the imaging device 900 records more characteristic sound from the subject when the subject is being imaged.

第２実施形態の撮像装置９００は、撮像部９１２、録音部９１４、音声格納部９１６、モード設定部９６２、録音音量設定部９１０、録音制御部９２２、距離測定部９７０、集音方向制御部９６４、及び音声集音部９８０を備える。録音音量設定部９１０は、閾値音量設定部９２０を有する。また、音声集音部９８０は、第１集音部９６６及び第２集音部９６８を有する。なお、第２実施形態に係る撮像装置９００の動作及び機能は、以下に説明する部分を除き、第１実施形態に係る撮像装置１００の動作及び機能と同一であるので説明を省略する。例えば、撮像部９１２及び録音部９１４の動作及び機能は、撮像部２１２及び録音部２１４の動作と同一であってよい。なお、第１実施形態の撮像装置１００及び第２実施形態の撮像装置９００の動作及び機能を組み合わせた撮像装置もまた発明となり得る。 The imaging apparatus 900 according to the second embodiment includes an imaging unit 912, a recording unit 914, an audio storage unit 916, a mode setting unit 962, a recording volume setting unit 910, a recording control unit 922, a distance measurement unit 970, and a sound collection direction control unit 964. And a sound collection unit 980. The recording volume setting unit 910 has a threshold volume setting unit 920. The sound collection unit 980 includes a first sound collection unit 966 and a second sound collection unit 968. Note that the operations and functions of the imaging apparatus 900 according to the second embodiment are the same as the operations and functions of the imaging apparatus 100 according to the first embodiment, except for the parts described below, and thus description thereof is omitted. For example, the operations and functions of the imaging unit 912 and the recording unit 914 may be the same as the operations of the imaging unit 212 and the recording unit 214. In addition, the imaging device which combined the operation | movement and function of the imaging device 100 of 1st Embodiment and the imaging device 900 of 2nd Embodiment can also become invention.

音声集音部９８０は、撮像部９１２の周囲の音声を集音して録音部９１４に録音させる。録音制御部９２２は、撮像部９１２の周囲の音声のうちで、予め設定された設定音量より大きい音声を録音部９１４に録音させる。具体的には、録音制御部９２２は、撮像部９１２の周囲の音声のうちで、予め設定された閾値音量より大きい音量の音声を録音部９１４に録音させる。そして、音声格納部９１６は、録音部９１４が録音した音声を格納する。 The sound collecting unit 980 collects sound around the imaging unit 912 and causes the recording unit 914 to record the sound. The recording control unit 922 causes the recording unit 914 to record a sound larger than a preset set volume among the sounds around the imaging unit 912. Specifically, the recording control unit 922 causes the recording unit 914 to record a sound having a volume higher than a preset threshold volume among the sounds around the imaging unit 912. The voice storage unit 916 stores the voice recorded by the recording unit 914.

モード設定部９６２は、撮像部９１２及び録音部９１４の動作状態の種類を示す動作モードを設定する。録音音量設定部９１０は、モード設定部９６２が設定した動作モードに基づいて、設定音量を設定する。具体的には、音音量設定部９１０は、モード設定部９６２が設定した動作モードに基づいて、録音部９１４が録音すべき音声の閾値音量を変更することによって、設定音量を設定する。より具体的には、閾値音量設定部９２０は、モード設定部９６２が設定した動作モードに基づいて、閾値音量を設定する。なお、録音音量設定部９１０は、モード設定部９６２が設定した動作モードに基づいて、音声集音部９８０の感度を変更することによって、設定音量を設定する。 The mode setting unit 962 sets an operation mode indicating the type of operation state of the imaging unit 912 and the recording unit 914. The recording volume setting unit 910 sets a set volume based on the operation mode set by the mode setting unit 962. Specifically, the sound volume setting unit 910 sets the set volume by changing the threshold volume of the sound to be recorded by the recording unit 914 based on the operation mode set by the mode setting unit 962. More specifically, threshold volume setting section 920 sets the threshold volume based on the operation mode set by mode setting section 962. Note that the recording volume setting unit 910 sets the set volume by changing the sensitivity of the sound collection unit 980 based on the operation mode set by the mode setting unit 962.

具体的には、モード設定部９６２は、撮像部９１２がユーザの操作を受け付け得る状態にある待機モード、及び撮像部９１２がユーザの操作を受けて動作している状態にある撮像モードを選択的に設定する。録音音量設定部９１０は、モード設定部９６２が待機モードに設定した場合に、録音部９１４が録音すべき音量の第１の設定音量を設定し、モード設定部９６２が撮像モードに設定した場合に、第１の設定音量より小さい第２の設定音量を設定する。具体的には、閾値音量設定部９２０は、モード設定部９６２が待機モードに設定した場合に、第１の閾値音量を設定し、モード設定部９６２が撮像モードに設定した場合に、第１の閾値音量より小さい第２の閾値音量を設定する。なお、録音音量設定部９１０は、モード設定部９６２が待機モードに設定した場合に、音声集音部９８０が集音する第１の感度を設定し、モード設定部９６２が撮像モードに設定した場合に、第１の感度より大きい第２の感度を設定してよい。 Specifically, the mode setting unit 962 selectively selects a standby mode in which the imaging unit 912 can accept a user operation and an imaging mode in which the imaging unit 912 is operating in response to a user operation. Set to. When the mode setting unit 962 sets the standby mode, the recording volume setting unit 910 sets the first set volume of the volume to be recorded by the recording unit 914 and when the mode setting unit 962 sets the imaging mode. Then, a second set volume smaller than the first set volume is set. Specifically, the threshold volume setting unit 920 sets the first threshold volume when the mode setting unit 962 sets the standby mode, and sets the first threshold volume when the mode setting unit 962 sets the imaging mode. A second threshold volume smaller than the threshold volume is set. When the mode setting unit 962 sets the standby mode, the recording volume setting unit 910 sets the first sensitivity for the sound collection unit 980 to collect sound, and the mode setting unit 962 sets the imaging mode. In addition, a second sensitivity larger than the first sensitivity may be set.

距離測定部９７０は、撮像部９１２と被写体との距離を測定する。具体的には、距離測定部９７０は、レーザ光、赤外線等を被写体に対して照射して、被写体から反射した光に基づいて距離を測定する測距センサであってよい。他にも、距離測定部９７０は、撮像部９１２による撮像画像のコントラストを検出して、最もコントラストの大きい撮像画像を撮像したときの撮像部９１２の制御値に基づいて、撮像部９１２と被写体との距離を測定してもよい。 The distance measuring unit 970 measures the distance between the imaging unit 912 and the subject. Specifically, the distance measuring unit 970 may be a distance measuring sensor that irradiates a subject with laser light, infrared rays, or the like and measures a distance based on light reflected from the subject. In addition, the distance measuring unit 970 detects the contrast of the image captured by the image capturing unit 912, and based on the control value of the image capturing unit 912 when the captured image with the highest contrast is captured, the image capturing unit 912, the subject, The distance may be measured.

録音音量設定部９１０は、距離測定部９７０が測定した距離に基づいて設定音量を設定する。具体的には、録音音量設定部９１０は、距離測定部９７０が測定した距離がより大きい場合に、設定音量をより小さく設定する。例えば、閾値音量設定部９２０は、距離測定部９７０が測定した距離に基づいて閾値音量を設定する。具体的には、閾値音量設定部９２０は、距離測定部９７０が測定した距離がより大きい場合に、閾値音量をより小さく設定する。なお、録音音量設定部９１０は、距離測定部９７０が測定した距離がより大きい場合に、音声集音部９８０が集音する感度をより大きく設定してよい。 The recording volume setting unit 910 sets a set volume based on the distance measured by the distance measurement unit 970. Specifically, the recording sound volume setting unit 910 sets the set sound volume smaller when the distance measured by the distance measurement unit 970 is larger. For example, the threshold volume setting unit 920 sets the threshold volume based on the distance measured by the distance measurement unit 970. Specifically, the threshold volume setting unit 920 sets the threshold volume smaller when the distance measured by the distance measurement unit 970 is larger. Note that the recording volume setting unit 910 may set the sensitivity of the sound collecting unit 980 to be larger when the distance measured by the distance measuring unit 970 is larger.

第１集音部９６６は、撮像部９１２の撮像方向と略同一の方向に集音指向性を有する。第２集音部９６８は、第１集音部９６６より広い集音指向性を有する。 The first sound collection unit 966 has sound collection directivity in a direction substantially the same as the imaging direction of the imaging unit 912. The second sound collection unit 968 has a wider sound collection directivity than the first sound collection unit 966.

集音方向制御部９６４は、モード設定部９６２が撮像モードに設定した場合に、撮像部９１２の撮像方向と略同一方向の音声を集音して録音部９１４に録音させ、モード設定部９６２が待機モードに設定した場合に、モード設定部９６２が撮像モードに設定したより広い方向の音声を集音して録音部９１４に録音させる。具体的には、集音方向制御部９６４は、モード設定部９６２が撮像モードに設定した場合に、第１集音部９６６が集音した音声を録音部９１４に録音させ、モード設定部９６２が待機モードに設定した場合に、第２集音部９６８が集音した音声を録音部９１４に録音させる。 When the mode setting unit 962 sets the imaging mode, the sound collection direction control unit 964 collects sound in a direction substantially the same as the imaging direction of the imaging unit 912 and causes the recording unit 914 to record the sound, and the mode setting unit 962 When the standby mode is set, sound in a wider direction set by the mode setting unit 962 in the imaging mode is collected and recorded by the recording unit 914. Specifically, the sound collection direction control unit 964 causes the recording unit 914 to record the sound collected by the first sound collection unit 966 when the mode setting unit 962 sets the imaging mode, and the mode setting unit 962 When the standby mode is set, the recording unit 914 records the sound collected by the second sound collection unit 968.

他にも、撮像装置９００は、集音指向性を変化させることのできる一の集音部を備えてもよい。そして、集音方向制御部９６４は、モード設定部９６２が撮像モードに設定した場合に、当該集音部の集音指向性を制御することによって撮像部９１２の撮像方向と略同一方向からの音声を集音して録音部９１４に録音させてもよい。そして、集音方向制御部９６４は、モード設定部９６２が待機モードに設定した場合に、モード設定部９６２が撮像モードに設定したより広い方向の音声を集音して録音部９１４に録音させてもよい。 In addition, the imaging apparatus 900 may include one sound collection unit that can change the sound collection directivity. Then, when the mode setting unit 962 sets the imaging mode, the sound collection direction control unit 964 controls the sound collection directivity of the sound collection unit, so that the sound from the substantially same direction as the imaging direction of the imaging unit 912 is obtained. May be collected and recorded by the recording unit 914. Then, when the mode setting unit 962 sets the standby mode, the sound collection direction control unit 964 collects sound in a wider direction set by the mode setting unit 962 to the imaging mode and causes the recording unit 914 to record the sound. Also good.

図１１は、動作モード毎の閾値音量の時間変化の一例を示す。撮像装置９００は、動作モードとして、待機モード、撮像モード、及び再生モードを有する。撮像モードは、例えば、撮像装置９００が撮像及び／又は録音することのできる動作モードであってよい。また、再生モードは、例えば、撮像装置９００が画像及び／又は音声を再生することのできる動作モードであってよい。なお、撮像装置９００が起動された直後は、撮像装置９００は待機モードに設定される。 FIG. 11 shows an example of a temporal change in the threshold volume for each operation mode. The imaging apparatus 900 has a standby mode, an imaging mode, and a playback mode as operation modes. The imaging mode may be an operation mode in which the imaging apparatus 900 can capture and / or record, for example. Further, the reproduction mode may be an operation mode in which the imaging apparatus 900 can reproduce an image and / or sound, for example. Note that immediately after the imaging apparatus 900 is activated, the imaging apparatus 900 is set to the standby mode.

図１１の例では、撮像部９１２及び録音部９１４は、期間（ｔ１００１〜ｔ１００２）に待機モードに設定される。そして、閾値音量設定部９２０は、撮像部９１２及び録音部９１４が待機モードに設定されている期間には、閾値音量をＬ１０２６に設定する。そして、閾値音量設定部９２０は、撮像部９１２及び録音部９１４が撮像モードに設定されている期間（ｔ１００２〜ｔ１００５）には、閾値音量として、閾値音量Ｌ１０２６よりも低い閾値音量Ｌ１０２２を設定する。この期間では、録音部９１４は、音量波形１０１６の音声が入力された場合に、閾値音量Ｌ１０２２よりも大きい音量の音声が入力される期間（ｔ１００３〜ｔ１００４）の音声を録音する。 In the example of FIG. 11, the imaging unit 912 and the recording unit 914 are set to the standby mode during the period (t1001 to t1002). Then, the threshold volume setting unit 920 sets the threshold volume to L1026 during the period in which the imaging unit 912 and the recording unit 914 are set to the standby mode. Then, the threshold volume setting unit 920 sets a threshold volume L1022 lower than the threshold volume L1026 as the threshold volume during a period (t1002 to t1005) in which the imaging unit 912 and the recording unit 914 are set to the imaging mode. During this period, when the sound of the volume waveform 1016 is input, the recording unit 914 records the sound during the period (t1003 to t1004) in which sound having a volume larger than the threshold volume L1022 is input.

そして、閾値音量設定部９２０は、撮像部９１２及び録音部９１４が再生モードに設定されている期間（ｔ１００５〜ｔ１００８）には、閾値音量として、閾値音量Ｌ１０２２又は閾値音量Ｌ１０２４と異なる閾値音量を設定してもよい。例えば、閾値音量設定部９２０は、再生モードに設定された期間には、閾値音量Ｌ１０２２よりも値の大きく、閾値音量Ｌ１０２６よりも値の小さい閾値音量Ｌ１０２４に閾値音量を設定する。この期間では、録音部９１４は、音量波形１０１６の音声が入力された場合に、閾値音量Ｌ１０２４よりも大きい音量の音声が入力される期間（ｔ１００６〜ｔ１００７）の音声を録音する。 The threshold volume setting unit 920 sets a threshold volume different from the threshold volume L1022 or the threshold volume L1024 as the threshold volume during the period (t1005 to t1008) in which the imaging unit 912 and the recording unit 914 are set to the playback mode. May be. For example, the threshold volume setting unit 920 sets the threshold volume to a threshold volume L1024 having a value larger than the threshold volume L1022 and smaller than the threshold volume L1026 during the period set in the playback mode. During this period, when the sound of the volume waveform 1016 is input, the recording unit 914 records the sound during a period (t1006 to t1007) in which sound having a volume larger than the threshold volume L1024 is input.

撮像装置９００が撮像モードに設定されているときには低い閾値音量が設定されるので、ユーザ１８０は撮像画像を撮像しているときの音声を容易に録音することができる。また、撮像装置９００が待機モードに設定されているときには高い閾値音量が設定されるので、例えばユーザ１８０が鳥を撮像しに山に行ったときに、自動車のエンジン音等が待機モード時に録音されることを防ぐことができる。 Since the low threshold volume is set when the imaging apparatus 900 is set to the imaging mode, the user 180 can easily record the sound when the captured image is captured. Further, when the imaging apparatus 900 is set to the standby mode, a high threshold volume is set. Therefore, for example, when the user 180 goes to the mountain to capture a bird, the engine sound of the car is recorded in the standby mode. Can be prevented.

また、集音方向制御部９６４は、撮像部９１２及び録音部９１４の動作モードに応じて、録音部９１４が録音する音声を集音する方向を制御する。具体的には、撮像部９１２及び録音部９１４が撮像モードに設定されている場合には、撮像部９１２の撮像方向と略同一方向の音声を第１集音部９６６を用いて集音して、録音部９１４に録音させる。また、撮像部９１２及び録音部９１４が待機モードに設定されている場合には、撮像モードに設定された集音方向よりも広い方向の音声を第２集音部９６８を用いて集音して、録音部９１４に録音させる。 The sound collection direction control unit 964 controls the direction in which the sound recorded by the recording unit 914 is collected according to the operation modes of the imaging unit 912 and the recording unit 914. Specifically, when the imaging unit 912 and the recording unit 914 are set to the imaging mode, the first sound collection unit 966 collects sound in the same direction as the imaging direction of the imaging unit 912. The recording unit 914 is made to record. Further, when the imaging unit 912 and the recording unit 914 are set to the standby mode, the second sound collection unit 968 collects sound in a direction wider than the sound collection direction set in the imaging mode. The recording unit 914 is made to record.

このため、撮像装置９００が撮像モードに設定されている場合には、撮像対象である被写体の方向からの音声をより大きな音量で録音することができる。また、撮像装置９００が待機モードに設定されている場合には、広い方向の音声を集音して録音するので、例えばユーザ１８０が撮像せずに遊園地で遊んでいるときには、撮像装置９００の周囲の自然な音声を録音することができる。 For this reason, when the imaging apparatus 900 is set to the imaging mode, it is possible to record the sound from the direction of the subject to be imaged at a higher volume. In addition, when the imaging device 900 is set to the standby mode, a wide range of sounds are collected and recorded. For example, when the user 180 is playing in an amusement park without taking an image, the imaging device 900 The surrounding natural sound can be recorded.

また、閾値音量設定部９２０は、距離測定部９７０によって測定される撮像部９１２と被写体との距離が大きいほど、小さい閾値音量を設定する。このため、録音部９１４は、遠くの距離の被写体を撮像している場合でも、被写体の方向からの音声をより容易に録音することができる。 The threshold volume setting unit 920 sets a smaller threshold volume as the distance between the imaging unit 912 and the subject measured by the distance measurement unit 970 is larger. For this reason, the recording unit 914 can more easily record the sound from the direction of the subject even when imaging a subject at a long distance.

なお、撮像部９１２及び録音部９１４が再生モードに設定されている場合には、画像を表示させる方向と略同一方向の音声を集音して録音部９１４に録音させてよい。このため、例えば撮像装置９００で表示される撮像画像をユーザ１８０が参照しながら、ユーザ１８０が撮像画像に関するナレーション等を録音する場合に、ユーザ１８０のナレーションをより適切に録音することができる。 Note that when the imaging unit 912 and the recording unit 914 are set to the playback mode, sound in the substantially same direction as the direction in which the image is displayed may be collected and recorded by the recording unit 914. Therefore, for example, when the user 180 records a narration or the like related to a captured image while the user 180 refers to a captured image displayed on the imaging apparatus 900, the narration of the user 180 can be recorded more appropriately.

なお、撮像装置９００は、動作モードが待機モード又は再生モードに設定されている場合に、ユーザ１８０によって撮像動作又は録音動作に関する操作がなされた場合に撮像モードに遷移する。撮像動作に関する操作は、例えば、画像を撮像する操作、シャッタスピード、焦点距離等の撮像条件を調整する操作等を含む。また、録音動作に関する操作は、例えば、音声を録音する操作、録音感度の調整等の録音条件を調整する操作等を含む。また、撮像装置９００は、動作モードが待機モード又は撮像モードに設定されている場合に、ユーザ１８０によって撮像装置９００の再生動作に関する操作がなされた場合に、再生モードに遷移する。再生動作に関する操作は、例えば、画像を再生する操作、再生する画像を選択する操作、再生速度の調節等の再生条件を調整する操作等を含む。なお、撮像装置９００は、撮像装置９００が撮像モード又は再生モードに設定されている場合に、ユーザによる撮像装置９００の操作が所定の期間操作されなかったことを条件として、待機モードに遷移してよい。 Note that when the operation mode is set to the standby mode or the playback mode, the imaging apparatus 900 transitions to the imaging mode when an operation related to the imaging operation or the recording operation is performed by the user 180. The operation related to the imaging operation includes, for example, an operation for imaging an image, an operation for adjusting imaging conditions such as a shutter speed and a focal length, and the like. In addition, operations related to the recording operation include, for example, operations for recording sound, operations for adjusting recording conditions such as adjustment of recording sensitivity, and the like. In addition, when the operation mode is set to the standby mode or the imaging mode, the imaging device 900 transitions to the playback mode when an operation related to the playback operation of the imaging device 900 is performed by the user 180. The operation related to the reproduction operation includes, for example, an operation for reproducing an image, an operation for selecting an image to be reproduced, an operation for adjusting reproduction conditions such as adjustment of reproduction speed, and the like. Note that when the imaging apparatus 900 is set to the imaging mode or the playback mode, the imaging apparatus 900 transitions to the standby mode on condition that the user has not operated the imaging apparatus 900 for a predetermined period. Good.

図１２は、撮像モードに応じて設定される閾値音量の一例を示す。撮像装置９００は、撮像モードとして、接写モード、中距離撮像モード、及び遠景撮像モードを有する。本図の例において、期間（ｔ１００２〜ｔ１００５）において撮像部９１２及び録音部９１４が接写モードである撮像モードに設定されるとき、閾値音量設定部９２０は、中距離撮像モードの場合に設定される閾値音量Ｌ１０２２より小さい閾値音量Ｌ１２２２を設定する。このとき、閾値音量設定部９２０は、予め定められた１より小さい係数を閾値音量Ｌ１０２２に乗じて得られる音量を閾値音量Ｌ１２２２としてよい。 FIG. 12 shows an example of the threshold volume set according to the imaging mode. The imaging apparatus 900 has a close-up mode, a mid-range imaging mode, and a distant view imaging mode as imaging modes. In the example of this figure, when the imaging unit 912 and the recording unit 914 are set to the imaging mode which is the close-up mode in the period (t1002 to t1005), the threshold volume setting unit 920 is set in the case of the middle distance imaging mode. A threshold volume L1222 that is smaller than the threshold volume L1022 is set. At this time, the threshold volume setting unit 920 may set a volume obtained by multiplying the threshold volume L1022 by a predetermined coefficient smaller than 1 as the threshold volume L1222.

また、期間（ｔ１００２〜ｔ１００５）において撮像部９１２及び録音部９１４が遠景撮像モードである撮像モードに設定されるときには、閾値音量設定部９２０は、中距離撮像モードの場合に設定される閾値音量Ｌ１０２２より大きい閾値音量Ｌ１２２３を設定する。このとき、閾値音量設定部９２０は、予め定められた１より大きい係数を閾値音量Ｌ１０２２に乗じて得られる音量を閾値音量Ｌ１２２３としてよい。なお、閾値音量設定部９２０は、閾値音量Ｌ１０２４及びＬ１０２６より小さい閾値音量Ｌ１２２３を設定してよい。なお、閾値音量設定部９２０は、接写モード、遠景撮像モード等の撮像モードの他に、夜景モード、昼間撮影モード等の、様々な撮像モードに応じて閾値音量を設定してよいことは言うまでもない。 In addition, when the imaging unit 912 and the recording unit 914 are set to the imaging mode that is the distant imaging mode during the period (t1002 to t1005), the threshold volume setting unit 920 sets the threshold volume L1022 that is set in the middle-distance imaging mode. A larger threshold volume L1223 is set. At this time, the threshold volume setting unit 920 may set a volume obtained by multiplying the threshold volume L1022 by a predetermined coefficient larger than 1 as the threshold volume L1223. The threshold volume setting unit 920 may set a threshold volume L1223 that is smaller than the threshold volumes L1024 and L1026. Needless to say, the threshold volume setting unit 920 may set the threshold volume according to various imaging modes such as a night scene mode and a daytime shooting mode in addition to the imaging modes such as the close-up mode and the far-field imaging mode. .

なお、閾値音量設定部９２０は、撮像モードに対応付けて閾値音量を格納してよい。この場合、録音制御部９２２は、撮像部９１２の撮像モードに対応付けて閾値音量設定部９２０が格納している閾値音量より大きい音量の音声を録音部９１４に録音させてよい。以上説明したように、閾値音量設定部９２０は、撮像モードのそれぞれに応じた望ましい閾値音量を設定することができる。したがって、例えばユーザが小さな虫を接写モードで撮像するような場合において、小さな虫の音が録音部９１４によって録音され易くなる。また、ユーザが風景を撮像するような場合には、周囲の騒々しい音声が録音されにくくなる。 The threshold volume setting unit 920 may store the threshold volume in association with the imaging mode. In this case, the recording control unit 922 may cause the recording unit 914 to record sound having a volume higher than the threshold volume stored in the threshold volume setting unit 920 in association with the imaging mode of the imaging unit 912. As described above, the threshold volume setting unit 920 can set a desired threshold volume corresponding to each of the imaging modes. Therefore, for example, when the user images a small insect in the close-up mode, the sound of the small insect is easily recorded by the recording unit 914. In addition, when the user captures a landscape, surrounding noisy sound is difficult to be recorded.

図１３は、撮像条件に応じて設定される閾値音量の一例を示す。本図の例において、期間（ｔ１００２〜ｔ１００５）における撮像部９１２の撮像条件として、フラッシュを用いて撮像する旨が設定されると、閾値音量設定部９２０は、閾値音量Ｌ１０２２より小さい閾値音量Ｌ１３２２を設定する。このとき、閾値音量設定部９２０は、予め定められた１より小さい係数を閾値音量Ｌ１０２２に乗じて得られる音量を閾値音量Ｌ１３２２としてよい。 FIG. 13 shows an example of the threshold volume set according to the imaging conditions. In the example of this figure, when the effect of imaging using a flash is set as the imaging condition of the imaging unit 912 in the period (t1002 to t1005), the threshold volume setting unit 920 sets a threshold volume L1322 smaller than the threshold volume L1022. Set. At this time, the threshold volume setting unit 920 may set the volume obtained by multiplying the threshold volume L1022 by a predetermined coefficient smaller than 1 as the threshold volume L1322.

また、期間（ｔ１００２〜ｔ１００５）における撮像部９１２の撮像条件として、絞り値を大きくして撮影する旨が設定されるとき、閾値音量設定部９２０は、閾値音量Ｌ１０２２より大きい閾値音量Ｌ１３２３を設定する。このとき、閾値音量設定部９２０は、予め定められた、絞り値の応じた１より大きい係数を閾値音量Ｌ１０２２に乗じて得られる音量を閾値音量Ｌ１３２３としてよい。なお、閾値音量設定部９２０は、閾値音量Ｌ１０２４及びＬ１０２６より小さい閾値音量Ｌ１３２３を設定してよい。なお、閾値音量設定部９２０は、フラッシュ、絞り値等の他に、様々な撮像条件に応じて閾値音量を設定してよいことは言うまでもない。 In addition, when the fact that shooting is performed with a larger aperture value is set as the imaging condition of the imaging unit 912 in the period (t1002 to t1005), the threshold volume setting unit 920 sets a threshold volume L1323 that is larger than the threshold volume L1022. . At this time, the threshold sound volume setting unit 920 may set the sound volume obtained by multiplying the threshold sound volume L1022 by a predetermined coefficient larger than 1 corresponding to the aperture value as the threshold sound volume L1323. The threshold volume setting unit 920 may set a threshold volume L1323 that is smaller than the threshold volumes L1024 and L1026. Needless to say, the threshold volume setting unit 920 may set the threshold volume according to various imaging conditions in addition to the flash, the aperture value, and the like.

なお、閾値音量設定部９２０は、撮像条件の制御値に対応付けて閾値音量を格納してよい。この場合、録音制御部９２２は、撮像部９１２の撮像条件の制御値に対応付けて閾値音量設定部９２０が格納している閾値音量より大きい音量の音声を録音部９１４に録音させてよい。また、閾値音量設定部９２０は、撮像条件の制御値に対応付けて閾値音量Ｌ１０２２に乗じるべき係数を格納してよい。以上説明したように、閾値音量設定部９２０は、撮像条件のそれぞれに応じた望ましい閾値音量を設定することができる。例えば、夜は昼間に比べて静かな場合が多い。そして、夜にはフラッシュを用いて撮像されることが多く、昼間には夜より絞り値を大きくして撮像される場合が多い。したがって、フラッシュ撮影時には閾値音量設定部９２０が閾値音量を小さく設定することによって、静かな夜における撮像装置９００の周囲の小さな音を録音部９１４に録音させ易くすることができる。なお、閾値音量設定部９２０は、撮像部９１２による撮像時刻に応じて閾値音量を設定してよい。例えば、閾値音量設定部９２０は、撮像時刻が昼間であると判断される場合には、撮像時刻が夜であると判断される場合より大きい閾値音量を設定してよい。その他、閾値音量設定部９２０は、撮像装置９００の周囲の明るさを判断して、撮像装置９００の周囲の明るさが予め定められた明るさより明るい場合に、撮像装置９００の周囲の明るさが予め定められた明るさより暗い場合より大きい閾値音量を設定してよい。 The threshold volume setting unit 920 may store the threshold volume in association with the control value of the imaging condition. In this case, the recording control unit 922 may cause the recording unit 914 to record a sound having a volume larger than the threshold volume stored in the threshold volume setting unit 920 in association with the control value of the imaging condition of the imaging unit 912. The threshold volume setting unit 920 may store a coefficient to be multiplied by the threshold volume L1022 in association with the control value of the imaging condition. As described above, the threshold volume setting unit 920 can set a desired threshold volume corresponding to each of the imaging conditions. For example, nights are often quieter than daytime. In many cases, images are taken using a flash at night, and images are often taken during the day with a larger aperture value than at night. Therefore, the threshold sound volume setting unit 920 sets the threshold sound volume to be small at the time of flash photography, so that it is possible to easily cause the recording unit 914 to record a small sound around the imaging device 900 at a quiet night. The threshold volume setting unit 920 may set the threshold volume according to the imaging time by the imaging unit 912. For example, when it is determined that the imaging time is daytime, the threshold volume setting unit 920 may set a threshold volume that is greater than when the imaging time is determined to be night. In addition, the threshold volume setting unit 920 determines the brightness around the imaging apparatus 900, and when the brightness around the imaging apparatus 900 is brighter than a predetermined brightness, the brightness around the imaging apparatus 900 is You may set the threshold volume larger than the case where it is darker than predetermined brightness.

なお、図１１から図１３にかけて、動作モード、撮像モード、撮像条件、撮像環境等に応じて閾値音量を設定することによって、録音部９１４が録音すべき音声を決定する例について説明した。このような閾値音量による決定方法の他に、音声集音部９８０における集音感度を動作モード、撮像モード、撮像条件、撮像環境等に応じて設定することによって、録音部９１４が録音すべき音声を決定することできる。例えば図１１から図１３の説明において閾値音量を大きく設定するケースにおいては音声集音部９８０の集音感度を小さくし、閾値音量を小さく設定するケースにおいては音声集音部９８０の集音感度を大きくすることによって、動作モード、撮像モード、撮像条件、撮像環境等に応じた音声を録音部９１４に録音させることができる。 11 to 13, the example in which the recording unit 914 determines the sound to be recorded by setting the threshold volume according to the operation mode, the imaging mode, the imaging condition, the imaging environment, and the like has been described. In addition to such a determination method based on the threshold volume, the sound to be recorded by the recording unit 914 is set by setting the sound collection sensitivity in the sound collection unit 980 according to the operation mode, the imaging mode, the imaging conditions, the imaging environment, and the like. Can be determined. For example, in the description of FIGS. 11 to 13, the sound collection sensitivity of the sound collection unit 980 is reduced in the case where the threshold sound volume is set high, and the sound collection sensitivity of the sound collection unit 980 is set in the case where the threshold sound volume is set low. By increasing the volume, it is possible to cause the recording unit 914 to record sound corresponding to the operation mode, the imaging mode, the imaging conditions, the imaging environment, and the like.

図１４は、第１実施形態の撮像装置１００及び再生装置１４０、並びに第２実施形態の撮像装置９００に係るコンピュータ１５００のハードウェア構成の一例を示す。コンピュータ１５００は、ホスト・コントローラ１５８２により相互に接続されるＣＰＵ１５０５、ＲＡＭ１５２０、グラフィック・コントローラ１５７５、及び表示装置１５８０を有するＣＰＵ周辺部と、入出力コントローラ１５８４によりホスト・コントローラ１５８２に接続される通信インターフェイス１５３０、ハードディスクドライブ１５４０、及びＣＤ−ＲＯＭドライブ１５６０を有する入出力部と、入出力コントローラ１５８４に接続されるＲＯＭ１５１０、フレキシブルディスク・ドライブ１５５０、及び入出力チップ１５７０を有するレガシー入出力部とを備える。 FIG. 14 illustrates an example of a hardware configuration of a computer 1500 according to the imaging device 100 and the playback device 140 of the first embodiment and the imaging device 900 of the second embodiment. The computer 1500 includes a CPU peripheral unit including a CPU 1505, a RAM 1520, a graphic controller 1575, and a display device 1580 connected to each other by a host controller 1582, and a communication interface 1530 connected to the host controller 1582 by an input / output controller 1584. An input / output unit having a hard disk drive 1540 and a CD-ROM drive 1560, and a legacy input / output unit having a ROM 1510, a flexible disk drive 1550, and an input / output chip 1570 connected to the input / output controller 1584.

ホスト・コントローラ１５８２は、ＲＡＭ１５２０と、高い転送レートでＲＡＭ１５２０をアクセスするＣＰＵ１５０５、及びグラフィック・コントローラ１５７５とを接続する。ＣＰＵ１５０５は、ＲＯＭ１５１０、及びＲＡＭ１５２０に格納されたプログラムに基づいて動作し、各部の制御を行う。グラフィック・コントローラ１５７５は、ＣＰＵ１５０５等がＲＡＭ１５２０内に設けたフレーム・バッファ上に生成する画像データを取得し、表示装置１５８０上に表示させる。これに代えて、グラフィック・コントローラ１５７５は、ＣＰＵ１５０５等が生成する画像データを格納するフレーム・バッファを、内部に含んでもよい。 The host controller 1582 connects the RAM 1520, the CPU 1505 that accesses the RAM 1520 at a high transfer rate, and the graphic controller 1575. The CPU 1505 operates based on programs stored in the ROM 1510 and the RAM 1520 and controls each unit. The graphic controller 1575 acquires image data generated by the CPU 1505 and the like on a frame buffer provided in the RAM 1520 and displays the image data on the display device 1580. Alternatively, the graphic controller 1575 may include a frame buffer that stores image data generated by the CPU 1505 or the like.

入出力コントローラ１５８４は、ホスト・コントローラ１５８２と、比較的高速な入出力装置であるハードディスクドライブ１５４０、通信インターフェイス１５３０、ＣＤ−ＲＯＭドライブ１５６０を接続する。ハードディスクドライブ１５４０は、コンピュータ１５００内のＣＰＵ１５０５が使用するプログラム、及びデータを格納する。通信インターフェイス１５３０は、ネットワークを介して撮像装置１００、再生装置１４０、又は撮像装置９００と通信し、撮像装置１００、再生装置１４０、又は撮像装置９００にプログラム、及びデータを提供する。ＣＤ−ＲＯＭドライブ１５６０は、ＣＤ−ＲＯＭ１５９５からプログラムまたはデータを読み取り、ＲＡＭ１５２０を介してハードディスクドライブ１５４０、及び通信インターフェイス１５３０に提供する。 The input / output controller 1584 connects the host controller 1582 to the hard disk drive 1540, the communication interface 1530, and the CD-ROM drive 1560, which are relatively high-speed input / output devices. The hard disk drive 1540 stores programs and data used by the CPU 1505 in the computer 1500. The communication interface 1530 communicates with the imaging device 100, the playback device 140, or the imaging device 900 via a network, and provides a program and data to the imaging device 100, the playback device 140, or the imaging device 900. The CD-ROM drive 1560 reads a program or data from the CD-ROM 1595 and provides it to the hard disk drive 1540 and the communication interface 1530 via the RAM 1520.

また、入出力コントローラ１５８４には、ＲＯＭ１５１０と、フレキシブルディスク・ドライブ１５５０、及び入出力チップ１５７０の比較的低速な入出力装置とが接続される。ＲＯＭ１５１０は、コンピュータ１５００が起動時に実行するブート・プログラムや、コンピュータ１５００のハードウェアに依存するプログラム等を格納する。フレキシブルディスク・ドライブ１５５０は、フレキシブルディスク１５９０からプログラムまたはデータを読み取り、ＲＡＭ１５２０を介してハードディスクドライブ１５４０、及び通信インターフェイス１５３０に提供する。入出力チップ１５７０は、フレキシブルディスク・ドライブ１５５０や、例えばパラレル・ポート、シリアル・ポート、キーボード・ポート、マウス・ポート等を介して各種の入出力装置を接続する。 The input / output controller 1584 is connected to the ROM 1510, the flexible disk drive 1550, and the relatively low-speed input / output device of the input / output chip 1570. The ROM 1510 stores a boot program executed when the computer 1500 is started up, a program depending on the hardware of the computer 1500, and the like. The flexible disk drive 1550 reads a program or data from the flexible disk 1590 and provides it to the hard disk drive 1540 and the communication interface 1530 via the RAM 1520. The input / output chip 1570 connects various input / output devices via a flexible disk drive 1550 and, for example, a parallel port, a serial port, a keyboard port, a mouse port, and the like.

ＲＡＭ１５２０を介して通信インターフェイス１５３０に提供されるプログラムは、フレキシブルディスク１５９０、ＣＤ−ＲＯＭ１５９５、またはＩＣカード等の記録媒体に格納されて利用者によって提供される。プログラムは、記録媒体から読み出され、ＲＡＭ１５２０を介して通信インターフェイス１５３０に提供され、ネットワークを介して撮像装置１００、再生装置１４０、又は撮像装置９００に送信される。撮像装置１００、再生装置１４０、又は撮像装置９００に送信されたプログラムは、撮像装置１００、再生装置１４０、又は撮像装置９００においてインストールされて実行される。 A program provided to the communication interface 1530 via the RAM 1520 is stored in a recording medium such as the flexible disk 1590, the CD-ROM 1595, or an IC card and provided by the user. The program is read from the recording medium, provided to the communication interface 1530 via the RAM 1520, and transmitted to the imaging device 100, the playback device 140, or the imaging device 900 via the network. The program transmitted to the imaging device 100, the playback device 140, or the imaging device 900 is installed and executed in the imaging device 100, the playback device 140, or the imaging device 900.

撮像装置１００にインストールされて実行されるプログラムは、撮像装置１００を、図１、図２、及び図４から図７において説明した撮像装置１００として機能させる。また、再生装置１４０にインストールされて実行されるプログラムは、再生装置１４０を、図１、図３、図８、及び図９において説明した再生装置１４０として機能させる。また、撮像装置９００にインストールされて実行されるプログラムは、撮像装置９００を、図１０から図１３において説明した撮像装置９００として機能させる。 A program installed and executed in the imaging apparatus 100 causes the imaging apparatus 100 to function as the imaging apparatus 100 described with reference to FIGS. 1, 2, and 4 to 7. The program installed and executed in the playback device 140 causes the playback device 140 to function as the playback device 140 described with reference to FIGS. 1, 3, 8, and 9. The program installed and executed in the imaging apparatus 900 causes the imaging apparatus 900 to function as the imaging apparatus 900 described with reference to FIGS.

以上に示したプログラムは、外部の記憶媒体に格納されてもよい。記憶媒体としては、フレキシブルディスク１５９０、ＣＤ−ＲＯＭ１５９５の他に、ＤＶＤやＰＤ等の光学記録媒体、ＭＤ等の光磁気記録媒体、テープ媒体、ＩＣカード等の半導体メモリ等を用いることができる。また、専用通信ネットワークやインターネットに接続されたサーバシステムに設けたハードディスクまたはＲＡＭ等の記憶装置を記録媒体として使用し、ネットワークを介してプログラムをコンピュータ１５００に提供してもよい。 The program shown above may be stored in an external storage medium. As the storage medium, in addition to the flexible disk 1590 and the CD-ROM 1595, an optical recording medium such as a DVD or PD, a magneto-optical recording medium such as an MD, a tape medium, a semiconductor memory such as an IC card, or the like can be used. Further, a storage device such as a hard disk or a RAM provided in a server system connected to a dedicated communication network or the Internet may be used as a recording medium, and the program may be provided to the computer 1500 via the network.

以上、実施形態を用いて本発明を説明したが、本発明の技術的範囲は上記実施形態に記載の範囲には限定されない。上記実施形態に、多様な変更又は改良を加えることができる。そのような変更又は改良を加えた形態も本発明の技術的範囲に含まれ得ることが、特許請求の範囲の記載から明らかである。 As mentioned above, although this invention was demonstrated using embodiment, the technical scope of this invention is not limited to the range as described in the said embodiment. Various modifications or improvements can be added to the above embodiment. It is apparent from the scope of the claims that the embodiments added with such changes or improvements can be included in the technical scope of the present invention.

撮像装置１００及び再生装置１４０の利用環境の一例を示す図である。FIG. 3 is a diagram illustrating an example of a usage environment of an imaging apparatus 100 and a playback apparatus 140. 撮像装置１００のブロック構成の一例を示す図である。1 is a diagram illustrating an example of a block configuration of an imaging apparatus 100. FIG. 再生装置１４０のブロック構成の一例を示す図である。3 is a diagram illustrating an example of a block configuration of a playback device 140. FIG. 撮像画像と音声との対応関係の一例を示す図である。It is a figure which shows an example of the correspondence of a captured image and an audio | voice. 撮像画像と音声との対応関係の他の一例を示す図である。It is a figure which shows another example of the correspondence of a captured image and an audio | voice. 周波数帯域毎に設定される閾値音量の一例を示す図である。It is a figure which shows an example of the threshold volume set for every frequency band. 可変フィルタ部２４２が透過する音声の帯域周波数の一例を示す図である。6 is a diagram illustrating an example of a band frequency of sound transmitted through a variable filter unit 242. FIG. 撮像画像と音声の対応関係の一例を示す図である。It is a figure which shows an example of the correspondence of a captured image and an audio | voice. 再生される画像と音声の対応関係の一例を示す図である。It is a figure which shows an example of the correspondence of the image and audio | voice to reproduce | regenerate. 撮像装置９００のブロック構成の一例を示す図である。FIG. 25 is a diagram illustrating an example of a block configuration of an imaging apparatus 900. 動作モード毎の閾値音量の時間変化の一例を示す図である。It is a figure which shows an example of the time change of the threshold volume for every operation mode. 撮像モード毎の閾値音量の時間変化の一例を示す図である。It is a figure which shows an example of the time change of the threshold volume for every imaging mode. 撮像条件毎の閾値音量の時間変化の一例を示す図である。It is a figure which shows an example of the time change of the threshold volume for every imaging condition. コンピュータ１５００のハードウェア構成の一例を示す図である。2 is a diagram illustrating an example of a hardware configuration of a computer 1500. FIG.

符号の説明Explanation of symbols

１００撮像装置
１０２マイクロホン
１４０再生装置
１５０通信回線
１８０ユーザ
２１２撮像部
２１４録音部
２１６音声格納部
２１８音声抽出部
２２０閾値音量設定部
２２２閾値音量格納部
２３２データ格納部
２３４データ出力部
２４２可変フィルタ部
２４４帯域制御部
２４６環境情報格納部
２４８位置検出部
２５０時刻検出部
２５２環境特定部
３１２指示受付部
３１６音声格納部
３１８音声抽出部
３２０撮像画像格納部
３２２閾値音量格納部
３２４閾値音量設定部
３３２データ格納部
３３４データ出力部
３６０時刻検出部
３６２許容時間制御部
３６４許容時間格納部
９００撮像装置
９１０録音音量設定部
９１２撮像部
９１４録音部
９１６音声格納部
９２０閾値音量設定部
９２２録音制御部
９６２モード設定部
９６４集音方向制御部
９６６第１集音部
９６８第２集音部
９７０距離測定部
９８０音声集音部
100 Imaging device 102 Microphone 140 Playback device 150 Communication line 180 User 212 Imaging unit 214 Recording unit 216 Audio storage unit 218 Audio extraction unit 220 Threshold volume setting unit 222 Threshold volume storage unit 232 Data storage unit 234 Data output unit 242 Variable filter unit 244 Band control unit 246 Environment information storage unit 248 Position detection unit 250 Time detection unit 252 Environment identification unit 312 Instruction reception unit 316 Audio storage unit 318 Audio extraction unit 320 Captured image storage unit 322 Threshold volume storage unit 324 Threshold volume setting unit 332 Data storage Unit 334 data output unit 360 time detection unit 362 allowable time control unit 364 allowable time storage unit 900 imaging device 910 recording volume setting unit 912 imaging unit 914 recording unit 916 audio storage unit 920 threshold volume setting unit 922 recording control unit 962 mode setting unit 96 Sound collecting direction control unit 966 first current clef 968 second current clef 970 distance measuring unit 980 speech sound collecting unit

Claims

被写体を撮像する撮像部と、
前記撮像部の周囲の音声を録音する録音部と、
設定された閾値音量を格納する閾値音量格納部と、
前記録音部が録音した音声のうちで、前記閾値音量格納部が格納している閾値音量より大きい音量の音声が含まれる一部の期間の音声を抽出する音声抽出部と、
前記撮像部が撮像した撮像画像と、前記音声抽出部が抽出した音声とを対応づけて格納するデータ格納部と、
前記データ格納部が対応づけて格納している撮像画像と音声とを同期させて出力するデータ出力部と
を備える撮像装置。 An imaging unit for imaging a subject;
A recording unit for recording sound around the imaging unit;
A threshold volume storage section for storing the set threshold volume;
Among the voices recorded by the recording unit, a voice extraction unit that extracts voices of a part of a period including voices with a volume larger than the threshold volume stored in the threshold volume storage unit;
A data storage unit that stores the captured image captured by the imaging unit and the audio extracted by the audio extraction unit in association with each other;
An image pickup apparatus comprising: a data output unit that outputs a captured image and sound that are stored in association with each other in the data storage unit.

前記データ格納部は、前記撮像部が撮像した複数の撮像画像のそれぞれと、前記音声抽出部が抽出した複数の音声のそれぞれとを、撮像及び録音された順に対応づけて格納する
請求項１に記載の撮像装置。 The data storage unit stores each of the plurality of captured images captured by the imaging unit and each of the plurality of sounds extracted by the sound extraction unit in association with each other in the order of imaging and recording. The imaging device described.

前記音声抽出部が抽出する複数の音声の期間の合計が、前記撮像部が撮像した複数の撮像画像の数に、予め定められた撮像画像の再生時間を乗じた期間と同一となるように、前記閾値音量格納部が格納している閾値音量を設定する閾値音量設定部
をさらに備える請求項１に記載の撮像装置。 The total of a plurality of sound periods extracted by the sound extraction unit is equal to a period obtained by multiplying the number of the plurality of picked-up images picked up by the image pickup unit by a reproduction time of a predetermined picked-up image. The imaging apparatus according to claim 1, further comprising a threshold volume setting unit that sets a threshold volume stored in the threshold volume storage unit.

前記録音部が録音した音声を格納する音声格納部と、
前記音声格納部が格納している音声の音量分布に基づいて、前記閾値音量格納部が格納している閾値音量を設定する閾値音量設定部と
をさらに備える
請求項１に記載の撮像装置。 A voice storage unit for storing the voice recorded by the recording unit;
The imaging apparatus according to claim 1, further comprising: a threshold volume setting unit configured to set a threshold volume stored in the threshold volume storage unit based on a volume distribution of audio stored in the audio storage unit.

前記閾値音量設定部は、前記音声格納部が格納している音声の音量の平均値がより大きい場合に、前記閾値音量格納部が格納している閾値音量をより大きく設定する
請求項４に記載の撮像装置。 The threshold sound volume setting unit sets the threshold sound volume stored in the threshold sound volume storage unit larger when the average value of the sound volume stored in the sound storage unit is larger. Imaging device.

前記閾値音量格納部は、複数の周波数帯域のそれぞれに対応づけて帯域別閾値音量を格納し、
前記音声抽出部は、前記録音部が録音した音声の音量を周波数帯域毎に、前記閾値音量格納部が格納している帯域別閾値音量と比較し、特定の周波数帯域において帯域別閾値音量より大きい音量が含まれる一部の期間の音声を抽出する
請求項１に記載の撮像装置。 The threshold volume storage unit stores a threshold volume for each band in association with each of a plurality of frequency bands,
The voice extraction unit compares the volume of the voice recorded by the recording unit for each frequency band with the threshold volume for each band stored in the threshold volume storage unit, and is larger than the threshold volume for each band in a specific frequency band. The imaging apparatus according to claim 1, wherein a sound of a part of a period including a volume is extracted.

当該撮像装置の周囲の環境を特定する環境特定部と、
設定された帯域周波数の音声を透過させる可変フィルタ部と、
前記環境特定部が特定した環境に応じて、前記可変フィルタ部が透過させる音声の帯域周波数を設定する帯域制御部と
をさらに備え、
前記録音部は、前記フィルタ部が透過させた音声を録音する
請求項１に記載の撮像装置。 An environment identification unit that identifies the environment around the imaging device;
A variable filter that transmits sound of a set band frequency; and
According to the environment specified by the environment specifying unit, further comprising a band control unit for setting a band frequency of the sound transmitted by the variable filter unit,
The imaging device according to claim 1, wherein the recording unit records the sound transmitted through the filter unit.

当該撮像装置の位置を検出する位置検出部と、
位置を示す情報に対応づけて、環境を示す情報を格納する環境情報格納部と
をさらに備え、
前記環境特定部は、前記位置検出部が検出した位置に基づいて前記環境情報格納部を検索し、当該撮像装置の周囲の環境を特定する
請求項７に記載の撮像装置。 A position detector for detecting the position of the imaging device;
An environment information storage unit that stores information indicating the environment in association with the information indicating the position;
The imaging apparatus according to claim 7, wherein the environment identification unit searches the environment information storage unit based on the position detected by the position detection unit, and identifies an environment around the imaging apparatus.

時刻を検出する時刻検出部と、
時刻を示す情報に対応づけて、環境を示す情報を格納する環境情報格納部と
をさらに備え、
前記環境特定部は、前記時刻検出部が検出した時刻に基づいて前記環境情報格納部を検索し、当該撮像装置の周囲の環境を特定する
請求項７に記載の撮像装置。 A time detection unit for detecting time;
An environment information storage unit that stores information indicating the environment in association with the information indicating the time;
The imaging device according to claim 7, wherein the environment identification unit searches the environment information storage unit based on the time detected by the time detection unit, and identifies an environment around the imaging device.

撮像部を用いて被写体を撮像する段階と、
前記撮像部の周囲の音声を録音する録音段階と、
設定された閾値音量を格納する閾値音量格納段階と、
前記録音段階において録音された音声のうちで、前記閾値音量格納段階において格納される閾値音量より大きい音量の音声が含まれる一部の期間の音声を抽出する音声抽出段階と、
前記撮像部が撮像した撮像画像と、前記音声抽出段階で抽出された音声とを対応づけて格納するデータ格納段階と、
前記データ格納段階において対応づけて格納される撮像画像と音声とを同期させて出力するデータ出力段階と
を備える撮像方法。 Imaging a subject using an imaging unit;
A recording stage for recording sound around the imaging unit;
A threshold volume storage stage for storing a set threshold volume;
A voice extraction step for extracting a portion of the voice that includes a volume of sound that is larger than the threshold volume stored in the threshold volume storage step among the voices recorded in the recording stage;
A data storage stage for storing the captured image captured by the imaging unit and the voice extracted in the voice extraction stage in association with each other;
An image pickup method comprising: a data output step of synchronizing and outputting a picked-up image and sound stored in association with each other in the data storage step.

画像を撮像する撮像装置用のプログラムであって、前記撮像装置を
被写体を撮像する撮像部、
前記撮像部の周囲の音声を録音する録音部、
設定された閾値音量を格納する閾値音量格納部、
前記録音部が録音した音声のうちで、前記閾値音量格納部が格納している閾値音量より大きい音量の音声が含まれる一部の期間の音声を抽出する音声抽出部、
前記撮像部が撮像した撮像画像と、前記音声抽出部が抽出した音声とを対応づけて格納するデータ格納部、
前記データ格納部が対応づけて格納している撮像画像と音声とを同期させて出力するデータ出力部
として機能させるプログラム。 A program for an imaging device that captures an image, the imaging device capturing an image of a subject,
A recording unit for recording sound around the imaging unit;
A threshold volume storage for storing the set threshold volume;
Among the voices recorded by the recording unit, a voice extraction unit that extracts voices of a part of a period including voices having a volume larger than the threshold volume stored in the threshold volume storage unit,
A data storage unit that stores the captured image captured by the imaging unit and the audio extracted by the audio extraction unit in association with each other;
A program that functions as a data output unit that outputs a captured image and sound that are stored in association with each other in the data storage unit.

撮像装置によって撮像された撮像画像を格納する撮像画像格納部と、
前記撮像装置によって録音された音声を格納する音声格納部と、
閾値音量を格納する閾値音量格納部と、
前記音声格納部が格納する音声のうちで、前記閾値音量格納部が格納している閾値音量より大きい音量の音声が含まれる一部の期間の音声を抽出する音声抽出部と、
前記撮像画像格納部が格納している撮像画像と、前記音声抽出部が抽出した音声とを対応づけて格納するデータ格納部と、
前記データ格納部が対応づけて格納している撮像画像と音声とを同期させて出力するデータ出力部と
を備える再生装置。 A captured image storage unit that stores a captured image captured by the imaging device;
An audio storage unit for storing audio recorded by the imaging device;
A threshold volume storage for storing the threshold volume;
A voice extraction unit that extracts a voice of a part of a period including a voice having a volume larger than the threshold volume stored in the threshold volume storage unit among the voices stored in the voice storage unit;
A data storage unit that stores the captured image stored in the captured image storage unit in association with the audio extracted by the audio extraction unit;
A playback apparatus comprising: a data output unit that synchronizes and outputs captured images and audio stored in association with each other in the data storage unit.

設定された許容時間を格納する許容時間格納部
をさらに備え、
前記撮像画像格納部は、前記撮像装置によって撮像された時刻に対応づけて撮像画像を格納し、
前記音声格納部は、前記撮像装置によって録音された時刻に対応づけて音声を格納し、
前記音声抽出部は、前記撮像画像格納部が格納している撮像画像が撮像された時刻から、前記許容時間格納部が格納している許容時間の範囲内の時刻に録音された音声のうちで、前記閾値音量格納部が格納している閾値音量より大きい音量の音声が含まれる一部の期間の音声を抽出する
請求項１２に記載の再生装置。 It further includes an allowable time storage unit for storing the set allowable time,
The captured image storage unit stores the captured image in association with the time of image capturing by the imaging device,
The audio storage unit stores audio in association with the time recorded by the imaging device,
The voice extraction unit includes voices recorded at a time within a range of an allowable time stored in the allowable time storage unit from a time when the captured image stored in the captured image storage unit is captured. The playback apparatus according to claim 12, wherein a sound of a part of a period in which a sound having a volume larger than the threshold sound volume stored in the threshold sound volume storage unit is included is extracted.

前記撮像画像格納部が格納している撮像画像を再生するべき旨の指示を受け付ける指示受付部と、
前記指示受付部が指示を受け付けたときの時刻を検出する時刻検出部と、
前記撮像画像格納部が格納している撮像画像が撮像された時刻と、前記時刻検出部が検出した時刻との差が大きいほど、前記許容時間格納部が格納している許容時間を長く設定する許容時間制御部
をさらに備える請求項１３に記載の再生装置。 An instruction receiving unit for receiving an instruction to reproduce the captured image stored in the captured image storage unit;
A time detection unit for detecting a time when the instruction receiving unit receives an instruction;
The allowable time stored in the allowable time storage unit is set longer as the difference between the time when the captured image stored in the captured image storage unit is captured and the time detected by the time detection unit is larger. The playback device according to claim 13, further comprising an allowable time control unit.

前記音声格納部が格納している音声の音量分布に基づいて、前記閾値音量格納部が格納している閾値音量を設定する閾値音量設定部
をさらに備える
請求項１２に記載の再生装置。 The playback apparatus according to claim 12, further comprising a threshold volume setting unit that sets a threshold volume stored in the threshold volume storage unit based on a volume distribution of audio stored in the audio storage unit.

前記閾値音量設定部は、前記音声格納部が格納している音声の音量の平均値がより大きい場合に、前記閾値音量格納部が格納している閾値音量をより大きく設定する
請求項１５に記載の再生装置。 The threshold sound volume setting unit sets the threshold sound volume stored in the threshold sound volume storage unit larger when an average value of sound volume stored in the sound storage unit is larger. Playback device.

撮像装置によって撮像された撮像画像を格納する撮像画像格納段階と、
前記撮像装置によって録音された音声を格納する音声格納段階と、
閾値音量を格納する閾値音量格納段階と、
前記音声格納段階において格納される音声のうちで、前記閾値音量格納段階において格納される閾値音量より大きい音量の音声が含まれる一部の期間の音声を抽出する音声抽出段階と、
前記撮像画像格納段階において格納される撮像画像と、前記音声抽出段階において抽出される音声とを対応づけて格納するデータ格納段階と、
前記データ格納段階において対応づけて格納される撮像画像と音声とを同期させて出力するデータ出力部と
を備える再生方法。 A captured image storage stage for storing a captured image captured by the imaging device;
A voice storage step for storing voice recorded by the imaging device;
A threshold volume storage stage for storing the threshold volume;
A voice extraction stage for extracting a voice of a part of a period including a voice having a volume larger than the threshold volume stored in the threshold volume storage stage among the voices stored in the voice storage stage;
A data storage stage for storing the captured image stored in the captured image storage stage and the voice extracted in the voice extraction stage in association with each other;
A reproduction method comprising: a data output unit that outputs a captured image and sound that are stored in association with each other in the data storage step.

画像を再生する再生装置用のプログラムであって、前記再生装置を
撮像装置によって撮像された撮像画像を格納する撮像画像格納部、
前記撮像装置によって録音された音声を格納する音声格納部、
閾値音量を格納する閾値音量格納部、
前記音声格納部が格納している音声のうちで、前記閾値音量格納部が格納している閾値音量より大きい音量の音声が含まれる一部の期間の音声を抽出する音声抽出部、
前記撮像画像格納部が格納している撮像画像と、前記音声抽出部が抽出した音声とを対応づけて格納するデータ格納部、
前記データ格納部が対応づけて格納している撮像画像と音声とを同期させて出力するデータ出力部
として機能させるプログラム。
A program for a playback device that plays back an image, the captured image storage unit storing the captured image captured by the imaging device.
An audio storage unit for storing audio recorded by the imaging device;
A threshold volume storage for storing the threshold volume;
A voice extraction unit that extracts a voice of a part of a period in which a voice having a volume larger than the threshold volume stored in the threshold volume storage unit is included among the voices stored in the voice storage unit;
A data storage unit that stores the captured image stored in the captured image storage unit in association with the voice extracted by the voice extraction unit;
A program that functions as a data output unit that outputs a captured image and sound that are stored in association with each other in the data storage unit.