JP2008245254A

JP2008245254A - Audio processing apparatus

Info

Publication number: JP2008245254A
Application number: JP2008025832A
Authority: JP
Inventors: Shingo Ikeda; 信吾池田
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2007-03-01
Filing date: 2008-02-06
Publication date: 2008-10-09
Anticipated expiration: 2028-02-06
Also published as: JP5020845B2

Abstract

<P>PROBLEM TO BE SOLVED: To remove wind noise contained in a low frequency component in an input sound. <P>SOLUTION: A wind noise detector (309) detects a level L of wind noise from audio signals outputted from microphones (301 to 304). A wind noise removing unit (306) removes wind noise from the audio signals outputted from the microphones (301 to 304) in accordance with the wind noise level L. A sound field converter (307) converts the audio signals outputted from the wind noise removing unit (306) to 5.1 channel audio signals. A sound volume adjusting unit (310) adjusts a sound volume level of a low frequency channel (LF) in accordance with the wind noise level L. An automatic level controller (308) controls the levels of sound signals from the sound field converter (307) and those from the sound volume adjusting unit (310) in accordance with one of the sound signals from the sound field converter (307). <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は音声処理装置に関し、特に、入力された音声中の雑音を低減する音声処理装置に関する。 The present invention relates to a speech processing apparatus, and more particularly to a speech processing apparatus that reduces noise in input speech.

ビデオカメラでは、被写体を動画撮影しつつ、周囲の音声をステレオマイクにより収録する。 The video camera records the surrounding sound with a stereo microphone while shooting a moving image of the subject.

マイクに風が当たることにより音声信号中に発生する雑音（風雑音）を除去する方法として、２チャンネルの音声データから逆位相の低音を除去する方法が知られている（例えば、特許文献１参照）。また、逆位相の低音を除去する周波数範囲を、風雑音のレベルに応じて変更する技術が知られている。 As a method for removing noise (wind noise) generated in an audio signal when wind strikes a microphone, a method for removing low-phase bass from 2-channel audio data is known (see, for example, Patent Document 1). ). There is also known a technique for changing a frequency range for removing low-frequency bass in accordance with a wind noise level.

ＤＶＤビデオでは、音声データとして、複数の音声チャンネルと低域を中心とする音声チャンネルから構成される５．１ｃｈサラウンド音声信号が記録されている。近年では、この様な５．１ｃｈサラウンド音声信号を記録するビデオカメラも登場している。具体的には、複数の無指向性マイクにより得られた音声信号をマトリクス演算して、５．１ｃｈ音声信号に変換して記録する。
特開平４−２７０５９９号公報 In DVD video, a 5.1ch surround sound signal composed of a plurality of sound channels and sound channels centered on a low frequency is recorded as sound data. In recent years, video cameras that record such 5.1 channel surround sound signals have also appeared. Specifically, audio signals obtained by a plurality of omnidirectional microphones are subjected to matrix calculation, converted into 5.1ch audio signals, and recorded.
JP-A-4-270599

特許文献１に記載の雑音除去技術では、複数チャンネルの音声信号に対し、風雑音除去の後に５．１ｃｈ音声に変換した場合、除去しきれなかった風雑音の低域成分が強調されることがある。５．１ｃｈ変換後の低域チャンネルの中に除去しきれなかった風雑音が含まれていた場合、耳障りな音声が記録されてしまうという課題がある。 In the noise removal technique described in Patent Document 1, when a sound signal of a plurality of channels is converted to 5.1ch sound after wind noise removal, low frequency components of wind noise that could not be removed may be emphasized. is there. When wind noise that cannot be completely removed is included in the low-frequency channel after 5.1 ch conversion, there is a problem that unpleasant sound is recorded.

そこで、本発明は、上記の問題点に鑑み、入力音声中の低周波数成分に含まれる雑音を除去する音声処理装置を提示することを目的とする。 In view of the above problems, an object of the present invention is to present a speech processing device that removes noise contained in low frequency components in input speech.

本発明に係る音声処理装置は、複数の音声入力手段と、前記複数の音声入力手段から出力された複数の音声信号の低周波数帯域に含まれる雑音の大きさを検出する雑音検出手段と、前記雑音検出手段の出力に基づいて前記複数の音声入力手段から出力された複数の音声信号の前記雑音を除去する雑音除去手段と、前記雑音除去手段から出力された前記複数の音声信号を、低周波数チャンネルとその他のチャンネルとを含む複数のチャンネルの音声データに変換する変換手段と、前記雑音検出手段により検出された雑音の大きさに応じて、前記低周波数チャンネルの音声データのレベルを制御する調整手段と、前記変換手段から出力された前記他のチャンネルの音声データと前記調整手段から出力された低周波数チャンネルの音声データのレベルを調整するレベル制御手段とを備えることを特徴とする。 The speech processing apparatus according to the present invention includes a plurality of speech input means, a noise detection means for detecting a magnitude of noise included in a low frequency band of a plurality of speech signals output from the plurality of speech input means, Noise removing means for removing the noise from the plurality of voice signals output from the plurality of voice input means based on the output of the noise detecting means, and the plurality of voice signals output from the noise removing means, Conversion means for converting into audio data of a plurality of channels including a channel and other channels, and adjustment for controlling the level of the audio data of the low frequency channel according to the magnitude of noise detected by the noise detection means And a level of the audio data of the other channel output from the converting unit and the audio data of the low frequency channel output from the adjusting unit. Characterized in that it comprises a level control means for adjusting the.

本発明に係る音声処理装置は、複数の音声入力手段と、前記複数の音声入力手段から出力された複数の音声信号の低周波数帯域に含まれる雑音の大きさを検出する雑音検出手段と、前記雑音検出手段の出力に基づいて前記複数の音声入力手段から出力された複数の音声信号の前記雑音を除去する雑音除去手段と、前記雑音除去手段から出力された複数の音声信号から互いに異なる指向性の複数のチャンネルの音声データを生成する変換手段であって、前記雑音除去手段から出力された複数の音声信号を演算することにより互いに異なる指向性の複数のチャンネルの音声データを生成する演算部と、前記雑音除去装置から出力された複数の音声信号の低周波数成分を抽出して合成する合成部と、前記雑音検出手段により検出された雑音の大きさに応じて前記合成部の出力信号のレベルを調整して低周波数チャンネルの音声データとして出力する調整部とを有する変換手段と、前記変換手段から出力された前記複数チャンネルの音声データと前記低周波数チャンネルの音声データのレベルを調整するレベル制御手段とを備えることを特徴とする。 The speech processing apparatus according to the present invention includes a plurality of speech input means, a noise detection means for detecting a magnitude of noise included in a low frequency band of a plurality of speech signals output from the plurality of speech input means, A noise removing unit that removes the noise from the plurality of audio signals output from the plurality of audio input units based on an output of the noise detecting unit, and a directivity different from each other from the plurality of audio signals output from the noise removing unit A conversion unit that generates audio data of a plurality of channels, and a calculation unit that generates audio data of a plurality of channels having different directivities by calculating a plurality of audio signals output from the noise removing unit; A synthesizing unit that extracts and synthesizes low frequency components of a plurality of audio signals output from the noise removing device, and a magnitude of noise detected by the noise detecting unit A conversion unit that adjusts a level of an output signal of the synthesis unit according to the adjustment unit and outputs it as low-frequency channel audio data; and the plurality of channels of audio data and the low-frequency channel output from the conversion unit And level control means for adjusting the level of the audio data.

本発明によれば、風雑音など、入力された音声信号の低周波数成分に含まれる雑音を除去することができる。 According to the present invention, it is possible to remove noise included in low-frequency components of an input audio signal such as wind noise.

図１は、本発明に係る音声処理装置の一実施例を実装したビデオカメラ１００の概略構成ブロック図を示し、図２は、ビデオカメラ１００の外観斜視図を示す。 FIG. 1 is a block diagram showing a schematic configuration of a video camera 100 in which an embodiment of a sound processing apparatus according to the present invention is mounted. FIG. 2 is an external perspective view of the video camera 100.

先ず、図２を参照して外観を説明する。マイクユニット２０１は、それぞれ周囲の音声を電気信号に変換する４つマイクロフォンからなる。撮影レンズ２０２は、被写体の光学像を撮像素子上に結像する。表示パネル２０３は、撮影された画像、再生画像、及びその他各種の情報を表示する。表示パネル２０３は、ヒンジ機構により、ビデオカメラ１００の本体に対して開閉自在に取り付けられている。この明細書では、撮影レンズの向く方向をビデオカメラ１００の前方と呼ぶ。 First, the external appearance will be described with reference to FIG. The microphone unit 201 includes four microphones that convert surrounding sounds into electric signals. The photographing lens 202 forms an optical image of the subject on the image sensor. The display panel 203 displays captured images, reproduced images, and various other information. The display panel 203 is attached to the main body of the video camera 100 so as to be opened and closed by a hinge mechanism. In this specification, the direction in which the photographing lens faces is referred to as the front of the video camera 100.

図１を参照して、本実施例の基本的な構成と動作を説明する。 With reference to FIG. 1, the basic configuration and operation of the present embodiment will be described.

撮影時の動作を説明する。操作部１１０の電源スイッチにより電源が投入されると、記録ポーズ状態になる。又は、モードダイヤルで録画モードが選択されると、記録ポーズ状態になる。 The operation during shooting will be described. When the power is turned on by the power switch of the operation unit 110, the recording pause state is set. Alternatively, when the recording mode is selected with the mode dial, the recording pause state is set.

記録ポーズ状態では、制御部１０９は、撮像部１０１を制御して被写体像の撮影（画像の取り込み）を開始する。撮像部１０１は、撮影レンズ２０１と、撮影レンズ２０１による光学像を画像信号に変換する撮像素子と、撮像素子の出力画像信号を所定映像信号形式に変換するカメラ信号処理部とからなる。撮像部１０１からの動画像信号は表示制御部１０４に送られる。制御部１０９は表示制御部１０４を制御し、撮像部１０１により得られた動画像信号に係る画像を表示部１０５に表示させる。表示部１０５は、表示パネル２０３と、カメラ１００の背面の電子ビューファインダからなる。表示制御部１０４は何れの表示手段に画像を表示させるかを制御できる。 In the recording pause state, the control unit 109 controls the imaging unit 101 to start capturing a subject image (capturing an image). The imaging unit 101 includes a photographing lens 201, an imaging element that converts an optical image obtained by the photographing lens 201 into an image signal, and a camera signal processing unit that converts an output image signal of the imaging element into a predetermined video signal format. A moving image signal from the imaging unit 101 is sent to the display control unit 104. The control unit 109 controls the display control unit 104 and causes the display unit 105 to display an image related to the moving image signal obtained by the imaging unit 101. The display unit 105 includes a display panel 203 and an electronic viewfinder on the back of the camera 100. The display control unit 104 can control which display means displays an image.

音声入力部１０２は、マイクユニット２０１と、マイクユニット２０１の出力音声信号から５．１チャンネル音声データを生成する音声処理部とからなる。記録ポーズ状態では、音声入力部１０２の音声処理部は休止している。音声入力部１０２の詳細は、後述する。 The audio input unit 102 includes a microphone unit 201 and an audio processing unit that generates 5.1 channel audio data from the output audio signal of the microphone unit 201. In the recording pause state, the audio processing unit of the audio input unit 102 is paused. Details of the voice input unit 102 will be described later.

この記録ポーズ状態で、ユーザが操作部１１０の記録トリガスイッチを操作すると、制御部１０９は、各部を制御して撮影画像と音声の記録処理を開始する。即ち、まず、制御部１０９からの記録指示信号に応じて、撮像部１０１から出力される動画像データと、音声入力部１０２が、メモリ１０３に書き込まれる。 When the user operates the recording trigger switch of the operation unit 110 in the recording pause state, the control unit 109 controls each unit to start recording of captured images and sounds. That is, first, the moving image data output from the imaging unit 101 and the audio input unit 102 are written in the memory 103 in accordance with a recording instruction signal from the control unit 109.

符号化処理部１０６は、メモリ１０３に記憶された動画像データと音声データを読み出し、公知のＭＰＥＧ方式等に従って圧縮符号化し、符号化された動画像データ及び音声データを記録再生部１０７に出力する。記録再生部１０７は、符号化された動画像データと音声データを記録フォーマットに従って多重し、記録媒体１０８に記録する。ユーザによる記録停止の指示があると、記録再生部１０７は、記録媒体１０８へのデータ記録を停止する。 The encoding processing unit 106 reads out the moving image data and audio data stored in the memory 103, compresses and encodes them according to a known MPEG method, and outputs the encoded moving image data and audio data to the recording / reproducing unit 107. . The recording / playback unit 107 multiplexes the encoded moving image data and audio data according to the recording format, and records the multiplexed data on the recording medium 108. When the user gives an instruction to stop recording, the recording / reproducing unit 107 stops data recording on the recording medium 108.

本実施例では、記録開始から記録停止までの間に記録された動画データと音声データを、一つのシーンとして管理する。 In this embodiment, moving image data and audio data recorded from the start of recording to the stop of recording are managed as one scene.

次に、再生動作を説明する。ユーザが、操作部１１０のモードダイヤルで再生モードを指示すると、制御部１０９はビデオカメラ１００を再生モードに切り換える。ユーザは、操作部１１０により記録媒体１０８に記録された複数のシーンの中から再生したいシーンを選択でき、選択したシーンの再生を指示する。この指示に応じて、制御部１０９は、記録再生部１０７に選択されたシーンの符号化動画・音声データを記録媒体１０８から再生させる。記録再生部１０７は、再生された符号化データを符号化処理部１０６に送る。符号化処理部１０６は、記録再生部１０７からの符号化動画像データ及び符号化音声データをそれぞれ復号し、再生動画像データ及び再生音声データをメモリ１０３に記憶する。 Next, the reproduction operation will be described. When the user instructs the playback mode with the mode dial of the operation unit 110, the control unit 109 switches the video camera 100 to the playback mode. The user can select a scene to be reproduced from a plurality of scenes recorded on the recording medium 108 by the operation unit 110, and instructs the reproduction of the selected scene. In response to this instruction, the control unit 109 causes the recording / playback unit 107 to play back the encoded moving image / audio data of the selected scene from the recording medium 108. The recording / reproducing unit 107 sends the reproduced encoded data to the encoding processing unit 106. The encoding processing unit 106 decodes the encoded moving image data and encoded audio data from the recording / reproducing unit 107, and stores the reproduced moving image data and the reproduced audio data in the memory 103.

表示制御部１０４は、メモリ１０３から再生動画像データを読み出し、再生画像を表示部１０５に表示する。 The display control unit 104 reads the playback moving image data from the memory 103 and displays the playback image on the display unit 105.

音声出力部１１１は、メモリ１０３から再生音声データを読み出して、スピーカ１１２に出力する。なお、本実施例では、スピーカ１１２は、左右２チャンネルのステレオ音声用のスピーカである。そのため、後述のように、５．１チャンネルの音声データを出力することができない。そこで、音声出力部１１１は、再生された５．１チャンネルの音声データを２チャンネルの音声データに変換してスピーカ１１２に出力する。 The audio output unit 111 reads the reproduced audio data from the memory 103 and outputs it to the speaker 112. In the present embodiment, the speaker 112 is a speaker for stereo sound of two channels on the left and right. Therefore, 5.1 channel audio data cannot be output, as will be described later. Therefore, the audio output unit 111 converts the reproduced 5.1-channel audio data into 2-channel audio data and outputs the 2-channel audio data to the speaker 112.

メモリ１０３上の再生動画像データ及び再生音声データは、順次、読み出されて、出力部１１３から外部機器に出力されることができる。出力部１１３は、例えば、ＵＳＢ又はＩＥＥＥ１３９４のデジタルインターフェースからなる。 Reproduced moving image data and reproduced audio data on the memory 103 can be sequentially read out and output from the output unit 113 to an external device. The output unit 113 includes, for example, a USB or IEEE 1394 digital interface.

図３は、音声入力部１０２の概略構成ブロック図を示す。マイクユニット２０１は、近接配置された４つの音声入力手段としての無指向性マイク３０１〜３０４からなる。図４は、ビデオカメラ１００の上面から見たマイク３０１〜３０４の配置を示す。即ち、相対的に、マイク３０１はビデオカメラ１００の前側に位置し、マイク３０４は後ろ側に位置し、マイク３０２は右側に位置し、マイク３０３は左側に位置する。 FIG. 3 shows a schematic block diagram of the voice input unit 102. The microphone unit 201 includes four omnidirectional microphones 301 to 304 serving as four sound input means arranged in close proximity. FIG. 4 shows the arrangement of the microphones 301 to 304 viewed from the upper surface of the video camera 100. That is, the microphone 301 is relatively located on the front side of the video camera 100, the microphone 304 is located on the rear side, the microphone 302 is located on the right side, and the microphone 303 is located on the left side.

ＡＤコンバータ３０５は、マイク３０１〜３０４に対応するＡ／Ｄ変換器３０５Ａ〜３０５Ｄを具備する。Ａ／Ｄ変換器３０５Ａ〜３０５Ｄはそれぞれ、マイク３０１〜３０４のアナログ音声出力をデジタル信号に変換する。各Ａ／Ｄ変換器３０５Ａ〜３０５Ｄは、その入力段にアンプを有する。Ａ／Ｄ変換器３０５Ａ〜３０５Ｄから出力される各音声データＤ１〜Ｄ４は、風雑音除去部３０６と風雑音検出部３０９に入力される。 The AD converter 305 includes A / D converters 305A to 305D corresponding to the microphones 301 to 304. The A / D converters 305A to 305D convert analog audio outputs from the microphones 301 to 304 into digital signals, respectively. Each A / D converter 305A-305D has an amplifier at its input stage. The audio data D1 to D4 output from the A / D converters 305A to 305D are input to the wind noise removing unit 306 and the wind noise detecting unit 309.

風雑音検出部３０９は、Ａ／Ｄ変換器３０５Ａ〜３０５Ｄから出力されるデジタル音声信号Ｄ１〜Ｄ４から各デジタル音声信号Ｄ１〜Ｄ４中の風雑音を検出する。そして、検出された風雑音のレベル（大きさ）を示す信号Ｌを出力する。風雑音除去部３０６は、風雑音検出部３０９からの風雑音レベル信号Ｌに応じて、Ａ／Ｄ変換器３０５Ａ〜３０５Ｄから出力されるデジタル音声信号Ｄ１〜Ｄ４から風雑音を除去する。風雑音除去部３０６は、風雑音を除去したデジタル音声信号D１１〜Ｄ４１を音場変換部３０７に出力する。風雑音検出部３０９及び風雑音除去部３０６の詳細な動作は後述する。 The wind noise detection unit 309 detects wind noise in the digital audio signals D1 to D4 from the digital audio signals D1 to D4 output from the A / D converters 305A to 305D. And the signal L which shows the level (magnitude) of the detected wind noise is output. The wind noise removing unit 306 removes wind noise from the digital audio signals D1 to D4 output from the A / D converters 305A to 305D according to the wind noise level signal L from the wind noise detecting unit 309. The wind noise removing unit 306 outputs the digital audio signals D11 to D41 from which the wind noise has been removed to the sound field converting unit 307. Detailed operations of the wind noise detection unit 309 and the wind noise removal unit 306 will be described later.

音場変換部３０７は、風雑音除去部３０９からの４チャンネルのデジタル音声信号Ｄ１１〜Ｄ４１を公知の方法で演算処理し、５．１チャンネルのデジタル音声信号を生成する。５．１チャンネル音声信号は、フロント右チャンネル（Ｒ）、フロント左チャンネル（Ｌ）、フロントセンターチャンネル（Ｃ）、リア右チャンネル（ＲＳ）、リア左チャンネル（ＬＳ）及び低周波数チャンネル（ＬＦ）からなる。 The sound field conversion unit 307 performs arithmetic processing on the 4-channel digital audio signals D11 to D41 from the wind noise removal unit 309 by a known method to generate a 5.1-channel digital audio signal. 5.1 channel audio signal is from front right channel (R), front left channel (L), front center channel (C), rear right channel (RS), rear left channel (LS) and low frequency channel (LF). Become.

音場変換部３０７は、具体的には、音声信号Ｄ１１と音声信号Ｄ４からセンターチャンネル（Ｃ）の音声信号を生成する。音声信号Ｄ１１と音声信号Ｄ２からフロント左チャンネル（Ｌ）の音声信号を生成する。音声Ｄ１１と音声信号出力Ｄ３１からフロント右チャンネル（Ｒ）の音声信号を生成する。音声信号Ｄ２１と音声信号Ｄ４１からリア左チャンネル（ＬＳ）の音声信号を生成する。音声信号Ｄ３１と音声信号Ｄ４１からリア右チャンネル（ＲＳ）の音声信号を生成する。音声信号Ｄ１１〜Ｄ４１の低周波数帯域の成分を用いて低域チャンネル（ＬＦ）の音声信号を生成する。 Specifically, the sound field conversion unit 307 generates a center channel (C) audio signal from the audio signal D11 and the audio signal D4. A front left channel (L) audio signal is generated from the audio signal D11 and the audio signal D2. The front right channel (R) audio signal is generated from the audio D11 and the audio signal output D31. A rear left channel (LS) audio signal is generated from the audio signal D21 and the audio signal D41. A rear right channel (RS) audio signal is generated from the audio signal D31 and the audio signal D41. A low frequency band (LF) audio signal is generated using the low frequency band components of the audio signals D11 to D41.

なお、５．１チャンネル音声は、ドルビーサラウンド（商標）などの仕様に沿ったものが考えられるが、本実施例は、その方式に限定されるものではない。 Note that 5.1 channel audio may be in accordance with specifications such as Dolby Surround (trademark), but the present embodiment is not limited to this method.

そして、低域チャンネルＬＦ以外のチャンネルの音声データは、自動レベル制御（ＡＬＣ）部３０８に出力される。低域チャンネルＬＦの音声信号は、入力音声信号の低周波数成分として音量調整部３１０を介してＡＬＣ部３０８に供給される。 The audio data of channels other than the low frequency channel LF is output to the automatic level control (ALC) unit 308. The audio signal of the low-frequency channel LF is supplied to the ALC unit 308 via the volume adjustment unit 310 as a low frequency component of the input audio signal.

音量調整部３１０は、風雑音検出部３０９からの風雑音レベル信号Ｌに応じて、低域チャンネルＬＦの音声レベル（音量）を調整して、ＡＬＣ部３０８に出力する。音量調整部３１０の詳細な動作は後述する。 The volume adjustment unit 310 adjusts the sound level (volume) of the low frequency channel LF according to the wind noise level signal L from the wind noise detection unit 309 and outputs the adjusted sound level (volume) to the ALC unit 308. The detailed operation of the volume adjustment unit 310 will be described later.

ＡＬＣ部３０８は、音場変換部３０７からのチャンネルＣ，Ｌ，Ｒ，ＬＳ，ＲＳの音声信号、及び、音量調整部３１０からの低域チャンネルＬＦの音声信号の各レベルを、全体として一定レベルになるよう調整する。ＡＬＣ部３０８によりレベル調整された音声データは、メモリ１０３に格納される。 The ALC unit 308 has a constant level for each level of the audio signals of the channels C, L, R, LS, and RS from the sound field converting unit 307 and the audio signal of the low frequency channel LF from the volume adjusting unit 310 as a whole. Adjust so that The audio data whose level has been adjusted by the ALC unit 308 is stored in the memory 103.

ＡＬＣ部３０８は、具体的には、音場変換部３０７からの各チャンネルの音声信号のうち、最もレベルが高い何れかのチャンネルのレベルが所定レベルになるようなレベル調整量を決定する。そして、決定したレベル調整量に従って、全チャンネルの音声信号のレベルを共通に調整する。５．１ｃｈの音声では、各チャンネル間のバランスが重要であり、各チャンネル間のバランスが最適になるように各チャンネルの音声レベルを調整する必要がある。そこで、本実施例では、ＡＬＣ部３０８が各チャンネルの音声レベルを一律に調整することで、このバランスを保ったままレベルを調整できる。 Specifically, the ALC unit 308 determines a level adjustment amount so that the level of any channel having the highest level among the audio signals of each channel from the sound field conversion unit 307 becomes a predetermined level. And according to the determined level adjustment amount, the level of the audio signal of all the channels is adjusted in common. In 5.1ch audio, the balance between the channels is important, and it is necessary to adjust the audio level of each channel so that the balance between the channels is optimal. Therefore, in this embodiment, the ALC unit 308 can adjust the audio level of each channel uniformly, so that the level can be adjusted while maintaining this balance.

図６は、風雑音検出部３０９の概略構成ブロック図を示す。風雑音検出部３０９は、音声信号Ｄ１，Ｄ２を用いて風雑音レベルＬ１を検出する系統と、音声信号Ｄ３，Ｄ４を用いて風雑音レベルＬ２を検出する系統を有する。そして、風雑音検出部３０９は、２系統の風雑音レベルＬ１，Ｌ２を比較して、平均的な風雑音レベルＬを最終的に算出する。なお、音声信号Ｄ１，Ｄ２を対とし、音声信号Ｄ３，Ｄ４を対としているが、この組み合わせは便宜的なものであり、各マイク３０１〜３０４の位置に依存しない。 FIG. 6 shows a schematic block diagram of the wind noise detection unit 309. The wind noise detection unit 309 has a system for detecting the wind noise level L1 using the audio signals D1 and D2, and a system for detecting the wind noise level L2 using the audio signals D3 and D4. Then, the wind noise detection unit 309 compares the two wind noise levels L1 and L2, and finally calculates an average wind noise level L. Although the audio signals D1 and D2 are paired and the audio signals D3 and D4 are paired, this combination is convenient and does not depend on the positions of the microphones 301 to 304.

通常の音声の場合、低音域は指向性が低いので、複数のマイクが近接していれば同じ位相の信号となる。しかし、マイクに風があたることで発生する低音域は相関性が無く、同じ位相にはならない。風雑音検出部３０９は、この特性を使って、風雑音レベルＬを検出する。 In the case of normal sound, since directivity is low in the low sound range, signals having the same phase are obtained if a plurality of microphones are close to each other. However, the low frequency range generated by wind on the microphone has no correlation and does not have the same phase. The wind noise detection unit 309 detects the wind noise level L using this characteristic.

加算器５０１は、音声信号Ｄ１と音声信号Ｄ２を加算し、和信号Ｄ１＋Ｄ２を出力する。減算器５０２は、音声信号Ｄ１から音声信号Ｄ２を減算し、差信号Ｄ１−Ｄ２を出力する。絶対値変換部５０３は、加算器５０１の出力信号Ｄ１＋Ｄ２をその絶対値に変換し、絶対値信号｜Ｄ１＋Ｄ２｜をＬＰＦ（ローパスフィルタ）５０５に出力する。絶対値変換部５０４は、減算器５０２の出力信号Ｄ１−Ｄ２を絶対値に変換し、絶対値信号｜Ｄ１−Ｄ２｜をＬＰＦ５０５と略同じ伝達特性のＬＰＦ５０６に出力する。ＬＰＦ５０５，５０６は、入力信号の高域成分を除去するデジタルフィルタである。音声信号の低域に限れば、ＬＰＦ５０５の出力信号は入力信号｜Ｄ１＋Ｄ２｜にほぼ等しく、ＬＰＦ５０６の出力信号は入力信号｜Ｄ１−Ｄ２｜にほぼ等しい。 The adder 501 adds the audio signal D1 and the audio signal D2, and outputs a sum signal D1 + D2. The subtracter 502 subtracts the audio signal D2 from the audio signal D1, and outputs a difference signal D1-D2. The absolute value conversion unit 503 converts the output signal D1 + D2 of the adder 501 into its absolute value, and outputs the absolute value signal | D1 + D2 | to an LPF (low-pass filter) 505. The absolute value converter 504 converts the output signal D1-D2 of the subtractor 502 into an absolute value, and outputs the absolute value signal | D1-D2 | to the LPF 506 having substantially the same transfer characteristics as the LPF 505. LPFs 505 and 506 are digital filters that remove high frequency components of the input signal. As far as the low frequency range of the audio signal is concerned, the output signal of the LPF 505 is approximately equal to the input signal | D1 + D2 |, and the output signal of the LPF 506 is approximately equal to the input signal | D1-D2 |.

減算器５０７は、ＬＰＦ５０５の出力信号｜Ｄ１＋Ｄ２｜からＬＰＦ５０６の出力信号｜Ｄ１−Ｄ２｜を減算する。減算器５０７の出力はおよそ、｜Ｄ１＋Ｄ２｜−｜Ｄ１−Ｄ２｜に相当する。エンベロープ検出部５０８は、減算器５０７の出力信号のエンベロープを検出し、検出したエンベロープのレベルＬ１を出力する。 The subtracter 507 subtracts the output signal | D1-D2 | of the LPF 506 from the output signal | D1 + D2 | of the LPF 505. The output of the subtracter 507 is approximately equivalent to | D1 + D2 | − | D1−D2 |. The envelope detection unit 508 detects the envelope of the output signal of the subtractor 507 and outputs the detected envelope level L1.

マイク３０１〜３０４で入力された音声信号のうち、風雑音では無い、被写体からの音声の場合、音声信号Ｄ１〜Ｄ４の低域成分が同位相となる。このとき、ＬＰＦ５０５の出力信号は、|Ｄ１＋Ｄ２|≒|２×Ｄ１|≒|２×Ｄ２|となり、ＬＰＦ５０６の出力信号は、|Ｄ１−Ｄ２|≒０となる。この結果、減算器５０７の出力は|２×Ｄ１|又は|２×Ｄ２|となる。 Of the audio signals input from the microphones 301 to 304, in the case of audio from a subject that is not wind noise, the low frequency components of the audio signals D1 to D4 have the same phase. At this time, the output signal of the LPF 505 is | D1 + D2 | ≈ | 2 × D1 | ≈ | 2 × D2 |, and the output signal of the LPF 506 is | D1−D2 | ≈0. As a result, the output of the subtracter 507 becomes | 2 × D1 | or | 2 × D2 |.

他方、風雑音の場合、音声信号Ｄ１〜Ｄ４の低音域は相関が無い。そのため、ＬＰＦ５０５の出力|Ｄ１＋Ｄ２|に比べ、ＬＰＦ５０６の出力|Ｄ１−Ｄ２|の方が大きくなる。特に、音声信号Ｄ１，Ｄ２に含まれる風雑音の成分の位相が１８０°異なる場合、ＬＰＦ５０５の出力では|Ｄ１＋Ｄ２|≒０となり、ＬＰＦ５０６の出力では|Ｄ１−Ｄ２|≒|２×Ｄ１|≒|２×Ｄ２|となる。 On the other hand, in the case of wind noise, the low sound range of the audio signals D1 to D4 has no correlation. For this reason, the output | D1-D2 | of the LPF 506 is larger than the output | D1 + D2 | of the LPF 505. In particular, when the phase of the wind noise component included in the audio signals D1 and D2 is 180 ° different, the output of the LPF 505 is | D1 + D2 | ≈0, and the output of the LPF 506 is | D1-D2 | ≈ | 2 × D1 | ≈ | 2 × D2 |.

この結果、風雑音が含まれていた場合、減算器５０７の出力はマイナス値になる。特に、音声信号Ｄ１，Ｄ２に含まれる風雑音の成分の位相が１８０°異なる場合には、減算器５０７の出力は−｜２×Ｄ１｜又は−｜２×Ｄ２｜になる。 As a result, when wind noise is included, the output of the subtracter 507 becomes a negative value. In particular, when the phase of the wind noise component included in the audio signals D1 and D2 differs by 180 °, the output of the subtractor 507 is − | 2 × D1 | or − | 2 × D2 |.

この様に、減算器５０７の出力値の符号がマイナスのとき、減算器５０７の出力信号、つまり、音声信号Ｄ１と同Ｄ２の差分の低域成分は、風雑音のレベルを反映している。 Thus, when the sign of the output value of the subtractor 507 is negative, the output signal of the subtracter 507, that is, the low frequency component of the difference between the audio signal D1 and the same D2 reflects the level of wind noise.

エンベロープ検出部５０８は、減算器５０７の出力値の符号が負の場合、減算器５０７の出力信号のエンベロープレベルＬ１を出力する。また、減算器５０７の出力値の符号が正の場合、出力Ｌ１として値０を出力する。 When the sign of the output value of the subtracter 507 is negative, the envelope detection unit 508 outputs the envelope level L1 of the output signal of the subtracter 507. When the sign of the output value of the subtracter 507 is positive, the value 0 is output as the output L1.

音声信号Ｄ３，Ｄ４から雑音レベルＬ２を算定する部分の動作も、音声信号Ｄ１，Ｄ２に対する部と基本的に同じである。 The operation for calculating the noise level L2 from the audio signals D3 and D4 is basically the same as that for the audio signals D1 and D2.

即ち、加算器５０９は音声信号Ｄ３と音声信号Ｄ４を加算し、和信号Ｄ３＋Ｄ４を出力する。減算器５１０は音声信号Ｄ３から音声信号Ｄ４を減算し、差信号Ｄ３−Ｄ４を出力する。絶対値変換部５１１は、加算器５０９の出力信号Ｄ３＋Ｄ４をその絶対値に変換し、絶対値信号｜Ｄ３＋Ｄ４｜をＬＰＦ５１３に出力する。絶対値変換部５１２は、減算器５１０の出力信号Ｄ３−Ｄ４をその絶対値に変換し、絶対値信号｜Ｄ３−Ｄ４｜をＬＰＦ５１３と略同じ伝達特性のＬＰＦ５１４に出力する。音声信号の低域に限れば、ＬＰＦ５１３の出力信号は入力信号｜Ｄ３＋Ｄ４｜にほぼ等しく、ＬＰＦ５１４の出力信号は入力信号｜Ｄ３−Ｄ４｜にほぼ等しい。 That is, the adder 509 adds the audio signal D3 and the audio signal D4, and outputs a sum signal D3 + D4. The subtracter 510 subtracts the audio signal D4 from the audio signal D3, and outputs a difference signal D3-D4. The absolute value conversion unit 511 converts the output signal D3 + D4 of the adder 509 into its absolute value, and outputs the absolute value signal | D3 + D4 | to the LPF 513. The absolute value converter 512 converts the output signal D3-D4 of the subtractor 510 into its absolute value, and outputs the absolute value signal | D3-D4 | to the LPF 514 having substantially the same transfer characteristics as the LPF 513. As far as the low frequency range of the audio signal is concerned, the output signal of the LPF 513 is approximately equal to the input signal | D3 + D4 |, and the output signal of the LPF 514 is approximately equal to the input signal | D3-D4 |.

減算器５１５は、ＬＰＦ５１３の出力信号｜Ｄ３＋Ｄ４｜からＬＰＦ５１４の出力信号｜Ｄ３−Ｄ４｜を減算する。減算器５１５の出力はおよそ、｜Ｄ３＋Ｄ４｜−｜Ｄ３−Ｄ４｜に相当する。エンベロープ検出部５１６は、減算器５１５の出力信号のエンベロープを検出し、検出したエンベロープのレベルＬ２を出力する。 The subtractor 515 subtracts the output signal | D3−D4 | of the LPF 514 from the output signal | D3 + D4 | of the LPF 513. The output of the subtracter 515 corresponds approximately to | D3 + D4 |-| D3-D4 |. The envelope detector 516 detects the envelope of the output signal of the subtractor 515 and outputs the detected envelope level L2.

マイク３０１〜３０４で入力された音声信号のうち、風雑音では無い、被写体からの音声の場合、ＬＰＦ５１３の出力信号は|Ｄ３＋Ｄ４|≒|２×Ｄ３|≒|２×Ｄ４|となり、ＬＰＦ５１４の出力信号は|Ｄ３−Ｄ４|≒０となる。この結果、減算器５１５の出力は、|２×Ｄ３|又は|２×Ｄ４|となる。 In the case of audio from a subject that is not wind noise among the audio signals input from the microphones 301 to 304, the output signal of the LPF 513 is | D3 + D4 | ≈ | 2 × D3 | ≈ | 2 × D4 |, and the output of the LPF 514 The signal is | D3-D4 | ≈0. As a result, the output of the subtracter 515 is | 2 × D3 | or | 2 × D4 |.

他方、風雑音の場合、音声信号Ｄ１〜Ｄ４の低音域は相関が無いので、ＬＰＦ５１３の出力|Ｄ３＋Ｄ４|に比べ、ＬＰＦ５１４の出力|Ｄ３−Ｄ４|の方が大きくなる。特に、音声信号Ｄ３と同Ｄ４に含まれる風雑音の成分の位相が１８０°異なる場合、ＬＰＦ５１３の出力では|Ｄ３＋Ｄ４|≒０となり、ＬＰＦ５１０の出力では|Ｄ３−Ｄ４|≒|２×Ｄ３|≒|２×Ｄ４|となる。 On the other hand, in the case of wind noise, since the low sound range of the audio signals D1 to D4 has no correlation, the output | D3-D4 | of the LPF 514 is larger than the output | D3 + D4 | of the LPF 513. In particular, when the phase of the wind noise component included in the audio signal D3 and the same D4 is 180 ° different, the output of the LPF 513 is | D3 + D4 | ≈0, and the output of the LPF510 is | D3-D4 | ≈ | 2 × D3 | ≈ | 2 × D4 |.

この結果、風雑音が含まれていた場合、減算器５１５の出力はマイナス値になる。特に、音声信号Ｄ３と同Ｄ４に含まれる風雑音の成分の位相が１８０°異なる場合、減算器５１５の出力は−｜２×Ｄ３｜又は−｜２×Ｄ４｜になる。 As a result, when wind noise is included, the output of the subtracter 515 becomes a negative value. In particular, when the phase of the wind noise component included in the audio signal D3 and the same D4 is 180 °, the output of the subtracter 515 is − | 2 × D3 | or − | 2 × D4 |.

この様に、減算器５１５の出力値の符号がマイナスのときに、減算器５１５の出力信号、つまり、音声信号Ｄ３と同Ｄ４の差分の低域成分は、風雑音のレベルを反映している。 In this way, when the sign of the output value of the subtractor 515 is negative, the output signal of the subtracter 515, that is, the low frequency component of the difference between the audio signal D3 and D4, reflects the level of wind noise. .

エンベロープ検出部５１６は、エンベロープ検出部５０８と同様に、減算器５１５の出力値の符号が負の場合、減算器５１５の出力信号のエンベロープレベルＬ２を出力する。また、減算器５１５の出力値の符号が正のときには、出力Ｌ２として値０を出力する。 Similar to the envelope detector 508, the envelope detector 516 outputs the envelope level L2 of the output signal of the subtracter 515 when the sign of the output value of the subtracter 515 is negative. When the sign of the output value of the subtracter 515 is positive, the value 0 is output as the output L2.

判定部５１７は、エンベロープ検出部５０８からのレベルＬ１と、エンベロープ検出部５１６からのレベルＬ２の平均値を算出し、全体の平均レベルＬとして出力する。なお、風雑音レベルＬ１，Ｌ２の平均値を出力するのではなく、風雑音レベルＬ１，Ｌ２の大きい値を検出レベルＬとして出力してもよい。 The determination unit 517 calculates an average value of the level L1 from the envelope detection unit 508 and the level L2 from the envelope detection unit 516, and outputs the average value L as an overall average level L. Instead of outputting the average value of the wind noise levels L1 and L2, a large value of the wind noise levels L1 and L2 may be output as the detection level L.

図７は、風雑音除去部３０６の概略構成ブロック図を示す。図７を参照して風雑音除去部３０６の動作を説明する。 FIG. 7 shows a schematic block diagram of the wind noise removing unit 306. The operation of the wind noise removing unit 306 will be described with reference to FIG.

風雑除去部３０６は、音声信号Ｄ１，Ｄ２に含まれる風雑音を除去する処理系と、音声信号Ｄ３，Ｄ４に含まれる風雑音を除去する処理系を有する。各系統において、本実施例では、音声信号Ｄ１，Ｄ２から差信号Ｄ１−Ｄ２の低域成分を除去し、音声信号Ｄ３，Ｄ４から差信号Ｄ３−Ｄ４の低域成分を除去することで、風雑音を低減する。その際、風雑音レベルＬが大きいほど、差信号Ｄ１−Ｄ２，Ｄ３−Ｄ４低域遮断周波数を高くする。 The dust removal unit 306 includes a processing system that removes wind noise contained in the audio signals D1 and D2, and a processing system that removes wind noise contained in the audio signals D3 and D4. In each system, in this embodiment, the low frequency component of the difference signal D1-D2 is removed from the audio signals D1, D2, and the low frequency component of the difference signal D3-D4 is removed from the audio signals D3, D4. Reduce noise. At that time, the higher the wind noise level L, the higher the difference signal D1-D2, D3-D4 low-frequency cutoff frequency.

加算器６０１は、音声信号Ｄ１と音声信号Ｄ２を加算し、和信号Ｄ１＋Ｄ２を出力する。減算器６０２は、音声信号Ｄ１から音声信号Ｄ２を減算し、差信号Ｄ１−Ｄ２を出力する。ＨＰＦ（ハイパスフィルタ）６０３は、差信号Ｄ１−Ｄ２の遮断周波数以下の低域成分を除去し、残る高域成分を通過する。遮断周波数制御部６１５は、風雑音検出部３０９からの風雑音レベルＬに応じて、ＨＰＦ６０３の遮断周波数を切り替える。 The adder 601 adds the audio signal D1 and the audio signal D2, and outputs a sum signal D1 + D2. The subtracter 602 subtracts the audio signal D2 from the audio signal D1, and outputs a difference signal D1-D2. An HPF (high pass filter) 603 removes a low frequency component equal to or lower than the cutoff frequency of the difference signal D1-D2, and passes the remaining high frequency component. The cutoff frequency control unit 615 switches the cutoff frequency of the HPF 603 according to the wind noise level L from the wind noise detection unit 309.

図８は、ＨＰＦ６０３の３通りの周波数特性例を示す。図８（Ａ）は、風雑音レベルＬが第１の閾値よりも小さい場合の周波数特性を示す。図８（Ｂ）は、風雑音レベルＬが第１の閾値以上で、第１の閾値より大きい第２の閾値よりも小さい場合の周波数特性を示す。図８（Ｃ）は、風雑音レベルＬが第２の閾値以上の場合の周波数特性を示す。図８（Ａ）〜図８（Ｃ）で、横軸は周波数を示し、縦軸は、振幅（又は透過率）を示す。 FIG. 8 shows three examples of frequency characteristics of the HPF 603. FIG. 8A shows frequency characteristics when the wind noise level L is smaller than the first threshold value. FIG. 8B shows the frequency characteristics when the wind noise level L is equal to or higher than the first threshold and smaller than the second threshold greater than the first threshold. FIG. 8C shows frequency characteristics when the wind noise level L is greater than or equal to the second threshold value. 8A to 8C, the horizontal axis represents frequency, and the vertical axis represents amplitude (or transmittance).

風雑音レベルＬが第１の閾値よりも小さい場合、遮断周波数制御部６１５は、図８（Ａ）に示すように、ＨＰＦ６０３が低域から高域まで、全帯域において信号レベルを減衰させることなく出力するようＨＰＦ６０３を制御する。即ち、ＨＰＦ６０３は、いわばスルー状態になる。 When the wind noise level L is smaller than the first threshold, the cutoff frequency control unit 615 causes the HPF 603 to attenuate the signal level in the entire band from low to high as shown in FIG. 8A. The HPF 603 is controlled to output. That is, the HPF 603 is in a through state.

風雑レベルＬが第１の閾値以上で、且つ、第１の閾値よりも大きい第２の閾値よりも小さい場合、遮断周波数制御部６１５は、図８（Ｂ）に示すように、ＨＰＦ６０３の低域遮断周波数を周波数ｆ１に設定する。これにより、周波数ｆ１以下の帯域成分が減衰する。風雑音レベルＬが閾値２以上の場合、遮断周波数制御部６１５は、図８（Ｃ）に示すように、ＨＰＦ６０３の遮断周波数を、周波数ｆ１よりも高いｆ２に設定する。 When the noise level L is equal to or higher than the first threshold value and smaller than the second threshold value that is larger than the first threshold value, the cutoff frequency control unit 615 sets the low HPF 603 as shown in FIG. The band cutoff frequency is set to the frequency f1. Thereby, the band component below the frequency f1 is attenuated. When the wind noise level L is greater than or equal to the threshold 2, the cutoff frequency control unit 615 sets the cutoff frequency of the HPF 603 to f2 higher than the frequency f1, as shown in FIG. 8C.

２つの近接したマイクからの音声信号の差を算出することで、風雑音の成分を抽出できる。風雑音レベルが高いときには、風雑音が高い周波数にまで延びていると考えられる。風雑音レベルＬに応じて、ＨＰＦ６０３の遮断周波数を制御することで、差信号から風雑音成分を効果的に抑圧できる。 By calculating the difference between audio signals from two adjacent microphones, a wind noise component can be extracted. When the wind noise level is high, it is considered that the wind noise extends to a high frequency. By controlling the cutoff frequency of the HPF 603 according to the wind noise level L, the wind noise component can be effectively suppressed from the difference signal.

加算器６０４は、加算器６０１の出力にＨＰＦ６０３の出力を加算する。加算器６０４の出力は、ＨＰＦ６０３の影響部分を無視すると、およそ２Ｄ１（≒（Ｄ１＋Ｄ２）＋（Ｄ１−Ｄ２））となる。減算器６０５は、加算器６０１の出力からＨＰＦ６０３の出力を減算する。減算器６０５の出力は、ＨＰＦ６０３の影響部分を無視すると、およそ２Ｄ２（≒（Ｄ１＋Ｄ２）−（Ｄ１−Ｄ２））となる。 The adder 604 adds the output of the HPF 603 to the output of the adder 601. The output of the adder 604 is approximately 2D1 (≈ (D1 + D2) + (D1−D2)) when the affected part of the HPF 603 is ignored. The subtracter 605 subtracts the output of the HPF 603 from the output of the adder 601. The output of the subtractor 605 is approximately 2D2 (≈ (D1 + D2) − (D1−D2)) when the affected part of the HPF 603 is ignored.

先に説明したように、音声信号Ｄ１，Ｄ２に含まれる風雑音成分は相関がない。従って、風雑音を含む場合の差信号Ｄ１−Ｄ２（の低域成分）は、通常の音声の場合の差信号Ｄ１−Ｄ２に比べ大きくなる。これをＨＰＦ６０３で削除又は抑圧することで、風雑音を低減できる。 As described above, the wind noise components included in the audio signals D1 and D2 have no correlation. Therefore, the difference signal D1-D2 (the low frequency component thereof) in the case of including wind noise is larger than the difference signal D1-D2 in the case of normal speech. By removing or suppressing this with the HPF 603, wind noise can be reduced.

アンプ６０６は、加算器６０４の出力信号の音声レベルを例えば１／２に調整する。同様に、アンプ６０７は、減算器６０５の出力信号の音声レベルを例えば１／２に調整する。この結果、アンプ６０６は、風雑音が除去又は抑圧された音声信号Ｄ１１を出力し、アンプ６０７は、風雑音が除去又は抑圧された音声信号Ｄ２１を出力する。 The amplifier 606 adjusts the audio level of the output signal of the adder 604 to, for example, ½. Similarly, the amplifier 607 adjusts the audio level of the output signal of the subtractor 605 to, for example, ½. As a result, the amplifier 606 outputs the audio signal D11 from which wind noise has been removed or suppressed, and the amplifier 607 outputs the audio signal D21 from which wind noise has been removed or suppressed.

音声信号Ｄ３，Ｄ４の風雑音を除去する他方の系統も、同様に動作する。すなわち、加算器６０８は音声信号Ｄ３と音声信号Ｄ４を加算し、和信号Ｄ３＋Ｄ４を出力する。減算器６０９は音声信号Ｄ３から音声信号Ｄ４を減算し、差信号Ｄ３−Ｄ４を出力する。 The other system for removing wind noise from the audio signals D3 and D4 operates in the same manner. That is, the adder 608 adds the audio signal D3 and the audio signal D4, and outputs a sum signal D3 + D4. The subtractor 609 subtracts the audio signal D4 from the audio signal D3, and outputs a difference signal D3-D4.

ＨＰＦ６１０は、差信号Ｄ３−Ｄ４から低域遮断周波数以下の低域成分を減衰させ、残る高域成分を通過する。遮断周波数制御部６１５が、風雑音レベルＬに応じて、ＨＰＦ６１０の低域遮断周波数をＨＰＦ６０３と同じ低域遮断周波数に制御する。 The HPF 610 attenuates the low frequency component below the low frequency cutoff frequency from the difference signal D3-D4 and passes the remaining high frequency component. The cutoff frequency control unit 615 controls the low frequency cutoff frequency of the HPF 610 to the same low frequency cutoff frequency as that of the HPF 603 according to the wind noise level L.

加算器６１１は、加算器６０８の出力とＨＰＦ６１０の出力を加算する。加算器６１１の出力は、ＨＰＦ６１０の影響部分を無視すると、およそ２Ｄ３（≒（Ｄ３＋Ｄ４）＋（Ｄ３−Ｄ４））となる。減算器６１２は、加算器６０８の出力からＨＰＦ６１０の出力を減算する。減算器６１２の出力は、ＨＰＦ６１０の影響部分を無視すると、およそ２Ｄ４（≒（Ｄ３＋Ｄ４）−（Ｄ３−Ｄ４））となる。 The adder 611 adds the output of the adder 608 and the output of the HPF 610. The output of the adder 611 is approximately 2D3 (≈ (D3 + D4) + (D3−D4)) when the affected part of the HPF 610 is ignored. The subtractor 612 subtracts the output of the HPF 610 from the output of the adder 608. The output of the subtractor 612 is approximately 2D4 (≈ (D3 + D4) − (D3−D4)) when the affected part of the HPF 610 is ignored.

アンプ６１３は、加算器６１１の出力信号の音声レベルを例えば１／２に調整する。同様に、アンプ６１４は、減算器６１２の出力信号の音声レベルを例えば１／２に調整する。この結果、アンプ６１３は風雑音が除去された音声信号Ｄ３１を出力し、アンプ６１４は風雑音が除去された音声信号Ｄ４１を出力する。 The amplifier 613 adjusts the audio level of the output signal of the adder 611 to 1/2, for example. Similarly, the amplifier 614 adjusts the audio level of the output signal of the subtractor 612 to, for example, 1/2. As a result, the amplifier 613 outputs the audio signal D31 from which the wind noise has been removed, and the amplifier 614 outputs the audio signal D41 from which the wind noise has been removed.

前述の様に、風雑音が検出されない場合、つまり風雑音レベルＬが極めて低い場合には、ＨＰＦ６０３，６１０は低域から高域まで、全帯域の入力信号を減衰させずに出力する。また、風雑音レベルＬが大きくなると、ＨＰＦ６０３，６１０の遮断周波数が高くなり、より高い周波数成分までの低域成分が除去される。 As described above, when the wind noise is not detected, that is, when the wind noise level L is very low, the HPFs 603 and 610 output the input signals of the entire band from the low range to the high range without being attenuated. Further, when the wind noise level L increases, the cutoff frequency of the HPFs 603 and 610 increases, and low frequency components up to higher frequency components are removed.

本実施例では、遮断周波数制御部６１５が、ＨＰＦ６０３，６１０の周波数特性の例として図８（Ａ）〜（Ｃ）を例示したが、勿論、遮断周波数制御部６１５は、ＨＰＦ６０３，６１０の低域遮断周波数を連続的又は不連続に制御することができる。 In the present embodiment, the cutoff frequency control unit 615 illustrated FIGS. 8A to 8C as examples of the frequency characteristics of the HPFs 603 and 610. Of course, the cutoff frequency control unit 615 includes the low frequency range of the HPFs 603 and 610. The cutoff frequency can be controlled continuously or discontinuously.

図７に示す部構成では、図６に示す風雑音検出部３０９において風雑音検出のために用いた音声信号Ｄ１，Ｄ２のペア、及び音声信号Ｄ３，Ｄ４のペアと同じ組み合わせを用いている。この様に、風雑音検出と同じ組み合わせを用いることにより、各マイク３０１〜３０４の特性のばらつきの影響を抑えることができる。勿論、音声信号Ｄ１と同Ｄ３の差分を算出し、音声信号Ｄ１と同Ｄ４の差分を算出し、これらの差信号の低域成分を除去するように構成しても良い。 In the unit configuration shown in FIG. 7, the same combination as the pair of audio signals D1 and D2 and the pair of audio signals D3 and D4 used for wind noise detection in the wind noise detection unit 309 shown in FIG. 6 is used. In this way, by using the same combination as the wind noise detection, it is possible to suppress the influence of variations in the characteristics of the microphones 301 to 304. Of course, the difference between the audio signals D1 and D3 may be calculated, the difference between the audio signals D1 and D4 may be calculated, and the low frequency components of these difference signals may be removed.

音量調整部３１０は、風雑音レベルＬが高いほど、減衰量が大きくなるゲイン可変減衰器からなる。音量調整部３１０は、風雑音検出部３０９からの風雑音レベルＬに応じて低域チャンネルＬＦの振幅を調整してＡＬＣ部３０８に出力する。 The volume adjustment unit 310 is composed of a variable gain attenuator in which the amount of attenuation increases as the wind noise level L increases. The volume adjustment unit 310 adjusts the amplitude of the low frequency channel LF according to the wind noise level L from the wind noise detection unit 309 and outputs the adjusted signal to the ALC unit 308.

図５（Ａ）〜（Ｃ）は、音量調整部３１０の風雑音レベルＬに対するゲインの特性例を示す。横軸は風雑音レベルＬを示し、縦軸は音量調整部３１０のゲインを示す。図５（Ａ）では、風雑音レベルＬが０から所定値Ｌａの範囲ではゲインを一定とし、Ｌａ以上では、レベルＬが高くなるほど、ゲインを小さくする。図５（Ｂ）では、風雑音レベルＬが高くなるほど、単純にゲインを小さくする。図５（Ｃ）では、風雑音レベルＬが０から第１の閾値Ｌａまでの範囲では、ゲインを一定とし、ＬａからＬａより高いＬｂの範囲では、レベルＬにが高くなるほどゲインを小さくし、Ｌｂ以上ではゲインを再び一定にする。 5A to 5C show examples of gain characteristics with respect to the wind noise level L of the volume adjustment unit 310. FIG. The horizontal axis indicates the wind noise level L, and the vertical axis indicates the gain of the volume adjusting unit 310. In FIG. 5A, the gain is constant when the wind noise level L is in the range of 0 to the predetermined value La, and the gain is decreased as the level L increases above La. In FIG. 5B, the gain is simply reduced as the wind noise level L increases. In FIG. 5C, the gain is constant in the range where the wind noise level L is from 0 to the first threshold value La, and in the range of Lb higher than La to La, the gain is decreased as the level L increases. Above Lb, the gain is made constant again.

音量調整部３１０は、図５（Ａ）〜（Ｃ）の何れかに示す特性に従って、低域チャンネルＬＦのレベルを調整する。この様に、低域チャンネルＬＦのレベルを風雑音レベルＬに応じて調整することで、ＡＬＣ部３０８は、低域チャンネルＬＦのレベルを他のチャンネルのレベルと同様に、一括して調整できる。ＡＬＣ部３０８の調整によっても、風雑音が強調されずに済む。 The volume adjustment unit 310 adjusts the level of the low-frequency channel LF according to the characteristics shown in any of FIGS. In this way, by adjusting the level of the low-frequency channel LF according to the wind noise level L, the ALC unit 308 can collectively adjust the level of the low-frequency channel LF in the same manner as the levels of other channels. Even with the adjustment of the ALC unit 308, the wind noise is not enhanced.

図９は、音声入力部１０２の別の構成例を示す。音場変換部３０７ａに、音場変換部３０７と音量調整部３１０の機能を装備してある。図３と同じ構成要素には、同じ符号を付してある。 FIG. 9 shows another configuration example of the voice input unit 102. The sound field conversion unit 307a is equipped with the functions of the sound field conversion unit 307 and the volume adjustment unit 310. The same components as those in FIG. 3 are denoted by the same reference numerals.

図９に示す構成では、風雑音検出部３０９からの風雑音レベルＬに従って、音場変換部３０７ａにおける低域チャンネルＬＦの生成処理を制御する点が、図３と異なる。 The configuration shown in FIG. 9 differs from FIG. 3 in that the low-frequency channel LF generation process in the sound field conversion unit 307a is controlled according to the wind noise level L from the wind noise detection unit 309.

図１０は、音場変換部３０７ａの概略構成ブロック図を示す。風雑音除去部３０６から出力される音声信号Ｄ１１〜Ｄ４１は、演算部９０１と低域チャンネル生成部９０２に入力される。また、風雑音検出部３０９から出力される風雑音レベル信号Ｌは、低域チャンネル生成部９０２に供給される。 FIG. 10 shows a schematic block diagram of the sound field converter 307a. The audio signals D11 to D41 output from the wind noise removal unit 306 are input to the calculation unit 901 and the low frequency channel generation unit 902. The wind noise level signal L output from the wind noise detection unit 309 is supplied to the low frequency channel generation unit 902.

演算部９０１は、入力音声信号Ｄ１１〜Ｄ４１から公知の演算によりをチャンネルＣ，Ｌ，Ｒ，ＬＳ，ＲＳの音声データを生成する。一方、低域チャンネル生成部９０２は、入力音声信号Ｄ１１〜Ｄ４１からそれぞれ決められた帯域の音声データを抽出し、低域チャンネルＬＦの音声データを生成する。 The calculation unit 901 generates audio data of channels C, L, R, LS, and RS from input audio signals D11 to D41 by a known calculation. On the other hand, the low frequency channel generation unit 902 extracts audio data of a determined band from the input audio signals D11 to D41, and generates audio data of the low frequency channel LF.

図１１は、低域チャンネル生成部９０２の概略構成ブロック図を示す。入力音声信号Ｄ１１〜Ｄ４１はそれぞれ、バンドパスフィルタ（ＢＰＦ）１００１〜１００４に供給される。各ＢＰＦ１００１〜１００４は、入力された音声信号Ｄ１１〜Ｄ４１のうち、所定の周波数帯域、例えば、１００ｋＨｚと２００ｋＨｚの間の成分を抽出して、合成部１００５に出力する。合成部１００５は、各ＢＰＦ１００１〜１００４の出力を合成し、レベル調整部１００６に出力する。 FIG. 11 shows a schematic block diagram of the low-frequency channel generation unit 902. Input audio signals D11 to D41 are supplied to band pass filters (BPF) 1001 to 1004, respectively. Each of the BPFs 1001 to 1004 extracts a predetermined frequency band, for example, a component between 100 kHz and 200 kHz, from the input audio signals D11 to D41, and outputs the extracted component to the synthesis unit 1005. The combining unit 1005 combines the outputs of the BPFs 1001 to 1004 and outputs the combined outputs to the level adjusting unit 1006.

レベル調整部１００６は、風雑音検出部３０９からの風雑音レベル信号Ｌに基づいて、合成部１００５から出力される低域チャンネルの音声データのレベルを調整する。具体的には、レベル調整部１００６は、例えば図５（Ａ）、（Ｂ）又は（Ｃ）に示すような特性で、風雑音レベルＬに応じて合成部１００５からの出力信号のレベルを調整する。レベル調整部１００６の出力信号が、低域チャンネルＬＦの音声信号となる。 Based on the wind noise level signal L from the wind noise detection unit 309, the level adjustment unit 1006 adjusts the level of the low-frequency channel audio data output from the synthesis unit 1005. Specifically, the level adjustment unit 1006 adjusts the level of the output signal from the synthesis unit 1005 according to the wind noise level L, for example, with the characteristics shown in FIG. 5 (A), (B), or (C). To do. The output signal of the level adjustment unit 1006 becomes the audio signal of the low frequency channel LF.

図９〜図１１に示す装置構成では、音場変換部３０７ａが、低域チャンネルＬＦのレベルを風雑音レベルＬに応じて調整する。これにより、図３に示す構成の場合と同様に、ＡＬＣ部３０８は、低域チャンネルＬＦのレベルを他のチャンネルのレベルと同様に、一括して調整できる。ＡＬＣ部３０８の調整によっても、風雑音が強調されずに済む。 9 to 11, the sound field conversion unit 307 a adjusts the level of the low frequency channel LF according to the wind noise level L. Accordingly, as in the case of the configuration shown in FIG. 3, the ALC unit 308 can collectively adjust the level of the low-frequency channel LF in the same manner as the levels of other channels. Even with the adjustment of the ALC unit 308, the wind noise is not enhanced.

以上の説明では、４つのマイク３０１〜３０４で取り込んだ音声信号から５．１ｃｈの音声信号を生成したが、本発明は、５．１ｃｈに限らず、これ以上のチャンネル数の音声信号に変換する場合にも適用可能である。また、マイクの数も４つに限らず、これ以外の個数でもよい。 In the above description, a 5.1ch audio signal is generated from the audio signals captured by the four microphones 301 to 304. However, the present invention is not limited to 5.1ch, and converts the audio signal to a larger number of channels. It is also applicable to cases. Further, the number of microphones is not limited to four, and may be any number other than this.

本発明に係る一実施例におけるビデオカメラの概略構成ブロック図である。1 is a block diagram of a schematic configuration of a video camera according to an embodiment of the present invention. 本実施例のビデオカメラの外観斜視図である。It is an external appearance perspective view of the video camera of a present Example. 音声入力部１０２の概略構成ブロック図である。3 is a block diagram of a schematic configuration of a voice input unit 102. FIG. マイクユニット２０１を構成する４つのマイクの配置図である。FIG. 4 is a layout diagram of four microphones constituting the microphone unit 201. 音量調整部３１０の特性例である。It is an example of the characteristic of the volume adjustment part 310. FIG. 風雑音検出部の概略構成ブロック図である。It is a schematic block diagram of a wind noise detection part. 風雑音除去部の概略構成ブロック図である。It is a schematic block diagram of a wind noise removal part. 風雑音除去部における３つの周波数特性例を示す図である。It is a figure which shows the example of three frequency characteristics in a wind noise removal part. 音声入力部の別の構成例の概略構成ブロック図である。It is a schematic block diagram of another structural example of an audio | voice input part. 図９に示す音場変換部の概略構成ブロック図である。It is a schematic block diagram of the sound field converter shown in FIG. 図１０に示す低域チャンネル生成部の概略構成ブロック図である。It is a schematic block diagram of a low-frequency channel generation unit shown in FIG.

符号の説明Explanation of symbols

１００：ビデオカメラ
１０１：撮像部
１０２：音声入力部
１０３：メモリ
１０４：表示制御部
１０５：表示部
１０６：符号化処理部
１０７：記録再生部
１０８：記録媒体
１０９：制御部
１１０：操作部
１１１：音声出力部
１１２：スピーカ
１１３：出力部
２０１：マイクユニット
２０２：撮影レンズ
２０３：表示パネル
３０１〜３０４：無指向性マイク
３０５：ＡＤコンバータ
３０５Ａ〜３０５Ｄ：Ａ／Ｄ変換器
３０６：風雑音除去部
３０７，３０７ａ：音場変換部
３０８：自動レベル制御（ＡＬＣ）部
３０９：風雑音検出部
３１０：音量調整部
５０１：加算器
５０２：減算器
５０３：絶対値変換部
５０４：絶対値変換部
５０５：ＬＰＦ
５０６：ＬＰＦ
５０７：減算器
５０８：エンベロープ検出部
５０９：加算器
５１０：減算器
５１１：絶対値変換部
５１２：絶対値変換部
５１３：ＬＰＦ
５１４：ＬＰＦ
５１５：減算器
５１６：エンベロープ検出部
５１７：判定部
６０１：加算器
６０２：減算器
６０３：ＨＰＦ（ハイパスフィルタ）
６０４：加算器
６０５：減算器
６０６：アンプ
６０７：アンプ
６０８：加算器
６０９：減算器
６１０：ＨＰＦ
６１１：加算器
６１２：減算器
６１３：アンプ
６１４：アンプ
６１５：遮断周波数制御部
９０１：演算部
９０２：低域チャンネル生成部
１００１〜１００４：バンドパスフィルタ（ＢＰＦ）
１００５：合成部
１００６：レベル調整部 100: video camera 101: imaging unit 102: audio input unit 103: memory 104: display control unit 105: display unit 106: encoding processing unit 107: recording / playback unit 108: recording medium 109: control unit 110: operation unit 111: Audio output unit 112: speaker 113: output unit 201: microphone unit 202: photographing lens 203: display panels 301 to 304: omnidirectional microphone 305: AD converters 305A to 305D: A / D converter 306: wind noise removing unit 307 307a: Sound field conversion unit 308: Automatic level control (ALC) unit 309: Wind noise detection unit 310: Volume adjustment unit 501: Adder 502: Subtractor 503: Absolute value conversion unit 504: Absolute value conversion unit 505: LPF
506: LPF
507: Subtractor 508: Envelope detector 509: Adder 510: Subtractor 511: Absolute value converter 512: Absolute value converter 513: LPF
514: LPF
515: Subtractor 516: Envelope detection unit 517: Determination unit 601: Adder 602: Subtractor 603: HPF (High Pass Filter)
604: Adder 605: Subtractor 606: Amplifier 607: Amplifier 608: Adder 609: Subtractor 610: HPF
611: Adder 612: Subtractor 613: Amplifier 614: Amplifier 615: Cutoff frequency controller 901: Arithmetic unit 902: Low-frequency channel generators 1001 to 1004: Band pass filter (BPF)
1005: Composition unit 1006: Level adjustment unit

Claims

複数の音声入力手段と、
前記複数の音声入力手段から出力された複数の音声信号の低周波数帯域に含まれる雑音の大きさを検出する雑音検出手段と、
前記雑音検出手段の出力に基づいて前記複数の音声入力手段から出力された複数の音声信号の前記雑音を除去する雑音除去手段と、
前記雑音除去手段から出力された前記複数の音声信号を、低周波数チャンネルとその他のチャンネルとを含む複数のチャンネルの音声データに変換する変換手段と、
前記雑音検出手段により検出された雑音の大きさに応じて、前記低周波数チャンネルの音声データのレベルを制御する調整手段と、
前記変換手段から出力された前記他のチャンネルの音声データと前記調整手段から出力された低周波数チャンネルの音声データのレベルを調整するレベル制御手段
とを備えることを特徴とする音声処理装置。 A plurality of voice input means;
Noise detection means for detecting the magnitude of noise included in a low frequency band of a plurality of voice signals output from the plurality of voice input means;
Noise removing means for removing the noise of the plurality of audio signals output from the plurality of audio input means based on the output of the noise detecting means;
Converting means for converting the plurality of audio signals output from the noise removing means into audio data of a plurality of channels including a low frequency channel and other channels;
Adjusting means for controlling the level of the audio data of the low frequency channel according to the magnitude of noise detected by the noise detecting means;
An audio processing apparatus comprising: level control means for adjusting the level of the audio data of the other channel output from the conversion means and the level of audio data of the low frequency channel output from the adjustment means.

前記調整手段は、前記雑音の大きさが大きいほど、前記低周波数チャンネルの音声データの減衰量を大きくすることを特徴とする請求項１に記載の音声処理装置。 The audio processing apparatus according to claim 1, wherein the adjustment unit increases the attenuation amount of audio data of the low frequency channel as the magnitude of the noise increases.

前記雑音除去手段は、前記複数の音声入力手段から出力された複数の音声信号の何れか２つの音声信号の差分が入力されるハイパスフィルタと、前記雑音検出手段により検出された雑音が大きいほど、前記ハイパスフィルタの遮断周波数を高くする遮断周波数制御装置とを有することを特徴とする請求項１に記載の音声処理装置。 The noise removing unit is configured such that a difference between two audio signals of a plurality of audio signals output from the plurality of audio input units is input, and the noise detected by the noise detection unit increases. The speech processing apparatus according to claim 1, further comprising: a cutoff frequency control device that increases a cutoff frequency of the high-pass filter.

前記変換手段は、前記雑音除去手段から出力された前記複数の音声信号を互いに異なる指向性の複数のチャンネルの音声データに変換し、前記互いに異なる指向性の複数のチャンネルの音声データを前記他のチャンネルの音声データとして出力すると共に、前記雑音除去手段から出力された前記複数の音声信号の低周波数成分をそれぞれ抽出し、前記抽出した低周波数成分の音声信号を合成することにより前記低周波数チャンネルの音声データを生成することを特徴とする請求項１に記載の音声処理装置。 The converting unit converts the plurality of audio signals output from the noise removing unit into audio data of a plurality of channels having different directivities, and converting the audio data of the plurality of channels having different directivities to the other Output as audio data of the channel, extract low frequency components of the plurality of audio signals output from the noise removing unit, and synthesize the audio signals of the extracted low frequency components, thereby The voice processing apparatus according to claim 1, wherein voice data is generated.

前記雑音検出手段は、前記複数の音声入力手段のうちの何れか２つの音声入力手段からの音声信号の差分を用いて前記雑音の大きさを検出することを特徴とする請求項１に記載の音声処理装置。 2. The noise detection unit according to claim 1, wherein the noise detection unit detects the magnitude of the noise using a difference between audio signals from any two of the plurality of audio input units. Audio processing device.

前記レベル制御手段は、前記他のチャンネルの音声データと前記低周波数チャンネルの音声データのうちの何れかの音声データのレベルに応じて、前記他のチャンネルの音声データと前記低周波数チャンネルの音声データのレベルを共通に制御することを特徴とする請求項１記載の音声処理装置。 The level control means is configured to control the audio data of the other channel and the audio data of the low frequency channel according to the level of any one of the audio data of the other channel and the audio data of the low frequency channel. 2. The speech processing apparatus according to claim 1, wherein the levels of the audio signals are controlled in common.

複数の音声入力手段と、
前記複数の音声入力手段から出力された複数の音声信号の低周波数帯域に含まれる雑音の大きさを検出する雑音検出手段と、
前記雑音検出手段の出力に基づいて前記複数の音声入力手段から出力された複数の音声信号の前記雑音を除去する雑音除去手段と、
前記雑音除去手段から出力された複数の音声信号から互いに異なる指向性の複数のチャンネルの音声データを生成する変換手段であって、前記雑音除去手段から出力された複数の音声信号を演算することにより互いに異なる指向性の複数のチャンネルの音声データを生成する演算部と、前記雑音除去手段から出力された複数の音声信号の低周波数成分を抽出して合成する合成部と、前記雑音検出手段により検出された雑音の大きさに応じて前記合成部の出力信号のレベルを調整して低周波数チャンネルの音声データとして出力する調整部とを有する変換手段と、
前記変換手段から出力された前記複数チャンネルの音声データと前記低周波数チャンネルの音声データのレベルを調整するレベル制御手段
とを備えることを特徴とする音声処理装置。 A plurality of voice input means;
Noise detection means for detecting the magnitude of noise included in a low frequency band of a plurality of voice signals output from the plurality of voice input means;
Noise removing means for removing the noise of the plurality of audio signals output from the plurality of audio input means based on the output of the noise detecting means;
Conversion means for generating sound data of a plurality of channels having different directivities from a plurality of sound signals output from the noise removing means, by calculating a plurality of sound signals output from the noise removing means; Detected by an arithmetic unit that generates audio data of a plurality of channels with different directivities, a synthesis unit that extracts and synthesizes low frequency components of a plurality of audio signals output from the noise removal unit, and the noise detection unit A conversion unit having an adjustment unit that adjusts the level of the output signal of the synthesis unit according to the magnitude of the generated noise and outputs the low-frequency channel audio data;
An audio processing apparatus comprising: level control means for adjusting levels of the audio data of the plurality of channels output from the conversion means and the audio data of the low frequency channel.

前記調整部は、前記雑音の大きさが大きいほど、前記低周波数成分のチャンネルの音声データの減衰量を大きくすることを特徴とする請求項７に記載の音声処理装置。 The audio processing apparatus according to claim 7, wherein the adjustment unit increases an attenuation amount of audio data of the channel of the low frequency component as the magnitude of the noise increases.

前記雑音除去手段は、前記複数の音声入力手段から出力された複数の音声信号の何れか２つの音声信号の差分が入力されるハイパスフィルタと、前記雑音検出手段により検出された雑音が大きいほど、前記ハイパスフィルタの遮断周波数を高くする遮断周波数制御装置とを有することを特徴とする請求項７に記載の音声処理装置。 The noise removing unit is configured such that a difference between two audio signals of a plurality of audio signals output from the plurality of audio input units is input, and the noise detected by the noise detection unit increases. The voice processing device according to claim 7, further comprising a cutoff frequency control device that increases a cutoff frequency of the high-pass filter.

前記雑音検出手段は、前記複数の音声入力手段のうちの何れか２つの音声入力手段からの音声信号の差分を用いて前記雑音の大きさを検出することを特徴とする請求項７に記載の音声処理装置。 8. The noise detection unit according to claim 7, wherein the noise detection unit detects the magnitude of the noise using a difference between audio signals from any two of the plurality of audio input units. Audio processing device.

前記レベル制御手段は、前記他のチャンネルの音声データと前記低周波数成分の音声データのうちの何れかの音声データのレベルに応じて、前記他のチャンネルの音声データと前記低周波数成分の音声データのレベルを共通に制御することを特徴とする請求項７に記載の音声処理装置。 The level control means is configured to control the audio data of the other channel and the audio data of the low frequency component according to the level of any one of the audio data of the other channel and the audio data of the low frequency component. The audio processing apparatus according to claim 7, wherein the levels of the audio signal are controlled in common.

前記複数の音声入力手段は、それぞれ無指向性のマイクロフォンであることを特徴とする請求項１から１１の何れか１項に記載の音声処理装置。 The voice processing apparatus according to claim 1, wherein each of the plurality of voice input units is a non-directional microphone.