JPH10308815A

JPH10308815A - Voice switch for taking equipment

Info

Publication number: JPH10308815A
Application number: JP11572597A
Authority: JP
Inventors: Yasushi Yamazaki; 泰山崎; Tomonori Sato; 知紀佐藤; Hitoshi Matsuzawa; 均松澤
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1997-05-06
Filing date: 1997-05-06
Publication date: 1998-11-17
Anticipated expiration: 2017-05-06
Also published as: JP3466049B2

Abstract

PROBLEM TO BE SOLVED: To precisely detect sound even in case of background noise level fluctuation by performing the detection of sound/silence based on comparison between a threshold value and the power of input voice. SOLUTION: A comparator 51 of sound detection part 5 compares an inputted power pi with a prescribed threshold th and outputs a sound state si as a discriminated result. A background noised learning part 52 sets the threshold th based on the inputted voice power and supplies it to the comparator 51 and the sound state si from the comparator part 51 is inputted as well. When the sound state si from the comparator part 51 shows silence, namely, on the condition of si =0, this background noise learning part 52 operates the threshold th and supplies it to the comparator 51. When the sound state si from this comparator part 51 shows a sound, namely, on the condition of si =1, th keeps its last value in silence. Thus, the learning part for the threshold value th to be used by the sound detection part is provided so that communication is enabled even in case of background noise fluctuation caused by environmental changes.

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明はハンズフリー通話機
などに用いられる音声スイッチに関するものである。音
声スイッチ方式を採用したハンズフリー通話機において
は、背景雑音のレベルの変動に対しても、背景雑音中か
ら有音部分を的確に抽出できる有音判定を行えることが
必要とされる。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice switch used for a hands-free telephone or the like. In a hands-free communication device employing a voice switch method, it is necessary to be able to make a sound determination capable of accurately extracting a sound part from background noise even when the level of background noise fluctuates.

【０００２】[0002]

【従来の技術】ハンズフリー機能を実現するためには、
スピーカの音量を上げ、マイクの感度を高める必要があ
る。しかしながら、このようにすると、図２に示される
ように、スピーカ等の音声出力部から出力された受話音
声がマイクロホン等の音声入力部に回り込む音響エコー
が生じる。これは、通話相手にとっては自分の声がこだ
まのように聞こえる現象で、非常に使いにくいものとな
る。この音響エコーを除去するためには、（１）エコー
キャンセラ方式、（２）音声スイッチ方式の二方式があ
る。2. Description of the Related Art To realize a hands-free function,
It is necessary to increase the volume of the speaker and the sensitivity of the microphone. However, in this case, as shown in FIG. 2, an acoustic echo occurs in which the received voice output from the voice output unit such as a speaker circulates to a voice input unit such as a microphone. This is a phenomenon in which one's voice sounds like an echo to the other party, and is very difficult to use. There are two methods for removing the acoustic echo: (1) an echo canceller method and (2) a voice switch method.

【０００３】エコーキャンセラ方式は適応信号処理技術
を用いて音響エコーを除去するものである。例えば図３
に示されるように、出力された受話音声がマイクに回り
込む音響エコーｒを、通話機の内部で擬似的に発生さ
せ、マイク入力された信号から差し引くものである。こ
の擬似エコーｒ’の発生はスピーカからマイクへの伝達
関数をＦＩＲフィルタで表したものである。この伝達関
数は通話機の周囲の状況によって変化するため、擬似エ
コーｒ’と音響エコーｒの誤差が最小になるよう適応的
にフィルタを変化させるものである。[0003] The echo canceller system removes an acoustic echo using an adaptive signal processing technique. For example, FIG.
As shown in (1), an acoustic echo r in which the output received voice wraps around the microphone is artificially generated inside the telephone, and is subtracted from the signal input to the microphone. The generation of the pseudo echo r 'is obtained by expressing a transfer function from the speaker to the microphone by using an FIR filter. Since this transfer function changes depending on the situation around the telephone, the filter is adaptively changed so that the error between the pseudo echo r 'and the acoustic echo r is minimized.

【０００４】一方、音声スイッチ方式は、図４に示され
るように、スピーカ出力音声とマイク入力音声とのパワ
ーを比較し、どちらか一方を抑圧することで、音響エコ
ーを除去する。つまり、スピーカ出力している間はマイ
ク入力された信号は音響エコーであるので、この間はマ
イク入力信号を抑圧することで、相手に音響エコーを送
信することを防ぐ。On the other hand, in the voice switch system, as shown in FIG. 4, the power of a speaker output voice and the power of a microphone input voice are compared, and one of them is suppressed to remove an acoustic echo. That is, while the signal is being output from the speaker, the signal input to the microphone is an acoustic echo. During this time, the microphone input signal is suppressed to prevent transmission of the acoustic echo to the other party.

【０００５】このように、ハンズフリー機能を実現する
上で問題となる音響エコーの除去には、エコーキャンセ
ラ、音声スイッチの２方式がある。両者の長所、短所の
比較は図５に示すとおりであり、処理量と能力のトレー
ドオフとなる。コストを優先させる場合には音声スイッ
チ方式を採用することになる。本発明はこの音声スイッ
チに関わるものである。As described above, there are two methods of removing an acoustic echo which is a problem in realizing the hands-free function, an echo canceller and a voice switch. A comparison of the advantages and disadvantages of both is shown in FIG. 5, which is a trade-off between processing amount and performance. To prioritize the cost, a voice switch method will be adopted. The present invention relates to this voice switch.

【０００６】図６にはこの音声スイッチを備えたハンズ
フリー通話機の詳細な従来構成が示される。図６におい
て、１は相手側からの音声信号を受信する復調器等から
なる受信部、２は受信ゲインｇａｉｎ-rを変化させるこ
とで受信信号のパワーを抑圧制御できるパワー抑圧部、
３は増幅器やスピーカ等からなり受話音声（Ｒ）を放音
する音声出力部である。６はマイクロホンや増幅器から
なり送話音声を入力する音声入力部、７は送信ゲインｇ
ａｉｎ-sを変化させることで受信信号のパワーを抑圧制
御できるパワー抑圧部、８は送話音声信号を相手側に送
信する変調器等からなる送信部である。FIG. 6 shows a detailed conventional structure of a hands-free telephone having the voice switch. In FIG. 6, reference numeral 1 denotes a receiving unit including a demodulator for receiving an audio signal from the other party, 2 denotes a power suppressing unit that can suppress and control the power of a received signal by changing a reception gain gain-r,
Reference numeral 3 denotes an audio output unit which includes an amplifier, a speaker, and the like, and emits a received voice (R). Reference numeral 6 denotes a voice input unit which includes a microphone and an amplifier and inputs a transmission voice, and 7 denotes a transmission gain g.
A power suppression unit 8 that can suppress and control the power of the received signal by changing the ain-s, and a transmission unit 8 including a modulator that transmits the transmission voice signal to the other party.

【０００７】４は受信部１で受信した受信信号のパワー
を計算するパワー計算部、５’はパワー計算部４で算出
したパワーに基づいて現在の受話音声状態ｓ-rが無音か
有音かを検出する有音検出部、１１は音声入力部６に入
力した音声信号のパワーを計算するパワー計算部、９’
はパワー計算部１１で算出したパワーに基づいて現在の
送話音声状態ｓ-sが無音か有音かを検出する有音検出
部、１０は有音検出部５’、９’の検出結果に基づいて
パワー抑圧部２、７のいずれ側を抑圧制御状態にするか
を判定する判定部である。Reference numeral 4 denotes a power calculator for calculating the power of the received signal received by the receiver 1. Reference numeral 5 'denotes whether the current received voice state sr is silence or sound based on the power calculated by the power calculator 4. , A power calculation unit 11 for calculating the power of the audio signal input to the audio input unit 6, 9 ′
Is a sound detection unit that detects whether the current transmission voice state s-s is silence or sound based on the power calculated by the power calculation unit 11, and 10 is a detection result of the sound detection units 5 'and 9'. The determination unit determines which side of the power suppression units 2 and 7 is to be in the suppression control state based on the power suppression units.

【０００８】ここで、パワー計算部４、１１は次の計算
式により入力音声データのパワーを計算する。すなわ
ち、入力された音声データをｘ_iとすると、出力パワー
ｐ_iは、ｐ_i＝１０×log 〔Σ（ｘ_i-j×ｘ_i-j）〕で求まる。但し、Σはｊ＝０からＪまでの加算であるも
のとする。Here, the power calculators 4 and 11 calculate the power of the input voice data according to the following formula. That is, assuming that the input audio data is x _i , the output power p _i is obtained by p _i = 10 × log [Σ (x _ij x _ij )]. Here, Σ is an addition from j = 0 to J.

【０００９】有音検出部５’、９’は、図７に示される
ように、入力パワーｐ_iを一定のしきい値ｔｈと比較す
る比較部からなり、次の判定式により、入力パワーｐ_i
をしきい値ｔｈと比較して、現在の音声状態Ｓ_iが有音
か無音かを判定している。ここで、ｓ_i＝０は無音、ｓ
_i＝１は有音を意味する。判定式は、ｉｆ（ｐ_i＜ｔｈ）ｓ_i＝０ｉｆ（ｐ_i＞ｔｈ）ｓ_i＝１である。これは、入力パワーｐ_iがしきい値ｔｈより小
さければ、音声状態ｓ_iを「０」とし、しきい値ｔｈに
よりも大きければ、音声状態ｓ_iを「１」とするもので
ある。これより、しきい値ｔｈ以下の背景雑音が誤って
有音を判定されることを防ぐ。As shown in FIG. 7, the sound detectors 5 'and 9' each comprise a comparator for comparing the input power p _i with a constant threshold value th. _i
The compared with the threshold th, current audio state S _i is determined whether voiced or silence. Here, s _i = 0 is silence, s _i
_i = 1 means a sound. The judgment formula is if (p _i <th) s _i = 0 if (p _i > th) s _i = 1. This is because the input power p _i is smaller than the threshold th, the speech state s _i to "0", if greater More threshold th, in which the speech state s _i to "1". As a result, it is possible to prevent erroneous determination of existence of background noise having a threshold value th or less.

【００１０】判定部１０は、図８に示す判定論理テーブ
ルに従って、受話パワー抑圧部２の受話ゲインｇａｉｎ
-rと送話パワー抑圧部７の送話ゲインｇａｉｎ-sを制御
している。ここで、受話ゲインｇａｉｎ-rと送話ゲイン
ｇａｉｎ-sは０．０≦ｇａｉｎ≦１．０の範囲のものである。図８の判定論理テーブルでは、送話音声状態ｓ-s＝０、受話音声状態ｓ-r＝０の場合
には、送話ゲインｇａｉｎ-s、受話ゲインｇａｉｎ-rと
もに「０．０」とする．送話音声状態ｓ-s＝１、受話音声状態ｓ-r＝０の場合
には、送話ゲインｇａｉｎ-sを「１．０」、受話ゲイン
ｇａｉｎ-rを「０．０」とする．送話音声状態ｓ-s＝０、受話音声状態ｓ-r＝１の場合
には、送話ゲインｇａｉｎ-sを「０．０」、受話ゲイン
ｇａｉｎ-rを「１．０」とする．送話音声状態ｓ-s＝１、受話音声状態ｓ-r＝１の場合
には、受話を優先して、送話ゲインｇａｉｎ-sを「０．
０」、受話ゲインｇａｉｎ-rを「１．０」とする．の制御を行う。The determination unit 10 receives the reception gain gain of the reception power suppression unit 2 according to the determination logic table shown in FIG.
-r and the transmission gain gain-s of the transmission power suppression unit 7 are controlled. Here, the reception gain gain-r and the transmission gain gain-s are in the range of 0.0 ≦ gain ≦ 1.0. In the determination logic table of FIG. 8, when the transmission voice state s−s = 0 and the reception voice state s−r = 0, both the transmission gain gain−s and the reception gain gain−r are “0.0”. I do. When the transmitted voice state s−s = 1 and the received voice state s−r = 0, the transmitted gain “gain-s” is set to “1.0” and the received gain “gain-r” is set to “0.0”. When the transmitted voice state s−s = 0 and the received voice state sr = 1, the transmitted gain “gain-s” is set to “0.0” and the received gain “gain-r” is set to “1.0”. When the transmission voice state s-s = 1 and the reception voice state sr = 1, the reception is prioritized, and the transmission gain gain-s is set to "0.
0 ", and the reception gain gain-r is set to" 1.0 ". Control.

【００１１】この判定部１０の判定結果に従って、パワ
ー抑圧部２、７は入力音声データｘ _iに対して以下の処
理を行って、出力音声データｘ_iとして出力する。ｘ_i＝ｘ_i×ｇａｉｎAccording to the determination result of the determination unit 10, the power
-The suppression units 2 and 7 are input audio data x _iFor
And output audio data x_iOutput as x_i= X_i× gain

【００１２】このように、この音声スイッチ方式は、受
話音声と送話音声の状態によりどちらか一方を抑圧し、
他方が受話音声であればスピーカ出力し、送話音声であ
れば送信するものである。両者のいずれもが音声の場合
には、受話音声を優先する場合や、音声パワーの高い方
を優先する場合など様々な基準が考えられる。As described above, this voice switch system suppresses one of the received voice and the transmitted voice depending on the state of the voice.
If the other is the reception voice, the speaker output is performed, and if the transmission voice is the transmission voice, the transmission is performed. When both of them are voices, various criteria can be considered, such as a case where a received voice is prioritized, and a case where a higher voice power is prioritized.

【００１３】[0013]

【発明が解決しようとする課題】従来の音声スイッチの
有音検出部５、９では、有音判定はしきい値ｔｈと入力
音声パワーｐ_iを比較することで行っているが、様々な
使用環境では背景雑音のレベルが変動し、一定のしきい
値ｔｈでは有音判定がうまく動作しないことがある。例
えば、しきい値ｔｈを低めに設定しておくと、背景雑音
のパワーが高くなると背景雑音を常に有音と判定してし
まうし、逆にしきい値ｔｈを高めに設定しておくと、小
さなレベルの有音が検出されなくなる。In the sound detectors 5 and 9 of the conventional voice switch, the sound determination is made by comparing the threshold th and the input voice power p _i. In an environment, the level of the background noise fluctuates, and the sound determination may not operate well at a certain threshold th. For example, if the threshold th is set to a low value, the background noise is always determined to be sound if the power of the background noise increases, and if the threshold th is set to a high value, Level sound is no longer detected.

【００１４】本発明はかかる問題点に鑑みてなされたも
のであり、背景雑音のレベル変動に対しても的確に有音
を検出できるようにすることを目的とする。The present invention has been made in view of such a problem, and an object of the present invention is to make it possible to accurately detect a sound with respect to a level fluctuation of background noise.

【００１５】[0015]

【課題を解決するための手段】上述の課題を解決するた
めに、本発明に係る通話機の音声スイッチは、受話音声
のパワー計算をする受話側パワー計算手段と、送話音声
のパワー計算をする送話側パワー計算手段と、前記受話
側パワー計算手段の受話音声のパワーから受話音声の有
音／無音の音声状態を判定する受話側有音検出手段と、
前記送話側パワー計算手段の送話音声のパワーから送話
音声の有音／無音の音声状態を判定する送話側有音検出
手段と、前記受話音声のパワーを抑圧する受話側抑圧手
段と、前記送話音声のパワーを抑圧する送話側抑圧手段
と、前記受話側有音検出手段の受話音声の音声状態およ
び前記送話側有音検出手段の送話音声の音声状態に基づ
き前記受話側抑圧手段の受話音声および前記送話側抑圧
手段の送話音声のいずれを抑圧するかを判定する判定手
段とを備え、前記受話側および送話側の有音検出手段
の少なくとも一方は、その入力音声を無音と判定してい
るときに、入力音声のパワーの時間平均またはそれに準
じる値に基づいてしきい値を学習し、このしきい値と入
力音声のパワーの比較結果に基づいて有音／無音の検出
を行うように構成される。有音判定手段で入力音声のパ
ワーと比較するしきい値が固定では、環境変化による背
景雑音の変動に対応できなくなる。そこで、背景雑音を
学習する。これは、入力音声を無音と判定しているとき
に、ある定められた時間範囲の音声パワーの時間平均ま
たはそれに準じる値を求めることで現在の背景雑音のレ
ベルを測定し、それに基づいてしきい値を設定し直すも
のである。これにより、背景雑音のレベル変化に対応し
た適正なしきい値を用いて有音／無音の検出ができる。In order to solve the above-mentioned problems, a voice switch of a telephone according to the present invention comprises: a receiver-side power calculator for calculating a power of a received voice; and a power calculator of a transmitted voice. Transmitting-side power calculating means, and receiving-side voice detecting means for determining the voiced / non-voiced voice state of the received voice from the power of the received voice of the receiving power calculating means;
A transmitting-side voice detecting means for determining a voiced / unvoiced voice state of the transmitting voice from the power of the transmitting voice of the transmitting-side power calculating means; a receiving-side suppressing means for suppressing the power of the receiving voice; A transmitting-side suppressing means for suppressing the power of the transmitting voice; a receiving state based on a voice state of the receiving voice of the receiving side voice detecting means and a voice state of the transmitting voice of the transmitting side voice detecting means. Determining means to determine which of the received voice of the side suppression means and the transmitted voice of the transmission side suppression means to suppress, at least one of the voice detection means of the reception side and the transmission side, When the input voice is determined to be silent, a threshold is learned based on the time average of the power of the input voice or a value equivalent thereto, and sound is generated based on the comparison result of this threshold and the power of the input voice. / Configured to detect silence It is. If the threshold for comparing with the power of the input voice by the voiced determination means is fixed, it becomes impossible to cope with fluctuations in background noise due to environmental changes. Therefore, background noise is learned. This is because when the input sound is determined to be silent, the current background noise level is measured by calculating the time average of the sound power in a predetermined time range or a value equivalent thereto, and based on the measured value, The value is reset. This makes it possible to detect sound / non-speech using an appropriate threshold value corresponding to a change in the level of background noise.

【００１６】前記有音検出手段のしきい値の学習に用い
る入力音声パワーの時間範囲は、音声状態の変化（例え
ば有音区間から無音区間への切替え）によって時間範囲
を狭めるなどに変えるようにしてもよい。このようにす
ることで、背景雑音の学習の追従性を高めることができ
る。The time range of the input sound power used for learning the threshold value of the sound detection means is changed so as to narrow the time range by a change in the sound state (for example, switching from a sound section to a silent section). You may. By doing so, the follow-up of learning of background noise can be improved.

【００１７】[0017]

【発明の実施の形態】以下、図面を参照して本発明の実
施例を説明する。図１には本発明の一実施例としての音
声スイッチを備えたハンズフリー通話機が示される。図
中、受信部１、パワー抑圧部２、７、音声出力部３、パ
ワー計算部４、１１、音声入力部６、送信部８、判定部
１０は、図６の従来装置で説明した回路要素と同じもの
であるので、ここでは詳細な説明は省く。Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 shows a hands-free telephone having a voice switch according to an embodiment of the present invention. In the figure, a receiving unit 1, power suppressing units 2 and 7, an audio output unit 3, power calculating units 4 and 11, an audio input unit 6, a transmitting unit 8, and a determining unit 10 are circuit elements described in the conventional device of FIG. Therefore, the detailed description is omitted here.

【００１８】一方、有音検出部５、９は従来回路のもの
と相違している。すなわち、有音検出部５は比較部５１
と背景雑音学習部５２からなり、有音検出部９は比較部
９１と背景雑音学習部９２からなる。この有音検出部
５、９は同じ構成であるので、以降、有音検出部５につ
いてだけその機能・作用を説明する。On the other hand, the sound detection units 5 and 9 are different from those of the conventional circuit. That is, the sound detection unit 5 includes the comparison unit 51
The sound detection unit 9 includes a comparison unit 91 and a background noise learning unit 92. Since the sound detection units 5 and 9 have the same configuration, only the function and operation of the sound detection unit 5 will be described below.

【００１９】比較部５１は入力されたパワー信号ｐ_iを
所定のしきい値ｔｈと比較する回路であり、それによ
り、次の判定式により、判定結果としての音声状態ｓ_i
を出力している。ここで、ｓ_i＝０は無音、ｓ_i＝１は
有音を意味する。判定式は、ｉｆ（ｐ_i＜ｔｈ）ｓ_i＝０ｉｆ（ｐ_i＞ｔｈ）ｓ_i＝１である。これは、入力パワーｐ_iがしきい値ｔｈより小
さければ、音声状態ｓ_iを「０」とし、しきい値ｔｈに
よりも大きければ、音声状態ｃ_iを「１」とするもので
ある。The comparing section 51 is a circuit for comparing the input power signal p _i with a predetermined threshold value th. Thereby, the voice state s _i as a judgment result is obtained by the following judgment formula.
Is output. Here, s _i = 0 means no sound, and s _i = 1 means sound. The judgment formula is if (p _i <th) s _i = 0 if (p _i > th) s _i = 1. This is because the input power p _i is smaller than the threshold th, the speech state s _i to "0", if greater More threshold th, in which the voice state c _i as "1".

【００２０】背景雑音学習部５２は入力された音声パワ
ーに基づいてしきい値ｔｈを設定し、比較部５１に供給
する回路であり、比較部５１からの音声状態ｓ_iも入力
されている。この背景雑音学習部５２は、比較部５１か
らの音声状態ｓ_iがｓ_i＝０すなわち無音であるとき
に、次式ｔｈ＝Ｐ_ave＋α （無音の場合）（１）Ｐ_ave＝（Σｐ_i）／Ｎ（２）但し、Σはｉ＝１からＮまでの加算に従ってしきい値ｔｈを演算して比較部５１に供給す
る。この式は、入力音声のパワーｐ_iの時間平均値であ
る平均パワーＰ_aveを（２）式で求め、この平均パワー
Ｐ_aveに所定の係数αを加算したものをしきい値ｔｈと
するものである。ここで、比較部５１からの音声状態ｓ
_iがｓ_i＝１すなわち有音であるときには、ｔｈは前回
の無音時の値を保持するものとする。The background noise learning unit 52 sets the threshold th based on the sound power that is input, a circuit for supplying the comparator 51, is also inputted speech state s _i from the comparator 51. The background noise learning unit 52 calculates the following equation when the voice state s _i from the comparison unit 51 is s _i = 0, that is, no sound: th = P _ave + α (in the case of no sound) (1) P _ave = (Σp _i ) / N (2) where Σ calculates the threshold value th in accordance with the addition from i = 1 to N and supplies it to the comparison unit 51. In this equation, the average power P _ave , which is the time average value of the power p _i of the input voice, is obtained by equation (2), and a value obtained by adding a predetermined coefficient α to the average power P _ave is used as the threshold th. It is. Here, the voice state s from the comparison unit 51
_{When i} is s _i = 1, that is, when there is sound, th holds the value at the time of the previous silence.

【００２１】係数αは小さくすれば、有音の検出感度が
高まるが背景騒音を有音と誤判定する確率も高まり、逆
に係数αを大きくすれば、有音の検出感度が鈍くなるが
背景騒音を有音と誤判定する確率が下がるというもの
で、経験的に適当な値を設定すればよい。If the coefficient α is small, the detection sensitivity of sound is increased, but the probability of erroneously determining background noise as sound is also increased. Conversely, if the coefficient α is large, the detection sensitivity of sound is reduced, but Since the probability that noise is erroneously determined to be sound is reduced, an appropriate value may be set empirically.

【００２２】このように構成すると、背景雑音のレベル
が高まってくると、それに応じてしきい値ｔｈの値も大
きくなり、背景雑音を有音として誤検出する確率が下が
り、反対に、背景雑音のレベルが下がってくると、それ
に応じてしきい値ｔｈの値も小さくなり、小さいレベル
の有音も的確に検出できるようになる。With this configuration, as the level of the background noise increases, the value of the threshold value th also increases accordingly, and the probability of erroneously detecting the background noise as a sound decreases. Becomes lower, the value of the threshold value th also decreases accordingly, and it is possible to accurately detect a sound with a small level.

【００２３】このように、本実施例では、有音検出部で
用いるしきい値ｔｈの学習部を設けることで、環境変化
による背景雑音の変動に対処している。その際に、有音
区間では背景雑音の学習を停止し、無音区間でのみ学習
を行うものである。As described above, in the present embodiment, the provision of the learning unit for the threshold value th used in the sound detection unit copes with fluctuations in background noise due to environmental changes. At that time, learning of background noise is stopped in a sound section, and learning is performed only in a silent section.

【００２４】本発明の実施にあたっては種々の変形形態
が可能である。以下にその一つを説明する。この実施例
では、上記の背景雑音の学習を行う際に有音区間から無
音区間に変化した時点（話し終わった時点）で、背景雑
音の学習の追従性を高めるため、学習に用いるパワーの
範囲を狭めることとする。無音区間でのしきい値ｔｈの
学習は、ｔｈ＝Ｐ_ave＋α （３）Ｐ_ave＝（Σｐ_i）／Ｍ（４）但し、Σはｉ＝１からＭまでの加算とし、有音から無音に変化した場合には、（４）式での
平均計算に用いる範囲Ｍを前述の（２）式のＮよりも小
さくする。なお、一定時間が経過した後にはこの範囲は
通常の範囲すなわちＭ＝Ｎに戻すものとする。In implementing the present invention, various modifications are possible. One of them will be described below. In this embodiment, when the background noise learning is performed, the range of power used for learning is improved at the time of changing from a voiced section to a silent section (at the end of speaking) in order to improve the followability of background noise learning. Shall be narrowed. The learning of the threshold th in the silent section is as follows: th = P _ave + α (3) P _ave = (Σp _i ) / M (4) where Σ is an addition from i = 1 to M. Is changed to a range M used for the average calculation in the equation (4) is made smaller than N in the equation (2). After a certain period of time, this range returns to the normal range, that is, M = N.

【００２５】なお、上述の各実施例では平均パワーＰ
_aveは入力音声パワーｐ_iの時間平均値としたが、本発
明はこれに限られるものではなく、かかる時間平均値に
準じる値、例えば音声パワーｐ_iの２乗値を所定サンプ
ル回数にわたり加算したものの平方根をとり、これをサ
ンプル回数で割ったものなどとしてもよい。In each of the above embodiments, the average power P
_{Although ave} is the time average value of the input audio power p _i , the present invention is not limited to this, and a value according to the time average value, for example, a square value of the audio power p _i is added over a predetermined number of samples. It is also possible to take the square root of an object and divide it by the number of samples.

【００２６】[0026]

【発明の効果】以上説明したように、本発明によれば、
有音検出部に設けた背景雑音学習部で様々な環境下での
背景雑音の変化に対応することが可能となり、背景雑音
のレベル変動に対しても的確に有音を検出できるように
なる。As described above, according to the present invention,
The background noise learning unit provided in the sound detection unit can cope with a change in background noise under various environments, and it is possible to accurately detect a sound even when the level of the background noise changes.

【図面の簡単な説明】[Brief description of the drawings]

【図１】本発明に係る一実施例としての音声スイッチを
備えたハンズフリー通話機を示す図である。FIG. 1 is a diagram showing a hands-free telephone having a voice switch according to an embodiment of the present invention.

【図２】ハンズフリー通話機等における音響エコーを説
明する図である。FIG. 2 is a diagram illustrating an acoustic echo in a hands-free communication device or the like.

【図３】エコーキャンセラ方式を説明する図である。FIG. 3 is a diagram illustrating an echo canceller method.

【図４】音声スイッチ方式を説明する図である。FIG. 4 is a diagram illustrating an audio switch system.

【図５】エコーキャンセラ方式と音声スイッチ方式を比
較する図である。FIG. 5 is a diagram comparing an echo canceller method and a voice switch method.

【図６】従来の音声スイッチを備えたハンズフリー通話
機を示す図である。FIG. 6 is a diagram illustrating a conventional hands-free telephone having a voice switch.

【図７】従来装置における有音検出部の構成を示す図で
ある。FIG. 7 is a diagram showing a configuration of a sound detection unit in a conventional device.

【図８】有音／無音の判定テーブルの例を示す図であ
る。FIG. 8 is a diagram illustrating an example of a sound / silence determination table.

【符号の説明】[Explanation of symbols]

１受信部２、７パワー抑圧部３音声出力部４、１１パワー計算部５、５’、９、９’ 有音検出部６音声入力部８送信部５１、９１比較部５２、９２背景雑音学習部 DESCRIPTION OF SYMBOLS 1 Receiving part 2, 7 Power suppression part 3 Audio output part 4, 11 Power calculation part 5, 5 ', 9, 9' Sound existence detection part 6 Audio input part 8 Transmission part 51, 91 Comparison part 52, 92 Background noise learning Department

Claims

【特許請求の範囲】[Claims]

【請求項１】受話音声のパワー計算をする受話側パワー
計算手段と、送話音声のパワー計算をする送話側パワー計算手段と、前記受話側パワー計算手段の受話音声のパワーから受話
音声の有音／無音の音声状態を判定する受話側有音検出
手段と、前記送話側パワー計算手段の送話音声のパワーから送話
音声の有音／無音の音声状態を判定する送話側有音検出
手段と、前記受話音声のパワーを抑圧する受話側抑圧手段と、前記送話音声のパワーを抑圧する送話側抑圧手段と、前記受話側有音検出手段の受話音声の音声状態および前
記送話側有音検出手段の送話音声の音声状態に基づき前
記受話側抑圧手段の受話音声および前記送話側抑圧手段
の送話音声のいずれを抑圧するかを判定する判定手段と
を備え、前記受話側および送話側の有音検出手段の少なくとも一
方は、その入力音声を無音と判定しているときに、入力
音声のパワーの時間平均またはそれに準じる値に基づい
てしきい値を学習し、このしきい値と入力音声のパワー
の比較結果に基づいて有音／無音の検出を行うように構
成された通話機の音声スイッチ。A receiving power calculator for calculating a power of the received voice; a transmitting power calculator for calculating a power of the transmitted voice; and a power of the received voice of the receiving power calculator. A receiving-side voice detecting means for determining a voiced / non-voiced voice state; and a transmitting-side voice detecting means for determining a voiced / non-voiced voice state of the transmitted voice from the power of the transmitted voice of the transmitting side power calculating means. Sound detecting means, a receiving-side suppressing means for suppressing the power of the received voice, a transmitting-side suppressing means for suppressing the power of the transmitted voice, a voice state of the received voice of the receiving-side voice detecting means, and Judgment means for determining which of the received voice of the receiving side suppressing means and the transmitted voice of the transmitting side suppressing means is to be suppressed, based on the voice state of the transmitted voice of the transmitting side voice detection means, Sound detection of the receiving side and the transmitting side At least one of the output means learns a threshold based on a time average of the power of the input voice or a value equivalent thereto when the input voice is determined to be silent, and determines the threshold and the power of the input voice. A voice switch of a telephone set configured to detect presence / absence of sound based on the comparison result.

【請求項２】前記有音検出手段のしきい値の学習に用い
る入力音声パワーの時間範囲を音声状態によって変える
ようにした請求項１記載の通話機の音声スイッチ。2. A voice switch for a telephone according to claim 1, wherein the time range of the input voice power used for learning the threshold value of said sound detection means is changed according to the voice state.