JPH08202394A

JPH08202394A - Voice detector

Info

Publication number: JPH08202394A
Application number: JP7011575A
Authority: JP
Inventors: Masanori Morohishi; 正典諸菱
Original assignee: Kyocera Corp
Current assignee: Kyocera Corp
Priority date: 1995-01-27
Filing date: 1995-01-27
Publication date: 1996-08-09

Abstract

PURPOSE: To prevent missing a word head when a silence state is switched to a sound state, to hold a natural characteristic of a call, and to realize a high quality call, in a voice detector detecting existence of a voice signal using voice power. CONSTITUTION: A voice detector detecting voice power and the like per unit of a frame of some fixed length and detecting existence of a voice consists of a voice power calculator 10 calculating voice power of some fixed length from a discrete input signal every one sample, a maximum value detecting circuit 11 detecting the maximum value of an output from the voice power calculator 10 in a section specified for each frame, and a discriminating circuit 12 discriminating whether the frame is a voice signal or a silent signal using an output from the maximum value detecting circuit 11.

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、ディジタル音声通信等
に用いる音声の有無を検出する音声検出器に関わる。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice detector for detecting the presence or absence of voice used in digital voice communication.

【０００２】[0002]

【従来の技術】近年、携帯電話等の移動体通信では低消
費電力化を図るため、音声の無い区間は送信を中断する
方法（ＶＯＸ）が用いられている。この方法において
は、音声の有る区間のみ伝送することにより送信電力を
低減しているため、精度の高い音声検出器が望まれてい
る。2. Description of the Related Art In recent years, in mobile communication such as mobile phones, a method (VOX) of interrupting transmission in a period without voice is used in order to reduce power consumption. In this method, since the transmission power is reduced by transmitting only the section with voice, a voice detector with high accuracy is desired.

【０００３】以下、従来の音声検出器について説明す
る。図７は従来の音声検出器を示す図である。図７にお
いて、音声の有無の判定は、ある固定長のフレーム単位
で行われる。１は、そのフレームの音声パワーを算出す
る音声パワー算出器、２はそのフレーム内の零交差率を
測定する零交差率測定器である。零交差率は、パワーが
低く高域のパワーが大きい摩擦性の子音を検出するのに
有効な手段である。３は音声パワー、零交差率からその
フレームが有音であるか無音であるかを判定する判定回
路、４は話尾切れの防止、音声の連続性を維持するため
に有音状態から無音状態で移ったとき、ある一定の時間
有音状態を保持するハングオーバー発生器、５は判定回
路内部で用いる次のフレームのしきい値を決定するしき
い値算出器である。A conventional voice detector will be described below. FIG. 7 is a diagram showing a conventional voice detector. In FIG. 7, the presence / absence of sound is determined in units of a certain fixed-length frame. Reference numeral 1 is a voice power calculator for calculating the voice power of the frame, and 2 is a zero crossing rate measuring device for measuring the zero crossing rate in the frame. The zero-crossing rate is an effective means for detecting frictional consonants with low power and high power in the high frequency range. 3 is a decision circuit for judging whether the frame is voiced or silent based on the voice power and the zero-crossing rate. 4 is a state where the voiced state is changed to a silence state in order to prevent the break of the tail and to maintain the continuity of the voice. A hangover generator 5 that holds a voiced state for a certain period of time when the shift is made is a threshold value calculator that determines the threshold value of the next frame to be used inside the determination circuit.

【０００４】以上のような構成により、入力された音声
信号をある固定長、例えば２０ｍｓｅｃ程度のフレーム
毎に音声パワー算出器１により音声パワーが、零交差率
算出器２により零交差率が算出される。判定回路３で
は、音声パワーをある２つのしきい値と比較してそのフ
レームが有音であるか無音であるか、または不定の状態
であるかを判定し、さらに不定の状態であれば、零交差
率により有音であるか無音であるかを判定し有音であれ
ば１、無音であれば０という値をハングオーバー発生器
に対して出力する。ハングオーバー発生器４では、無音
であればそのまま０を、有音であれば１を、また、無音
から有音に切り替わったときは、ある時間長、すなわち
数フレーム１を出力するように動作する。しきい値算出
器５は、判定結果、及び、音声パワーから周囲雑音のパ
ワーレベルを推定し、判定回路１１内で音声パワーの判
定に用いる２つのしきい値を設定する。With the above configuration, the voice power calculator 1 calculates the voice power and the zero-crossing rate calculator 2 calculates the zero-crossing rate of the input voice signal for each frame of a certain fixed length, for example, about 20 msec. It The determination circuit 3 compares the voice power with two threshold values to determine whether the frame is voiced, silent, or in an indeterminate state. Based on the zero-crossing rate, it is determined whether there is sound or no sound. If there is sound, a value of 1 is output to the hangover generator. The hangover generator 4 operates so as to output 0 if there is no sound, 1 if there is sound, and a certain length of time, that is, several frames 1 when switching from silence to sound. . The threshold calculator 5 estimates the power level of ambient noise from the determination result and the voice power, and sets two thresholds used in the determination of the voice power in the determination circuit 11.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、上記従
来の音声検出器では、フレーム単位で有音無音を検出
し、その検出に固定長区間のパワーをそのフレームの音
声パワーとしているため、フレームの終わりに話頭がか
かってしまっているような状態は検出しにくく、話頭切
れを生じてしまう問題があった。また、近年のディジタ
ル方式の携帯電話では、そのフレーム単位で音声を符号
化するが、その音声符号化方式ＶＳＥＬＰ（Vector Sum
Excited Linear Prediction）では、過去の信号から現
在の音声信号を予測により生成しているため、ほんのわ
ずかな話頭切れが音声品質を劣化させてしまうという問
題もある。However, in the above-mentioned conventional speech detector, since the voiced and unvoiced sound is detected in frame units and the power of the fixed length section is used as the voice power of the frame for the detection, the end of the frame is detected. There is a problem that it is difficult to detect a state where the head is on the head and the head is cut off. In addition, in recent digital mobile phones, the voice is coded in frame units, but the voice coding method VSELP (Vector Sum) is used.
In Excited Linear Prediction), the current speech signal is generated by prediction from the past signal, so there is a problem that even a slight break in speech degrades the speech quality.

【０００６】本発明は、上記従来の問題を解決するため
のものであり、音声パワーを用いた音声検出器において
話頭切れを少なくすることのできる優れた音声検出器を
提供することを目的とするものである。An object of the present invention is to solve the above-mentioned conventional problems, and it is an object of the present invention to provide an excellent voice detector capable of reducing breaks in speech in a voice detector using voice power. It is a thing.

【０００７】[0007]

【課題を解決するための手段】本発明は上記目的を達成
するために、現在のフレーム区間の音声信号に対してそ
の前後の音声信号を含めた区間に対して、フレームの最
初と最後をずらし音声パワーが最大となる区間の値をそ
のフレームの音声パワーとして判定回路によりしきい値
判定するようにしたものである。In order to achieve the above object, the present invention shifts the beginning and end of a frame with respect to a section including a speech signal before and after the speech signal of a current frame section. The threshold value is determined by the determination circuit as the value of the section in which the voice power is maximum as the voice power of the frame.

【０００８】[0008]

【作用】本発明は上記構成により話頭がフレームの後半
にかかっており、そのフレームの音声パワーが大きくな
らないようなフレームに対しても、現在のフレームの前
後のある長さの区間に対して音声パワーを算出し、最大
になる値をそのフレームの音声パワーとすることにより
音声の話頭切れを少なくすることができる。また、この
課程において得られる現在のフレーム区間の音声パワー
をしきい値算出器への入力とすることにより従来と同等
な方法によりしきい値の設定を行うものである。According to the present invention, with the above structure, even if the speech is started in the latter half of the frame and the voice power of the frame does not increase, the voice is output for a certain length section before and after the current frame. By calculating the power and setting the maximum value as the voice power of the frame, the break in the voice can be reduced. Further, the threshold power is set by a method equivalent to the conventional method by inputting the voice power of the current frame section obtained in this process to the threshold calculator.

【０００９】[0009]

【実施例】以下本発明の実施例について図面を参照しな
がら説明する。図１は本発明を用いた音声検出器であ
り、図３は音声パワーの算出を行う回路の一例であり、
図４は本発明による音声検出器のパワー測定の様子を説
明するための図であり、図５は判定回路の一例、図６は
判定回路の判定規則を説明するための図である。Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a voice detector using the present invention, and FIG. 3 is an example of a circuit for calculating voice power.
FIG. 4 is a diagram for explaining the power measurement state of the voice detector according to the present invention, FIG. 5 is an example of the determination circuit, and FIG. 6 is a diagram for explaining the determination rule of the determination circuit.

【００１０】以下の説明では、音声検出器に入力される
音声は８ｋＨｚでサンプリングされ、２の補数形式でデ
ィジタル化されたものとする。また、２０ｍｓｅｃ、１
６０サンプルを１フレームとし、フレーム毎に有音無音
を検出するものとする。In the following description, it is assumed that the voice input to the voice detector is sampled at 8 kHz and digitized in 2's complement format. Also, 20 msec, 1
It is assumed that one frame is made up of 60 samples and that voiced and unvoiced sounds are detected for each frame.

【００１１】図１において、１０は１サンプル毎に音声
パワーを算出する音声パワー算出器、１１はある区間の
中から最大になる値を検出する最大値検出器、１３は零
交差率を算出する零交差率算出器、１２は零交差率、音
声パワーの最大値から有音無音を判定する判定回路、１
４はハングオーバ発生器である。In FIG. 1, 10 is a voice power calculator for calculating the voice power for each sample, 11 is a maximum value detector for detecting the maximum value in a certain section, and 13 is a zero crossing rate. A zero-crossing rate calculator, 12 is a zero-crossing rate, a judgment circuit for judging whether there is voice or no sound from the maximum value of voice power, 1
4 is a hangover generator.

【００１２】入力される音声は、１サンプル毎に音声パ
ワー算出器１０によりある固定長の区間の音声パワーが
求められる。この固定長は、通常フレーム長でもよいが
音声のピッチ周期によりパワーの変動を抑えられればよ
いため最低でもピッチ周期の波形の１周期がはいるよう
な長さを選択すればよく、通常１５〜２０ｍｓｅｃ程度
に選べば十分である。今回の場合はフレーム長と同じ１
６０サンプル、２０ｍｓｅｃとして考える。For the input voice, the voice power calculator 10 obtains the voice power of a fixed length section for each sample. This fixed length may be a normal frame length, but it is sufficient to suppress fluctuations in power due to the pitch period of the voice, and therefore, a length such that at least one period of the waveform of the pitch period is included may be selected. It is sufficient to select about 20 msec. In this case, the same as the frame length 1
Consider 60 samples, 20 msec.

【００１３】図３は音声パワー算出器の具体例を示した
ものであり、１サンプル毎に音声パワーを算出する回路
の一例である。２０は入力サンプルの自乗をとる自乗回
路、２１はシフトレジスタ、２２、２３は加算器、２４
は１サンプルの遅延器である。入力された音声サンプル
は自乗回路２０により１サンプル毎に自乗をとり、シフ
トレジスタ２１へ入力されるとともに１サンプル前の音
声パワー値である遅延器２４の出力と加算器２２により
加算される。そして、パワーを測定する区間分のシフト
レジスタ２１より１６０サンプル前の自乗値を加算器２
３により引くことにより現在のサンプルにおける音声パ
ワー値を音声サンプルの自乗値の移動平均として得る。
また、現在のサンプルの音声パワーは次のサンプルのパ
ワーを求めるために遅延器２４へ入力される。FIG. 3 shows a concrete example of the audio power calculator, which is an example of a circuit for calculating the audio power for each sample. 20 is a squaring circuit for taking the square of the input sample, 21 is a shift register, 22 and 23 are adders, 24
Is a one-sample delay device. The input voice sample is squared by the squaring circuit 20 for each sample, input to the shift register 21, and added by the adder 22 with the output of the delay unit 24 which is the voice power value one sample before. Then, the squared value 160 samples before from the shift register 21 for the section for measuring the power is added by the adder 2
Subtracting by 3 gives the voice power value in the current sample as the moving average of the squared value of the voice sample.
Also, the voice power of the current sample is input to the delay unit 24 to obtain the power of the next sample.

【００１４】１サンプル毎に算出された音声パワーは最
大値検出器１１に入力される。最大値検出器１１では、
図４中３１で示される処理対称であるフレーム区間に対
して、３２で示される範囲内の音声パワーの最大値を検
出する。この処理対称フレームと音声パワーの測定範囲
のずれ（遅延）はシステムにより許容できる範囲内でと
ればよい。例えばディジタル方式の携帯電話では、音声
符号化方式であるＶＳＥＬＰ自体がフレーム処理に先立
ち６５サンプル余分に音声サンプルを読み込む必要があ
る。このため、ディジタル方式の携帯電話に本方式を適
用するのであれば３１の処理対称フレームと３２の音声
パワーの測定範囲のずれを６５サンプルとすれば、シス
テム的にここで生じる遅延量はなんら問題を生じない。The voice power calculated for each sample is input to the maximum value detector 11. In the maximum value detector 11,
The maximum value of the audio power within the range indicated by 32 is detected with respect to the frame section having the processing symmetry indicated by 31 in FIG. The deviation (delay) between the processing symmetrical frame and the measurement range of the voice power may be set within a range allowable by the system. For example, in a digital mobile phone, VSELP itself, which is a voice encoding system, needs to read an extra 65 voice samples before frame processing. Therefore, if the present method is applied to a mobile phone of a digital method, if the difference between the processing symmetrical frame of 31 and the measurement range of the voice power of 32 is 65 samples, the delay amount generated here systematically does not matter. Does not occur.

【００１５】結果的に図４で例示しているような処理対
称フレームの後半に音声部分がかかっているときは、そ
のフレームの後半から次のフレームにかけてのパワーの
大きい音声部分の値がそのフレームの音声パワーとして
選択される。As a result, when the voice portion is applied to the latter half of the processing symmetrical frame as illustrated in FIG. 4, the value of the voice portion having high power from the latter half of the frame to the next frame is the frame. Is selected as the voice power.

【００１６】これら音声パワーの算出をするとともに、
入力される音声は零交差率算出器１３により図４の３１
で示されるフレーム区間の零交差率が算出される。零交
差率算出器１３は入力された音声サンプルの符号ビット
だけに着目し、隣合うサンプルの符号ビットの違いをカ
ウントするものである。While calculating these voice powers,
The input voice is represented by 31 in FIG.
The zero crossing rate of the frame section indicated by is calculated. The zero-crossing rate calculator 13 pays attention only to the sign bit of the input voice sample, and counts the difference between the sign bits of the adjacent samples.

【００１７】最大値検出器１１と、零交差率算出器１３
の結果は、フレームに一回判定回路１２に入力され、処
理対称フレームが有音であるか、無音であるかが判定さ
れ、有音であれば１が、無音であれば０が出力される。Maximum value detector 11 and zero crossing rate calculator 13
The result of is input to the determination circuit 12 once for each frame, and it is determined whether the processing symmetric frame is voiced or silenced. If there is voiced, 1 is output, and if there is no voice, 0 is output. .

【００１８】図５は判定回路の一例であり、４０は最大
値検出器１１からの出力が、４１は零交差率検出器から
の出力が入力され、４２、４３、４４はそれぞれＴＨ
１、ＴＨ２、ＴＨ３のしきい値と入力を比較し、入力が
しきい値より大きければ１、小さければ０を出力する比
較回路、４５は入力を反転するＮＯＴゲート、４６は３
つの入力の論理積をとるＡＮＤゲート、４７は２つの入
力の論理和をとるＯＲゲートである。この判定回路は、
入力された音声パワーが、しきい値ＴＨ１より大きけれ
ば出力が１、音声パワーがしきい値ＴＨ１より小さく、
しきい値ＴＨ２より大きく、また、零交差率がしきい値
ＴＨ３より大きい場合は出力が１、それ以外の場合は出
力が０となるように動作する。しきい値ＴＨ１、ＴＨ２
の値は通常ＴＨ１＞ＴＨ２となるように設定される。FIG. 5 shows an example of a decision circuit. 40 is the output from the maximum value detector 11, 41 is the output from the zero-crossing rate detector, and 42, 43 and 44 are TH, respectively.
A comparator circuit that compares the input with threshold values of 1, TH2, and TH3 and outputs 1 if the input is larger than the threshold value, and outputs 0 if the input is smaller, 45 is a NOT gate that inverts the input, and 46 is 3
An AND gate that takes a logical product of two inputs, and an OR gate 47 that takes a logical sum of two inputs. This decision circuit
If the input voice power is larger than the threshold TH1, the output is 1, and the voice power is smaller than the threshold TH1,
When the value is larger than the threshold value TH2 and the zero-crossing ratio is larger than the threshold value TH3, the output is 1, and otherwise the output is 0. Threshold values TH1 and TH2
The value of is normally set so that TH1> TH2.

【００１９】これは、図６に示すように処理対称フレー
ムの音声パワーがしきい値ＴＨ１以上では有音、しきい
値ＴＨ１以下、しきい値ＴＨ２以上であれば不定、しき
い値ＴＨ３以下であれば無音と判定し、さらに不定状態
のときは零交差率の値により有音であるか、無音である
かを判定しているものである。ここで、図４で示したよ
うなフレームにおいては、処理対称フレームの実際の音
声パワーがしきい値ＴＨ１近辺の値であっても、本発明
によれば判定回路１２に入力される音声パワーの値はそ
の処理対象フレームに対して図４で示される音声パワー
測定範囲のうちの有音区間側の値が選択され、その処理
対称フレームの実際の音声パワーより大きな値となるた
め、そのフレームを有音であると判定することができ、
話頭切れを少なくすることができる。As shown in FIG. 6, when the voice power of the processed symmetrical frame is threshold TH1 or higher, the voice is present, when the threshold TH1 is lower than threshold TH2, it is indefinite, and when it is threshold TH3 or lower. If there is a sound, it is determined that there is no sound, and if the state is indefinite, it is determined whether there is a sound or there is no sound according to the value of the zero-crossing rate. Here, in the frame as shown in FIG. 4, even if the actual voice power of the processing symmetrical frame is a value near the threshold value TH1, according to the present invention, the voice power of the voice power input to the determination circuit 12 is A value on the voiced section side of the voice power measurement range shown in FIG. 4 is selected for the frame to be processed, and the value is larger than the actual voice power of the symmetrical frame to be processed. It can be determined that there is voice,
You can reduce the number of breaks.

【００２０】判定回路の出力は最終的に１４のハングオ
ーバ発生器に入力され、処理対称フレームに対して有音
か無音の判定値が出力される。ハングオーバー発生器１
４では、無音であればそのまま０を、有音であれば１
を、また、無音から有音に切り替わったときは、ある時
間長、すなわち数フレーム１を出力するように動作す
る。これは、音声検出器が検出しきれない音声部分の欠
落を防ぐためである。The output of the decision circuit is finally inputted to 14 hangover generators, and the decision value of voiced or silent is output for the processed symmetrical frame. Hangover generator 1
In 4, if there is no sound, 0 is used as it is, and if there is sound, 1 is used
Further, when switching from silence to voice, it operates to output a certain time length, that is, several frames 1. This is to prevent the loss of the voice portion that cannot be detected by the voice detector.

【００２１】図２は本発明の第２の実施例であり、判定
回路１２に用いるしきい値を１フレーム毎に可変に設定
することができるようにしたものである。携帯電話等で
は、屋外で使用することが多いため、その周囲雑音を特
定することができない。このため、判定回路内で用いる
しきい値を一定とすると、高雑音下では音声検出器の出
力がすべて有音であるというような結果になってしま
う。このため、しきい値を周囲雑音により適応的に決め
ることが考えられる。FIG. 2 shows a second embodiment of the present invention in which the threshold value used in the decision circuit 12 can be variably set for each frame. Since mobile phones are often used outdoors, their ambient noise cannot be specified. For this reason, if the threshold value used in the determination circuit is constant, the output of the voice detector is all voiced under high noise. Therefore, it is possible to adaptively determine the threshold value by the ambient noise.

【００２２】図２では、本発明の実施例１に対してしき
い値算出回路を設けたものであり、１フレーム毎に最適
なしきい値を設定し、高雑音下でも精度の高い音声検出
を行うようにしたものである。しきい値算出器１５以外
は本発明の実施例１と同じ動作をする。ここではしきい
値算出器１５の説明をする。また、この場合の周囲雑音
は定常的なものと仮定する。In FIG. 2, a threshold value calculation circuit is provided for the first embodiment of the present invention, and an optimum threshold value is set for each frame, so that highly accurate voice detection can be performed even under high noise. It's something that you do. Except for the threshold calculator 15, the same operation as in the first embodiment of the present invention is performed. Here, the threshold calculator 15 will be described. Also, the ambient noise in this case is assumed to be stationary.

【００２３】しきい値算出器１５は、１フレーム毎に音
声パワー算出器１０から一定区間の音声パワーの値を入
力し、フレーム毎にそのパワーの変動を監視する。この
変動がある一定時間長、一定値以下であればその区間は
周囲雑音のみの区間であると判定し、その区間の間に入
力された周囲雑音のパワーを推定し判定回路で用いるし
きい値を設定する。The threshold calculator 15 inputs the value of the voice power in a certain section from the voice power calculator 10 for each frame and monitors the fluctuation of the power for each frame. If this variation has a certain length of time and is less than a certain value, it is determined that the section is the section containing only ambient noise, and the power of the ambient noise input during that section is estimated and the threshold value used in the determination circuit. To set.

【００２４】また、このしきい値算出器１５で監視する
時間長は、判定回路１２の結果により可変とすることも
考えられる。The time length monitored by the threshold calculator 15 may be variable according to the result of the judgment circuit 12.

【００２５】また、音声パワー算出器が１サンプル毎に
音声パワーを算出しているため、その変動をサンプル毎
に監視することができ、さらに精度の高い周囲雑音の推
定を行うことができる。Further, since the voice power calculator calculates the voice power for each sample, the fluctuation can be monitored for each sample, and the ambient noise can be estimated with higher accuracy.

【００２６】[0026]

【発明の効果】以上説明してきたように本発明によれ
ば、固定長のフレーム単位で、音声パワーにより有音か
無音であるかを判定する音声検出器において、その音声
パワーの測定範囲をそのフレームの前後に広げ、その範
囲内で測定された最大の値をそのフレームの音声パワー
とすることにより、フレームの後半に音声がかかってお
り、パワーのあまり大きくないようなフレームに対して
も音声の検出ができるため、話頭切れを少なくすること
ができる。As described above, according to the present invention, in the voice detector for determining whether there is voice or no voice by the voice power in fixed length frame units, the measurement range of the voice power is By spreading the voice before and after the frame and setting the maximum value measured within that range as the voice power of that frame, the voice is applied to the second half of the frame, and even if the power is not so large Since it is possible to detect, it is possible to reduce breaks in speech.

【図面の簡単な説明】[Brief description of drawings]

【図１】本発明の第１の実施例における音声検出器のブ
ロック図。FIG. 1 is a block diagram of a voice detector according to a first embodiment of the present invention.

【図２】本発明の第２の実施例における音声検出器のブ
ロック図。FIG. 2 is a block diagram of a voice detector according to a second embodiment of the present invention.

【図３】本実施例における音声パワー算出器の一実施例
を示す図。FIG. 3 is a diagram showing an example of an audio power calculator according to the present embodiment.

【図４】本発明における処理対象フレームと音声パワー
測定範囲を示す図。FIG. 4 is a diagram showing a frame to be processed and a voice power measurement range in the present invention.

【図５】本実施例における判定回路の一実施例を示す
図。FIG. 5 is a diagram showing an embodiment of a determination circuit in this embodiment.

【図６】本実施例における判定回路の判定規則を示す
図。FIG. 6 is a diagram showing a determination rule of a determination circuit in this embodiment.

【図７】従来の音声検出器のブロック図。FIG. 7 is a block diagram of a conventional voice detector.

【符号の説明】[Explanation of symbols]

１パワー算出器２零交差率算出器３判定回路４ハングオーバ発生器５しきい値算出器１０音声パワー算出器１１最大値検出器１２判定回路１３零交差率算出器１４ハングオーバ発生器１５しきい値算出器２０自乗回路２１シフトレジスタ２２、２３加算器２４遅延器４２、４３、４４比較回路４５ＮＯＴゲート４６ＡＮＤゲート４７ＯＲゲート 1 Power Calculator 2 Zero Crossing Ratio Calculator 3 Judgment Circuit 4 Hangover Generator 5 Threshold Calculator 10 Voice Power Calculator 11 Maximum Value Detector 12 Judgment Circuit 13 Zero Crossing Ratio Calculator 14 Hangover Generator 15 Threshold Calculator 20 Square circuit 21 Shift register 22, 23 Adder 24 Delay device 42, 43, 44 Comparison circuit 45 NOT gate 46 AND gate 47 OR gate

Claims

【特許請求の範囲】[Claims]

【請求項１】ある固定長のフレーム単位に音声のパワー
等を検出し音声の有無を検出する音声検出器において、
離散化された入力信号から１サンプル毎にある固定長の
音声パワーを測定する音声パワー算出器１０と、音声パ
ワー算出器１０の出力をフレーム毎に指定された区間中
の最大値を検出する最大値検出回路１１と、その最大値
検出回路１１の出力を、ある閾値と比較してそのフレー
ムが有音であるか無音であるかを判定する判定回路１２
とを具備することを特徴とする備える音声検出器。1. A voice detector for detecting the presence or absence of voice by detecting the power of voice or the like in a unit of a fixed length frame,
An audio power calculator 10 for measuring a fixed length audio power for each sample from a discretized input signal, and an output of the audio power calculator 10 for detecting a maximum value in a section designated for each frame. The value detection circuit 11 and the determination circuit 12 that compares the output of the maximum value detection circuit 11 with a certain threshold value to determine whether the frame is voiced or silent.
An audio detector comprising:

【請求項２】前記音声パワー算出器１０の出力に基いて
前記判定回路１２の閾値を可変とする閾値算出器１５を
具備することを特徴とする請求項１記載の音声検出器。2. The voice detector according to claim 1, further comprising a threshold calculator 15 for varying the threshold of the determination circuit 12 based on the output of the voice power calculator 10.