JP2009134260A

JP2009134260A - Voice musical sound false broadband forming device, voice speech musical sound false broadband forming method, and its program and its record medium

Info

Publication number: JP2009134260A
Application number: JP2008230455A
Authority: JP
Inventors: Takeshi Mori; 岳至森; Shigeaki Sasaki; 茂明佐々木; Kimitaka Tsutsumi; 公孝堤; Yuusuke Hiwazaki; 祐介日和▲崎▼; Naka Omuro; 仲大室; Akitoshi Kataoka; 章俊片岡
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2007-10-30
Filing date: 2008-09-09
Publication date: 2009-06-18
Anticipated expiration: 2028-09-09
Also published as: JP4733727B2

Abstract

<P>PROBLEM TO BE SOLVED: To improve the voice quality of a voice musical sound false broadband forming device. <P>SOLUTION: The voice musical sound false broadband forming device converts a narrow band voice musical sound signal to a signal of a frequency region to form a signal of a low frequency region, multiplying the signal of the low frequency region by a gain coefficient to create a signal of a high frequency region, and synthesizes the signal of the high frequency region with the signal of the low frequency region to be a false broadband region. A gain determination section determines a gain so as to make the gain smaller when the power of the signal on a low-pass side or the absolute value of amplitude is large, and to make the gain larger in the case where the power of the signal on a high-pass side or the absolute value of amplitude is large on the basis of the power ratio or the absolute value of amplitude of the signal of different ranges within the signal on a low-frequency region. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

この発明は、狭帯域音声楽音信号を広帯域音声信号に広帯域化する音声楽音擬似広帯域化方法と、その装置と、そのプログラムと記録媒体に関する。 The present invention relates to a voice musical sound pseudo-wideband method for widening a narrowband voice musical sound signal into a wideband voice signal, an apparatus thereof, a program thereof, and a recording medium.

従来からの電話システムで伝送できる音声信号の周波数帯域は、約３００Ｈｚから３．４ｋＨｚである。従来の電話システムの音声符号化技術の目的は、伝送パラメータ量を最小化することにあり、符号化した音声信号の周波数帯域を超える音声を得ることは不可能である。ところで、最近の音響技術の発展及びディジタル信号処理技術の開発により、日常生活で使われる機器の音声の品質が向上して来ている。このような状況において、例えば電話の音質にも高音質を求める声がある。かかる要求に応える目的で音声擬似広帯域化装置や、その方法が用いられる。 The frequency band of voice signals that can be transmitted by a conventional telephone system is about 300 Hz to 3.4 kHz. The purpose of the speech coding technology of the conventional telephone system is to minimize the amount of transmission parameters, and it is impossible to obtain speech exceeding the frequency band of the coded speech signal. By the way, with recent development of acoustic technology and development of digital signal processing technology, the quality of voice of equipment used in daily life has been improved. In such a situation, for example, there is a voice that demands high sound quality for telephone sound quality. For the purpose of responding to such demands, a speech pseudo-wideband apparatus and its method are used.

従来の音声擬似広帯域化方法の手順を、図１３に示してその方法を簡単に説明する。従来の音声擬似広帯域化方法は、アップサンプリング処理１０１、全波整流処理１０２、ＳＴＦＴ分析（短時間フーリエ）処理１０３,１０５、バンドパスフィルタリング処理１０４、低域の周波数帯域の周波数スペクトルを高域に複写するコピー処理１０６、乗算処理１０７,１０８、ＳＴＦＴ合成処理１０９、加算処理１１０、から成る。アップサンプリング処理１０１は、例えば８ｋＨｚでサンプリングされた狭帯域音声信号を、１６ｋＨｚサンプリングの音声信号にアップサンプリングする。ＳＴＦＴ分析処理１０５は、一定時間（フレーム）毎にアップサンプリングされた音声信号を周波数分析し、周波数スペクトルを生成する。コピー処理１０６は、低域の周波数スペクトルを高域の周波数帯域の周波数スペクトルとしてコピーする。乗算処理１０８は、高域の周波数スペクトルに一定の倍率を乗じてゲイン調整を行う。全波整流処理１０２から乗算処理１０７の過程は、狭帯域音声信号に含まれない低域の周波数スペクトルを生成するものである。全波整流処理１０２で生成された低域の周波数スペクトルは、高域の周波数スペクトルと同じように乗算部１０７において、一定倍率が乗算されてゲイン調整される。ゲイン調整された高域の周波数スペクトルと低域の周波数スペクトルは、ＳＴＦＴ合成処理１０９で合成される。加算処理１１０は、狭帯域音声信号を周波数分析した周波数スペクトルに、ゲイン調整された低域と高域の周波数スペクトルを加算して擬似広帯域音声信号を生成する。
特開平９−９０９９２号公報、図１ The procedure of the conventional speech pseudo-band widening method is shown in FIG. 13 and the method will be briefly described. The conventional speech pseudo-broadband method includes up-sampling processing 101, full-wave rectification processing 102, STFT analysis (short-time Fourier) processing 103 and 105, band-pass filtering processing 104, and a frequency spectrum in a low frequency band to a high frequency range. A copy process 106 for copying, multiplication processes 107 and 108, an STFT synthesis process 109, and an addition process 110 are included. The up-sampling process 101 up-samples a narrowband audio signal sampled at, for example, 8 kHz into an audio signal with 16 kHz sampling. The STFT analysis process 105 frequency-analyzes the audio signal up-sampled every predetermined time (frame) to generate a frequency spectrum. The copy process 106 copies the low frequency spectrum as the frequency spectrum of the high frequency band. The multiplication process 108 performs gain adjustment by multiplying a high frequency spectrum by a fixed magnification. The process from full-wave rectification processing 102 to multiplication processing 107 generates a low-frequency spectrum that is not included in the narrowband audio signal. The low-frequency spectrum generated by the full-wave rectification processing 102 is multiplied by a fixed magnification in the multiplier 107 in the same manner as the high-frequency spectrum, and the gain is adjusted. The gain-adjusted high frequency spectrum and low frequency spectrum are synthesized by the STFT synthesis processing 109. Addition processing 110 adds the low-frequency and high-frequency spectrums whose gains have been adjusted to the frequency spectrum obtained by frequency analysis of the narrow-band audio signal to generate a pseudo-wideband audio signal.
Japanese Patent Laid-Open No. 9-90992, FIG.

従来の音声擬似広帯域化方法は、狭帯域音声信号の周波数スペクトルに広帯域化した周波数範囲のスペクトルを加える際に、加算する周波数スペクトルに一定の倍率を乗じてゲインを調整する方法である。この方法では、雑音を発生させ、または音声を不明瞭にしてしまう課題があった。図１４と図１５に音声信号の周波数スペクトルの例を示す。横軸は周波数、縦軸は振幅である。図１４（ａ）に、周波数の増加に伴って信号の振幅が減衰する例えば音声の有声部のような場合の周波数スペクトルを示す。コピー処理１０６が、この図１４（ａ）の信号に一定倍率を乗じて高域の周波数スペクトルを生成すると、図１４（ｂ）に示すように４ｋＨｚ付近で非常に小さくなる振幅が、４ｋＨｚ以上で再び急激に立ち上がるスペクトル構造になる。このように低域信号と高域信号の境界で信号が極端に不連続になると雑音の原因になる。また、図１５（ａ）に示す低域から高域に向かって振幅が増加する例えば音声の無声部のような周波数スペクトルの場合は、ある一定倍率を乗じて高域の信号を生成すると、図１５（ｂ）に示すように高域の振幅が小さくなることがある。この場合は、擬似広帯域音声の無声部が不明瞭となり音声が聞き取り難くなる。 The conventional speech pseudo-broadband method is a method of adjusting a gain by multiplying a frequency spectrum to be added by a fixed magnification when a spectrum in a frequency range that has been widened is added to the frequency spectrum of a narrowband speech signal. In this method, there is a problem that noise is generated or voice is obscured. 14 and 15 show examples of the frequency spectrum of the audio signal. The horizontal axis is frequency and the vertical axis is amplitude. FIG. 14A shows a frequency spectrum in the case of, for example, a voiced portion where the amplitude of the signal is attenuated as the frequency increases. When the copy process 106 multiplies the signal in FIG. 14A by a fixed magnification to generate a high frequency spectrum, the amplitude that becomes very small near 4 kHz is 4 kHz or more as shown in FIG. 14B. The spectral structure rises rapidly again. As described above, when the signal becomes extremely discontinuous at the boundary between the low-frequency signal and the high-frequency signal, noise is caused. In the case of a frequency spectrum such as an unvoiced portion of a voice whose amplitude increases from a low frequency to a high frequency shown in FIG. 15A, if a high frequency signal is generated by multiplying by a certain magnification, As shown in FIG. 15 (b), the amplitude of the high band may be small. In this case, the unvoiced part of the pseudo-wideband voice becomes unclear and it becomes difficult to hear the voice.

この発明は、このような点に鑑みてなされたものであり、雑音の原因を発生させず、また音声を不明瞭にしない音声楽音擬似広帯域化装置と、その方法と、プログラムと記録媒体を提供することを目的とする。 The present invention has been made in view of the above points, and provides a voice musical tone pseudo-wideband apparatus that does not cause noise and does not obscure voice, a method thereof, a program, and a recording medium The purpose is to do.

この発明による音声楽音擬似広帯域化装置は、周波数変換部と、高域信号生成部と、ゲイン決定部と、ゲイン乗算部と、結合部と、周波数逆変換部とを具備する。周波数変換部は、離散値化された狭帯域音声楽音信号を周波数領域の信号に変換して低域領域の信号を生成する。高域信号生成部は、低域領域の信号の一部または全部を複写して高域領域の信号を生成する。ゲイン決定部は、低域領域内の異なる範囲の信号のパワー比または振幅の絶対値和の比に基づいて、異なる範囲の低域側の信号のパワーまたは振幅の絶対値和が大きい場合には高域領域の信号に乗ずるゲイン係数を小さくし、高域側の信号のパワーまたは振幅の絶対値和が大きい場合にはゲイン係数を大きくするようにゲイン係数を決定する。ゲイン乗算部は、高域領域の信号にゲイン係数を乗じて強調高域信号を生成する。結合部は、低域領域の信号と強調高域信号を合わせて擬似広帯域周波数信号を生成する。周波数逆変換部は、擬似広帯域周波数信号を時間領域の擬似広帯域音声信号に変換して出力する。 The voice musical tone pseudo-wideband apparatus according to the present invention includes a frequency conversion unit, a high frequency signal generation unit, a gain determination unit, a gain multiplication unit, a coupling unit, and a frequency inverse conversion unit. The frequency converting unit converts the narrow band voice musical sound signal that has been converted into a discrete value into a frequency domain signal, and generates a low band signal. The high-frequency signal generator copies a part or all of the low-frequency signal to generate a high-frequency signal. Based on the ratio of the power ratio or the absolute value sum of the amplitudes of the signals in different ranges in the low frequency range, the gain determination unit determines whether the absolute value sum of the power or amplitude of the low frequency signals in the different ranges is large. The gain coefficient to be multiplied by the signal in the high frequency region is reduced, and when the sum of the absolute values of the power or amplitude of the high frequency signal is large, the gain factor is determined so as to increase the gain factor. The gain multiplication unit multiplies the high frequency region signal by a gain coefficient to generate an enhanced high frequency signal. The combining unit generates a pseudo wideband frequency signal by combining the low-frequency region signal and the enhanced high-frequency signal. The frequency inverse conversion unit converts the pseudo wideband frequency signal into a pseudo wideband audio signal in the time domain and outputs the converted signal.

この発明の音声楽音擬似広帯域化装置は、ゲイン決定部が低域領域内の異なる範囲の信号のパワー比または振幅の絶対値和の比に基づいてゲイン係数を決定する。図１４（ａ）に示したような低域から高域に向かって信号の振幅が減衰する特性を示す音声信号の場合は、ゲイン係数を小さくする。このようにすると、擬似広帯域周波数信号のスペクトル構造を、高域になるほど振幅が低下する構造にすることができ、不連続点が強調され難くなる。この結果、雑音の発生を抑えることができる。また、図１５（ａ）に示したような低域から高域に向かって振幅が増加する特性を示す音声信号の場合は、ゲイン係数を大きくする。すると、擬似広帯域周波数信号のスペクトル構造全体として、高域になるほど振幅が増加する連続性のある構造とすることができ、例えば無声音の明瞭度を向上させることができる。つまり、低域領域内の信号の特徴に応じてゲイン係数を可変することで、雑音の発生を防止すると共に無声部を聞き取り易くできるので、擬似広帯域音声の明瞭度を向上させることができる。 In the voice musical tone pseudo-broadband apparatus according to the present invention, the gain determination unit determines the gain coefficient based on the power ratio or the ratio of the absolute value sum of the amplitudes of signals in different ranges within the low frequency range. In the case of an audio signal having a characteristic in which the amplitude of the signal attenuates from the low range to the high range as shown in FIG. 14A, the gain coefficient is reduced. In this way, the spectrum structure of the pseudo wideband frequency signal can be changed to a structure in which the amplitude decreases as the frequency becomes higher, and the discontinuous points are hardly emphasized. As a result, generation of noise can be suppressed. Further, in the case of an audio signal having a characteristic in which the amplitude increases from the low range to the high range as shown in FIG. 15A, the gain coefficient is increased. Then, the whole spectrum structure of the pseudo wideband frequency signal can be a continuous structure in which the amplitude increases as the frequency becomes higher, and for example, the intelligibility of unvoiced sound can be improved. That is, by changing the gain coefficient according to the characteristics of the signal in the low frequency region, it is possible to prevent noise generation and make it easy to hear the unvoiced part, so that the intelligibility of the pseudo wideband speech can be improved.

以下、この発明の実施の形態を図面を参照して説明する。複数の図面中同一のものには同じ参照符号を付し、説明は繰り返さない。 Embodiments of the present invention will be described below with reference to the drawings. The same reference numerals are given to the same components in a plurality of drawings, and the description will not be repeated.

図１にこの発明の音声楽音擬似広帯域化装置の実施例１の機能構成例を、図２に動作フローを示す。音声楽音擬似広帯域化装置は、周波数変換部１１と、周波数拡張部１２と、高域信号生成部１３と、ゲイン決定部１４と、ゲイン乗算部１５と、結合部１６と、周波数逆変換部１７とで構成される。周波数変換部１１に入力される狭帯域音声楽音信号Ｉｎ（ｔ）は、時間領域の信号であり、所定のサンプリング周波数で離散値化されサンプル数（Ｄ個）毎に入力端子１０に入力される。ここでサンプル数Ｄは、予め決まっている値でも良いし、フレーム毎に可変な値でも良い。入力の狭帯域音声楽音信号を、Ｉｎ（ｔ）（ｔ＝０,１,…,Ｄ−１）と表わす。例えば、フレーム長は２０ｍｓ、サンプリング周波数は８ｋＨｚといった値である。 FIG. 1 shows an example of a functional configuration of the first embodiment of the voice musical tone pseudo-broadbanding apparatus of the present invention, and FIG. 2 shows an operation flow. The sound musical tone pseudo-bandwidth expansion apparatus includes a frequency conversion unit 11, a frequency expansion unit 12, a high frequency signal generation unit 13, a gain determination unit 14, a gain multiplication unit 15, a coupling unit 16, and a frequency inverse conversion unit 17. It consists of. The narrow-band audio musical sound signal In (t) input to the frequency conversion unit 11 is a time domain signal, is digitized at a predetermined sampling frequency, and is input to the input terminal 10 for each number of samples (D). . Here, the sample number D may be a predetermined value or a variable value for each frame. The input narrow-band audio musical tone signal is represented as In (t) (t = 0, 1,..., D−1). For example, the frame length is 20 ms and the sampling frequency is 8 kHz.

周波数変換部１１は、時間領域の狭帯域音声楽音信号Ｉｎ（ｔ）を周波数領域の信号である低域領域の信号に変換する（ステップＳ１１）。周波数変換部１１は、ＭＤＣＴの場合、バッファ等に蓄積された直前のフレームのＩｎ（ｔ−ｄ）と、入力Ｉｎ（ｔ）（ｔ＝０,１,…,Ｄ−１）とを用いて、周波数領域の信号ＩｎＦｒｅｑ（ｋ）（ｋ＝０,１,…,Ｄ−１）を生成する。この例では、周波数変換方法としてＭＤＣＴ（Modified Discrete Cosine Transform：修正離散コサイン変換）を用いる例を示すが、ＤＣＴやＦＦＴなど他の周波数変換方法を用いても良い。以下の説明では、周波数領域の信号を周波数インデックスｋを用いて表記する。ｋの値が小さいほど低い周波数の信号を表わしている。 The frequency converter 11 converts the narrow-band audio musical tone signal In (t) in the time domain into a low-frequency signal that is a frequency domain signal (step S11). In the case of MDCT, the frequency converter 11 uses In (t−d) of the immediately preceding frame stored in the buffer or the like and the input In (t) (t = 0, 1,..., D−1). , A frequency domain signal InFreq (k) (k = 0, 1,..., D−1) is generated. In this example, MDCT (Modified Discrete Cosine Transform) is used as the frequency conversion method, but other frequency conversion methods such as DCT and FFT may be used. In the following description, a signal in the frequency domain is expressed using a frequency index k. A smaller value of k represents a lower frequency signal.

周波数拡張部１２は、低域領域の信号を２以上の整数であるＮの倍数に拡張した拡張信号ＩｎＦｒｅｑＥｘｐ（ｋ）を生成する。低域領域の信号ＩｎＦｒｅｑ（ｋ）（ｋ＝０,１,…,Ｄ−１）に対して、例えばＤ個の信号を追加し、ＩｎＦｒｅｑＥｘｐ（ｋ）（ｋ＝０,１,…,２Ｄ−１）のＮ＝２倍の周波数インデックスの範囲に周波数範囲を拡張する（ステップＳ１２）。上記したフレーム長＝２０ｍｓで、サンプリング周波数が８ｋＨｚの場合、Ｄ＝１６０である。拡張信号ＩｎＦｒｅｑＥｘｐ（ｋ）（ｋ＝０,１,…,２Ｄ−１）は、例えば式（１）に示すような信号である。
InFreqExp(k)＝InFreq(k) （０≦ｋ≦Ｄ−１）（１）
InFreqExp(k)＝MIN （Ｄ≦ｋ≦２Ｄ−１）
ここでＭＩＮの値は０でも良いし、非常に小さな値でも良い。つまり拡張信号は、低域領域の信号はそのままで、周波数インデックスの範囲が例えば２倍に拡張された信号である。 The frequency extension unit 12 generates an extension signal InFreqExp (k) obtained by extending a low-frequency region signal to a multiple of N that is an integer of 2 or more. For example, D signals are added to the low-frequency region signal InFreq (k) (k = 0, 1,..., D−1), and InFreqExp (k) (k = 0, 1,..., 2D− The frequency range is extended to a frequency index range that is N = 2 times that of 1) (step S12). When the frame length is 20 ms and the sampling frequency is 8 kHz, D = 160. The extended signal InFreqExp (k) (k = 0, 1,..., 2D−1) is a signal as shown in, for example, the equation (1).
InFreqExp (k) = InFreq (k) (0 ≦ k ≦ D−1) (1)
InFreqExp (k) = MIN (D ≦ k ≦ 2D−1)
Here, the value of MIN may be 0 or a very small value. That is, the extended signal is a signal in which the range of the frequency index is extended by, for example, twice while the signal in the low frequency region is left as it is.

高域信号生成部１３は高域領域の信号を、拡張した高域領域の周波数範囲に低域領域の信号を複写して生成する（ステップＳ１３）。ステップＳ１３の高域信号生成処理の具体的な動作フローを図３に示してその動作を説明する。図３は複写するコピー元の低域領域の周波数範囲が固定の場合である。低域領域のコピーを開始する先頭の周波数インデックスをＤ_Ｌ、コピーする範囲をＤ_Ｗ、コピー先の高域領域の信号の先頭の周波数インデックスをＤ_Ｈとして説明する。まず始めにＤ_Ｌ、Ｄ_Ｗ、Ｄ_Ｈの値を設定する（ステップＳ１３１）。周波数インデックスｋを、高域領域の信号の最下限であるｋ＝０に設定する（ステップＳ１３２）。周波数インデックスｋ＝Ｄからｋ＝Ｄ_Ｈ−１までの高域信号ＦｒｅｑＨｉｇｈ（ｋ）には、ＭＩＮが書き込まれる（ステップＳ１３３〜Ｓ１３５）。周波数インデックスｋが、ｋ＝Ｄ_Ｈになると高域信号ＦｒｅｑＨｉｇｈ（ｋ）には、低域領域のコピー元の先頭の周波数インデックスｋ＝Ｄ_Ｌの信号の振幅がコピーされる（ステップＳ１３６）。つまり、（ｋ−Ｄ_Ｈ−Ｄ_Ｌ＝Ｄ_Ｈ−Ｄ_Ｈ＋Ｄ_Ｌ＝Ｄ_Ｌ）である。したがって、コピー元のｋ＝Ｄ_Ｌ〜（Ｄ_Ｌ＋Ｄ_Ｗ）の範囲の信号の振幅が、高域領域のｋ＝Ｄ_Ｈ〜（Ｄ_Ｈ＋Ｄ_Ｗ）の範囲にコピーされる（ステップＳ１３８のＮｏのループ）。周波数インデックスｋ＝（Ｄ_Ｈ＋Ｄ_Ｗ）〜（２Ｄ−１）までの範囲の高域信号ＦｒｅｑＨｉｇｈ（ｋ）には、ＭＩＮが書き込まれる（ステップＳ１３９〜Ｓ１４１）。この結果、高域信号ＦｒｅｑＨｉｇｈ（ｋ）は、式（２）に示すようになる。
FreqHigh(k)＝MIN （０≦ｋ≦Ｄ_Ｈ−１）
FreqHigh(k)＝InFreqExp(k−D_Ｈ+D_Ｌ) （Ｄ_Ｈ≦ｋ≦Ｄ_Ｈ＋Ｄ_Ｗ−１）（２）
FreqHigh(k)＝MIN （Ｄ_Ｈ＋Ｄ_Ｗ≦ｋ≦２Ｄ−１） The high-frequency signal generator 13 generates a high-frequency signal by copying the low-frequency signal to the expanded high-frequency range (step S13). A specific operation flow of the high-frequency signal generation processing in step S13 is shown in FIG. 3 and will be described. FIG. 3 shows a case where the frequency range of the low frequency region of the copy source to be copied is fixed. A description will be given assuming that the head frequency index for starting copying in the low frequency region is D _L , the range to be copied is D _W , and the frequency index at the beginning of the signal in the copy destination high frequency region is _DH . First, the values of D _L , D _W and _DH are set (step S131). The frequency index k is set to k = 0, which is the lowest limit of the signal in the high frequency region (step S132). MIN is written in the high frequency signal FreqHigh (k) from the frequency index k = D to k = D _H −1 (steps S133 to S135). Frequency index k, k = becomes the _{D H} A high-frequency signal FreqHigh (k), the amplitude of the signal of the frequency index k = _{D L} of the beginning of the copy source of the low-frequency region is copied (step S136). That _is _{_{_{(k-D H -D L =}}} D H -D H + D L = D L). Therefore, the amplitude of the signal in the range of k = D _L to (D _L + D _W ) as the copy source is copied to the range of k = D _H to (D _H + D _W ) in the high frequency region (No in step S138). Loop). The MIN is written to the high frequency signal FreqHigh (k) in the range from the frequency index k = (D _H + D _W ) to (2D−1) (steps S139 to S141). As a result, the high frequency signal FreqHigh (k) is as shown in Expression (2).
FreqHigh (k) = MIN (0 ≦ k ≦ _DH− 1)
FreqHigh (k) = InFreqExp (k -D H + D L) (D H ≦ k ≦ D H + D W -1) (2)
_{FreqHigh (k) = MIN (D} H + D W ≦ k ≦ 2D-1)

この低域領域の信号を、拡張した周波数範囲に複写して高域領域の信号を生成する様子を模式的に図４に示す。横軸は周波数インデックス、縦軸は振幅である。周波数インデックスが０〜Ｄ−１の範囲の低域領域のＤ_Ｌ〜（Ｄ_Ｌ＋Ｄ_Ｗ−１）の範囲の振幅が、高域領域のＤ_Ｈ〜（Ｄ_Ｈ＋Ｄ_Ｗ−１）の範囲にコピーされている様子が分かる。 FIG. 4 schematically shows how a signal in the high frequency region is generated by copying the signal in the low frequency region into the expanded frequency range. The horizontal axis is the frequency index, and the vertical axis is the amplitude. The amplitude of the low frequency region D _L to (D _L + D _W −1) in the frequency index range of 0 to D−1 is in the high frequency region D _H to (D _H + D _W −1). You can see how it is copied.

なお、この例では、連続する低域領域の拡張信号の一部を高域信号にコピーする場合について説明したが、拡張信号の全部を高域信号にコピーしても良いし、複数部分を分割してコピーしても良い。 In this example, the case where a part of the extended signal in the continuous low frequency region is copied to the high frequency signal has been described. However, the entire extended signal may be copied to the high frequency signal, or a plurality of portions may be divided. And copy it.

また、上記した例では、周波数拡張部１２で周波数インデックスの範囲を例えば２倍に拡張した後に、高域信号生成部１３が低域領域の信号の一部または全部を拡張した高域領域にコピーしたが、この発明はこの例に限定されない。高域信号生成部１３は、低域周波数領域信号ＩｎＦｒｅｑ（ｋ）の一部または全部の信号を高域領域の信号としてコピーするだけとしても良い。つまり、式（３）に示すように、単純に低域周波数領域信号ＩｎＦｒｅｑ（ｋ）の一部または全部の信号を切り出すだけの処理を行う。
FreqHigh(k)＝MIN （０≦ｋ≦Ｄ_Ｈ−１）
FreqHigh(k)＝InFreqExp(k−D_Ｈ+D_Ｌ) （Ｄ_Ｈ≦ｋ≦Ｄ_Ｈ＋Ｄ_Ｗ−１）（３）
FreqHigh(k)＝MIN （Ｄ_Ｈ＋Ｄ_Ｗ≦ｋ≦Ｄ−１）
そして、結合部１６は、高域領域の信号が後述するゲイン乗算部１５でゲイン調整された強調高域信号を高域側に配置し、低域周波数領域信号ＩｎＦｒｅｑ（ｋ）を低域側に配置することにより合成する。このように、結合部１６において、周波数範囲を拡張するようにしても良い。 In the above example, after the frequency index range is expanded by, for example, twice by the frequency extension unit 12, the high frequency signal generation unit 13 copies part or all of the low frequency region signal to the high frequency region. However, the present invention is not limited to this example. The high frequency signal generation unit 13 may only copy a part or all of the low frequency signal region InFreq (k) as a high frequency signal. That is, as shown in the equation (3), a process of simply cutting out a part or all of the low frequency domain signal InFreq (k) is performed.
FreqHigh (k) = MIN (0 ≦ k ≦ _DH− 1)
FreqHigh (k) = InFreqExp (k -D H + D L) (D H ≦ k ≦ D H + D W -1) (3)
FreqHigh (k) = MIN (D _H + D _W ≦ k ≦ D−1)
Then, the combining unit 16 arranges the emphasized high frequency signal whose gain in the high frequency region is adjusted by the gain multiplier 15 described later on the high frequency side, and sets the low frequency region signal InFreq (k) to the low frequency side. Synthesize by arranging. Thus, the coupling unit 16 may extend the frequency range.

以上の動作を模式的に図５に示す。横軸は周波数インデックス、縦軸は振幅である。図５（ａ）は低域周波数領域信号ＩｎＦｒｅｑ（ｋ）である。図５（ｂ）が高域信号生成部１３でコピーした高域領域の信号である。図５（ｂ）の周波数インデックスの上限がＤ−１である点に注意、単純に図５（ａ）の一部を切り出した信号である。図５（ｃ）が結合部１６で合成された擬似広帯域周波数信号である。以上のように動作する場合は、周波数拡張部１２が無くて良い。 The above operation is schematically shown in FIG. The horizontal axis is the frequency index, and the vertical axis is the amplitude. FIG. 5A shows a low frequency domain signal InFreq (k). FIG. 5B shows a high frequency region signal copied by the high frequency signal generator 13. Note that the upper limit of the frequency index in FIG. 5B is D−1, which is a signal obtained by simply cutting out a part of FIG. FIG. 5C shows a pseudo broadband frequency signal synthesized by the coupling unit 16. When operating as described above, the frequency extension unit 12 may be omitted.

ゲイン決定部１４は、低域領域内の異なる範囲の信号のパワー比に基づいて高域領域の信号に乗ずるゲイン係数を決定する（ステップＳ１４）。ステップＳ１４のゲイン決定処理の具体的な動作フローを図６に示してその動作を説明する。図６はパワー比を求める周波数範囲が固定の場合である。低域領域の異なる範囲の、一方のパワーの計算を開始する先頭の周波数インデックスをｋ_０、その範囲をｄ_０、その異なる範囲の他方のパワーを計算する先頭の周波数インデックスと範囲をｋ_１，ｄ_１として説明する。まず始めにｋ_０、ｄ_０、ｋ_１、ｄ_１の値を設定する（ステップＳ１４２）。上記したＤ＝１６０の場合、例えばｋ_０＝４０、ｄ_０＝４０、ｋ_１＝８０、ｄ_１＝４０といった値に設定される。そして各変数を初期化する（ステップＳ１４３）。一方の範囲である周波数インデックスがｋ_０〜（ｋ_０＋ｄ_０−１）の範囲の累積パワーｐ_０を計算する（ステップＳ１４４〜Ｓ１４６）。次に、他方の範囲であるｋ_１〜（ｋ_１＋ｄ_０−１）の範囲の累積パワーｐ_１を計算する（ステップＳ１４７〜Ｓ１５０）。ｐ_０とｐ_１が求まった後に、ステップＳ１５１でパワー比ｒ＝ｐ_１/ｐ_０を計算する。つまり信号パワー比ｒは、式（４）で表わせる。

The gain determination unit 14 determines a gain coefficient to be multiplied with the signal in the high frequency region based on the power ratio of the signals in different ranges in the low frequency region (step S14). A specific operation flow of the gain determination process in step S14 will be described with reference to FIG. FIG. 6 shows a case where the frequency range for obtaining the power ratio is fixed. K ₀ , the first frequency index for starting the calculation of one power of the different ranges in the low frequency region, d ₀ for the range, and the _first frequency index and the range for calculating the other power of the different range, k ₁ , It described as d _1. First, values of k ₀ , d ₀ , k ₁ , and d ₁ are set (step S142). In the case of D = 160 described above, for example, k ₀ = 40, d ₀ = 40, k ₁ = 80, and d ₁ = 40 are set. Then, each variable is initialized (step S143). The cumulative power p ₀ in which the frequency index which is one of the ranges is in the range of k ₀ to (k ₀ + d ₀ −1) is calculated (steps S144 to S146). Next, the accumulated power p ₁ in the range of k ₁ to (k ₁ + d ₀ −1), which is the other range, is calculated (steps S147 to S150). After p ₀ and p ₁ are obtained, a power ratio r = p ₁ / p ₀ is calculated in step S151. That is, the signal power ratio r can be expressed by equation (4).

ゲイン決定部１４は、信号パワー比ｒの値を例えば複数の閾値で評価して、ゲイン係数ｋ_ＨＧを決定する。例えば閾値を表１のように設定し、ｒ以上となる閾値の数によって表２に示す様にゲイン係数ｋ_ＨＧを決定する（ステップＳ１５２）。

例えばパワー比ｒ＝１．０の場合は、ゲイン係数ｋ_ＨＧ＝０.６となる。 The gain determination unit 14 evaluates the value of the signal power ratio r with, for example, a plurality of threshold values, and determines the gain coefficient k _HG . For example, the threshold value is set as shown in Table 1, and the gain coefficient k _HG is determined as shown in Table 2 according to the number of threshold values equal to or greater than r (step S152).

For example, when the power ratio r = 1.0, the gain coefficient k _HG = 0.6.

また、ゲイン決定部１４は、低域領域内の異なる範囲の信号の信号振幅の絶対値和の比に基づいて高域領域の信号に乗ずるゲイン係数を決定しても良い（図７、ステップＳ１４２′）。この場合のゲイン決定処理の動作フローを図７に示す。図７は、図６のステップＳ１４２がステップＳ１４２′に、ステップＳ１４４がステップＳ１４４′に、ステップＳ１４８がステップＳ１４８′に変わる点のみが異なる。ステップＳ１４４′は、一方の範囲である周波数インデックスがｋ_０〜（ｋ_０＋ｄ_０−１）の範囲の信号振幅の絶対値の和をｐ_０として計算する。ステップＳ１４８′は、他方の範囲であるｋ_１〜（ｋ_１＋ｄ_０−１）の範囲の信号振幅の絶対値の和をｐ_１として計算する。 Further, the gain determination unit 14 may determine the gain coefficient to be multiplied with the signal in the high frequency region based on the ratio of the absolute value sum of the signal amplitudes of the signals in different ranges in the low frequency region (FIG. 7, step S142). ′). FIG. 7 shows an operation flow of gain determination processing in this case. FIG. 7 differs only in that step S142 in FIG. 6 is changed to step S142 ′, step S144 is changed to step S144 ′, and step S148 is changed to step S148 ′. In step S144 ′, the sum of the absolute values of the signal amplitudes in the range where the frequency index in one range is k ₀ to (k ₀ + d ₀ −1) is calculated as p ₀ . In step S148 ′, the sum of absolute values of signal amplitudes in the range of k ₁ to (k ₁ + d ₀ −1), which is the other range, is calculated as p ₁ .

ゲイン決定部１４は、それぞれの範囲の信号振幅の絶対値和ｐ_０とｐ_１が求まった後に、ステップＳ１５１において式（５）に示す信号振幅の絶対値和の比ｒ′を計算する。

ゲイン決定部１４が、信号振幅の絶対値和の比ｒ′を複数の閾値で評価してゲイン係数を決定するのは、上記した信号パワー比ｒを評価する方法と同じである。 After determining the absolute value sums p ₀ and p ₁ of the signal amplitudes in the respective ranges, the gain determining unit 14 calculates the ratio r ′ of the absolute value sums of the signal amplitudes shown in Expression (5) in step S151.

The gain determination unit 14 evaluates the ratio r ′ of the sum of absolute values of the signal amplitude with a plurality of thresholds to determine the gain coefficient in the same manner as the above-described method for evaluating the signal power ratio r.

ゲイン乗算部１５は、入力された高域信号ＦｒｅｑＨｉｇｈ（ｋ）（ｋ＝０,１,…,２Ｄ−１）とゲイン係数ｋ_ＨＧから、式（６）の強調高域信号ＦｒｅｑＨｉｇｈＧａｉｎ（ｋ）を計算して出力する（図２、ステップＳ１５）。
FreqHighGain(k)＝FreqHigh(k)・ｋ_ＨＧ（６） The gain multiplication unit 15 calculates the enhanced high-frequency signal FreqHighGain (k) of Expression (6) from the input high-frequency signal FreqHigh (k) (k = 0, 1,..., 2D−1) and the gain coefficient k _HG. Calculate and output (FIG. 2, step S15).
FreqHighGain (k) = FreqHigh (k ) · k HG (6)

結合部１６は、周波数拡張部１２が出力する拡張信号と、ゲイン乗算部１５が出力する強調高域信号とを加算し、式（７）に示す擬似広帯域周波数信号ＰｓＦｒｅｑ（ｋ）を生成する（ステップＳ１６）。
PsFreq(k)＝InFreqExp(k)＋FreqHighGain(k) （７） The combining unit 16 adds the extension signal output from the frequency extension unit 12 and the enhanced high frequency signal output from the gain multiplication unit 15 to generate a pseudo wideband frequency signal PsFreq (k) shown in Expression (7) ( Step S16).
PsFreq (k) = InFreqExp (k) + FreqHighGain (k) (7)

周波数逆変換部１７は、擬似広帯域周波数信号ＰｓＦｒｅｑ（ｋ）を時間領域の擬似広帯域音声信号ｏｕｔ（ｋ）（ｋ＝０,１,…,２Ｄ−１）に変換して出力する（ステップＳ１７）。 The frequency inverse conversion unit 17 converts the pseudo wideband frequency signal PsFreq (k) into a time-domain pseudo wideband audio signal out (k) (k = 0, 1,..., 2D-1) and outputs the converted signal (step S17). .

以上説明した音声楽音擬似広帯域化装置によれば、低域領域内の異なる範囲の信号のパワー比ｒまたは振幅の絶対値和の比ｒ′が、１以下になる図８（ａ）に示すような音声信号の場合は、ゲイン係数ｋ_ＨＧが１以下になり高域領域にコピーされる強調高域信号の振幅が減衰する。この結果、擬似広帯域周波数信号のスペクトル構造全体として、高域になるほど振幅が低下する構造にすることができ、不連続点が強調され難くなる。また、パワー比ｒまたは振幅の絶対値和の比ｒ′が１以上になる図９（ａ）に示す音声信号の場合は、ゲイン係数ｋ_ＨＧの値が１以上になるので強調高域信号の振幅が増加する。したがって、擬似広帯域周波数信号のスペクトル構造を、高域になるほど振幅が増加する連続性のある構造とすることができる。この結果、擬似広帯域音声の無声部が聞き取り易くなり、音声の明瞭度を向上させることができる。 As shown in FIG. 8A, according to the above-described voice musical tone pseudo-wideband apparatus, the power ratio r of the signals in different ranges in the low frequency region or the ratio r ′ of the absolute value sum of the amplitudes is 1 or less. In the case of a simple audio signal, the gain coefficient k _HG becomes 1 or less, and the amplitude of the emphasized high frequency signal copied to the high frequency region is attenuated. As a result, the entire spectrum structure of the pseudo wideband frequency signal can be made to have a structure in which the amplitude decreases as the frequency becomes higher, and discontinuities are hardly emphasized. In the case of the audio signal shown in FIG. 9A in which the power ratio r or the ratio r ′ of the absolute value sum of amplitudes is 1 or more, the gain coefficient k _HG is 1 or more, so Amplitude increases. Therefore, the spectrum structure of the pseudo broadband frequency signal can be a continuous structure in which the amplitude increases as the frequency becomes higher. As a result, the voiceless part of the pseudo wideband voice can be easily heard, and the clarity of the voice can be improved.

なお、実施例１では、低域領域の信号をＮ倍に拡張した拡張信号を生成する周波数拡張部のＮが２の場合で説明を行ったが、Ｎ＝３でもＮ＝４でも構わない。また、低域領域内の異なる範囲の信号のパワー比を求める周波数インデックスの範囲を固定にした例で説明を行ったが、その範囲を可変にしても良い。次にパワー比を求める周波数インデックスの範囲を可変にした実施例２を説明する。以降ではパワー比を求める例のみを示して実施例を説明する。しかし、以下の実施例は、上記したように信号振幅の絶対値和の比を求める場合にも適用が可能である。 In the first embodiment, the description has been given of the case where N of the frequency extension unit that generates the extension signal obtained by extending the low-frequency signal by N times is 2. However, N = 3 or N = 4 may be used. Further, although an example has been described in which the range of the frequency index for obtaining the power ratio of signals in different ranges in the low frequency region is fixed, the range may be variable. Next, Embodiment 2 in which the range of the frequency index for obtaining the power ratio is made variable will be described. Hereinafter, only an example for obtaining the power ratio will be described, and the embodiment will be described. However, the following embodiments can also be applied to the case where the ratio of the sum of the absolute values of the signal amplitude is obtained as described above.

実施例２の音声楽音擬似広帯域化装置は、ゲイン決定部１４内に累積パワー移動計算部１４ａも備えた点が、実施例１と異なる。図１にその構成を破線で示す。他の構成は実施例１と同じである。累積パワー移動計算部１４ａの一部の動作フローを図１０に示して動作を説明する。 The voice musical tone pseudo-wideband apparatus according to the second embodiment is different from the first embodiment in that the gain determining unit 14 also includes a cumulative power shift calculating unit 14a. FIG. 1 shows the configuration with broken lines. Other configurations are the same as those of the first embodiment. FIG. 10 shows a part of the operation flow of the cumulative power transfer calculation unit 14a, and the operation will be described.

累積パワー移動計算部１４ａは、低域領域の低周波数側であるｋ＝０〜（Ｄ/２−ｄ_０−１）の範囲と、高周波数側のｋ＝Ｄ/２〜（Ｄ−ｄ_０−１）の範囲内のそれぞれの最大累積パワーが得られる周波数インデックスの範囲を動的に求めるものである。まず始めに累積パワーｐ_０を初期化すると共に、累積パワーを求める途中のある範囲ｄ_０の累積パワーを格納する変数ｐ_tempを初期化する（ステップＳ８０）。範囲ｄ_０毎に求める累積パワーｐ_０を初期化する（ステップＳ８１）。そして最初にｋ＝０からｋ＝（ｄ_０−１）の範囲の累積パワーｐ_０を計算する（ステップＳ８２〜Ｓ８４）。次に変数ｐ_tempと今回求めた累積パワーｐ_０を比較する（ステップＳ８５）。最初は変数ｐ_tempが０のために、必ずｐ_temp＜ｐ_０となるので、途中の最大パワーとして求めたｐ_０を変数ｐ_tempに代入するｐ_temp＝ｐ_０（ステップＳ８６）。そして低域領域の異なる範囲の一方のパワーの計算を開始する先頭の周波数インデックスｋ_０を、ｋ_０＝ｉとする。したがって、最初はｋ_０＝０である。この動作をステップＳ８９でｉを１ずつ加算しながらｉ＝（Ｄ/２−ｄ_０−１）になるまで、繰り返す（ステップＳ８８）。つまり、累積パワーｐ_０を求める累積範囲ｄ_０の先頭の周波数インデックスｋ_０を求める。例えば２回目の累積パワーｐ_０が１回目の変数ｐ_tempより大きければ、ステップＳ８７でｋ_０＝１となる。このように、最大の累積パワーｐ_０になる先頭の周波数インデックスｋ_０を求めることができる。 The cumulative power transfer calculation unit 14a includes a range of k = 0 to (D / 2−d ₀ −1) on the low frequency side in the low frequency region, and k = D / 2 to (D−d ₀ ) on the high frequency side. The frequency index range in which each maximum accumulated power within the range of -1) is obtained is obtained dynamically. First, the accumulated power p ₀ is initialized, and a variable p _temp for storing the accumulated power in a range d _{0 in} the middle of _obtaining the accumulated power is initialized (step S80). The accumulated power p ₀ obtained for each range d ₀ is initialized (step S81). First, the accumulated power p ₀ in the range of k = 0 to k = (d ₀ −1) is calculated (steps S82 to S84). Next, the variable p _temp is compared with the accumulated power p ₀ obtained this time (step S85). For first variable p _temp is zero, because always the _{_{_{p temp <p 0, p temp}}} = p 0 substituting _{p 0} determined as the maximum power of the middle variable p _temp (step S86). Then, the head frequency index k _{0 at} which the calculation of the power of one of the different ranges of the low frequency region is started is set to k ₀ = i. Therefore, initially k ₀ = 0. This operation is repeated until i = (D / 2−d ₀ −1) while i is incremented by 1 in step S89 (step S88). That is, the first frequency index k ₀ of the accumulation range d ₀ for obtaining the accumulated power p ₀ is obtained. For example, if the accumulated power p ₀ for the second time is larger than the variable p _{temp for the} first time, k ₀ = 1 in step S87. In this way, the leading frequency index k ₀ that provides the maximum accumulated power p ₀ can be obtained.

同様に高周波数側のｋ＝Ｄ/２〜（Ｄ−ｄ_０−１）の範囲内のパワーを計算する先頭の周波数インデックスｋ_１も求めることができる（ステップＳ９１〜）。動作フローは、上記した動作と同じなので省略する。このようにしてｋ_０とｋ_１を求めた後は、図６で説明済みのステップＳ１４３以降の処理を行なって、累積パワーｐ_０とｐ_１とを求める。このようにすれば、低域領域の低周波数側と高周波数側のそれぞれの範囲の最大パワー同士から求めたパワー比ｒを得ることができる。この方法は比較的に演算量を必要とする。より少ない演算量でパワー比を求める周波数インデックスの範囲を可変にした実施例３を次に説明する。 Similarly, the head frequency index k ₁ for calculating the power within the range of k = D / 2 to (D−d ₀ −1) on the high frequency side can be obtained (step S91). Since the operation flow is the same as the above-described operation, it is omitted. After obtaining k ₀ and k ₁ in this way, the processing after step S143 described with reference to FIG. 6 is performed to obtain the accumulated powers p ₀ and p ₁ . In this way, it is possible to obtain the power ratio r obtained from the maximum powers in the respective ranges on the low frequency side and the high frequency side of the low frequency region. This method requires a relatively large amount of computation. A third embodiment in which the frequency index range for obtaining the power ratio with a smaller calculation amount is made variable will be described below.

実施例３の音声楽音擬似広帯域化装置は、実施例２の累積パワー移動計算部１４ａに代えてピーク検出部１４ｂを備える。図１にそのピーク検出部１４ｂを破線で示す。他の構成は実施例１又は２と同じである。ピーク検出部１４ｂの動作フローを図１１に示して動作を説明する。 The voice musical tone pseudo-broadband apparatus according to the third embodiment includes a peak detection unit 14b instead of the cumulative power shift calculation unit 14a according to the second embodiment. FIG. 1 shows the peak detector 14b with a broken line. Other configurations are the same as those in the first or second embodiment. The operation of the peak detector 14b will be described with reference to FIG.

ピーク検出部１４ｂは、低域領域の低周波数側であるｋ＝０〜（Ｄ/２−ｄ_０−１）の範囲と、高周波数側のｋ＝Ｄ/２〜（Ｄ−ｄ_０−１）の範囲内のそれぞれの最大パワーを示す周波数インデックスｋ_０Ｐとｋ_１Ｐを動的に求めるものである。まず始めにステップＳ９３で変数を初期化する。ｐ_peakは、範囲ｄ_０内の最大パワーの値を格納する変数である。周波数インデックスｋを増やしながらパワーを計算（ステップＳ９４）して、変数ｐ_peakと比較する（ステップＳ９５）。計算したｐ_ｋの方が変数ｐ_peakよりも大きい場合、ステップＳ９６で変数ｐ_peakにｐ_ｋを代入してパワーの大きい方の周波数インデックスｋをｋ_０Ｐとして記録する（ステップＳ９７）。この処理をｋを１ずつ加算（ステップＳ９８）しながらｋ＝（Ｄ/２−ｄ_０−１）になるまで繰り返す（ステップＳ９９）。そのように動作すると、ｋ_０Ｐにはｋ＝０〜（Ｄ/２−ｄ_０−１）の範囲で最大パワーを示す周波数インデックスが記録される。 The peak detector 14b includes a range of k = 0 to (D / 2−d ₀ −1) on the low frequency side in the low frequency region, and k = D / 2 to (D−d ₀ −1) on the high frequency side. The frequency indexes k _0P and k _1P indicating the respective maximum powers within the range of First, in step S93, variables are initialized. p _peak is a variable that stores the value of the maximum power within the range d ₀ . The power is calculated while increasing the frequency index k (step S94) and compared with the variable p _peak (step S95). If towards the calculated _{p k} is greater than the variable p _peak, by substituting _{p k} to record the frequency index k larger power as _{k 0P} variable p _peak in step S96 (step S97). This process is repeated while adding k by 1 (step S98) until k = (D / 2−d ₀ −1) (step S99). When operating in this manner, a frequency index indicating the maximum power in the range of k = 0 to (D / 2−d ₀ −1) is recorded in k _0P .

同様に高周波数側のｋ＝Ｄ/２〜（Ｄ−ｄ_０−１）の範囲内の最大パワーを示す周波数インデックスｋ_１Ｐも求めることができる。動作フローは、上記した動作と同じなので省略する。このようにｋ_０Ｐとｋ_１Ｐを求めた後は、ｋ_０Ｐとｋ_１Ｐをそれぞれ中心として例えばｄ_０の範囲の累積パワーｐ_０とｐ_１を計算してパワー比ｒを求める。または、ｋ_０Ｐ, ｋ_１Ｐを先頭の周波数インデックスｋ_０,ｋ_１として実施例２と同じように累積パワーを求めても良い。実施例３は、実施例２に対して演算量を１/ｄ_０に削減することができる。 Similarly, a frequency index k _1P indicating the maximum power in the range of k = D / 2 to (D−d ₀ −1) on the high frequency side can also be obtained. Since the operation flow is the same as the above-described operation, the description is omitted. After _obtaining k _0P and k _1P in this way, the power ratio r is obtained by calculating, for example, cumulative powers p ₀ and p ₁ in the range of d ₀ centering on k _0P and k _1P , respectively. _{Alternatively} , the accumulated power may be obtained in the same manner as in the second embodiment using k _0P and k _1P as the head frequency index k ₀ and k ₁ . In the third embodiment, the amount of calculation can be reduced to 1 / d ₀ compared to the second embodiment.

また、最大パワーの代わりに信号振幅の最大値から周波数インデックスｋ_０Ｐとｋ_１Ｐを動的に求め、それぞれを中心とした信号振幅の絶対値和の比ｒ′を計算するようにしても良い。 Alternatively, the frequency indices k _0P and k _1P may be dynamically obtained from the maximum value of the signal amplitude instead of the maximum power, and the ratio r ′ of the sum of the absolute values of the signal amplitudes centered on each of them may be calculated.

実施例１ではゲイン係数ｋ_ＨＧを、複数の閾値とパワー比ｒとを比較して表２に示したテーブルを用いて決定する例を示した。他の方法として、式（８）に示すように累積パワー比ｒに正の実数αを乗じた値を、ゲイン係数ｋ_ＨＧとするようにしても良い。正の実数αを図１のゲイン決定部１４内に破線で示す。
ｋ_ＨＧ＝α・ｒ（８） In the first embodiment, an example is shown in which the gain coefficient k _HG is determined using a table shown in Table 2 by comparing a plurality of threshold values with the power ratio r. As another method, a value obtained by multiplying the cumulative power ratio r by a positive real number α as shown in Expression (8) may be used as the gain coefficient k _HG . The positive real number α is indicated by a broken line in the gain determination unit 14 of FIG.
k _HG = α · r (8)

例えば、α＝０．５のように１以下の値にすれば、ゲイン係数ｋ_ＨＧを細かく設定することができる。また、正の実数αをパラメータとすることで、ゲイン係数ｋ_ＨＧを容易に変更することが可能になるのでゲイン係数ｋ_ＨＧの設定と調整を容易にする効果を奏する。なお、正の実数αを乗じた値をゲイン係数とするのは、信号振幅の絶対値和の比ｒ′の場合にも適用が可能である。 For example, if the value is 1 or less, such as α = 0.5, the gain coefficient k _HG can be set finely. Further, by using the positive real number α as a parameter, the gain coefficient k _HG can be easily changed, so that the gain coefficient k _HG can be easily set and adjusted. Note that the gain coefficient obtained by multiplying the positive real number α can also be applied to the ratio r ′ of the absolute value sum of signal amplitudes.

〔シミュレーション結果〕
この発明で提案した音声楽音擬似広帯域化装置で擬似広帯域化処理を施した場合と、その処理を行なわない場合の音声の音質を、５段階ＭＯＳ主観評価で評価した結果を図１２に示す。横軸は処理の有無を示し、縦軸はＭＯＳ主観評価値である。数値が大きいほど良い評価結果を示す。〔simulation result〕
FIG. 12 shows the results of evaluating the sound quality of the voice when the pseudo-bandwidth processing is performed by the voice musical tone pseudo-bandwidth proposing device proposed in the present invention and when the processing is not performed by the five-step MOS subjective evaluation. The horizontal axis indicates the presence or absence of processing, and the vertical axis is the MOS subjective evaluation value. The larger the value, the better the evaluation result.

シミュレーション条件：男性音源４名分、女性音源４名分について、この発明の擬似広帯域化処理を行なった場合と行わない場合とについて、一般人２４名に評価してもらった。擬似広帯域化処理を行なわない場合のＭＯＳ値＝３.１９に対して、この発明の擬似広帯域化処理を行なった場合、０．３６ポイント向上したＭＯＳ値＝３．５５の結果を得ることができた。このようにこの発明による音声楽音擬似広帯域化装置及び方法によれば、擬似広帯域音声の音声品質を向上させることができる。 Simulation conditions: For the male sound source for four and the female sound source for four, the case of performing the pseudo-broadband processing of the present invention and the case of not performing it were evaluated by 24 ordinary people. When the pseudo-bandwidth processing of the present invention is performed on the MOS value = 3.19 when the pseudo-bandwidth processing is not performed, a result of MOS value = 3.55 improved by 0.36 points can be obtained. It was. As described above, according to the voice musical tone pseudo-wideband apparatus and method according to the present invention, the voice quality of the pseudo-wideband voice can be improved.

なお、上記した実施例の説明では、例えばサンプリング周波数８ｋＨｚ、フレーム時間長を２０ｍｓといった電話システムを前提にしたような例を示したが、この発明はこの例に限定されるものではない。この発明は、音声楽音信号を広帯域化する技術として広く利用することが可能である。 In the above description of the embodiment, an example has been shown on the premise of a telephone system such as a sampling frequency of 8 kHz and a frame time length of 20 ms, but the present invention is not limited to this example. The present invention can be widely used as a technique for widening a voice tone signal.

また、この発明である装置及び方法は上述の実施形態に限定されるものではなく、この発明の趣旨を逸脱しない範囲で適宜変更が可能である。また、上記装置及び方法において説明した処理は、記載の順に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されるとしてもよい。 Moreover, the apparatus and method which are this invention are not limited to the above-mentioned embodiment, It can change suitably in the range which does not deviate from the meaning of this invention. Further, the processes described in the above apparatus and method are not only executed in time series according to the order of description, but also may be executed in parallel or individually as required by the processing capability of the apparatus that executes the process. Good.

また、上記装置における処理手段をコンピュータによって実現する場合、各装置が有すべき機能の処理内容はプログラムによって記述される。そして、このプログラムをコンピュータで実行することにより、各装置における処理手段がコンピュータ上で実現される。 Further, when the processing means in the above apparatus is realized by a computer, the processing contents of functions that each apparatus should have are described by a program. Then, by executing this program on the computer, the processing means in each apparatus is realized on the computer.

この処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、例えば、磁気記録装置、光ディスク、光磁気記録媒体、半導体メモリ等どのようなものでもよい。具体的には、例えば、磁気記録装置として、ハードディスク装置、フレキシブルディスク、磁気テープ等を、光ディスクとして、ＤＶＤ（Digital Versatile Disc）、ＤＶＤ−ＲＡＭ（Random Access Memory）、ＣＤ−ＲＯＭ（Compact Disc Read Only Memory）、ＣＤ−Ｒ（Recordable）/ＲＷ（ReWritable）等を、光磁気記録媒体として、ＭＯ（Magneto Optical disc）等を、半導体メモリとしてＥＥＰ−ＲＯＭ（Electronically Erasable and Programmable-Read Only Memory）等を用いることができる。 The program describing the processing contents can be recorded on a computer-readable recording medium. As the computer-readable recording medium, for example, any recording medium such as a magnetic recording device, an optical disk, a magneto-optical recording medium, and a semiconductor memory may be used. Specifically, for example, as a magnetic recording device, a hard disk device, a flexible disk, a magnetic tape or the like, and as an optical disk, a DVD (Digital Versatile Disc), a DVD-RAM (Random Access Memory), a CD-ROM (Compact Disc Read Only). Memory), CD-R (Recordable) / RW (ReWritable), etc., magneto-optical recording medium, MO (Magneto Optical disc), etc., semiconductor memory, EEP-ROM (Electronically Erasable and Programmable-Read Only Memory), etc. Can be used.

また、このプログラムの流通は、例えば、そのプログラムを記録したＤＶＤ、ＣＤ−ＲＯＭ等の可搬型記録媒体を販売、譲渡、貸与等することによって行う。さらに、このプログラムをサーバコンピュータの記録装置に格納しておき、ネットワークを介して、サーバコンピュータから他のコンピュータにそのプログラムを転送することにより、このプログラムを流通させる構成としてもよい。 The program is distributed by selling, transferring, or lending a portable recording medium such as a DVD or CD-ROM in which the program is recorded. Further, the program may be distributed by storing the program in a recording device of a server computer and transferring the program from the server computer to another computer via a network.

また、各手段は、コンピュータ上で所定のプログラムを実行させることにより構成することにしてもよいし、これらの処理内容の少なくとも一部をハードウェア的に実現することとしてもよい。 Each means may be configured by executing a predetermined program on a computer, or at least a part of these processing contents may be realized by hardware.

この発明の音声楽音擬似広帯域化装置の実施例１乃至４の機能構成例を示す図。The figure which shows the function structural example of Examples 1 thru | or 4 of the sound musical tone pseudo | simulation wideband apparatus of this invention. 実施例１の動作フローを示す図。FIG. 3 is a diagram illustrating an operation flow of the first embodiment. 高域信号生成処理（ステップＳ１３）の具体的な動作フローを示す図。The figure which shows the specific operation | movement flow of a high region signal production | generation process (step S13). 高域信号生成処理の処理結果を模式的に示す図。The figure which shows typically the process result of a high region signal production | generation process. 高域信号生成処理の他の処理方法を模式的に示す図であり、（ａ）は低域周波数領域信号ＩｎＦｒｅｑ（ｋ）を示す図、（ｂ）は高域信号生成部１３でコピーした高域領域の信号を示す図、（ｃ）は結合部１６で合成された擬似広帯域周波数信号を示す図である。It is a figure which shows typically the other processing methods of a high region signal production | generation process, (a) is a figure which shows the low frequency domain signal InFreq (k), (b) is the high frequency signal copied by the high frequency signal generator 13 FIG. 4C is a diagram illustrating a signal in a band region, and FIG. 5C is a diagram illustrating a pseudo wideband frequency signal synthesized by the combining unit 16. ゲイン決定処理（ステップＳ１４）の具体的な動作フローを示す図。The figure which shows the specific operation | movement flow of a gain determination process (step S14). ゲイン決定処理の具体的な他の動作フローを示す図。The figure which shows the other specific operation | movement flow of a gain determination process. 音声信号のスペクトル構造の例を示す図であり、（ａ）は、高域になるほど振幅が低下する構造を示す図、（ｂ）は（ａ）を擬似広帯域処理したスペクトルを示す図である。It is a figure which shows the example of the spectrum structure of an audio | voice signal, (a) is a figure which shows the structure where an amplitude falls as it becomes a high region, (b) is a figure which shows the spectrum which carried out the pseudo | simulation wideband process of (a). 音声信号のスペクトル構造の例を示す図であり、（ａ）は、高域になるほど振幅が増加する構造を示す図、（ｂ）は（ａ）を擬似広帯域処理したスペクトルを示す図である。It is a figure which shows the example of the spectrum structure of an audio | voice signal, (a) is a figure which shows the structure where an amplitude increases, so that it becomes a high region, (b) is a figure which shows the spectrum which carried out the pseudo | simulation wideband process of (a). 累積パワー移動計算部１４ａの一部の動作フローを示す図。The figure which shows the one part operation | movement flow of the accumulation power transfer calculation part 14a. ピーク検出部１４ｂの動作フローを示す図。The figure which shows the operation | movement flow of the peak detection part 14b. ５段階ＭＯＳ主観評価の評価結果を示す図。The figure which shows the evaluation result of 5-step MOS subjective evaluation. 従来の音声擬似広帯域化方法の手順を示す図。The figure which shows the procedure of the conventional audio | voice pseudo | simulation broadening method. 音声信号の周波数スペクトルの例を示す図であり、（ａ）は低域から高域に向かって信号の振幅が減衰する有声部の周波数スペクトルを示す図、（ｂ）は（ａ）の信号を従来の音声楽音擬似広帯域化方法で擬似広帯域処理した周波数スペクトルを示す図である。It is a figure which shows the example of the frequency spectrum of an audio | voice signal, (a) is a figure which shows the frequency spectrum of the voiced part which the amplitude of a signal attenuates toward a high region from a low region, (b) is a signal of (a). It is a figure which shows the frequency spectrum which carried out the pseudo | simulation wideband process by the conventional voice musical tone quasi-wideband method. 音声信号の周波数スペクトルの例を示す図であり、（ａ）は低域から高域に向かって信号の振幅が増加する無声部の周波数スペクトルを示す図、（ｂ）は（ａ）の信号を従来の音声楽音擬似広帯域化方法で擬似広帯域処理した周波数スペクトルを示す図である。It is a figure which shows the example of the frequency spectrum of an audio | voice signal, (a) is a figure which shows the frequency spectrum of the unvoiced part which the amplitude of a signal increases toward a high region from a low region, (b) is a signal of (a). It is a figure which shows the frequency spectrum which carried out the pseudo | simulation wideband process by the conventional voice musical tone quasi-wideband method.

Claims

離散値化された狭帯域音声楽音信号を周波数領域の信号に変換し、低域領域の信号を生成する周波数変換部と、

上記低域領域の信号の一部または全部を複写して高域領域の信号を生成する高域信号生成部と、
上記低域領域内の異なる範囲の信号のパワー比または振幅の絶対値和の比に基づいて上記高域領域の信号に乗ずるゲイン係数を決定するゲイン決定部と、
上記高域領域の信号に上記ゲイン係数を乗じて強調高域信号を生成するゲイン乗算部と、
上記低域領域の信号と上記強調高域信号を合わせて擬似広帯域周波数信号を生成する結合部と、
上記擬似広帯域周波数信号を時間領域の擬似広帯域音声信号に変換して出力する周波数逆変換部と、を具備し、
上記ゲイン決定部は、上記異なる範囲の低域側の信号のパワーまたは振幅の絶対値和が大きい場合には上記ゲイン係数を小さくし、高域側の信号のパワーまたは振幅の絶対値和が大きい場合には上記ゲイン係数を大きくすることを特徴とする音声楽音擬似広帯域化装置。 A frequency conversion unit that converts a discrete narrow band voice musical sound signal into a frequency domain signal and generates a low frequency signal;

A high-frequency signal generation unit that generates a high-frequency signal by copying a part or all of the low-frequency signal;
A gain determining unit that determines a gain coefficient to be multiplied by the signal in the high frequency region based on the power ratio of the signals in different ranges in the low frequency region or the ratio of the absolute value sum of the amplitudes;
A gain multiplier that multiplies the high-frequency signal by the gain coefficient to generate an enhanced high-frequency signal;
A combining unit that generates a pseudo wideband frequency signal by combining the low-frequency region signal and the enhanced high-frequency signal;
A frequency inverse transform unit that converts the pseudo wideband frequency signal into a pseudo wideband audio signal in the time domain and outputs the signal, and
The gain determining unit reduces the gain coefficient when the sum of absolute values of the power or amplitude of the low-frequency signal in the different range is large, and increases the absolute value sum of the power or amplitude of the high-frequency signal. In some cases, the voice musical tone pseudo-broadband apparatus is characterized in that the gain coefficient is increased.

請求項１に記載の音声楽音擬似広帯域化装置において、
上記ゲイン決定部は、上記低域領域の異なる範囲の信号の累積パワー比または振幅の絶対値和の比を計算する累積パワー移動計算部も備え、上記累積パワー移動計算部で求めた累積パワーの比または振幅の絶対値和の比に基づいて上記ゲイン係数を決定することを特徴とする音声楽音擬似広帯域化装置。 In the voice musical tone pseudo-broadband apparatus according to claim 1,
The gain determination unit also includes a cumulative power shift calculation unit that calculates a cumulative power ratio of signals in different ranges of the low frequency region or a ratio of absolute value sums of amplitudes, and the cumulative power shift calculated by the cumulative power shift calculation unit. A sound musical tone pseudo-broadband apparatus characterized by determining the gain coefficient based on a ratio or a ratio of absolute value sums of amplitudes.

請求項１に記載の音声楽音擬似広帯域化装置において、
上記ゲイン決定部は、上記低域領域内の異なる範囲の最大信号パワーまたは最大振幅の周波数を検出するピーク検出部も備え、上記最大信号パワーまたは最大振幅の周波数を中心として低域側と高域側のそれぞれについて所定範囲の信号の累積パワー比または振幅の絶対値和の比に基づいてゲイン係数を決定することを特徴とする音声楽音擬似広帯域化装置。 In the voice musical tone pseudo-broadband apparatus according to claim 1,
The gain determination unit also includes a peak detection unit that detects a maximum signal power or a maximum amplitude frequency in a different range within the low frequency region, and the low frequency side and the high frequency center around the maximum signal power or the maximum amplitude frequency. A sound musical tone pseudo-broadband apparatus characterized in that a gain coefficient is determined based on a cumulative power ratio of signals within a predetermined range or a ratio of absolute value sums of amplitudes for each side.

請求項１乃至３の何れかに記載の音声楽音擬似広帯域化装置において、
上記ゲイン決定部は、上記累積パワー比または振幅の絶対値和の比に正の実数αを乗じた値を上記ゲイン係数とすることを特徴とする音声楽音擬似広帯域化装置。 In the voice musical tone pseudo-wideband device according to any one of claims 1 to 3,
The apparatus according to claim 1, wherein the gain determination unit uses a value obtained by multiplying the cumulative power ratio or the ratio of absolute value sums of amplitudes by a positive real number α as the gain coefficient.

周波数変換部が、離散値化された狭帯域音声楽音信号を周波数領域の信号に変換し、低域領域の信号を生成する周波数変換過程と、
周波数拡張部が、上記低域領域の信号をＮ倍（ただし、Ｎは２以上の整数）に拡張した拡張信号を生成する周波数拡張過程と、
高域信号生成部が、上記低域領域の信号を複写して高域領域の信号を生成する高域信号生成過程と、
ゲイン決定部が、上記低域領域の異なる範囲の信号のパワー比または振幅の絶対値和の比に基づいて、低域側の信号のパワーの方が大きい場合には上記高域領域の信号に乗ずるゲイン係数を小さくし、高域側の信号のパワーの方が大きい場合には上記ゲイン係数を大きくするゲイン決定過程と、
ゲイン乗算部が、上記高域領域の信号に上記ゲイン係数を乗じて強調高域信号を生成するゲイン乗算過程と、
結合部が、上記拡張信号と上記強調高域信号を合わせて擬似広帯域周波数信号を生成する結合過程と、
周波数逆変換部が、上記擬似広帯域周波数信号を時間領域の擬似広帯域音声信号に変換して出力する周波数逆変換過程と、
を含むことを特徴とする音声楽音擬似広帯域化方法。 A frequency conversion process in which the frequency conversion unit converts the discrete-valued narrow-band audio musical sound signal into a frequency-domain signal and generates a low-frequency signal;
A frequency extension process in which a frequency extension unit generates an extension signal obtained by extending the low-frequency region signal N times (where N is an integer of 2 or more);
A high-frequency signal generation unit that generates a high-frequency signal by copying the low-frequency signal;
Based on the power ratio of the signals in different ranges of the low-frequency region or the ratio of the absolute value sum of the amplitudes, the gain determination unit determines that the signal in the high-frequency region is higher when the power of the low-frequency signal is higher. A gain determination process in which the gain coefficient to be multiplied is reduced, and the gain coefficient is increased when the power of the high frequency side signal is larger,
A gain multiplication process in which a gain multiplication unit multiplies the signal in the high frequency region by the gain coefficient to generate an enhanced high frequency signal;
A combining unit that generates a pseudo wideband frequency signal by combining the extended signal and the enhanced high-frequency signal;
A frequency inverse transform unit converts the pseudo wideband frequency signal into a pseudo wideband audio signal in the time domain and outputs it, and a frequency reverse transform process,
A method for increasing the frequency of a musical tone pseudo-band.

請求項５に記載した音声楽音擬似広帯域化方法において、
上記ゲイン決定過程は、累積パワー移動計算部が上記低域領域の異なる範囲の信号の累積パワー比または振幅の絶対値和の比を計算する累積パワー移動計算過程も含み、上記累積パワー移動計算過程で求めた累積パワーの比または振幅の絶対値和の比に基づいて上記ゲイン係数を決定する過程であることを特徴とする音声楽音擬似広帯域化方法。 In the method for realizing a voice musical tone pseudo-band according to claim 5,
The gain determination process includes a cumulative power transfer calculation process in which the cumulative power transfer calculation unit calculates a cumulative power ratio or a ratio of absolute value sums of amplitudes of signals in different ranges in the low frequency region, and the cumulative power transfer calculation process A method for increasing the frequency of a voice tone pseudo-band, which is the process of determining the gain coefficient based on the ratio of accumulated power or the ratio of absolute value sums of amplitudes obtained in (1).

請求項５に記載した音声楽音擬似広帯域化方法において、
上記ゲイン決定過程は、ピーク検出部が上記低域領域中の最大信号パワーまたは最大振幅の周波数を検出するピーク検出過程も含み、上記最大信号パワーまたは最大振幅の周波数を中心として低域側と高域側のそれぞれについて所定範囲の信号の累積パワー比または振幅の絶対値和の比を求める過程も含むことを特徴とする音声楽音擬似広帯域化方法。 In the method for realizing a voice musical tone pseudo-band according to claim 5,
The gain determination process also includes a peak detection process in which the peak detection unit detects the frequency of the maximum signal power or maximum amplitude in the low frequency region. A method of realizing a voice tone pseudo-wideband, comprising a step of obtaining a cumulative power ratio of signals within a predetermined range or a ratio of absolute value sums of amplitudes for each band side.

請求項５乃至７の何れかに記載の音声楽音擬似広帯域化方法において、
上記ゲイン決定過程は、上記累積パワー比または振幅の絶対値和の比に正の実数αを乗じて上記ゲイン係数を求める過程も含むことを特徴とする音声楽音擬似広帯域化方法。 In the method for realizing a voice tone pseudo-wideband according to any one of claims 5 to 7,
The above-mentioned gain determination step includes a step of obtaining the gain coefficient by multiplying the cumulative power ratio or the ratio of absolute value sums of amplitudes by a positive real number α, and a method for realizing a voice musical tone pseudo-band.

請求項１乃至４の何れかに記載された音声楽音擬似広帯域化装置としてコンピュータを機能させるためのプログラム。 A program for causing a computer to function as the voice musical tone pseudo-broadband device according to any one of claims 1 to 4.

請求項９に記載した何れかのプログラムを記録したコンピュータで読み取り可能な記録媒体。
A computer-readable recording medium on which any one of the programs according to claim 9 is recorded.