JPH0242239B2

JPH0242239B2 -

Info

Publication number: JPH0242239B2
Application number: JP58115538A
Authority: JP
Priority date: 1983-06-27
Filing date: 1983-06-27
Publication date: 1990-09-21
Also published as: JPS607500A

Description

【発明の詳細な説明】本発明はマルチパルス型ボコーダに関する。入
力音声信号を分析して、この入力音声信号の音声
情報を構成するスペクトル包絡情報と音源情報と
を分析側で抽出し、これら音声情報を伝送路を介
して合成側に送出して入力音声信号を再生するボ
コーダはよく知られている。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a multi-pulse vocoder. The input audio signal is analyzed, the spectral envelope information and sound source information that constitute the audio information of this input audio signal are extracted on the analysis side, and these audio information are sent to the synthesis side via the transmission path to generate the input audio signal. Vocoders that play .

上述したスペクトル包絡情報は、入力音声信号
を発生する声道系のスペクトル分布情報を表わす
もので、通常LPC分析によつて得られた分析次
数に対応する個数のLPC係数、たとえばαパラ
メータ、Ｋパラメータ等によつて表現され、また
音源情報はスペクトル包絡の微細構造を示すもの
で入力音声信号からスペクトル分布情報を除い
た、いわゆる残差信号として知られるもので、入
力音声信号の音源の強さ、ピツチ周期および有
声・無声に関する情報が含まれ、通常これらの情
報は入力音声信号の分析フレームごとの自己相関
係数を介して抽出されることもよく知られてい
る。 The above-mentioned spectral envelope information represents the spectral distribution information of the vocal tract system that generates the input speech signal, and usually includes the number of LPC coefficients corresponding to the analysis order obtained by LPC analysis, such as the α parameter and the K parameter. The sound source information indicates the fine structure of the spectral envelope, and is known as the so-called residual signal obtained by removing the spectral distribution information from the input audio signal. It is also well known that information regarding pitch period and voiced/unvoiced is included, and that this information is usually extracted via autocorrelation coefficients for each analysis frame of the input audio signal.

さて、スペクトル包絡情報はボコーダの合成側
で入力音声信号を合成する場合、通常全極型のデ
ジタルフイルタを利用して近似的声道系を形成せ
しめるLPC合成器の係数として利用され、音源
情報はこのデジタルフイルタの駆動音源として利
用され、このデジタルフイルタによつて入力音声
信号が合成される。 Now, when spectral envelope information is synthesized on the synthesis side of a vocoder, it is usually used as coefficients of an LPC synthesizer that uses an all-pole digital filter to form an approximate vocal tract system, and the sound source information is It is used as a driving sound source for this digital filter, and input audio signals are synthesized by this digital filter.

このようにして得られらる従来のLPCボコー
ダは、約4Kb（キロビツト）以下の低ビツトレー
トでも音声の合成が可能であり多用されているも
のの、高品質の音声合成は高ビツトレートにおい
ても困難であるという欠点を有する。この原因は
音源情報のモデル化の場合、有声音に対してはそ
の内容に対応するピツチ周期を抽出してこのピツ
チ周期に対応する単一のインパルス列で近似的に
表現し、ランダム周期の無声音に対しては白色雑
音で近似的に表現するという単純なモデル化処理
を前提としているため、入力音声信号の音源情報
を忠実に抽出したものとならず、従つて音源情報
に含まれる入力音声信号の波形情報の分析、合成
が実施されていないことによる。 Conventional LPC vocoders obtained in this way can synthesize speech even at low bit rates of about 4Kb (kilobits) or less, and are widely used, but high-quality speech synthesis is difficult even at high bit rates. It has the following drawback. The reason for this is that when modeling sound source information, for a voiced sound, the pitch period corresponding to its content is extracted and approximately expressed by a single impulse train corresponding to this pitch period, while unvoiced sounds with a random period are is based on a simple modeling process in which it is approximated by white noise, so it does not faithfully extract the sound source information of the input audio signal, and therefore the input audio signal contained in the sound source information This is due to the fact that analysis and synthesis of waveform information has not been carried out.

マルチパルス型ボコーダは、このような波形非
伝送による問題の改善を図るため波形伝送を行な
つて入力音声信号の合成を実施するボコーダのひ
とつとして近時よく知られつつあるものである。 A multi-pulse vocoder has recently become well known as a type of vocoder that performs waveform transmission and synthesizes input audio signals in order to improve the problem caused by non-transmission of waveforms.

第１図は従来のマルチパルス型ボコーダの基本
的構成を示すブロツク図である。 FIG. 1 is a block diagram showing the basic configuration of a conventional multi-pulse vocoder.

LPC合成器１は声道をシミユレートする全極
型デジタルフイルタを備え、その係数は入力端子
２００１を介して入力される入力音声信号ｘ（ｎ）
（ｎ＝１，２，３……ｎ）をLPC分析器２により
分析フレームごとに分析したLPC係数が供給さ
れる。音源パルス発生器３は、入力音声信号の音
源情報から複数個のインパルス系列、すなわちマ
ルチパルスからなる駆動音源系列Ｖ（ｎ）を得て、
これをLPC合成器１の駆動音源して供給する。 The LPC synthesizer 1 is equipped with an all-pole digital filter that simulates the vocal tract, and its coefficients are based on the input audio signal x(n) input via the input terminal 2001.
(n=1, 2, 3...n) is analyzed by the LPC analyzer 2 for each analysis frame, and LPC coefficients are supplied. The sound source pulse generator 3 obtains a driving sound source sequence V(n) consisting of a plurality of impulse sequences, that is, multipulses, from the sound source information of the input audio signal,
This is supplied as a driving sound source for the LPC synthesizer 1.

LPC合成器１はこうして入力するLPC係数を、
通常は金極型デジタルフイルタを利用する合成フ
イルタの係数とし、マルチパルスを駆動音源とし
て駆動され合成信号ｘ〓（ｎ）を出力する。この場
合、マルチパルスは入力音声信号の波形情報を含
むものであり、LPC合成器１は波形情報を含む
入力音声信号の合成を行なうこととなる。 The LPC synthesizer 1 inputs the LPC coefficients in this way,
Usually, this is the coefficient of a synthesis filter using a gold-pole type digital filter, which is driven by a multi-pulse as a driving sound source and outputs a synthesis signal x〓(n). In this case, the multi-pulse includes waveform information of the input audio signal, and the LPC synthesizer 1 synthesizes the input audio signal including the waveform information.

さて、LPC合成器１から出力する合成信号ｘ^〜
（ｎ）は次に減算器４で入力音声信号ｘ（ｎ）との
差をとり、誤差ｅ（ｎ）を得てこれを聴感重み付
け器５に送出する。 Now, the composite signal x output from LPC combiner 1 ^~
(n) is then subtracted from the input audio signal x(n) by a subtracter 4 to obtain an error e(n), which is sent to the auditory weighter 5.

聴感重み付け器５は、誤差ｅ（ｎ）に対して次
の(1)式に示す特性Ｗ（Ｚ）を有する重み付けフイ
ルタによつて聴感的な重み付けを付与したうえ、
これらを２乗誤差最小化器６に送出するものであ
る。 The perceptual weighting device 5 applies perceptual weighting to the error e(n) using a weighting filter having a characteristic W(Z) shown in the following equation (1), and
These are sent to the square error minimizer 6.

Ｗ（Ｚ）＝〔１−_p 〓^k=1 a_kZ^-k〕／〔１−_p 〓^k=1 a_kΥ^kZ^-k〕 ………(1) (1)式においてa_kはLPC合成器１の全極型デジタ
ルフイルタの係数とすべきLPC係数、ｐはその
次数であり従つてLPC分析次数、Υは重み付け
係数、Ｚは全極型デジタルフイルタのＺ変換表示
による伝達関数Ｈ（Z^-1）におけるＺ＝exp（jλ）
を示し、λ＝2πΔTfでありΔTは分析フレームの
標本化サンプリング周期、ｆ周波数を示す。W (Z) = [1- _p 〓 ^k=1 a _k Z ^-k ] / [1- _p 〓 ^k=1 a _k Υ ^k Z ^-k ] ...... (1) In equation (1), a _k is The LPC coefficient to be used as the coefficient of the all-pole digital filter of the LPC synthesizer 1, p is its order and therefore the LPC analysis order, Υ is the weighting coefficient, and Z is the transfer function H in the Z-transform representation of the all-pole digital filter. Z=exp(jλ) at (Z ^-1 )
, λ=2πΔTf, and ΔT indicates the sampling period of the analysis frame and f frequency.

また(1)式において重み付け係数Υは、０＜Υ＜
１の範囲で設定される。 Also, in equation (1), the weighting coefficient Υ is 0<Υ<
It is set in the range of 1.

(1)式に示すＷ（Ｚ）はΥ＝１に対しては１，Υ
＝０に対してはＷ（Ｚ）＝１−ｐ（Ｚ）の範囲で変
化し、Υの値は誤差ｅ（ｎ）の周波数スペクトル
におけるフオルマント領域に現われる過大なレベ
ルを抑圧する程度に対応して前述した範囲の中で
設定され、合成すべき信号の聴感的重み付けの役
割を果すものであり、通常予め最適聴感テストに
よつてその最適値が選定される。 W(Z) shown in equation (1) is 1 for Υ=1, Υ
= 0, it changes in the range of W(Z) = 1-p(Z), and the value of Υ corresponds to the degree to which the excessive level appearing in the formant region in the frequency spectrum of error e(n) is suppressed. It is set within the above-mentioned range and plays the role of perceptual weighting of the signals to be synthesized, and its optimal value is usually selected in advance by an optimal auditory test.

このようにして重み付けされた誤差ｅ（ｎ）は、
音源パルス発生器３から出力される駆動音源系列
Ｖ（ｎ）、すなわちマルチパルスの最適時間位置と
振幅とを決定するために２乗誤差最小化器６に送
出され、次の２式による２乗誤差εを計算し、ε
を最小にするように駆動音源系列Ｖ（ｎ）が選択
される。 The error e(n) weighted in this way is
In order to determine the drive sound source sequence V(n) output from the sound source pulse generator 3, that is, the optimal time position and amplitude of the multi-pulse, it is sent to the square error minimizer 6, and is squared according to the following two equations. Calculate the error ε, ε
The driving sound source sequence V(n) is selected so as to minimize the value of V(n).

ε＝_N 〓ⁿ⁼¹ 〔ｅ（ｎ）〓ｗ（ｎ）〕² ………(2) (2)式において記号〓は聴感重み付け器５の重み
付けフイルタによるたたみ込み積分、Ｎはマルチ
パルスを計算する区間長を示す。 ε= _N 〓 ⁿ⁼¹ [e(n)〓w(n)] ² ......(2) In equation (2), the symbol 〓 is the convolution integral by the weighting filter of the auditory weighter 5, and N is the multipulse Indicates the interval length to be calculated.

上述した処理はマルチパルスのパルスごとに繰
返され、分析による合成がマルチパルスごとに行
なわれる、いわゆるAnalysis−by−Synthesis手
法（以下Ａ−ｂ−Ｓ手法と略称する）であつて、
このＡ−ｂ−Ｓ手法は上述した内容からも明らか
な如く、マルチパルス１つずつについてパルス発
生、２乗誤差計算およびパルス位置・振振調整の
ループで行なわれるため、低ビツトレート領域に
おける有効な手段であるにもかかわらずその演算
量が極めて膨大なものとなるという欠点がある。 The above-mentioned process is repeated for each multi-pulse, and synthesis by analysis is performed for each multi-pulse, which is the so-called Analysis-by-Synthesis method (hereinafter abbreviated as A-b-S method).
As is clear from the above, this A-b-S method is performed in a loop of pulse generation, square error calculation, and pulse position/oscillation adjustment for each multipulse, so it is effective in the low bit rate region. Although it is a means, it has the disadvantage that the amount of calculation is extremely large.

なお、このＡ−ｂ−Ｓ手法については、B.S.
Atal et al，“Ａ NeW Model of LPC
Excitation for Producing Natural−Sounding
Speech at Low Bit Rates”，Proc.ICASSP82，
pp614−617，（1982）等に詳述されている。 Regarding this A-b-S method, BS
Atal et al, “A NeW Model of LPC
Excitation for Producing Natural−Sounding
Speech at Low Bit Rates”，Proc.ICASSP82，
It is detailed in pp614-617, (1982), etc.

このような従来のＡ−ｂ−Ｓ手法におおける欠
点に対して、相関演算にもとづき最適なマルチパ
ルスを効率的に計算する次のような演算処理アル
ゴリズムが最近紹介されている。 In order to address these shortcomings in the conventional A-b-S method, the following arithmetic processing algorithm has recently been introduced that efficiently calculates optimal multi-pulses based on correlation calculations.

すなわち、入力音声信号ｘ（ｎ）はＮサンプル
ごと処理フレームによつて区分され、このフレー
ムごとにマルチパルスが包絡的に計算されるもの
である。 That is, the input audio signal x(n) is divided into processing frames every N samples, and multipulses are calculated envelopely for each frame.

いま、１分析フレーム内に音源パルスがｋ個存
在するものとし、ｉ番目のパルスがフレーム端か
ら時間位置m_iにあり、かつその振幅がgiであると
すると、LPC合成フイルタの駆動音源ｄ（ｎ）は
次の(3)式で示される。 Assume that there are k sound source pulses in one analysis frame, and that the i-th pulse is at the time position m _i from the frame end and its amplitude is gi, then the driving sound source d( n) is expressed by the following equation (3).

ｄ（ｎ）＝_k 〓ⁱ⁼¹ gi・δn，m_i ………(3) (3)式においてδn，m_iはクロネツカーのデルタ
関数であり、δn，m_i＝１（ｎ＝m_i），δn，m_i＝０
（ｎ≠m_i）である。 d(n)= _k 〓 ⁱ⁼¹ gi・δn,m _i ......(3) In equation (3), δn, m _i are Kronetzker's delta functions, and δn, m _i = 1 (n=m _i ), δn, m _i =0
(n≠m _i ).

LPC合成フイルタはこの駆動音源ｄ（ｎ）によ
つて駆動され合成信号x〓（ｍ）を出力する。 The LPC synthesis filter is driven by this drive sound source d(n) and outputs a synthesis signal x〓(m).

LPC合成フイルタとして、たとえば全極型デ
ジタルフイルタを考えるものとし、その伝達関数
をインパルス応答ｋ（ｎ）（０≦ｎ≦Ｍ−１）で表
現するものとすると、合成信号x〓（ｎ）は次の４
式で表わされる。 As an LPC synthesis filter, let us consider, for example, an all-pole digital filter, and its transfer function is expressed by an impulse response k(n) (0≦n≦M-1), then the synthesized signal x〓(n) is Next 4
It is expressed by the formula.

x〓（ｎ）＝_M=-1 〓^l=1 ｄ（ｌ）・ｈ（ｎ−ｌ） ………(4) (4)式においてｄ（ｌ）は駆動音源を表わす。 x〓(n)= _M=-1〓l ⁼¹ d(l)·h(n-l) (4) In equation (4), d(l) represents the driving sound source.

次に入力音声信号ｘ（ｎ）と合成信号x〓（ｎ）と
の誤差に対し聴感的な補正を施した重み付け誤差
をe_w（ｎ）とするとe_w（ｎ）は次の(5)式で示され
る。 Next, let e _w (n) be the weighted error obtained by auditorily correcting the error between the input audio signal x(n) and the composite signal x〓(n), then e _w (n) is expressed as the following (5) It is shown by the formula.

e_w（ｎ）＝｛ｘ（ｎ）−x〓（ｎ）〓ｗ（ｎ）……(5) さらに２乗誤差は(5)式から誘導して次の(6)式で
示すことができる。_M 〓ⁿ⁼¹ e² _w（ｎ）＝_M 〓ⁿ⁼¹ 〔｛ｘ（ｎ）−x〓（ｎ）｝〓ｗ（ｎ）〕²………(6) (6)式においてＭは誤差を最小化する区間のサン
プル数を示し、たとえば１分析フレーム長に選
ぶ。最適な音源パルス列としてのマルチパルスは
(6)式を最小化するgiを得ることによつて得られ、
このgiは上述した(3)，(4)および(6)式から次の(7)式
の如く誘導される。 e _w (n)={x(n)−x〓(n)〓w(n)……(5) Furthermore, the squared error can be derived from equation (5) and expressed as the following equation (6). can. _M 〓 ⁿ⁼¹ e ² _w (n)= _M 〓 ⁿ⁼¹ [{x(n)−x〓(n)}〓w(n)] ² ………(6) In equation (6), M is It indicates the number of samples in the interval that minimizes the error, and is selected to be one analysis frame length, for example. Multipulse as the optimal sound source pulse train is
Obtained by obtaining gi that minimizes equation (6),
This gi is derived from the above-mentioned equations (3), (4), and (6) as shown in the following equation (7).

gi（m_i）＝_M 〓ⁿ⁼¹ x_w（ｎ）・h_w（ｎ−m_i）−_i=1 〓^l=1 〔g_lM 〓^M=1 h_w（ｎ−m_l）・h_w（ｎ−m_i）〕／_M 〓ⁿ⁼¹ h_w（ｎ−m_i）・h_w（ｎ−m_i） ………(7) (7)式においてx_w（ｎ）はｘ（ｎ）〓ｗ（ｎ），hw
（ｎ）はｈ（ｎ）〓ｗ（ｎ）を示す。(7)式の右辺の
分子の第１項はx_w（ｎ）とh_w（ｎ）との時間遅れ
m_iの相互相関関数_hx（m_i）を示すものであり、
また、第２項の_M 〓^M=1 h_w（ｎ−m_l）・h_w（ｎ−m_i）は
h_w（ｎ）の共分散関数_hh（m_l，m_i）（１≦m_l，m_i
≦Ｍ）を示す。共分散関数_hh（m_l，m_i）は自己
相関関数R_hh（｜m_l−m_i｜）と等しくなり、従つ
て(7)式は次の(8)式の如く表わすことができる。gi (m _i ) = _M 〓 ⁿ⁼¹ x _w (n)・h _w (n−m _i ) − _i=1 〓 ^l=1 [g _lM 〓 ^M=1 h _w (n−m _l ) ・h _w (n-m _i )] / _M 〓 ⁿ⁼¹ h _w (n-m _i )・h _w (n-m _i ) .........(7) In equation (7), x _w (n) is x ( n) 〓w(n), hw
(n) indicates h(n)〓w(n). The first term in the numerator on the right side of equation (7) is the time delay between x _w (n) and h _w (n).
It indicates the cross-correlation function _hx (m _i ) of m _i ,
Also, the second term _M 〓 ^M=1 h _w (n-m _l )・h _w (n-m _i ) is
Covariance function of h _w (n) _hh (m _l , m _i ) (1≦m _l , m _i
≦M). The covariance function _hh (m _l , m _i ) is equal to the autocorrelation function R _hh (|m _l −m _i |), and therefore equation (7) can be expressed as the following equation (8).

_hx（m_i）−_i-1 〓^l=1 g_l・R_hh（｜m_l−m_i｜） gi（mi）＝ R_hh（ｏ） ………(8) (8)式によれば、時間位置m_iにおいてパルスを
発生せしめると振幅g_i（m_i）が最適なものとして
決定しうることとなる。なお(8)式において１≦
m_i≦Ｍである。 _hx (m _i ) − _i-1 〓 ^l=1 g _l・R _hh (｜m _l −m _i ｜) gi (mi)= R _hh (o) ......(8) According to equation (8) , if a pulse is generated at the time position m _i , the amplitude g _i (m _i ) can be determined as the optimum one. Note that in equation (8), 1≦
m _i ≦M.

つまり、音源パルスに着目し、種々の時間位置
において(8)式によりその振幅を計算したうえ、そ
の振幅の絶対値を最大とするものが(6)式に示す２
乗誤差を最小化するパルスとなり、このような手
続を繰返して複数個の音源パルスを求めることが
できる。 In other words, by focusing on the sound source pulse and calculating its amplitude using equation (8) at various time positions, the one that maximizes the absolute value of the amplitude is 2 as shown in equation (6).
The pulse minimizes the multiplication error, and by repeating this procedure, a plurality of sound source pulses can be obtained.

なお、上述した計算アルゴリズムに関しては、
小沢、荒関、小野、“マルチパルス駆動形音声符
号化法の検討”，1983年３月電子通信学会通信方
式研究会に詳述されている。 Regarding the calculation algorithm mentioned above,
Ozawa, Araseki, and Ono, ``Study of multipulse-driven speech coding method,'' detailed at the IEICE communications system study group, March 1983.

このような計算アルゴリズムに基づいて行なわ
れるマルチパルスの発生によれば、相互相関関数
と自己相関関数ならびに最大値演算から最適なマ
ルチパルスの計算が可能となるため、構成が非常
に簡素化されたものとなり演算量を大幅に低減し
うるマルチパルス型ボコーダを実現することがで
きる。 Generating multipulses based on such calculation algorithms makes it possible to calculate optimal multipulses from cross-correlation functions, autocorrelation functions, and maximum value calculations, which greatly simplifies the configuration. Therefore, it is possible to realize a multi-pulse vocoder that can significantly reduce the amount of calculation.

しかしながら、このようにして改善したマルチ
パルス型ボコーダにあつてもさらに次に述べるよ
うな欠点がある。 However, even the multi-pulse vocoder improved in this way still has the following drawbacks.

すなわち、フレーム内で発生するパルス数より
もフレーム内に存在するパルス数が多いときには
合成信号は入力音声信号の音源情報に関する波形
伝送を忠実に実行したものとならず、合成信号の
音声品質が上述したパルス数の差に対応た程度の
劣化を伴なうこととなる。 In other words, if the number of pulses existing in a frame is greater than the number of pulses generated within the frame, the synthesized signal will not faithfully transmit the waveform regarding the sound source information of the input audio signal, and the voice quality of the synthesized signal will not be as described above. This results in a degree of deterioration corresponding to the difference in the number of pulses.

マルチパルス型ボコーダでは、たとえば分析周
期を20mSECとする１フレームにおいて発生すべ
き音源駆動パルスの数はビツトレートに対応して
通常４〜16個のうち予め設定した固定数を利用す
る。入力音声信号が女声あるいは幼児声の如くピ
ツチ周期が小さい音声の場合、音源信号のピツチ
周期が2.5mSEC程度となることも珍しくない。
この場合１分析フレーム中に設定すべ駆動音源パ
ルスの数としては少なくとも８個必要となる。こ
のような場合、分析フレーム内で発生すべき駆動
音源パルスの数が８個以下、たとえば４個に設定
してあるときにはこのような駆動音源パルスを利
用するマルチパルス型ボコーダでは倍ピツチエラ
ーと同様な結果を含む合成音が発生し合成音質が
著しく劣化することとなる。 In a multi-pulse vocoder, the number of sound source drive pulses to be generated in one frame with an analysis period of 20 mSEC, for example, is usually a preset fixed number of 4 to 16, depending on the bit rate. When the input audio signal is a voice with a small pitch period, such as a female voice or a child's voice, it is not uncommon for the pitch period of the sound source signal to be about 2.5 mSEC.
In this case, at least eight drive sound source pulses are required to be set in one analysis frame. In such a case, if the number of driving sound source pulses to be generated within an analysis frame is set to 8 or less, for example 4, a multi-pulse type vocoder that uses such driving sound source pulses will cause a double pitch error. A synthesized sound containing the result will be generated, and the synthesized sound quality will be significantly degraded.

本発明の目的は上述した欠点を除去し、マルチ
パルス型ボコーダにおいて、分析フレーム内で発
生すべき駆動音源パルスよりも分析フレーム内に
存在するパルス数が多ときには入力音声信号から
抽出したピツチ周期にもとづいてモデル化して発
生する駆動音源パルスをマルチパルスに代えて利
用する手段と備えることにより、合成音質の劣化
を大幅に改善た簡単な構成のマルチパルス型ボコ
ーダを提供することにある。 An object of the present invention is to eliminate the above-mentioned drawbacks, and to provide a multi-pulse vocoder with a pitch period extracted from an input audio signal when the number of pulses existing in an analysis frame is greater than the driving sound source pulses to be generated within the analysis frame. It is an object of the present invention to provide a multi-pulse type vocoder with a simple configuration which greatly improves the deterioration of synthesized sound quality by providing means for using drive sound source pulses generated by modeling in place of multi-pulses.

本発明のマルチパルス型ボコーダの具体例は、
入力音声信号を分析フレームごとにLPC分析し
て抽出したLPC係数をスペクトル包絡情報とし
てこのスペクトル包絡情報とともに前記入力音声
信号の音声情報を構成する音源情報を分析フレー
ムごとにこの音源情報の特徴に対応する発生時間
位置と振幅とを有する複数個のインパルス列（マ
ルチパルス）を以つて表現し前記入力音声信号の
分析および合成を行なうマルチパルス型ボコーダ
において、前記入力音声信号の分析フレームごと
に抽出するピツチ周期を介して分析フレーム内に
存在する前記音源情報よるパルス数を判断したう
えこれが分析フレーム内で発生すべき前記マルチ
パルスの数よりも多い場合には前ピツチ周期に対
応する複数個のインパルス系列によるマルチパル
スを前記マルチパルスに代えて発生する手段を備
えて構成される。 A specific example of the multi-pulse vocoder of the present invention is as follows:
The input audio signal is LPC-analyzed for each analysis frame, and the extracted LPC coefficients are used as spectral envelope information. Together with this spectral envelope information, the sound source information that constitutes the audio information of the input audio signal is processed for each analysis frame, and the extracted LPC coefficients are used to correspond to the characteristics of this sound source information. In a multi-pulse vocoder that analyzes and synthesizes the input audio signal by expressing it with a plurality of impulse trains (multipulses) having occurrence time positions and amplitudes, the input audio signal is extracted for each analysis frame. The number of pulses based on the sound source information existing in the analysis frame is determined through the pitch period, and if this number is greater than the number of multi-pulses that should be generated within the analysis frame, multiple impulses corresponding to the previous pitch period are determined. The apparatus is configured to include means for generating a series of multi-pulses instead of the multi-pulses.

次に図面を参照して本発明を詳細に説明する。
第２図は本発明によるマルチパルス型ボコーダの
分析側の一実施例を示すブロツク図、第３図は本
発明によるマルチパルス型ボコーダの合成側の一
実施例を示すブロツク図である。 Next, the present invention will be explained in detail with reference to the drawings.
FIG. 2 is a block diagram showing an embodiment of the analysis side of the multi-pulse vocoder according to the present invention, and FIG. 3 is a block diagram showing an embodiment of the synthesis side of the multi-pulse vocoder according to the present invention.

第２図に示す本発明によるマルチパルス型ボコ
ーダの分析側は、LPC分析器７，相互相関数算
出器８，ピツチ抽出器９，第１の符号化器１０，
自己相関関数算出器１１，音源パルス発生器１
２，第２の符号化器１３，第３の符号化器１４お
よびマルチプレクサ１５を備えて構成される。 The analysis side of the multi-pulse vocoder according to the present invention shown in FIG. 2 includes an LPC analyzer 7, a cross-correlation number calculator 8, a pitch extractor 9, a first encoder 10,
Autocorrelation function calculator 11, sound source pulse generator 1
2, a second encoder 13, a third encoder 14, and a multiplexer 15.

入力端子７００１を介して入力た入力音声信号
は、LPC分析器７，相互相関関数算出器８およ
びピツチ抽出器９にそれぞれ供給される。 An input audio signal input via an input terminal 7001 is supplied to an LPC analyzer 7, a cross-correlation function calculator 8, and a pitch extractor 9, respectively.

LPC分析器７は入力音声信号を分析フレーム
ごとに、予め設定するビツト数のデジタル量とし
て量子化し、この量子化音声信号をLPC分析し
てLPC係数としてのｐ次のＫパラメータ（偏自
己相関係数）を抽出し、これを出力ライン７０１
を介して第１の符号化器１０に供給する。本実施
例においては分析フレームは20mSECに設定して
いる。 The LPC analyzer 7 quantizes the input audio signal as a digital quantity with a preset number of bits for each analysis frame, performs LPC analysis on this quantized audio signal, and calculates the p-order K parameter (partial self-correlation) as an LPC coefficient. number) and send this to the output line 701
is supplied to the first encoder 10 via. In this example, the analysis frame is set to 20 mSEC.

第１の符号化器１０は、入力したLPC係数の
量子化と符号化を行なつたのち、出力ライン１０
０１を介してマルチプレクサ１５に送出する。 The first encoder 10 quantizes and encodes the input LPC coefficients, and then outputs them on the output line 10.
01 to the multiplexer 15.

LPC分析器７はまた、LPC係数からインパル
ス応答ｈ（ｎ）（１≦ｎ≦Ｎ）を計算し、出力ライ
ン７０２，第１の符号化器１０，出力ライン１０
０２を介して相互相関関数算出器８および自己相
関関数算出器１１に供給する。 The LPC analyzer 7 also calculates an impulse response h(n) (1≦n≦N) from the LPC coefficients, and outputs an output line 702, a first encoder 10, an output line 10
02 to the cross-correlation function calculator 8 and autocorrelation function calculator 11.

相互相関関数算出器８は、入力音声信号とイン
パルス応答ｈ（ｎ）とを利用して相互相関関数_hx
を計算し、これを出力ライン８０１を介して音源
パルス発生器１２に送出する。 The cross-correlation function calculator 8 calculates a cross-correlation function _hx using the input audio signal and the impulse response h(n).
is calculated and sent to the sound source pulse generator 12 via the output line 801.

また、自己相関関数算出器１１は、入力したイ
ンパルス応答ｈ（ｎ）の自己相関関数R_hhを計算
し、これを出力ライン１１０１を介して音源パル
ス算出器１２に送出する。 Further, the autocorrelation function calculator 11 calculates the autocorrelation function _Rhh of the input impulse response h(n), and sends it to the sound source pulse calculator 12 via the output line 1101.

音源パルス算出器１２は、こうして入力した分
析フレームごとの相互相関関数_hxと自己相関関
数R_hhとを利用して(8)式の計算を実行し所定の数
の音源パルス列を得て、これらのパルスの振幅お
よび位置情報を出力ライン１２０１を介して第２
の符号化器１３を送出し、これによつて量子化お
よび符号化を行なつたのち出力ライン１３０１を
介してマルチプレクサ１５を送出する。 The sound source pulse calculator 12 executes the calculation of equation (8) using the cross-correlation function _hx and autocorrelation function _Rhh for each analysis frame thus input, obtains a predetermined number of sound source pulse trains, and calculates these The amplitude and position information of the pulses are transmitted via output line 1201 to the second
The encoder 13 performs quantization and encoding, and then the multiplexer 15 is sent out via the output line 1301.

このようにして、量子化および符号化されてマ
ルチプレクサ１５に送出されるLPC係数および
マルチパルスデータは、入力音声信号のスペクト
ル包絡および音源情報を表わすデータとしてマル
チプレクサ１５を介して所定の方式で時分割さ
れ、伝送路１５０１を介して第２図に示す分析側
から第３図に示す合成側に伝送さるが、分析側に
おける処理において、分析フレーム内で発生すべ
きマルチパルスの数は伝送ビツトレートに対応し
て予め固定数のものとして設定されており、女声
もしくは幼児声の如く高ピツチ周期の入力音声信
号を入力して、分析フレーム内に存在する源情報
のパルス数の方が多い場合には忠実に音源情報を
分析して波形伝送を図ることができなくなり、こ
のために合成側で再生される音声品質がこれら２
つのパルス数の差に応答して劣化するという欠点
を生ずることは前述したとおりである。 In this way, the LPC coefficients and multipulse data that are quantized and encoded and sent to the multiplexer 15 are time-divided in a predetermined manner via the multiplexer 15 as data representing the spectral envelope and sound source information of the input audio signal. The data is transmitted from the analysis side shown in FIG. 2 to the synthesis side shown in FIG. This is set in advance as a fixed number of pulses, and when an input audio signal with a high pitch period, such as a female voice or a child's voice, is input, and the number of pulses of the source information existing in the analysis frame is greater, the fidelity is It is no longer possible to analyze the sound source information and transmit the waveform, and as a result, the quality of the audio played back on the synthesis side differs from those two.
As mentioned above, this has the drawback of deterioration in response to a difference in the number of pulses.

そこで、本実施例にあつては第２図に示すピツ
チ抽出器９および第３の符号化器１４ならびに後
述する第３図に示す切替器２０および代替音源パ
ルス発生器２３等を備え、次のようにしてこの欠
点の除去を図つている。 Therefore, in this embodiment, the pitch extractor 9 and third encoder 14 shown in FIG. 2, as well as the switch 20 and alternative sound source pulse generator 23 shown in FIG. 3, which will be described later, are provided. In this way, we are trying to eliminate this drawback.

ピツチ抽出器９は入力音声信号を受けると、分
析フレームごとにその自己相関関数を算出し、こ
れによつてピツチ周期を求めている。これは、入
力音声信号が周期的であれば、この入力音声信号
のピツチ周期と同じ遅れ時間における自己相関関
数が最大値をとるという原理に基づいて一般的に
よく利用されている手法である。 When the pitch extractor 9 receives the input audio signal, it calculates its autocorrelation function for each analysis frame, thereby determining the pitch period. This is a commonly used method based on the principle that if the input audio signal is periodic, the autocorrelation function will take a maximum value at a delay time that is the same as the pitch period of the input audio signal.

ピツチ抽出器９は、入力音声信号の分析フレー
ムごとにそのピツチ周期を抽出し、このピツチ周
期が第２図に示すマルチパルス型ボコーダにおい
て予め設定した固定数のマルチパルスよりも小さ
いと判断したときはこれらの情報を第２図に示す
分析側から第３図に示す合成側に伝送して、予め
設定した個数のマルチパルスの代りに検出したピ
ツチ周期に基づいてモデル化したマルチパルスを
発生せしめ、これを音源情報すなわち駆動音源パ
ルスとしてLPC合成を行なわせるものであり、
その動作は次の如く行なわれる。 The pitch extractor 9 extracts the pitch period for each analysis frame of the input audio signal, and when it is determined that this pitch period is smaller than the fixed number of multipulses set in advance in the multipulse type vocoder shown in FIG. transmits this information from the analysis side shown in Figure 2 to the synthesis side shown in Figure 3, and generates a multi-pulse modeled based on the detected pitch period instead of a preset number of multi-pulses. , this is used as sound source information, that is, driving sound source pulse, to perform LPC synthesis.
The operation is performed as follows.

すなわち、ピツチ抽出器９は分析フレームごと
のピツチ周期を抽出し、これを予め設定したマル
チパルスの個数に対応するピツチ周期の判断域値
と比較し、この判断域値よりも小さいピツチ周期
を有する分析フレームのピツチ情報、分析フレー
ム内に存在する音源パルスの先頭および最終パル
スもしくは先頭パルスのみの時間的位置情報およ
びその振幅情報等を出力ライン９０１を介して第
３の符号化器１４に送出するとともに、分析フレ
ームの音源パルスの個数がマルチパルスの個数よ
りも多いことを合成側に通知する、２値の論理値
“１”レベルのマルチパルス代替制御信号を出力
ライン９０２を介して出力しこれをマルチプレク
サ１５に送出する。 That is, the pitch extractor 9 extracts the pitch period for each analysis frame, compares this with a pitch period judgment threshold value corresponding to a preset number of multi-pulses, and determines whether the pitch period has a pitch period smaller than this judgment threshold value. The pitch information of the analysis frame, the temporal position information of the first and last pulses of the sound source pulses existing in the analysis frame, or only the first pulse, their amplitude information, etc. are sent to the third encoder 14 via the output line 901. At the same time, a multi-pulse alternative control signal with a binary logic value "1" level is output via the output line 902, which notifies the synthesis side that the number of sound source pulses in the analysis frame is greater than the number of multi-pulses. is sent to the multiplexer 15.

第３の符号化器１４は、ピツチ抽出器９から入
力した前記諸情報に関するデータの量子化および
符号化を行ないこれらを出力ライン１４０１を介
しマルチプレクサ１５に送出する。 The third encoder 14 quantizes and encodes data related to the various information inputted from the pitch extractor 9 and sends them to the multiplexer 15 via an output line 1401.

マルチプレクサ１５はこのようにして、ピツチ
抽出器９によつて抽出されたピツチ周期が予め定
めた値と同じか大きい場合には、すなわち分析フ
レームで発生すべき必要なマルチパルスのパルス
数が分析フレーム内に存在する分析の結果得られ
る予め設定された音源パルスの数に等しいかそれ
以下の場合には、出力ライン１００１を介して受
けるLPC係数データ、および出力ライン１３０
１を介して受けるマルチパルスデータの転送を伝
送路１５０１を介して予め定める時分割方式によ
り同時伝送し、またピツチ抽出器９によつて抽出
されたピツチ周期が予め定めた値より小さい場
合、すなわち分析フレームで発生すべき必要なマ
ルチパルス数のパルス数が分析フレーム内に存在
する分析の結果得られる予め設定された音源パル
スの数よりも多い場合には上述したデータのほか
に、出力ライン１４０１を介して受ける上述した
諸情報に関するデータおよび出力ライン９０２を
介して受けるマルチパルス代替制御信号を含めて
時分割同時伝送する。 In this way, the multiplexer 15 determines that if the pitch period extracted by the pitch extractor 9 is equal to or larger than a predetermined value, that is, the number of necessary multipulses to be generated in the analysis frame is LPC coefficient data received via output line 1001 and output line 130 if equal to or less than the preset number of source pulses resulting from the analysis present in
If the multi-pulse data received through the transmission line 1501 is simultaneously transmitted in a predetermined time division manner through the transmission path 1501, and the pitch period extracted by the pitch extractor 9 is smaller than a predetermined value, that is, If the number of required multi-pulses to be generated in the analysis frame is greater than the preset number of sound source pulses obtained as a result of analysis existing in the analysis frame, in addition to the above data, output line 1401 902 and a multipulse alternative control signal received via output line 902.

第３図に示す合成側は、伝送路１５０１を介し
て合成側から伝送されたデータに基づいて入力音
声信号の合成を行なうものであり、デマルチプレ
クサ１６，第１の復号化器１７，第２の復号化器
１８，第３の復号化器１９，切替器２０，LPC
合成器２１，LPF（Low Pass Filter）２２およ
び代替音源パルス発生器２３等を備えて構成され
る。 The synthesis side shown in FIG. 3 synthesizes input audio signals based on data transmitted from the synthesis side via a transmission path 1501, and includes a demultiplexer 16, a first decoder 17, and a second decoder 17. decoder 18, third decoder 19, switch 20, LPC
It is configured to include a synthesizer 21, an LPF (Low Pass Filter) 22, an alternative sound source pulse generator 23, and the like.

デマルチプレクサ１６は、伝送路１５０１を介
して入力した各種データをマルチプレクサ１５の
時分割伝送形式による変換前の状態に復元し、
LPC係数データは出力ライン１６１を介して第
１の復号化器１７に、マルチパルスデータは出力
ライン１６２を介して第２の復号化器１８に、ま
た判定域値以下のピツチ周期を抽出した分析フレ
ームのピツチ周期、音源パルスの先頭および最終
パルスもしくは先頭パルスのみの時間的位置情報
およびその振幅情報は出力ライン１６３を介して
第３の復号化器１９にそれぞれ供給され、これら
の復号化器によつてデータの復号化を行なつたう
え、それぞれ出力ライン１７１，１８１，１９１
に送出する。 The demultiplexer 16 restores various data input via the transmission line 1501 to the state before conversion by the time division transmission format of the multiplexer 15,
The LPC coefficient data is sent to the first decoder 17 via the output line 161, and the multipulse data is sent to the second decoder 18 via the output line 162. The pitch period of the frame, the temporal position information of the first and last pulses of the sound source pulses, or only the first pulse, and their amplitude information are respectively supplied to the third decoder 19 via the output line 163, and these decoders Therefore, after decoding the data, output lines 171, 181, 191 are output, respectively.
Send to.

また、デマルチプレクサ１６を介して出力され
るマルチパルス代替制御信号データは出力ライン
１６４を介して切替器２０および代替音源パルス
発生器２３に送出される。 Further, the multi-pulse alternative control signal data outputted via the demultiplexer 16 is sent to the switch 20 and the alternative sound source pulse generator 23 via the output line 164.

切替器２０は、出力ライン１６４を介して入力
するマルチパルス代替制御信号がないとき、すな
わち分析側のピツチ抽出器９によつて分析された
分析フレームのピツチ周期がマルチパルスのピツ
チ周期と等しいかそれ以上の時間であつて、分析
フレーム内で発生されるマルチパルスの数が分析
フレーム内に存在する音源パルスの数に等しくか
それよりも多くて、発生すべきマルチパルスを音
源情報として利用しても合成音質の劣化の恐れが
ないときには出力ライン１８１を介して入力する
マルチパルスデータを出力ライン２０１を介して
LPC合成器２１に送出するように切替える。 The switch 20 determines whether the pitch period of the analysis frame analyzed by the pitch extractor 9 on the analysis side is equal to the pitch period of the multipulse when there is no multipulse alternative control signal inputted via the output line 164. or longer, and the number of multipulses generated within the analysis frame is equal to or greater than the number of sound source pulses present within the analysis frame, and the multipulses to be generated are used as sound source information. However, when there is no risk of deterioration of the synthesized sound quality, the multipulse data input via the output line 181 is transferred via the output line 201.
The signal is switched to be sent to the LPC synthesizer 21.

LPC合成器２１は、このようにして入力する
マルチパルスを音源情報としてｐ次の全極型デジ
タルフイルタの駆動音源に利用し、また出力ライ
ン１７１を介して入力するｐ次のLPC係数デー
タを上記全極型デジタルフイルタの係数としてこ
のLPC合成フイルタを制御して入力音声信号を
合成し、これを出力ライン２１１を介してLPF
２２に送出し、所定の低域フイルタリングを行つ
てアナログ量の合成音声として出力ライン２２１
に送出する。 The LPC synthesizer 21 uses the input multi-pulse as sound source information as a driving sound source of a p-order all-pole digital filter, and also uses the p-order LPC coefficient data input via the output line 171 as the sound source information. This LPC synthesis filter is controlled as a coefficient of an all-pole digital filter to synthesize the input audio signal, and this is sent to the LPF via the output line 211.
22, and performs a predetermined low-frequency filtering to output it as an analog synthesized voice to the output line 221.
Send to.

さて、出力ライン１６４を介して論理値“１”
のマルチパルス代替制御信号が切替器２０に供給
されるとき、すなわち分析側におけるピツチ抽出
器９によつて抽出された分析フレームのピツチ周
期が判定域値よりも小さく、従つて分析フレーム
内で発生されるマルチパルスの数よりも分析フレ
ーム内に存在する音源パルスの数が多いときに
は、切替器２０は次に述べるような代替音源パル
スを出力ライン２３１を介して代替音源パルス発
生器２３から受け、これを出力ライン２０１を介
して送出する音源パルスデータに代えて代替音源
パルスとして出力ライン２０２を介してLPC合
成器２１に供給するように切替える。 Now, the logic value “1” is output via the output line 164.
When a multi-pulse alternative control signal of When the number of sound source pulses existing in the analysis frame is greater than the number of multipulses to be analyzed, the switch 20 receives the following alternative sound source pulses from the alternative sound source pulse generator 23 via the output line 231, This is switched to be supplied to the LPC synthesizer 21 via the output line 202 as an alternative sound source pulse instead of the sound source pulse data sent out via the output line 201.

代替音源パルス発生器２３は、出力ライン１６
４を介してピツチ周期が判定域値よりも小さい分
析フレームのピツチ周期に関するデータを、また
出力ライン１９１を介してこの分析フレームに含
まれる音源パルスの先頭パルスと最終パルスの時
間位置情報ならびに振幅情報に関するデータ、も
しくは先頭パルスのみの時間位置情報と振幅情報
に関するデータを入力する。 The alternative sound source pulse generator 23 is connected to the output line 16
Data regarding the pitch period of the analysis frame whose pitch period is smaller than the decision threshold value is sent via output line 191, and time position information and amplitude information of the first and last pulses of the sound source pulses included in this analysis frame are sent via output line 191. input data, or data related to the time position information and amplitude information of only the first pulse.

分析フレームに対するスペクトル包絡情報なら
びに音源情報の標本値抽出のためのサンプリング
周波数は予め既知であり、分析フレームの時間も
また予め設定され、かつ短時間の分析フレーム内
では音声信号の変化は緩やかでほぼ定常的な信号
と見なしうる。従つて、サンプリング周波数のピ
ツチ周期に対して分析フレームに存在する音源パ
ルスのピツチ周期をその大きさに比例する整数比
で対応せしめ、かつ音源パルスの先頭パルスおよ
び最終パルスの時間的位置と振幅との情報にもと
づいて、ピツチ抽出器９によつて抽出したピツチ
周期でモデル化し複数のインパルス系列を容易に
発生することができ、しかも音源パルスの先頭パ
ルスと最終パルスの時間位置情報を利用しうるこ
とにより、これら複数のインパルス系列はすべて
分析フレーム内に含まれるようにすることができ
る。 The sampling frequency for extracting sample values of spectral envelope information and sound source information for the analysis frame is known in advance, the time of the analysis frame is also set in advance, and changes in the audio signal within a short analysis frame are gradual and almost constant. It can be considered as a steady signal. Therefore, the pitch period of the sound source pulse existing in the analysis frame is made to correspond to the pitch period of the sampling frequency in an integer ratio proportional to its magnitude, and the temporal position and amplitude of the first and last pulses of the sound source pulse are Based on this information, it is possible to easily generate a plurality of impulse sequences by modeling the pitch period extracted by the pitch extractor 9, and moreover, it is possible to use the time position information of the first pulse and the last pulse of the sound source pulse. By doing so, all of these multiple impulse sequences can be included within the analysis frame.

代替音源パルス発生器２３はこのような内容の
複数のインパルス系列を発生し、この分析ピツチ
周期でモデル化したマルチパルス群を通常のマル
チパルスに対する代替マルチパルス、すなわち代
替音源パルスとして出力ライン２３１に送出す
る。 The alternative sound source pulse generator 23 generates a plurality of impulse sequences with such content, and sends the multipulse group modeled with this analysis pitch period to the output line 231 as an alternative multipulse to the normal multipulse, that is, as an alternative sound source pulse. Send.

なお、この代替音源パルス発生において利用す
べき音源パルスをその先頭パルスのみとしてモデ
ル化する場合には、発生する代替音源パルスの全
部が分析フレーム内に包含されず、代替音源パル
スのうち最終パルス分が次の分析フレームと共通
して存在する、いわゆる端数パルスの存在が起り
このぶんだけ再生音質の劣化を招くという点のみ
が異なる。 Note that if the sound source pulse to be used in generating this alternative sound source pulse is modeled as only its leading pulse, all of the generated alternative sound source pulses will not be included in the analysis frame, and the final pulse portion of the alternative sound source pulses will not be included in the analysis frame. The only difference is that there is a so-called fractional pulse that exists in common with the next analysis frame, which causes a corresponding deterioration in the reproduced sound quality.

このようにして発生した代替音源パルスは、切
替器２０によつて通常のマルチパルスと代替えて
LPC合成器２１に送出され、LPC合析器２１は
この代替音源パルスを駆動音源としてLPCフイ
ルタを制御して入力音声信号の再生のための
LPC合成を行ない、分析フレーム内で発生する
マルチパルスの数よりも分析フレーム内に存在す
る音源パルスの数多いときでも音質の劣化を招く
ことなく入力音声信号の分析、合成を行なうこと
ができる。 The alternative sound source pulse generated in this way is replaced with the normal multi-pulse by the switch 20.
The signal is sent to the LPC synthesizer 21, and the LPC synthesizer 21 uses this alternative sound source pulse as a driving sound source to control the LPC filter to reproduce the input audio signal.
By performing LPC synthesis, input audio signals can be analyzed and synthesized without deteriorating sound quality even when the number of sound source pulses present in an analysis frame is greater than the number of multipulses generated within the analysis frame.

次に図面を用いて代替音源パルス発生器２３の
機能を詳細に説明する。第４図は代替音源パルス
発生器２３の機能を説明するための波形図であ
る。第４図に於いては一般に設定される数より
もはるかに多くのマルチパルス（35パルス）を設
定した場合に求められるマルチパルス列であり、
線分４０１の長さは真のピツチ周期を示してい
る。は実際に設定されたマルチパルス数（４パ
ルス）に対応して求められたマルチパルス列であ
り、点４０２に於いてピツチパルスの欠落を生じ
ている。はのパルス列の左端のピークに一致
させて、抽出されたピツチ周期により発生された
代替音源パルスを示し、線分４０３の長さはピツ
チ抽出器９により抽出されたピツチ周期を示して
いる。はのパルス列の右端のピークに一致さ
せて抽出されたピツチ周期により発生された代替
音源パルスを示す。はのパルス列の左端のピ
ークと右端のピークとの各々に一致させ、且つほ
ぼピツチ周期に対応する位置に音源パルスを発生
する場合を示す。代替パルス発生器２３は上記
のパルス列の代りに，又はのパルス列を発
するものである。 Next, the functions of the alternative sound source pulse generator 23 will be explained in detail using the drawings. FIG. 4 is a waveform diagram for explaining the function of the alternative sound source pulse generator 23. Figure 4 shows the multi-pulse train obtained when a much larger number of multi-pulses (35 pulses) are set than is generally set.
The length of line segment 401 indicates the true pitch period. is a multi-pulse train obtained corresponding to the actually set multi-pulse number (4 pulses), and a pitch pulse is missing at a point 402. shows an alternative sound source pulse generated by the extracted pitch period, matching the peak at the left end of the pulse train, and the length of the line segment 403 shows the pitch period extracted by the pitch extractor 9. shows an alternative sound source pulse generated by the pitch period extracted to coincide with the rightmost peak of the pulse train. A case is shown in which the sound source pulses are generated to coincide with the left end peak and the right end peak of the pulse train, respectively, and at positions approximately corresponding to the pitch period. The alternative pulse generator 23 generates a pulse train of or instead of the above pulse train.

なお、第２図および第３図に示す本発明の実施
例においては、LPC係数としてＫパラメータを
用いているが、これは他のLPC係数、たとえば
αパラメータ等を利用してもよく、また符号化器
とマルチプレクサ、および復号化器とデマルチプ
レクサはそれぞれこれらを一体化た構成のものと
しても同様に実施し得ることは明らかであり、ま
たLPC合成フイルタは全極型以外の非極型デジ
タルフイルタ等と置換してもほぼ同様に実施しう
ることもまた明らかである。 In the embodiment of the present invention shown in FIGS. 2 and 3, the K parameter is used as the LPC coefficient, but other LPC coefficients such as the α parameter may also be used. It is clear that the encoder and multiplexer, and the decoder and demultiplexer, respectively, can be similarly implemented as integrated configurations, and the LPC synthesis filter can also be implemented as a non-polar digital filter other than an all-pole type. It is also clear that it can be implemented in substantially the same way even if the expressions are replaced with .

以上説明した如く本発明によれば、マルチパル
スボコーダにおいて、分析フレーム内で発生すべ
きマルチパルスの数よりも分析フレーム内に存在
する音源パルスの数が多いときには分析フレーム
から抽出したピツチ周期にもとづいてモデル化し
た複数のインパルス系列を前記マルチパルスの代
替用の代替音源パルスとして利用するという手段
を備えて入力音声信号の分析、合成を図ることに
よつて入力音声信号が女声もしくは幼児声の如く
そのピツチ周期が短い高声の場合でも再生音質の
劣化を大幅に改善することができるマルチパルス
型ボコーダが実現できるという効果がある。 As explained above, according to the present invention, in a multipulse vocoder, when the number of sound source pulses existing in an analysis frame is greater than the number of multipulses to be generated within the analysis frame, the pitch period is determined based on the pitch period extracted from the analysis frame. By analyzing and synthesizing the input audio signal by using a plurality of impulse sequences modeled as alternative sound source pulses to replace the multi-pulse, the input audio signal can be made to sound like a female voice or a child's voice. This has the effect of realizing a multi-pulse vocoder that can significantly improve the deterioration of reproduced sound quality even in the case of high-pitched voices with short pitch periods.

【図面の簡単な説明】[Brief explanation of drawings]

第１図は従来のマルチパルス型ボコーダの基本
的構成を示すブロツク図、第２図は本発明による
マルチパルス型ボコーダの分析側の一実施例を示
すブロツク図、第３図は本発明によるマルチパル
ス型ボコーダの合成側の一実施例を示すブロツク
図、第４図は代替パルス発生器２３の機能を説明
するための波形図である。１……LPC合成器、２……LPC分析器、３…
…音源パルス発生器、４……減算器、５……聴感
重み付け器、６……２乗誤差最小化器、７……
LPC分析器、８……相互相関関数算出器、９…
…ピツチ抽出器、１０……第１の符号化器、１２
……音源パルス発生器、１３……第２の符号化
器、１４……第３の符号化器、１５……マルチプ
レクサ、１６……デマルチプレクサ、１７……第
１の復号化器、１８……第２の復号化器、１９…
…第３の復号化器、２０……切替器、２１……
LPC合成器、２２……LPF、２３……代替音源
パルス発生器。 FIG. 1 is a block diagram showing the basic configuration of a conventional multi-pulse vocoder, FIG. 2 is a block diagram showing an embodiment of the analysis side of the multi-pulse vocoder according to the present invention, and FIG. 3 is a block diagram showing the basic configuration of a conventional multi-pulse vocoder. A block diagram showing an embodiment of the synthesis side of the pulse type vocoder, and FIG. 4 is a waveform diagram for explaining the function of the alternative pulse generator 23. 1...LPC synthesizer, 2...LPC analyzer, 3...
... Sound source pulse generator, 4 ... Subtractor, 5 ... Auditory weighting device, 6 ... Square error minimizer, 7 ...
LPC analyzer, 8... Cross correlation function calculator, 9...
...Pitch extractor, 10...First encoder, 12
... Sound source pulse generator, 13 ... Second encoder, 14 ... Third encoder, 15 ... Multiplexer, 16 ... Demultiplexer, 17 ... First decoder, 18 ... ...Second decoder, 19...
...Third decoder, 20...Switcher, 21...
LPC synthesizer, 22...LPF, 23...alternative sound source pulse generator.

Claims

【特許請求の範囲】[Claims]

１入力音声信号を分析フレームごとにLPC（Li
−near Prediction Coefficient、線形予測係数）
分析して抽出したLPC係数をスペクトル包絡情
報とし、このスペクトル包絡情報とともに前記入
力音声信号の音声情報を構成する音源情報を分析
フレームごとに、この音源情報の特徴に対応する
発生時間位置と振幅とを有する予め定めた複数個
のインパルス系列（マルチパルス）を以つて表現
して前記入力音声信号の分析および合成を行なう
マルチパルス型ボコーダにおいて、前記入力音声
信号の分析フレームごとにピツチ周期を抽出し、
前記抽出されたピツチ周期が予め定めた値より小
さい場合には代替音源情報を発生する手段を有す
ることを特徴とするマルチパルス型ボコーダ。1 The input audio signal is analyzed by LPC (Li
−near Prediction Coefficient, linear prediction coefficient)
The analyzed and extracted LPC coefficients are used as spectral envelope information, and together with this spectral envelope information, the sound source information that constitutes the audio information of the input audio signal is analyzed for each analysis frame, with the occurrence time position and amplitude corresponding to the characteristics of this sound source information. In a multi-pulse vocoder that analyzes and synthesizes the input audio signal by expressing it with a plurality of predetermined impulse sequences (multipulses) having ,
A multi-pulse vocoder comprising means for generating alternative sound source information when the extracted pitch period is smaller than a predetermined value.