JPH06214592A

JPH06214592A - Noise resisting phoneme model generating system

Info

Publication number: JPH06214592A
Application number: JP5005688A
Authority: JP
Inventors: Kiyohiro Kano; 清宏鹿野; Yasuhiro Minami; 泰浩南; Tatsuo Matsuoka; 達雄松岡; Marutan Furanku; マルタンフランク
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1993-01-18
Filing date: 1993-01-18
Publication date: 1994-08-05
Anticipated expiration: 2017-01-21
Also published as: JP3247746B2

Abstract

PURPOSE:To improve voice recognition performance by adding noise HMM and phoneme HMM, which is generated based on the voice information beforehand collected under a noise free condition, in a spectrum region and generating a phoneme model suitable to uttered voice environment. CONSTITUTION:In a cepstrum region, a phoneme HMM(Hidden Markov Model) in a noise free condition and a noise HMM in a noisy condition are beforehand generated, are respectively cosine transformed and respective logarithmic spectra are computed. Then, they are exponential transformed, respective linear spectra are computed and output probabilities of phoneme and noise are obtained by convolution. The noise convoluted phoneme HMM is exponential transformed, is inverse cosine transformed and a phoneme HMM is generated. Since the noise model of the uttered voice environment is used, a good phoneme HMM, which is suitable to the uttered voice environment, is generated and a high phoneme recognition rate is obtained.

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、音声認識に用いる音韻
モデルの作成技術に関し、特に、雑音の存在する場所で
の音声認識に有効な耐雑音音韻モデルの作成に関するも
のである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a technique for creating a phoneme model used for speech recognition, and more particularly to the creation of a noise-resistant phoneme model effective for speech recognition in the presence of noise.

【０００２】[0002]

【従来の技術】音韻ＨＭＭを用いた従来の音声認識方式
を図を用いて説明する。図６は従来の音声認識方式を実
現する機能ブロック図の一例を示す図である。同図中１
は音声入力部、２は認識部、３は認識結果出力部であ
る。４は音韻モデル格納部であり音韻ＨＭＭはここに格
納されている。2. Description of the Related Art A conventional speech recognition method using a phoneme HMM will be described with reference to the drawings. FIG. 6 is a diagram showing an example of a functional block diagram for realizing a conventional voice recognition system. 1 in the figure
Is a voice input unit, 2 is a recognition unit, and 3 is a recognition result output unit. The phoneme model storage unit 4 stores the phoneme HMM.

【０００３】まず、図５の動作を説明する。認識部２で
は、音声入力部１に入力された音声をフレーム化し、フ
レームごとに、音韻モデル格納部４に予め格納された音
韻ＨＭＭを用いて確率計算を行い、確率の高いものを認
識結果として認識結果出力部３に送信していた。認識部
２の動作をより詳細に説明する。ある一定の期間、音声
が入力されると、この入力を単位時間に区切ってフレー
ム化し、各フレーム毎に特徴パラメータを抽出する。そ
して、最初のフレームについて抽出された特徴パラメー
タについて、各音韻ごとに用意された音韻ＨＭＭを用い
て、例えば「あ」である確率、「い」である確率、
「う」である確率、等を音韻ごとに求める。First, the operation of FIG. 5 will be described. In the recognition unit 2, the voice input to the voice input unit 1 is framed, the probability is calculated for each frame using the phoneme HMM stored in advance in the phoneme model storage unit 4, and the one with a high probability is used as the recognition result. It was transmitted to the recognition result output unit 3. The operation of the recognition unit 2 will be described in more detail. When a voice is input for a certain period of time, the input is divided into unit time frames and the feature parameters are extracted for each frame. Then, for the characteristic parameters extracted for the first frame, using the phoneme HMM prepared for each phoneme, for example, the probability of being “A”, the probability of being “I”,
The probability of being "u", etc. is obtained for each phoneme.

【０００４】続いて、次のフレームの特徴パラメータを
入力し、前の状態の次にこのフレームがくる確率を、そ
れぞれ音韻ごとに求める。これを、入力音声のすべての
フレームについて行い、例えば「あ」の音韻についての
音韻ＨＭＭを用いたときに最も確率が高ければ、入力音
声を「あ」として認識する。Next, the characteristic parameters of the next frame are input, and the probability that this frame will come after the previous state is obtained for each phoneme. This is performed for all the frames of the input voice, and if the probability is highest when using the phoneme HMM for the phoneme of "a", the input voice is recognized as "a".

【０００５】[0005]

【発明が解決しようとする課題】しかし従来、音韻ＨＭ
Ｍの作成は、雑音のない状態で得られた音声情報をもと
に行っており、実環境では、雑音の影響を受けて音声認
識の性能が著しく低下する。また、あらかじめ種々の雑
音下で音声を収録し、この音声から音韻ＨＭＭを作成す
る方式では、雑音の種類が膨大であるため、高い認識性
能を得るためにはシステムが肥大化するという問題点が
ある。However, in the past, the phonological HM has been used.
The M is created based on the voice information obtained in a noise-free state, and in a real environment, the voice recognition performance is significantly deteriorated due to the influence of noise. In addition, in a method in which voices are recorded in advance under various noises and a phonological HMM is created from the voices, there is a problem that the system is enlarged in order to obtain high recognition performance because the types of noises are huge. is there.

【０００６】本発明は、上記問題点を解決し、簡易なシ
ステムで用いることができ、これにより雑音の多い状況
下においても高い認識性能が得られるような音韻モデル
を作成することにより、発声環境に合わせた音韻モデル
を作成し、音声認識性能を向上させることを目的とす
る。The present invention solves the above-mentioned problems and can be used in a simple system, thereby creating a phonological model that can obtain high recognition performance even in a noisy environment, thereby producing a utterance environment. The purpose of this study is to improve the speech recognition performance by creating a phoneme model suitable for.

【０００７】[0007]

【課題を解決するための手段】上記目的を達成するた
め、本発明の耐雑音音韻ＨＭＭの作成方式では、発声場
所の雑音に基づいて雑音ＨＭＭを作成し、この雑音ＨＭ
Ｍと予め雑音の無い状態で収集された音声情報に基づい
て作成された音韻ＨＭＭとを、スペクトル領域で加算す
ることにより、耐雑音音韻ＨＭＭを作成することを主要
な特徴とする。In order to achieve the above object, in the method of creating a noise-resistant phoneme HMM according to the present invention, a noise HMM is created based on the noise at the utterance location, and this noise HM is created.
The main feature is to create a noise-resistant phoneme HMM by adding M and a phoneme HMM created based on voice information collected in advance in the absence of noise in the spectral domain.

【０００８】[0008]

【作用】これにより、発声場所の雑音を考慮した音韻Ｈ
ＭＭが作成でき、雑音下における声認識性能の向上を可
能にする。As a result, the phonological unit H that takes into account the noise of the vocalization site
An MM can be created and the voice recognition performance in noise can be improved.

【０００９】[0009]

【実施例】図１は本発明による、耐雑音音韻ＨＭＭの作
成手順を示す図である。以下同図に従って作成手順を説
明する。音声認識のＨＭＭで用いられている音響パラメ
ータとして、ケプストラム係数が広く用いられている。
このケプストラム係数は、対数スペクトル（対数パワー
スペクトル）とコサイン変換の関係にある。一方、周囲
雑音等は加法性雑音であり、音声とはスペクトル領域で
加算することができる。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 1 is a diagram showing a procedure for creating a noise resistant phoneme HMM according to the present invention. The creation procedure will be described below with reference to FIG. Cepstral coefficients are widely used as acoustic parameters used in HMMs for speech recognition.
The cepstrum coefficient has a relationship of logarithmic spectrum (logarithmic power spectrum) and cosine transform. On the other hand, ambient noise is additive noise and can be added to speech in the spectral domain.

【００１０】まず、ケプストラム領域で、雑音のない環
境での音韻ＨＭＭと、雑音のある環境での雑音ＨＭＭと
を作成しておき、それぞれをコサイン変換してそれぞれ
の対数スペクトルを算出する。次に、それらにエキスポ
ネンシャル変換を行ってそれぞれの線形スペクトルを算
出し、音韻と雑音の出力確率を畳み込み（コンボルーシ
ョン）で求め、このように構成した雑音込みの音韻ＨＭ
Ｍを対数変換し、続いて逆コサイン変換を行うことによ
り、音韻ＨＭＭを作成する。First, in the cepstrum region, a phoneme HMM in a no-noise environment and a noise HMM in a noisy environment are created, and each logarithm spectrum is calculated by cosine transform. Next, exponential transformation is performed on them to calculate their respective linear spectra, and the output probabilities of the phoneme and noise are obtained by convolution (convolution), and the phoneme HM with noise thus constructed is constructed.
A phonetic HMM is created by logarithmically transforming M and then performing an inverse cosine transform.

【００１１】これらの変換を、音声、雑音についてそれ
ぞれについて行ない、この線形スペクトル領域で、両者
のパラメータを加算する。その後、この線形スペクトル
を対数変換して対数スペクトル領域に戻し、さらに、コ
サイン変換でケプストラム領域に戻し、周囲雑音を考慮
した音韻ＨＭＭを作成する。These conversions are carried out for voice and noise respectively, and the parameters of both are added in this linear spectral region. After that, the linear spectrum is logarithmically transformed back to the logarithmic spectrum domain, and further returned to the cepstrum domain by the cosine transform to create a phoneme HMM considering ambient noise.

【００１２】ここで、音韻ＨＭＭと雑音ＨＭＭは、正規
分布の混合（いわゆるGaussian mixture）を用いたＨＭ
Ｍで表されているとする。よって、上記の変換では、平
均値の変換とともに、共分散についても変換を行う。次
に具体的計算方法について説明する。まず、これらのＨ
ＭＭのパラメータとして、０次からｐ次までのケプスト
ラム係数を考える。このベクトルをＣ＝（Ｃ₀ Ｃ₁ Ｃ₂ - - - - - Ｃ_p-1 Ｃ_p）Ｄ＝（Ｄ₀ Ｄ₁ Ｄ₂ - - - - - Ｄ_p-1 Ｄ_p）と表す。ここで、Ｃは音韻のパラメータ、Ｄは雑音のパ
ラメータである。このケプストラム係数から対数スペク
トルの変換は、コサイン変換が知られているが、これは
線形変換であり、(p+1) ｘｍの変換行列(COS)で表す。
対数スペクトルのベクトルをそれぞれＬＣ, ＬＤで表
す。よって、コサイン変換による平均値は、ＬＣ＝Ｃ(COS) ＬＤ＝Ｄ(COS) となる。また、ケプストラム係数の共分散をΣ^C、Σ^D
とすると、対数スペクトルの共分散Σ^LC、Σ^LDは、 Σ^LC＝(COS)Σ^C(COS)^t Σ^LD＝(COS)Σ^D(COS)^t となる。このように、平均値と共分散を変換することに
より、音声、雑音の正規分布の対数スペクトル領域での
平均値と共分散が得られる。Here, the phonological HMM and the noise HMM are HM using a mixture of normal distributions (so-called Gaussian mixture).
Let it be represented by M. Therefore, in the above conversion, not only the conversion of the average value but also the conversion of the covariance is performed. Next, a specific calculation method will be described. First, these H
Consider the 0th to pth cepstrum coefficients as MM parameters. This vector is represented as C = (C ₀ C ₁ C ₂ _----- C _p-1 C _p ) D = (D ₀ D ₁ D ₂ _---- D _p-1 D _p ). Here, C is a phonological parameter, and D is a noise parameter. Cosine transform is known as the transform of the cepstrum coefficient to the logarithmic spectrum, but this is a linear transform and is represented by a (p + 1) xm transform matrix (COS).
The logarithmic spectrum vectors are represented by LC and LD, respectively. Therefore, the average value by the cosine transformation is LC = C (COS) LD = D (COS). In addition, the covariance of the cepstrum coefficient is Σ ^C , Σ ^D
Then, the covariances Σ ^LC and Σ ^LD of the logarithmic spectrum are Σ ^LC = (COS) Σ ^C (COS) ^t Σ ^LD = (COS) Σ ^D (COS) ^t . In this way, by converting the average value and the covariance, the average value and the covariance in the logarithmic spectral region of the normal distribution of voice and noise can be obtained.

【００１３】次に、対数スペクトルを線形スペクトルに
変換するエキスポネンシャル変換について述べる。この
変換では正規分布の形は保たれないが、これを正規分布
で近似する。エキスポネンシャル変換したときの分布形
の平均値ＳＣ, ＳＤと共分散Σ^SC，Σ^SDを計算すると、ＳＣⁱ＝exp(ＬＣⁱ＋Σ^LC _ii/2) ＳＤⁱ＝exp(ＬＤⁱ＋Σ^LD _ii/2) Σ^SC _ij＝ＳＣⁱ× ＳＣ^j×｛ exp( Σ^LC _ij )−1 ｝ Σ^SD _ij＝ＳＤⁱ× ＳＤ^j×｛ exp( Σ^LD _ij )−1 ｝となる。このように、線形スペクトルの領域で音声と加
法性雑音の和をとることになり、和をとることは、分布
形のコンボルーションであるので、容易に２つの分布形
のコンボルーションの結果の平均値、Ｍと共分散、Σ^M
を求めることができる。よって、Ｍ＝ＳＣⁱ＋ＳＤⁱ Σ^M＝Σ^SC＋Σ^SD となる。このようにして得られた分布形の平均値と共分
散を、今までの過程と逆に、ケプストラム領域まで変換
して行く。Next, exponential conversion for converting a logarithmic spectrum into a linear spectrum will be described. Although the shape of the normal distribution cannot be maintained by this conversion, this is approximated by the normal distribution. Average value of the distribution type when converted exponential SC, SD and covariance sigma ^SC, when calculating the ^{^{Σ SD, SC i = exp (}} LC i + Σ LC ii / 2) SD i = exp (LD i + Σ LD ii / 2) Σ ^SC _ij = SC ⁱ × SC ^j × {exp (Σ ^LC _ij ) −1} Σ ^SD _ij = SD ⁱ × SD ^j × {exp (Σ ^LD _ij ) −1}. In this way, the sum of the speech and the additive noise is taken in the region of the linear spectrum, and since taking the sum is a distributed convolution, it is easy to average the results of the two distributed convolutions. Value, covariance with ^M , Σ ^M
Can be asked. Therefore, M = SC ⁱ + SD ⁱ Σ ^M = Σ ^SC + Σ ^SD The mean value and covariance of the distribution type obtained in this way are converted to the cepstrum domain in the reverse of the process so far.

【００１４】まず、エキスポネンシャル変換の逆変換で
ある対数変換を行なう。対数変換された平均値をＬＭ、
共分散をΣ^LMとすると、エキスポネンシャル変換の逆変
換であるので、ＬＭ_i＝log(Ｍ_i) −1/2 log(Σ^M _ii /Ｍ_i ²＋1 ) Σ^LM _ij＝log(Σ^M _ij / (Ｍ_iＭ_j）＋ 1 ) と変換する。さらに、逆コサイン変換（コサイン変換と
同じ）(COS')ｍｘ(p+1)によって対数スペクトルをケプ
ストラム領域へ変換し、平均値、Ｓと共分散Σ^Sを、Ｓ＝ＬＭ (COS') Σ^S＝(COS')Σ^LM (COS')^t のように得る。そして、ケプストラム領域で得られたＨ
ＭＭの出力確率の正規分布を線形スペクトル領域までも
って行き、雑音モデルと加算してから、更に逆変換を行
ない、雑音を加算した音韻ＨＭＭを作成する。First, logarithmic transformation, which is the inverse transformation of exponential transformation, is performed. The logarithmically transformed average value is LM,
If the covariance is Σ ^LM , it is an inverse transformation of exponential transformation, so LM _i = log (M _i ) −1/2 log (Σ ^M _ii / M _i ² +1) Σ ^LM _ij = log (Σ ^M _ij / (M _i M _j ) +1). Further, the logarithmic spectrum is transformed into the cepstrum domain by the inverse cosine transform (the same as the cosine transform) (COS ') mx (p + 1), and the average value, S and the covariance Σ ^S are given by S = LM (COS') Σ ^S = (COS ') Σ ^LM (COS') ^t . And H obtained in the cepstrum region
The normal distribution of the output probability of the MM is taken to the linear spectrum region, added to the noise model, and then inverse transformation is performed to create a phoneme HMM to which noise is added.

【００１５】上記説明では、二つの分布形を取り上げ、
その合成方法を述べた。次に、二つのＨＭＭの合成方法
について述べる。通常、音韻ＨＭＭは、図２に示すよう
な右から左への３状態ぐらいのモデルで表される。一
方、雑音のモデルとしては、図３に示すようなエルゴー
ド的なＨＭＭが適している。この２つのモデルを積とし
て、積空間での合成を考える。すると、図４のような積
空間でのＨＭＭが得られる。それぞれの状態は、音韻Ｈ
ＭＭの状態と雑音ＨＭＭの状態の組み合わせからなって
いる。よって、対応するそれぞれの状態の出力確率に対
し、上記の線形スペクトル領域への変換を行ない、続い
てコンボルーションを行ない、逆変換を行なう。In the above description, two distribution types are taken up,
The synthetic method was described. Next, a method of synthesizing the two HMMs will be described. Usually, the phoneme HMM is represented by a model of about three states from right to left as shown in FIG. On the other hand, an ergodic HMM as shown in FIG. 3 is suitable as a noise model. Let us consider composition in the product space by using these two models as products. Then, the HMM in the product space as shown in FIG. 4 is obtained. Each state is phonological H
It consists of a combination of MM states and noise HMM states. Therefore, the output probabilities of the respective corresponding states are converted into the above-described linear spectral region, followed by convolution, and inverse conversion is performed.

【００１６】出力確率が単一の正規分布であるときに
は、２つの分布での上記の変換を行なえばよい。出力分
布が正規分布の混合であるときには、あらゆる分布の組
み合わせ対について上記の変換を行なうことになる。こ
こまでの説明では、ケプストラム係数のみについて説明
したが、音韻ＨＭＭでよく用いられるΔケプストラムや
Δパワーに対しても、それらが、ケプストラム係数の線
形和で表されているので、同様の変換が適用できる。When the output probability has a single normal distribution, the above conversion with two distributions may be performed. When the output distribution is a mixture of normal distributions, the above conversion will be performed for all combinations of distribution combinations. In the description so far, only the cepstrum coefficient has been described, but the same conversion is applied to the Δ cepstrum and the Δ power that are often used in the phoneme HMM because they are represented by the linear sum of the cepstrum coefficients. it can.

【００１７】図５は、本発明を音声認識装置に適用した
場合の実施例を示す図である。図中５は音韻モデル格納
部、６は雑音入力部、７は本発明の耐雑音音韻モデル作
成方式を実行する耐雑音音韻モデル作成部、８は耐雑音
音韻モデル格納部である。予め雑音のない状態で得た音
声情報から作成した音韻ＨＭＭモデルを音韻モデル格納
部５に用意し、耐雑音音韻モデル作成部７では、雑音入
力部６から入力された雑音から、雑音音韻ＨＭＭを作成
し、本発明の方式に従って作成された耐雑音音韻モデル
を耐雑音音韻モデル格納部８に格納させる。FIG. 5 is a diagram showing an embodiment in which the present invention is applied to a voice recognition device. In the figure, 5 is a phoneme model storage unit, 6 is a noise input unit, 7 is a noise tolerant phoneme model creation unit that executes the noise tolerant phoneme model creation method of the present invention, and 8 is a noise tolerant phoneme model storage unit. A phoneme HMM model created in advance from speech information obtained in a noise-free state is prepared in the phoneme model storage unit 5, and the noise-resistant phoneme model creation unit 7 extracts a noise phoneme HMM from noise input from the noise input unit 6. The noise-resistant phoneme model created according to the method of the present invention is stored in the noise-resistant phoneme model storage unit 8.

【００１８】認識部２では、音声入力部１から入力され
た入力音声を耐雑音音韻モデル格納部８に格納された音
韻モデルを用いて音声認識し、認識結果出力部３から出
力させる。The recognition unit 2 performs voice recognition on the input voice input from the voice input unit 1 using the phoneme model stored in the noise-resistant phoneme model storage unit 8 and outputs it from the recognition result output unit 3.

【００１９】[0019]

【発明の効果】この発明によれば、発声環境の雑音モデ
ルを用いることができるので、発声環境に適合した頑健
な音韻ＨＭＭの作成が可能になり、高い音韻認識率の達
成が期待できる。According to the present invention, since the noise model of the vocal environment can be used, it is possible to create a robust phoneme HMM suitable for the vocal environment, and it is expected that a high phoneme recognition rate is achieved.

【図面の簡単な説明】[Brief description of drawings]

【図１】本発明による耐雑音音韻モデルの作成手順を示
す図である。FIG. 1 is a diagram showing a procedure for creating a noise-resistant phoneme model according to the present invention.

【図２】３状態３ループの音韻ＨＭＭの例を示す。FIG. 2 shows an example of a 3-state 3-loop phonological HMM.

【図３】２状態のエルゴードＨＭＭによる雑音ＨＭＭの
例を示す。FIG. 3 shows an example of a noise HMM by a two-state ergodic HMM.

【図４】図２の音韻ＨＭＭと図３の雑音ＨＭＭを、積空
間で合成した音韻ＨＭＭを示す。FIG. 4 shows a phoneme HMM obtained by combining the phoneme HMM shown in FIG. 2 and the noise HMM shown in FIG. 3 in a product space.

【図５】本発明を音声認識装置に適用した場合の実施例
を説明する図である。FIG. 5 is a diagram for explaining an embodiment when the present invention is applied to a voice recognition device.

【図６】従来の音声認識装置の構成を説明するための図
である。FIG. 6 is a diagram for explaining the configuration of a conventional voice recognition device.

───────────────────────────────────────────────────── フロントページの続き (72)発明者フランクマルタン東京都目黒区駒場４丁目６番29号 ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Frank Martin 4-629 Komaba, Meguro-ku, Tokyo

Claims

【特許請求の範囲】[Claims]

【請求項１】音声認識に用いる音韻隠れマルコフモデル
（Hidden Markov Model 以下、ＨＭＭという）の作成方
式であって、あらかじめ作成された音韻ＨＭＭと雑音の
ＨＭＭとを、スペクトル領域で加算することにより、雑
音の重畳した音韻ＨＭＭを作成することを特徴とする耐
雑音音韻モデルの作成方式。1. A method of creating a phonological hidden Markov model (hereinafter referred to as HMM) used for speech recognition, wherein a phoneme HMM and a noise HMM created in advance are added in a spectral domain. A method for creating a noise-resistant phoneme model, characterized by creating a phoneme HMM on which noise is superimposed.

【請求項２】音韻ＨＭＭと雑音ＨＭＭとを積空間で合成
することを特徴とする請求項１項記載の耐雑音音韻の作
成方式。2. The noise-tolerant phoneme generation method according to claim 1, wherein the phoneme HMM and the noise HMM are combined in a product space.

【請求項３】ケプストラム領域で表された音韻ＨＭＭと
雑音ＨＭＭの出力確率の分布をスペクトル領域まで、コ
サイン変換とエキスポネンシャル変換によって変換し
て、そこで両者のコンボルーション（畳み込み）を行
い、雑音に適合した音韻ＨＭＭを構成し、さらに、上記
の逆の変換である対数変換と逆コサイン変換を行なっ
て、音韻ＨＭＭをケプストラム領域まで変換することを
特徴とする耐雑音音韻モデルの作成方式。3. A phonological HMM and a noise HMM output probability distribution expressed in a cepstrum domain are converted up to a spectral domain by cosine transformation and exponential transformation, and convolution of both is performed there to generate noise. A method for creating a noise-resistant phoneme model, characterized in that the phoneme HMM adapted to the above is configured, and further the logarithmic conversion and the inverse cosine conversion which are the above-mentioned reverse conversion are performed to convert the phoneme HMM up to the cepstrum region.

【請求項４】音韻ＨＭＭと雑音ＨＭＭをそれぞれ、スペ
クトル領域で学習により作成し、コンボルーションを行
い、対数変換と逆コサイン変換を行って音韻ＨＭＭをケ
プストラム領域まで変換することを特徴とする耐雑音音
韻モデルの作成方式。4. A phonological HMM and a noise HMM are created by learning in the spectral domain, convolution is performed, and logarithmic transformation and inverse cosine transformation are performed to transform the phonological HMM up to the cepstrum domain. Phonological model creation method.

【請求項５】請求項２の音韻ＨＭＭの合成で、２つのモ
デルの対応する出力確率の値を請求項３ないし請求項４
で計算することを特徴とする耐雑音音韻モデルの作成方
式。5. The phonological HMM synthesis according to claim 2, wherein the values of the corresponding output probabilities of the two models are calculated according to any one of claims 3 to 4.
A method for creating a noise-resistant phoneme model characterized by being calculated by.