JPH036520B2 - - Google Patents

Info

Publication number
JPH036520B2
JPH036520B2 JP57032426A JP3242682A JPH036520B2 JP H036520 B2 JPH036520 B2 JP H036520B2 JP 57032426 A JP57032426 A JP 57032426A JP 3242682 A JP3242682 A JP 3242682A JP H036520 B2 JPH036520 B2 JP H036520B2
Authority
JP
Japan
Prior art keywords
detector
output
circuit
nasal
oral
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
JP57032426A
Other languages
Japanese (ja)
Other versions
JPS58150997A (en
Inventor
Toyozo Sugimoto
Takeo Murata
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Institute of Advanced Industrial Science and Technology AIST
Original Assignee
Agency of Industrial Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agency of Industrial Science and Technology filed Critical Agency of Industrial Science and Technology
Priority to JP3242682A priority Critical patent/JPS58150997A/en
Publication of JPS58150997A publication Critical patent/JPS58150997A/en
Publication of JPH036520B2 publication Critical patent/JPH036520B2/ja
Granted legal-status Critical Current

Links

Description

【発明の詳細な説明】[Detailed description of the invention]

本発明は音声以外の情報から発音の認識を行な
う発音特徴抽出装置に関するものである。 音声は肺から送り出された呼気流が喉頭に存す
る声帯を通過する際に声帯が振動することにより
声に変換され、***や鼻腔に至る呼気の通路が形
を変えることにより変調され、これら発声器管の
総合的な動きの結果、産声される。 さて従来、このような音声を抽出するには音響
マイクロホンにより音声波を電気信号に変換し、
所定の周波数帯域を有する多数のフイルタ回路に
入力し、各フイルタ回路の出力から判断して発音
を特徴づけていた。 しかし発声器管の総合的動きの結果である音声
を、音声波のみにより全ての音素の発音特徴を抽
出して音声認識を行なうことは極めて困難であ
る。とりわけ非定常的な子音については雑音エネ
ルギーが強く、音声波の中でほぼ確実な特徴抽出
ができる無声摩擦音/s,∫/等を除けば、無声
摩擦音/h/や無声破裂音/p,t,k/や有声
破裂音/b,d,g/や鼻音/m,n,η/等は
その検出及び分離は非常に困難なものである。 本発明は上記欠点に鑑み、発声器管各部の動き
を検出する検出器を発声器管各部の近傍に装着ま
たは配置し、前記各検出器からの出力を処理装置
により処理させることにより、従来よりも正確に
発音抽出ができる発音特徴抽出装置を提供するも
のである。 以下、図面を参照しながら本発明の一実施例に
ついて説明する。 第1図は本発明の一実施例における発音抽出装
置のブロツク構成を示すものである。同図におい
て、1は喉頭部声帯付近に取付けられ声帯の振動
を検出する声帯振動検出器、2は鼻壁中央部付近
に取付けられ鼻腔内における音声の振動を検出す
る鼻振動検出器、3は口腔前方に配置し口気流を
検出する口気流検出器、4は口腔内口蓋に装着し
舌と口蓋との接触を検出する口蓋接触検出器であ
る。5は声帯振動検出器1、鼻振動検出器2、口
気流検出器3及び口蓋接触検出器4の出力から発
音特徴を抽出する処理装置で、以下第2図を用い
てさらに処理装置5における構成の詳細な説明を
行なう。 第2図において、6は声帯振動検出器1の声帯
振動情報から特定の値に基づいて声帯振動の有無
を決定する閾値回路、7は鼻振動検出器2の鼻振
動情報から特定の値に基づいて鼻振動の有無を決
定する閾値回路、8は口気流検出器3の口気流情
報を微分することにより口気流の変化率(加速
度)を求める微分回路、9は口気流の変化率の有
無を特定の値に基づいて決定する閾値回路、10
は口気流検出器3の口気流情報から特定の値に基
づいて口気流の有無を決定する閾値回路、11は
口蓋接触検出器4の口蓋接触情報を一旦測定回路
12により舌と口蓋との接触信号に変換した後に
後述する前舌閉鎖、後舌閉鎖及び閉鎖なしの3種
類の状態を判断する舌閉鎖検出回路、13は閾値
回路6,7,9,10から出力される各閾値情報
の有無、及び舌閉鎖検出回路11における3種類
の情報から音素分類を行なう音素分類回路であ
る。 上記のように構成された発音特徴抽出装置につ
いて、以下具体的な使用方法を第3図を用い説明
を行なう。 声帯振動検出部1として第3図に示すように加
速度センサー1′を医療用両面テープにより人体
における喉頭の声帯部に取り付けることにより、
声帯振動を検出する。検出された声帯振動は閾値
回路6に出力され、閾値回路6は声帯振動の値が
特定の値以上であれば音素分類回路13に有
(+)信号を、また一定の値以下であれば無(−)
信号を出力する。 また鼻振動検出器2として加速度センサー2′
を医療用両面テープにより人体における鼻壁中央
部付近に取り付けることにより、鼻振動を検出す
る。検出された鼻振動は閾値回路7に出力され、
閾値回路7は鼻振動の値が特定の値以上であれば
音素分類回路13に有(+)信号を、また一定の
値以下であれば無(−)信号を出力する。 また口気流検出器3として熱線流量計センサー
3′を人体における口腔前方の机上等に固定し配
置することにより、口気流の検出を行なう。検出
された口気流は微分回路8に出力され、微分回路
8では口気流の変化率を求めその変化率を閾値回
路9に出力する。そして閾値回路9は変化率の値
が特定の値以上であれば音素分類回路13に有
(+)信号を、また一定の値以下であれば無(−)
信号を出力する。一方熱線流量計センサー3′に
より検出された口気流は閾値回路10にも出力さ
れ、閾値回路10ではその口気流の値が特定値以
上であれば音素分類回路13に有(+)信号を、
また一定値以下であれば無(−)信号を出力す
る。 さらに口蓋接触検出器4としては第4図に示さ
れるような接触センサー4′を用いる。接触セン
サー4′は舌と接触する部分に多数の電極4′aを
有し、止め部4′bにより人体における口腔内口
上蓋に装着され、電極4′aにより舌との接触状
態を検出する。そして検出された電極4′aと舌
との接触状態は測定回路12及び舌閉鎖検出回路
11に順次入力され、接触状態が第5図イのよう
なパターンとなつた際には前舌閉鎖としての情報
が、第5図ロのようなパターンとなつた際には後
舌閉鎖としての情報が、また舌との接触がない場
合には閉鎖なしの情報が音素分類回路13に出力
される。 最終的に音素分類回路13では下表に示すよう
な内部の記憶テーブルから、閾値回路6,7,
9,10及び舌閉鎖検出回路11より入力した各
情報に基づいて音声を判断できる。
The present invention relates to a pronunciation feature extraction device that recognizes pronunciation from information other than speech. Speech is converted into voice by the vibration of the vocal cords as the exhaled airflow from the lungs passes through the vocal cords in the larynx, and is modulated by changing the shape of the exhaled air passage leading to the lips and nasal cavity. As a result of the overall movement of the ducts, cries are produced. Conventionally, to extract such sounds, the sound waves are converted into electrical signals using an acoustic microphone.
The input signal is input to a number of filter circuits each having a predetermined frequency band, and the pronunciation is characterized based on the output of each filter circuit. However, it is extremely difficult to perform speech recognition by extracting the pronunciation characteristics of all phonemes using only speech waves for speech, which is the result of comprehensive movements of the vocal organ. In particular, non-stationary consonants have strong noise energy, and with the exception of voiceless fricatives /s, ∫/, etc., whose features can almost certainly be extracted from the speech wave, voiceless fricatives /h/ and voiceless plosives /p, t , k/, voiced plosives /b, d, g/, nasal sounds /m, n, η/, etc. are extremely difficult to detect and separate. In view of the above-mentioned drawbacks, the present invention has been proposed by installing or arranging a detector for detecting the movement of each part of the vocal tube in the vicinity of each part of the vocal tube, and by having the output from each of the detectors processed by a processing device. The present invention also provides a pronunciation feature extraction device that can accurately extract pronunciations. An embodiment of the present invention will be described below with reference to the drawings. FIG. 1 shows the block configuration of a pronunciation extraction device in one embodiment of the present invention. In the figure, 1 is a vocal cord vibration detector attached near the vocal cords of the larynx to detect vibrations of the vocal cords, 2 is a nasal vibration detector attached near the center of the nasal wall and detects voice vibrations in the nasal cavity, and 3 is a An oral airflow detector is placed in front of the oral cavity to detect oral airflow, and 4 is a palate contact detector that is attached to the palate in the oral cavity to detect contact between the tongue and the palate. Reference numeral 5 denotes a processing device for extracting pronunciation features from the outputs of the vocal cord vibration detector 1, nasal vibration detector 2, oral airflow detector 3, and palate contact detector 4. The configuration of the processing device 5 will be further explained using FIG. A detailed explanation will be given below. In FIG. 2, 6 is a threshold circuit that determines the presence or absence of vocal fold vibration based on a specific value from the vocal fold vibration information of the vocal fold vibration detector 1, and 7 is a threshold circuit that determines the presence or absence of vocal fold vibration based on a specific value from the nasal vibration information of the nasal vibration detector 2. A threshold circuit 8 determines the rate of change (acceleration) of the oral airflow by differentiating the oral airflow information from the oral airflow detector 3; 9 a differential circuit that determines the presence or absence of the rate of change of the oral airflow; Threshold circuit that determines based on a specific value, 10
11 is a threshold circuit that determines the presence or absence of oral airflow based on a specific value from the oral airflow information of the oral airflow detector 3; 11 is a threshold circuit that determines the presence or absence of oral airflow based on a specific value from the oral airflow information of the oral airflow detector 3; and 11, the palate contact information of the palate contact detector 4; A tongue closure detection circuit that determines three types of states, including anterior tongue closure, posterior tongue closure, and no closure, which will be described later after converting into a signal; reference numeral 13 indicates the presence or absence of each threshold value information output from threshold circuits 6, 7, 9, and 10; , and the tongue closure detection circuit 11, which performs phoneme classification based on three types of information. A specific method of using the pronunciation feature extracting device configured as described above will be explained below with reference to FIG. As shown in FIG. 3, an acceleration sensor 1' is attached as the vocal cord vibration detection unit 1 to the vocal cord part of the larynx of the human body with medical double-sided tape.
Detects vocal cord vibration. The detected vocal cord vibration is output to the threshold circuit 6, and the threshold circuit 6 outputs a positive (+) signal to the phoneme classification circuit 13 if the value of the vocal cord vibration is above a certain value, and a negative signal if it is below a certain value. (-)
Output a signal. Also, the acceleration sensor 2' serves as the nasal vibration detector 2.
Nasal vibrations are detected by attaching the device to the center of the nasal wall of the human body using double-sided medical tape. The detected nasal vibration is output to the threshold circuit 7,
The threshold circuit 7 outputs a presence (+) signal to the phoneme classification circuit 13 if the value of the nasal vibration is above a specific value, and outputs an absence (-) signal if it is below a certain value. Oral airflow is detected by fixing and arranging a hot wire flowmeter sensor 3' as the oral airflow detector 3 on a desk or the like in front of the oral cavity of the human body. The detected oral airflow is output to a differentiating circuit 8, which determines the rate of change in the oral airflow and outputs the rate of change to a threshold circuit 9. Then, the threshold circuit 9 sends a presence (+) signal to the phoneme classification circuit 13 if the rate of change value is above a certain value, and a no signal (-) if it is below a certain value.
Output a signal. On the other hand, the oral airflow detected by the hot-wire flowmeter sensor 3' is also output to the threshold circuit 10, and if the value of the oral airflow is greater than or equal to a specific value, the threshold circuit 10 sends a positive (+) signal to the phoneme classification circuit 13.
Moreover, if it is below a certain value, a null (-) signal is output. Further, as the palate contact detector 4, a contact sensor 4' as shown in FIG. 4 is used. The contact sensor 4' has a large number of electrodes 4'a on the part that comes into contact with the tongue, is attached to the roof of the oral cavity in the human body by a stopper part 4'b, and detects the state of contact with the tongue by the electrode 4'a. . The detected contact state between the electrode 4'a and the tongue is sequentially input to the measurement circuit 12 and the tongue closure detection circuit 11, and when the contact state becomes a pattern as shown in Fig. 5A, it is determined that the anterior tongue is closed. When the information becomes a pattern as shown in FIG. Finally, in the phoneme classification circuit 13, threshold circuits 6, 7,
Speech can be determined based on each piece of information input from 9, 10 and the tongue closure detection circuit 11.

【表】【table】

【表】 さてたとえば第6図イに示すような音素波を有
する「hana」という音声を発声すると、加速度
センサー1′は第6図ロのような波形を閾値回路
6に出力する。そして閾値回路6では特定の閾値
から判断して「h」の部分では無(−)信号を、
「n」の部分では有(+)信号を音素分類回路1
3に出力する。 また加速度センサー1′は第6図ハのような波
形を閾値回路7に出力する。そして閾値回路7で
は特定の閾値から判断して「h」の部分では無
(−)信号を、「n」の部分では有(+)信号を音
素分類回路13に出力する。 さらに熱線流量計センサー3′では第6図ニの
ような波形を微分回路8及び閾値回路10に出力
する。そして閾値回路9では微分回路8からの微
分値を特定の閾値から判断して「h」及び「n」
の部分で無(−)信号を音素分類回路13に出力
する。また閾値回路10でも特定の閾値から判断
して「h」の部分では有(+)信号を、「n」の
部分では無(−)信号を音素分類回路13に出力
する。 一方接触センサー4′は電極4aと舌との接触
状態を検出し、測定回路12を介して舌閉鎖検出
回路11に出力する。そして舌閉鎖検出回路11
では「h」の部分で接触パターンにより「閉鎖な
し」の情報を、また「n」の部分では「前舌閉
鎖」の情報を音素分類回路13に出力する。 そして音素分類回路13では各情報に基づいて
表に示したような内部の記憶テーブルから「h」
及び「n」を認識することができる。 以上のように、声帯振動検出器1、鼻振動検出
器2、口気流検出器3及び口蓋接触検出器4によ
り各発声器管の動きを検出し、処理装置5により
各検出器が検出した情報に基づいてあらかじめ記
憶しているテーブルの中から特定の音素を決定す
ることにより、従来困難であつた音声の認識を正
確に行なうことができる。 以上のように本発明は声帯振動検出器が検出し
た声帯の振動情報と、鼻振動検出器が検出した鼻
腔内の振動情報と、口気流検出器が検出した口気
流情報と、口蓋接触検出器が検出した舌と口蓋と
の接触情報とに基づいて従来よりも正確に破裂音
および鼻音の各音素を識別することができ、その
実用的効果は大なるものがある。
[Table] For example, when the voice "hana" having a phoneme wave as shown in FIG. 6A is uttered, the acceleration sensor 1' outputs a waveform as shown in FIG. 6B to the threshold circuit 6. Then, in the threshold circuit 6, judging from a specific threshold value, there is no (-) signal at the "h" part,
In the “n” part, the presence (+) signal is sent to the phoneme classification circuit 1.
Output to 3. Further, the acceleration sensor 1' outputs a waveform as shown in FIG. 6C to the threshold circuit 7. Then, the threshold circuit 7 outputs an absent (-) signal for the "h" portion and a present (+) signal for the "n" portion to the phoneme classification circuit 13, based on a judgment based on a specific threshold value. Furthermore, the hot wire flow meter sensor 3' outputs a waveform as shown in FIG. Then, in the threshold circuit 9, the differential value from the differentiating circuit 8 is judged from a specific threshold value, and it is determined as "h" and "n".
A null (-) signal is output to the phoneme classification circuit 13 at the part. The threshold circuit 10 also outputs a presence (+) signal at the "h" portion and a no (-) signal at the "n" portion to the phoneme classification circuit 13 based on a judgment based on a specific threshold value. On the other hand, the contact sensor 4' detects the contact state between the electrode 4a and the tongue, and outputs the detected state to the tongue closure detection circuit 11 via the measurement circuit 12. and tongue closure detection circuit 11
Then, the contact pattern outputs information on "no closure" at the "h" part, and information on "frontal tongue closure" at the "n" part, to the phoneme classification circuit 13. Then, the phoneme classification circuit 13 selects "h" from an internal memory table as shown in the table based on each information.
and "n" can be recognized. As described above, the movement of each vocal tube is detected by the vocal cord vibration detector 1, the nasal vibration detector 2, the oral airflow detector 3, and the palate contact detector 4, and the information detected by each detector is processed by the processing device 5. By determining a specific phoneme from a pre-stored table based on the above, it is possible to accurately recognize speech, which has been difficult in the past. As described above, the present invention uses vocal cord vibration information detected by a vocal cord vibration detector, intranasal vibration information detected by a nasal vibration detector, oral airflow information detected by an oral airflow detector, and a palate contact detector. Each phoneme of plosives and nasals can be identified more accurately than before based on the contact information between the tongue and the roof of the mouth detected by the system, and its practical effects are significant.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は本発明の一実施例における発音特徴抽
出装置のブロツク図、第2図は同発音特徴抽出装
置における処理装置のブロツク図、第3図は同発
音特徴抽出装置の使用例を示す図、第4図は接触
センサーの平面図、第5図は舌と口蓋との接触パ
ターンを示す図、第6図は各検出器の波形図であ
る。 1……声帯振動検出器、2……鼻振動検出器、
3……口気流検出器、4……口蓋接触検出器、5
……処理装置。
FIG. 1 is a block diagram of a pronunciation feature extracting device according to an embodiment of the present invention, FIG. 2 is a block diagram of a processing device in the same pronunciation feature extraction device, and FIG. 3 is a diagram showing an example of use of the same pronunciation feature extraction device. , FIG. 4 is a plan view of the contact sensor, FIG. 5 is a diagram showing a contact pattern between the tongue and the palate, and FIG. 6 is a waveform diagram of each detector. 1... Vocal cord vibration detector, 2... Nasal vibration detector,
3... Oral air flow detector, 4... Palate contact detector, 5
...Processing device.

Claims (1)

【特許請求の範囲】[Claims] 1 喉頭部に取り付けた声帯振動検出器と、鼻部
に取り付けた鼻振動検出器と、口腔前方に配置し
た口気流検出器と、舌と口蓋の接触を検出する口
蓋接触検出器とを備えかつ、口気流検出器の出力
に基づいて破裂音p,t,k,b,d,gおよび
hのグループを抽出し、鼻振動検出器の出力に基
づいて鼻音m,n、を抽出し、声帯振動検出器の
出力に基づいてp,t,k,hとb,d,gとを
分離し、口蓋接触検出器の出力に基づいてp,
h,t,k,b,d,g,m,n、とに分離、識
別し、さらに口気流検出器の出力にもとづく口気
流の変化率によりpとhを分離する処理装置とを
具備したことを特徴とする発音特徴抽出装置。
1.Equipped with a vocal cord vibration detector attached to the larynx, a nasal vibration detector attached to the nose, an oral airflow detector placed in front of the oral cavity, and a palate contact detector that detects contact between the tongue and the palate. , extract the groups of plosive sounds p, t, k, b, d, g and h based on the output of the oral airflow detector, extract the nasal sounds m, n, based on the output of the nasal vibration detector, and Separate p, t, k, h and b, d, g based on the output of the vibration detector, and separate p, t, k, h and b, d, g based on the output of the palate contact detector.
h, t, k, b, d, g, m, and n, and further includes a processing device that separates p and h based on the rate of change in oral airflow based on the output of the oral airflow detector. A pronunciation feature extraction device characterized by:
JP3242682A 1982-03-03 1982-03-03 Speech feature extractor Granted JPS58150997A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP3242682A JPS58150997A (en) 1982-03-03 1982-03-03 Speech feature extractor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP3242682A JPS58150997A (en) 1982-03-03 1982-03-03 Speech feature extractor

Publications (2)

Publication Number Publication Date
JPS58150997A JPS58150997A (en) 1983-09-07
JPH036520B2 true JPH036520B2 (en) 1991-01-30

Family

ID=12358621

Family Applications (1)

Application Number Title Priority Date Filing Date
JP3242682A Granted JPS58150997A (en) 1982-03-03 1982-03-03 Speech feature extractor

Country Status (1)

Country Link
JP (1) JPS58150997A (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6111021A (en) * 1984-06-26 1986-01-18 工業技術院長 Speaking exercise apparatus
EP1320850A2 (en) 2000-09-19 2003-06-25 Logometrix Corporation Palatometer and nasometer apparatus
AU2002236483A1 (en) 2000-11-15 2002-05-27 Logometrix Corporation Method for utilizing oral movement and related events
TWI576826B (en) * 2014-07-28 2017-04-01 jing-feng Liu Discourse Recognition System and Unit

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS501846A (en) * 1973-05-14 1975-01-09

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS501846A (en) * 1973-05-14 1975-01-09

Also Published As

Publication number Publication date
JPS58150997A (en) 1983-09-07

Similar Documents

Publication Publication Date Title
US7529670B1 (en) Automatic speech recognition system for people with speech-affecting disabilities
Stevens et al. A miniature accelerometer for detecting glottal waveforms and nasalization
Philips et al. Acoustic–phonetic descriptions of speech production in speakers with cleft palate and other velopharyngeal disorders
Abdul-Kadir et al. Difficulties of standard arabic phonemes spoken by non-arab primary school children based on formant frequencies
JPS6129000B2 (en)
JPH036520B2 (en)
JPH036519B2 (en)
Demolin et al. Whispery voiced nasal stops in Rwanda
JP2000276191A (en) Voice recognizing method
Garnier et al. Efforts and coordination in the production of bilabial consonants
JPH0475520B2 (en)
JPH0139600B2 (en)
Wang et al. Research on Children’s Mandarin Chinese Voiceless Consonant Airflow
JPH025099A (en) Voiced, voiceless, and soundless state display device
JPH034919B2 (en)
Wang et al. Aerodynamic Measurements: Normative Data for Children Ages 15-17 Years
Hu et al. Acoustic Study of Mongolian Unaspirated and Aspirated Consonants
JPS60238899A (en) Breathing flow detector
JPS6331795B2 (en)
JPS63175897A (en) Breathing flow detector
JPS6258519B2 (en)
JPS60238898A (en) Short syllable recognition
JPS616696A (en) Fracturing tendency detector
JPS6329759B2 (en)
Demolin et al. Double articulations in some Mangbutu-Efe languages