JP3327936B2

JP3327936B2 - Speech rate control type hearing aid

Info

Publication number: JP3327936B2
Application number: JP24596091A
Authority: JP
Inventors: 章中村; 信正清山; 徹都木; 栄一宮坂
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 1991-09-25
Filing date: 1991-09-25
Publication date: 2002-09-24
Anticipated expiration: 2017-09-24
Also published as: JPH0580796A

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】この発明は、例えばテレビ、ラジ
オテープレコーダ、補聴器などの音響機器や医療機器な
どにおいて、発話者の音声を加工して、ほぼリアルタイ
ムで受聴者の受聴能力に音声スピードをフィットさせ、
補聴を行なう話速制御型補聴装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to the processing of speech of a speaker in audio equipment and medical equipment such as a television, a radio tape recorder and a hearing aid, for example, to increase the sound speed of the listener in almost real time. Fit,
The present invention relates to a speech rate control type hearing aid for performing hearing aid.

【０００２】［発明の概要］この発明は、加齢ないしは
何らかの障害などによって低下する音声識別臨界速度
（音声を正確に識別できる最大の話速）などの受聴能力
を補うための話速制御型補聴方法および装置に関し、発
話者の個人性、音韻性を保持したまま、受聴者自身の操
作によって高品質に話速を変換し、受聴者にとって最適
な音声了解度を得るものである。[Summary of the Invention] [0002] The present invention relates to a speech rate control type hearing aid for compensating for a listening ability such as a critical voice discrimination speed (a maximum speech speed at which a voice can be accurately discriminated) which decreases due to aging or some obstacle. The present invention relates to a method and an apparatus for converting a speech speed to a high quality by a listener's own operation while maintaining a speaker's personality and phonological characteristics, thereby obtaining an optimal speech intelligibility for the listener.

【０００３】[0003]

【従来の技術】一般に、受聴者が加齢ないしは何らかの
障害などによって音声識別臨界速度（音声を正確に識別
できる最大の話速）などの受聴能力が低下すると、通常
の速さの音声や早口で話される音声の識別度が大幅に低
下するようになる。2. Description of the Related Art In general, when a listener loses his / her hearing ability such as a critical speed for speech recognition (maximum speech speed at which speech can be accurately identified) due to aging or some obstacle, for example, a normal speed speech or a quick speech is used. The discrimination degree of the spoken voice is greatly reduced.

【０００４】[0004]

【発明が解決しようとする課題】ところが、従来、この
ような聴力障害を持つ人のための補聴手段としては補聴
器しか知られておらず、補聴器は、単に周波数特性の改
善、利得制御などによって聴覚系の外耳、中耳の伝達特
性のみを補償するものであるために、主として聴覚中枢
の劣化が関与する音声の識別能力の低下を補償すること
は不可能であった。However, conventionally, only hearing aids have been known as hearing aids for persons with such hearing impairment, and hearing aids are provided by simply improving the frequency characteristics and controlling the gain. Since only the transfer characteristics of the outer and middle ears of the system were compensated, it was not possible to compensate for the decrease in the ability to discriminate speech mainly involving the deterioration of the auditory center.

【０００５】この発明の目的は、発話速度を受聴者の受
聴能力に最適になるように高品質のものにほぼリアルタ
イムで変換することにより、主として聴覚中枢に起因す
る聴覚の劣化を補償することができる話速制御型補聴装
置を提供することにある。[0005] It is an object of the present invention to compensate for hearing impairment mainly due to the auditory center by converting the speaking rate into a high quality one in near real time so as to optimize the listening ability of the listener. It is an object of the present invention to provide a speech rate control type hearing aid that can be used.

【０００６】[0006]

【課題を解決するための手段】この発明の話速制御型補
聴装置は、以下の（１）から（１０）に示す各モジュー
ルを並列演算用ＩＣから成るトランスピュータモジュー
ルで構成したことを特徴としている。すなわち、（１）
入力音声信号をＡ／Ｄ変換して時系列音声データを生成
するするＡ／Ｄ変換モジュールと、（２）Ａ／Ｄ変換さ
れた時系列音声データを取り込んでフレーム単位で処理
するためのバッファリングを行なう入力バッファモジュ
ールと、（３）この入力バッファモジュールを介して取
り込まれる時系列音声データに対して平均パワー、零交
差数、および自己相関係数を算出し、これらのしきい値
により無音フレーム、無声フレームおよび有声フレーム
を決定する区間分割処理を実行する第１分析モジュー
ル、と、（４）区間分割された時系列音声データに対し
て高速化のためのデシメーションを行なう第２分析モジ
ュールと、（５）デシメーションが施された時系列音声
データの自己相関係数を求め、フレーム毎に音声のピッ
チ周波数を求める第３分析モジュールと、（６）求めら
れたピッチ周波数の軌跡の平滑化を行なってピッチの開
始点およびピッチ数を決定し、ピッチ区間長延長・繰り
返し処理を行なう第４分析モジュールと、（７）決定さ
れたピッチの開始点およびピッチ数に基づいて、無音区
間、無声区間、有声区間の各ピッチ区間を決定する第５
分析モジュールと、（８）前記第５分析モジュールによ
って得られた無音区間、無声区間、有声区間および話速
度パラメータ設定手段から得られる無音区間ならびに有
声区間の延長比率に適合するように話速を変換して音声
合成を行なう合成モジュールと、（９）合成モジュール
の音声合成出力の連続性を保持するためのバッファリン
グを行なう出力バッファモジュールと、（１０）出力バ
ッファモジュールの出力をＤ／Ａ変換して話速が制御さ
れた音声信号として出力するＤ／Ａ変換モジュールとを
トランスピュータモジュールで構成したものである。The speech rate control type hearing aid of the present invention is characterized in that each of the following modules (1) to (10) is constituted by a transputer module comprising an IC for parallel operation. I have. That is, (1)
An A / D conversion module for A / D converting an input audio signal to generate time-series audio data; and (2) buffering for taking in the A / D-converted time-series audio data and processing it in frame units And (3) calculate an average power, a number of zero crossings, and an autocorrelation coefficient with respect to the time-series audio data taken in through the input buffer module, and calculate a silent frame based on these thresholds. A first analysis module that executes a section division process for determining unvoiced frames and voiced frames, and (4) a second analysis module that performs decimation for speeding up the time-series voice data that has been section-divided. (5) calculating an autocorrelation coefficient of the decimated time-series audio data and obtaining a pitch frequency of the audio for each frame; An analysis module; and (6) a fourth analysis module for smoothing the trajectory of the determined pitch frequency to determine the starting point and the number of pitches and for performing pitch section length extension and repetition processing, and (7) determination. A fifth section that determines each pitch section of a silent section, an unvoiced section, and a voiced section based on the starting point of the pitch and the number of pitches
And (8) converting the speech speed so as to conform to the extension ratio of the silent section, the unvoiced section, the voiced section, and the unvoiced section and the voiced section obtained from the voice speed parameter setting means obtained by the fifth analysis module. (9) an output buffer module that performs buffering for maintaining continuity of the speech synthesis output of the synthesis module, and (10) a D / A conversion of the output of the output buffer module. And a D / A conversion module for outputting a speech signal whose speech speed is controlled by a transputer module.

【０００７】[0007]

【０００８】[0008]

【作用】この発明の話速制御型補聴装置によれば、先
ず、Ａ／Ｄ変換モジュールにより入力音声信号がＡ／Ｄ
変換されて時系列音声データが生成される。バッファモ
ジュールでは、Ａ／Ｄ変換された時系列音声データが取
り込まれてフレーム単位で処理するためのバッファリン
グが行なわれる。第１分析モジュールでは、入力バッフ
ァモジュールを介して取り込まれる時系列音声データに
対して平均パワー、零交差数、および自己相関係数が算
出され、これらのしきい値により無音フレーム、無声フ
レームおよび有声フレームを決定する区間分割処理が実
行される。第２分析モジュールでは、区間分割された時
系列音声データに対して高速化のためのデシメーション
が行なわれる。第３分析モジュールでは、デシメーショ
ンが施された時系列音声データの自己相関係数を求め、
フレーム毎に音声のピッチ周波数が求められる。第４分
析モジュールでは、求められたピッチ周波数の軌跡の平
滑化を行なってピッチの開始点およびピッチ数が決定さ
れ、ピッチ区間長延長・繰り返し処理が行なわれる。第
５分析モジュールでは、決定されたピッチの開始点およ
びピッチ数に基づいて、無音区間、無声区間、有声区間
の各ピッチ区間が決定される。合成モジュールでは、第
５分析モジュールによって得られた無音区間、無声区
間、有声区間および、話速度パラメータ設定モジュール
から得られる無音区間ならびに有声区間の延長比率に適
合するように話速を変換して音声合成が行なわれる。出
力バッファモジュールでは、合成モジュールの音声合成
出力における音声伸長による延長部分を吸収し、連続性
を保持するためのバッファリングが行なわれる。この出
力バッファモジュールの出力がＤ／Ａ変換モジュールに
よってＤ／Ａ変換されて話速が制御された音声信号とし
て出力される。According to the speech rate control type hearing aid of the present invention, first, the input audio signal is converted into an A / D signal by the A / D conversion module.
It is converted to generate time-series audio data. The buffer module takes in the A / D converted time-series audio data and performs buffering for processing on a frame-by-frame basis. In the first analysis module, the average power, the number of zero crossings, and the autocorrelation coefficient are calculated for the time-series audio data captured via the input buffer module, and these thresholds are used to calculate a silence frame, an unvoiced frame, and a voiced frame. Section division processing for determining a frame is executed. In the second analysis module, decimation for speeding up is performed on the time-series audio data divided into sections. In the third analysis module, an autocorrelation coefficient of the decimated time-series audio data is obtained,
The pitch frequency of speech is determined for each frame. In the fourth analysis module, the trajectory of the determined pitch frequency is smoothed to determine the starting point of the pitch and the number of pitches, and the pitch section length extension / repetition processing is performed. In the fifth analysis module, based on the determined start point of the pitch and the number of pitches, each pitch section of a silent section, an unvoiced section, and a voiced section is determined. In the synthesis module, the speech rate is converted to match the extension ratio of the silent section, unvoiced section, and voiced section obtained by the fifth analysis module, and the extension ratio of the silent section and the voiced section obtained from the speech rate parameter setting module. Synthesis is performed. In the output buffer module, buffering is performed to absorb the extension due to voice expansion in the voice synthesis output of the synthesis module and maintain continuity. The output of the output buffer module is D / A converted by the D / A conversion module and output as an audio signal whose speech speed is controlled.

【０００９】[0009]

【００１０】[0010]

【００１１】[0011]

【００１２】[0012]

【００１３】[0013]

【実施例】以下、この発明の実施例を図に基づいて詳説
する。DESCRIPTION OF THE PREFERRED EMBODIMENTS Embodiments of the present invention will be described below in detail with reference to the drawings.

【００１４】図１は話速制御型補聴装置の機能ブロック
図である。この装置は、区間分割部１、無音区間延長処
理部２、ピッチ周期抽出部３、ピッチ区間分割処理部
４、ピッチ区間長延長・繰り返し処理部５、話速設定部
６および合成部７から構成されている。FIG. 1 is a functional block diagram of a speaking speed control type hearing aid. This apparatus comprises a section dividing section 1, a silence section extension processing section 2, a pitch period extraction section 3, a pitch section division processing section 4, a pitch section length extension / repetition processing section 5, a speech speed setting section 6, and a synthesis section 7. Have been.

【００１５】区間分割処理部１は、入力音声を無音区
間、無声区間および有声区間に分割する区間分割処理を
施す部分である。The section dividing section 1 is a section for performing a section dividing process for dividing an input voice into a silent section, an unvoiced section and a voiced section.

【００１６】無音区間延長処理部２は、この区間分割処
理部１によって分割された無音区間に対して、話速設定
部６があらかじめ定めた比率によって延長させる部分で
ある。The silent section extension processing section 2 is a section for extending the silent section divided by the section dividing section 1 at a ratio determined by the speech speed setting section 6 in advance.

【００１７】ピッチ周期抽出処理部３は、区間分割処理
部１によって分割された有声区間に対してピッチ周期を
抽出する部分であり、ピッチ区間分割処理部４は、ピッ
チ周期抽出処理部３が抽出した各ピッチ区間ごとに有声
区間を分割する部分であり、ピッチ区間長延長・繰り返
し部５は、ピッチ区間分割処理部が分割した各ピッチ区
間長を話速設定部６があらかじめ定めた比率によって延
長させる部分である。The pitch period extraction processing section 3 is a section for extracting a pitch period from the voiced section divided by the section division processing section 1. The voice section is divided for each pitch section. The pitch section length extension / repetition section 5 extends each pitch section length divided by the pitch section division processing section at a ratio predetermined by the speech speed setting section 6. It is the part to be done.

【００１８】話速設定部６は、発話者の話す速さと受聴
者の受聴能力に応じて無音区間の延長比率および有声区
間のピッチ区間の延長比率の設定を行なう部分である。The speech speed setting section 6 is a portion for setting the extension ratio of the silent section and the extension ratio of the pitch section of the voiced section in accordance with the speaking speed of the speaker and the listening ability of the listener.

【００１９】そして合成部７は、延長処理の施された無
音区間、何も処理の施されていない無声区間、およびピ
ッチ区間の延長処理の施された有声区間を入力音声と同
じ順序で再合成し、音声として出力する部分である。[0019] The synthesizing unit 7, the extension processing of decorated with silent interval, nothing no voice interval which was not subjected to treatment, and the voiced segment having undergone extension processing of the pitch period in the same order as the input speech again This is the part that synthesizes and outputs as speech.

【００２０】次に、上記構成の話速制御型補聴装置の動
作について説明する。Next, the operation of the speaking speed control type hearing aid of the above configuration will be described.

【００２１】入力音声に対して、まず区間分割処理部１
において無音区間ｉ、無声区間ii、有声区間iii に分割
する。For an input voice, first, a section division processing section 1
Is divided into a silent section i, a silent section ii, and a voiced section iii.

【００２２】そして無音区間延長処理部２において、無
音区間ｉに対して話速設定部６が定める所定の比率で延
長処理を施し、話しの間の制御を行なう。Then, the silent section extension processing section 2 performs extension processing on the silent section i at a predetermined ratio determined by the speech speed setting section 6, and performs control during talking.

【００２３】また、有声区間iii に対して、まずピッチ
周期抽出処理部３によって有声区間iii のピッチ周期を
抽出し、続いて、ピッチ区間分割処理部４によってピッ
チ周期抽出処理部３で抽出されたピッチ周期にしたがっ
てピッチ区間を分割する。さらに、ピッチ区間長延長・
繰り返し処理部５によってピッチ区間分割処理部４で分
割されたピッチ区間を話速設定部６であらかじめ定めた
数だけ繰り返して有声区間iii の延長を行なう。For the voiced section iii, first, the pitch period of the voiced section iii is extracted by the pitch period extraction processing section 3, and subsequently, the pitch period is extracted by the pitch period extraction processing section 3 by the pitch section division processing section 4. The pitch section is divided according to the pitch cycle. Furthermore, pitch section length extension
The voice section iii is extended by repeating the pitch sections divided by the pitch section division processing section 4 by the repetition processing section 5 by a number predetermined by the speech speed setting section 6.

【００２４】なお、発話者の個人性および音韻性を維持
するために無声区間iiについては、何ら加工を行なわな
い。No processing is performed on the unvoiced section ii in order to maintain the speaker's personality and phonological property.

【００２５】このようにして各部で処理が行なわれた無
音区間ｉ、無声区間iiおよび有声区間iii に対して、合
成部７において入力音声と同じ順序で再合成し、音声と
して出力する。こうして、出力される音声は、話速が受
聴者の受聴能力に応じて変換されたものとなる。The silent section i, the unvoiced section ii, and the voiced section iii which have been processed in each section in this way are re-synthesized in the same order as the input voice in the synthesizer 7 and output as voice. In this way, the output voice has a speech speed converted according to the listening ability of the listener.

【００２６】次にこの発明の実施例である話速制御型補
聴装置について図２を用いて説明する。Next, a speech speed control type hearing aid according to an embodiment of the present invention will be described with reference to FIG.

【００２７】通常、図１に示した話速制御型補聴装置は
大型コンピュータ上で実現されるが、この場合には、大
量の演算を要するために信号処理に長時間を必要とし、
リアルタイム性に欠ける他、可搬性、操作性にも欠け
る。このため、入力音声の継続時間は必然的にメモリ容
量に依存し、有限長で打ち切らざるを得ない。Normally, the speech rate control type hearing aid shown in FIG. 1 is realized on a large computer, but in this case, a large amount of computation is required, and a long time is required for signal processing.
In addition to lack of real-time properties, it lacks portability and operability. For this reason, the duration of the input voice necessarily depends on the memory capacity, and must be terminated at a finite length.

【００２８】そこで、この実施例は、複数個の超小型ト
ランスピュータモジュール（並列演算用ＩＣ）を縦続接
続し、各モジュールに信号処理アルゴリズムを最適配分
して高速化を実現することにより、処理時間の大幅な短
縮を行ない、リアルタイム性を実現するものである。し
たがって、これによって入力音声が無制限に継続した場
合でも処理可能となるばかりでなく、小型軽量で可搬性
にも優れたものとなる。また、使用者自らが、簡単な手
動操作によって話速変換に必要なパラメータ（無音区間
ｉおよび有声区間iii の延長比率）を可変でき、使用者
にとっての最適値を設定できるので、操作性にも優れた
ものとすることができる。In this embodiment, a plurality of ultra-small transputer modules (parallel operation ICs) are connected in cascade, and a signal processing algorithm is optimally distributed to each module to realize high-speed processing. Is greatly shortened to achieve real-time performance. Accordingly, this enables not only processing even when the input voice continues unlimitedly, but also small size, light weight, and excellent portability. In addition, the user himself / herself can change parameters necessary for speech speed conversion (extension ratio of the silent section i and the voiced section iii) by a simple manual operation, and can set an optimal value for the user, so that operability is improved. It can be excellent.

【００２９】図２に示すブロック図において、各々のブ
ロックはトランスピュータモジュール（ＴＲＰＭ）１個
分に相当し、個々のアルゴリズムがそれぞれのモジュー
ル上で縦続的に高速演算処理を分担している。In the block diagram shown in FIG. 2, each block corresponds to one transputer module (TRPM), and each algorithm cascades high-speed operation processing on each module.

【００３０】この実施例の話速制御型補聴装置は、Ａ／
Ｄ変換モジュール１１、入力バッファモジュール１２、
第１〜第５分析モジュール１３〜１７、ロータリエンコ
ーダ１８、合成モジュール１９、出力バッファモジュー
ル２０およびＤ／Ａ変換モジュール２１から構成されて
いる。The speech rate control type hearing aid of this embodiment has an A /
D conversion module 11, input buffer module 12,
It comprises first to fifth analysis modules 13 to 17, a rotary encoder 18, a synthesis module 19, an output buffer module 20, and a D / A conversion module 21.

【００３１】Ａ／Ｄ変換モジュール１１は、入力音声信
号を１６ビット量子化、４８ｋＨｚサンプリングでＡ／
Ｄ変換する。The A / D conversion module 11 quantizes the input audio signal by 16 bits and performs A / D conversion at 48 kHz sampling.
D-convert.

【００３２】入力バッファモジュール１２は、Ａ／Ｄ変
換された時系列音声データを逐次取り込み、フレーム単
位で処理するためにバッファリングを行なう。The input buffer module 12 sequentially takes in A / D-converted time-series audio data, and performs buffering for processing in frame units.

【００３３】区間分割処理を担当する第１分析モジュー
ル１３は、平均パワー、零交差数、および自己相関係数
を算出し、これらのしきい値により無音、無声、および
有声音フレームｉ，ii,iiiを決定する。The first analysis module 13 which is in charge of the section division processing calculates the average power, the number of zero crossings, and the autocorrelation coefficient, and uses these thresholds to calculate the silence, unvoiced and voiced sound frames i, ii, Determine iii.

【００３４】第２分析モジュール１４は、高速化のため
にデシメーションを行なう。The second analysis module 14 performs decimation for speeding up.

【００３５】ピッチ周期抽出処理を担当する第３分析モ
ジュール１５は、デシメーションを施した時系列音声デ
ータの自己相関係数を求め、これを用い、フレームごと
に音声のピッチ周波数を算出する。The third analysis module 15 which is in charge of the pitch period extracting process obtains the autocorrelation coefficient of the decimated time-series audio data, and uses this to calculate the pitch frequency of the audio for each frame.

【００３６】ピッチ区間長延長・繰り返し処理を担当す
る第４分析モジュール１６は、ピッチ周波数軌跡の平滑
化を行ない、ピッチの開始点およびピッチ数を決定す
る。The fourth analysis module 16 in charge of the pitch section length extension and repetition processing smoothes the pitch frequency trajectory and determines the starting point of the pitch and the number of pitches.

【００３７】第５分析モジュール１７は、各ブロックで
求めたパラメータの最終的な誤り訂正などの微調整を行
い、無音区間ｉ、無声区間ii、有声区間iii と各ピッチ
区間を決定する。The fifth analysis module 17 makes fine adjustments such as final error correction of the parameters obtained in each block, and determines a silent section i, an unvoiced section ii, a voiced section iii and each pitch section.

【００３８】ロータリエンコーダ１８は、手動操作によ
り与えられる話速変換に必要なパラメータ（無音区間お
よび有声区間の延長比率）を角度情報として８ビット量
子化する。The rotary encoder 18 quantizes, as angle information, 8-bit parameters necessary for speech speed conversion (extension ratio of a silent section and a voiced section) given by manual operation.

【００３９】合成処理を担当する合成モジュール１９
は、第５分析モジュール１７から得られた無音区間ｉ、
無声区間ii、有声区間iii 、およびロータリエンコーダ
１８から得られた無音、有声区間の延長比率に合うよう
に話速を変換して音声合成を行なう。Compositing module 19 responsible for compositing processing
Is a silent section i obtained from the fifth analysis module 17,
Speech synthesis is performed by converting the speech speed to match the extension ratio of the unvoiced section ii, the voiced section iii, and the unvoiced and voiced sections obtained from the rotary encoder 18.

【００４０】出力バッファモジュール２０は、合成した
音声が伸張しているため、この延長部分を吸収するため
にバッファリングを行なう。Since the synthesized voice is expanded, the output buffer module 20 performs buffering to absorb the extended portion.

【００４１】Ｄ／Ａ変換モジュール２１は、合成した音
声をＤ／Ａ変換し、音声信号として出力する。The D / A conversion module 21 performs D / A conversion on the synthesized voice and outputs the voice as a voice signal.

【００４２】こうして、この実施例の話速制御型補聴装
置によれば、入力音声を無音区間、無声区間および有声
区間に分割する区間分割処理を施し、無音区間に対して
はあらかじめ定められた比率によって延長させ、無声区
間に対しては何ら加工を施さず、さらに有声区間に対し
てピッチ周期を抽出し、ピッチ区間ごとに分割し、各ピ
ッチ区間長をあらかじめ定められた比率によって延長さ
せ、これら延長処理の施された無音区間、何も処理の施
されていない無声区間、およびピッチ区間の延長処理の
施された有声区間を入力音声と同じ順序で再合成し、受
聴者の受聴能力に応じた話速にほぼリアルタイムで変換
して出力することができるのである。As described above, according to the speech rate control type hearing aid of this embodiment, the input voice is subjected to the section dividing process for dividing the input speech into the silent section, the unvoiced section and the voiced section, and the predetermined ratio is determined for the silent section. No processing is applied to unvoiced sections, and pitch periods are extracted for voiced sections, divided for each pitch section, and each pitch section length is extended by a predetermined ratio. Re-synthesize the silence section with extension processing, the unvoiced section without any processing, and the voiced section with extension processing of the pitch section in the same order as the input voice, according to the listening ability of the listener. It can be converted and output in near real-time to the spoken speed.

【００４３】[0043]

【発明の効果】以上のようにこの発明によれば、発話者
の個人性、および音韻性を保持したまま高品質に話速を
変換する処理を施し、受聴者自身の操作によって受聴者
の受聴能力に話速をフィットさせ、受聴者にとって最適
な話速にほぼリアルタイムに変換することができるの
で、加齢ないしは何らかの障害などによって音声識別臨
界速度が低下している受聴者の受聴能力を補い、最適な
速度でリアルタイムで受聴させることができる。As described above, according to the present invention, a process of converting the speech speed to a high quality while maintaining the speaker's individuality and phonological characteristics is performed, and the listener's own operation is performed by the listener's own operation. Speaking speed can be fitted to the ability, and it can be converted to the optimum talking speed for the listener almost in real time, so that the listening ability of the listener whose voice identification critical speed has decreased due to aging or some obstacle, etc., is compensated, You can listen in real time at the optimal speed.

【図面の簡単な説明】[Brief description of the drawings]

【図１】話速制御型補聴装置の機能ブロック図である。FIG. 1 is a functional block diagram of a speech speed control type hearing aid.

【図２】この発明の実施例である話速制御型補聴装置の
機能ブロック図である。FIG. 2 is a functional block diagram of a speaking speed control type hearing aid according to an embodiment of the present invention;

【符号の説明】[Explanation of symbols]

１区間分割処理部２無音区間延長処理部３ピッチ周期抽出処理部４ピッチ区間分割処理部５ピッチ区間長延長・繰り返し処理部６話速設定部７合成部１１Ａ／Ｄ変換モジュール１２入力バッファモジュール１３第１分析モジュール１４第２分析モジュール１５第３分析モジュール１６第４分析モジュール１７第５分析モジュール１８ロータリエンコーダ１９合成モジュール２０出力バッファモジュール２１Ｄ／Ａ変換モジュール REFERENCE SIGNS LIST 1 section division processing section 2 silent section extension processing section 3 pitch period extraction processing section 4 pitch section division processing section 5 pitch section length extension / repetition processing section 6 speech speed setting section 7 synthesis section 11 A / D conversion module 12 input buffer module 13 First Analysis Module 14 Second Analysis Module 15 Third Analysis Module 16 Fourth Analysis Module 17 Fifth Analysis Module 18 Rotary Encoder 19 Synthesis Module 20 Output Buffer Module 21 D / A Conversion Module

───────────────────────────────────────────────────── フロントページの続き (72)発明者宮坂栄一東京都世田谷区砧一丁目10番11号日本放送協会放送技術研究所内 (56)参考文献特開平１−93795（ＪＰ，Ａ) 特開昭59−82608（ＪＰ，Ａ) 特開昭64−44995（ＪＰ，Ａ) 特開平５−73089（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁷，ＤＢ名) G10L 21/04 A61F 11/00 H03M 1/00 H04R 25/00 ──────────────────────────────────────────────────続き Continuation of front page (72) Inventor Eiichi Miyasaka 1-10-11 Kinuta, Setagaya-ku, Tokyo Japan Broadcasting Corporation Research Institute of Broadcasting Technology (56) References JP-A-1-93795 (JP, A) JP-A-59-82608 (JP, A) JP-A-64-44995 (JP, A) JP-A-5-73089 (JP, A) (58) Fields investigated (Int. Cl. ⁷ , DB name) G10L 21 / 04 A61F 11/00 H03M 1/00 H04R 25/00

Claims

(57)【特許請求の範囲】(57) [Claims]

【請求項１】以下の（１）から（１０）に示す各モジ
ュールを並列演算用ＩＣから成るトランスピュータモジ
ュールで構成したことを特徴とする話速制御型補聴装
置。（１）入力音声信号をＡ／Ｄ変換して時系列音声データ
を生成するするＡ／Ｄ変換モジュール、（２）Ａ／Ｄ変換された時系列音声データを取り込んで
フレーム単位で処理するためのバッファリングを行なう
入力バッファモジュール、（３）この入力バッファモジュールを介して取り込まれ
る時系列音声データに対して平均パワー、零交差数、お
よび自己相関係数を算出し、これらのしきい値により無
音フレーム、無声フレームおよび有声フレームを決定す
る区間分割処理を実行する第１分析モジュール、（４）区間分割された時系列音声データに対して高速化
のためのデシメーションを行なう第２分析モジュール、（５）デシメーションが施された時系列音声データの自
己相関係数を求め、フレーム毎に音声のピッチ周波数を
求める第３分析モジュール、（６）求められたピッチ周波数の軌跡の平滑化を行なっ
てピッチの開始点およびピッチ数を決定し、ピッチ区間
長延長・繰り返し処理を行なう第４分析モジュール、（７）決定されたピッチの開始点およびピッチ数に基づ
いて、無音区間、無声区間、有声区間の各ピッチ区間を
決定する第５分析モジュール、（８）前記第５分析モジュールによって得られた無音区
間、無声区間、有声区間および話速度パラメータ設定手
段から得られる無音区間ならびに有声区間の延長比率に
適合するように話速を変換して音声合成を行なう合成モ
ジュール、（９）合成モジュールの音声合成出力における連続性を
保持するためのバッファリングを行なう出力バッファモ
ジュール、（１０）出力バッファモジュールの出力をＤ／Ａ変換し
て話速が制御された音声信号として出力するＤ／Ａ変換
モジュール。1. A speech rate control type hearing aid wherein each of the following modules (1) to (10) is constituted by a transputer module comprising a parallel operation IC. (1) an A / D conversion module for A / D converting an input audio signal to generate time-series audio data; and (2) an A / D conversion module for taking in the A / D-converted time-series audio data and processing it in frame units. (3) An average power, a number of zero crossings, and an autocorrelation coefficient are calculated for the time-series audio data taken in through the input buffer module, and silence is performed based on these threshold values. A first analysis module that executes a section division process for determining a frame, an unvoiced frame, and a voiced frame; (4) a second analysis module that performs decimation for speeding up the time-divided time-series speech data; 3) calculating the autocorrelation coefficient of the decimated time-series audio data, and calculating the pitch frequency of the audio for each frame. An analysis module, (6) a fourth analysis module for smoothing the trajectory of the determined pitch frequency to determine the starting point and the number of pitches, and for performing pitch section length extension and repetition processing; (7) determined A fifth analysis module for determining each of the silent sections, unvoiced sections, and voiced sections based on the starting point of the pitch and the number of pitches; (8) a silent section, a unvoiced section, and a voiced section obtained by the fifth analyzing module; A synthesizing module that converts the speech speed so as to conform to the extension ratio of the silent section and the voiced section obtained from the section and the speech rate parameter setting means, and performs speech synthesis; (9) maintaining continuity in speech synthesis output of the synthesis module Output buffer module that performs buffering for performing (10) D / A conversion of the output of the output buffer module D / A conversion module for outputting a sound signal speech speed is controlled.