JP2856769B2

JP2856769B2 - Speech synthesizer

Info

Publication number: JP2856769B2
Application number: JP1148996A
Authority: JP
Inventors: 義幸原
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1989-06-12
Filing date: 1989-06-12
Publication date: 1999-02-10
Anticipated expiration: 2014-02-10
Also published as: JPH0313999A

Description

【発明の詳細な説明】［発明の目的］（産業上の利用分野）本発明は、コード化された数字列を自然性良く音声合
成することのできる音声合成装置に関する。DETAILED DESCRIPTION OF THE INVENTION [Object of the Invention] (Industrial application field) The present invention relates to a speech synthesizer that can naturally synthesize a coded numeric string with speech.

（従来の技術）一般に、規則による音声合成装置で数字列、たとえば
「3561」を出力する場合、それが値段を表すときは「サ
ンゼン／ゴヒャク／ロクジュー／イチエン」のように位
読みするが、たとえば電話番号のときは、「サン／ゴー
／ロク／イチ」のように位読みしないほうが好ましい。
従来、この位読みをしない場合には、のように１桁の数字ごとにアクセント的区切りを入れ、
さらに聞きやすくするためにアクセント的区切り位置に
ポーズを挿入していた。(Prior Art) In general, when a rule-based speech synthesizer outputs a digit string, for example, "3561", when it represents a price, it is read as "Sanzen / Gohjak / Rokju / Ichien". In the case of a telephone number, it is preferable not to read digits such as “Sun / Go / Roku / Ichi”.
Conventionally, if you do not do this reading, Insert an accent-like delimiter for each single digit like
Pauses were inserted at accent breaks to make it easier to hear.

しかしなら、実際に人が数字を発声する場合は、数字
１音１音を区切って発声することは少なく、連続して発
声することの方が多い。However, when a person actually utters a number, utterance is rarely performed while separating each sound of a number, and utterance is often performed continuously.

（発明が解決しようとする課題）上記したように、従来の音声合成装置でコード化され
た数字列を位読みしないで出力する場合、数字の１音を
区切って発声するため、不自然なものとなっていた。(Problems to be Solved by the Invention) As described above, when outputting a digit string coded by a conventional speech synthesizer without digitizing, it utters one digit of a digit, which is unnatural. Had become.

そこで、本発明は、入力される数字列を先頭から２桁
毎に１つのアクセント句とすることにより、特に電話番
号などの数字列を自然性良く音声合成することが可能と
なる音声合成装置を提供することを目的とする。Therefore, the present invention provides a speech synthesizer that can synthesize a character string such as a telephone number with a natural sound by using an input number string as one accent phrase every two digits from the beginning. The purpose is to provide.

［発明の構成］（課題を解決するための手段）本発明の音声合成装置は、コード化されて与えられる
数字列を位読みしないで音声出力するものにおいて、入
力される数字列の先頭から２桁ごとをアクセント境界に
して韻律パラメータを生成する韻律パラメータ生成手段
と、この韻律パラメータ生成手段で生成した韻律パラメ
ータに基づき合成音を生成し、出力する手段とを具備し
ている。[Structure of the Invention] (Means for Solving the Problems) A speech synthesizing apparatus according to the present invention, which outputs speech without digit reading of a coded given number sequence, includes two digits from the beginning of an input number sequence. Prosody parameter generation means for generating a prosody parameter with each digit as an accent boundary, and means for generating and outputting a synthesized sound based on the prosody parameter generated by the prosody parameter generation means.

（作用）本発明の音声合成装置によれば、コード化された数字
列の先頭から２桁ごとをアクセント境界にし、２桁の数
字あるいは残りの１桁の数字にアクセント付けを行なう
ため、数字列の合成音が自然性良く生成できる。(Operation) According to the speech synthesizer of the present invention, every two digits from the beginning of a coded number string are used as accent boundaries to accentuate a two-digit number or the remaining one-digit number. Can be generated with good naturalness.

（実施例）以下、本発明の一実施例について図面を参照して説明
する。Hereinafter, an embodiment of the present invention will be described with reference to the drawings.

第１図において、１は入力部で、コード化された文字
列および数字列などが入力される。２はアクセント検定
部で、入力部１を介して与えられるコードとアクセント
辞書３との照合を行ない、入力に対するアクセント型、
読み、品詞情報を得る。ここで、入力に対する品詞情報
が数字を表す「数」のときには、数字２桁で１つのアク
セント句とする。In FIG. 1, reference numeral 1 denotes an input unit for inputting a coded character string and a numeric string. Reference numeral 2 denotes an accent testing unit for comparing a code given through the input unit 1 with an accent dictionary 3 to obtain an accent type for the input,
Read and get part of speech information. Here, when the part-of-speech information corresponding to the input is “number” representing a number, two digits of the number constitute one accent phrase.

アクセント辞書３には、単語に対するアクセント型、
読み、品詞などが登録されている。合成パラメータ生成
部４は、アクセント検定部２から与えられるアクセント
型を基にして韻律パラメータを生成するとともに、読み
を基にして音声素片ファイル５を参照して音韻パラメー
タを生成し、それぞれ合成部６へ送る。音声素片ファイ
ル５には、各単音節を分析したパラメータが格納されて
いる。The accent dictionary 3 contains accent types for words,
Readings, parts of speech, etc. are registered. The synthesis parameter generation unit 4 generates a prosody parameter based on the accent type given from the accent test unit 2 and generates a phoneme parameter by referring to the speech unit file 5 based on the reading. Send to 6. The speech unit file 5 stores parameters obtained by analyzing each syllable.

合成部６は、合成パラメータ生成部４から送られる韻
律パラメータおよび音韻パラメータに基づいて合成音の
生成を行ない、それを出力する。The synthesizing unit 6 generates a synthesized sound based on the prosodic parameters and the phonological parameters sent from the synthesis parameter generating unit 4, and outputs it.

次に、このような構成において、たとえば「75387」
なる数字列の入力があった場合について説明する。この
「75387」なる数字列は、入力部１を介してアクセント
検定部２へ送られる。アクセント検定部２では、第２図
に示すようなアクセント辞書３と照合することにより、
全て品詞が「数」のため「75」「38」「７」に分割され
る。このときのアクセント型読みは、「５」の数字結合
アクセント情報を用いて（３型）、「８」の数字結合アクセント情報を用いてと検定される。この情報は、合成パラメータ生成部４に
送られ、音声素片ファイル５を参照してに対応する音韻パラメータおよび韻律パラメータが生成
され、それぞれ合成部６へ送られる。合成部６では、そ
れらの合成パラメータに基づいて合成音を生成し、その
音声出力を行なう。Next, in such a configuration, for example, “75387”
A case will be described in which a numeric string is input. The numeral string “75387” is sent to the accent testing unit 2 via the input unit 1. In the accent test unit 2, by comparing with an accent dictionary 3 as shown in FIG.
All parts of speech are divided into "75", "38" and "7" because of the "number". At this time, the accent-type reading is performed using the digit-combined accent information of “5”. (Type 3), using the digit-combined accent information of "8" Is tested. This information is sent to the synthesis parameter generation unit 4 and refers to the speech unit file 5. Are generated and sent to the synthesis unit 6, respectively. The synthesizer 6 generates a synthesized sound based on the synthesis parameters and outputs the sound.

また、数字の読みは全て２モーラ（「２」は「ニー」
「５」は「ゴー」とする）であるため、２つの数字を１
アクセント句とすると４モーラ、数字結合アクセント情
報は全て「１」のため、アクセント型はどのような組合
せでも「３型」となる。In addition, the number reading is all 2 mora ("2" is "knee"
"5" is "go"), so two numbers are 1
If the accent phrase is 4 moras, and all the digit combination accent information is “1”, any combination of accent types will be “type 3”.

１つの数字が残った場合は、アクセント型を「１」と
する。これらの規則を、たとえば第１図に破線で示した
数字アクセント検定部７で実現することも可能である。
この場合、入力部１で数字を検出し、それを数字アクセ
ント検定部７へ送る。数字アクセント検出部７は、数字
列を先頭から２桁ごとに分割し、その２桁の数字を１ア
クセント句とする。そして、２桁数字のアクセント型を
「３」とし、もし１桁の数字が残っているならば、その
数字のアクセント型を「１」とし、数字の読みとアクセ
ント型を合成パラメータ生成部４へ送る。If one number remains, the accent type is set to “1”. These rules can be implemented, for example, by the numeric accent testing unit 7 shown by a broken line in FIG.
In this case, the input unit 1 detects a number and sends it to the number accent testing unit 7. The numeric accent detector 7 divides the numeric string into two digits from the beginning, and uses the two-digit number as one accent phrase. Then, the accent type of the two-digit number is set to “3”, and if one digit remains, the accent type of the number is set to “1”, and the reading of the number and the accent type are sent to the synthesis parameter generation unit 4. send.

このように、電話番号などの数字列を音声出力する際
に、その数字列の先頭から２桁ごとをアクセント境界と
することにより、数字列の自然な発声が可能である。As described above, when a character string such as a telephone number is output as speech, by setting every two digits from the beginning of the number string as an accent boundary, natural utterance of the number string can be achieved.

なお、本発明は上述した実施例に限定されるものでは
ない。たとえば、第１図に示した音声合成装置の概略構
成図はあくまでも一例であり、また第２図のアクセント
辞書の内容についても前述した実施例に限定されるもの
ではない。また、数字を表す品詞は「数」でなくてもよ
い。要するに本発明は、その要旨を逸脱しない範囲で種
々変形して実施可能である。The present invention is not limited to the embodiments described above. For example, the schematic configuration diagram of the speech synthesizer shown in FIG. 1 is merely an example, and the contents of the accent dictionary in FIG. 2 are not limited to the above-described embodiment. Also, the part of speech that represents a number need not be “number”. In short, the present invention can be implemented with various modifications without departing from the scope of the invention.

［発明の効果］以上詳述したように本発明によれば、入力される数字
列の先頭から２桁ごとをアクセント境界にし、２桁の数
字あるいは残りの１桁の数字にアクセント付けを行なう
ことにより、本来人が発声するのと同じような自然な音
声出力が得られる音声合成装置を提供できる。[Effects of the Invention] As described in detail above, according to the present invention, every two digits from the beginning of an input numeric string are set as accent boundaries, and two digits or the remaining one digit is accented. Accordingly, it is possible to provide a voice synthesizing apparatus capable of obtaining a natural voice output similar to a voice uttered by a human.

【図面の簡単な説明】[Brief description of the drawings]

第１図は本発明の一実施例に係る音声合成装置の概略構
成図、第２図は同実施例におけるアクセント辞書の内容
を示す図である。１……入力部、２……アクセント検定部、３……アクセ
ント辞書、４……合成パラメータ生成部、５……音声素
片ファイル、６……合成部、７……数字アクセント検定
部。FIG. 1 is a schematic configuration diagram of a speech synthesizer according to one embodiment of the present invention, and FIG. 2 is a diagram showing the contents of an accent dictionary in the embodiment. 1. Input unit 2. Accent test unit 3. Accent dictionary 4. Synthetic parameter generating unit 5. Speech unit file 6. Synthetic unit 7. Numerical accent test unit.

Claims

(57)【特許請求の範囲】(57) [Claims]

【請求項１】コード化されて与えられる数字列を位読み
しないで音声出力するものにおいて、入力される数字列の先頭から２桁ごとをアクセント境界
にして韻律パラメータを生成する韻律パラメータ生成手
段と、この韻律パラメータ生成手段で生成した韻律パラメータ
に基づき合成音を生成し、出力する手段とを具備したこ
とを特徴とする音声合成装置。1. A prosody parameter generating means for generating a prosody parameter with an accent boundary every two digits from the beginning of an input digit sequence, wherein the encoded digit sequence is output as voice without digit reading. Means for generating and outputting a synthesized speech based on the prosody parameters generated by the prosody parameter generation means.

【請求項２】前記韻律パラメータ生成手段は、２桁ごと
に分割されたアクセント型を、「３型」にして韻律パラ
メータを生成することを特徴とする請求項１記載の音声
合成装置。2. A speech synthesizer according to claim 1, wherein said prosody parameter generation means generates a prosody parameter by setting the accent type divided every two digits to "type 3".