JPS6136796A

JPS6136796A - Pronunciation dictionary editor

Info

Publication number: JPS6136796A
Application number: JP16002484A
Authority: JP
Inventors: 神山　ふかみ
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1984-07-30
Filing date: 1984-07-30
Publication date: 1986-02-21

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、合成音の生成に使用する発音辞書の編集装置
に関する。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to an editing device for a pronunciation dictionary used for generating synthesized sounds.

〔従来の技術〕[Conventional technology]

音声応答、自動アナウンス、文章読上げ用などに、合成
音が使用されている。この合成音の発生装置は第２図に
概要を示すように発音辞書１０を備えており、コードで
表わされた漢字カナ混り文を入力されると、該入力文で
発音辞書を検索して、アクセント位置なども示したカナ
文を得、該カナ文を音声合成部１２に渡し、該合成部よ
りカナ文に対する音声信号を得、これをスピーカ１４よ
り発声させる。この発音辞書は漢字がベースで、単語等
を表わす漢字別に、その読み（振り仮名に近いが必らず
しも一致しない）、アクセント位置、品詞などの文法情
報が添えられている。Synthetic sounds are used for voice responses, automated announcements, text reading, etc. This synthesized sound generator is equipped with a pronunciation dictionary 10 as shown in the outline in Fig. 2, and when a sentence containing kanji and kana represented by a code is input, it searches the pronunciation dictionary with the input sentence. Then, a kana sentence including an accent position is obtained, and the kana sentence is passed to a speech synthesis section 12, an audio signal for the kana sentence is obtained from the synthesis section, and this is uttered from a speaker 14. This pronunciation dictionary is based on kanji, and for each kanji that represents a word, grammatical information such as its reading (close to furigana, but not necessarily the same), accent position, and part of speech is attached.

漢字には表記は異なるが、発音が同一のものがある。例
えば「市立」、「私立」は共に「シリ７」であり、音声
的には区別がつかない。「化学」と「科学」なども同様
である。これらは、文章を読み上げ時に、まぎられしく
、市立はイチリツに読みを変えて私立（シリ７）と区別
する、「化学」は「バケガク」に読みを変えて科学（カ
ガク）と区別する、等の方法をとる。また使用が予想さ
れる漢字で辞書に収容されていないものがあればそれを
登録し、収容されているもので不要なものがあればそれ
を削除し、か＼る処理が編集である。Although some kanji are written differently, they have the same pronunciation. For example, "municipal" and "private" both have "Siri 7" and are indistinguishable phonetically. The same goes for "chemistry" and "science." When reading these sentences aloud, the pronunciation of municipal government is changed to ichiritsu to distinguish it from private (siri 7), and the pronunciation of ``chemistry'' is changed to ``bakegaku'' to distinguish it from science (kagaku), etc. Take this method. Editing is the process of registering any kanji that are expected to be used but not included in the dictionary, and deleting any unnecessary ones that are included.

〔発明が解決しようとする問題点〕[Problem that the invention seeks to solve]

従来発音辞書の編集装置は、発音辞書が漢字ベースであ
ることから該辞書は漢字で索引するように構成されてい
る。即ちＪＥＦコードなどの漢字コードをキーボードよ
り入力し、これで該辞書を索引して該当漢字、その読み
、アクセント位置、品ｆｉｌなどを得てこれらをＣＲＴ
ディスプレイに表示する。オペレータはこれを見て通・
切、不適切を判断し、不適切なら修正し、といった編集
処理を行なう。しかしこの方式では各漢字（単語などを
構成する）毎にそのコードを人カセねばならず、収容漢
字数が多数の場合は相当に厄介である。Conventional pronunciation dictionary editing devices are configured to index the dictionary using Kanji because the pronunciation dictionary is based on Kanji. In other words, enter a kanji code such as the JEF code from the keyboard, use this to index the dictionary, obtain the corresponding kanji, its reading, accent position, product file, etc., and display these on the CRT.
Display on display. The operator sees this and
It performs editing processing such as determining whether the content is inappropriate or inappropriate, and correcting it if it is inappropriate. However, with this method, it is necessary to create a code for each kanji (constituting a word, etc.), which is quite troublesome when a large number of kanji can be accommodated.

ところで漢字には読みが同じ又はは＼゛同じものが多数
ある。前述のカガク、シリクなどがその例であるが、他
の二、三の例を挙げると威光、移行、移項、偉功、意向
、遺稿（これらは全てイヨウ）、完工、勘考、敢行、感
光、慣行、観光、織口、還幸（これらは全てカンヨウ）
、などがある。漢和辞典ではなく国語辞書のように仮名
で検索できれば、イヨウ、カンヨウ、と入力するだけで
上記の各漢字を引出すことができ、これらの漢字のコー
ドを１つずつ入力する必要がない。By the way, there are many kanji that have the same reading or the same reading. The above-mentioned Kagaku and Shirik are examples, but a few other examples include prestige, transition, transfer, great deeds, intentions, posthumous writings (all of these are irrelevant), completion, consideration, courage, sensitivity, Customs, tourism, orikuchi, return to Japan (all of these are common)
,and so on. If you were able to search by kana, like in a Japanese dictionary instead of a Kanji dictionary, you would be able to pull up each of the kanji listed above just by inputting ``Iyou'' or ``Kanyo,'' without having to enter the codes for each of these kanji one by one.

本発明はか＼る点に着目するものであり、発音辞書編集
装置では発音辞書をカナで索引可能にしようとするもの
である。勿論、発音辞書それ自体は音声合成装置に組み
込まれ、漢字カナ混り文を入力されて対応する音声を出
力する必要があるから、また漢字でないと意味が固定せ
ずアクセント位置が定まらない場合も生しるから、漢字
ベースである必要がある。そこで辞書は変えずに、編集
装置では等価的に仮名へ一スのように扱える様にする必
要がある。The present invention focuses on this point, and aims to enable a pronunciation dictionary editing device to index a pronunciation dictionary using kana. Of course, the pronunciation dictionary itself is built into a speech synthesis device, and it is necessary to input a sentence containing kanji and kana and output the corresponding voice.Also, if it is not a kanji, the meaning may not be fixed and the accent position may not be determined. It needs to be based on kanji because it is written in Japanese. Therefore, it is necessary to make it possible for the editing device to equivalently handle kana as if it were a single pass, without changing the dictionary.

また仮名ベースであると、イヨウ、カンヨウ、のヨウな
どのような長音はニーと書く者も層り、正しい振り仮名
でないと検索不可能となると不便である。本発明はこの
ような問題にも対処しようとするものである。Furthermore, if it is based on kana, there are many people who write long sounds such as iyou, kanyou, noyou, etc. as nee, and it is inconvenient if it becomes impossible to search without the correct furigana. The present invention attempts to address such problems as well.

〔問題点を解決するための手段および作用〕本発明は、
漢字カナ混じり文を入力され、対応する音声を出力する
のに用いられる発音辞書の収容梧句の加除、変更を行な
う編集装置において、漢字の読みをカナまたはローマ字
で入力されて読みの類（以の判定を行い、前記発音辞書
を検索する手段と、該検索手段が検索した全ての漢字の
辞書内容を表示するディスプレイを備えることを特徴と
するものである。[Means and effects for solving the problems] The present invention has the following features:
In an editing device that inputs a sentence containing kanji and kana, and adds, deletes, or changes the pronunciation dictionary that is used to output the corresponding voice, the reading of the kanji is input in kana or romaji, and the pronunciation type (the following) is input. The present invention is characterized by comprising means for making a determination and searching the pronunciation dictionary, and a display for displaying dictionary contents of all the kanji searched by the search means.

第１図は本発明の基本構成を示し、１０は前述の発音辞
書、１６はカナ及び又はローマ字を入力できるキーボー
ド、１８は読みの類似の判定部、２０は発音辞書の検索
手段、２２はＣＲＴなどのディスプレイ、２６は全体の
制御部である。キーボード１６からカナまたはローマ字
で単語を入力されるとそのカナ又はローマ字で表わされ
る全ての漢字を出力する。例えばセージまたは５ＥＩＳ
Ｈ１と入力すると世子、生死、正史、正使、正視、・・
・・・・の漢字が出力される。セイシを入力した場合も
同様の漢字が出力される。つまり本装置では定められた
振り仮名で漢字を出力するのではなく読み又は発音を仮
名またはローマ字で表現して入力されると、そのように
読まれ又は発音される全ての漢字単語を出力する。振り
仮名のイが発音上は長音又は工に変化する例は多い。例
えば経営（ケイエイ）はケーエー、ケイニー、ケエエー
などと書くことができる。これらのどれが入力されても
同一とみなしか＼る読みを持つ漢字単語を出力する。即
ち、入力　　　　　　　　出力入出力態様の例を更に挙げると、「シリク」と入力すれ
ば「重文」、「私立」・・・・・・などの漢字が出力さ
れ、ｒカガク」と入力すれば「下学」、「化学」、「科
学」、「歌学」・・・・・・などの漢字が出力される。FIG. 1 shows the basic configuration of the present invention, in which 10 is the aforementioned pronunciation dictionary, 16 is a keyboard that can input kana and/or Romaji, 18 is a reading similarity judgment unit, 20 is a search means for the pronunciation dictionary, and 22 is a CRT. 26 is the overall control section. When a word is input in kana or romaji from the keyboard 16, all kanji expressed in kana or romaji are output. For example Sage or 5EIS
If you enter H1, the crown prince, life and death, official history, official envoy, correct view, etc.
... Kanji characters are output. If you input Seishi, the same kanji will be output. In other words, this device does not output kanji using predetermined furigana, but when the reading or pronunciation is input in kana or romaji, it outputs all kanji words that are read or pronounced that way. There are many cases where the furigana ``i'' changes into a long sound or ``ku'' in pronunciation. For example, management (keiei) can be written as keei, keini, keeeei, etc. No matter which of these is input, it will be considered the same and will output the kanji word with the reading. That is, to give a further example of the input/output mode, if you input ``Shirik'', kanji such as ``important heritage'', ``private'', etc. will be output, and if you input ``r kagaku'', ``lower'' will be output. Kanji such as '学', 'chemistry', 'science', 'kagaku', etc. are output.

セイシもセージも間−と見做し、トウキヨウ、トオキョ
ウ、トーキヨウはいずれも同じ（東京）、トウ力　トオ
カはいずれも同じ土日、等価、灯下、灯火などとするが
、次のものも同様に同一と見做す。Both seishi and sage are considered to be between, and tokiyo, tokyo, and tokyo are all the same (Tokyo), and toka is the same Saturday and Sunday, equivalent, toka, touka, etc., but the following are also the same. considered to be the same.

無声化音と無声化しない音：各音節の間に挾まれたシ、
スは無声化することがある０例えば「常識」の「シｊは
無声化しているが、これは無声化しないないものと同一
と見做す。Devoiced and unvoiced sounds: C between each syllable,
0 For example, in ``common sense'', ``shij'' is devoiced, but it is considered to be the same as not being devoiced.

鼻濁音と非鼻濁音二文節の初め以外に現われるガ行は鼻
にか＼る音になる。例えば音楽のガ、白菊のギ、道具の
グ・・・・・・が鼻濁音であるが、これは非鼻濁音と同
一、即ち学校のガ、義士のギ、群雄のグ・・・・・・と
同一とする。A nasal sound and a non-nasal sound The ga line that appears other than at the beginning of a clause becomes a nasal sound. For example, music's ga, white chrysanthemum's gi, tool's gu, etc. are nasal sounds, but these are the same as non-nasal sounds, i.e. school's ga, gi's gi, group male's gu, etc. be the same.

長音は前音節の母音を連続したものと見做す：長音即ち
サー、クー、などの伸びる音は前音節の母音、本例では
ア、つの連続したもの従ってサア、クラと表わせるもの
とする。A long sound is considered to be a continuous vowel in the previous syllable: a long sound, such as sa, ku, etc., is a continuous vowel in the previous syllable, in this case a, and can therefore be expressed as saa or kula. .

音便は、変形以前の母音と同一と見做す：前音節のある
音節にはイ、゛つ、ン、ツに変るものがある（音便）が
、これは変形以前の音節と同じ、例えばか（Ｆｉｔ）い
繕うの「い」は掻き繕うの「き」であるとする。The syllables are considered to be the same as the vowels before the transformation: Some syllables with the previous syllable change to i, ゛tsu, n, tsu (onbin), but this is the same as the syllable before the transformation. For example, let's say that ``i'' in ``Ka (Fit)'' is ``ki'' in ``kake-mani''.

アクセントの違いは無視する：本装置では読み又は発音
を入力するのでアクセントも問題になるが、入力手段は
音声ではなくキーボードなのでこれは無視する（文字入
力では一般にこれらは自動的に無視される）。Ignore differences in accents: This device inputs pronunciation or pronunciation, so accents are also an issue, but since the input method is a keyboard rather than voice, this is ignored (in text input, these are generally automatically ignored) .

手段１８はか−る論理で読みの類似の判定を行なう、検
索手段２０により発音辞書１０をアクセスし、その読出
し出力即ち各漢字のアクセント付きの読み（発音）及び
品詞などを得てＣＲＴ出力部２８へ送る。オペレータは
ＣＲ７画面２２を見て前述の修正を行なう。The means 18 accesses the pronunciation dictionary 10 using the search means 20, which determines the similarity of pronunciations using the above logic, and obtains the reading output, that is, the accented pronunciation (pronunciation) of each kanji and the part of speech, and outputs it to the CRT output section. Send to 28. The operator looks at the CR7 screen 22 and makes the above corrections.

〔発明の効果〕〔Effect of the invention〕

この発音辞書編集装置では、読み又は発音を入力するこ
とによりその読み又は発音を持つ全ての漢字単語の発音
辞書内容を出力（−指表示）できるので、各漢字単語毎
にコード入力する必要がなく、また−指表示されるので
読み上げ時に紛られしい（読みは同じであるが意味が異
なり、前後関係からも判別しにくい）漢字単語を容易に
発見して適宜変更することができ、甚だ有効である。ま
た読みを入力するので振り仮名人力のようにルール通り
でないという事でアクセス不可能となるようなことがな
いという利点がある。With this pronunciation dictionary editing device, by inputting a reading or pronunciation, the pronunciation dictionary contents of all kanji words with that reading or pronunciation can be output (displayed with a - finger), so there is no need to input a code for each kanji word. Also, since the - finger is displayed, you can easily find kanji words that are confusing when reading aloud (the pronunciation is the same, but the meaning is different, and it is difficult to distinguish from the context), and you can change them accordingly, making it extremely effective. be. Also, since the pronunciation is input, there is no possibility that access will become impossible due to the rule not being followed, unlike Furigana Jinriki.

【図面の簡単な説明】[Brief explanation of drawings]

第１図は本発明装置の概要を示すブロック図、第２図は
音声合成装置の概要を示すプロ、り図である。図面で、１０は発音辞書、１２は音声合成部、１４はス
ピーカ、１６はキーボード、１日は読みの類似判定部、
２０は検索部である。FIG. 1 is a block diagram showing an overview of the device of the present invention, and FIG. 2 is a professional diagram showing an overview of the speech synthesis device. In the drawing, 10 is a pronunciation dictionary, 12 is a speech synthesis unit, 14 is a speaker, 16 is a keyboard, 1st is a pronunciation similarity determination unit,
20 is a search section.

Claims

【特許請求の範囲】漢字カナ混じり文を入力され、対応する音声を出力する
のに用いられる発音辞書の収容語句の加除、変更を行な
う編集装置において、漢字の読みをカナまたはローマ字で入力されて読みの類
似の判定を行い、前記発音辞書を検索する手段と、該検
索手段が検索した全ての漢字の辞書内容を表示するディ
スプレイを備えることを特徴とする発音辞書編集装置。[Claims] In an editing device that receives a sentence containing kanji and kana and adds, deletes, or changes words stored in a pronunciation dictionary used to output the corresponding speech, the reading of the kanji is input in kana or romaji. 1. A pronunciation dictionary editing device comprising: means for determining the similarity of pronunciations and searching the pronunciation dictionary; and a display for displaying dictionary contents of all kanji searched by the search means.