JP2793992B2

JP2793992B2 - Homonym recognition device

Info

Publication number: JP2793992B2
Application number: JP3089507A
Authority: JP
Inventors: 久夫中村; 力高橋; 元彦長谷川
Original assignee: ENU TEI TEI ADOBANSU TEKUNOROJI KK
Current assignee: ENU TEI TEI ADOBANSU TEKUNOROJI KK
Priority date: 1991-03-27
Filing date: 1991-03-27
Publication date: 1998-09-03
Anticipated expiration: 2013-09-03
Also published as: JPH05120331A

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、日本語ワードプロセッ
サ、日本語ワードプロセッサを内蔵するコンピュータ装
置、印刷装置、日本語の音声認識装置、日本語を他言語
に翻訳する自動翻訳装置、その他日本語を認識する装置
に利用する。本発明は、日本語の同音異義語をその文中
の前後関係にしたがって認識する装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a Japanese word processor, a computer device incorporating a Japanese word processor, a printing device, a Japanese speech recognition device, an automatic translation device for translating Japanese into other languages, and other Japanese words. Used for recognition devices. The present invention relates to a device for recognizing Japanese homonyms according to the context of the sentence.

【０００２】[0002]

【従来の技術】出願人は、かな漢字変換装置において同
音異義語の漢字列の絞り込みについて入力された漢字に
変換すべき文字列を含む文字に述語が含まれるときに、
格フレームの概念により変換対象となる同音異義語を絞
り込むことにより、同音異義語選択能率を高めた技術を
特願平１−313045として出願した。2. Description of the Related Art Applicants have used a kana-kanji conversion device to narrow down a kanji string of homonyms when a predicate is included in a character including a character string to be converted into a kanji input.
A Japanese patent application No. Hei 1-313045 has filed a technique for narrowing down homonyms to be converted by the concept of a case frame, thereby increasing the homonym selection efficiency.

【０００３】この発明は、かな漢字変換辞書に「格フレ
ーム」あるいは「拡大格フレーム」の概念を導入し、格
フレームを記述した述語辞書および意味素性を記述した
項辞書を備え、入力文字列中から述語を取り出し、文中
の名詞句が述語の格フレームのどの格の位置に入るかの
マッチングをすることにより、同音異義語を絞り込んで
いくものである。The present invention introduces the concept of "case frame" or "extended case frame" into a kana-kanji conversion dictionary, and includes a predicate dictionary describing case frames and a term dictionary describing semantic features. By extracting a predicate and matching which case position in the case frame of the predicate the noun phrase in the sentence is to narrow down homonyms.

【０００４】[0004]

【発明が解決しようとする課題】しかし、この先願の発
明でも、同音異義語解析用辞書に登録されていない未知
語については、その同音異義語の認識率が低下してお
り、また拡大格フレームのルールによっても同音異義語
の認識ができない場合があった。However, even in the invention of the prior application, for unknown words not registered in the homonym analysis dictionary, the recognition rate of the homonym is reduced, and the expanded case frame In some cases, it was not possible to recognize homonyms even with the rule.

【０００５】本発明は、上述の問題を解決するもので、
頻度計算により学習機能を解析用辞書にもたせて、同音
異義語の認識率を向上させた同音異義語認識装置を提供
することを目的とする。The present invention solves the above-mentioned problems,
An object of the present invention is to provide a homonymous word recognition device in which a learning function is provided to an analysis dictionary by frequency calculation to improve the homonymous word recognition rate.

【０００６】[0006]

【課題を解決するための手段】本発明は、文字列を入力
する入力手段と、文字列に対応する一文字または複数文
字からなり特定の意味をもち日本語で常用されるきわめ
て多数の漢字列（一字以上の漢字の列、かな文字列、英
文字列、その組み合わせを含み、一字であってもよい。
以下同じ。）を辞書として記憶する記憶手段と、前記入
力手段から入力され漢字列に変換すべき文字列に対応す
る漢字列をこの辞書から検索し読み出す制御手段とを備
え、前記辞書は、１または複数の意味素性および表層格
からなる格フレームが付与された述語とこの格フレーム
の意味素性に対応した意味素性を記述しその意味素性の
記述が可変である項とを含んで構成され、前記入力手段
から入力され漢字列に変換すべき文字列を含む文字列に
述語が存在するか否かを判別する判別手段と、この判別
手段により述語が存在することが判別されたときに、そ
の述語により定まる格フレームにしたがって各文字列の
変換すべき漢字列について前記辞書の中で検索対象を当
該格フレームの意味素性に対応する意味素性の項に限定
して検索する検索手段とを備えた同音異義語認識装置に
おいて、上記検索手段で検索された取りうる意味素性の
述語とこの述語に対応する意味素性の項との組合せにつ
いて、その意味素性が複数であるとき前記変換候補が確
定されるごとに前記述語および項のそれぞれの意味素性
ごとにその採用率であるヒット率の再計算を行ってこの
ヒット率を前記意味素性に対応して格納し、検索ごとに
ヒット率が高い意味素性の述語と項との組合せの漢字列
から順に表示する手段を備えたことを特徴とする。SUMMARY OF THE INVENTION The present invention provides an input means for inputting a character string, and an extremely large number of kanji character strings (one or more characters corresponding to the character string, which have a specific meaning and are commonly used in Japanese). It may include one or more Chinese character strings, kana character strings, English character strings, and combinations thereof, and may be one character.
same as below. ) As a dictionary, and control means for searching and reading a kanji string corresponding to a character string to be converted into a kanji string input from the input means from the dictionary. A case frame including a semantic feature and a surface case, and a semantic feature corresponding to the semantic feature of the case frame, and a term in which the description of the semantic feature is variable. Discriminating means for discriminating whether or not a predicate exists in a character string including a character string to be input and converted to a kanji string, and a case determined by the predicate when the discriminating means determines that the predicate exists; Search means for searching a kanji string to be converted for each character string according to the frame, by limiting a search target in the dictionary to terms of semantic features corresponding to the semantic features of the case frame; And the homonym recognition apparatus, the combination of the term semantic feature that corresponds to the predicate and the predicate of the semantic feature that can be taken retrieved by the retrieval means, sure the conversion candidates when the semantic feature is more
The recalculate the hit ratio is earlier descriptors and their adoption rate for each semantic features of claim every time is constant
A hit ratio is stored in correspondence with the semantic feature, and a means for sequentially displaying a kanji string of a combination of a predicate and a term having a high hit ratio for each search is provided.

【０００７】[0007]

【作用】本発明では、文字列から述語を取り出して、こ
の述語の格フレームに対応する意味素性の項を選択して
同音異義語の複数候補を出力表示するときに、述語と項
との対応する複数の意味素性ごとのヒット率を計算し、
そのヒット率の数値が大きい順に複数候補を出力する。
ヒット率をその都度再計算し、その数値が大きい順に表
示することにより、同音異義語認識率をさらに向上する
ことができる。According to the present invention retrieves the predicate from the character string, and selecting the meaning element of terms corresponding to the case frame of the predicate when outputting displaying a plurality of candidates of homophones, predicates and term
Calculate the hit rate for each of multiple semantic features corresponding to
A plurality of candidates are output in descending order of the numerical value of the hit ratio.
By recalculating the hit rate each time and displaying the numerical values in descending order, the homonymous word recognition rate can be further improved.

【０００８】[0008]

【実施例】本発明の実施例を図面を参照して説明する。An embodiment of the present invention will be described with reference to the drawings.

【０００９】図１は、本発明一実施例の同音異義語認識
装置のブロック構成図を示す。本図において、この装置
は、文字列を入力する入力手段であるキーボード１と、
日本工業規格Ｃ6226に規定された第１水準および第２水
準の漢字集合を記憶する読み出し専用メモリ２と、この
読み出し専用メモリ２を参照して入力された文字列に対
応する一文字または複数文字からなり特定の意味をもち
日本語で常用されるきわめて多数の漢字列を辞書として
記憶している固定ディスク３を含む固定ディスク制御部
３Ａと、キーボード１から入力され漢字列に変換すべき
文字列に対応する漢字列をこの辞書から検索し読み出す
制御手段４とを備える。FIG. 1 is a block diagram of a homonym recognition apparatus according to an embodiment of the present invention. In the figure, the device includes a keyboard 1 as an input means for inputting a character string,
A read-only memory 2 for storing first and second level kanji sets specified in Japanese Industrial Standard C6226, and one or more characters corresponding to a character string inputted with reference to the read-only memory 2 Supports a fixed disk control unit 3A including a fixed disk 3 storing a very large number of kanji strings having a specific meaning and commonly used in Japanese as a dictionary, and character strings input from the keyboard 1 and to be converted to kanji strings And a control means 4 for searching and reading a kanji string to be read from the dictionary.

【００１０】また、キーボード１より入力された文字列
と制御手段４により読み出された漢字列とを順次光学的
に表示するＣＲＴ表示装置５は、表示文字データメモリ
５Ａ、ドット画像表示データメモリ５Ｂ、表示制御デー
タメモリ５ＣおよびＣＲＴ方式による表示部５Ｄとから
構成される。変換結果を記憶するフロッピーディスク７
を含むフロッピーディスク制御部７Ａと、印字出力する
プリンタ８とを備えている。また符号９は増設可能な書
き込み読み出しメモリで、通常前記固定ディスク３によ
り代行される。また符号10は、ＲＳ-232Ｃ規格に準拠し
た回線インタフェースである。A CRT display device 5 for sequentially and optically displaying a character string input from the keyboard 1 and a Chinese character string read by the control means 4 includes a display character data memory 5A and a dot image display data memory 5B. , A display control data memory 5C and a display unit 5D based on the CRT method. Floppy disk 7 for storing conversion results
And a printer 8 for printing and outputting. Reference numeral 9 denotes an expandable write / read memory, which is usually substituted by the fixed disk 3. Reference numeral 10 denotes a line interface conforming to the RS-232C standard.

【００１１】ここで、固定ディスク３に記憶されている
辞書には、格フレームを付与された辞書項目（述語とい
う）および意味素性を付与された辞書項目（項という）
などからなり、意味素性は、名詞など項のおおよその意
味を表示する標識で、述語の属性である格フレームのな
かの意味素性と対応するものであり、さらに、キーボー
ド１から入力され漢字列に変換すべき文字列を含む文字
列に述語が存在するか否かを判別する第一の判別手段
と、この第一の判別手段により述語が存在することが判
別されたときに、固定ディスク３に記憶されたこの文字
列に対応する漢字列に同音異義語が存在するか否かを判
別する第二の判別手段と、この第二の判別手段により同
音異義語が存在することが判別されたときに、その述語
により定まる格フレームにしたがって同音異義語が存在
する各文字列の変換すべき漢字列をこの格フレームに対
応する意味素性が付与された項を登録した辞書の中で検
索対象として検索する検索手段とを備えた制御手段４に
おいて、この制御手段４はヒット率の再計算手段を備
え、上記第一判別手段および第二判別手段において、同
音異義語が存在することが判別され、この判別結果に基
づき検索された複数の同音異義語について、述語の格フ
レームとこれに対応する項との組合せについてそれぞれ
の意味素性ごとにヒット率の再計算を行い、この再計算
結果に基づきヒット率の大きい順に複数の同音異義語の
候補を表示し、かな漢字変換効率を向上させ、同音異義
認識率を高めるところに特徴がある。Here, the dictionary stored in the fixed disk 3 includes a dictionary item with a case frame (called a predicate) and a dictionary item with a semantic feature (called a term).
The semantic feature is a sign indicating the approximate meaning of a term such as a noun, and corresponds to the semantic feature in the case frame which is the attribute of the predicate. First determining means for determining whether or not a predicate exists in a character string including a character string to be converted; and when the first determining means determines that a predicate exists, the fixed disk 3 Second determining means for determining whether a homonym exists in a kanji string corresponding to the stored character string, and when the second determining means determines that a homonym exists. Then, according to the case frame determined by the predicate, a kanji string to be converted for each character string in which a homonym exists is searched as a search target in a dictionary in which terms with semantic features corresponding to this case frame are registered. You In the control means 4 including the search means, the control means 4 includes the recalculation means of the hit ratio, and the first discriminating means and the second discriminating means discriminate that the homonym exists. For multiple homonyms retrieved based on the results, for each combination of the predicate case frame and the corresponding term
The hit rate is recalculated for each semantic feature of, and based on the recalculation result, multiple homonym candidates are displayed in order of the hit rate, and the kana-kanji conversion efficiency is improved and the homonym recognition rate is increased. There are features.

【００１２】ここで、意味素性について説明する。意味
素性とは、辞書上で名詞など「項」に付与されている意
味標識で述語の属性である格フレームの中の意味素性に
対応するものであり、HUMN（人間) 、ANIM（動物）、PL
AC（場所）、ACTN（動作）、CONC（具象物）、ABST（抽
象物）、CHAR（文字情報）、INST（道具手段）、PLNT
（植物）、TIME（時間）、PHEN（現象、無意志動作）、
MENT（思考）、DIRC（方向性）、DEGR（数値表現、程
度）、VEHC（乗り物）、ROLE（役割）、ATRB（属性）、
DIVR（現れ得る項の種類に何らの制限も設けないもの）
などを含み、述語の属性である格フレームの中の意味素
性と対応するものであり、述語とは辞書で格フレームを
付与された語である。Here, the semantic feature will be described. A semantic feature is a semantic indicator attached to a “term” such as a noun in a dictionary and corresponds to the semantic feature in the case frame, which is the attribute of the predicate. HUMN (human), ANIM (animal), PL
AC (location), ACTN (action), CONC (concrete), ABST (abstract), CHAR (character information), INST (tool means), PLNT
(Plant), TIME (time), PHEN (phenomenon, voluntary movement),
MENT (thinking), DIRC (direction), DEGR (numerical expression, degree), VEHC (vehicle), ROLE (role), ATRB (attribute),
DIVR (without any restrictions on the types of terms that can appear)
And so on, which correspond to the semantic features in the case frame which is the attribute of the predicate, and the predicate is a word to which the case frame is added in the dictionary.

【００１３】以下、本実施例におけるヒット率の再計算
処理を例を挙げて説明する。The recalculation processing of the hit rate in the embodiment will be described below with reference to an example.

【００１４】図２および図３は、本実施例の同音異義語
認識装置における制御流れ図を示す。本実施例では、変
換結果確認処理において、図４の（Ａ）に示すように、
候補より選択指示し、確定した後、意味素性のヒット率
の再計算処理を行い、複数候補が選択されこれを表示す
る場合に、この再計算したヒット率の数値が大きい順に
表示するところに特徴がある。このヒット率の再計算処
理によって同音異義語の複数候補の出現頻度に基づいて
学習し、認識率を向上させる。FIGS. 2 and 3 show a control flow chart in the homonymous word recognition apparatus of the present embodiment. In this embodiment, the conversion result confirmation process, as shown in FIG. 4 (A),
Select instructions from candidates, after determining, recalculates processing means containing soluble hit rate, if the plurality of candidates is selected for displaying this, that it shows the numerical values of Re calculated hit ratio in descending order There are features. Through the recalculation of the hit rate, learning is performed based on the frequency of appearance of a plurality of homonymous candidates, thereby improving the recognition rate.

【００１５】図７は図４（Ａ）に示す意味素性の再計算
処理を説明する流れ図であり、この再計算処理は、述語
の意味素性と項の意味素性とのヒット率の再計算に分け
られる。なお図面中ではヒット率をＨＲで表す。[0015] Figure 7 is a flow diagram illustrating the process of recalculating the mean iodine exhibit in FIG. 4 (A), the recalculation process, the recalculation of the hit rate of the semantic features of the semantic features and terms of predicates Divided. In the drawings, the hit rate is represented by HR.

【００１６】述語の意味素性のヒット率の再計算は、確
定した述語と項の間で、述語は自分と同じ意味素性を持
つ項のヒット率を自分のヒット率に加える。また、項の
意味素性のヒット率の再計算は、確定した述語と項の間
で、項は述語のすべての意味素性とそのヒット率を受け
取り再計算する。The recalculation of the hit ratio of the semantic features of predicates, among definite predicate and the term predicates added hit rate terms have the same meaning element of the own to their hit rate. The recalculation of the hit rate of the semantic feature of the term is performed between the determined predicate and the term, and the term receives and recalculates all the semantic features of the predicate and its hit rate.

【００１７】次に、述語の意味素性のヒット率の再計算
についてその計算式を挙げて説明する。Next, recalculation of the hit rate of the semantic feature of the predicate will be described with reference to its calculation formula.

【００１８】述語の意味素性のヒット率の再計算は、確
定した述語の意味素性の数をｎ個それぞれ元のヒット率
をvj1,vj2,…vjnヒット率の和vj1+vj2+…+vjn＝Svjそれ
らに対応する項の意味素性のヒット率をvk1,vk2,…vkn
（対応する意味素性がない場合の値は０とする）項のヒ
ット率の和をvk1+vk2+…+vkn＝Svk求める述語の意味素
性の新しいヒット率をj1,j2,…jnとすると、となる。The recalculation of the hit rate of the semantic feature of the predicate is as follows. The number of the semantic features of the determined predicate is n, and the original hit rates are vj1, vj2,... the meaning hydrogen of hit rate of the term corresponding to those vk1, vk2, ... vkn
(The value in the absence of means containing potentially corresponding to a 0) term hit ratio of the sum of vk1 + vk2 + ... + vkn = Svk j1 new hit rate of semantic features of predicates seeking, j2, ... When jn, Becomes

【００１９】未知語の項の場合、vk1,vk2,…vkn 、およ
びヒット率の和Svk は「０」なので、となり、述語のヒット率は変わらない。また、述語の要
素が一つしかない場合、Svj ＝vj1,Svk ＝vk1 なので、となり、述語のヒット率は項のヒット率に関係なく変わ
らない。In the case of an unknown word term, since vk1, vk2,... Vkn and the sum Svk of hit rates are “0”, And the hit rate of the predicate does not change. Also, if there is only one predicate element, Svj = vj1, Svk = vk1, so And the hit rate of the predicate does not change regardless of the hit rate of the term.

【００２０】項の意味素性のヒット率の再計算は、確定
した述語の意味素性の数をｍ個それぞれ元のヒット率を
ak1,ak2,…akm （持っていなかった意味素性のヒット率
の値は０）ヒット率の和ak1 ＋ak2 ＋…＋akm ＝Sakそ
れらに対応する項の意味素性のヒット率をaj1,aj2,…aj
m （対応する意味素性がない場合の値は０とする）述語
のヒット率の和をaj1 ＋aj2 ＋…＋ajm ＝Saj求める項
の新しい意味素性のヒット率をk1,k2,…kmとすると、となる。The recalculation of the hit rate of the semantic feature of the term is performed by calculating the number of m semantic features of the determined predicate by m, respectively.
ak1, ak2, ... akm (0 is the value of the sense element of hit rate of which did not have) the sum of the hit rate ak1 + ak2 + ... + akm = Sak the hit rate of the semantic feature of terms corresponding to those aj1, aj2, ... aj
m (the value in the absence of the corresponding semantic feature is set to 0) new meaning hydrogen of hit rate of the terms of the sum of the hit rate determined aj1 + aj2 + ... + ajm = Saj predicates k1, k2, ... When km, Becomes

【００２１】未知語の項の場合、ak1,ak2,…akm,および
Sak は「０」なので、 For unknown word terms, ak1, ak2,... Akm, and
Sak is "0", so

【００２２】既知語の項の場合、ヒット率の和Sak は通
常「１」なので、となる。In the case of a term of a known word, the sum Sak of the hit rates is usually “1”. Becomes

【００２３】次に具体的に文字列入力があった場合のヒ
ット率の再計算処理の例を挙げて説明する。Next, a specific example of the recalculation processing of the hit ratio when a character string is input will be described.

【００２４】辞書に次の述語「生える」と「植える」に
ついてその取りうる意味素性、表層格、ヒット率の数値
が与えられた辞書があり、また、項については、「ポイ
ンセチア」、「歯」、「葉」についてそれぞれ辞書があ
り、「生える」と「植える」の意味素性、表層格、ヒッ
ト率は以下の通りであるとする。There are dictionaries in which the following predicates "grow" and "plant" have numerical values of possible semantic features, surface cases, and hit rates, and the terms "poinsettia" and "teeth" , And “leaves”, respectively, and the semantic features, surface layer, and hit rate of “grow” and “plant” are as follows.

【００２５】 [0025]

【００２６】 [0026]

【００２７】なお、述語のヒット率の初期値は１深層格
当たり1.0 、項のヒット率は１項あたり常に1.0 とす
る。The initial value of the hit rate of the predicate is 1.0 per deep case, and the hit rate of the term is always 1.0 per term.

【００２８】ここで、確定した述語と項の間で、述語は
自分と同じ意味素性を持つ項のヒット率を自分のヒット
率に加え、項は述語のすべての意味素性とそのヒット率
を受け取り、ヒット率を再計算する。述語の候補群のヒ
ット率の合計は計算前と後で変わらない。さて、入力文
字列として「ポインセチアが生える」の場合のヒット率
を再計算するととなる。また入力文字列として「ポインセチアを植え
る」の場合のヒット率を再計算する。このときの「植え
る」について上述のjn＝｛(vjn/Svj)+vkn ｝/(1+Svk)*S
vjの式でそのヒット率を計算すると、また、「ポインセチア」についても、上述のkm＝(akm+a
jm/Saj)/2 の式で計算すると、を得る。[0028] Here, among the finalized predicates and terms, predicates added hit rate terms have the same meaning element of the own to their hit rate, term all semantic feature and its hit ratio predicate Receive and recalculate the hit rate. The total hit rate of the predicate candidate group does not change before and after the calculation. Now, recalculating the hit rate in the case of "Poinsettia grows" as an input character string Becomes Also, the hit rate in the case of “planting poinsettia” as an input character string is recalculated. Jn = ｛(vjn / Svj) + vkn｝ / (1 + Svk) * S
Calculating the hit rate with the formula of vj, In addition, for “Poinsettia”, km = (akm + a
jm / Saj) / 2 Get.

【００２９】次に「歯が生える」については、「生え
る」がj1＝vj1 の式で計算すると、「生える」Ａ1 +SEGM が 1.0 「歯」がkm＝(akm+aim/Saj)/2 の式より「歯」 SEGM (1.0+1.0/1)/2＝1.0Next, as for "tooth grow", when "grow" is calculated by the formula of j1 = vj1, "grow" A1 + SEGM is 1.0. "Tooth" is km = (akm + aim / Saj) / 2. From the formula, "Tooth" SEGM (1.0 + 1.0 / 1) /2=1.0

【００３０】次に「葉が生える」については、jn＝｛(v
jn/Svj)+vkn ｝/ (1＋Svk)*Svjより「生える」Ｂ1 +PLAT が (0.5/1+0)/2*1 ＝0.125 1 +PART が (0.5/1+1.0)/2*1 ＝0.625 km ＝(akm+aim/Saj)/2 より「葉」 PART (1.0+0.5/1)/2＝0.75 PLAT (0+0.5/1)/2＝0.25 を得る。Next, as for “leaves grow”, jn = ｛(v
jn / Svj) + vkn｝ / (1 + Svk) * "grows" from Svj B1 + PLAT is (0.5 / 1 + 0) / 2 * 1 = 0.125 1 + PART is (0.5 / 1 + 1.0) / 2 * 1 = From 0.625 km = (akm + aim / Saj) / 2, “leaf” PART (1.0 + 0.5 / 1) /2=0.75 PLAT (0 + 0.5 / 1) /2=0.25 is obtained.

【００３１】次に「ポインセチアを植える」について
は、jn＝｛(vjn/Svj)+vkn ｝/ (1＋Svk)*Svj、jn＝vjn
より「植える」Ａ2 +PLAT を (1/2+0)/1*2＝1 Ｂ2 +IDEA を (1/2+0)/1*2＝1 また、「ポインセチア」についても、上述のkm＝ajm/Saj 式で計算すると、「ポインセチア」PLAT 1/2＝0.5 IDEA 1/2＝0.5 となる。「花を植える」については、jn＝｛(vjn/Svj)+vkn ｝/ (1＋Svk)*Svjから「植える」Ａ2 +PLATA を｛(1/2)+0.5 ｝/1.5*2＝1.333 Ｂ2 +IDEA を｛(1/2)+0.0 ｝/1.5*2＝0.667 km＝(akm+ajm/Saj)/2 より「花」 FLOW (0.5+0/2)/2＝0.5 PLAT (0.5+1.2/2)/2＝0.3 IDEA (0+0.8/2)/2＝0.2 変換の順番を変えると、「花を植える」は jn ＝｛(vjn/Svj)+vkn ｝/ (1＋Svk)*Svj jn ＝vjn より「植える」Ａ2 +PLATA を｛(1/2)+0.5 ｝/1.5*2＝1.333 Ｂ2 +IDEA を｛(1/2)+0.0 ｝/1.5*2＝0.667 km ＝(akm+ajm/Saj)/2 より「花」 FLOW (0.5+0/2)/2＝0.25 PLAT (0.5+1/2)/2＝0.5 IDEA (0.0+1/2)/2＝0.25 「ポインセチアを植える」は jn ＝vjn より「植える」Ａ2 +PLAT を 1.333 Ｂ2 +IDEA を 0.667 km＝ akm+ajm/Sajより「ポインセチア」 PLAT 1.333/2＝0.667 IDEA 0.667/2＝0.333 となり、変換順序で結果が変わり、直前の情報に重みが
おかれている。Next, regarding “planting poinsettia”, jn = ｛(vjn / Svj) + vkn｝ / (1 + Svk) * Svj, jn = vjn
More “planting” A2 + PLAT as (1/2 + 0) / 1 * 2 = 1 B2 + IDEA as (1/2 + 0) / 1 * 2 = 1 Also, for “Poinsettia”, the above km = Calculating with the ajm / Saj equation, "Poinsettia" PLAT 1/2 = 0.5 IDEA 1/2 = 0.5. For “planting flowers”, “plant” A2 + PLATA from (1/2) + 0.5 // 1.5 * 2 = 1.333 B2 from jn = ｛(vjn / Svj) + vkn｝ / (1 + Svk) * Svj + IDEA from ｛(1/2) +0.0｝ /1.5*2＝0.667 km ＝ (akm + ajm / Saj) / 2 “Flower” FLOW (0.5 + 0/2) /2=0.5 PLAT (0.5 + 1.2 /2)/2=0.3 IDEA (0 + 0.8 / 2) /2=0.2 By changing the order of conversion, “Planting a flower” is jn = ｛(vjn / Svj) + vkn｝ / (1 + Svk) * Svj jn = “Plant” from vjn A2 + PLATA ｛(1/2) +0.5｝ /1.5*2=1.333 B2 + IDEA ｛(1/2) +0.0｝ /1.5*2＝0.667 km ＝ (akm + ajm / Saj) / 2 "Flower" FLOW (0.5 + 0/2) /2=0.25 PLAT (0.5 + 1/2) /2=0.5 IDEA (0.0 + 1/2) /2=0.25 "Planting Poinsettia" Is "planting" from jn = vjn A2 + PLAT is 1.333 B2 + IDEA is 0.667 km = akm + ajm / Saj is "Poinsettia" PLAT 1.333 / 2 = 0.667 IDEA 0.667 / 2 = 0.333, the result changes in the conversion order, The last information is weighted.

【００３２】[0032]

【発明の効果】このように同音異義語の複数候補につい
てそのヒット率を元のヒット率に基づいて再計算してそ
のヒット率の値が大きい順に表示して変換する学習機能
を設けることによりその認識効率を向上させることがで
きる。As described above, by providing a learning function of recalculating the hit rates of a plurality of homonym candidates based on the original hit rates and displaying and converting the values of the hit rates in descending order. The recognition efficiency can be improved.

【図面の簡単な説明】[Brief description of the drawings]

【図１】本発明一実施例同音異義語認識装置のブロック
図。FIG. 1 is a block diagram of a homonym recognition apparatus according to an embodiment of the present invention.

【図２】本発明実施例の制御流れ図。FIG. 2 is a control flowchart of the embodiment of the present invention.

【図３】本発明実施例の制御流れ図。FIG. 3 is a control flowchart of the embodiment of the present invention.

【図４】本発明実施例の制御流れ図。FIG. 4 is a control flowchart of the embodiment of the present invention.

【図５】本発明実施例の制御流れ図。FIG. 5 is a control flowchart of the embodiment of the present invention.

【図６】本発明実施例の制御流れ図。FIG. 6 is a control flowchart of the embodiment of the present invention.

【図７】図４のヒット率再計算処理の制御流れ図。FIG. 7 is a control flowchart of the hit rate recalculation processing in FIG. 4;

【図符号の説明】[Description of figure symbols]

１キーボード１Ａ、８Ａインタフェース２読み出し専用メモリ３固定ディスク３Ａ固定ディスク制御部４制御手段５ＣＲＴ表示装置５Ａ表示部５Ｄに表示されるテキスト用文字情報を記
憶する表示文字データメモリ５Ｂドット単位の画像データを記憶するドット画像表
示データメモリ５Ｃテキスト用文字情報と画像データとの混合制御デ
ータを記憶する表示制御データメモリ５ＤＣＲＴ方式による表示部６書き込み読み出しメモリ（容量640Kバイト) ７フロッピーディスク（容量１Ｍバイトおよび640Kバ
イト) ７Ａフロッピーディスク制御部８プリンタ９増設可能な書き込み読み出しメモリ 10 回線インタフェースDESCRIPTION OF SYMBOLS 1 Keyboard 1A, 8A interface 2 Read-only memory 3 Fixed disk 3A Fixed disk control unit 4 Control means 5 CRT display device 5A Display character data memory for storing text character information displayed on display unit 5D 5B Image data in dot units 5C Display control data memory for storing mixed control data of text character information and image data 5D CRT display unit 6 Write / read memory (capacity 640 Kbytes) 7 Floppy disk (capacity 1 Mbytes) 7A Floppy disk controller 8 Printer 9 Expandable write / read memory 10 Line interface

───────────────────────────────────────────────────── フロントページの続き (56)参考文献特開平１−134563（ＪＰ，Ａ) 特開昭63−65566（ＪＰ，Ａ) 特開昭61−40672（ＪＰ，Ａ) ──────────────────────────────────────────────────続き Continuation of the front page (56) References JP-A-1-134563 (JP, A) JP-A-63-65566 (JP, A) JP-A-61-40672 (JP, A)

Claims

(57)【特許請求の範囲】(57) [Claims]

【請求項１】文字列を入力する入力手段と、文字列に
対応する一文字または複数文字からなり特定の意味をも
ち日本語で常用されるきわめて多数の漢字列（一字以上
の漢字の列、かな文字列、英文字列、その組み合わせを
含み、一字であってもよい。以下同じ。）を辞書として
記憶する記憶手段と、前記入力手段から入力され漢字列
に変換すべき文字列に対応する漢字列をこの辞書から検
索し読み出す制御手段とを備え、前記辞書は、１または複数の意味素性および表層格から
なる格フレームが付与された述語とこの格フレームの意
味素性に対応した意味素性を記述しその意味素性の記述
が可変である項とを含んで構成され、前記入力手段から入力され漢字列に変換すべき文字列を
含む文字列に述語が存在するか否かを判別する判別手段
と、この判別手段により述語が存在することが判別されたと
きに、その述語により定まる格フレームにしたがって各
文字列の変換すべき漢字列について前記辞書の中で検索
対象を当該格フレームの意味素性に対応する意味素性の
項に限定して検索する検索手段とを備えた同音異義語認
識装置において、上記検索手段で検索された取りうる意味素性の述語とこ
の述語に対応する意味素性の項との組合せについて、そ
の意味素性が複数であるとき前記変換候補が確定される
ごとに前記述語および項のそれぞれの意味素性ごとにそ
の採用率であるヒット率の再計算を行ってこのヒット率
を前記意味素性に対応して格納し、検索ごとにヒット率
が高い意味素性の述語と項との組合せの漢字列から順に
表示する手段を備えたことを特徴とする同音異義語認識
装置。An input means for inputting a character string, and a very large number of kanji strings (one or more kanji strings, each consisting of one or more characters corresponding to the character string, having a specific meaning and commonly used in Japanese) A character string including a kana character string, an English character string, and a combination thereof, which may be a single character; the same shall apply hereinafter) as a dictionary, and a character string input from the input means and to be converted to a kanji character string. Control means for retrieving and reading a kanji string to be read from the dictionary, wherein the dictionary includes one or more semantic features and a predicate to which a case frame including a surface case is assigned, and a semantic feature corresponding to the semantic feature of the case frame. And a term whose description of the semantic feature is variable, and determining whether a predicate exists in a character string input from the input means and including a character string to be converted into a kanji character string. Means When it is determined by the determining means that a predicate exists, a search target in the dictionary is converted to a kanji string to be converted for each character string in accordance with the case frame determined by the predicate according to the semantic feature of the case frame. A homonym recognition device comprising a search means for performing a search limited to a semantic feature term to be searched for, wherein a combination of a semantic feature predicate retrieved by the search means and a semantic feature term corresponding to the predicate is obtained. , The conversion candidate is determined when its semantic feature is plural.
Their for each semantic feature prior written word and term every
The hit ratio re-calculate the line of the hit ratio, which is the adoption rate
And a means for sequentially displaying a kanji string of a combination of a predicate and a term of the semantic feature having a high hit rate for each search in order from the kanji string.