JPH07152781A

JPH07152781A - Method and device for processing document

Info

Publication number: JPH07152781A
Application number: JP5299300A
Authority: JP
Inventors: Makoto Hirota; 誠廣田; Tsuyoshi Yagisawa; 津義八木沢; Kazue Kaneko; 和恵金子; Shogo Shibata; 昇吾柴田; Minoru Fujita; 稔藤田
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1993-11-30
Filing date: 1993-11-30
Publication date: 1995-06-16
Anticipated expiration: 2018-01-08
Also published as: JP3363552B2

Abstract

PURPOSE:To provide document processing method and device by which a word concerned to a sentence can be retrieved at high speed by inputting the sentence which is desired to learn by a user and flexibly comparing the inputting sentence with respective pieces of word information in a dictionary. CONSTITUTION:The word and word information related to the word are stored in the dictionary 3. When the sentence for retrieval is inputted, a words retrieval processing part 2 discriminates whether the respective words included in the inputted sentence are matched with word information stored in the dictionary 3 or not. The word concerned to the inputted sentence is retrieved and outputted based on word information which is discriminated to match.

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は文書処理装置に関し、例
えば単語の語義を入力し、それに最も適当な単語を検索
する文書処理方法とその装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a document processing apparatus, and more particularly to a document processing method and apparatus for inputting the meaning of a word and searching for the most suitable word.

【０００２】[0002]

【従来の技術】一般に、語の語義からその語義を持つ単
語を検索する場合は、例えばＵＮＩＸのコマンド“ｇｒ
ｅｐ”やフルテキストサーチ（全文検索）を用いて検索
が行われる。2. Description of the Related Art Generally, when searching a word having the meaning from the meaning of the word, for example, a UNIX command "gr" is used.
The search is performed using ep "or full text search (full text search).

【０００３】[0003]

【発明が解決しようとする課題】しかしながら、この種
のコマンド等で実行される検索方式は、ユーザが検索し
たい単語と関連のある単語或いは、その単語の語義文中
に含まれていそうな何らかの語句をキーワードとして入
力している。そして、単に、このキーワードを含む語義
文を探して、その語義文に該当する単語を検索している
ため、その検索の精度があまり良くないという問題があ
った。However, the search method executed by this kind of command is such that a word related to the word that the user wants to search, or some word or phrase that is likely to be included in the meaning sentence of the word, is used. It is entered as a keyword. Then, since the word meaning sentence including this keyword is simply searched for the word corresponding to the word meaning sentence, there is a problem that the accuracy of the search is not so good.

【０００４】本発明は上記従来例に鑑みてなされたもの
で、ユーザが知りたい単語の語義文を自由に作成して入
力し、その入力された語義文と辞書に記載された各単語
の語義文との表現の間で柔軟な比較を行って、入力され
た語義文に該当する単語を高速に検索できる文書処理方
法とその装置を提供することを目的とする。The present invention has been made in view of the above-mentioned conventional example, and the user can freely create and input a word meaning sentence of a word that the user wants to know, and the meaning of the input word meaning sentence and each word written in the dictionary. An object of the present invention is to provide a document processing method and a device thereof that can perform a flexible comparison between expressions and sentences to search for a word corresponding to an input meaning sentence at high speed.

【０００５】[0005]

【課題を解決するための手段】上記目的を達成するため
に本発明の文書処理装置は以下の様な構成を備える。即
ち、単語と該単語に関する単語情報を記憶する辞書記憶
手段と、検索するための文章を入力する入力手段と、前
記入力手段により入力された文章に含まれる単語と前記
辞書記憶手段に記憶された前記単語情報とが一致するか
どうかを判別する判別手段と、前記判別手段により一致
すると判別された単語情報に基づいて前記文章に該当す
る単語を検索する検索手段とを有する。In order to achieve the above object, the document processing apparatus of the present invention has the following configuration. That is, a dictionary storage unit that stores a word and word information related to the word, an input unit that inputs a sentence to be searched, a word included in the sentence input by the input unit, and a dictionary stored in the dictionary storage unit. It has a discriminating means for discriminating whether or not the word information matches, and a searching means for searching a word corresponding to the sentence based on the word information discriminated by the judging means.

【０００６】上記目的を達成するために本発明の文書処
理方法は以下の様な工程を備える。即ち、文書を入力し
て該当する単語を検索する文書処理方法であって、検索
するための文章を入力する工程と、入力された文章に含
まれる単語のそれぞれと、単語及び該単語に関する単語
情報を記憶している辞書に記憶された単語情報とが一致
するかどうかを判別する工程と、一致すると判別された
単語情報に基づいて、入力された文章に該当する単語を
前記辞書より検索する工程とを有する。In order to achieve the above object, the document processing method of the present invention comprises the following steps. That is, a document processing method for inputting a document and searching for a corresponding word, the step of inputting a sentence for searching, each of the words included in the input sentence, the word, and word information regarding the word. Determining whether or not the word information stored in the dictionary storing the same matches, and searching the dictionary for a word corresponding to the input sentence based on the word information determined to match. Have and.

【０００７】[0007]

【作用】以上の構成において、入力された文章と辞書に
記憶されている単語情報とが完全に一致していなくて
も、それぞれの表記上の差異をある程度吸収して比較す
るので、ユーザは辞書中の単語情報の表記を気にするこ
となく、自由な文章を入力して検索することができる。In the above structure, even if the input sentence and the word information stored in the dictionary do not completely match, the difference in each notation is absorbed and compared to some extent. You can enter and search any text without worrying about the word information inside.

【０００８】[0008]

【実施例】以下、添付図面を参照して本発明の好適な実
施例を詳細に説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT A preferred embodiment of the present invention will now be described in detail with reference to the accompanying drawings.

【０００９】図１は本発明の一実施例に係る自然言語処
理装置の概略構成を示すブロック図である。同図におい
て、１は入力文保持部で、後述するキーボード等より入
力された文を保持している。２は単語検索処理部で、入
力文保持部１に保持された入力文に基づいて該当する単
語を検索する。３は「単語−語義」辞書で、例えば図４
に示すように、単語とその語義文とが対応付けて記憶さ
れている。４は単語出力部で、単語検索処理部４で検索
された単語が出力される。FIG. 1 is a block diagram showing the schematic arrangement of a natural language processing apparatus according to an embodiment of the present invention. In the figure, reference numeral 1 denotes an input sentence holding unit which holds a sentence input from a keyboard or the like which will be described later. Reference numeral 2 denotes a word search processing unit that searches for a corresponding word based on the input sentence held in the input sentence holding unit 1. 3 is a "word-word meaning" dictionary, for example, as shown in FIG.
As shown in, the word and the word meaning sentence are stored in association with each other. A word output unit 4 outputs the word searched by the word search processing unit 4.

【００１０】図２は本実施例の自然言語処理装置の具体
的な回路構成を示すブロック図である。FIG. 2 is a block diagram showing a concrete circuit configuration of the natural language processing apparatus of this embodiment.

【００１１】図２において、１０１は装置全体を制御す
るＣＰＵで、プログラムメモリ１０４に記憶されている
制御プログラム（例えば、図３のフローチャートで示
す）に従って装置全体を制御している。このＣＰＵ１０
１及びプログラムメモリ１０４は図１の単語検索処理部
２に該当している。１０２はキーボードで、オペレータ
により操作され、後述する入力文や文書データ等の各種
データや各種指示コマンド等が入力される。１０３は、
例えばマウス等のポインティングデバイスで、コマンド
入力やメニュ選択等に使用される。１０５はＲＡＭで、
ＣＰＵ１０１の動作時、ワークエリアとして使用され、
後述する入力文Ｓ、単語Ａ、変数Ｍ、カウンタｉ等の各
種データを一時的に保存している。よって、このＲＡＭ
１０５は入力文保持部１の機能をも有している。１０６
は、例えばＣＲＴや液晶等の表示部で、キーボード１０
２より入力されたコマンドや文書データ、更にはオペレ
ータへのメッセージ等を表示しており、単語出力部４の
機能をも有している。３は図１に示された、単語と語義
との関係を記憶している辞書である。１０８は、例えば
ハードディスク等の外部記憶装置で、文書データや画像
データ、更には辞書１０７の内容が記憶されていても良
い。In FIG. 2, reference numeral 101 denotes a CPU for controlling the entire apparatus, which controls the entire apparatus according to a control program (for example, shown in the flowchart of FIG. 3) stored in the program memory 104. This CPU10
1 and the program memory 104 correspond to the word search processing unit 2 in FIG. A keyboard 102 is operated by an operator to input various data such as an input sentence and document data described later, various instruction commands, and the like. 103 is
For example, a pointing device such as a mouse is used for command input and menu selection. 105 is a RAM,
Used as a work area when the CPU 101 operates,
Various data such as an input sentence S, a word A, a variable M, and a counter i described later are temporarily stored. Therefore, this RAM
Reference numeral 105 also has the function of the input sentence holding unit 1. 106
Is a display unit such as a CRT or a liquid crystal, and the keyboard 10
It displays commands and document data input from 2 and further messages to the operator, and also has a function of the word output unit 4. Reference numeral 3 is a dictionary that stores the relationship between words and word meanings shown in FIG. Reference numeral 108 denotes an external storage device such as a hard disk, which may store document data, image data, and further the contents of the dictionary 107.

【００１２】次に、図３のフローチャートを参照して、
本実施例の装置の動作を詳しく説明する。Next, referring to the flowchart of FIG.
The operation of the apparatus of this embodiment will be described in detail.

【００１３】まずステップＳ１では、例えばキーボード
１０２から入力されるユーザからの入力文を受取り、こ
れを入力文保持部１（例えばＲＡＭ１０５の文書データ
記憶エリア：以降、ここに保持されている入力文を入力
文Ｓと呼ぶ）に保持する。また、後続の検索処理により
取り出される単語を保持する単語出力部４（例えばＲＡ
Ｍ１０５の検索語記憶エリア：以後、ここに保持されて
いる単語を単語Ａと呼ぶ）を、最初に空の文字列（例え
ばヌル(null)コード列）としておく。更に、入力文と辞
書中の語義文との整合度の度合いを示す（以後、これを
スコアＳＣと呼ぶ）の最大値を保持する変数Ｍ、チェッ
クした単語の数をカウントするカウンタｉの値をそれぞ
れ“０”に初期化する。尚、これらスコアＳＣ、変数
Ｍ、カウンタｉの値はＲＡＭ１０５のワークエリアに記
憶されている。First, in step S1, an input sentence from the user, which is input from the keyboard 102, for example, is received, and the input sentence holding unit 1 (for example, the document data storage area of the RAM 105: hereinafter, the input sentence held here is stored. The input sentence S is called). In addition, the word output unit 4 (for example, RA
A search word storage area of M105: Hereinafter, a word held here is referred to as a word A), and is initially set as an empty character string (for example, a null code string). Further, a variable M holding the maximum value of the degree of matching between the input sentence and the word meaning sentence in the dictionary (hereinafter referred to as score SC) and a value of a counter i for counting the number of checked words are set. Each is initialized to "0". The score SC, the variable M, and the value of the counter i are stored in the work area of the RAM 105.

【００１４】次にステップＳ２に進み、カウンタｉの値
が予め設定された所定値Ｎ（例えば辞書中の単語の総
数）を越えたかどうかを調べ、越えていない場合はステ
ップＳ３に進む。ステップＳ３では、カウンタｉをイン
クリメント（＋１）する。そしてステップＳ４に進み、
まず「単語−語義」辞書３を引く。Next, in step S2, it is checked whether or not the value of the counter i has exceeded a preset value N (for example, the total number of words in the dictionary), and if not, the process proceeds to step S3. In step S3, the counter i is incremented (+1). Then go to step S4,
First, the "word-sense" dictionary 3 is drawn.

【００１５】図４は、この「単語−語義」辞書３のデー
タ構成例を示す図である。FIG. 4 is a diagram showing an example of the data structure of the "word-word meaning" dictionary 3.

【００１６】図４では、辞書中のいくつかの単語と、そ
の単語の意味を示す語義文とが対応付けて示されてい
る。In FIG. 4, some words in the dictionary are shown in correspondence with the word meaning sentences indicating the meanings of the words.

【００１７】そこでステップＳ４において、カウンタｉ
の値で示されるｉ番目の単語の語義文（以後、これを語
義文Ｇ_i と呼ぶ）を辞書３より読出す。そして入力文Ｓ
と語義文Ｇ_i とを比較する。図３に示された関数ｆ
（Ｓ，Ｇ_i ）は、入力文Ｓと語義文Ｇ_i との整合度（マ
ッチング）を調べ、そのスコアを返す関数である。この
関数で表されたスコアは、ＲＡＭ１０５の変数ｍに保持
される。次にステップＳ５に進み、ステップＳ４で得ら
れたスコアｍと、それまでに得られたスコアの最大値Ｍ
とを比較し、ｍがＭ以下（ｍ≦Ｍ）であればステップＳ
２に戻る。逆にｍがＭより大きければ（ｍ＞Ｍ）ステッ
プＳ６に進む。Therefore, in step S4, the counter i
The word meaning sentence of the i-th word (hereinafter referred to as the word meaning sentence G _i ) indicated by the value of is read from the dictionary 3. And the input sentence S
And the word meaning sentence G _i are compared. The function f shown in FIG.
(S, G _i ) is a function that checks the matching degree (matching) between the input sentence S and the word meaning sentence G _i and returns the score. The score represented by this function is held in the variable m of the RAM 105. Next, in step S5, the score m obtained in step S4 and the maximum value M of the scores obtained up to that point
And m is less than or equal to M (m ≦ M), step S
Return to 2. On the contrary, if m is larger than M (m> M), the process proceeds to step S6.

【００１８】ステップＳ６では、最大スコアＭを今回得
られたスコアｍに書き換えるとともに、単語出力部４に
保持されている単語Ａを、辞書中のｉ番目の単語Ｗ_i に
書き換える。そしてステップＳ２に戻る。こうしてステ
ップＳ２で、カウンタｉの値がＮを越えた場合はステッ
プＳ７に進み、検索結果として単語Ａに記憶されている
単語Ｗを出力して処理を終了する。In step S6, the maximum score M is rewritten to the score m obtained this time, and the word A held in the word output unit 4 is rewritten to the i-th word W _i in the dictionary. Then, the process returns to step S2. Thus, in step S2, when the value of the counter i exceeds N, the process proceeds to step S7, the word W stored in the word A is output as the search result, and the process ends.

【００１９】図５は図３のステップＳ４における、入力
文Ｓと語義文Ｇ_i の整合度の比較およびスコア付け（関
数ｆの内容）のための処理を示すフローチャートであ
る。FIG. 5 is a flow chart showing a process for comparing the degree of matching between the input sentence S and the word meaning sentence G _i and scoring (contents of the function f) in step S4 of FIG.

【００２０】図６（Ａ）に示すように、入力文Ｓの前か
らｐ番目の文字をＸ_p とし、図６（Ｂ）に示すように、
語義文Ｇ_i の前からｑ番目の文字をＹ_q と表わす。また
最初、スコアＳＣの値は“０”としておく。尚、このス
コアＳＣの値は、ＲＡＭ１０５に記憶されている。As shown in FIG. 6A, the pth character from the front of the input sentence S is X _p, and as shown in FIG.
The qth character from the front of the word meaning sentence G _i is represented by Y _q . At first, the value of the score SC is set to "0". The value of this score SC is stored in the RAM 105.

【００２１】マッチングの判定は、入力文Ｓの文字Ｘ
₁ ，Ｘ₂ ，…の先頭から順にそれぞれに一致する文字を
語義文Ｇ_i の中の（前から順に）探していく。また、入
力文の文字が語義文の何文字目の文字と一致するか（つ
まりｑの値）を記憶するためのリスト(list)を用意す
る。最初、このリストの内容は全て空（“０”）にして
おく（ステップＳ１１）。いま、図６（Ａ）の文字Ｘ_p
に注目しているとすると、これが文末コードでなければ
ステップＳ１２からステップＳ１３に進み、対応する文
字を語義文Ｇ_i （Ｙ₁ 〜文末まで）の中から探す。但
し、既にリストに記憶されているポインタの指す文字に
ついては（すでに入力文の他の文字と一致しているた
め）一致しているかどうかの判定を行なわない。図５で
は、ステップＳ１２からステップＳ１７の処理がこの処
理に該当している。即ち、最初にステップＳ１３でｑの
値を“１”とし（語義文Ｇ_i の先頭）、語義文Ｇ_i の先
頭より順次文字Ｘ_p と比較していく。このとき、ｑの値
がリストに既に登録されている時はステップＳ１５より
ステップＳ１７に進み、そうでない時のみステップＳ１
６で文字Ｘ_p と語義文のｑ番目の文字Ｙ_q とが比較され
る。この処理が語義文Ｇ_i の最後（文末）まで行われ
る。Matching is determined by the character X of the input sentence S.
Characters that respectively match ₁ , X ₂ , ... Are searched for in the word meaning sentence G _i (in order from the front). In addition, a list for storing which character of the input sentence matches the character of the word meaning sentence (that is, the value of q) is prepared. Initially, the contents of this list are all empty ("0") (step S11). Now, the character X _{p in} FIG.
If this is not the case, if this is not the end-of-sentence code, the process proceeds from step S12 to step S13, and the corresponding character is searched from the word meaning sentence G _i (Y ₁ to end of sentence). However, it is not determined whether the character pointed to by the pointer already stored in the list matches (because it already matches another character in the input sentence). In FIG. 5, the processes of steps S12 to S17 correspond to this process. That is, first the value of q in step S13 is "1" (the first word meaning sentence G _i), we compared sequentially character X _p from the beginning of the word meaning sentence G _i. At this time, if the value of q is already registered in the list, the process proceeds from step S15 to step S17, and if not, only step S1.
At 6, the character X _p is compared with the _qth character Y _q of the word meaning sentence. This processing is performed until the end (sentence end) of the word meaning sentence G _i .

【００２２】こうしてステップＳ１６で、文字Ｘ_p と語
義文字Ｙ_q とが一致するとステップＳ１８に進み、スコ
アＳＣをある一定量増やし、語義文Ｇ_i 中の一致した文
字の順位を示す値ｑをリストに追加する。そしてステッ
プＳ１９でｐを＋１して（ｐ＝ｐ＋１）、入力文の文字
位置を次の位置に進め、入力文Ｓの次の文字Ｘ_p+1 につ
いて、前述と同様の処理を行なう。Thus, when the character X _p and the word meaning character Y _q match in step S16, the process proceeds to step S18, the score SC is increased by a certain amount, and a value q indicating the rank of the matched character in the word meaning sentence G _i is listed. Add to. Then, in step S19, p is incremented by 1 (p = p + 1), the character position of the input sentence is advanced to the next position, and the same processing as described above is performed for the next character X _{p + 1} of the input sentence S.

【００２３】このような比較処理を行うことにより、入
力文Ｓと語義文Ｇ_i の表記上の差異が吸収され、柔軟な
マッチング処理を行うことができる。By performing such a comparison process, the notational difference between the input sentence S and the word meaning sentence G _i is absorbed, and a flexible matching process can be performed.

【００２４】例えば、入力文Ｓが『物事をする理由や目
的など』である場合、図４の辞書例にある「趣旨」とい
う単語の語義文『その事をする中心的なねらいや目的』
と文字の比較を行うと、両者の文表記が異なっているに
もかかわらず、“事をする”や“目的”という文字がう
まく一致して、同じ内容を表していると判断され、この
単語「趣旨」の語義文が高いスコアを得ることができ
る。For example, when the input sentence S is "reason or purpose of doing something", the meaning of the word "purpose" in the dictionary example of FIG.
When comparing the characters with "," it is judged that the characters "do" and "purpose" are well matched and represent the same content, even though the sentence notation of both is different. It is possible to obtain a high score for the meaning of the word “intent”.

【００２５】また、前述のマッチング処理では、比較す
る文の中での文字の出現順序の制約を受けない（例え
ば、入力文Ｓの１番目および５番目の各文字が、語義文
Ｇ_i の４番目および２番目の文字とそれぞれ一致すると
いうことが許される）ので、文字列の前後が逆になって
いるような、いわゆるクロス状態にあっても入力文Ｓと
語義文Ｇ_i の表記とがうまく比較される。Further, in the above-mentioned matching process, there is no restriction on the order of appearance of characters in the sentences to be compared (for example, the first and fifth characters of the input sentence S are 4 of the word meaning sentence G _i ). Since it is allowed to match the first and second characters respectively), the notation of the input sentence S and the word meaning sentence G _i is not changed even in the so-called cross state where the front and rear of the character string are reversed. Well compared.

【００２６】この文字列のクロスとは、例えば入力文Ｓ
が、『目下の者に対し、目上の者が言い聞かせること』
である場合を考える。図４の辞書例の「諭旨」という単
語の語義文『目上の者から目下の者にさとして言い聞か
せること』のようになっている場合は、“目上の者”と
いう文字列と“目下の者”という文字列の位置が入力文
と語義文とで逆になっている。このような関係を文字列
のクロスと呼んでいる。本実施例では、このようなクロ
スしている場合であっても、各文字列同士はうまくマッ
チングが取られることになる。The character string cross is, for example, the input sentence S.
But, "What the superior says to the person who is present"
Consider the case. In the dictionary example shown in FIG. 4, when the word "Sentence" is used as in the sentence "To tell the person below from the person above", the character string "Upper" and " The position of the character string "current person" is reversed in the input sentence and the word meaning sentence. Such a relationship is called a character string cross. In the present embodiment, even if such a cross occurs, the character strings are well matched with each other.

【００２７】図７は本実施例の言語処理装置における具
体的な動作例を示す図で、ここでは入力文が「物事をす
る理由や目的など」であるとき、その文に対応する意味
の単語が検索され、単語「趣旨」との整合度がスコア
“８．０”であり、単語「要旨」との整合度が“５．
０”であり、単語「趣旨」との整合度が“３．０”とい
うようにそれぞれ示されている。FIG. 7 is a diagram showing a specific operation example in the language processing apparatus of this embodiment. Here, when the input sentence is "reason or purpose of doing things", a word having a meaning corresponding to the sentence is given. Are searched for, the matching degree with the word “purpose” is a score “8.0”, and the matching degree with the word “summary” is “5.
0 ", and the degree of matching with the word" purpose "is shown as" 3.0 ".

【００２８】図７における７０１は、単語「趣旨」の語
義文『その事をする中心的なねらいや目的』と、入力文
との比較例を示したものである。これから明らかなよう
に、その語義文と入力文とでは８文字が一致しており、
従ってスコアは“８．０”となる。Reference numeral 701 in FIG. 7 shows a comparative example of a word meaning sentence "a central aim or purpose of doing that" and an input sentence. As is clear from this, 8 characters match in the word meaning sentence and the input sentence,
Therefore, the score is “8.0”.

【００２９】尚、前述の実施例では、スコアの最も高い
語義文を持つ単語を検索するものとしたが、スコアの高
い順に複数の単語の候補を出力するようにしてもよい。
本実施例では、整合度の度合いをスコアとして定量的に
評価しているため、このようなことが簡単に実現でき
る。In the above embodiment, the word having the word meaning sentence with the highest score is searched, but a plurality of word candidates may be output in the order of the highest score.
In this embodiment, since the degree of matching degree is quantitatively evaluated as a score, such a thing can be easily realized.

【００３０】また上記実施例では、一致する文字が見つ
かるごとにスコアを一定量増加させるというスコア付け
を行なっているが、本発明はこれに限定されるものでな
く、先見的な知識などを利用したさまざまな方法が可能
である。例えば、漢字同士が一致したときは意味的にも
一致している可能性が高いとみなし、漢字が一致したと
きのスコアの増分を、ひらがなやカタカナが一致した場
合よりも多くする方法等が考えられる。その他にも、ス
コアを計算する数式を自由に設定或いは記述できるよう
にすることにより、種々の微調整ができる。In the above embodiment, the score is increased by a certain amount each time a matching character is found. However, the present invention is not limited to this, and a priori knowledge is used. Different methods are possible. For example, when there is a match between Chinese characters, it is considered highly likely that they match in meaning, and there is a method to increase the score increase when the Chinese characters match, compared to when Hiragana or Katakana match. To be In addition, various fine adjustments can be made by freely setting or describing the mathematical formula for calculating the score.

【００３１】更にまた本実施例では、日本語の場合を例
にとって説明したが、英語や独語などのように、どのよ
うな言語にも適用できる。Furthermore, in this embodiment, the case of Japanese has been described as an example, but the present invention can be applied to any language such as English or German.

【００３２】また本実施例では、マッチングの対象を語
義文として説明したが、語義文以外の情報、例えば市販
辞書に見られるような同義語、反対語、用例文などを辞
書に持たせ、これらの文字列をマッチングの対象にして
も良い。In this embodiment, the object of matching is described as a word meaning sentence. However, information other than the word meaning sentence, for example, synonyms, opposite words and example sentences found in a commercial dictionary are provided in the dictionary, and The character string of may be the target of matching.

【００３３】更に、実施例では、文字列を対象とした簡
易なマッチングの場合で説明したが、これ以外にも文字
や単語レベルでのＤＰマッチング（参照：長尾真，
「言語工学」，昭晃堂）や、文解析を用いた手法を組み
合わせても良い。Furthermore, in the embodiment, the case of simple matching for a character string has been described, but in addition to this, DP matching at the character or word level (see: Makoto Nagao,
"Language Engineering", Shokodo) and methods using sentence analysis may be combined.

【００３４】尚、本発明は複数の機器から構成されるシ
ステムに適用しても、１つの機器からなる装置に適用し
ても良い。また、本発明はシステム或は装置に、本発明
を実施するプログラムを供給することによって達成され
る場合にも適用できることは言うまでもない。The present invention may be applied to either a system composed of a plurality of devices or an apparatus composed of a single device. Further, it goes without saying that the present invention can also be applied to the case where it is achieved by supplying a program for implementing the present invention to a system or an apparatus.

【００３５】以上説明したように本実施例では、入力さ
れた語義文と辞書中の各語義文との間で文解析を用いず
に、それぞれの表記上の異なりをある程度吸収して、で
きるだけ高速に一致しているかどうかを調べ、その一致
している度合いを定量的に評価できるようにしている。
更に、入力された語義文に、最も整合していると判断さ
れる語義文を有する単語を検索するようにしたので、ユ
ーザはある語義を有する単語を知りたいときに、その語
義を自由に記述した語義文を入力することによって、そ
の単語を辞書から容易に検索できる効果がある。As described above, in the present embodiment, sentence analysis is not used between the input word meaning sentence and each word meaning sentence in the dictionary, each notational difference is absorbed to some extent, and as fast as possible. It is possible to quantitatively evaluate the degree of agreement by checking whether or not
Further, since the word having the word meaning sentence that is judged to be most matched with the input word meaning sentence is searched, when the user wants to know a word having a certain meaning, the user can freely describe the meaning. By inputting the word meaning sentence, the word can be easily searched from the dictionary.

【００３６】[0036]

【発明の効果】以上説明したように本発明によれば、ユ
ーザが知りたい文章を入力し、その入力された文章と辞
書中の各単語情報との間で柔軟な比較を行って、その文
章に該当する単語を高速に検索できる効果がある。As described above, according to the present invention, the user inputs a sentence that he / she wants to know, and the input sentence and each word information in the dictionary are flexibly compared, and the sentence is compared. There is an effect that the word corresponding to can be searched at high speed.

【図面の簡単な説明】[Brief description of drawings]

【図１】本実施例の自然言語処理装置の基本構成を示す
機能ブロック図である。FIG. 1 is a functional block diagram showing a basic configuration of a natural language processing apparatus of this embodiment.

【図２】本実施例の自然言語処理装置の具体的な構成を
示すブロック図である。FIG. 2 is a block diagram showing a specific configuration of the natural language processing apparatus of this embodiment.

【図３】本実施例の自然言語処理装置における処理手順
を示すフローチャートである。FIG. 3 is a flowchart showing a processing procedure in the natural language processing apparatus of this embodiment.

【図４】本実施例の「単語−語義」辞書の具体的な内容
例を示す図である。FIG. 4 is a diagram showing an example of specific contents of a “word-word meaning” dictionary according to the present embodiment.

【図５】図３のステップＳ４における入力文と辞書語義
文とのマッチング処理およびスコア付けの処理手順を示
すフローチャートである。FIG. 5 is a flowchart showing a procedure of matching processing and scoring processing between an input sentence and a dictionary word meaning sentence in step S4 of FIG. 3;

【図６】入力文と辞書語義文の各ポインタにより指示さ
れた文字位置を説明するための図である。FIG. 6 is a diagram for explaining a character position designated by each pointer of an input sentence and a dictionary word meaning sentence.

【図７】本実施例の自然言語処理装置の動作例を説明す
るための図である。FIG. 7 is a diagram for explaining an operation example of the natural language processing device of the present embodiment.

【符号の説明】[Explanation of symbols]

１入力文保持部２単語検索処理部３「単語−語義」辞書４単語出力部１０１ＣＰＵ１０４プログラムメモリ１０５プログラムメモリ１０６表示部１０８外部記憶装置 1 Input sentence holding unit 2 Word search processing unit 3 "Word-word meaning" dictionary 4 Word output unit 101 CPU 104 Program memory 105 Program memory 106 Display unit 108 External storage device

───────────────────────────────────────────────────── フロントページの続き (72)発明者柴田昇吾東京都大田区下丸子３丁目30番２号キヤノン株式会社内 (72)発明者藤田稔東京都大田区下丸子３丁目30番２号キヤノン株式会社内 ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Shogo Shibata 3-30-2 Shimomaruko, Ota-ku, Tokyo Canon Inc. (72) Minor Fujita 3-30-2 Shimomaruko, Ota-ku, Tokyo Non non corporation

Claims

【特許請求の範囲】[Claims]

【請求項１】単語と該単語に関する単語情報を記憶す
る辞書記憶手段と、検索するための文章を入力する入力手段と、前記入力手段により入力された文章に含まれる単語と前
記辞書記憶手段に記憶された前記単語情報とが一致する
かどうかを判別する判別手段と、前記判別手段により一致すると判別された単語情報に基
づいて前記文章に該当する単語を検索する検索手段と、を有することを特徴とする文書処理装置。1. A dictionary storage unit for storing a word and word information related to the word, an input unit for inputting a sentence for searching, a word included in the sentence input by the input unit, and the dictionary storage unit. And a search unit that searches for a word corresponding to the sentence based on the word information determined to match by the determination unit, the determination unit determining whether or not the stored word information matches. Characteristic document processing device.

【請求項２】前記辞書記憶手段は単語と該単語の語義
文を記憶していることを特徴とする請求項１に記載の文
書処理装置。2. The document processing apparatus according to claim 1, wherein the dictionary storage unit stores a word and a word meaning sentence of the word.

【請求項３】前記検索手段は入力された文章に含まれ
る単語のそれぞれと前記語義文に含まれる単語とを順次
比較し、最も一致する単語の多い語義文に対応する単語
を検索結果として出力することを特徴とする請求項１に
記載の文書処理装置。3. The search means sequentially compares each of the words included in the input sentence with the words included in the word meaning sentence, and outputs the word corresponding to the word meaning sentence having the most matching words as a search result. The document processing apparatus according to claim 1, wherein:

【請求項４】文書を入力して該当する単語を検索する
文書処理方法であって、検索するための文章を入力する工程と、入力された文章に含まれる単語のそれぞれと、単語及び
該単語に関する単語情報を記憶している辞書に記憶され
た単語情報とが一致するかどうかを判別する工程と、一致すると判別された単語情報に基づいて、入力された
文章に該当する単語を前記辞書より検索する工程と、を有することを特徴とする文書処理方法。4. A document processing method for inputting a document and searching for a corresponding word, comprising a step of inputting a sentence for searching, each of words included in the input sentence, a word and the word. Determining whether or not the word information stored in the dictionary storing the word information about the word matches, and based on the word information determined to match, the word corresponding to the input sentence is extracted from the dictionary. A document processing method comprising: a search step.