JPH01137366A

JPH01137366A - Production system for data base of adversative dictionary

Info

Publication number: JPH01137366A
Application number: JP62296672A
Authority: JP
Inventors: Koji Hashiguchi; 幸治橋口
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1987-11-25
Filing date: 1987-11-25
Publication date: 1989-05-30

Abstract

PURPOSE:To automatically produce an adversative dictionary data base by extracting the words including the key words as suffixes out of an input document for production of said data base. CONSTITUTION:A morpheme analyzing part 1 applies the morpheme analysis to an input document (a manual, etc.) to divide this document into parts of speech. A word extracting part 2 extracts the words having the designated key words as suffixes out of those divided parts of speech and stores these words into a word rile 3 as the adversative dictionary data base. In such a way, an adversative dictionary data base is automatically produced.

Description

【発明の詳細な説明】〔概要〕キーワードを接尾辞とする用語を検索して逆語辞書デー
タベースを作成する逆語辞書データベース作成方式に関
し、キーワードを接尾辞とする用語を文書中から検索して逆
語辞書データベースを迅速に自動作成することを目的と
し、入力された文書を解析して品詞に分割する形態素解析部
と、この形態素解析部によって分割された品詞のうち、
指示されたキーワードを接尾辞として持つものを抽出す
る用語抽出部とを備え、この用語抽出部によって抽出さ
れた用語を逆語辞書データベースに格納するように構成
する。[Detailed Description of the Invention] [Summary] Concerning a reverse word dictionary database creation method for creating a reverse word dictionary database by searching for terms with keywords as suffixes, the present invention involves searching documents for terms with keywords as suffixes. The aim is to quickly and automatically create a reverse word dictionary database.The system includes a morphological analysis section that analyzes input documents and divides them into parts of speech, and a morphological analysis section that analyzes the input document and divides it into parts of speech.
and a term extraction section that extracts words having the specified keyword as a suffix, and the terminology extracted by the term extraction section is configured to be stored in a reverse word dictionary database.

〔産業上の利用分野〕[Industrial application field]

本発明は、キーワードを接尾辞とする用語を検索して逆
語辞書デー、タベースを作成する逆語辞書データベース
作成方式に関するものである。The present invention relates to a reverse word dictionary database creation method for creating reverse word dictionary data or a database by searching for terms with keywords as suffixes.

〔従来の技術と発明が解決しようとする問題点〕マニュ
アルのＩＮＤＥＸや用語集、略語集などを作成する場合
、検索しようとするキーワードを接尾辞としてもついわ
ゆる逆語辞書のデータベースがあれば、極めて迅速に作
成することができる。[Problems to be solved by conventional technology and invention] When creating manual indexes, glossaries, abbreviations, etc., it would be extremely helpful if there was a database of so-called reverse word dictionaries that have the keywords to be searched as suffixes. Can be created quickly.

ここで、逆語辞書は、例えばクロックという見出し語（
Ｋｅｙ　　Ｗｏｒｄ）を参照すると、クロックを接尾辞
とする一連の用語群である“基本クロック”、“システ
ムクロック”、１２倍周期クロック”、“バーストクロ
ック”などが列挙されているものである。Here, the reverse word dictionary is, for example, the entry word clock (
Key Word) lists a series of terms with clock as a suffix, such as "basic clock,""systemclock,""12 times period clock," and "burst clock."

この逆語辞書データベースを構築するのに、人手などを
頌つて作成したのでは、多大の工数が必要となってしま
うという問題点があった。If this reverse word dictionary database was created manually, there would be a problem in that it would require a large amount of man-hours.

本発明は、キーワードを接尾辞とする用語を文書中から
検索して逆語辞書データベースを迅速に自動作成するこ
とを目的としている。An object of the present invention is to quickly and automatically create a reverse word dictionary database by searching a document for a term with a keyword as a suffix.

〔問題点を解決するための手段〕[Means for solving problems]

第１図を参照して問題点を解決するための手段を説明す
る。Means for solving the problem will be explained with reference to FIG.

第１図において、形態素解析部１は、入力された交会を
形態素解析して品詞に分割するものである。In FIG. 1, a morphological analysis unit 1 performs morphological analysis of an input intersection and divides it into parts of speech.

用語抽出部２は、形態素解析部１によって分割された品
詞中から、キーワードを接尾辞として持つ用語（品詞）
を抽出するものである。The term extraction unit 2 extracts terms (parts of speech) having keywords as suffixes from the parts of speech divided by the morphological analysis unit 1.
This is to extract.

用語ファイル３は、用語抽出部２によって抽出されたキ
ーワードを接尾辞として持つ用語を格納するものである
。この格納された用語群は、述語　□辞書データベース
を形成する。The terminology file 3 stores terms having keywords extracted by the term extraction unit 2 as suffixes. This stored term group forms a predicate dictionary database.

（作用〕本発明は、第１図に示すように、形態素解析部１が入力
された文＠（マニュアルなど）を形態素解析して品詞に
分割し、用語抽出部２がこの分割した品詞中から指定さ
れたキーワードを接尾辞として持つ用語を抽出して用語
ファイル３中に逆語辞書データベースとして格納するよ
うにしている。(Operation) As shown in FIG. 1, in the present invention, the morphological analysis unit 1 morphologically analyzes an input sentence (such as a manual) and divides it into parts of speech, and the term extraction unit 2 extracts the parts of speech from the divided parts of speech. Terms having the specified keyword as a suffix are extracted and stored in the terminology file 3 as a reverse word dictionary database.

このため、入力された文書中からキーワードを接尾辞と
して持つ用語を抽出して逆語辞書データベースを自動作
成することが可能となる。Therefore, it is possible to extract terms having the keyword as a suffix from the input document and automatically create a reverse word dictionary database.

〔実施例〕〔Example〕

次に、第１図ないし第４図を用いて本発明の１実施例の
構成および動作を順次詳細に説明する。Next, the configuration and operation of one embodiment of the present invention will be explained in detail using FIGS. 1 to 4.

第１図において、ファイルエディタ４は、用語ファイル
３から読み出した用語（キーワードを接尾辞として持つ
用語）をｋＩＡ集（ソート、マージ、デリートなど）し
、そのｋＭ集結果を出力ファイル５に格納するものであ
る。これにより、例えばアルファベット順に並んだＩＮ
ＤＥＸなどが作成される。In FIG. 1, the file editor 4 creates a kIA collection (sort, merge, delete, etc.) of the terms (terms with a keyword as a suffix) read from the terminology file 3, and stores the result of the kM collection in an output file 5. It is something. This allows, for example, the alphabetically ordered IN
DEX etc. are created.

第２図および第３図を用いて第１図構成の動作を詳細に
説明する。The operation of the configuration shown in FIG. 1 will be explained in detail using FIGS. 2 and 3.

第２図において、図中■は、文書ファイルから文書例え
ばマニュアルを読み出し、形態素解析部ｌに入力する状
態を示す、これは、例えば第３図（イ）に示す”　ＳＶ
Ｐプログラムは３つのＡＯＦ制御レジしクを使ってコマ
ンドシーケンスを実行する。′という文書を入力するこ
とを意味している。In FIG. 2, ■ indicates a state in which a document, for example, a manual, is read from a document file and input to the morphological analysis unit l. This is, for example, shown in FIG. 3 (a).
The P program uses three AOF control registers to execute command sequences. ′ means to input the document.

図中■は、検索ＷＯＲＤ（キーワード）を入力する状態
を示す。これは、例えば第３図（イ）検索ＷＯＩ？ｒ）
”レジスダを入力することを意味している。■ in the figure indicates a state in which a search word (keyword) is input. This is, for example, Fig. 3 (a) Search WOI? r)
”It means to enter Regisda.

図中■は、形態素解析を行う状態を示す。これは、第１
図形態素解析部１が、入力された文書を品詞に分割する
ことを意味している。この際、図中■ｂａｃｋ　ｔｒａ
ｃｅ解析規則によって、指定された文書の位置から戻る
態様で形態素解析を行って、品詞に分割する。これは、
第３図（ロ）に示すように、指定されたＡＯＦ制御レジ
しクの末尾からバックトレースする態様で図示下線を用
いて表したように、品詞に分割することを意味している
。■ in the figure indicates a state in which morphological analysis is performed. This is the first
This means that the graphic morphological analysis unit 1 divides the input document into parts of speech. At this time, ■ back tra in the figure
According to the ce analysis rules, morphological analysis is performed in a manner that returns from the specified position of the document, and the document is divided into parts of speech. this is,
As shown in FIG. 3(b), this means that the part of speech is divided into parts of speech, as indicated by underlining in the form of backtracing from the end of the specified AOF control register.

図中■は、用語の抽出を行う状態を示す、これは、例え
ば第３図（ロ）下線を用いて示す品詞中から、キーワー
ド（検索ＷＯＲＤ）”レジスタ”を接尾辞に持つ用語”
ＡＯＦ制御レジスタ“を第３図（ハ）に示すように、抽
出することを意味している。■ in the figure indicates a state in which terms are extracted. For example, in Figure 3 (b), from the underlined parts of speech, the keyword (search word) "terms with the suffix "register"" is used.
This means that the AOF control register is extracted as shown in FIG. 3 (c).

図中■は、用語ファイルに格納する状態を示す。■ in the figure indicates the state of storage in the terminology file.

これにより、逆語辞書データベースが作成される。As a result, a reverse word dictionary database is created.

図中■は、ファイルエディタであって、用語ファイルか
ら読み出した用語について編集（ソート、マージ、デリ
ートなど）を行い、その結果を用語ファイルに格納する
ことを意味している。■ in the figure is a file editor that edits (sorts, merges, deletes, etc.) the terms read from the term file and stores the results in the term file.

図中■は、出力ユーティリティであって、用語ファイル
に格納されている１Ｈｆｆｉ後の用語（逆語辞書データ
ベース）などを各種出力媒体例えばフロッピィディスク
にダンプするものである。In the figure, ■ is an output utility that dumps the terms after 1Hffi (reverse word dictionary database) stored in the terminology file to various output media, such as a floppy disk.

図中■は、用語自動処理システムであって、！ＮＤＥＸ
作成、用語集作成、類語／関連語リスト作成、略５ｎ集
作成、および逆語辞書作成などの各種作成処理を行うも
のである。■ in the figure is an automatic terminology processing system. NDEX
It performs various creation processes such as creation, glossary creation, synonym/related word list creation, approximately 5n collection creation, and reverse word dictionary creation.

以上の手順によって、入力された文書を品詞に分割し、
更にこの品詞中からキーワード（検索ＷＯＲＤＬを接尾
辞に持つ用語を抽出して述語辞四データベースを自動的
に作成することが可能となる。By the above steps, the input document is divided into parts of speech,
Furthermore, it is possible to automatically create a predicate dictionary database by extracting terms with keywords (search WORDL as suffix) from this part of speech.

第４図は、逆語辞書データベース例を示す、これは、第
２図フローチャートに示す手順によって作成された逆語
辞書データベース例である。第２行目の“Ｍ−９００動
作説明書”は文書のタイトルである。FIG. 4 shows an example of a reverse word dictionary database. This is an example of a reverse word dictionary database created by the procedure shown in the flowchart of FIG. “M-900 Operation Manual” on the second line is the title of the document.

第３行目の“ボート”は、見出し語（検索ＷＯ１？Ｄ、
キーワード）である。“Boat” in the third line is a headword (search WO1?D,
keyword).

第３行目に続く“アドレス交換ボート”などは、抽出用
語群即ち見出し語を接尾辞に持つ用語を、第２図手順に
よって抽出したものである。"Address exchange boat" and the like following the third line are extracted term groups, that is, terms having a headword as a suffix, using the procedure shown in FIG. 2.

同様に、見出し語“レジスタ”についても、この見出し
語に続く行に当該レジスタを接尾辞に持つ用語が抽出さ
れている。Similarly, regarding the headword "register", terms having the register as a suffix are extracted in the line following the headword.

〔発明の効果〕〔Effect of the invention〕

以上説明したように、本発明によれば、入力された文書
中からキーワードを接尾辞として持つ用語を抽出して逆
語辞書データベースを作成する構成を採用しているため
、逆語辞書データベースを自動作成することができる。As explained above, according to the present invention, a configuration is adopted in which terms having a keyword as a suffix are extracted from an input document to create a reverse word dictionary database, so that the reverse word dictionary database is automatically created. can be created.

この自動作成した逆語辞書データベースを編集してマニ
ュアルのＩＮＤＥＸ、用語集、略語集、類語／関連語リ
ストなどを作成することができる。これにより　、マニ
ュアル、書羅の索引の作成工数の削減、品質の向上、逆
語辞書データベースの共用による用語の標準化、電子フ
ァイル化の促進などを図ることができる。This automatically created reverse word dictionary database can be edited to create manual indexes, glossaries, abbreviations, synonyms/related word lists, and the like. As a result, it is possible to reduce the man-hours required to create indexes for manuals and calligraphy, improve quality, standardize terminology by sharing reverse word dictionary databases, and promote the creation of electronic files.

【図面の簡単な説明】[Brief explanation of the drawing]

第１図は本発明の１実施例構成図、第２図は本発明の動
作説明フローチャート、第３図は形態素解析／用語抽出
例、第４図は逆語辞書データベース例を示す。図中、１は形態素解析部、２は用語抽出部、３は用語フ
ァイル、４はファイルエディタを表す。木斃日月の１実施例ａ仄図！Ｐ）１　　　閃FIG. 1 is a configuration diagram of one embodiment of the present invention, FIG. 2 is a flowchart explaining the operation of the present invention, FIG. 3 is an example of morphological analysis/term extraction, and FIG. 4 is an example of a reverse word dictionary database. In the figure, 1 represents a morphological analysis unit, 2 represents a term extraction unit, 3 represents a term file, and 4 represents a file editor. A diagram of one example of the moon! P) 1 flash

Claims

【特許請求の範囲】キーワードを接尾辞とする用語を検索して逆語辞書デー
タベースを作成する逆語辞書データベース作成方式にお
いて、入力された文書を解析して品詞に分割する形態素解析部
（１）と、この形態素解析部（１）によって分割された品詞のうち
、指示されたキーワードを接尾辞として持つものを抽出
する用語抽出部（２）とを備え、この用語抽出部（２）
によって抽出された用語を逆語辞書データベースに格納
するように構成したことを特徴とする逆語辞書データベ
ース作成方式。[Claims] In a reverse word dictionary database creation method that searches for terms with keywords as suffixes and creates a reverse word dictionary database, there is provided a morphological analysis unit (1) that analyzes an input document and divides it into parts of speech. and a term extraction unit (2) for extracting words having the specified keyword as a suffix from among the parts of speech divided by the morphological analysis unit (1).
1. A method for creating a reverse word dictionary database, characterized in that the terms extracted by the method are stored in a reverse word dictionary database.