JPH11272709A

JPH11272709A - File retrieval system

Info

Publication number: JPH11272709A
Application number: JP10090899A
Authority: JP
Inventors: Atsuko Niimura; 敦子新村; Osamu Aihara; 理相原; Koushi Yamanaka; 航史山中
Original assignee: NTT Data Corp
Current assignee: NTT Data Group Corp
Priority date: 1998-03-19
Filing date: 1998-03-19
Publication date: 1999-10-08

Abstract

PROBLEM TO BE SOLVED: To present proper information for narrowing-down to a user at the time of narrowing down plural files as the retrieval result to specific files. SOLUTION: When the user inputs a keyword for document data retrieval (step S21), retrieval of document data is started based on this keyword (step S22). When each acquired document data is registered in each storage area of a data base, a keyword is extracted from the document data, and its importance degree data and appearance rate data are calculated and are registered in the storage area (step S23). The degree of importance of each document data is calculated (step S24). The importance degree data and appearance rate data of each keyword and importance degree data of each document data are used to calculate data D indicating the degree of availability of document data narrowing-down with each keyword (step S27). K (or L%) keywords whose degrees of availability are higher are displayed as keywords for narrowing-down of document data (step S28), and their degrees of availability for each document data are displayed also (step S29).

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、検索結果として得
られた複数のファイルから、検索対象に該当するファイ
ルを絞り込むファイル検索方式の改良に関するものであ
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an improvement in a file search method for narrowing down a file corresponding to a search target from a plurality of files obtained as a search result.

【０００２】[0002]

【従来の技術】従来、パーソナルコンピュータ（パソコ
ン）等の端末によるファイル検索の結果を絞り込む方法
として、上記端末に内蔵される類似語辞書や学習辞書を
用いてファイル検索のためのキーワード（検索キーワー
ド）に関連するキーワードを抽出し、それを上記端末の
表示部を介してユーザに提示する方法が知られている。2. Description of the Related Art Conventionally, as a method of narrowing down a file search result by a terminal such as a personal computer (personal computer), a keyword (search keyword) for file search using a similar word dictionary or a learning dictionary built in the terminal. There is known a method of extracting a keyword related to a keyword and presenting the keyword to a user via a display unit of the terminal.

【０００３】[0003]

【発明が解決しようとする課題】ところで、上述した絞
り込み方法においては、検索結果として与えられるファ
イルとは関係の薄いキーワード等も抽出されることにな
るので、ファイルを効率的に絞り込めない場合がある。
そこで、上記に鑑みて、検索結果として与えられたファ
イル群中のキーワードをデータベースより抽出し、その
キーワードを上記端末の表示部を介してユーザに提示す
ることで、ユーザによるファイルの絞り込みを効率的に
行えるようにした方法が提案され、特開昭６４―第１０
３０号公報によって開示されている。By the way, in the above-mentioned narrowing-down method, keywords and the like that have little relation with the file given as a search result are also extracted. is there.
Accordingly, in view of the above, keywords in a file group given as a search result are extracted from a database, and the keywords are presented to the user via the display unit of the terminal, thereby efficiently narrowing down files by the user. A method that can be carried out in a simple manner has been proposed.
No. 30 discloses this.

【０００４】しかし、上述した提案に係る方法において
も、検索対象の文書量が膨大であったり、データベース
に登録されているキーワード数が大量であるようなとき
には、絞り込みのためのキーワードも大量になる場合が
生じる。このような状態になると、ユーザがキーワード
を選択するときに迷いを生じ、その結果として不適切な
キーワードを選択してしまうことがあるため、ファイル
の効率的な絞り込みが行えなくなる虞がある。However, even in the method according to the above-mentioned proposal, when the number of documents to be searched is enormous or the number of keywords registered in the database is large, the number of keywords for narrowing down becomes large. Cases arise. In such a state, the user may be confused when selecting a keyword, and as a result, an inappropriate keyword may be selected, so that it may not be possible to narrow down the files efficiently.

【０００５】そこで、これに対処すべく、ユーザが入力
した自由な文章からキーワードを自動的に抽出してファ
イルの検索を行う自由文検索や、検索キーワードと検索
対象ファイル中のテキスト文章とを照合して該当ファイ
ルを検索する全文検索などの検索技術を利用する方法も
検討された。しかし、自由文検索や全文検索によって
も、検索対象の文書量が膨大であったり、データベース
に登録されているキーワード数が大量であるようなとき
には、やはり、上記提案に係る方法におけると同様に、
絞り込みのためのキーワードが大量になる不具合を回避
できない。To cope with this, a free sentence search for automatically extracting a keyword from a free sentence input by a user to search for a file, or collating a search keyword with a text sentence in a search target file are performed. Then, a method of using a search technology such as a full-text search for searching the corresponding file was also studied. However, even in a free text search or a full text search, when the number of documents to be searched is enormous or the number of keywords registered in the database is large, as in the method according to the above proposal,
It is not possible to avoid a problem that a large number of keywords for narrowing down.

【０００６】更に、上述した提案に係る方法において、
出現頻度の高いキーワードや特定の出現頻度のキーワー
ドのみを提示する手法も思料され得るが、これらのキー
ワードが必ずしも適切な絞り込み用のキーワードとは言
えないという問題もあった。Further, in the method according to the above proposal,
Although a method of presenting only a keyword having a high frequency of appearance or a keyword having a specific frequency of appearance may be considered, there is a problem that these keywords are not necessarily appropriate keywords for narrowing down.

【０００７】従って本発明の目的は、検索結果としての
複数のファイルから特定のファイルの絞り込みを行うと
きに、絞り込みのための適切な情報をユーザに提示でき
るようにすることにある。Accordingly, an object of the present invention is to provide a user with appropriate information for narrowing down a specific file from a plurality of files as search results.

【０００８】本発明の別の目的は、端末の簡単な操作に
よって効率良くファイル検索が行えるようにすることに
ある。[0008] Another object of the present invention is to enable efficient file search by a simple operation of a terminal.

【０００９】[0009]

【課題を解決するための手段】本発明の第１の側面に従
うファイル検索方式は、検索結果として得られた複数の
ファイルから、検索対象に該当するファイルを絞り込む
もので、各ファイルから、各ファイルの特徴を表わす情
報を抽出する手段と、抽出された特徴情報から、特定の
ファイルを絞り込むのに適した情報を選別して提示する
手段とを備える。The file search method according to the first aspect of the present invention narrows down a file corresponding to a search target from a plurality of files obtained as a search result. And a means for selecting and presenting information suitable for narrowing down a specific file from the extracted characteristic information.

【００１０】上記構成によれば、抽出された特徴情報か
ら、特定のファイルを絞り込むのに適した情報を選別し
て提示するので、絞り込みのための適切な情報をユーザ
に提供できる。According to the above configuration, information suitable for narrowing down a specific file is selected and presented from the extracted characteristic information, so that appropriate information for narrowing down can be provided to the user.

【００１１】本発明の第１の側面に係る好適な実施形態
では、特徴情報は、各ファイル中に存在するキーワード
である。各ファイルの検索は、検索対象となるファイル
を特定するための情報が入力されることにより実行され
る。入力されるファイル特定情報としては、ファイルの
特徴を表わすキーワード情報、自由文情報及び音声情報
のいずれかが用いられる。各ファイルの検索は、全文検
索又は全文検索に類似の手法により実行される。抽出手
段は、検索結果として得られた複数のファイルが登録さ
れるとき、これらのファイルから各々のファイルの特徴
情報を抽出する。In a preferred embodiment according to the first aspect of the present invention, the feature information is a keyword existing in each file. The search of each file is executed by inputting information for specifying a file to be searched. As the input file identification information, any of keyword information, free text information, and voice information representing the characteristics of the file is used. The search of each file is executed by a full-text search or a technique similar to the full-text search. When a plurality of files obtained as a search result are registered, the extracting unit extracts feature information of each file from these files.

【００１２】上記実施形態では、選別、提示手段は、抽
出された特徴情報の各ファイル中での重要度を算出する
第１の手段と、抽出された特徴情報の各ファイル中での
出現率を算出する第２の手段と、各ファイルの重要度を
算出する第３の手段と、特徴情報の重要度算出値と、特
徴情報の出現率算出値と、ファイルの重要度算出値とか
ら、特徴情報の有効度を算出する第４の手段とを備え、
算出された有効度中で比較的高い値の有効度を持つ特徴
情報を絞り込みに適した特徴情報として提示する。第１
の手段は、特徴情報の各ファイル中での出現位置又は出
現箇所により特徴情報を段階的に分類することにより特
徴情報の重要度を算出するか、或いは、予め登録されて
いる重要度の値を参酌することにより特徴情報の重要度
を算出する。第２の手段は、特徴情報の各ファイル中で
の出現数と、各ファイル中での全ての特徴情報の出現数
とから特徴情報の出現率を算出する。第３の手段は、各
ファイルの順位が出力されたときは、その順位に基づ
き、また、各ファイルの確からしさや重要度を示す値が
出力されたときは、その値に基づいて、夫々各ファイル
の重要度を算出する。或いは、重要度の高い上位複数個
のファイルのみを抽出して第４の手段に与えることもで
きる。第４の手段は、算出された特徴情報中で比較的高
い値の有効度を持つ上位複数個の特徴情報を絞り込みに
適した特徴情報として提示する。In the above embodiment, the selecting and presenting means calculates the importance of the extracted feature information in each file and the appearance rate of the extracted feature information in each file. A second means for calculating; a third means for calculating the importance of each file; a calculated value of the importance of the characteristic information; a calculated value of the appearance rate of the characteristic information; and a calculated value of the importance of the file. And fourth means for calculating the validity of the information.
Feature information having a relatively high value of effectiveness among the calculated effectiveness is presented as feature information suitable for narrowing down. First
Means may calculate the importance of the feature information by classifying the feature information stepwise according to the appearance position or location in each file of the feature information, or may calculate the value of the importance degree registered in advance. The importance of the feature information is calculated by taking this into consideration. The second means calculates the appearance rate of feature information from the number of appearances of feature information in each file and the number of appearances of all feature information in each file. The third means is based on the order of each file when the order is output, and when the value indicating the certainty or importance of each file is output, based on the value. Calculate file importance. Alternatively, it is also possible to extract only a plurality of files having higher importance and give them to the fourth means. The fourth means presents, as feature information suitable for narrowing down, a plurality of upper-ranked pieces of feature information having a relatively high value of effectiveness in the calculated feature information.

【００１３】本発明の第２の側面に従うファイル検索方
式は、検索結果として得られた複数のファイルから、検
索対象に該当するファイルを絞り込むもので、各ファイ
ルから、各ファイルの特徴を表わす情報を抽出する手段
と、抽出された特徴情報から、特定のファイルを絞り込
むのに適した情報を選別して提示する手段と、提示され
た情報中から特定の情報が指定されたとき、その情報に
基づいてファイルを再検索する手段とを備える。The file search method according to the second aspect of the present invention narrows down a file corresponding to a search target from a plurality of files obtained as a search result. Means for extracting, means for selecting and presenting information suitable for narrowing down a specific file from the extracted characteristic information, and when specific information is designated from the presented information, based on the information, Means for re-searching the file.

【００１４】上記構成によれば、提示された情報中から
特定の情報が指定されたとき、その情報に基づいてファ
イルを再検索するので、端末の簡単な操作によって効率
良くファイル検索が行える。According to the above configuration, when specific information is designated from the presented information, the file is searched again based on the information, so that the file can be efficiently searched by a simple operation of the terminal.

【００１５】本発明の第２の側面に係る好適な実施形態
では、特徴情報は、各ファイル中に存在するキーワード
である。各ファイルの検索は、検索対象となるファイル
を特定するための情報が入力されることにより実行され
る。入力されるファイル特定情報としては、ファイルの
特徴を表わすキーワード情報、自由文情報及び音声情報
のいずれかが用いられる。各ファイルの検索は、全文検
索又は全文検索に類似の手法により実行される。抽出手
段は、検索結果として得られた複数のファイルが登録さ
れるとき、これらのファイルから各々のファイルの特徴
情報を抽出する。In a preferred embodiment according to the second aspect of the present invention, the feature information is a keyword existing in each file. The search of each file is executed by inputting information for specifying a file to be searched. As the input file identification information, any of keyword information, free text information, and voice information representing the characteristics of the file is used. The search of each file is executed by a full-text search or a technique similar to the full-text search. When a plurality of files obtained as a search result are registered, the extracting unit extracts feature information of each file from these files.

【００１６】上記実施形態では、選別、提示手段は、抽
出された特徴情報の各ファイル中での重要度を算出する
第１の手段と、抽出された特徴情報の各ファイル中での
出現率を算出する第２の手段と、各ファイルの重要度を
算出する第３の手段と、特徴情報の重要度算出値と、特
徴情報の出現率算出値と、ファイルの重要度算出値とか
ら、特徴情報の有効度を算出する第４の手段とを備え、
算出された有効度中で比較的高い値の有効度を持つ特徴
情報を絞り込みに適した特徴情報として提示する。第１
の手段は、特徴情報の各ファイル中での出現位置又は出
現箇所により特徴情報を段階的に分類することにより特
徴情報の重要度を算出するか、或いは、予め登録されて
いる重要度の値を参酌することにより特徴情報の重要度
を算出する。第２の手段は、特徴情報の各ファイル中で
の出現数と、各ファイル中での全ての特徴情報の出現数
とから特徴情報の出現率を算出する。第３の手段は、各
ファイルの順位が出力されたときは、その順位に基づ
き、また、各ファイルの確からしさや重要度を示す値が
出力されたときは、その値に基づいて、夫々各ファイル
の重要度を算出する。或いは、重要度の高い上位複数個
のファイルのみを抽出して第４の手段に与えることもで
きる。第４の手段は、算出された特徴情報中で比較的高
い値の有効度を持つ上位複数個の特徴情報を絞り込みに
適した特徴情報として提示する。In the above embodiment, the selecting and presenting means calculates the importance of the extracted feature information in each file and the appearance rate of the extracted feature information in each file. A second means for calculating; a third means for calculating the importance of each file; a calculated value of the importance of the characteristic information; a calculated value of the appearance rate of the characteristic information; and a calculated value of the importance of the file. And fourth means for calculating the validity of the information.
Feature information having a relatively high value of effectiveness among the calculated effectiveness is presented as feature information suitable for narrowing down. First
Means may calculate the importance of the feature information by classifying the feature information stepwise according to the appearance position or location in each file of the feature information, or may calculate the value of the importance degree registered in advance. The importance of the feature information is calculated by taking this into consideration. The second means calculates the appearance rate of feature information from the number of appearances of feature information in each file and the number of appearances of all feature information in each file. The third means is based on the order of each file when the order is output, and when the value indicating the certainty or importance of each file is output, based on the value. Calculate file importance. Alternatively, it is also possible to extract only a plurality of files having higher importance and give them to the fourth means. The fourth means presents, as feature information suitable for narrowing down, a plurality of upper-ranked pieces of feature information having a relatively high value of effectiveness in the calculated feature information.

【００１７】本発明の第３の側面に従うプログラム媒体
は、検索結果として得られた複数のファイルから、検索
対象に該当するファイルを絞り込むファイル検索方式に
おいて、各ファイルから、各ファイルの特徴を表わす情
報を抽出する手段と、抽出された特徴情報から、特定の
ファイルを絞り込むのに適した情報を選別して提示する
手段とを備えることを特徴とするファイル検索方式にお
ける上記各手段としてコンピュータを動作させるための
コンピュータプログラムをコンピュータ読取可能に担持
する。According to a third aspect of the present invention, in a file search method for narrowing down a file corresponding to a search target from a plurality of files obtained as a search result, information representing characteristics of each file from each file is provided. And a means for selecting and presenting information suitable for narrowing down a specific file from the extracted characteristic information. A computer is operated as each of the above means in the file search method. Computer-readable program for carrying the same.

【００１８】[0018]

【発明の実施の形態】以下、本発明の実施の形態を、図
面により詳細に説明する。Embodiments of the present invention will be described below in detail with reference to the drawings.

【００１９】図１は、本発明の一実施形態に係るファイ
ル検索方式に適用されるデータベースを示す説明図であ
る。FIG. 1 is an explanatory diagram showing a database applied to a file search method according to an embodiment of the present invention.

【００２０】上記データベースは、本発明の一実施形態
に係るファイル検索方式が構築されるパソコン等の端末
に内蔵されるもので、上記データベースには、端末操作
による検索の結果得られた複数のファイル（以下、「文
書データ」と称する）及びそれらの絞り込みを行うのに
適用される各種のデータ等が格納される。The database is built in a terminal such as a personal computer in which the file search method according to one embodiment of the present invention is constructed. The database includes a plurality of files obtained as a result of the search by the terminal operation. (Hereinafter, referred to as “document data”) and various data applied to narrow down the data are stored.

【００２１】即ち、上記データベース１は、文書データ
の記憶領域３と、各文書データ中に存在するキーワード
の記憶領域５と、上記キーワードの重要度を示すデータ
の記憶領域（重要度データ記憶領域）７と、各文書デー
タにおける上記キーワードの出現率を示すデータの記憶
領域（出現率データ記憶領域）９とを有する。上述した
記憶領域３は、本実施形態に係るファイル検索方式によ
って検索される複数の文書データI、II、…を個別に登
録できるよう複数設定されており、各記憶領域５、７及
び９は、上記各記憶領域３に夫々対応して複数設定され
ている。That is, the database 1 has a storage area 3 for document data, a storage area 5 for keywords existing in each document data, and a storage area for data indicating importance of the keywords (importance data storage area). 7 and a data storage area (appearance rate data storage area) 9 indicating the appearance rate of the keyword in each document data. A plurality of storage areas 3 are set so that a plurality of document data I, II,... Searched by the file search method according to the present embodiment can be individually registered. A plurality of storage areas 3 are set correspondingly.

【００２２】上記各文書データの記憶領域３には、上記
検索方式による検索の結果として得られた文書データ
I、II、…が夫々登録される。上記各キーワードの記憶
領域５には、各文書データI、II、…の各記憶領域３へ
の登録時に、各文書データI、II、…から夫々抽出され
た各キーワード（Ａ、Ｂ、…）が登録される。重要度デ
ータ記憶領域７には、各文書データI、II、…の各記憶
領域３への登録時に、上記抽出された各キーワード
（Ａ、Ｂ、…）について夫々算出された重要度データｂ
（ｂ1、ｂ2、…）が登録される。更に、出現率データ記
憶領域９には、各文書データI、II、…の各記憶領域３
への登録時に、上記抽出された各キーワード（Ａ、Ｂ、
…）について夫々算出された出現率データｃ（ｃ1、ｃ
2、…）が登録される。In the storage area 3 of each document data, the document data obtained as a result of the search by the search method is stored.
I, II, ... are registered respectively. Each of the keywords (A, B,...) Extracted from each of the document data I, II,... At the time of registration of each of the document data I, II,. Is registered. In the importance data storage area 7, the importance data b calculated for each of the extracted keywords (A, B,...) At the time of registering the document data I, II,.
(B1, b2, ...) are registered. Further, in the appearance rate data storage area 9, each storage area 3 of each document data I, II,.
At the time of registration to, each of the extracted keywords (A, B,
..) Are calculated for each of the appearance rate data c (c1, c
2, ...) is registered.

【００２３】上記各々の記憶領域３〜９に登録されたデ
ータのうち、記憶領域３に登録された各文書データI、I
I、…は、それらの重要度ａ（ａ1、ａ2、…）を算出す
るとき、及びそれらを端末の表示部に表示するときに読
出される。また、記憶領域７及び記憶領域９に夫々登録
された各データは、文書データI、II、…の絞り込みに
際しての上記各キーワード（Ａ、Ｂ、…）の有効度を算
出するとき上記文書データI、II、…の重要度と共に用
いられるもので、上記有効度の算出に際して夫々読出さ
れる。更に、記憶領域５に登録された各キーワード
（Ａ、Ｂ、…）については、上述した有効度において上
位にランクされるものが、文書データI、II、…の絞り
込みに適したキーワードであるとして端末の表示部に表
示するときに読出される。Of the data registered in each of the storage areas 3 to 9, each of the document data I, I registered in the storage area 3
.. Are read out when calculating their importance a (a1, a2,...) And when displaying them on the display unit of the terminal. Each of the data registered in the storage area 7 and the storage area 9 is used to calculate the validity of each of the keywords (A, B,...) When narrowing down the document data I, II,. , II,... Are read out at the time of calculating the validity. Further, as for each keyword (A, B,...) Registered in the storage area 5, it is assumed that the keyword ranked higher in the above-mentioned effectiveness is a keyword suitable for narrowing down the document data I, II,. Read when displayed on the display unit of the terminal.

【００２４】図２及び図３は、本発明の一実施形態に係
るファイル検索方式の処理動作を示すフローチャートで
ある。FIGS. 2 and 3 are flowcharts showing the processing operation of the file search method according to one embodiment of the present invention.

【００２５】図２及び図３において、ユーザが端末のキ
ーボード（或いは、マウス）等を操作して文書データ検
索のためのキーワードを上記端末に入力すると（ステッ
プＳ２１）、上記キーワード（入力情報）に基づいて文
書データの検索が開始される。ここで、上述した文書デ
ータの検索に適用される手法は、特定のものに限定され
ない。しかし、以下に説明するステップＳ２４において
上記文書データの重要度を算出するのに必要な情報を得
るために、例えば全文検索等のような、検索結果として
重要度や確からしさや、或いは確からしさの順位等が出
力されるような手法を適用するのが望ましい（ステップ
Ｓ２２）。In FIG. 2 and FIG. 3, when the user operates the keyboard (or mouse) of the terminal and inputs a keyword for document data search to the terminal (step S21), the keyword (input information) A search for document data is started based on the document data. Here, the method applied to the above-described document data search is not limited to a specific method. However, in order to obtain information necessary for calculating the importance of the document data in step S24 described below, the importance, certainty, or certainty of the search result, such as a full-text search, is obtained. It is desirable to apply a method of outputting the rank and the like (step S22).

【００２６】次に、上記検索の結果として取得した各文
書データI、II、…を、データベース１の各記憶領域３
に夫々登録する際に、各文書データI、II、…からキー
ワード（Ａ、Ｂ、…）を抽出する。そして、これらのキ
ーワード（Ａ、Ｂ、…）の重要度データｂ（ｂ1、ｂ2、
…）及び出現率データｃ（ｃ1、ｃ2、…）を算出して上
述した記憶領域７及び記憶領域９に夫々登録する（ステ
ップＳ２３）。Next, each of the document data I, II,... Obtained as a result of the search is stored in each storage area 3 of the database 1.
, Keywords (A, B,...) Are extracted from the document data I, II,. Then, the importance data b (b1, b2,...) Of these keywords (A, B,...)
..) And appearance rate data c (c1, c2,...) Are calculated and registered in the storage areas 7 and 9 described above (step S23).

【００２７】ここで、上記各データの算出の手順につい
て説明する。Here, the procedure of calculating each of the above data will be described.

【００２８】重要度データｂ（ｂ1、ｂ2、…）の算出に
際しては、各文書データI、II、…中の各キーワード
（Ａ、Ｂ、…）の出現位置或いは出現箇所により、各キ
ーワード（Ａ、Ｂ、…）の重要度をＧ段階に分類する手
法が適用される。出現位置による分類では、例えば文書
データ中の最初の位置及び最後の位置に出現するキーワ
ードについては重要度を高く設定する等、予め文書デー
タ中の各位置に対応して夫々重要度を定めておき、これ
らの重要度を、各位置に出現したキーワード（Ａ、Ｂ、
…）の重要度データとして採用する。出現箇所による分
類では、例えば文書データ中の概要、要約、纏め等の用
語の近くに出現するキーワードについては重要度を高く
設定するなど、予め文書データ中の各用語に対応して夫
々重要度を定めておき、これらの重要度を、各用語の近
くに出現したキーワード（Ａ、Ｂ、…）の重要度データ
として採用する。When calculating the importance data b (b1, b2,...), Each keyword (A, B,...) In each document data I, II,. , B,...) Are classified into G levels. In the classification by the appearance position, for example, importance is set in advance corresponding to each position in the document data, for example, the importance is set high for keywords appearing at the first position and the last position in the document data. , Their importance, and the keywords (A, B,
…) Is adopted as importance data. In the classification based on the appearance location, for example, a keyword that appears near a term such as an outline, a summary, or a summary in the document data is set to a high degree of importance. In advance, these importance levels are adopted as importance level data of keywords (A, B,...) Appearing near each term.

【００２９】出現率データｃ（ｃ1、ｃ2、…）の算出に
際しては、ｃ〈ｉｊ〉を文書〈ｊ〉におけるキーワード
〈ｉ〉の出現率とすれば、ｃ〈ｉｊ〉は、下記の（１）
式によって求められる。When calculating the appearance rate data c (c1, c2,...), If c <ij> is the occurrence rate of the keyword in the document <j>, c <ij> is expressed by the following (1) )
It is determined by the formula.

【００３０】ｃ〈ｉｊ〉＝（文書〈ｊ〉におけるキーワード〈ｉ〉の出現数）／（文書〈ｊ〉における全キーワードの出現数）…（１）ステップＳ２３での処理動作が終了すると、各文書デー
タI、II、…を各記憶領域３から読出し、各文書データ
I、II、…について夫々の重要度ａ（ａ1、ａ2、…）を
算出する（ステップＳ２４）。C <ij> = (number of appearances of keyword in document <j>) / (number of appearances of all keywords in document <j>) (1) When the processing operation in step S23 ends, Read out the document data I, II,... From each storage area 3 and read each document data.
The respective importance levels a (a1, a2, ...) are calculated for I, II, ... (step S24).

【００３１】この重要度算出の手法の一例として、各文
書データI、II、…に対し、順位ｘ（ｘ1、ｘ2、…）が
出力されるときには、順位ｘ（ｘ1、ｘ2、…）を各文書
データI、II、…の重要度ａ（ａ1、ａ2、…）と見做
し、ａ＝ｘ（即ち、ａ1＝ｘ1、ａ2＝ｘ2、…）とする。
つまり、各文書データI、II、…の重要度ａに順位ｘを
代入する。上記重要度算出の手法の別の例として、各文
書データI、II、…に対し、確からしさや重要度を示す
値ｙ（ｙ1、ｙ2、…）が出力されるときには、上記確か
らしさや重要度を示す値ｙ（ｙ1、ｙ2、…）を各文書デ
ータI、II、…の重要度ａ（ａ1、ａ2、…）と見做す。
そして、ａ＝ｙ（即ち、ａ1＝ｙ1、ａ2＝ｙ2、…）とす
るか、ａ＝ｙ／Ｙ（即ち、ａ1＝ｙ1／Ｙ、ａ2＝ｙ2／
Ｙ、…）とする。つまり、各文書データI、II、…の重
要度ａに上述した値ｙ又はｙ／Ｙを代入する。なお、Ｙ
＝Σｙ（即ち、ｙ1＋ｙ2＋…）である。As an example of a method of calculating the importance, when the order x (x1, x2,...) Is output for each document data I, II,. The document data I, II,... Are regarded as importance a (a1, a2,...), And a = x (that is, a1 = x1, a2 = x2,...).
That is, the order x is substituted for the importance a of each document data I, II,. As another example of the importance calculation method, when a value y (y1, y2,...) Indicating certainty or importance is output to each document data I, II,. The value y (y1, y2,...) Indicating the degree is regarded as the importance a (a1, a2,...) Of each document data I, II,.
Then, a = y (that is, a1 = y1, a2 = y2,...) Or a = y / Y (that is, a1 = y1 / Y, a2 = y2 /
Y, ...). That is, the above-mentioned value y or y / Y is substituted for the importance a of each document data I, II,. Note that Y
= Σy (that is, y1 + y2 +...).

【００３２】上述した処理動作が終了すると、データベ
ース１の記憶領域７から各キーワード（Ａ、Ｂ、…）の
重要度データｂ（ｂ1、ｂ2、…）を、記憶領域９から各
キーワード（Ａ、Ｂ、…）の出現率データｃ（ｃ1、ｃ
2、…）を夫々読出す（ステップＳ２５、Ｓ２６）。そ
して、これらのデータと、ステップＳ２４で算出した各
文書データI、II、…の重要度データａ（ａ1、ａ2、
…）とを用いて、各キーワード（Ａ、Ｂ、…）による文
書データI、II、…の絞り込みの有効度を示すデータ
（Ｄ）（以下、「有効度データ（Ｄ）」で表わす）を算
出する（ステップＳ２７）。When the above-described processing operation is completed, the importance data b (b1, b2,...) Of each keyword (A, B,...) Is stored in the storage area 7 of the database 1, and each keyword (A, B, ...) appearance rate data c (c1, c
2,...) Are read out (steps S25, S26). Then, these data and the importance data a (a1, a2,...) Of the respective document data I, II,.
..) And data (D) indicating the effectiveness of narrowing down the document data I, II,... By each keyword (A, B,...) (Hereinafter, referred to as “effectiveness data (D)”). It is calculated (step S27).

【００３３】次に、各文書データI、II、…の重要度デ
ータａ（ａ1、ａ2、…）、各キーワード（Ａ、Ｂ、…）
の重要度データｂ（ｂ1、ｂ2、…）、及び出現率データ
ｃ（ｃ1、ｃ2、…）が、共に数値が大きいほど重要度及
び出現率が高い値になる場合を例にとり、上記有効度デ
ータ（Ｄ）の算出の手順を説明する。Next, the importance data a (a1, a2,...) Of each document data I, II,.
, The importance data b (b1, b2,...) And the appearance rate data c (c1, c2,. The procedure for calculating the data (D) will be described.

【００３４】上記有効度データ（Ｄ）の算出に際して
は、各文書データI、II、…において重要度の高いキー
ワードや、重要度の高い文書データに含まれるキーワー
ドほど高い値となるよう算出する。ここで、ｄ〈ｉｊ〉
を、文書データ〈ｊ〉におけるキーワード〈ｉ〉の有効
度とすれば、ｄ〈ｉｊ〉は、下記の（２）式によって求
められる。In the calculation of the validity data (D), the higher the importance of the keywords in the document data I, II,... And the higher the value of the keywords included in the higher importance document data, the higher the value. Here, d <ij>
Is the validity of the keyword in the document data <j>, d <ij> is obtained by the following equation (2).

【００３５】ｄ〈ｉｊ〉＝（Ｐ＊ａ〈ｊ〉＋Ｑ＊ｂ<ｉｊ>）＊（Ｒ＋ｃ<ｉｊ>）…（２）ただし、ａ<ｊ> は、文書データ<ｊ>の重要度であり、
ｂ<ｉｊ>は、文書データ<ｊ>におけるキーワード<ｉ>の
重要度であり、ｃ<ｉｊ>は、上述したように文書データ
<ｊ>におけるキーワード<ｉ>の出現率であり、更に、
Ｐ、Ｑ及びＲは、係数である。（２）式において、係数
Ｐ、Ｑ、Ｒを設けた理由は、（２）式が（各文書データ
におけるキーワードの出現頻度）を（文書データの重要
度）と（キーワードの重要度）に掛け合わせているの
で、出現頻度が有効度（Ｄ）の値に大きな影響を及ぼす
こととなるのを緩和するためである。D <ij> = (P * a <j> + Q * b <ij>) * (R + c <ij>) (2) where a <j> is the importance of the document data <j> Yes,
b <ij> is the importance of the keyword in the document data <j>, and c <ij> is the document data
The appearance rate of the keyword in <j>.
P, Q and R are coefficients. The reason why the coefficients P, Q, and R are provided in the equation (2) is that the equation (2) multiplies (frequency of appearance of keywords in each document data) by (importance of document data) and (importance of keywords). The reason is that the occurrence frequency greatly influences the value of the effectiveness (D) because it is matched.

【００３６】また、Ｄ<ｉ>を、キーワード<ｉ>の有効度
とすれば、ｄ〈ｉ〉は、下記の（３）式によって求めら
れる。If D is the validity of the keyword , d can be obtained by the following equation (3).

【００３７】Ｄ<ｉ>＝Σｄ<ｉ>…（３）ただし、Σｄ<ｉ>は、各文書データI、II、…における
キーワード<ｉ>の有効度の累計値である。D = Σd (3) where Σd is a cumulative value of the validity of the keyword in each document data I, II,.

【００３８】ステップＳ２７での処理動作が終了する
と、上記算出結果に基づき、有効度Ｄの高い上位Ｋ個
（若しくは上位Ｌ％）のキーワードを、ステップＳ２３
で各記憶領域３に登録した各文書データI、II、…を絞
り込むためのキーワードとして、端末の表示部に表示す
る（ステップＳ２８）。この表示と共に、各文書データ
I、II、…についても表示する（ステップＳ２９）。When the processing operation in step S27 is completed, the top K (or upper L%) keywords having a high degree of effectiveness D are determined based on the above calculation results in step S23.
Are displayed on the display unit of the terminal as keywords for narrowing down the document data I, II,... Registered in each storage area 3 (step S28). Along with this display, each document data
.. Are also displayed (step S29).

【００３９】次に、ユーザが、検索結果として表示され
た各文書データI、II、…と、それらの絞り込みに有効
なキーワードとして表示されたキーワードとを確認し、
表示された各文書データI、II、…中から特定の文書デ
ータの選択を行う旨の指令を入力したとする（ステップ
Ｓ３０）。この場合には、該当する文書データを表示部
に表示し（ステップＳ３１）、ユーザがその文書データ
を取得すべき旨の指令を入力すると（ステップＳ３
２）、ファイル検索処理のための一連の動作を終了させ
る。ステップＳ３２において、上記指令が与えられなけ
れば、再びステップＳ３０に復帰して、表示されている
絞り込み用のキーワードを選択して文書データの再検索
を行うか、それとも表示されている文書データI、II、
…のいずれかを取得するかユーザが判断するのを待つ。Next, the user confirms each of the document data I, II,... Displayed as a search result and the keywords displayed as effective keywords for narrowing them down.
Assume that a command to select specific document data from among the displayed document data I, II,... Has been input (step S30). In this case, the corresponding document data is displayed on the display unit (step S31), and when the user inputs a command to acquire the document data (step S3).
2) End a series of operations for file search processing. In step S32, if the above command is not given, the process returns to step S30 again to select the displayed narrowing-down keyword and perform a search for the document data again, or to display the document data I, II,
Wait for the user to determine whether to acquire any of the following.

【００４０】一方、ステップＳ３０において、ユーザが
表示されている絞り込み用のキーワードのうちから特定
のキーワードを選択して検索キーワードに指定する旨入
力したとする。この場合には、上記指定されたキーワー
ドを、検索キーワードに追加して（ステップＳ３３）、
ステップＳ２２に移行し、ステップＳ２２〜Ｓ３０の処
理動作を再度実行することにより文書データの再検索が
実行されることになる。On the other hand, it is assumed that in step S30, the user selects a specific keyword from the displayed narrowing keywords and inputs an instruction to designate the keyword as a search keyword. In this case, the specified keyword is added to the search keyword (step S33),
The process proceeds to step S22, and the processing operations of steps S22 to S30 are executed again, whereby the document data is searched again.

【００４１】以上説明したように、本発明の一実施形態
によれば、検索の結果として得られた複数個の文書デー
タ中から特定の文書データを絞り込むのに適切なキーワ
ードを、ユーザに提示することができる。また、上記適
切なキーワードを、ユーザによる文書データの絞り込み
に際して提示することにより、ユーザが所望する文書デ
ータを効率的に取得することができる。更に、端末の簡
単な操作によって文書データの検索が効率良く行うこと
ができ、検索に要する時間の短縮を図ることもできる。As described above, according to an embodiment of the present invention, a keyword suitable for narrowing down specific document data from a plurality of document data obtained as a result of a search is presented to a user. be able to. Also, by presenting the appropriate keyword when the user narrows down the document data, it is possible to efficiently obtain the document data desired by the user. Further, the document data can be efficiently searched by a simple operation of the terminal, and the time required for the search can be reduced.

【００４２】上述した内容は、あくまで本発明の一実施
形態に関するものであって、本発明が上記内容のみに限
定されることを意味するものでないのは勿論である。The above description relates to one embodiment of the present invention, and does not mean that the present invention is limited to only the above content.

【００４３】例えば、上記実施形態では、ステップＳ２
１で端末へ入力される情報をキーワードとしたが、キー
ワードに代えて自由文の入力や音声による入力を用いて
も良い。For example, in the above embodiment, step S2
Although the information input to the terminal in 1 is a keyword, an input of a free sentence or an input by voice may be used instead of the keyword.

【００４４】また、上記実施形態では、ステップＳ２３
で抽出したキーワードの重要度データの算出を、各キー
ワードの各文書データ内での出現位置或いは出現箇所を
参酌して行っているが、各キーワードの重要度を予め辞
書に登録しておき、それらの値を参照することによって
各キーワードの重要度を求めることにしても良い。In the above embodiment, step S23
The importance data of the keywords extracted in the above is calculated in consideration of the appearance position or the appearance position of each keyword in each document data, but the importance of each keyword is registered in a dictionary in advance, and May be determined by referring to the value of.

【００４５】更に、上記実施形態では、ステップＳ２４
での各文書データI、II、…の重要度ａ（ａ1、ａ2、
…）の算出に際して、順位或いは確からしさや重要度を
示す値を用いているが、これらに代えて、重要度の高い
上位（Ｎ個又はＭ％）の文書データのみに絞り込む手法
を採用することもできる。この手法を採用してステップ
Ｓ２５以下の処理を実行すれば、検索対象となった文書
の量が膨大であったとしても、処理時間の短縮を図るこ
とが可能になる。Further, in the above embodiment, step S24
, The importance a of each document data I, II, ... (a1, a2,
..) Are calculated using values indicating the order or certainty or importance, but instead of these, a method of narrowing down to only high-order (N or M%) document data with high importance is adopted. Can also. By executing the processing of step S25 and subsequent steps by employing this method, it is possible to reduce the processing time even if the number of documents to be searched is enormous.

【００４６】次に、上述した本発明の一実施形態に係る
ファイル検索方式において得られる各文書データの重要
度データ、キーワードの重要度データ、キーワードの出
現率データ、及びキーワードの有効度データの一例とし
て、図４乃至図１０に夫々示す各々のデータを例にとり
説明する。Next, an example of the importance data of each document data, the importance data of the keyword, the appearance rate data of the keyword, and the effectiveness data of the keyword obtained in the file search method according to the embodiment of the present invention described above. This will be described with reference to the respective data shown in FIGS. 4 to 10 as an example.

【００４７】図４乃至図１０に示した例では、簡単のた
め、検索の結果として得られた文書データが５件の場合
の有効度（Ｄ）の算出について示すが、実際には、検索
結果として得られる文書データの数も、各文書データに
含まれるキーワードの数も、相当多い場合が一般的であ
る。また、上述した（２）式を構成する各係数Ｐ、Ｑ、
Ｒについても、簡単のため、Ｐ＝Ｑ＝１、Ｒ＝０にし
た。In the examples shown in FIGS. 4 to 10, for the sake of simplicity, the calculation of the effectiveness (D) when the number of document data obtained as a result of the search is five is shown. In general, both the number of pieces of document data obtained as "" and the number of keywords included in each piece of document data are considerably large. Further, each coefficient P, Q,
For R, P = Q = 1 and R = 0 for simplicity.

【００４８】図４は、各文書データの重要度データを示
す説明図である。FIG. 4 is an explanatory diagram showing importance data of each document data.

【００４９】図４に示した例では、上述したｙ／Ｙｙ
（又はΣｙ）の値によって示される各文書データの重要
度ａの値は、文書データIが４０％で最も高く、次いで
文書データII、III、IV、Vの順に低くなっている（文書
データIIが３６％、文書データIIIが１８％、文書デー
タIVが４％、文書データVが２％）。In the example shown in FIG. 4, the above described y / Yy
The value of the importance a of each document data indicated by the value of (or Δy) is 40% for the document data I, the highest, and then decreases in the order of the document data II, III, IV, and V (the document data II). 36%, document data III 18%, document data IV 4%, and document data V 2%).

【００５０】図５は、文書データIにおけるキーワード
の出現率データ、重要度データ、及び有効度データを夫
々示す説明図である。FIG. 5 is an explanatory diagram showing keyword appearance rate data, importance data, and validity data in the document data I, respectively.

【００５１】文書データIでは、図示のように、掲載さ
れている６個のキーワードのうち、出現率の最も高いの
が『マルチメディア』で０．３５７であり、重要度及び
有効度の最も高いのが『ドキュメント管理』で、重要度
が５０、有効度が１６．１になっている。In the document data I, as shown in the figure, out of the six keywords listed, the highest appearance rate is 0.357 in “multimedia”, which is the highest in importance and validity. "Document management" has an importance of 50 and an effectiveness of 16.1.

【００５２】図６は、文書データIIにおけるキーワード
の出現率データ、重要度データ、及び有効度データを夫
々示す説明図である。FIG. 6 is an explanatory diagram showing keyword appearance rate data, importance data, and validity data in the document data II, respectively.

【００５３】文書データIIでは、図示のように、掲載さ
れている５個のキーワードのうち、出現率及び有効度の
最も高いのが『ドキュメント管理』で、出現率が０．３
５７、有効度が２３．６であり、重要度の最も高いのが
『類義語展開』で、４０になっている。In the document data II, as shown in the figure, “document management” has the highest appearance rate and validity among the five keywords listed, and the appearance rate is 0.3.
57, the validity is 23.6, and the highest significance is “synonym expansion”, which is 40.

【００５４】図７は、文書データIIIにおけるキーワー
ドの出現率データ、重要度データ、及び有効度データを
夫々示す説明図である。FIG. 7 is an explanatory diagram showing keyword appearance rate data, importance data, and validity data in the document data III.

【００５５】文書データIIIでは、図示のように、出現
率については掲載されている４個のキーワード共０．２
５０で同一であり、重要度及び有効度の最も高いのが
『データベース』と『セキュリティ』で、共に重要度が
３０、有効度が１２．０である。In the document data III, as shown in FIG.
"Database" and "Security" have the same importance at 50 and have the highest importance and validity, and both have the importance of 30 and the validity of 12.0.

【００５６】図８は、文書データIVにおけるキーワード
の出現率データ、重要度データ、及び有効度データを夫
々示す説明図である。FIG. 8 is an explanatory diagram showing keyword appearance rate data, importance data, and validity data in the document data IV, respectively.

【００５７】文書データIVでは、図示のように、掲載さ
れている３個のキーワードのうち、出現率、重要度及び
有効度共に最も高いのが『文字認識』で、出現率が０．
６６７、重要度が５０、有効度が３６．０になってい
る。In the document data IV, "character recognition" has the highest appearance rate, importance, and validity among the three keywords listed, as shown in FIG.
667, the importance is 50, and the effectiveness is 36.0.

【００５８】図９は、文書データVにおけるキーワード
の出現率データ、重要度データ、及び有効度データを夫
々示す説明図である。FIG. 9 is an explanatory diagram showing keyword appearance rate data, importance data, and validity data in the document data V, respectively.

【００５９】文書データVでは、図示のように、掲載さ
れている４個のキーワードのうち、出現率及び有効度の
最も高いのが『自動収集』で、出現率が０．４５５、有
効度が１４．５であり、重要度の最も高いのが上記『自
動収集』と『分散処理』で、共に３０になっている。In the document data V, as shown in the figure, of the four keywords listed, “automatic collection” has the highest appearance rate and validity, the appearance rate is 0.455, and the validity is 14.5, and the highest importance is "automatic collection" and "distributed processing", both of which are 30.

【００６０】図１０は、上記各文書データI〜Vにおける
キーワードの有効度データ（Ｄ）を示す説明図である。FIG. 10 is an explanatory diagram showing the validity data (D) of the keyword in each of the document data I to V.

【００６１】図１０で示す有効度データ（Ｄ）は、上述
したように、検索の結果として得られる全ての文書デー
タ（この例では、文書データI〜V）における各キーワー
ドの有効度の累計値（Σｄ）を示している。As described above, the validity data (D) shown in FIG. 10 is a cumulative value of the validity of each keyword in all the document data (in this example, document data I to V) obtained as a result of the search. (Σd).

【００６２】図１０では、各文書データI〜Vから抽出さ
れたキーワードとして１４個のキーワードが掲載されて
いる。In FIG. 10, 14 keywords are listed as keywords extracted from each of the document data I to V.

【００６３】これらのうち、最も有効度の高いキーワー
ドは、文書データI及び文書データIIから夫々抽出され
た『ドキュメント管理』である。『ドキュメント管理』
は、図示のように、有効度の累計値が３９．７であり、
残り１３のキーワードのいずれよりも累計値が大きい。
なお、２番目に有効度の高いキーワードは、文書データ
IVからのみ抽出された『文字認識』で、累計値は文書デ
ータIVの３６．０である。因みに最も有効度の低いキー
ワードは、文書データIIから抽出された『ネットワー
ク』で累計値は文書データIIの２．６である。Among these, the keyword having the highest validity is “document management” extracted from the document data I and the document data II, respectively. Document Management
Is, as shown, the cumulative value of the effectiveness is 39.7,
The cumulative value is larger than any of the remaining 13 keywords.
The second most effective keyword is the document data
In “character recognition” extracted only from the IV, the cumulative value is 36.0 of the document data IV. Incidentally, the keyword with the lowest validity is “network” extracted from the document data II, and the total value is 2.6 of the document data II.

【００６４】[0064]

【発明の効果】以上説明したように、本発明によれば、
検索結果としての複数のファイルから特定のファイルの
絞り込みを行うときに、絞り込みのための適切な情報を
ユーザに提示することができるようになる。As described above, according to the present invention,
When narrowing down a specific file from a plurality of files as search results, appropriate information for narrowing down can be presented to the user.

【００６５】また、端末の簡単な操作によって効率良く
ファイル検索が行えるようにすることができる。Further, the file search can be efficiently performed by a simple operation of the terminal.

【図面の簡単な説明】[Brief description of the drawings]

【図１】本発明の一実施形態に係るファイル検索方式が
備えるデータベースを示す説明図。FIG. 1 is an explanatory diagram showing a database included in a file search method according to an embodiment of the present invention.

【図２】本発明の一実施形態に係るファイル検索方式の
処理動作を示すフローチャート。FIG. 2 is a flowchart showing a processing operation of a file search method according to an embodiment of the present invention.

【図３】本発明の一実施形態に係るファイル検索方式の
処理動作を示すフローチャート。FIG. 3 is a flowchart showing a processing operation of a file search method according to an embodiment of the present invention.

【図４】一実施形態のファイル検索方式により検索され
た各文書データの重要度データを示す説明図。FIG. 4 is an explanatory diagram showing importance data of each document data searched by the file search method according to the embodiment;

【図５】文書データIにおけるキーワードの出現率デー
タ、重要度データ、及び有効度データを示す説明図。FIG. 5 is an explanatory diagram showing appearance rate data, importance data, and validity data of a keyword in document data I.

【図６】文書データIIにおけるキーワードの出現率デー
タ、重要度データ、及び有効度データを示す説明図。FIG. 6 is an explanatory view showing keyword appearance rate data, importance data, and validity data in document data II.

【図７】文書データIIIにおけるキーワードの出現率デ
ータ、重要度データ、及び有効度データを示す説明図。FIG. 7 is an explanatory diagram showing keyword appearance rate data, importance data, and validity data in document data III.

【図８】文書データIVにおけるキーワードの出現率デー
タ、重要度データ、及び有効度データを示す説明図。FIG. 8 is an explanatory diagram showing keyword appearance rate data, importance data, and validity data in document data IV.

【図９】文書データVにおけるキーワードの出現率デー
タ、重要度データ、及び有効度データを示す説明図。FIG. 9 is an explanatory diagram showing keyword appearance rate data, importance data, and validity data in document data V.

【図１０】各文書データI〜Vにおけるキーワードの有効
度データを示す説明図。FIG. 10 is an explanatory diagram showing validity data of a keyword in each of document data I to V.

【符号の説明】[Explanation of symbols]

１データベース３、５、７、９記憶領域 1 database 3, 5, 7, 9 storage area

Claims

【特許請求の範囲】[Claims]

【請求項１】検索結果として得られた複数のファイル
から、検索対象に該当するファイルを絞り込むファイル
検索方式において、前記各ファイルから、各ファイルの特徴を表わす情報を
抽出する手段と、前記抽出された特徴情報から、特定のファイルを絞り込
むのに適した情報を選別して提示する手段と、を備えることを特徴とするファイル検索方式。1. In a file search method for narrowing down a file corresponding to a search target from a plurality of files obtained as a search result, means for extracting information representing characteristics of each file from each file; Means for selecting and presenting information suitable for narrowing down a specific file from the characteristic information thus obtained.

【請求項２】請求項１記載のファイル検索方式におい
て、前記特徴情報が、前記各ファイル中に存在するキーワー
ドであることを特徴とするファイル検索方式。2. The file search method according to claim 1, wherein the characteristic information is a keyword existing in each of the files.

【請求項３】請求項１記載のファイル検索方式におい
て、前記各ファイルの検索が、検索対象となるファイルを特
定するための情報が入力されることにより実行されるこ
とを特徴とするファイル検索方式。3. The file search method according to claim 1, wherein the search for each of the files is executed by inputting information for specifying a file to be searched. .

【請求項４】請求項３記載のファイル検索方式におい
て、前記入力されるファイル特定情報が、前記ファイルの特
徴を表わすキーワード情報、自由文情報及び音声情報の
いずれかであることを特徴とするファイル検索方式。4. The file search method according to claim 3, wherein the input file specifying information is any one of keyword information, free text information, and voice information representing characteristics of the file. Search method.

【請求項５】請求項３又は請求項４記載のファイル検
索方式において、前記各ファイルの検索が、全文検索又は全文検索に類似
の手法により実行されることを特徴とするファイル検索
方式。5. The file search method according to claim 3, wherein the search for each file is executed by a full-text search or a technique similar to a full-text search.

【請求項６】請求項１記載のファイル検索方式におい
て、前記抽出手段が、検索結果として得られた複数のファイ
ルが登録されるとき、これらのファイルから各々のファ
イルの特徴情報を抽出することを特徴とするファイル検
索方式。6. The file search method according to claim 1, wherein, when a plurality of files obtained as a search result are registered, the extracting unit extracts feature information of each file from these files. Characteristic file search method.

【請求項７】請求項１記載のファイル検索方式におい
て、前記選別、提示手段が、前記抽出された特徴情報の前記各ファイル中での重要度
を算出する第１の手段と、前記抽出された特徴情報の前記各ファイル中での出現率
を算出する第２の手段と、前記各ファイルの重要度を算出する第３の手段と、前記特徴情報の重要度算出値と、前記特徴情報の出現率
算出値と、前記ファイルの重要度算出値とから、前記特
徴情報の有効度を算出する第４の手段とを備え、前記算出された有効度中で比較的高い値の有効度を持つ
特徴情報を絞り込みに適した特徴情報として提示するこ
とを特徴とするファイル検索方式。7. The file search method according to claim 1, wherein the selecting and presenting means calculates first importance of the extracted feature information in each of the files; Second means for calculating an appearance rate of the characteristic information in each of the files; third means for calculating the importance of each of the files; an importance calculation value of the characteristic information; A fourth means for calculating the validity of the feature information from a calculated ratio and a calculated value of the importance of the file; a feature having a relatively high value of validity among the calculated validities; A file search method characterized by presenting information as feature information suitable for narrowing down.

【請求項８】請求項７記載のファイル検索方式におい
て、前記第１の手段が、前記特徴情報の前記各ファイル中で
の出現位置又は出現箇所により前記特徴情報を段階的に
分類することにより前記特徴情報の重要度を算出するこ
とを特徴とするファイル検索方式。8. The file search method according to claim 7, wherein the first means classifies the characteristic information stepwise by an appearance position or an appearance position of the characteristic information in each of the files. A file search method characterized by calculating the importance of feature information.

【請求項９】請求項７記載のファイル検索方式におい
て、前記第１の手段が、予め登録されている重要度の値を参
酌することにより前記特徴情報の重要度を算出すること
を特徴とするファイル検索方式。9. The file search method according to claim 7, wherein the first unit calculates the importance of the feature information by considering a value of the importance registered in advance. File search method.

【請求項１０】請求項７記載のファイル検索方式にお
いて、前記第２の手段が、前記特徴情報の前記各ファイル中で
の出現数と、前記各ファイル中での全ての特徴情報の出
現数とから前記特徴情報の出現率を算出することを特徴
とするファイル検索方式。10. The file search method according to claim 7, wherein the second means determines a number of occurrences of the feature information in each of the files, and a number of occurrences of all the feature information in each of the files. A file search method for calculating the appearance rate of the characteristic information from the file.

【請求項１１】請求項７記載のファイル検索方式にお
いて、前記第３の手段が、前記各ファイルの順位が出力された
ときは、その順位に基づき、また、前記各ファイルの確
からしさや重要度を示す値が出力されたときは、その値
に基づいて、夫々前記各ファイルの重要度を算出するこ
とを特徴とするファイル検索方式。11. The file search method according to claim 7, wherein the third means, when the order of the files is output, is based on the order and the likelihood and importance of the files. Is output, and the importance of each of the files is calculated based on the output value.

【請求項１２】請求項７記載のファイル検索方式にお
いて、前記第３の手段が、重要度の高い上位複数個のファイル
のみを抽出して前記第４の手段に与えることを特徴とす
るファイル検索方式。12. The file search method according to claim 7, wherein said third means extracts only a plurality of upper-ranked files having a high degree of importance and supplies the extracted files to said fourth means. method.

【請求項１３】請求項７記載のファイル検索方式にお
いて、前記第４の手段が、前記算出された特徴情報中で比較的
高い値の有効度を持つ上位複数個の特徴情報を絞り込み
に適した特徴情報として提示することを特徴とするファ
イル検索方式。13. The file search method according to claim 7, wherein the fourth means is suitable for narrowing down a plurality of pieces of feature information having a relatively high degree of validity in the calculated feature information. A file search method characterized by being presented as feature information.

【請求項１４】検索結果として得られた複数のファイ
ルから、検索対象に該当するファイルを絞り込むファイ
ル検索方式において、前記各ファイルから、各ファイルの特徴を表わす情報を
抽出する手段と、前記抽出された特徴情報から、特定のファイルを絞り込
むのに適した情報を選別して提示する手段と、前記提示された情報中から特定の情報が指定されたと
き、その情報に基づいてファイルを再検索する手段と、を備えることを特徴とするファイル検索方式。14. A file search method for narrowing down a file corresponding to a search target from a plurality of files obtained as a search result, means for extracting information representing characteristics of each file from each of the files, Means for selecting and presenting information suitable for narrowing down a specific file from the characteristic information obtained, and, when specific information is designated from the presented information, re-searching the file based on the information Means, and a file search method comprising:

【請求項１５】請求項１４記載のファイル検索方式に
おいて、前記特徴情報が、前記各ファイル中に存在するキーワー
ドであることを特徴とするファイル検索方式。15. The file search method according to claim 14, wherein the characteristic information is a keyword existing in each of the files.

【請求項１６】請求項１４記載のファイル検索方式に
おいて、前記各ファイルの検索が、検索対象となるファイルを特
定するための情報が入力されることにより実行されるこ
とを特徴とするファイル検索方式。16. The file search method according to claim 14, wherein the search for each of the files is executed by inputting information for specifying a file to be searched. .

【請求項１７】請求項１６記載のファイル検索方式に
おいて、前記入力されるファイル特定情報が、前記ファイルの特
徴を表わすキーワード情報、自由文情報及び音声情報の
いずれかであることを特徴とするファイル検索方式。17. The file search method according to claim 16, wherein the input file specifying information is one of keyword information, free text information, and voice information representing characteristics of the file. Search method.

【請求項１８】請求項１６又は請求項１７記載のファ
イル検索方式において、前記各ファイルの検索が、全文検索又は全文検索に類似
の手法により実行されることを特徴とするファイル検索
方式。18. The file search method according to claim 16, wherein the search for each file is performed by a full-text search or a technique similar to a full-text search.

【請求項１９】請求項１４記載のファイル検索方式に
おいて、前記抽出手段が、検索結果として得られた複数のファイ
ルが登録されるとき、これらのファイルから各々のファ
イルの特徴情報を抽出することを特徴とするファイル検
索方式。19. The file search method according to claim 14, wherein, when a plurality of files obtained as a search result are registered, the extracting unit extracts feature information of each file from these files. Characteristic file search method.

【請求項２０】請求項１４記載のファイル検索方式に
おいて、前記選別、提示手段が、前記抽出された特徴情報の前記各ファイル中での重要度
を算出する第１の手段と、前記抽出された特徴情報の前記各ファイル中での出現率
を算出する第２の手段と、前記各ファイルの重要度を算出する第３の手段と、前記特徴情報の重要度算出値と、前記特徴情報の出現率
算出値と、前記ファイルの重要度算出値とから、前記特
徴情報の有効度を算出する第４の手段とを備え、前記算出された有効度中で比較的高い値の有効度を持つ
特徴情報を絞り込みに適した特徴情報として提示するこ
とを特徴とするファイル検索方式。20. The file search method according to claim 14, wherein the selecting and presenting means calculates first importance of the extracted feature information in each of the files; Second means for calculating an appearance rate of the characteristic information in each of the files; third means for calculating the importance of each of the files; an importance calculation value of the characteristic information; A fourth means for calculating the validity of the feature information from a calculated ratio and a calculated value of the importance of the file; a feature having a relatively high value of validity among the calculated validities; A file search method characterized by presenting information as feature information suitable for narrowing down.

【請求項２１】請求項２０記載のファイル検索方式に
おいて、前記第１の手段が、前記特徴情報の前記各ファイル中で
の出現位置又は出現箇所により前記特徴情報を段階的に
分類することにより前記特徴情報の重要度を算出するこ
とを特徴とするファイル検索方式。21. The file search method according to claim 20, wherein the first means classifies the feature information stepwise by an appearance position or an appearance position of the feature information in each of the files. A file search method characterized by calculating the importance of feature information.

【請求項２２】請求項２０記載のファイル検索方式に
おいて、前記第１の手段が、予め登録されている重要度の値を参
酌することにより前記特徴情報の重要度を算出すること
を特徴とするファイル検索方式。22. The file search method according to claim 20, wherein the first means calculates the importance of the feature information by considering a value of the importance registered in advance. File search method.

【請求項２３】請求項２０記載のファイル検索方式に
おいて、前記第２の手段が、前記特徴情報の前記各ファイル中で
の出現数と、前記各ファイル中での全ての特徴情報の出
現数とから前記特徴情報の出現率を算出することを特徴
とするファイル検索方式。23. The file search method according to claim 20, wherein the second means determines a number of occurrences of the feature information in each of the files, and a number of occurrences of all feature information in each of the files. A file search method for calculating the appearance rate of the characteristic information from the file.

【請求項２４】請求項２０記載のファイル検索方式に
おいて、前記第３の手段が、前記各ファイルの順位が出力された
ときは、その順位に基づき、また、前記各ファイルの確
からしさや重要度を示す値が出力されたときは、その値
に基づいて、夫々前記各ファイルの重要度を算出するこ
とを特徴とするファイル検索方式。24. The file search method according to claim 20, wherein the third means, when the order of each of the files is output, is based on the order and the likelihood and importance of each of the files. Is output, and the importance of each of the files is calculated based on the output value.

【請求項２５】請求項２０記載のファイル検索方式に
おいて、前記第３の手段が、重要度の高い上位複数個のファイル
のみを抽出して前記第４の手段に与えることを特徴とす
るファイル検索方式。25. The file search method according to claim 20, wherein said third means extracts only a plurality of files having higher importance and gives them to said fourth means. method.

【請求項２６】請求項２０記載のファイル検索方式に
おいて、前記第４の手段が、前記算出された特徴情報中で比較的
高い値の有効度を持つ上位複数個の特徴情報を絞り込み
に適した特徴情報として提示することを特徴とするファ
イル検索方式。26. The file search method according to claim 20, wherein the fourth means is suitable for narrowing down a plurality of pieces of top-level feature information having a relatively high value of validity in the calculated feature information. A file search method characterized by being presented as feature information.

【請求項２７】検索結果として得られた複数のファイ
ルから、検索対象に該当するファイルを絞り込むファイ
ル検索方式において、前記各ファイルから、各ファイルの特徴を表わす情報を
抽出する手段と、前記抽出された特徴情報から、特定のファイルを絞り込
むのに適した情報を選別して提示する手段と、を備えることを特徴とするファイル検索方式における前
記各手段としてコンピュータを動作させるためのコンピ
ュータプログラムを担持したコンピュータ読取可能なプ
ログラム媒体。27. A file search method for narrowing down a file corresponding to a search target from a plurality of files obtained as a search result, means for extracting information representing characteristics of each file from each of the files, Means for selecting and presenting information suitable for narrowing down a specific file from the characteristic information, and a computer program for operating a computer as each of the means in the file search method characterized by comprising: Computer readable program medium.