JP2015108982A

JP2015108982A - Information processing apparatus and program

Info

Publication number: JP2015108982A
Application number: JP2013251668A
Authority: JP
Inventors: 小松　裕; Yutaka Komatsu; 裕小松
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2013-12-05
Filing date: 2013-12-05
Publication date: 2015-06-11
Anticipated expiration: 2033-12-05
Also published as: JP6217362B2

Abstract

PROBLEM TO BE SOLVED: To facilitate acquisition of electronic information related to a character string contained in a document having a hierarchical structure, on the basis of a position in a hierarchical structure of the aforementioned character string.SOLUTION: An information processing apparatus includes: a character string acquisition part 201 for acquiring a character string contained in a component that belongs to any of index components having a first hierarchical structure; a character string position information acquisition part 204 for acquiring character string position information indicating a position of the acquired character string in the first hierarchical structure; an electronic information extraction part 202 for extracting, from among plural pieces of electronic information accommodated in any of categorical elements having a second hierarchical structure, plural pieces of electronic information containing the acquired character string in an electronic information name of the plural pieces of electronic information; and means for acquiring electronic information related to the acquired character string, from among plural pieces of electronic information that have been extracted on the basis of the character string position information and storage place information indicating a storage place of the extracted plural pieces of electronic information in the second hierarchical structure.

Description

本発明は、情報処理装置及びプログラムに関する。 The present invention relates to an information processing apparatus and a program.

特許文献１には、作業工程マニュアル文書の各作業工程と、その作業工程を実施することにより生成される電子情報を予め関連付けて定義しておき、作業工程に応じて対応する電子情報をユーザに提示する技術が開示されている。 In Patent Document 1, each work process of the work process manual document and electronic information generated by performing the work process are defined in advance, and the corresponding electronic information is given to the user according to the work process. The technique to be disclosed is disclosed.

特開２００９−６４３４７号公報JP 2009-64347 A

本発明の目的の一つは、階層構造を有して構成される文書に含まれる文字列に関連する電子情報を、当該文字列の階層構造における位置に基づいて容易に取得することにある。 One of the objects of the present invention is to easily obtain electronic information related to a character string included in a document having a hierarchical structure based on the position of the character string in the hierarchical structure.

請求項１に記載の発明は、情報処理装置であって、第１の階層構造を有して構成される見出し要素の何れかに属する構成要素に含まれる文字列を取得する文字列取得手段と、前記取得された文字列の前記第１の階層構造における位置を示す文字列位置情報を取得する文字列位置情報取得手段と、それぞれ、第２の階層構造を有して構成される分類要素の何れかに格納する複数の電子情報のうち、その電子情報の電子情報名に前記取得された文字列を含む複数の前記電子情報を抽出する電子情報抽出手段と、前記抽出した複数の電子情報の前記第２の階層構造における格納場所を示す格納場所情報と、前記文字列位置情報と、に基づいて前記抽出した複数の電子情報のうち、前記取得された文字列と関連する前記電子情報を取得する電子情報取得手段と、を含むことを特徴とする。 The invention according to claim 1 is an information processing apparatus, and a character string acquisition unit that acquires a character string included in a constituent element belonging to any one of heading elements having a first hierarchical structure A character string position information acquisition unit that acquires character string position information indicating a position of the acquired character string in the first hierarchical structure, and a classification element configured to have a second hierarchical structure, respectively. Of the plurality of electronic information stored in any one of the plurality of electronic information, the electronic information extraction means for extracting the plurality of electronic information including the acquired character string in the electronic information name of the electronic information; The electronic information related to the acquired character string is acquired from the plurality of extracted electronic information based on the storage location information indicating the storage location in the second hierarchical structure and the character string position information. Electronic information collection Characterized in that it comprises a means.

請求項２に記載の発明は、請求項１に記載の情報処理装置であって、前記格納場所情報と前記文字列位置情報とが類似するか否かを判断する判断手段、をさらに含み、前記電子情報取得手段は、前記文字列位置情報と類似すると判断された前記格納場所情報に対応する前記電子情報を取得する、ことを特徴とする。 The invention according to claim 2 is the information processing apparatus according to claim 1, further comprising a determination unit that determines whether the storage location information and the character string position information are similar, The electronic information acquisition means acquires the electronic information corresponding to the storage location information determined to be similar to the character string position information.

請求項３に記載の発明は、請求項２に記載の情報処理装置であって、前記文字列位置情報は、前記取得された文字列を含む構成要素が属する階層から最上階層までの各見出し要素の見出し要素名を順に含み、前記格納場所情報は、前記電子情報が格納される階層から最上層までの各分類要素の分類要素名を順に含み、前記判断手段は、前記文字列位置情報に含まれるすべての前記見出し要素名それぞれが、前記格納場所情報における同じ階層の前記分類要素名と類似するか否かを判断する、ことを特徴とする。 The invention according to claim 3 is the information processing apparatus according to claim 2, wherein the character string position information includes each heading element from the hierarchy to which the constituent element including the acquired character string belongs to the highest hierarchy. In order, the storage location information includes the classification element name of each classification element from the hierarchy in which the electronic information is stored to the top layer, and the determination means is included in the character string position information It is determined whether each of the heading element names to be similar is similar to the classification element name of the same hierarchy in the storage location information.

請求項４に記載の発明は、請求項２に記載の情報処理装置であって、前記文字列位置情報は、前記取得された文字列を含む構成要素が属する階層から最上階層までの各見出し要素の見出し要素名を順に含み、前記格納場所情報は、前記電子情報が属する階層から最上層までの各分類要素の分類要素名を順に含み、前記判断手段は、前記文字列位置情報に含まれる見出し要素名の中から抽出した、１または複数の見出し要素名に基づいて類似するか否かを判断する、ことを特徴とする。 The invention according to claim 4 is the information processing apparatus according to claim 2, wherein the character string position information includes each heading element from the hierarchy to which the constituent element including the acquired character string belongs to the highest hierarchy. The storage location information sequentially includes the classification element names of the classification elements from the hierarchy to which the electronic information belongs to the top layer, and the determination means includes the header information included in the character string position information. It is characterized in determining whether or not they are similar based on one or more heading element names extracted from the element names.

請求項５に記載の発明は、請求項２から４のいずれか１項に記載の情報処理装置であって、前記文字列位置情報は、前記取得された文字列を含む構成要素が属する階層から最上階層までの各見出し要素の見出し要素名を順に含み、前記格納場所情報は、前記電子情報が属する階層から最上層までの各分類要素の分類要素名を順に含み、前記判断手段は、前記文字列位置情報に含まれるすべての見出し要素名それぞれと、前記格納場所情報における同じ階層の前記分類要素名と、が類似しない場合に、当該見出し要素名と、当該同じ階層より上層の前記分類要素名と、が類似するか否かを判断する、ことを特徴とする。 The invention according to claim 5 is the information processing apparatus according to any one of claims 2 to 4, wherein the character string position information is from a hierarchy to which a component including the acquired character string belongs. The heading element name of each heading element up to the top layer is included in order, and the storage location information includes the class element name of each classification element from the layer to which the electronic information belongs to the top layer in order, and the determining means includes the character When each heading element name included in the column position information is not similar to the classification element name at the same level in the storage location information, the heading element name and the classification element name above the same level And are similar to each other.

請求項６に記載の発明は、プログラムであって、第１の階層構造を有して構成される見出し要素の何れかに属する構成要素に含まれる文字列を取得する文字列取得手段、前記取得された文字列の前記第１の階層構造における位置を示す文字列位置情報を取得する文字列位置情報取得手段、それぞれ、第２の階層構造を有して構成される分類要素の何れかに格納する複数の電子情報のうち、その電子情報の電子情報名に前記取得された文字列を含む複数の前記電子情報を抽出する電子情報抽出手段、前記抽出した複数の電子情報の前記第２の階層構造における格納場所を示す格納場所情報と、前記文字列位置情報と、に基づいて前記抽出した複数の電子情報のうち、前記取得された文字列と関連する前記電子情報を取得する電子情報取得手段、としてコンピュータを機能させることを特徴とする。 The invention according to claim 6 is a program, a character string acquisition means for acquiring a character string included in a constituent element belonging to any one of heading elements having a first hierarchical structure, the acquisition Character string position information acquisition means for acquiring character string position information indicating the position of the generated character string in the first hierarchical structure, each of which is stored in one of the classification elements configured to have the second hierarchical structure Electronic information extraction means for extracting the plurality of electronic information including the acquired character string in the electronic information name of the electronic information, and the second hierarchy of the extracted plurality of electronic information Electronic information acquisition means for acquiring the electronic information related to the acquired character string out of the plurality of electronic information extracted based on storage location information indicating a storage location in the structure and the character string position information , And characterized by causing a computer to function with.

請求項１及び６に記載の発明によれば、階層構造を有して構成される見出し要素に属する構成要素に含まれる文字列に関連する電子情報を、当該文字列の階層構造における位置情報に基づいて取得する。 According to the first and sixth aspects of the invention, the electronic information related to the character string included in the constituent element belonging to the heading element having a hierarchical structure is converted into the position information in the hierarchical structure of the character string. Get based on.

請求項２に記載の発明によれば、階層構造を有して構成される見出し要素に属する構成要素に含まれる文字列に関連する電子情報を、当該文字列の階層構造における位置情報と、当該電子情報の階層構造における格納場所情報と、の類似度に基づいて取得する。 According to the second aspect of the present invention, the electronic information related to the character string included in the constituent element belonging to the heading element configured to have a hierarchical structure, the position information in the hierarchical structure of the character string, and the Obtained based on the similarity to the storage location information in the hierarchical structure of electronic information.

請求項３に記載の発明によれば、文字列の階層構造における位置情報と電子情報の階層構造における格納場所情報とが完全に一致するかを判断できる。 According to the third aspect of the present invention, it can be determined whether the position information in the hierarchical structure of the character string completely matches the storage location information in the hierarchical structure of the electronic information.

請求項４に記載の発明によれば、文字列の階層構造における位置情報の一部と電子情報の階層構造における格納場所情報との類似度を判断することができる。 According to the fourth aspect of the present invention, it is possible to determine the similarity between a part of the position information in the character string hierarchical structure and the storage location information in the electronic information hierarchical structure.

請求項５に記載の発明によれば、電子情報の階層構造における格納場所情報から関連性の低い情報を除外して類似度の判断を行うことができる。 According to the fifth aspect of the present invention, the similarity can be determined by excluding information with low relevance from the storage location information in the hierarchical structure of electronic information.

本発明の一実施形態に係る電子情報管理システムの構成の一例を示す図である。It is a figure which shows an example of a structure of the electronic information management system which concerns on one Embodiment of this invention. マニュアル文書の一例を示す図である。It is a figure which shows an example of a manual document. 電子情報管理サーバ２０に実装されるディレクトリシステムの一例を模式的に示す図である。2 is a diagram schematically illustrating an example of a directory system implemented in an electronic information management server 20. FIG. 本実施形態に係る文書管理サーバ２０により実行される主な機能の一例を示すブロック図である。It is a block diagram which shows an example of the main functions performed by the document management server 20 concerning this embodiment. 図３に示すディレクトリシステムのうち「購買No002」のディレクトリに格納されている情報の一例を示す図である。It is a figure which shows an example of the information stored in the directory of "purchasing No002" among the directory systems shown in FIG.

以下、本発明の一実施形態について図面に基づき詳細に説明する。 Hereinafter, an embodiment of the present invention will be described in detail with reference to the drawings.

図１は、本発明の一実施形態に係る電子情報管理システムの構成の一例を示す図である。図１に示すように、電子情報管理システムは、情報処理装置１０、マニュアル管理サーバ２０、及び電子情報管理サーバ３０を含んで構成されている。情報処理装置１０、電子情報管理サーバ２０、及びマニュアル管理サーバ３０はＬＡＮやインターネットなどの通信手段に接続されており、互いに通信されるようになっている。 FIG. 1 is a diagram showing an example of the configuration of an electronic information management system according to an embodiment of the present invention. As shown in FIG. 1, the electronic information management system includes an information processing apparatus 10, a manual management server 20, and an electronic information management server 30. The information processing apparatus 10, the electronic information management server 20, and the manual management server 30 are connected to communication means such as a LAN and the Internet, and communicate with each other.

情報処理装置１０は、情報処理装置１０にインストールされているプログラムに従って動作するＣＰＵ等のプログラム制御デバイスである制御部、ＲＯＭやＲＡＭ等の記憶素子やハードディスクドライブなどである記憶部、ネットワークボードなどの通信インタフェースである通信部、マウス、ディスプレイなどであるユーザインタフェース部などを含んでいる。これらの要素は、バスを介して接続される。情報処理装置１０の記憶部には、情報処理装置１０の制御部によって実行されるプログラムが記憶される。また、情報処理装置１０の記憶部は、情報処理装置１０のワークメモリとしても動作する。 The information processing apparatus 10 includes a control unit that is a program control device such as a CPU that operates according to a program installed in the information processing apparatus 10, a storage unit such as a ROM and RAM, a storage unit such as a hard disk drive, and a network board. A communication unit that is a communication interface, a user interface unit that is a mouse, a display, and the like are included. These elements are connected via a bus. A program executed by the control unit of the information processing apparatus 10 is stored in the storage unit of the information processing apparatus 10. Further, the storage unit of the information processing apparatus 10 also operates as a work memory of the information processing apparatus 10.

マニュアル管理サーバ２０は、作業手順等が記述されたマニュアル文書を管理している。マニュアル文書は、章、節、項といった見出し要素を用いた階層構造（第１の階層構造）を有する文書である。ここで、各見出し要素が階層構造におけるノード、見出し要素の文字列データがノード名に相当し、各ノードに作業内容を示す構成要素が属している。構成要素としては、例えば、文、図形、表などであってよい。 The manual management server 20 manages manual documents in which work procedures are described. The manual document is a document having a hierarchical structure (first hierarchical structure) using heading elements such as chapters, sections, and items. Here, each heading element corresponds to a node in the hierarchical structure, character string data of the heading element corresponds to a node name, and a component indicating work content belongs to each node. For example, the component may be a sentence, a figure, a table, or the like.

図２にマニュアル文書の一例を示す。図２に示すように購買マニュアルの階層構造では、「第１章見積」、「第２章購入」等の「章」階層は第一階層に位置し、「１．１基本事項」、「１．２相見積先の選定」等の「節」階層は第二階層に位置し、「１．４．１回答書チェック」等の「項」階層は第三階層に位置している。そして、見出し要素の文字列「見積」、「回答」、「回答書チェック」がそれぞれノード名となる。この場合、例えば、マニュアル文書に記述されている作業内容「見積書を見積先に送付し・・・」内の語句「見積書」について、その階層構造における位置を最上階層（章見出しから）から各ノード名を区切り記号「/」で区切って連ね、「見積/見積/」のように表す。 FIG. 2 shows an example of a manual document. As shown in FIG. 2, in the hierarchical structure of the purchasing manual, “chapter” hierarchies such as “Chapter 1 Estimate” and “Chapter 2 Purchase” are located in the first hierarchy, “1.1 Basic matters”, “1 .2 “Choice of phase estimation” “section” hierarchy is located in the second hierarchy, and “1.4.1 Check answer” etc. “item” hierarchy is located in the third hierarchy. The heading element character strings “estimate”, “answer”, and “answer check” are the node names. In this case, for example, the position in the hierarchical structure of the word “estimate” in the work content “Send an estimate to the estimate destination ...” described in the manual document from the top level (from the chapter heading) Each node name is separated by a delimiter “/” and expressed as “estimate / estimate /”.

電子情報管理サーバ３０は、文書管理サーバ２０にインストールされているプログラムに従って動作するＣＰＵ等のプログラム制御デバイスである制御部、ＲＯＭやＲＡＭ等の記憶素子やハードディスクドライブなどである記憶部、ネットワークボードなどの通信インタフェースである通信部、などを含んでいる。これらの要素は、バスを介して接続される。文書管理サーバ２０の記憶部には、文書管理サーバ２０の制御部によって実行されるプログラムが記憶される。また、文書管理サーバ２０の記憶部は、文書管理サーバ２０のワークメモリとしても動作する。 The electronic information management server 30 is a control unit that is a program control device such as a CPU that operates according to a program installed in the document management server 20, a storage unit that is a storage element such as ROM or RAM, a hard disk drive, a network board, or the like And a communication unit that is a communication interface. These elements are connected via a bus. A program executed by the control unit of the document management server 20 is stored in the storage unit of the document management server 20. The storage unit of the document management server 20 also operates as a work memory of the document management server 20.

電子情報管理サーバ３０は、マニュアル文書に基づいて作業した結果生成される成果物をディレクトリシステムにて管理する。成果物は、作業内容に応じて生成される電子情報とし、例えば文書情報、画像情報、動画情報、音声情報などであってよい。そして、電子情報管理サーバ３０の記憶部には、ディレクトリシステム上に存在する複数のディレクトリからなる木構造（第２の階層構造）を示す木構造データが記憶されている。木構造データには、各ディレクトリのディレクトリ名が含まれる。また、木構造データにおいて、各ディレクトリのディレクトリ名は、そのディレクトリの上位のディレクトリのディレクトリ名に関連づけられている。図３は電子情報管理サーバ３０に実装されるディレクトリシステムの一例を模式的に示す図である。図３に示すように「購買申請書フォルダ」はルートとなる第五階層に位置し、「購買No0001」は第四階層に位置し、「見積」、「購入依頼」は第三階層に位置し、「相見積」、「回答書」、「見積先決定」、「テンプレート」、「（旧）ドラフト版」は第二階層に位置し、「チェックシート」は第一階層に位置している。なお、ディレクトリの作成、削除等はユーザにより自由に実施される。 The electronic information management server 30 manages a product generated as a result of working based on a manual document in a directory system. The deliverable is electronic information generated according to the work content, and may be, for example, document information, image information, moving image information, audio information, or the like. The storage unit of the electronic information management server 30 stores tree structure data indicating a tree structure (second hierarchical structure) composed of a plurality of directories existing on the directory system. The tree structure data includes the directory name of each directory. In the tree structure data, the directory name of each directory is associated with the directory name of the directory above the directory. FIG. 3 is a diagram schematically showing an example of a directory system implemented in the electronic information management server 30. As shown in FIG. 3, “Purchase Application Form Folder” is located at the fifth hierarchy level, “Purchase No0001” is located at the fourth hierarchy level, and “Estimate” and “Purchase Request” are located at the third hierarchy level. , “Phase estimate”, “answer”, “estimate determination”, “template”, “(old) draft version” are located in the second hierarchy, and “check sheet” is located in the first hierarchy. Note that creation and deletion of the directory is freely performed by the user.

また、電子情報管理サーバ３０の記憶部には、ディレクトリシステム上に存在する複数の電子情報が記憶されている。図３に示すように、例えば「見積書.doc」が電子情報を示している。各電子情報は、それぞれ、いずれかのディレクトリに格納され、それぞれの電子情報の電子情報名が、その電子情報が格納されているディレクトリのディレクトリ名に関連づけられて記憶部に記憶される。例えば、図３によれば、電子情報「Ａ社見積書.doc」は第一階層のディレクトリ「相見積」に格納されている。そして、ディレクトリ「相見積」に属している電子情報「Ａ社見積書.doc」について、その格納場所を最上階層のディレクトリから各ディレクトリ名を区切り記号で区切って連ね、「/購買申請書フォルダ/購買No0001/見積/相見積/」のように表す。なお、電子情報を格納するディレクトリはユーザにより選択される。 The storage unit of the electronic information management server 30 stores a plurality of electronic information existing on the directory system. As shown in FIG. 3, for example, “estimate.doc” indicates electronic information. Each electronic information is stored in one of the directories, and the electronic information name of each electronic information is stored in the storage unit in association with the directory name of the directory in which the electronic information is stored. For example, according to FIG. 3, the electronic information “A company estimate.doc” is stored in the directory “phase estimate” in the first hierarchy. Then, for the electronic information “Company A estimate.doc” belonging to the directory “phase quotation”, the storage location is linked from the directory of the top hierarchy by separating each directory name with a delimiter, and “/ purchase application folder / “Purchase No0001 / Estimate / Phase estimate /”. A directory for storing electronic information is selected by the user.

ここで、マニュアル文書の作業手順に従って作成された成果物である電子情報を格納するディレクトリの階層構造及びディレクトリ名はマニュアル文書の階層構造及び見出し要素名と類似することが推定される。そこで、本実施形態では、電子情報を格納するディレクトリの階層構造及びディレクトリ名と、マニュアル文書の階層構造及び見出し要素名との類似性からマニュアル文書に記述されている文字列に対応する電子情報を特定する構成としている。 Here, it is presumed that the hierarchical structure and directory name of the directory for storing electronic information, which is a product created in accordance with the work procedure of the manual document, are similar to the hierarchical structure and heading element name of the manual document. Therefore, in this embodiment, the electronic information corresponding to the character string described in the manual document is obtained from the similarity between the hierarchical structure and directory name of the directory storing the electronic information and the hierarchical structure and heading element name of the manual document. It has a specific configuration.

図４は、本実施形態に係る情報処理装置１０により実行される主な機能の一例を示すブロック図である。図４に示すように、本実施形態における情報処理装置１０は、機能的には、マニュアル文書取得部２００、文字列取得部２０１、電子情報抽出部２０２、格納場所情報取得部２０３、文字列位置情報取得部２０４、類似度判断部２０５、及び表示部２０６を含んで構成される。なお、本実施形態に係る文書管理サーバ２０において、図２に示す機能以外の機能が実現されていてもよい。これらの機能は、記憶部に記憶されたプログラムが制御部を実行することにより実現されている。このプログラムは、例えば、光ディスク、磁気ディスク、磁気テープ、光磁気ディスク、フラッシュメモリ等のコンピュータ可読な情報記憶媒体を介して、あるいは、インターネットなどの通信手段を介して文書管理サーバ２０に供給される。 FIG. 4 is a block diagram illustrating an example of main functions executed by the information processing apparatus 10 according to the present embodiment. As shown in FIG. 4, the information processing apparatus 10 according to the present embodiment functionally includes a manual document acquisition unit 200, a character string acquisition unit 201, an electronic information extraction unit 202, a storage location information acquisition unit 203, a character string position. An information acquisition unit 204, a similarity determination unit 205, and a display unit 206 are included. In the document management server 20 according to the present embodiment, functions other than the functions shown in FIG. 2 may be realized. These functions are realized by a program stored in the storage unit executing the control unit. This program is supplied to the document management server 20 via a computer-readable information storage medium such as an optical disk, a magnetic disk, a magnetic tape, a magneto-optical disk, or a flash memory, or via communication means such as the Internet. .

マニュアル文書取得部２００は、マニュアル文書管理サーバ２０の記憶部からマニュアル文書を取得する。 The manual document acquisition unit 200 acquires a manual document from the storage unit of the manual document management server 20.

文字列取得部２０１は、マニュアル文書取得部２００が取得したマニュアル文書の文中からユーザにより指定された文字列を取得する。例えば、文字列取得部２０１は、マニュアル文書取得部２００が取得したマニュアル文書の文中のユーザが指定する任意の文字列を取得してもよいし、予め定義した文字列から選択してもよい。 The character string acquisition unit 201 acquires a character string designated by the user from the text of the manual document acquired by the manual document acquisition unit 200. For example, the character string acquisition unit 201 may acquire an arbitrary character string designated by the user in the text of the manual document acquired by the manual document acquisition unit 200, or may select from a predefined character string.

電子情報抽出部２０２は、電子情報管理サーバ２０の記憶部に記憶されている複数の電子情報から、当該電子情報の電子情報名に文字列取得部２０１が取得した文字列を含む電子情報を抽出する。具体的には、文字列取得部２０１が図２に示すマニュアル文書の「見積書」を取得した場合は、図３に示すディレクトリシステムに格納されている電子情報のうち電子情報名に「見積書」を含む、「Ａ社見積書.doc」、「Ｂ社見積書.doc」、「見積書.doc」（ディレクトリ「テンプレート」に格納されている）、「見積書.doc」（ディレクトリ「（旧）ドラフト版」に格納されている）の４つの電子情報が抽出される。なお、ここでは「購買No0001」ディレクトリ内に格納されている電子情報のうちから抽出しているが、「購買申請書」ディレクトリ内に格納されているすべての電子情報のうちから抽出してもよいし、いずれかのディレクトリを選択してもよい。 The electronic information extraction unit 202 extracts electronic information including the character string acquired by the character string acquisition unit 201 in the electronic information name of the electronic information from the plurality of electronic information stored in the storage unit of the electronic information management server 20. To do. Specifically, when the character string acquisition unit 201 acquires the “estimate” of the manual document shown in FIG. 2, the “estimate” is added to the electronic information name of the electronic information stored in the directory system shown in FIG. ”,“ A company estimate.doc ”,“ B company estimate.doc ”,“ quote.doc ”(stored in the directory“ template ”),“ quote.doc ”(directory“ ( The four pieces of electronic information stored in the “former (draft version)” are extracted. Here, the electronic information is extracted from the electronic information stored in the “purchasing No0001” directory, but may be extracted from all the electronic information stored in the “purchase application” directory. Any one of the directories may be selected.

格納場所情報取得部２０３は、電子情報抽出部２０２が抽出した各電子情報の電子情報管理サーバ３０の記憶部における格納場所を示す格納場所情報を取得する。具体的には、例えば、格納場所情報取得部２０３は、電子情報「Ａ社見積書.doc」の格納場所情報は「/購買申請書フォルダ/購買No0001/見積/相見積/」、電子情報「Ｂ社見積書.doc」の格納場所情報は/購買申請書フォルダ/購買No0001/見積/相見積/」、電子情報「見積書.doc」（ディレクトリ「テンプレート」に格納されている）の格納場所情報は「/購買申請書フォルダ/購買No0001/見積/テンプレート/」、電子情報「見積書.doc」（ディレクトリ「（旧）ドラフト版」に格納されている）の格納場所情報は「/購買申請書フォルダ/購買No0001/見積/（旧）ドラフト版/」として取得する。 The storage location information acquisition unit 203 acquires storage location information indicating the storage location of each electronic information extracted by the electronic information extraction unit 202 in the storage unit of the electronic information management server 30. Specifically, for example, the storage location information acquiring unit 203 stores “/ purchase application form folder / purchasing No0001 / estimate / phase estimate /”, electronic information “ The storage location information of “Company B estimate.doc” is / Purchase application form folder / Purchase No0001 / Estimate / Phase estimate / ”, and the storage location of electronic information“ Estimate.doc ”(stored in the directory“ Template ”) Information is "/ Purchase Application Folder / Purchase No0001 / Quotation / Template /", Electronic Information "Quote.doc" (stored in the directory "(Old) Draft") is storage location information "/ Purchase Application Document folder / purchasing No0001 / quote / (old) draft version / ”.

文字列位置情報取得部２０４は、文字列取得部２０１が取得した文字列のマニュアル文書内の記述場所を示す文字列位置情報をマニュアル文書管理サーバ２０から取得する。具体的には、例えば、文字列位置情報取得部２０４は、「見積書」の文字列位置情報を「見積/見積/」として取得する。 The character string position information acquisition unit 204 acquires character string position information indicating the description location in the manual document of the character string acquired by the character string acquisition unit 201 from the manual document management server 20. Specifically, for example, the character string position information acquisition unit 204 acquires the character string position information of “estimate” as “estimate / estimate /”.

類似度判断部２０５は、文字列位置情報取得部２０４が取得した文字列位置情報と、格納場所情報取得部２０３が取得した各電子情報の格納場所情報との類似度を判断する。類似度の判断は、マニュアル文書内の取得した文字列に対応する電子情報は、マニュアル文書の階層構造と類似した階層構造を有するディレクトリに格納されているという推定のもとに判断するものである。つまり、文字列位置情報に含まれる見出し要素名及びその順序がそれぞれ、格納場所情報に含まれる分類要素名及びその順序と類似するかを判断し文字列位置情報と格納場所情報との類似度を判断する。そうすることで、取得した文字列を電子情報名に含む複数の電子情報のうち、当該取得した文字列と関連の高い（類似度の高い）電子情報を特定することが可能となる。 The similarity determination unit 205 determines the similarity between the character string position information acquired by the character string position information acquisition unit 204 and the storage location information of each electronic information acquired by the storage location information acquisition unit 203. The determination of similarity is based on the assumption that the electronic information corresponding to the acquired character string in the manual document is stored in a directory having a hierarchical structure similar to the hierarchical structure of the manual document. . That is, the heading element name included in the character string position information and the order thereof are respectively determined to be similar to the classification element name included in the storage location information and the order thereof, and the similarity between the character string position information and the storage location information is determined. to decide. By doing so, it becomes possible to specify electronic information that is highly related to the acquired character string (highly similar) from among a plurality of electronic information that includes the acquired character string in the electronic information name.

まず、類似度の判断を行うための第１の手法について説明する。第１の手法は、文字列位置情報に含まれる見出し要素名それぞれを、類似度の判断を行う対象である電子情報の格納場所情報における同じ階層の分類要素名と比較し類似するか否かを判断する。具体的には、例えば、階層ｉ（ｉ≧１）において見出し要素名と分類要素名との語句類似度Ｓ_ｉ（０≦Ｓｉ≦１００）を算出し、算出された語句類似度Ｓ_ｉの平均値を類似度Ｓｖ（０≦Ｓｖ≦１００）とする。この場合、類似度Ｓｖが１００の場合に、文字列位置情報のすべての階層における見出し要素名が、それぞれ同じ階層の分類要素名と完全に一致する。なお、類似度Ｓｖが所定値以上（例えば、８０以上）の場合に、格納場所情報と文字列位置情報との類似度が高いと判断してもよい。これにより、文字列位置情報に含まれる１以上の見出し要素名と、その階層順と、がともに一致するという類似度の高い格納場所情報を特定することが可能となる。 First, a first method for determining similarity will be described. In the first method, each heading element name included in the character string position information is compared with the classification element name of the same hierarchy in the storage location information of the electronic information that is the target of similarity determination, and whether or not they are similar. to decide. Specifically, for example, the phrase similarity S _i (0 ≦ Si ≦ 100) between the heading element name and the classification element name is calculated in the hierarchy i (i ≧ 1), and the average of the calculated phrase similarity S _i is calculated. The value is a similarity Sv (0 ≦ Sv ≦ 100). In this case, when the similarity Sv is 100, the heading element names in all layers of the character string position information completely match the classification element names in the same layer. When the similarity Sv is equal to or higher than a predetermined value (for example, 80 or higher), it may be determined that the similarity between the storage location information and the character string position information is high. As a result, it is possible to specify storage location information with a high degree of similarity in which one or more heading element names included in the character string position information and the hierarchical order thereof match each other.

ここで、見出し要素名と分類要素名との語句同士の類似度（語句類似度Ｓ）を判断する手法は既存の技術を用いてよい。例えば、各語句について形態素解析を行うとともに、その形態素解析の結果に基づいて各語句を単語に分割し、分割した各単語に対して名詞、動詞、付属語等の品詞情報を付与する。付属語が連続している場合には、連続した付属語を一つの単語として扱うことにしてもよい。そして、各語句を構成する単語や文節（名詞や動詞等の内容後に続く付属語のまとまり）をそれぞれ比較して、その語句同士の差異が所定数以下の単語または文節に過ぎない場合には、それらの語句同士は類似の関係にあるとして判断する。付属語が連続している場合には、連続した付属語を一つの単語として扱うことにしてもよい。そして、語句同士の差異として検出された単語数または文節数の、各語句を構成する全単語数または全文節数に対する割合に応じた語句類似度Ｓを設定することとする。なお、その他の語句同士の類似性を判断する手法により設定される語句類似度Ｓを用いることとしてもよい。 Here, an existing technique may be used as a method of determining the similarity (word similarity S) between the words of the heading element name and the classification element name. For example, morphological analysis is performed on each word, and each word is divided into words based on the result of the morphological analysis, and part-of-speech information such as a noun, a verb, and an attached word is assigned to each divided word. When the adjunct words are continuous, the consecutive adjunct words may be handled as one word. Then, comparing the words and phrases that make up each phrase (a group of adjunct words following the contents of nouns, verbs, etc.), if the difference between the phrases is only a predetermined number of words or phrases, These words are judged as having a similar relationship. When the adjunct words are continuous, the consecutive adjunct words may be handled as one word. Then, the phrase similarity S is set according to the ratio of the number of words or phrases detected as a difference between phrases to the total number of words or the total number of phrases constituting each phrase. In addition, it is good also as using the phrase similarity S set by the method of judging the similarity of other words.

図３を用いて第１の手法について具体例を示す。文字列位置情報取得部２０４が取得した「見積書」の文字列位置情報「見積/見積/」と、格納場所情報取得部２０３が取得した「Ａ社見積書.doc」の格納場所情報「購買申請書フォルダ/購買No0001/見積/相見積」と、の類似度を判断する場合を例にすると、まず文字列位置情報の一層目の見出し要素名「見積」と、格納場所情報の一層目の分類要素名「相見積」と、の語句類似度Ｓ_１を算出する。この場合、分類要素名「相見積」と見出し要素名「見積」との語句の差異は「相」だけであるため語句類似度Ｓ_１は高い（例えば、Ｓ_１＞８０）と判断される。次に、文字列位置情報の二層目の見出し要素名「見積」と、格納場所情報の二層目の分類要素名「見積」と、の語句類似度Ｓ_２を算出する。この場合は、見出し要素名「見積」と分類要素名「見積」の語句は完全に一致するため語句類似度Ｓ_２は１００と判断される。そして、語句類似度Ｓ_１と語句類似度Ｓ_２との平均値が類似度Ｓｖ（この場合は、Ｓｖ＞９０となり類似度Ｓｖは高いと判断される）となる。同様にして、電子情報抽出部が抽出した電子情報「Ｂ社見積書.doc」、「見積書.doc」（ディレクトリ「テンプレート」に格納されている）、及び「見積書.doc」（ディレクトリ「（旧）ドラフト版」に格納されている）についてもそれぞれ、文字列位置情報取得部２０４が取得した「見積書」の文字列位置情報「見積/見積/」との類似度を判断すると、類似度Ｓｖが高いと判断される格納場所情報は「Ａ社見積書.doc」及び「Ｂ社見積書.doc」の格納場所情報となる。 A specific example of the first method will be described with reference to FIG. The character string position information “estimate / estimate /” of the “estimate” acquired by the character string position information acquisition unit 204 and the storage location information “purchase” of “A company estimate.doc” acquired by the storage location information acquisition unit 203 For example, when judging the similarity between “Application folder / Purchase No0001 / Estimate / Phase estimate”, the first heading element name “Estimate” in the character string position information and the first item in the storage location information The phrase similarity S ₁ with the classification element name “phase estimation” is calculated. In this case, since the difference in terms between the classification element name “phase estimation” and the heading element name “estimation” is only “phase”, it is determined that the phrase similarity S ₁ is high (eg, S ₁ > 80). Then calculated second layer heading element name string position information "estimate", and the second layer of the classifier name "estimate" of the storage location information, the phrases similarity S _2. In this case, the phrase similarity S ₂ for phrases that match perfectly the heading element name classifier name "estimate", "estimate" is determined to be 100. Then, the average value of the phrase similarity S ₁ and the phrase similarity S ₂ is the similarity Sv (in this case, Sv> 90 and it is determined that the similarity Sv is high). Similarly, the electronic information “B company estimate.doc”, “estimate.doc” (stored in the directory “template”), and “estimate.doc” (directory “ (Stored in the (old) draft version)) is also determined by determining the similarity between the “estimate” obtained by the character string position information acquisition unit 204 and the character string position information “estimate / estimate /”. The storage location information determined that the degree Sv is high is the storage location information of “A company estimate.doc” and “B company estimate.doc”.

次に、類似度の判断を行うための第２の手法について説明する。この第２の手法は、上述した第１の手法より類似度の確度は低いが、広い範囲で類似度の高い格納場所情報を特定することができる。 Next, a second method for determining similarity will be described. Although the second method has a lower degree of similarity than the first method described above, it is possible to specify storage location information having a high degree of similarity in a wide range.

第２の手法としては、文字列位置情報に含まれる見出し要素名それぞれと、格納御場所情報における同じ階層の分類要素名との類似度が低いと判断された場合に、当該同じ階層より上層の分類要素名との類似度を判断する。格納場所情報に含まれる分類要素と分類要素名とがユーザにより自由に設定することができる場合は、格納場所情報に見出し要素名と全く関連性のない分類要素名が含まれている可能性がある。そこで、第２の手法によりこのような見出し要素名と全く関連性のない分類要素名との語句類似度Ｓ_ｉを判断しないようにすることができる。具体的には、例えば、一層目の見出し要素名と一層目の分類要素名との語句類似度Ｓ_１が低いと判断された場合に、一層目の見出し要素名と、１つ上層である二層目の分類要素名との語句類似度を判断する。このような処理を一層目の見出し要素名と、いずれかの層の分類要素名との語句類似度が高いと判断されるまで、または最上層の分類要素名との類似度が判断されるまで繰り返す。そして、いずれかの層の分類要素名との語句類似度が高いと判断された場合に当該語句類似度を語句類似度Ｓ_１の値とし、最下層から最上層の分類要素名のうち語句類似度が高いと判断される分類要素名がない場合は、一層目の見出し要素名と一層目の分類要素名との語句類似度Ｓ_１をそのまま用いる。そして、すべての階層の見出し要素名について同様の処理を行い算出された語句類似度Ｓ_ｉの平均値を語句類似度Ｓｖとする。これにより、格納場所情報に、見出し要素名と関連性のない分類要素名が含まれている場合であっても、その関連性のない分類要素名をスキップして、文字列位置情報と格納場所情報との類似度Ｓｖの判断を行うことが可能となる。なお、第２の手法によりスキップされた分類要素の数と、語句類似度Ｓ_ｉとの重みづけにより類似度Ｓｖを算出することとしてもよい。 As a second method, when it is determined that the similarity between each heading element name included in the character string position information and the classification element name of the same hierarchy in the storage location information is low, the upper hierarchy of the same hierarchy. The similarity with the classification element name is determined. If the classification element and classification element name included in the storage location information can be freely set by the user, there is a possibility that the storage location information includes a classification element name that is completely unrelated to the heading element name. is there. Therefore, it is possible to prevent the phrase similarity S _i between such a heading element name and a classification element name having no relevance from being determined by the second method. Specifically, for example, when it is determined that the phrase similarity S1 between the heading element name of the first layer and the classification element name of the _{first layer} is low, the heading element name of the first layer and the second uppermost element name The phrase similarity with the layer classification element name is determined. Such a process is performed until it is determined that the word similarity between the heading element name of the first layer and the classification element name of any layer is high, or the similarity with the classification element name of the top layer is determined. repeat. Then, when it is determined that the phrase similarity with the classification element name of any layer is high, the phrase similarity is set as the value of the phrase similarity S ₁ , and the phrase similarity among the classification element names of the top layer from the lowest layer degrees is If there is no higher classification element name is determined, using the word or phrase as similarity S ₁ and more eye heading element name and the first layer of the classification element names. The average value of the phrase similarity S _i calculated by performing the same process for the heading element names of all layers is set as the phrase similarity Sv. As a result, even if the storage location information includes a classification element name that is not related to the heading element name, the character string position information and the storage location are skipped. It is possible to determine the degree of similarity Sv with information. Note that the similarity Sv may be calculated by weighting the number of classification elements skipped by the second method and the phrase similarity S _i .

図５を用いて第２の手法について具体例を示す。図５は、図３に示すディレクトリシステムのうち「購買No002」のディレクトリに格納されている情報の一例を示す図である。まず、語句取得部２０１が図２に示すマニュアル文書の「見積書」を取得した場合に、電子情報抽出部２０２は、図５に示す「購買No0002」ディレクトリに格納されている電子情報のうち電子情報名に「見積書」を含む電子情報「見積書.doc」を抽出する。そして、抽出した電子情報「見積書.doc」の格納場所情報「購買申請書フォルダ/購買No0002/相見積/Ｘ社/見積」と、文字列位置情報取得部２０４が取得した「見積書」の文字列位置情報「見積/見積/」と、の類似度を判断する場合を例にすると、まず文字列位置情報の一層目の見出し要素名「見積」と、格納場所情報の一層目の分類要素名「見積」と、の語句類似度Ｓ_１を算出すると、両者の語句は完全に一致するため語句類似度Ｓ_１は１００となる。次に文字列位置情報の二層目の見出し要素名「見積」と、格納場所情報の二層目の分類要素名「Ｘ社」と、の語句類似度Ｓ_２を算出すると、一致する語句がないため語句類似度Ｓ_２は０となる。このままだと、語句類似度Ｓ_１（＝１００）と語句類似度Ｓ_２（＝０）との平均値である類似度Ｓｖは低いと判断される。そこで、第２の手法を用いると、語句類似度Ｓ_２は低いと判断されるため、文字列位置情報の二層目の見出し要素名「見積」と、１つ上層の三層目の分類要素名「相見積」と、の語句類似度を算出することになる。すると、「相見積」と「見積」との語句の差異は「相」だけであるため語句類似度（例えば＞８０）は高いと判断されるので、当該語句類似度が語句類似度Ｓ_２の値となる。そして、語句類似度Ｓ_１（＝１００）と語句類似度Ｓ_２（＞８０）との平均値である類似度Ｓｖは高いと判断される。このように、図５に示す「Ｘ社」ディレクトリのようなユーザの都合により作成された、「見積書」とは直接関連性のないディレクトリを含む場合であっても、当該ディレクトリをスキップすることで広範囲での類似度の判断を行うことができる。 A specific example of the second method will be described with reference to FIG. FIG. 5 is a diagram illustrating an example of information stored in the directory “purchasing No 002” in the directory system illustrated in FIG. 3. First, when the phrase acquisition unit 201 acquires the “estimate” of the manual document shown in FIG. 2, the electronic information extraction unit 202 selects the electronic information out of the electronic information stored in the “purchasing No0002” directory shown in FIG. 5. Extracts the electronic information “estimate.doc” including “estimate” in the information name. Then, the storage location information “purchase application form folder / purchasing No0002 / phase estimate / Company X / estimate” of the extracted electronic information “estimate.doc” and the “estimate” acquired by the character string position information acquisition unit 204 Taking the case of determining the similarity between the character string position information “estimate / estimate /” as an example, the first heading element name “estimation” of the character string position information and the first classification element of the storage location information When the phrase similarity S ₁ of the name “estimate” is calculated, the phrase similarity S ₁ is 100 because the two phrases are completely matched. Then a second layer of the heading element name string position information "estimate", stores classifier name of a second layer of the location information as "X Company", calculating the word similarity S ₂ of matching phrases the phrase similarity S ₂ because there is zero. As it is, the similarity Sv, which is the average value of the phrase similarity S ₁ (= 100) and the phrase similarity S ₂ (= 0), is determined to be low. Therefore, when using the second method, since the word similarity S ₂ is determined to be low, the second layer of the heading element name string position information "estimate", one layer of the third layer of the classifier The phrase similarity with the name “phase estimate” is calculated. Then, since the phrase difference between “phase estimate” and “estimate” is only “phase”, it is determined that the phrase similarity (for example,> 80) is high. Therefore, the phrase similarity is equal to the phrase similarity S ₂ . Value. Then, it is determined that the similarity Sv that is an average value of the phrase similarity S ₁ (= 100) and the phrase similarity S ₂ (> 80) is high. In this way, even if a directory such as the “Company X” directory shown in FIG. 5 created for the convenience of the user and not directly related to the “estimate” is included, the directory should be skipped. With this, it is possible to determine the similarity in a wide range.

また、類似度の判断を行うための第３の手法として、文字列位置情報に含まれる見出し要素のうち１または複数の見出し要素を抽出して類似度の判断の対象としてもよい。具体的には、例えば、文字列位置情報に含まれる複数の見出し要素のうち、最上層から順に１または複数の見出し要素を抽出して文字列位置情報とする。例えば、文字列位置情報に含まれる見出し要素数が格納場所情報に含まれる分類要素数より多い場合に、当該分類要素数と同数の見出し要素を抽出することとしてもよい。文字列位置情報の上層の見出し要素は、例えば章、節といった大見出しであるためその見出し要素名を分類要素名に使用する可能性は高く、上層の見出し要素を抽出することで類似度の判断を容易にすることが期待できる。なお、同様にして、格納場所情報に含まれる複数の分類要素のうちから１または複数の分類要素を抽出して類似度の判断の対象としてもよい。 As a third method for determining the similarity, one or a plurality of heading elements may be extracted from the heading elements included in the character string position information, and the similarity may be determined. Specifically, for example, among a plurality of heading elements included in the character string position information, one or a plurality of heading elements are extracted in order from the top layer to obtain character string position information. For example, when the number of heading elements included in the character string position information is larger than the number of classification elements included in the storage location information, the same number of heading elements as the number of classification elements may be extracted. The heading elements in the upper layer of the character string position information are large headings such as chapters and sections, so it is highly likely that the heading element name will be used as the classification element name, and the similarity is judged by extracting the upper heading element. Can be expected to facilitate. Similarly, one or a plurality of classification elements may be extracted from a plurality of classification elements included in the storage location information, and the similarity may be determined.

なお、類似度判断部２０５は、上述した第１の手法、第２の手法、及び第３の手法のいずれかを用いてもよいし、組み合わせて用いてもよい。また、第１の手法を用いて類似度が高いと判断された格納場所情報と、第２の手法、第３の手法を用いて類似度が高いと判断された格納場所情報と、を類似度の算出精度の違いにより区別してもよい。 Note that the similarity determination unit 205 may use any one of the first method, the second method, and the third method described above, or may be used in combination. Further, the storage location information determined to have a high degree of similarity using the first method and the storage location information determined to have a high degree of similarity using the second method and the third method are used. You may distinguish by the difference in the calculation precision of.

表示部２０６は、類似度判断部２０５により類似度Ｓｖが高いと判断された格納場所情報に対応する電子情報を電子情報管理サーバ２０から取得してユーザインタフェース部に表示出力する。このとき、第１の手法を用いて類似度Ｓｖが１００と判断された格納場所情報に対応する電子情報と、それ以外の手法を用いて類似度Ｓｖが高いと判断された格納場所情報に対応する電子情報と、の表示を異ならせる。また、第２の手法、第３の手法を用いて類似度Ｓｖが高いと判断された格納場所情報に対応する電子情報は、類似度の算出精度が低いことを示して表示することとしてもよい。 The display unit 206 acquires electronic information corresponding to the storage location information determined to have a high similarity Sv by the similarity determination unit 205 from the electronic information management server 20, and displays and outputs the electronic information to the user interface unit. At this time, it corresponds to the electronic information corresponding to the storage location information for which the similarity Sv is determined to be 100 using the first method and the storage location information for which the similarity Sv is determined to be high using the other methods. Different display of electronic information. Also, the electronic information corresponding to the storage location information determined to have a high similarity Sv using the second method and the third method may be displayed indicating that the similarity calculation accuracy is low. .

なお、本発明は上述の実施形態に限定されるものではない。 In addition, this invention is not limited to the above-mentioned embodiment.

例えば、上述の実施形態において情報処理装置１０により実行される機能をマニュアル管理サーバ２０または電子情報管理サーバ３０で実行されることとしてもよい。また、情報処理装置１０、マニュアル管理サーバ２０、及び電子情報管理サーバ３０を別体の装置である例について示したが、これらは一体の装置であってもよい。 For example, the function executed by the information processing apparatus 10 in the above-described embodiment may be executed by the manual management server 20 or the electronic information management server 30. Moreover, although the information processing apparatus 10, the manual management server 20, and the electronic information management server 30 are shown as examples of separate apparatuses, these may be integrated apparatuses.

１０情報処理装置、２０マニュアル文書管理サーバ、３０電子情報管理サーバ、２００マニュアル文書取得部、２０１文字列取得部、２０２電子情報抽出部、２０３格納場所情報取得部、２０４文字列位置情報取得部、２０５類似度判断部、２０６表示部。 DESCRIPTION OF SYMBOLS 10 Information processing apparatus, 20 Manual document management server, 30 Electronic information management server, 200 Manual document acquisition part, 201 Character string acquisition part, 202 Electronic information extraction part, 203 Storage location information acquisition part, 204 Character string position information acquisition part, 205 similarity determination unit, 206 display unit.

Claims

第１の階層構造を有して構成される見出し要素の何れかに属する構成要素に含まれる文字列を取得する文字列取得手段と、
前記取得された文字列の前記第１の階層構造における位置を示す文字列位置情報を取得する文字列位置情報取得手段と、
それぞれ、第２の階層構造を有して構成される分類要素の何れかに格納する複数の電子情報のうち、その電子情報の電子情報名に前記取得された文字列を含む複数の前記電子情報を抽出する電子情報抽出手段と、
前記抽出した複数の電子情報の前記第２の階層構造における格納場所を示す格納場所情報と、前記文字列位置情報と、に基づいて前記抽出した複数の電子情報のうち、前記取得された文字列と関連する前記電子情報を取得する電子情報取得手段と、
を含むことを特徴とする情報処理装置。 A character string acquisition means for acquiring a character string included in a component belonging to any of the heading elements configured to have the first hierarchical structure;
Character string position information acquisition means for acquiring character string position information indicating the position of the acquired character string in the first hierarchical structure;
Among a plurality of electronic information stored in any one of the classification elements each having the second hierarchical structure, a plurality of the electronic information including the acquired character string in the electronic information name of the electronic information Electronic information extraction means for extracting
The acquired character string among the plurality of extracted electronic information based on the storage location information indicating the storage location in the second hierarchical structure of the plurality of extracted electronic information and the character string position information. Electronic information acquisition means for acquiring the electronic information related to
An information processing apparatus comprising:

前記格納場所情報と前記文字列位置情報とが類似するか否かを判断する判断手段、をさらに含み、
前記電子情報取得手段は、前記文字列位置情報と類似すると判断された前記格納場所情報に対応する前記電子情報を取得する、
ことを特徴とする請求項１に記載の情報処理装置。 Determining means for determining whether or not the storage location information and the character string position information are similar;
The electronic information acquisition means acquires the electronic information corresponding to the storage location information determined to be similar to the character string position information;
The information processing apparatus according to claim 1.

前記文字列位置情報は、前記取得された文字列を含む構成要素が属する階層から最上階層までの各見出し要素の見出し要素名を順に含み、前記格納場所情報は、前記電子情報が格納される階層から最上層までの各分類要素の分類要素名を順に含み、
前記判断手段は、前記文字列位置情報に含まれるすべての前記見出し要素名それぞれが、前記格納場所情報における同じ階層の前記分類要素名と類似するか否かを判断する、
ことを特徴とする請求項２に記載の情報処理装置。 The character string position information sequentially includes heading element names of heading elements from the hierarchy to which the constituent element including the acquired character string belongs to the highest hierarchy, and the storage location information is a hierarchy in which the electronic information is stored To the top layer of each classification element in order,
The determination means determines whether each of the heading element names included in the character string position information is similar to the classification element name of the same hierarchy in the storage location information.
The information processing apparatus according to claim 2.

前記文字列位置情報は、前記取得された文字列を含む構成要素が属する階層から最上階層までの各見出し要素の見出し要素名を順に含み、前記格納場所情報は、前記電子情報が属する階層から最上層までの各分類要素の分類要素名を順に含み、
前記判断手段は、前記文字列位置情報に含まれる見出し要素名の中から抽出した、１または複数の見出し要素名に基づいて類似するか否かを判断する、
ことを特徴とする請求項２に記載の情報処理装置。 The character string position information sequentially includes the heading element names of the heading elements from the hierarchy to which the constituent element including the acquired character string belongs to the highest hierarchy, and the storage location information is from the hierarchy to which the electronic information belongs. Including the classification element name of each classification element up to the upper layer in order,
The determination means determines whether or not the similarity is based on one or more heading element names extracted from the heading element names included in the character string position information;
The information processing apparatus according to claim 2.

前記文字列位置情報は、前記取得された文字列を含む構成要素が属する階層から最上階層までの各見出し要素の見出し要素名を順に含み、前記格納場所情報は、前記電子情報が属する階層から最上層までの各分類要素の分類要素名を順に含み、
前記判断手段は、前記文字列位置情報に含まれるすべての見出し要素名それぞれと、前記格納場所情報における同じ階層の前記分類要素名と、が類似しない場合に、当該見出し要素名と、当該同じ階層より上層の前記分類要素名と、が類似するか否かを判断する、
ことを特徴とする請求項２から４のいずれか１項に記載の情報処理装置。 The character string position information sequentially includes the heading element names of the heading elements from the hierarchy to which the constituent element including the acquired character string belongs to the highest hierarchy, and the storage location information is from the hierarchy to which the electronic information belongs. Including the classification element name of each classification element up to the upper layer in order,
The determination means, when each of the heading element names included in the character string position information and the classification element name of the same hierarchy in the storage location information is not similar, the heading element name and the same hierarchy It is determined whether or not the classification element name in the upper layer is similar.
The information processing apparatus according to claim 2, wherein the information processing apparatus is an information processing apparatus.

第１の階層構造を有して構成される見出し要素の何れかに属する構成要素に含まれる文字列を取得する文字列取得手段、
前記取得された文字列の前記第１の階層構造における位置を示す文字列位置情報を取得する文字列位置情報取得手段、
それぞれ、第２の階層構造を有して構成される分類要素の何れかに格納する複数の電子情報のうち、その電子情報の電子情報名に前記取得された文字列を含む複数の前記電子情報を抽出する電子情報抽出手段、
前記抽出した複数の電子情報の前記第２の階層構造における格納場所を示す格納場所情報と、前記文字列位置情報と、に基づいて前記抽出した複数の電子情報のうち、前記取得された文字列と関連する前記電子情報を取得する電子情報取得手段、
としてコンピュータを機能させるためのプログラム。 A character string acquisition means for acquiring a character string included in a constituent element belonging to any of the heading elements having the first hierarchical structure;
Character string position information acquisition means for acquiring character string position information indicating the position of the acquired character string in the first hierarchical structure;
Among a plurality of electronic information stored in any one of the classification elements each having the second hierarchical structure, a plurality of the electronic information including the acquired character string in the electronic information name of the electronic information Electronic information extraction means for extracting
The acquired character string among the plurality of extracted electronic information based on the storage location information indicating the storage location in the second hierarchical structure of the plurality of extracted electronic information and the character string position information. Electronic information acquisition means for acquiring the electronic information related to
As a program to make the computer function as.