JP4936946B2

JP4936946B2 - Data processing apparatus, data processing method, and program

Info

Publication number: JP4936946B2
Application number: JP2007078062A
Authority: JP
Inventors: 丈志竹内; 守加藤; 光則郡; 隆顕中村
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2007-03-26
Filing date: 2007-03-26
Publication date: 2012-05-23
Anticipated expiration: 2027-03-26
Also published as: JP2008242539A

Description

本発明は、データの格納技術及び格納したデータの抽出技術に関し、特に、電子メールの格納技術及び格納した電子メールの抽出技術に関する。 The present invention relates to a data storage technique and a stored data extraction technique, and more particularly to an electronic mail storage technique and a stored electronic mail extraction technique.

コンピュータやインターネットが普及し、これらを利用して業務の効率化が進められている中、電子メールは必要不可欠なビジネスツールとなった。
ビジネスのあらゆる場面で電子メールを用いた情報伝達が行われている一方で、不正会計処理や個人情報などの違法な情報の伝達や情報漏洩事故に電子メールが用いられている場面が多い。
このような状況を踏まえ、電子メールを中継する任意の各コンピュータで中継した電子メールを全て保存しておき、情報漏洩事故等が起きた際の原因究明や証拠保全に役立てようとする動きが進んでいる。また、法制化する動きも広まっている。 E-mail has become an indispensable business tool as computers and the Internet have become widespread and work efficiency has been promoted.
While information is transmitted using e-mail in all business situations, e-mail is often used for illegal accounting processing, transmission of illegal information such as personal information, and information leakage accidents.
Based on this situation, there is a movement to save all the emails relayed by any computer that relays emails to find out the cause of information leakage accidents and to preserve evidence. It is out. In addition, there is a widespread movement toward legislation.

電子メールには次のような特徴がある。
電子メールは、ヘッダ・本文・添付ファイルから構成されており、ヘッダと本文のみから構成される電子メールのデータ量は数キロバイトと短小なものが多いが、ヘッダと本文と添付ファイルから構成される電子メールのデータ量は数百キロバイトから数十メガバイトにまで及ぶ長大なものが多い。 E-mail has the following characteristics.
E-mail consists of a header, body, and attached file, and the amount of e-mail data that consists only of the header and body is often as short as several kilobytes, but it consists of a header, body, and attached file. The amount of email data is often long, ranging from hundreds of kilobytes to tens of megabytes.

従来の電子メールアーカイブシステムにおいては、このような長大な電子メールに対して各データの継ながり状態を示すリンク情報を付加し、このリンク情報を参照することで、電子メール全体の表示を行う構成になっている（例えば、特許文献１参照）。
特開平１１−４６２１４号公報（第２頁） In the conventional e-mail archiving system, link information indicating the connection status of each data is added to such a long e-mail, and the entire e-mail is displayed by referring to the link information. It is configured (see, for example, Patent Document 1).
JP-A-11-46214 (2nd page)

この電子メールアーカイブシステムは、各電子メールの継ながり状態を示すリンク情報を参照することで、電子メール全体の表示を行っている。 This electronic mail archiving system displays the entire electronic mail by referring to link information indicating the connection state of each electronic mail.

しかし、この方法では電子メールのリンク情報を参照して、続きとなる電子メールを取得する、という処理が発生するため、電子メール全体を取得するには時間がかかるという問題点があった。
この発明は、上述のような課題を解決することを主な目的の一つとし、長大な電子メールを保存する場合に、電子メールの抽出段階で電子メールの全体を迅速に取得できるように電子メールを保存することを主な目的とする。 However, in this method, there is a problem in that it takes time to acquire the entire e-mail because a process of acquiring the subsequent e-mail with reference to the link information of the e-mail occurs.
One of the main objects of the present invention is to solve the above-mentioned problems. When storing a long e-mail, the electronic mail can be quickly acquired at the e-mail extraction stage. The main purpose is to store mail.

なお、本明細書では、入力データとして電子メールを例にとって記述しているが、本発明の適用は電子メールに限定するものではなく、一般的なデータに対しても適用可能である。 In this specification, e-mail is described as an example of input data. However, the application of the present invention is not limited to e-mail, and can be applied to general data.

本発明に係るデータ処理装置は、
入力した入力データのデータサイズが所定の上限データサイズ以下の場合に入力データを分割することなくデータベースシステムに出力し、入力した入力データのデータサイズが前記上限データサイズを超える場合に、各々のデータサイズが前記上限データサイズ以下となるように入力データを複数の分割データに分割し、分割データの個数である分割件数と分割データ間で連番となる分割番号とを各々の分割データに付加してデータベースシステムに出力するデータベース管理部と、
前記データベース管理部から入力される各々のデータに、連番となるデータ管理番号を設定し、各々のデータをデータ管理番号順に格納し、前記データベース管理部により分割された複数の分割データが入力された場合に、分割番号順に各々の分割データにデータ管理番号を設定し、分割件数及び分割番号とともに各々の分割データをデータ管理番号順に格納するデータベースシステムとを有することを特徴とする。 The data processing apparatus according to the present invention
When the data size of the input data that is input is less than or equal to the predetermined upper limit data size, the input data is output to the database system without being divided, and when the data size of the input data that is input exceeds the upper limit data size, each data The input data is divided into a plurality of divided data so that the size is equal to or smaller than the upper limit data size, and the number of divided data that is the number of divided data and a divided number that is a serial number between the divided data are added to each divided data. A database management unit for outputting to the database system
A serial data management number is set for each data inputted from the database management unit, each data is stored in the order of the data management number, and a plurality of divided data divided by the database management unit are inputted. A database system that sets data management numbers for each piece of divided data in the order of division numbers, and stores the pieces of divided data together with the number of divisions and the division numbers in the order of the data management numbers.

本発明によれば、上限データサイズを超えるデータの場合は、当該データを上限データサイズ以下の分割データに分割するとともに、分割された複数の分割データを分割件数及び分割番号とともに分割番号順に連続してデータベースシステムに格納するため、データを抽出する際に一つのデータを構成する複数の分割データを一括して抽出することが可能であり、複数の分割データに分割されたデータであっても、高速に取得することができる。 According to the present invention, in the case of data exceeding the upper limit data size, the data is divided into divided data equal to or smaller than the upper limit data size, and the plurality of divided divided data are continuously arranged in the order of the division number together with the number of divisions and the division number. Therefore, when extracting data, it is possible to extract a plurality of divided data constituting one data at a time, even if the data is divided into a plurality of divided data, It can be acquired at high speed.

実施の形態１．
この発明の実施の形態１における電子メールアーカイブシステムの基本構成の一例を、図１を参照して説明する。
図１において、電子メールアーカイブシステム１０１（データ処理装置）は、データアーカイブシステムに検索システムを付加し、電子メール受信部１０５によってメールサーバ１０６から取得した電子メールに対して、蓄積処理・検索処理・抽出処理をユーザ（システム管理者等）が利用するユーザ端末１００に提供する。
この電子メールアーカイブシステム１０１は、データベース管理部１０２と、データベースシステム１０３と、検索システム１０４と、から構成される。
なお、本明細書において、「メールメッセージ」とは、電子メールが電子メールアーカイブシステムに保存された状態のデータ１件単位を表す。メールメッセージの詳細は、後述する。 Embodiment 1 FIG.
An example of the basic configuration of the e-mail archive system according to Embodiment 1 of the present invention will be described with reference to FIG.
In FIG. 1, an e-mail archive system 101 (data processing apparatus) adds a search system to a data archive system, and stores an e-mail received from a mail server 106 by an e-mail receiving unit 105 for a storage process, a search process, The extraction process is provided to the user terminal 100 used by the user (system administrator or the like).
The e-mail archive system 101 includes a database management unit 102, a database system 103, and a search system 104.
In the present specification, the “mail message” represents a unit of data in a state where an email is stored in the email archive system. Details of the mail message will be described later.

電子メールアーカイブシステム１０１の動作の詳細は後述するが、ここで、先ず本実施の形態に係る電子メールアーカイブシステム１０１の動作を概説する。
データベース管理部１０２は、電子メール受信部１０５から入力した電子メールのデータサイズが所定の上限データサイズ以下の場合に電子メールを分割することなくデータベースシステム１０３に出力し、入力した電子メールのデータサイズが上限データサイズを超える場合に、各々のデータサイズが上限データサイズ以下となるように電子メールを複数の分割データに分割する。電子メールが上限データサイズ以下の場合の電子メール全体、又は電子メールが上限データサイズを超える場合の分割データが、メールメッセージに相当する。
また、データベース管理部１０２は、電子メールが分割された場合は、メールメッセージの個数である分割件数とメールメッセージ間で連番となる分割番号とを各々のメールメッセージに付加してデータベースシステムに出力する。
なお、上限データサイズは、データベースシステム１０３が一つのレコードとして管理できる上限のデータサイズである。 The details of the operation of the e-mail archive system 101 will be described later. First, the operation of the e-mail archive system 101 according to the present embodiment will be outlined.
The database management unit 102 outputs the e-mail to the database system 103 without dividing the e-mail when the data size of the e-mail input from the e-mail receiving unit 105 is equal to or less than a predetermined upper limit data size. Is larger than the upper limit data size, the email is divided into a plurality of divided data so that each data size is equal to or smaller than the upper limit data size. The entire e-mail when the e-mail is less than or equal to the upper limit data size, or the divided data when the e-mail exceeds the upper limit data size corresponds to the mail message.
In addition, when the email is divided, the database management unit 102 adds the number of divisions that is the number of email messages and the division number that is a serial number between the email messages to each email message and outputs it to the database system To do.
The upper limit data size is an upper limit data size that can be managed as one record by the database system 103.

データベースシステム１０３では、データベース管理部１０２から入力される各々のメールメッセージに、連番となるレコード番号（データ管理番号）を設定し、各々のメールメッセージをレコード番号順に格納し、データベース管理部１０２により分割された複数のメールメッセージ（分割データ）が入力された場合に、分割番号順に各々のメールメッセージにレコード番号を設定し、分割件数及び分割番号とともに各々のメールメッセージをレコード番号順に格納する。
つまり、データベースシステム１０３は、一つの電子メールから分割された複数のメールメッセージの場合は、分割件数及び分割番号とともに複数のメールメッセージを分割番号順に連続するレコードに格納する。このため、後述するように、メールメッセージを抽出する際に、連続する複数レコード分のメールメッセージを一度に抽出することで、一つの電子メールを構成する複数のメールメッセージを一括して抽出することが可能であり、複数のメールメッセージに分割された電子メールであっても、高速に取得することができる。 In the database system 103, record numbers (data management numbers) that are serial numbers are set for each mail message input from the database management unit 102, and each mail message is stored in the order of the record number. When a plurality of divided mail messages (divided data) are input, a record number is set in each mail message in the order of the division number, and each mail message is stored in the order of the record number together with the number of divisions and the division number.
That is, in the case of a plurality of mail messages divided from one electronic mail, the database system 103 stores the plurality of mail messages together with the number of divisions and the division number in a record that is continuous in the order of the division number. Therefore, as described later, when extracting a mail message, a plurality of mail messages constituting a single e-mail can be extracted at a time by extracting the mail messages for a plurality of consecutive records at once. Even an electronic mail divided into a plurality of mail messages can be acquired at high speed.

また、データベース管理部１０２は、データベースシステム１０３により設定されたレコード番号をデータベースシステム１０３から入力し、データベースシステム１０３に対して、特定のレコード番号が設定されている特定のメールメッセージ（特定データ）及びレコード番号の順序において特定のメールメッセージに後続するＮ−１個（Ｎ≧１）の後続メールメッセージ（後続データ）を抽出するよう要求し、データベースシステム１０３から、Ｎ個分のメールメッセージを入力する。
そして、データベース管理部１０２は、データベースシステム１０３から入力した特定のメールメッセージに分割番号が付加されているか否かを判定し、特定のメールメッセージに分割番号が付加されている場合に、特定のメールメッセージに付加されている分割件数と特定のメールメッセージ及び後続メールメッセージの合計件数とを比較し、分割件数分のメールメッセージがデータベースシステム１０３から入力済みであるか否かを判断する。
分割件数分のメールメッセージが入力済みの場合に、分割件数分のメールメッセージを統合して分割前の電子メールを取得する。
他方、データベース管理部１０２は、特定のメールメッセージに付加されている分割件数と特定のメールメッセージ及び後続メールメッセージの合計件数とを比較した結果、分割件数分のメールメッセージが入力されていない場合に、データベースシステム１０３に対して、分割件数に対して不足している件数分の後続メールメッセージを抽出するよう要求し、データベースシステム１０３から不足している件数分の後続メールメッセージを入力し、分割件数分のメールメッセージを統合して分割前の電子メールを取得する。 Further, the database management unit 102 inputs the record number set by the database system 103 from the database system 103, and sends a specific mail message (specific data) in which a specific record number is set to the database system 103 and Requests to extract N-1 (N ≧ 1) subsequent mail messages (subsequent data) following a specific mail message in the order of record numbers, and inputs N mail messages from the database system 103. .
Then, the database management unit 102 determines whether or not a division number is added to the specific mail message input from the database system 103, and if the division number is added to the specific mail message, the specific mail The number of divisions added to the message is compared with the total number of specific mail messages and subsequent mail messages, and it is determined whether or not the number of divisional mail messages have been input from the database system 103.
When mail messages for the number of divisions have been input, the mail messages for the number of divisions are integrated and the pre-split e-mail is acquired.
On the other hand, as a result of comparing the number of divisions added to a specific mail message with the total number of specific mail messages and subsequent mail messages, the database management unit 102 determines that the number of divisional mail messages has not been input. The database system 103 is requested to extract the subsequent mail messages corresponding to the number that are insufficient with respect to the number of divisions, the subsequent mail messages for the number of cases that are insufficient are input from the database system 103, and the number of divisions Integrate minutes' email messages and get the pre-split email.

次に、本実施の形態に係るデータベースシステム１０３におけるデータ格納方法について説明する。
本実施の形態に係るデータベースシステム１０３には、次の特徴を持つデータ格納方法を用いる（図１８〜図２３参照）。 Next, a data storage method in the database system 103 according to the present embodiment will be described.
The database system 103 according to the present embodiment uses a data storage method having the following characteristics (see FIGS. 18 to 23).

（１）データの保存
図１８に示すような、Ｎ−１個のデータが保存されているデータベースシステム１０３に新たにデータを保存するとき、保存するデータの直前のレコード番号＃Ｎ−１に対して連番となるレコード番号＃Ｎを、保存するデータに付与して、図２０に示すように、時系列順にディスクに保存する。
また、データベースシステム１０３内は、複数個のレコードを単位とする管理ブロックに区分されている、図１８の例では、範囲１８０１と呼ばれる時系列単位で区切られ、各範囲１８０１に属するデータのレコード番号１８０２の下限および上限を、図１９に示すような範囲管理テーブル１９０１に保存することで、データを管理している。
前述のような、新たなデータがデータベースシステム１０３に保存された場合、範囲管理テーブル１９０１における現在の範囲のレコード番号１８０２の上限が、図２１に示すようにＮ−１からＮに更新される。 (1) Data storage When data is newly stored in the database system 103 in which N-1 data are stored as shown in FIG. 18, the record number # N-1 immediately before the data to be stored is stored. The record number #N, which is a serial number, is assigned to the data to be stored, and is stored on the disk in time series as shown in FIG.
The database system 103 is divided into management blocks each having a plurality of records. In the example of FIG. 18, the record numbers of data belonging to each range 1801 are divided in time series units called ranges 1801. Data is managed by storing the lower limit and the upper limit of 1802 in a range management table 1901 as shown in FIG.
When new data is stored in the database system 103 as described above, the upper limit of the record number 1802 of the current range in the range management table 1901 is updated from N-1 to N as shown in FIG.

（２）データの抽出
データの抽出は、抽出対象データのレコード番号１８０２をキーとして、範囲管理テーブル１９０１を参照することで、抽出対象データが属する範囲１８０１を特定する。
範囲１８０１を特定することができた後は、この範囲１８０１に属するデータ全てを読み出して抽出対象データの探索を行う。 (2) Data Extraction Data extraction specifies the range 1801 to which the extraction target data belongs by referring to the range management table 1901 using the record number 1802 of the extraction target data as a key.
After the range 1801 can be specified, all the data belonging to the range 1801 is read and the extraction target data is searched.

（３）範囲の追加
範囲１８０１は、任意のタイミングで追加することが可能である。
図２０に示すような状態のデータベースシステム１０３に対して新たに範囲１８０１を追加した場合、図２２に示すように、レコード番号＃２０１〜レコード番号＃Ｎのデータは新たに追加された範囲（範囲＃３）に属することになり、新たに保存されるレコード番号＃Ｎ＋１以降のデータは、現在の範囲に属することになる。
また、この範囲１８０１の追加に合わせて、範囲管理テーブル１９０１も、図２１に示す状態から図２３に示す状態へ更新される。 (3) Range addition The range 1801 can be added at an arbitrary timing.
When a range 1801 is newly added to the database system 103 in the state as shown in FIG. 20, the data of record number # 201 to record number #N are newly added as shown in FIG. The data after the record number # N + 1 that is newly stored belongs to the current range.
As the range 1801 is added, the range management table 1901 is also updated from the state shown in FIG. 21 to the state shown in FIG.

ここからは、この発明の実施の形態１における、電子メールアーカイブシステム１０１の動作について、処理ごとに詳細に説明する。 From here on, the operation of the electronic mail archive system 101 in Embodiment 1 of the present invention will be described in detail for each process.

図２は、本実施の形態に係る電子メールアーカイブシステム１０１とユーザ端末１００との間の動作例を示す。
なお、図２では、ユーザ端末１００が二つ示されているが、二つとも同一のユーザ端末である。上側のユーザ端末１００は、電子メールアーカイブシステム１０１に対する要求の送信元としてのユーザ端末を示しており、下側のユーザ端末１００は、電子メールアーカイブシステム１０１からの応答の送信先としてのユーザ端末を示している。 FIG. 2 shows an operation example between the electronic mail archive system 101 and the user terminal 100 according to the present embodiment.
In FIG. 2, two user terminals 100 are shown, but the two are the same user terminal. The upper user terminal 100 indicates a user terminal as a transmission source of a request to the email archive system 101, and the lower user terminal 100 indicates a user terminal as a transmission destination of a response from the email archive system 101. Show.

図２において、蓄積処理は、ユーザ端末１００からの蓄積要求を入力として、電子メールアーカイブシステム１０１によって実行され、完了通知をユーザ端末１００に返す。
電子メールアーカイブシステム１０１内における蓄積処理の流れは、図３に示すように、電子メール取得処理（Ｓ３０１）、データベースシステム保存処理（Ｓ３０２）（データ加工ステップ）（データ記憶ステップ）、検索システム登録処理（Ｓ３０３）の順に実行される。 In FIG. 2, the accumulation process is executed by the email archive system 101 with the accumulation request from the user terminal 100 as an input, and a completion notification is returned to the user terminal 100.
As shown in FIG. 3, the flow of the accumulation process in the e-mail archive system 101 includes e-mail acquisition process (S301), database system storage process (S302) (data processing step) (data storage step), search system registration process. (S303) are executed in this order.

ステップＳ３０１の電子メール取得処理では、図４に示すように、まず、ユーザ端末１００からの蓄積要求を受けたデータベース管理部１０２が、電子メール取得要求を電子メール受信部１０５へ送る。なお、前述したように、ユーザ端末１００を利用するユーザとは、電子メールを電子メールアーカイブシステム１０１に格納して管理するシステム管理者等である。
次に、電子メール受信部１０５は電子メール取得要求をメールサーバ１０６へ送り、これを受けたメールサーバ１０６が電子メールを電子メール受信部１０５へ返す。
電子メールを取得した電子メール受信部１０５は、この電子メールをデータベース管理部１０２へ返す。データベース管理部１０２が電子メールを取得することで、ステップＳ３０１の電子メール取得処理は完了となる。
なお、メールサーバ１０６では、図４に示す処理とは別に、電子メールユーザ（システム管理者ではない一般ユーザ）の電子メールの閲覧のために電子メールを電子メールユーザのユーザ端末に送信する処理も行っている。
メールサーバ１０６は、例えば、電子メールユーザに送信する電子メールのコピーを保持しており、電子メール受信部１０５には、コピーの電子メールを送信する。 In the e-mail acquisition process of step S301, as shown in FIG. 4, first, the database management unit 102 that has received a storage request from the user terminal 100 sends an e-mail acquisition request to the e-mail receiving unit 105. As described above, the user who uses the user terminal 100 is a system administrator who stores and manages e-mails in the e-mail archive system 101.
Next, the e-mail receiving unit 105 sends an e-mail acquisition request to the mail server 106, and the mail server 106 receiving the e-mail returns the e-mail to the e-mail receiving unit 105.
The e-mail receiving unit 105 that has acquired the e-mail returns this e-mail to the database management unit 102. When the database management unit 102 acquires an e-mail, the e-mail acquisition process in step S301 is completed.
In addition to the processing shown in FIG. 4, the mail server 106 also includes processing for transmitting an email to the user terminal of the email user for viewing emails of email users (general users who are not system administrators). Is going.
The mail server 106 holds, for example, a copy of the e-mail to be transmitted to the e-mail user, and transmits a copy of the e-mail to the e-mail receiving unit 105.

ステップＳ３０２のデータベースシステム保存処理では、図５に示すように、まず、ステップＳ３０１により電子メールを取得したデータベース管理部１０２が、電子メールをメールメッセージに変換した上で、このメールメッセージをデータベースシステム１０３に送る。
次に、データベースシステム１０３は、データベース管理部１０２より受け取ったメールメッセージの保存を行い、各メールメッセージに割り当てられたレコード番号１８０２をデータベース管理部１０２に返す。
データベース管理部１０２がレコード番号１８０２を取得することで、ステップ３０２のデータベースシステム保存処理は完了となる。 In the database system storage process in step S302, as shown in FIG. 5, first, the database management unit 102 that has acquired the e-mail in step S301 converts the e-mail into a mail message, and then converts the e-mail message into the database system 103. Send to.
Next, the database system 103 saves the mail message received from the database management unit 102 and returns the record number 1802 assigned to each mail message to the database management unit 102.
When the database management unit 102 acquires the record number 1802, the database system storage process in step 302 is completed.

ステップＳ３０３の検索システム登録処理では、図６に示すように、まず、データベース管理部１０２がステップＳ３０１の電子メール取得処理で取得した電子メールから、電子メールに対する検索情報を抽出する。検索情報とは、データベースシステム１０３が、メールメッセージを検索する際の全文検索に用いるための索引である。
次に、この抽出した検索情報と、ステップＳ３０２のデータベースシステム保存処理で取得したレコード番号１８０２を、検索システム１０４に送る。
検索システム１０４は、データベース管理部１０２より受け取った検索情報およびレコード番号１８０２を関連付けした上で登録を行い、完了通知をデータベース管理部１０２に返す。
データベース管理部１０２が完了通知を取得することで、ステップＳ３０３の検索システム登録処理は完了となる。 In the search system registration process in step S303, as shown in FIG. 6, first, the database management unit 102 extracts search information for the e-mail from the e-mail acquired in the e-mail acquisition process in step S301. The search information is an index used by the database system 103 for full-text search when searching for mail messages.
Next, the extracted search information and the record number 1802 acquired in the database system storage process in step S302 are sent to the search system 104.
The search system 104 performs registration after associating the search information received from the database management unit 102 and the record number 1802, and returns a completion notification to the database management unit 102.
When the database management unit 102 acquires the completion notification, the search system registration process in step S303 is completed.

ここで、ステップＳ３０２のデータベースシステム保存処理について、詳細に説明する。
データベースシステム１０３への１回の保存処理で保存可能なデータサイズには上限がある。つまり、データベースシステム１０３が、一つのレコードとして管理可能なデータサイズには上限がある。
これに対し、全ての電子メールの全データを保存するため、本実施の形態に係るデータベース管理部１０２は、図２５に示すように、１回の保存処理で保存可能な上限データサイズである保存処理単位を基準に電子メールを分割し、分割された電子メール内で連番となる分割番号２５０１と、分割された電子メールの電子メール内での合計件数（分割データの個数）を表す分割件数２５０２を、データベースシステム１０３への保存時に付加する。
図２５の例では、保存処理単位を超えるデータサイズの電子メールを３つのメールメッセージに分割し、それぞれのメールメッセージに対して分割件数２５０２として「３」を付加するとともに、それぞれのメールメッセージに対して連番となる分割番号２５０１「１」〜「３」を付加している。
なお、電子メールのデータサイズが保存処理単位を下回っており、電子メールを分割しない場合は、例えば、分割件数、分割番号としてＮｕｌｌ値をメールメッセージに付加する。 Here, the database system storage process in step S302 will be described in detail.
There is an upper limit to the data size that can be saved in one save process in the database system 103. In other words, there is an upper limit on the data size that can be managed as one record by the database system 103.
On the other hand, in order to save all data of all e-mails, the database management unit 102 according to the present embodiment saves data having an upper limit data size that can be saved in one save process as shown in FIG. E-mail is divided based on the processing unit, and the division number 2501 that is a serial number in the divided e-mail and the number of divisions that indicate the total number of e-mails in the e-mail (number of divided data) 2502 is added when saving to the database system 103.
In the example of FIG. 25, an e-mail having a data size exceeding the storage processing unit is divided into three e-mail messages, and “3” is added as the number of divisions 2502 to each e-mail message. The division numbers 2501 “1” to “3” which are serial numbers are added.
If the data size of the e-mail is smaller than the storage processing unit and the e-mail is not divided, for example, a Null value is added to the mail message as the number of divisions and the division number.

具体的な処理の流れを図７に示す。
まず、データベース管理部１０２は、ステップＳ７０１において、電子メールのデータサイズｂとデータベースシステム１０３が１回の保存処理で保存可能な上限データサイズａの大小関係の判定処理を行う。 A specific processing flow is shown in FIG.
First, in step S701, the database management unit 102 performs a determination process of the magnitude relationship between the e-mail data size b and the upper limit data size a that can be stored in the database system 103 by one storing process.

ステップＳ７０１において（上限サイズａ）≧（電子メールのサイズｂ）であるとき（Ｓ７０１でＮｏの場合）、ステップＳ７０２へ進み、データベース管理部１０２は、電子メールをそのままメールメッセージとしてデータベースシステム１０３への保存処理を実行し、ステップＳ７０６において、メールメッセージに一意に割り当てられるレコード番号１８０２を取得する。 When (upper limit size a) ≧ (e-mail size b) in step S701 (No in S701), the process proceeds to step S702, and the database management unit 102 directly sends the e-mail to the database system 103 as a mail message. The storage process is executed, and in step S706, a record number 1802 uniquely assigned to the mail message is acquired.

ステップＳ７０１において（上限サイズａ）＜（電子メールのサイズｂ）であるとき（Ｓ７０１でＹｅｓの場合）、ステップＳ７０３に進み、データベース管理部１０２は、上限サイズａを基準とした単位で電子メールを分割してメールメッセージとする。
続いてステップＳ７０４において、データベース管理部１０２は、メールメッセージそれぞれに、分割前の電子メール内で連番となる分割番号２５０１と分割件数２５０２を割り当てる（図２５参照）。
次に、データベース管理部１０２は、ステップＳ７０５において、このメールメッセージ全てに対してデータベースシステム１０３への保存処理を実行する。
このとき、データベース管理部１０２は、分割番号２５０１および分割件数２５０２をメールメッセージに付加して、データベースシステム１０３へ保存する。
最後に、データベース管理部１０２は、ステップＳ７０６において、データベースシステム１０３によって各メールメッセージに割り当てられたレコード番号１８０２を取得する。
ここで、データベースシステム１０３より取得するレコード番号１８０２は、１個のメールメッセージに割り当てられたものでもよい。
例えば、分割番号２５０１が最も小さいメールメッセージに対して割り当てられたものを取得する。 When (upper limit size a) <(email size b) in step S701 (Yes in S701), the process proceeds to step S703, and the database management unit 102 sends an e-mail in units based on the upper limit size a. Split into mail messages.
In step S704, the database management unit 102 assigns a division number 2501 and a division number 2502 that are serial numbers in the email before division to each mail message (see FIG. 25).
Next, in step S705, the database management unit 102 executes a storage process in the database system 103 for all the mail messages.
At this time, the database management unit 102 adds the division number 2501 and the number of divisions 2502 to the mail message and stores them in the database system 103.
Finally, the database management unit 102 acquires the record number 1802 assigned to each mail message by the database system 103 in step S706.
Here, the record number 1802 acquired from the database system 103 may be assigned to one mail message.
For example, the message assigned to the mail message with the smallest division number 2501 is acquired.

電子メールの蓄積処理における、メールメッセージのデータベースシステム１０３への保存をこのように実行することで、データベースシステム１０３の保存処理の処理単位に依存することなく、全ての電子メールの全データの保存が可能になる。 By storing the mail message in the database system 103 in the e-mail accumulation process in this manner, all data of all e-mails can be stored without depending on the processing unit of the storage process of the database system 103. It becomes possible.

図２に示す検索処理は、ユーザ端末１００からの検索要求および検索問い合わせ文を入力として、電子メールアーカイブシステム１０１によって実行され、検索結果となるレコード番号１８０２をユーザ端末１００に返す。
電子メールアーカイブシステム１０１内における検索処理の流れは、図８に示すように、検索システム検索処理（Ｓ８０１）のみが実行される。 The search process shown in FIG. 2 is executed by the e-mail archive system 101 with the search request and search query sentence from the user terminal 100 as inputs, and returns a record number 1802 as a search result to the user terminal 100.
As shown in FIG. 8, only the search system search process (S801) is executed as the search process flow in the e-mail archive system 101.

ステップＳ８０１の検索システム検索処理では、図９に示すように、まず、ユーザ端末１００から検索要求および検索問い合わせ文を受けたデータベース管理部１０２が、検索問い合わせ文を検索システム１０４に送る。
次に、検索システム１０４は、データベース管理部１０２から受け取った検索問い合わせ文に従って検索を行い、検索結果としてレコード番号１８０２を返す。
データベース管理部１０２がレコード番号１８０２を取得することで、ステップＳ８０１の検索システム検索処理は完了となる。 In the search system search process in step S801, as shown in FIG. 9, first, the database management unit 102 that has received a search request and a search query from the user terminal 100 sends the search query to the search system 104.
Next, the search system 104 performs a search according to the search query received from the database management unit 102, and returns a record number 1802 as a search result.
When the database management unit 102 acquires the record number 1802, the search system search process in step S801 is completed.

図２に示す抽出処理は、ユーザ端末１００からの抽出要求およびレコード番号１８０２を入力として、電子メールアーカイブシステム１０１によって実行され、抽出結果となる電子メールをユーザ端末１００に返す。
電子メールアーカイブシステム１０１内における抽出処理の流れは、図１０に示すように、データベースシステム抽出処理（Ｓ１００１）のみが実行される。 The extraction process shown in FIG. 2 is executed by the e-mail archive system 101 with the extraction request from the user terminal 100 and the record number 1802 as inputs, and returns an e-mail as an extraction result to the user terminal 100.
As for the flow of extraction processing in the e-mail archive system 101, as shown in FIG. 10, only the database system extraction processing (S1001) is executed.

ステップＳ１００１のデータベースシステム抽出処理では、図１１に示すように、まず、ユーザ端末１００から抽出要求およびレコード番号１８０２を受けたデータベース管理部１０２が、レコード番号１８０２をデータベースシステム１０３に送る。
次に、データベースシステム１０３は、データベース管理部１０２から受け取ったレコード番号１８０２に該当するメールメッセージの検索を行い、抽出結果としてこのメールメッセージを返す。
このとき、データベース管理部１０２は、データベースシステム１０３から受け取ったメールメッセージが分割保存された電子メールの一部である場合、再度データベースシステム１０３に対してメールメッセージの取得を実行し、受け取ったメールメッセージを結合させることで、分割前の電子メールを取得する場合がある。
データベース管理部１０２が、メールメッセージの取得およびメールメッセージから得られる電子メールを取得することで、ステップＳ１００１のデータベースシステム抽出処理は完了となる。 In the database system extraction processing in step S1001, as shown in FIG. 11, the database management unit 102 that has received the extraction request and the record number 1802 from the user terminal 100 first sends the record number 1802 to the database system 103.
Next, the database system 103 searches for a mail message corresponding to the record number 1802 received from the database management unit 102, and returns this mail message as an extraction result.
At this time, when the mail message received from the database system 103 is a part of the divided and stored e-mail, the database management unit 102 executes the acquisition of the mail message to the database system 103 again, and receives the received mail message. In some cases, the e-mail before the division is acquired by combining.
The database management unit 102 completes the database system extraction process in step S1001 by acquiring the mail message and acquiring the e-mail obtained from the mail message.

ここで、ステップＳ１００１のデータベースシステム抽出処理について、詳細に説明する。
データベースシステム１０３への１回の保存処理で保存可能なデータサイズには上限があるため、この上限を超える電子メールについては分割され、複数のメールメッセージの状態で保存されている。
したがって、抽出処理では、抽出対象のレコード番号１８０２に続く既定の件数分のメールメッセージについても合わせて取得し、抽出対象のレコード番号１８０２に該当するメールメッセージが分割された電子メールの一部である場合、合わせて取得したメールメッセージを結合する必要がある。 Here, the database system extraction processing in step S1001 will be described in detail.
Since there is an upper limit on the data size that can be stored in one save process in the database system 103, an email that exceeds this upper limit is divided and stored in a plurality of mail messages.
Therefore, in the extraction process, a predetermined number of mail messages following the record number 1802 to be extracted are also acquired, and the mail message corresponding to the record number 1802 to be extracted is a part of the divided email. In this case, it is necessary to combine the mail messages obtained together.

具体的な処理の流れを図１２に示す。
まず、ステップＳ１２０１において、データベース管理部１０２は、抽出したいメールメッセージのレコード番号１８０２を送信し、既定の件数分のメールメッセージを抽出するようにデータベースシステム１０３に要求する。
すなわち、抽出したいメールメッセージのレコード番号１８０２をＲ、既定の件数をＮとすると、レコード番号１８０２：Ｒ〜（Ｒ＋Ｎ−１）のメールメッセージを抽出するよう、データベースシステム１０３に要求する。
次に、ステップＳ１２０２において、データベースシステム１０３は指定のＮ個のレコードを取得し、データベース管理部１０２に返す。
次に、ステップＳ１２０３において、データベース管理部１０２はこれに分割番号２５０１が付加されているかの判定処理を行う。 A specific processing flow is shown in FIG.
First, in step S1201, the database management unit 102 transmits a record number 1802 of a mail message to be extracted, and requests the database system 103 to extract a predetermined number of mail messages.
That is, if the record number 1802 of the mail message to be extracted is R and the default number is N, the database system 103 is requested to extract the mail message of record number 1802: R to (R + N-1).
Next, in step S <b> 1202, the database system 103 acquires designated N records and returns them to the database management unit 102.
In step S1203, the database management unit 102 determines whether a division number 2501 has been added thereto.

分割番号２５０１が付加されているとき（Ｓ１２０３でＹｅｓの場合）、データベース管理部１０２は、ステップＳ１２０４に進み、分割件数２５０２を参照して分割件数２５０２分のメールメッセージを取得済みかどうかの判定処理を行う。 When the division number 2501 is added (Yes in S1203), the database management unit 102 proceeds to step S1204 and refers to the division number 2502 to determine whether or not the mail message for the division number 2502 has been acquired. I do.

分割件数２５０２分のメールメッセージを取得済みでない場合（Ｓ１２０４でＮｏの場合）、データベース管理部１０２は、ステップＳ１２０５へ進み、取得不足件数分のメールメッセージを新たにデータベースシステム１０３）から取得する。
そして、ステップＳ１２０６へ進み、取得したメールメッセージを結合して分割前の電子メールを取得することで、データベースシステム抽出処理を終了する。 If the number of divisional number 2502 mail messages has not been acquired (No in S1204), the database management unit 102 proceeds to step S1205 and acquires new mail messages for the number of insufficient acquisition from the database system 103).
Then, the process proceeds to step S1206, and the database system extraction processing is completed by combining the acquired mail messages and acquiring the pre-split e-mail.

一方、ステップＳ１２０４において、分割件数２５０２分のメールメッセージを取得済みの場合（Ｓ１２０４でＹｅｓの場合）、ステップＳ１２０６へ進み、取得したメールメッセージを結合して分割前の電子メールを取得することで、データベースシステム抽出処理を終了する。 On the other hand, in step S1204, if mail messages for the number of divisions 2502 have already been acquired (Yes in S1204), the process proceeds to step S1206, and the acquired mail messages are combined to acquire the pre-split e-mail. Terminates the database system extraction process.

また、ステップＳ１２０３において、分割番号２５０１が付加されていないとき（Ｓ１２０３でＮｏの場合）、指定のレコード番号Ｒのメールメッセージをそのまま電子メールとして取得することで、データベースシステム抽出処理は終了する。
ステップＳ１２０２において同時に取得したレコード番号（Ｒ＋１）〜（Ｒ＋Ｎ−１）のメールメッセージについては、使用されず破棄される。 Also, in step S1203, when the division number 2501 is not added (in the case of No in S1203), the database message extraction process is completed by acquiring the mail message of the designated record number R as an e-mail as it is.
The mail messages with the record numbers (R + 1) to (R + N−1) acquired at the same time in step S1202 are not used and are discarded.

ここで、ステップＳ１２０１およびＳ１２０２における、データベースシステム１０３の動作について、より詳細に説明する。
データベースシステム１０３では、データは時系列順に連番のレコード番号１８０２を割り当てられて並び、ストレージに格納されている。
メールメッセージの取得の要求に対しては、まず指定のメールメッセージのレコード番号１８０２が属するデータベースシステム１０３中の範囲１８０１を特定する。
次に、この範囲１８０１に含まれるメールメッセージ全てを読み出した上で、指定のメールメッセージのレコード番号１８０２を探索し、該当するメールメッセージを取り出してデータベース管理部１０２に返す。 Here, the operation of the database system 103 in steps S1201 and S1202 will be described in more detail.
In the database system 103, data is arranged in a time-series order with sequential record numbers 1802 assigned and stored in the storage.
In response to a mail message acquisition request, first, a range 1801 in the database system 103 to which the record number 1802 of the designated mail message belongs is specified.
Next, after reading all the mail messages included in the range 1801, the record number 1802 of the designated mail message is searched, the corresponding mail message is extracted and returned to the database management unit 102.

レコード番号１８０２の局所性により、指定のメールメッセージ（レコード番号１８０２をＲとする）が、任意の範囲１８０１に存在した場合、それに連続するメールメッセージ（レコード番号（Ｒ＋１）、（Ｒ＋２）、…）もこの範囲１８０１中に存在する確率が高い。
今、ある範囲１８０１にレコード番号Ｒ１〜Ｒ２（Ｒ１≦Ｒ≦Ｒ２）のメールメッセージが存在するとする。
仮に、Ｒ＋Ｎ−１≦Ｒ２であれば、指定したＮ個のメールメッセージはすべて同じ範囲１８０１内に存在する。
このような場合、データ取得時間ではストレージのアクセス時間が支配的であるため、レコード番号Ｒのデータ１件のみ取得する時間と、Ｒ〜Ｒ＋Ｎ−１のＮ件を取得する時間は、いずれも同一の範囲１８０１を読み出すストレージのアクセス時間に大部分が依存するため、その時間差は無視できる。 Due to the locality of the record number 1802, if a specified mail message (record number 1802 is R) is present in an arbitrary range 1801, the mail message that follows it (record number (R + 1), (R + 2),...) There is also a high probability of being in this range 1801.
Now, it is assumed that a mail message having record numbers R1 to R2 (R1 ≦ R ≦ R2) exists in a certain range 1801.
If R + N−1 ≦ R2, the designated N mail messages are all within the same range 1801.
In such a case, since the storage access time is dominant in the data acquisition time, the time for acquiring only one data of record number R and the time for acquiring N data of R to R + N−1 are both the same. Most of the time depends on the access time of the storage that reads the range 1801, so the time difference can be ignored.

ここで、分割された電子メールはできるだけＮ個以内のメールメッセージに分割されることで追加取得のステップをできるだけ避けられる。
１度読み出した範囲１８０１内に存在する任意のメールメッセージに対して複数回の抽出処理を再実行することで、同一の範囲１８０１に対する読出し処理が複数回行われることによる無駄を省くことができる。
このようなＮを既定の抽出レコード数として設定することにより、過剰なメールメッセージ取得のペナルティを最小に抑えながら、分割された電子メールの取得の高速化を実現できる。 Here, the divided e-mail is divided into as many as N mail messages as much as possible, thereby avoiding the additional acquisition step as much as possible.
By re-execution of the extraction processing a plurality of times for an arbitrary mail message existing in the range 1801 read once, waste due to a plurality of read processings for the same range 1801 can be eliminated.
By setting N as the default number of extracted records, it is possible to speed up the acquisition of divided emails while minimizing the penalty of excessive email message acquisition.

このように、本実施の形態に係る電子メールアーカイブシステムは、データベースシステムが１回の保存処理で保存可能なデータサイズを超える電子メールの保存において、電子メールを保存可能なデータサイズ単位に分割することで、全ての電子メールの全データの保存を可能にするものである。 As described above, the e-mail archiving system according to the present embodiment divides the e-mail into storable data size units when storing e-mails that exceed the data size that the database system can store in one saving process. This makes it possible to save all data of all e-mails.

以上、本実施の形態では、以下の２点を構成要素とし、データベースシステムの１回の保存処理単位を超える入力データを分割して保存し、１回の抽出処理で既定件数を取得して結合すること、を主な特徴とする、データアーカイブシステムについて説明した。
（ア）以下を特徴とするデータベースシステム。
１）データを時系列順で追記的に保存する。
２）保存される全データに対して、連番で管理されるレコード番号を付与する。
３）保存される全データに対して、分割番号を付与する。
４）保存される全データに対して、分割件数を付与する。
５）任意の時系列単位で範囲を追加し、範囲ごとにデータを管理する。
（イ）以下を特徴とするデータベース管理部。
１）データベースシステムの１回の保存処理単位を超える入力データを、複数個に分割することで、データベースシステムへの保存を可能とする。
２）分割された各データに対して、入力データ内で連番となる、分割番号を割り当てて、データベースシステムへ保存する。
３）分割された各データに対して、入力データを構成する分割されたデータの合計件数を表す、分割番号を割り当てて、データベースシステムへ保存する。 As described above, in the present embodiment, the following two points are used as components, and input data exceeding the unit of one storage processing of the database system is divided and stored, and a predetermined number is acquired and combined by one extraction processing. The data archiving system, whose main feature is to do, has been described.
(A) A database system characterized by the following.
1) Save data incrementally in chronological order.
2) A record number managed by a serial number is assigned to all stored data.
3) A division number is assigned to all stored data.
4) The number of divisions is given to all data to be stored.
5) Add ranges in arbitrary time series units and manage data for each range.
(A) A database management unit characterized by the following.
1) By dividing input data exceeding a single storage processing unit of the database system into a plurality of pieces, the data can be stored in the database system.
2) For each divided data, a division number that is a serial number in the input data is assigned and stored in the database system.
3) A division number representing the total number of divided data constituting the input data is assigned to each divided data and stored in the database system.

また、本実施の形態では、データベースシステム中のデータを検索するための検索問い合わせを入力してデータの検索を行い、検索結果としてレコード番号を返す検索システムをさらに備えること、を主な特徴とするデータアーカイブシステムについて説明した。 The main feature of the present embodiment is that it further includes a search system that inputs a search query for searching for data in the database system, searches for data, and returns a record number as a search result. The data archive system was explained.

また、本実施の形態では、分割保存されたデータに対する検索結果として返すレコード番号を１個とすること、を主な特徴とするデータアーカイブシステムについて説明した。 In the present embodiment, the data archiving system whose main feature is that one record number is returned as a search result for divided and stored data has been described.

また、本実施の形態では、データの検索を全文検索により行うこと、を主な特徴とするデータアーカイブシステムについて説明した。 In the present embodiment, the data archiving system whose main feature is that data search is performed by full-text search has been described.

実施の形態２．
この発明の実施の形態２における電子メールアーカイブシステムの基本構成は、実施の形態１と同じく、図１のようになる。
電子メールアーカイブシステム１０１は、データアーカイブシステムに検索システムを付加し、電子メール受信部１０５によってメールサーバ１０６から取得した電子メールに対して、蓄積処理・検索処理・抽出処理をユーザ端末１００に提供する。
この電子メールアーカイブシステム１０１は、データベース管理部１０２と、データベースシステム１０３）と、検索システム１０４と、から構成される。 Embodiment 2. FIG.
The basic configuration of the e-mail archive system according to the second embodiment of the present invention is as shown in FIG.
The e-mail archive system 101 adds a search system to the data archive system and provides the user terminal 100 with storage processing / search processing / extraction processing for the e-mail acquired from the mail server 106 by the e-mail receiving unit 105. .
The e-mail archive system 101 includes a database management unit 102, a database system 103), and a search system 104.

データベースシステム１０３は、実施の形態１で説明したように、複数個のレコードを単位とする範囲１８０１（管理ブロック）に区分して、格納しているメールメッセージを管理している。
そして、本実施の形態では、データベース管理部１０２は、データベースシステム１０３に格納されているメールメッセージの分割件数の総数とデータベースシステム１０３の範囲１８０１の個数とに基づきメールメッセージの抽出要求数（データ抽出要求数）を決定し、データベースシステム１０３に対して特定のメールメッセージの抽出を要求する際に、レコード番号の順序において特定のメールメッセージから抽出要求数分のメールメッセージを抽出するよう要求する。
データベース管理部１０２は、分割件数の総数を範囲１８０１の個数で除算して得られる値を、メールメッセージの抽出要求数とする。 As described in the first embodiment, the database system 103 divides the range 1801 (management block) in units of a plurality of records and manages stored mail messages.
In this embodiment, the database management unit 102 determines the number of mail message extraction requests (data extraction) based on the total number of mail message divisions stored in the database system 103 and the number of ranges 1801 in the database system 103. When requesting the database system 103 to extract a specific mail message, a request is made to extract mail messages corresponding to the number of extraction requests from the specific mail message in the order of record numbers.
The database management unit 102 sets a value obtained by dividing the total number of divisions by the number of ranges 1801 as the number of mail message extraction requests.

本実施の形態においても、電子メールアーカイブシステム１０１とユーザ端末１００との間の動作は図２と同様である。
図２に示す蓄積処理は、ユーザ端末１００からの蓄積要求を入力として、電子メールアーカイブシステム１０１によって実行され、完了通知をユーザ端末１００に返す。
電子メールアーカイブシステム１０１内における蓄積処理の流れは、図１３に示すように、電子メール取得処理（Ｓ３０１）、電子メール統計情報取得処理（Ｓ１３０１）、データベースシステム保存処理（Ｓ３０２）、検索システム登録処理（Ｓ３０３）の順に実行される。 Also in the present embodiment, the operation between the electronic mail archive system 101 and the user terminal 100 is the same as that in FIG.
The storage process shown in FIG. 2 is executed by the email archive system 101 with the storage request from the user terminal 100 as an input, and a completion notification is returned to the user terminal 100.
As shown in FIG. 13, the flow of the accumulation process in the e-mail archive system 101 includes an e-mail acquisition process (S301), an e-mail statistical information acquisition process (S1301), a database system storage process (S302), and a search system registration process. (S303) are executed in this order.

また、電子メールアーカイブシステム１０１内における蓄積処理の流れは、図１４に示すように、電子メール取得処理（Ｓ３０１）、データベースシステム保存処理（Ｓ３０２）、電子メール統計情報取得処理（Ｓ１４０１）、検索システム登録処理（Ｓ３０３）の順に実行されてもよい。 As shown in FIG. 14, the flow of the accumulation process in the e-mail archive system 101 includes an e-mail acquisition process (S301), a database system storage process (S302), an e-mail statistical information acquisition process (S1401), and a search system. The registration processing (S303) may be performed in this order.

ステップＳ３０１の電子メール取得処理、ステップＳ３０２のデータベースシステム保存処理、ステップＳ３０３の検索システム登録処理については、実施の形態１に示した通りである。 The e-mail acquisition process in step S301, the database system storage process in step S302, and the search system registration process in step S303 are as described in the first embodiment.

ステップＳ１３０１の電子メール統計情報取得処理では、データベース管理部１０２が、分割保存される電子メールのメールメッセージの件数（分割メールメッセージ件数）をカウントしておき、これをデータベース管理部１０２に保存する。
分割メールメッセージ件数のカウントは、データベース管理部１０２が、電子メールを入力として、電子メールのデータサイズをデータベースシステム１０３が１回の保存処理で保存可能なデータサイズで割って商を求め、入力した電子メールごとの商を累計することで、求めることができる。 In the e-mail statistical information acquisition process in step S1301, the database management unit 102 counts the number of e-mail messages (divided mail message number) of e-mails to be divided and stored, and stores them in the database management unit 102.
The count of the number of divided mail messages is input by the database management unit 102 by obtaining the quotient by dividing the data size of the e-mail by the data size that can be stored by the database system 103 in one save process, using the e-mail as input. It can be obtained by accumulating the quotient for each e-mail.

具体的な処理の流れを図１５に示す。
まず、ステップＳ１５０１において、データベース管理部１０２は、入力となる電子メールのデータサイズを取得する。
次に、ステップＳ１５０２において、データベース管理部１０２は、データベースシステムが１回の保存処理で保存可能なデータサイズで、電子メールのデータサイズを割った商を求める。
次に、ステップＳ１５０３において、データベース管理部１０２は、ステップＳ１５０２で求めた商について余りがあるかどうかの判定処理を行う。
余りが存在する場合（Ｓ１５０３でＹｅｓの場合）は、ステップＳ１５０４に進み、ステップＳ１５０２で求めた商に１を加算して終了する。
余りが存在しない場合は、そのまま終了する。
そして、このように電子メールごとに求めた商を、全ての電子メールについて加算することで、分割メールメッセージ件数を得ることができる。 A specific processing flow is shown in FIG.
First, in step S1501, the database management unit 102 acquires the data size of an input e-mail.
Next, in step S1502, the database management unit 102 obtains a quotient obtained by dividing the data size of the e-mail by the data size that the database system can store in one storing process.
Next, in step S1503, the database management unit 102 determines whether there is a remainder for the quotient obtained in step S1502.
If there is a remainder (Yes in S1503), the process proceeds to step S1504, 1 is added to the quotient obtained in step S1502, and the process ends.
If there is no remainder, the process ends.
The number of divided mail messages can be obtained by adding the quotient obtained for each e-mail in this way for all e-mails.

ステップＳ１３０１の電子メール統計情報取得処理では、上記した処理手順以外に、図２４に示すように、分割メールメッセージ件数を、データベースシステムの範囲１８０１ごとに、データベースシステム１０３の範囲管理テーブル１９０１に保存しておいてもよい。
つまり、図２４のように範囲１８０１ごとに分割メールメッセージ件数を管理していてもよいし、データベースシステム１０３の全範囲に対して分割メールメッセージ件数の合計値を一括して管理していてもよい。 In the e-mail statistical information acquisition process in step S1301, in addition to the above processing procedure, the number of divided mail messages is stored in the range management table 1901 of the database system 103 for each range 1801 of the database system, as shown in FIG. You may keep it.
That is, as shown in FIG. 24, the number of divided mail messages may be managed for each range 1801, or the total number of divided mail messages may be managed collectively for the entire range of the database system 103. .

また、蓄積処理の流れを、図１４に示す順に実行する場合、ステップＳ１４０１の電子メール統計情報取得処理では、データベース管理部１０２が、データベースシステム１０３が返すレコード番号１８０２を全て取得し、これを入力として、その件数をカウントして加算することによっても求めることができる。 Further, when the flow of the accumulation process is executed in the order shown in FIG. 14, in the electronic mail statistical information acquisition process in step S1401, the database management unit 102 acquires all the record numbers 1802 returned by the database system 103 and inputs them. Can also be obtained by counting and adding the number of cases.

また、図２に示す検索処理は、ユーザ端末１００からの検索要求および検索問い合わせ文を入力として、電子メールアーカイブシステム１０１によって実行され、検索結果となるレコード番号１８０２をユーザ端末１００に返す。
電子メールアーカイブシステム１０１内における検索処理の流れは、図８に示すように、検索システム検索処理（Ｓ８０１）のみが実行される。
ステップＳ８０１の検索システム検索処理については、実施の形態１に示した通りである。 The search process shown in FIG. 2 is executed by the e-mail archive system 101 with the search request and search query text from the user terminal 100 as inputs, and returns a record number 1802 as a search result to the user terminal 100.
As shown in FIG. 8, only the search system search process (S801) is executed as the search process flow in the e-mail archive system 101.
The search system search process in step S801 is as described in the first embodiment.

また、図２に示す抽出処理は、ユーザ端末１００からの抽出要求およびレコード番号１８０２を入力として、電子メールアーカイブシステム１０１によって実行され、抽出結果となる電子メールをユーザ端末１００に返す。
電子メールアーカイブシステム１０１内における抽出処理の流れは、図１０に示すように、データベースシステム抽出処理（Ｓ１００１）のみが実行される。
ステップＳ１００１のデータベースシステム取得処理については、実施の形態１に示した通りである。 The extraction process shown in FIG. 2 is executed by the e-mail archive system 101 with the extraction request from the user terminal 100 and the record number 1802 as inputs, and returns an e-mail as an extraction result to the user terminal 100.
As for the flow of extraction processing in the e-mail archive system 101, as shown in FIG. 10, only the database system extraction processing (S1001) is executed.
The database system acquisition process in step S1001 is as described in the first embodiment.

ただし、具体的な処理の流れを図１６に示すものとする。
これは、図１２のステップＳ１２０２をステップＳ１６０２に置き換えたもので、それ以外のステップＳ１２０１およびＳ１２０３〜Ｓ１２０６については、実施の形態１に示した通りである。 However, a specific processing flow is shown in FIG.
This is obtained by replacing step S1202 in FIG. 12 with step S1602, and other steps S1201 and S1203 to S1206 are as described in the first embodiment.

ここで、ステップＳ１６０２の「蓄積時の統計情報を基にした件数」について、詳細に説明する。
ステップＳ１３０１の電子メール統計情報取得処理を、データベース管理部１０２が分割メールメッセージ件数をカウントし、これをデータベース管理部１０２に保存する手順とした場合の具体的な処理の流れを図１７のように設定する。
まず、ステップＳ１７０１において、データベース管理部１０２は、保存している分割メールメッセージ件数を取得する。
次に、ステップＳ１７０２において、データベース管理部１０２は、データベースシステム中に存在する範囲１８０１数をデータベースシステム１０３の範囲管理テーブル１９０１から取得する。
最後に、ステップＳ１７０３において、データベース管理部１０２は、ステップＳ１７０１で取得した分割メールメッセージ件数をステップＳ１７０２で取得した範囲１８０１数で割った商を、メールメッセージの抽出要求数として「蓄積時の統計情報を基にした件数」に設定し、終了する。 Here, the “number of cases based on the statistical information at the time of accumulation” in step S1602 will be described in detail.
FIG. 17 shows a specific processing flow when the database management unit 102 counts the number of divided mail messages and stores it in the database management unit 102 in the e-mail statistical information acquisition process in step S1301. Set.
First, in step S1701, the database management unit 102 acquires the number of stored divided mail messages.
In step S <b> 1702, the database management unit 102 acquires the number of ranges 1801 existing in the database system from the range management table 1901 of the database system 103.
Finally, in step S1703, the database management unit 102 sets the quotient obtained by dividing the number of divided mail messages acquired in step S1701 by the number of ranges 1801 acquired in step S1702 as the number of mail message extraction requests. Set to “Number of cases based on” and exit.

また、ステップＳ１３０１の電子メール統計情報取得処理を、図２４に示すように、データベースシステム１０３の範囲管理テーブル１９０１に範囲１８０１ごとに分割メールメッセージ件数を保存する手順とした場合、データベース管理部１０２は、ステップＳ１２０１におけるレコード番号１８０２が存在する範囲１８０１の分割メールメッセージ件数を、データベースシステム１０３の範囲管理テーブル１９０１から取得し、これをメールメッセージの抽出要求数として「蓄積時の統計情報を基にした件数」に設定する。
つまり、データベース管理部１０２は、図２４の例では、分割メールメッセージ件数６件（１＋２＋３＋０＝６）を、範囲数である４で除算して、１．５を得る。
この１．５に対して、データベース管理部１０２は、１又は２をメールメッセージの抽出要求数（蓄積時の統計情報を基にした件数）として決定する。 Further, when the e-mail statistical information acquisition process in step S1301 is a procedure for storing the number of divided mail messages for each range 1801 in the range management table 1901 of the database system 103 as shown in FIG. In step S1201, the number of divided mail messages in the range 1801 in which the record number 1802 exists is acquired from the range management table 1901 of the database system 103, and this is used as the number of mail message extraction requests “based on statistical information at the time of accumulation. Set to “Number of records”.
That is, in the example of FIG. 24, the database management unit 102 divides the divided mail message number 6 (1 + 2 + 3 + 0 = 6) by 4 which is the number of ranges to obtain 1.5.
For this 1.5, the database management unit 102 determines 1 or 2 as the number of mail message extraction requests (the number of cases based on statistical information at the time of accumulation).

そして、データベース管理部１０２は、図１６のステップＳ１２０１において、抽出を要求する特定のレコード番号と、以上のようにして決定された抽出要求数（蓄積時の統計情報を基にした件数）をデータベースシステム１０３に対して通知し、データベースシステム１０３は、特定のレコード番号から抽出要求数分のメールメッセージを抽出してデータベース管理部１０２に出力する。
データベース管理部１０２では、図１６のステップＳ１６０２において、特定のレコード番号から抽出要求数分のメールメッセージを取得することになる。 Then, in step S1201 of FIG. 16, the database management unit 102 stores the specific record number requesting the extraction and the number of extraction requests determined as described above (the number based on the statistical information at the time of accumulation) in the database. Notifying the system 103, the database system 103 extracts mail messages corresponding to the number of extraction requests from a specific record number and outputs them to the database management unit 102.
In step S1602 of FIG. 16, the database management unit 102 acquires as many e-mail messages as the number of extraction requests from a specific record number.

このように、蓄積時の統計情報から最適な取得件数を自動的に設定することで、適用先のシステムの特性やアーカイブ対象となる電子メールのデータサイズの変化に柔軟に対応した、効率的なメールメッセージ取得機能を提供できる。
本実施の形態では、蓄積時の統計情報に基づいて、分割メールメッセージ件数の総数を範囲数（管理ブロックの個数）で除算して得られる値を、メールメッセージの抽出要求数とし、データベースシステムが、この抽出要求数に従ってメールメッセージを抽出する。この抽出要求数は、一つの範囲に存在する分割メールメッセージ件数の平均値である。
本実施の形態に係るデータベースシステムは、抽出要求数に従い、一つの範囲に存在する分割メールメッセージの平均件数分のメールメッセージを一回の抽出処理で抽出しているため、一つの電子メールを構成する全てのメールメッセージを一回の抽出処理で抽出できる可能性が増える。
実施の形態１の例では、一つの電子メールを構成する複数のメールメッセージの全てを一回の抽出処理で取得できない場合や、一回の抽出処理において抽出された複数のメールメッセージの多くを使用することなく破棄する場合が生じるが、本実施の形態によれば、このような事態の発生頻度を抑制することができ、メールメッセージ抽出処理の効率化を図ることができる。 In this way, by automatically setting the optimal number of acquisitions from the statistical information at the time of accumulation, it is efficient to flexibly respond to changes in the characteristics of the target system and the data size of the email to be archived. A mail message acquisition function can be provided.
In this embodiment, based on the statistical information at the time of accumulation, a value obtained by dividing the total number of divided mail messages by the number of ranges (number of management blocks) is set as the number of mail message extraction requests, and the database system The mail message is extracted according to the number of extraction requests. The number of extraction requests is an average value of the number of divided mail messages existing in one range.
The database system according to the present embodiment extracts one email message corresponding to the average number of divided email messages existing in one range according to the number of extraction requests, and thus constitutes one email. This increases the possibility of extracting all email messages that can be extracted with a single extraction process.
In the example of the first embodiment, when all of a plurality of mail messages constituting one electronic mail cannot be obtained by one extraction process, or many of a plurality of mail messages extracted by one extraction process are used. However, according to the present embodiment, the occurrence frequency of such a situation can be suppressed, and the efficiency of the mail message extraction process can be improved.

このように、本実施の形態に係る電子メールアーカイブシステムは、分割保存された電子メールの取得において、電子メールのデータ特性と電子メールアーカイブシステムで用いるデータベースシステムのディスクＩ／Ｏ特性を考慮し、１回のメールメッセージの抽出処理における抽出件数を複数件に設定することで、抽出処理実行回数の削減および取得処理１回における不要な電子メールの抽出件数を最小限に抑え、電子メールアーカイブシステム全体の処理の効率化を実現するものである。 As described above, the e-mail archive system according to the present embodiment takes into account the data characteristics of e-mail and the disk I / O characteristics of the database system used in the e-mail archive system in obtaining divided and stored e-mails. By setting the number of extractions in one email message extraction process to multiple, the number of extraction process executions can be reduced and the number of unnecessary emails extracted in one acquisition process can be minimized, and the entire email archive system To improve the efficiency of the process.

以上、本実施の形態では、分割保存された入力データの取得において、入力データのデータベースシステムへの保存時における統計情報として、データベースシステムに分割保存されたデータ数の総数を保存しておき、これとデータベースシステム中に存在する範囲数を用いて、最適な取得件数を設定し、取得対象のレコード番号に続く設定済みの取得件数分のデータを、データベースシステムから取得すること、を主な特徴とする、データアーカイブシステムについて説明した。 As described above, in the present embodiment, in the acquisition of the divided and stored input data, the total number of data divided and stored in the database system is stored as statistical information when the input data is stored in the database system. And the number of ranges that exist in the database system, set the optimal number of acquisitions, and acquire the data for the number of acquisitions that have been set following the record number to be acquired from the database system. Explained the data archiving system.

また、本実施の形態では、最適な取得件数の設定において、データベースシステムに分割保存されたデータ数を、データベースシステム中に存在する範囲数で割った商に最適な取得件数を設定すること、を主な特徴とする、データアーカイブシステムについて説明した。 Further, in the present embodiment, in setting the optimum number of acquisitions, setting the optimum number of acquisitions for the quotient obtained by dividing the number of data divided and stored in the database system by the number of ranges existing in the database system. The main feature, the data archiving system, was explained.

また、本実施の形態では、最適な取得件数において、入力データの保存時における統計情報として分割保存されたデータ数を、データベースシステム中に存在する範囲ごとに保持しておき、これを各範囲に対する最適な取得件数とすること、を主な特徴とするデータアーカイブシステムについて説明した。 Further, in the present embodiment, in the optimum number of acquisition cases, the number of data divided and stored as statistical information at the time of storing input data is held for each range existing in the database system, and this is stored for each range. The data archiving system whose main feature is to obtain the optimum number of acquisitions has been described.

最後に実施の形態１、２に示した電子メールアーカイブシステム１０１のハードウェア構成例について説明する。
図２６は、実施の形態１、２に示す電子メールアーカイブシステム１０１のハードウェア資源の一例を示す図である。なお、図２６の構成は、あくまでも電子メールアーカイブシステム１０１のハードウェア構成の一例を示すものであり、電子メールアーカイブシステム１０１のハードウェア構成は図２６に記載の構成に限らず、他の構成であってもよい。 Finally, a hardware configuration example of the electronic mail archive system 101 shown in the first and second embodiments will be described.
FIG. 26 is a diagram showing an example of hardware resources of the electronic mail archive system 101 shown in the first and second embodiments. The configuration in FIG. 26 is merely an example of the hardware configuration of the email archive system 101, and the hardware configuration of the email archive system 101 is not limited to the configuration in FIG. There may be.

図２６において、電子メールアーカイブシステム１０１は、プログラムを実行するＣＰＵ９１１（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ、中央処理装置、処理装置、演算装置、マイクロプロセッサ、マイクロコンピュータ、プロセッサともいう）を備えている。ＣＰＵ９１１は、バス９１２を介して、例えば、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）９１３、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）９１４、通信ボード９１５、表示装置９０１、キーボード９０２、マウス９０３、磁気ディスク装置９２０と接続され、これらのハードウェアデバイスを制御する。更に、ＣＰＵ９１１は、ＦＤＤ９０４（ＦｌｅｘｉｂｌｅＤｉｓｋＤｒｉｖｅ）、コンパクトディスク装置９０５（ＣＤＤ）、プリンタ装置９０６、スキャナ装置９０７と接続していてもよい。また、磁気ディスク装置９２０の代わりに、光ディスク装置、メモリカード読み書き装置などの記憶装置でもよい。
ＲＡＭ９１４は、揮発性メモリの一例である。ＲＯＭ９１３、ＦＤＤ９０４、ＣＤＤ９０５、磁気ディスク装置９２０の記憶媒体は、不揮発性メモリの一例である。これらは、記憶装置あるいは記憶部の一例である。
通信ボード９１５、キーボード９０２、スキャナ装置９０７、ＦＤＤ９０４などは、入力部、入力装置の一例である。
また、通信ボード９１５、表示装置９０１、プリンタ装置９０６などは、出力部、出力装置の一例である。 In FIG. 26, the e-mail archive system 101 includes a CPU 911 (also referred to as a central processing unit, a central processing unit, a processing unit, an arithmetic unit, a microprocessor, a microcomputer, and a processor) that executes a program. The CPU 911 is connected to, for example, a ROM (Read Only Memory) 913, a RAM (Random Access Memory) 914, a communication board 915, a display device 901, a keyboard 902, a mouse 903, and a magnetic disk device 920 via a bus 912. Control hardware devices. Further, the CPU 911 may be connected to an FDD 904 (Flexible Disk Drive), a compact disk device 905 (CDD), a printer device 906, and a scanner device 907. Further, instead of the magnetic disk device 920, a storage device such as an optical disk device or a memory card read / write device may be used.
The RAM 914 is an example of a volatile memory. The storage media of the ROM 913, the FDD 904, the CDD 905, and the magnetic disk device 920 are an example of a nonvolatile memory. These are examples of a storage device or a storage unit.
The communication board 915, the keyboard 902, the scanner device 907, the FDD 904, and the like are examples of an input unit and an input device.
Further, the communication board 915, the display device 901, the printer device 906, and the like are examples of an output unit and an output device.

通信ボード９１５は、ネットワークに接続されている。例えば、通信ボード９１５は、ＬＡＮ（ローカルエリアネットワーク）、インターネット、ＷＡＮ（ワイドエリアネットワーク）などに接続されていても構わない。
磁気ディスク装置９２０には、オペレーティングシステム９２１（ＯＳ）、ウィンドウシステム９２２、プログラム群９２３、ファイル群９２４が記憶されている。プログラム群９２３のプログラムは、ＣＰＵ９１１、オペレーティングシステム９２１、ウィンドウシステム９２２により実行される。 The communication board 915 is connected to the network. For example, the communication board 915 may be connected to a LAN (local area network), the Internet, a WAN (wide area network), or the like.
The magnetic disk device 920 stores an operating system 921 (OS), a window system 922, a program group 923, and a file group 924. The programs in the program group 923 are executed by the CPU 911, the operating system 921, and the window system 922.

上記プログラム群９２３には、実施の形態１、２の説明において「〜部」として説明している機能を実行するプログラムが記憶されている。プログラムは、ＣＰＵ９１１により読み出され実行される。
ファイル群９２４には、実施の形態１、２の説明において、「〜の判断」、「〜の計算」、「〜の比較」、「〜の取得」、「〜の抽出」、「〜の設定」、「〜の登録」、「〜の更新」「〜の分割」「〜のカウント」等として説明している処理の結果を示す情報やデータや信号値や変数値やパラメータが、「〜ファイル」や「〜データベース」の各項目として記憶されている。「〜ファイル」や「〜データベース」は、ディスクやメモリなどの記録媒体に記憶される。ディスクやメモリになどの記憶媒体に記憶された情報やデータや信号値や変数値やパラメータは、読み書き回路を介してＣＰＵ９１１によりメインメモリやキャッシュメモリに読み出され、抽出・検索・参照・比較・演算・計算・処理・編集・出力・印刷・表示などのＣＰＵの動作に用いられる。抽出・検索・参照・比較・演算・計算・処理・編集・出力・印刷・表示のＣＰＵの動作の間、情報やデータや信号値や変数値やパラメータは、メインメモリ、レジスタ、キャッシュメモリ、バッファメモリ等に一時的に記憶される。
また、実施の形態１、２で説明しているフローチャートの矢印の部分は主としてデータや信号の入出力を示し、データや信号値は、ＲＡＭ９１４のメモリ、ＦＤＤ９０４のフレキシブルディスク、ＣＤＤ９０５のコンパクトディスク、磁気ディスク装置９２０の磁気ディスク、その他光ディスク、ミニディスク、ＤＶＤ等の記録媒体に記録される。また、データや信号は、バス９１２や信号線やケーブルその他の伝送媒体によりオンライン伝送される。 The program group 923 stores programs for executing the functions described as “˜units” in the description of the first and second embodiments. The program is read and executed by the CPU 911.
In the description of the first and second embodiments, the file group 924 includes “determination of”, “calculation of”, “comparison of”, “acquisition of”, “extraction of”, and “setting of”. ”,“ Registration of ”,“ Update of ”,“ Division of ”,“ Count of ”, etc. And “˜database”. The “˜file” and “˜database” are stored in a recording medium such as a disk or a memory. Information, data, signal values, variable values, and parameters stored in a storage medium such as a disk or memory are read out to the main memory or cache memory by the CPU 911 via a read / write circuit, and extracted, searched, referenced, compared, Used for CPU operations such as calculation, calculation, processing, editing, output, printing, and display. Information, data, signal values, variable values, and parameters are stored in the main memory, registers, cache memory, and buffers during the CPU operations of extraction, search, reference, comparison, calculation, processing, editing, output, printing, and display. It is temporarily stored in a memory or the like.
The arrows in the flowcharts described in the first and second embodiments mainly indicate input / output of data and signals. The data and signal values are the memory of the RAM 914, the flexible disk of the FDD904, the compact disk of the CDD905, and the magnetic field. Recording is performed on a recording medium such as a magnetic disk of the disk device 920, other optical disks, mini disks, DVDs, and the like. Data and signals are transmitted online via a bus 912, signal lines, cables, or other transmission media.

また、実施の形態１、２の説明において「〜部」として説明しているものは、「〜回路」、「〜装置」、「〜機器」、であってもよく、また、「〜ステップ」、「〜手順」、「〜処理」であってもよい。すなわち、「〜部」として説明しているものは、ＲＯＭ９１３に記憶されたファームウェアで実現されていても構わない。或いは、ソフトウェアのみ、或いは、素子・デバイス・基板・配線などのハードウェアのみ、或いは、ソフトウェアとハードウェアとの組み合わせ、さらには、ファームウェアとの組み合わせで実施されても構わない。ファームウェアとソフトウェアは、プログラムとして、磁気ディスク、フレキシブルディスク、光ディスク、コンパクトディスク、ミニディスク、ＤＶＤ等の記録媒体に記憶される。プログラムはＣＰＵ９１１により読み出され、ＣＰＵ９１１により実行される。すなわち、プログラムは、実施の形態１、２の「〜部」としてコンピュータを機能させるものである。あるいは、実施の形態１、２の「〜部」の手順や方法をコンピュータに実行させるものである。 In addition, what is described as “˜unit” in the description of the first and second embodiments may be “˜circuit”, “˜device”, “˜device”, and “˜step”. , “˜procedure”, and “˜processing”. That is, what is described as “˜unit” may be realized by firmware stored in the ROM 913. Alternatively, it may be implemented only by software, or only by hardware such as elements, devices, substrates, and wirings, by a combination of software and hardware, or by a combination of firmware. Firmware and software are stored as programs in a recording medium such as a magnetic disk, a flexible disk, an optical disk, a compact disk, a mini disk, and a DVD. The program is read by the CPU 911 and executed by the CPU 911. That is, the program causes the computer to function as the “˜unit” in the first and second embodiments. Alternatively, the computer executes the procedure and method of “to unit” in the first and second embodiments.

このように、実施の形態１、２に示す電子メールアーカイブシステム１０１は、処理装置たるＣＰＵ、記憶装置たるメモリ、磁気ディスク等、入力装置たるキーボード、マウス、通信ボード等、出力装置たる表示装置、通信ボード等を備えるコンピュータであり、上記したように「〜部」として示された機能をこれら処理装置、記憶装置、入力装置、出力装置を用いて実現するものである。 As described above, the electronic mail archive system 101 shown in the first and second embodiments includes a CPU as a processing device, a memory as a storage device, a magnetic disk, a keyboard as an input device, a mouse, a communication board, and a display device as an output device, A computer including a communication board or the like, and implements the functions indicated as “˜units” as described above using these processing devices, storage devices, input devices, and output devices.

実施の形態１、２に係るシステム構成例を示す図。FIG. 3 is a diagram showing a system configuration example according to the first and second embodiments. 実施の形態１、２に係るユーザ端末と電子メールアーカイブシステムにおける動作例を示す図。The figure which shows the operation example in the user terminal and electronic mail archive system which concern on Embodiment 1,2. 実施の形態１に係る蓄積処理の例を示すフローチャート図。FIG. 3 is a flowchart showing an example of accumulation processing according to the first embodiment. 実施の形態１に係る電子メール取得処理におけるデータフローを示す図。FIG. 6 is a diagram showing a data flow in an e-mail acquisition process according to the first embodiment. 実施の形態１に係るデータベースシステム保存処理におけるデータフローを示す図。FIG. 3 is a diagram showing a data flow in a database system storage process according to the first embodiment. 実施の形態１に係る検索システム登録処理におけるデータフローを示す図。FIG. 4 is a diagram showing a data flow in a search system registration process according to the first embodiment. 実施の形態１に係るデータベースシステム保存処理の例を示すフローチャート図。FIG. 3 is a flowchart showing an example of database system storage processing according to the first embodiment. 実施の形態１に係る検索処理の例を示すフローチャート図。FIG. 3 is a flowchart showing an example of search processing according to the first embodiment. 実施の形態１に係る検索システム検索処理におけるデータフローを示す図。FIG. 4 is a diagram showing a data flow in a search system search process according to the first embodiment. 実施の形態１に係る抽出処理の例を示すフローチャート図。FIG. 3 is a flowchart showing an example of extraction processing according to the first embodiment. 実施の形態１に係るデータベースシステム抽出処理におけるデータフローを示す図。FIG. 4 is a diagram showing a data flow in a database system extraction process according to the first embodiment. 実施の形態１に係るデータベースシステム抽出処理の例を示すフローチャート図。FIG. 3 is a flowchart showing an example of database system extraction processing according to the first embodiment. 実施の形態２に係る蓄積処理の例を示すフローチャート図。FIG. 9 is a flowchart showing an example of accumulation processing according to the second embodiment. 実施の形態２に係る蓄積処理の例を示すフローチャート図。FIG. 9 is a flowchart showing an example of accumulation processing according to the second embodiment. 実施の形態２に係る電子メール統計情報取得処理の例を示すフローチャート図。FIG. 9 is a flowchart showing an example of e-mail statistical information acquisition processing according to the second embodiment. 実施の形態２に係るデータベースシステム抽出処理の例を示すフローチャート図。The flowchart figure which shows the example of the database system extraction process which concerns on Embodiment 2. FIG. 実施の形態２に係る蓄積時の統計情報を基にした抽出要求数の算定処理の例を示すフローチャート図。The flowchart figure which shows the example of the calculation process of the number of extraction requests based on the statistical information at the time of accumulation | storage which concerns on Embodiment 2. 実施の形態１、２に係るデータベースシステムのデータ格納例を示す図。The figure which shows the example of data storage of the database system which concerns on Embodiment 1,2. 実施の形態１に係る範囲管理テーブルの例を示す図。FIG. 4 is a diagram showing an example of a range management table according to the first embodiment. 実施の形態１、２に係るデータベースシステムのデータ格納例を示す図。The figure which shows the example of data storage of the database system which concerns on Embodiment 1,2. 実施の形態１に係る範囲管理テーブルの例を示す図。FIG. 4 is a diagram showing an example of a range management table according to the first embodiment. 実施の形態１、２に係るデータベースシステムのデータ格納例を示す図。The figure which shows the example of data storage of the database system which concerns on Embodiment 1,2. 実施の形態１に係る範囲管理テーブルの例を示す図。FIG. 4 is a diagram showing an example of a range management table according to the first embodiment. 実施の形態２に係る範囲管理テーブルの例を示す図。FIG. 10 is a diagram illustrating an example of a range management table according to the second embodiment. 実施の形態１、２に係る電子メールの分割例を示す図。The figure which shows the example of a division | segmentation of the email which concerns on Embodiment 1,2. 実施の形態１、２に係る電子メールアーカイブシステムのハードウェア構成例を示す図。FIG. 3 is a diagram illustrating a hardware configuration example of an electronic mail archive system according to the first and second embodiments.

符号の説明Explanation of symbols

１００ユーザ端末、１０１電子メールアーカイブシステム、１０２データベース管理部、１０３データベースシステム、１０４検索システム、１０５電子メール受信部、１０６メールサーバ。 DESCRIPTION OF SYMBOLS 100 User terminal, 101 E-mail archive system, 102 Database management part, 103 Database system, 104 Search system, 105 E-mail receiving part, 106 Mail server

Claims

入力した入力データのデータサイズが所定の上限データサイズ以下の場合に入力データを分割することなくデータベースシステムに出力し、入力した入力データのデータサイズが前記上限データサイズを超える場合に、各々のデータサイズが前記上限データサイズ以下となるように入力データを複数の分割データに分割し、複数の分割データをデータベースシステムに出力するデータベース管理部と、
前記データベース管理部から入力される各々のデータに、連番となるデータ管理番号を設定し、各々のデータをデータ管理番号順に格納するデータベースシステムとを有し、
前記データベースシステムは、
複数個のデータを単位とする管理ブロックに区分して、格納しているデータを管理し、
前記データベース管理部は、
前記データベースシステムに対してデータの抽出を要求する前に、前記データベースシステムに格納されている分割データの総数と前記データベースシステムの管理ブロックの個数とに基づき、前記データベースシステムに対して抽出を要求するデータの数をデータ抽出要求数として決定し、
前記データベースシステムに対して、特定のデータ管理番号が設定されている特定データとデータ管理番号の順序において前記特定データに後続する後続データとを合わせて前記データ抽出要求数分抽出するよう要求し、
前記データベースシステムから、前記データ抽出要求数分の特定のデータと後続データとを入力することを特徴とするデータ処理装置。 When the data size of the input data that is input is less than or equal to the predetermined upper limit data size, the input data is output to the database system without being divided, and when the data size of the input data that is input exceeds the upper limit data size, each data A database management unit that divides input data into a plurality of divided data so that the size is equal to or less than the upper limit data size, and outputs the plurality of divided data to a database system;
To each of the data input from the database management unit sets the data management number as a serial number, and a database system for storing each data in the data management number order,
The database system includes:
Divide the management block into a plurality of data units, manage the stored data,
The database management unit
Before requesting the database system to extract data, the database system is requested to extract based on the total number of divided data stored in the database system and the number of management blocks of the database system. Determine the number of data as the number of data extraction requests,
Requesting the database system to extract the number of data extraction requests by combining the specific data set with a specific data management number and the subsequent data following the specific data in the order of the data management number;
A data processing apparatus , wherein specific data corresponding to the number of data extraction requests and subsequent data are input from the database system .

前記データベース管理部は、The database management unit
新たな入力データを入力する度に、新たな入力データ又は新たな入力データを分割した新たな分割データを前記データベースシステムに出力し、Each time new input data is input, new input data or new divided data obtained by dividing new input data is output to the database system,
新たな入力データ又は新たな分割データを前記データベースシステムに出力する度に、新たな入力データ又は新たな分割データの出力時点で前記データベースシステムに格納されている分割データの総数と前記データベースシステムの管理ブロックの個数とに基づき、データ抽出要求数を更新することを特徴とする請求項１に記載のデータ処理装置。Each time new input data or new divided data is output to the database system, the total number of divided data stored in the database system at the time when the new input data or new divided data is output and management of the database system The data processing apparatus according to claim 1, wherein the number of data extraction requests is updated based on the number of blocks.

前記データベース管理部は、
分割データの総数を管理ブロックの個数で除算して得られる値を、データ抽出要求数とすることを特徴とすることを請求項１に記載のデータ処理装置。 The database management unit
2. The data processing apparatus according to claim 1 , wherein a value obtained by dividing the total number of divided data by the number of management blocks is used as a data extraction request number.

前記データベースシステムは、
分割データの件数を、分割データが記憶されている管理ブロックと対応づけて記憶し、
前記データベース管理部は、
前記データベースシステムに記憶されている管理ブロックごとの分割データの件数を加算して分割データの総数を得ることを特徴とする請求項１に記載のデータ処理装置。 The database system includes:
The matter number of divided data, in association with stored with management block division data is stored,
The database management unit
The data processing apparatus according to claim 1 , wherein the total number of divided data is obtained by adding the number of pieces of divided data for each management block stored in the database system.

前記データベースシステムは、
前記データベース管理部より前記特定データの抽出が要求された場合に、抽出が要求された特定のデータ管理番号に対応する管理ブロックを特定し、特定した管理ブロック内の全データを読み出して、前記特定のデータ管理番号が設定されたデータを検索することを特徴とする請求項１に記載のデータ処理装置。 The database system includes:
When extraction of the specific data is requested by the database management unit, the management block corresponding to the specific data management number for which extraction has been requested is specified, all data in the specified management block is read, and the specification is performed. The data processing apparatus according to claim 1 , wherein data set with the data management number is searched.

前記データベース管理部は、
入力データを複数の分割データに分割する場合に、分割データの個数である分割件数と分割データ間で連番となる分割番号とを各々の分割データに付加して前記データベースシステムに出力し、
前記データベースシステムは、
前記データベース管理部により分割された複数の分割データが入力された場合に、分割番号順に各々の分割データにデータ整理番号を設定し、分割件数及び分割番号とともに各々の分割データをデータ管理番号順に格納し、
前記データベース管理部は、
前記データベースシステムから入力した前記特定データに分割番号が付加されているか否かを判定し、
前記特定データに分割番号が付加されている場合に、前記特定データに付加されている分割件数と前記特定データ及び後続データの件数とを比較し、分割件数分のデータが入力済みであるか否かを判断し、
分割件数分のデータが入力済みの場合に、分割件数分の前記特定データ及び後続データを統合して分割前のデータを取得することを特徴とする請求項１に記載のデータ処理装置。 The database management unit
When dividing the input data into a plurality of divided data, the number of divided data that is the number of divided data and a divided number that is a serial number between the divided data are added to each divided data and output to the database system,
The database system includes:
When a plurality of pieces of divided data divided by the database management unit are input, a data arrangement number is set for each piece of divided data in the order of the division number, and each piece of divided data is stored in the order of the data management number together with the number of divisions and the division number. And
The database management unit
Determining whether a division number is added to the specific data input from the database system;
When a division number is added to the specific data, the number of divisions added to the specific data is compared with the number of the specific data and subsequent data, and whether or not data for the number of divisions has been input. Determine whether
The data processing apparatus according to claim 1, wherein when data for the number of divisions has been input, the specific data for the number of divisions and subsequent data are integrated to obtain data before division.

前記データベース管理部は、
前記特定データに付加されている分割件数と前記特定データ及び後続データの件数とを比較した結果、分割件数分のデータが入力されていない場合に、前記データベースシステムに対して、分割件数に対して不足している件数分の後続データを抽出するよう要求し、
前記データベースシステムから不足している件数分の後続データを入力し、分割件数分の前記特定データ及び後続データを統合して分割前のデータを取得することを特徴とする請求項６に記載のデータ処理装置。 The database management unit
As a result of comparing the number of divisions added to the specific data with the number of the specific data and subsequent data, when data for the number of divisions is not input, the database system Request to extract subsequent data for the number of missing items,
7. The data according to claim 6 , wherein the subsequent data for the number of cases that are lacking from the database system are input, and the specific data and the subsequent data for the number of divisions are integrated to obtain data before division. Processing equipment.

前記データベース管理部は、
前記データベースシステムが一つのレコードとして管理できる上限のデータサイズを前記上限データサイズとすることを特徴とする請求項１に記載のデータ処理装置。 The database management unit
The data processing apparatus according to claim 1, wherein an upper limit data size that can be managed as one record by the database system is set as the upper limit data size.

コンピュータが、入力した入力データのデータサイズが所定の上限データサイズ以下の場合に入力データを分割することなく出力し、入力した入力データのデータサイズが前記上限データサイズを超える場合に、各々のデータサイズが前記上限データサイズ以下となるように入力データを複数の分割データに分割し、複数の分割データを出力するデータ加工ステップと、
コンピュータが、前記データ加工ステップにより出力された各々のデータに、連番となるデータ管理番号を設定し、複数個のデータを単位とする管理ブロックに区分してデータを管理する記憶装置に各々のデータをデータ管理番号順に記憶させるデータ記憶ステップと、
コンピュータが、前記記憶装置に対してデータの抽出を要求する前に、前記記憶装置に格納されている分割データの総数と前記データベースシステムの管理ブロックの個数とに基づき、前記記憶装置に対して抽出を要求するデータの数をデータ抽出要求数として決定するステップと、
コンピュータが、前記記憶装置に対して、特定のデータ管理番号が設定されている特定データとデータ管理番号の順序において前記特定データに後続する後続データとを合わせて前記データ抽出要求分抽出するよう要求するステップと、
コンピュータが、前記記憶装置から、前記データ抽出要求分の特定データと後続データとを入力するステップと
を有することを特徴とするデータ処理方法。 The computer outputs the input data without dividing when the data size of the input data is less than or equal to a predetermined upper limit data size, and each data when the data size of the input data exceeds the upper limit data size. size divides the input data to be equal to or less than the upper limit data size into a plurality of divided data, and a data processing step of outputting a plurality of divided data,
A computer sets a serial data management number for each data output in the data processing step, and divides the data into management blocks each having a plurality of data as units, and stores each data in a storage device that manages the data . and data storage step that makes storing data in a data management numerical order,
Before the computer requests the storage device to extract data, the storage device extracts data based on the total number of divided data stored in the storage device and the number of management blocks of the database system. Determining the number of data requesting as the number of data extraction requests;
The computer requests the storage device to extract the data extraction request by combining the specific data for which a specific data management number is set and the subsequent data following the specific data in the order of the data management number. And steps to
A data processing method , comprising: a step of inputting specific data for the data extraction request and subsequent data from the storage device .

入力した入力データのデータサイズが所定の上限データサイズ以下の場合に入力データを分割することなく出力し、入力した入力データのデータサイズが前記上限データサイズを超える場合に、各々のデータサイズが前記上限データサイズ以下となるように入力データを複数の分割データに分割し、複数の分割データを出力するデータ加工処理と、
前記データ加工ステップにより出力された各々のデータに、連番となるデータ管理番号を設定し、複数個のデータを単位とする管理ブロックに区分してデータを管理する記憶装置に各々のデータをデータ管理番号順に記憶させるデータ記憶処理と、
前記記憶装置に対してデータの抽出を要求する前に、前記記憶装置に格納されている分割データの総数と前記データベースシステムの管理ブロックの個数とに基づき、前記記憶装置に対して抽出を要求するデータの数をデータ抽出要求数として決定する処理と、
前記記憶装置に対して、特定のデータ管理番号が設定されている特定データとデータ管理番号の順序において前記特定データに後続する後続データとを合わせて前記データ抽出要求分抽出するよう要求する処理と、
前記記憶装置から、前記データ抽出要求分の特定データと後続データとを入力する処理と
をコンピュータに実行させることを特徴とするプログラム。 When the input input data size is equal to or smaller than the predetermined upper limit data size, the input data is output without being divided, and when the input input data size exceeds the upper limit data size, each data size is Data processing that divides input data into a plurality of divided data so as to be equal to or less than the upper limit data size, and outputs a plurality of divided data
The data for each output by the data processing step sets a data management number as a serial number, de each data by dividing the control block in a plurality of data units in a storage apparatus that manages data Data storage processing for storing data in the order of data management numbers ;
Before requesting the storage device to extract data, the storage device is requested to extract based on the total number of divided data stored in the storage device and the number of management blocks of the database system. A process of determining the number of data as the number of data extraction requests;
A process for requesting the storage device to extract the data extraction request for the specific data set with a specific data management number and the subsequent data following the specific data in the order of the data management number; ,
A program for causing a computer to execute a process of inputting specific data and subsequent data for the data extraction request from the storage device .