JP6626198B2

JP6626198B2 - Management device, execution environment setting method, stream data processing system

Info

Publication number: JP6626198B2
Application number: JP2018524589A
Authority: JP
Inventors: 隼之土田; 常之今木
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2016-06-27
Filing date: 2016-06-27
Publication date: 2019-12-25
Anticipated expiration: 2036-06-27
Also published as: WO2018002976A1; JPWO2018002976A1

Description

本発明は、概して、データの分析に関する。 The present invention relates generally to data analysis.

近年、コンピュータやセンサなど世の中の様々な物を大量に接続し、各コンピュータやネットワーク機器などから取得したログデータを解析しネットワークの負荷を予測することや、設備の監視・メンテナンスなどを行うＩｏＴシステムに対する要求が高まっている。 In recent years, IoT systems that connect a large number of various things in the world such as computers and sensors, analyze log data obtained from each computer and network devices, and predict network load, and monitor and maintain equipment The demand for is increasing.

一般的に、ネットワーク分析では、ネットワーク通信の応答時間の平均値や最大値などの基礎的な統計情報を一定時間毎に算出して、算出した値を用いて負荷予測などの高度な分析を行う。 Generally, in network analysis, basic statistical information such as an average value or a maximum value of response time of network communication is calculated at regular intervals, and advanced analysis such as load prediction is performed using the calculated value. .

例えば、企業などにおいて社内ネットワークのネットワーク分析を行う際に、ネットワークに接続されたコンピュータのネットワークアクセスを記録したアクセスログを解析し、ネットワーク通信の応答時間の平均値や最大値などの基礎的な統計情報を一定時間毎に算出して、算出した値を用いて負荷予測などの高度な分析を行う。また工場においては、設備の温度や振動量を測定し、異常値が検出された場合には管理者へ連絡するシステムが考えられる。 For example, when analyzing the network of an in-house network in a company or the like, the access logs that record the network access of the computers connected to the network are analyzed, and basic statistics such as the average and maximum values of the response time of network communication are analyzed. Information is calculated at regular intervals, and advanced analysis such as load prediction is performed using the calculated values. In a factory, a system that measures the temperature and vibration amount of equipment and notifies an administrator when an abnormal value is detected may be considered.

従来のデータベース管理システム（ＤＢＭＳ）でデータを処理する場合には、時々刻々と到着するデータ（ストリームデータ）を一旦従来のデータベースが検索可能なデータ形式に変更して二次記憶装置に格納する必要があった。この変換処理と格納には大きな時間がかかり、データをまとめて一括で処理を行った方が効率的に行えるため、日単位の日次バッチなどで行われる事が多く、リアルタイムにデータ分析を行う事が出来ない。一括で処理を行った方が効率的である理由として、例えば従来のデータベースで一般的に用いられる二次索引へ逐次的にデータ１件毎に追加するのではなく、一括で追加した方が時間コストの大きい二次記憶へのアクセス量を抑えられる事が挙げられる。しかし、前述のログデータは従来のデータベースに格納して、日次バッチで処理するよりも、リアルタイムに鮮度の高いまま処理を行うストリームデータ処理を行うのが望ましい。例えば、ネットワークの負荷予測に対しては、ネットワークが過負荷になるタイミングを予測し、ネットワーク機器の切り替えや増強を行うことで、ネットワークのダウンを未然に防止することが可能となる。また工場における設備の故障に対しては、早いタイミングで検知するほど損失を抑えることが可能になる。 When data is processed by a conventional database management system (DBMS), it is necessary to temporarily change data (stream data) arriving every moment into a data format searchable by a conventional database and store the data in a secondary storage device. was there. This conversion process and storage take a lot of time, and it is more efficient to process the data collectively and collectively, so it is often performed in daily batches on a daily basis, and performs data analysis in real time I can't do things. The reason why it is more efficient to perform batch processing is that, for example, it is time-consuming to collectively add data to secondary indexes generally used in conventional databases, instead of adding data one by one. That is, the amount of access to the secondary storage which is expensive can be suppressed. However, it is preferable to perform stream data processing in which the log data is processed in real time with high freshness, rather than storing the log data in a conventional database and processing the log data in a daily batch. For example, when predicting the load on the network, the timing at which the network becomes overloaded is predicted, and network devices are switched or strengthened, whereby it is possible to prevent the network from going down. In addition, as for the failure of the equipment in the factory, the earlier the detection is, the more the loss can be suppressed.

しかし、ストリームデータ処理を行うためにストリームデータのリアルタイム処理アプリケーションを個別に作成すると、開発期間の長期化、処理内容の変化への迅速な対応が困難といった問題がある。そこで汎用のストリームデータ処理システムやストリームデータ処理クエリが提案されている。例えば、非特許文献１には、ストリームデータ処理システムが開示されている。 However, if a real-time processing application for stream data is individually created to perform stream data processing, there is a problem in that the development period is lengthened, and it is difficult to quickly respond to changes in processing contents. Therefore, general-purpose stream data processing systems and stream data processing queries have been proposed. For example, Non-Patent Document 1 discloses a stream data processing system.

また、ストリームデータ処理システムに関する技術として、特許文献１ではストリームデータ処理において装置間の通信回数を減らす方法が提案されている。 As a technique related to a stream data processing system, Patent Document 1 proposes a method of reducing the number of times of communication between devices in stream data processing.

米国特許 US 9,141,677 B2US Patent US 9,141,677 B2

Arvind Arasu, Shivnath Babu, Jennifer Widom著: ”The CQL continuous query language: semantic foundations and query execution.”, VLDB J. (2006)Arvind Arasu, Shivnath Babu, Jennifer Widom: “The CQL continuous query language: semantic foundations and query execution.”, VLDB J. (2006)

大企業においてネットワークの負荷の算出を行う場合には、ネットワークに接続された数万〜数十万のコンピュータを監視する。コンピュータのログデータを解析するためには各コンピュータやネットワーク機器などからサーバ等にログデータを送信する必要がある。ネットワーク負荷の低減や、通信にかかる消費電力を抑制する観点から、コンピュータやネットワーク機器などからサーバへのログデータの送信間隔を長くして、送信回数を少なくすることが望ましい。 When calculating the load on a network in a large company, tens of thousands to hundreds of thousands of computers connected to the network are monitored. In order to analyze the log data of a computer, it is necessary to transmit the log data from each computer or network device to a server or the like. From the viewpoint of reducing the network load and suppressing the power consumption for communication, it is desirable to increase the transmission interval of log data from a computer or network device to the server and reduce the number of transmissions.

また、ＩｏＴシステムで良く用いられる無線センサは電池駆動が多く、センサデータの送信間隔を長くし、送信回数を少なくする事により通信電力消費を抑制し、駆動時間を長くする事が望ましい。センサデータの送信間隔が長くなると、無線親機が扱える子機数が増えるため、システム全体の価格を抑えることも可能になる。これは、複数の通信が重なった場合には通信を中断し、一定時間後に再度通信を試みる、という通信で広く使われるＣＳＭＡ／ＣＤプロトコルにおいて、通信間隔が長くなると通信の重なりが起こりにくくなるためである。 In addition, wireless sensors frequently used in IoT systems are often driven by batteries, and it is desirable to increase the interval between sensor data transmissions and reduce the number of transmissions, thereby suppressing communication power consumption and increasing the driving time. When the transmission interval of the sensor data becomes longer, the number of slave units that can be handled by the wireless master unit increases, so that the price of the entire system can be reduced. This is because, in the CSMA / CD protocol widely used in communication in which a plurality of communications are interrupted and communication is attempted again after a certain period of time, communications are less likely to occur when the communication interval is long. It is.

しかし、単純にデータの送信間隔を長くすると、結果出力遅延が発生し、処理がリアルタイムに行えなくなってしまう。そのため、結果出力遅延と通信回数削減効果のトレードオフからデータの送信間隔を算出する必要がある。 However, if the data transmission interval is simply lengthened, a result output delay occurs, and processing cannot be performed in real time. Therefore, it is necessary to calculate the data transmission interval from the trade-off between the result output delay and the effect of reducing the number of times of communication.

しかし、従来技術ではストリームデータ処理のデータ送信間隔については考慮されていない。そのため、より長い間隔でデータを送信してもリアルタイムに処理が出来る場合であっても、短い送信間隔が用いられる可能性がある。特許文献1では、ストリームデータ処理において装置間の通信回数を減らす方法が提案されている。通信の多いクエリ処理の組を同一装置で処理する事により、装置間の通信回数を減らす方法であるが、通信の間隔は扱われていないため、ストリームデータの送信間隔を長くして送信回数を減らすことによるネットワーク負荷の低減や省電力効果は見込めない。 However, the prior art does not consider the data transmission interval for stream data processing. Therefore, even if data can be transmitted in a longer interval and processing can be performed in real time, a shorter transmission interval may be used. Patent Document 1 proposes a method of reducing the number of times of communication between devices in stream data processing. This is a method of reducing the number of communications between devices by processing a set of query processing with many communications on the same device.However, since the communication interval is not handled, the transmission interval of stream data is lengthened and the number of transmissions is reduced. No reduction in network load or power saving effect can be expected.

本発明の代表的な一形態は、ストリームデータを送信する第一の処理装置とクエリに基づいて前記第一の処理装置から受信した前記ストリームデータに対して処理を実行する第二の処理装置とネットワークを介して接続される管理装置であって、前記管理装置は、プロセッサと格納部を備え、前記格納部は、複数の前記クエリから構成されるクエリグラフに関する情報を格納し、前記クエリグラフを構成する少なくとも一つの前記クエリは、所定の間隔で前記ストリームデータの処理を実行する処理に関する情報を含み、前記プロセッサは、前記クエリグラフに基づいて、一つ以上の前記クエリを前記第二の処理装置に割り当て、前記所定の間隔と前記クエリグラフに基づいて、前記第一の処理装置が前記ストリームデータを前記第二の処理装置に送信する間隔であるデータ送信間隔を算出し、前記データ送信間隔に関する情報を前記第一の処理装置に送信することを特徴とするものである。 A representative embodiment of the present invention is a first processing device that transmits stream data and a second processing device that performs processing on the stream data received from the first processing device based on a query. A management device connected via a network, wherein the management device includes a processor and a storage unit, wherein the storage unit stores information about a query graph composed of a plurality of the queries, and stores the query graph. At least one of the queries comprises information on a process of executing the process of the stream data at a predetermined interval, and the processor performs one or more of the queries based on the query graph in the second process. Assigned to a device, and based on the predetermined interval and the query graph, the first processing device processes the stream data in the second processing. Calculates data transmission interval is an interval for transmitting the location, is characterized in that transmits information on the data transmission interval to the first processing unit.

本発明を用いることにより、ストリームデータ処理システムの処理結果出力に影響を与えない範囲でシステム全体の電力消費の抑制や、ネットワーク負荷を低減することが出来る。 By using the present invention, it is possible to suppress the power consumption of the entire system and reduce the network load within a range that does not affect the processing result output of the stream data processing system.

実施例１に係るシステム全体の構成を示す。1 illustrates a configuration of an entire system according to a first embodiment. 実施例１のストリームデータ処理のフローを示す。2 shows a flow of stream data processing according to the first embodiment. クエリグラフの一例を示す。4 shows an example of a query graph. ログデータの一例を示す。4 shows an example of log data. 許容送信間隔を設定したクエリグラフの一例を示す。4 shows an example of a query graph in which an allowable transmission interval is set. 許容送信間隔を設定した直線のクエリグラフの一例を示す。4 shows an example of a straight-line query graph in which an allowable transmission interval is set. 展開したクエリグラフの一例を示す。An example of an expanded query graph is shown. 許容送信間隔の算出方法の一例を示す。An example of a method for calculating an allowable transmission interval will be described. 実施例１に係るデータ送信間隔の算出処理のフローを示す。4 shows a flow of processing for calculating a data transmission interval according to the first embodiment. クエリグラフ全体の許容送信間隔の算出方法の一例を示す。An example of a method for calculating an allowable transmission interval for the entire query graph will be described. クエリの許容送信間隔を算出し、直線に展開したクエリグラフの一例を示す。7 shows an example of a query graph in which a permissible transmission interval of a query is calculated and developed in a straight line. クエリグラフ全体の許容送信間隔の算出方法の一例を示す。An example of a method for calculating an allowable transmission interval for the entire query graph will be described. クエリの許容送信間隔を算出し、直線に展開したクエリグラフの一例を示す。7 shows an example of a query graph in which a permissible transmission interval of a query is calculated and developed in a straight line. クエリグループ割り当て処理のフローを示す。10 shows a flow of a query group assignment process. サーバと割り当てられたクエリグループとの関係の一例を示す。4 shows an example of the relationship between a server and an assigned query group. 構成情報の一例を示す。4 shows an example of configuration information. クエリグループ抽出処理のフローを示す。4 shows a flow of a query group extraction process. サーバと割り当てられたクエリグループとの関係の一例を示す。4 shows an example of the relationship between a server and an assigned query group. サーバと割り当てられたクエリグループとの関係の一例を示す。4 shows an example of the relationship between a server and an assigned query group. 出力デバイスの一例を示す。1 shows an example of an output device. クエリグラフの一例を示す。4 shows an example of a query graph. 実施例２に係るデータ送信間隔の算出処理のフローを示す。9 shows a flow of a data transmission interval calculation process according to the second embodiment.

以下、図面を参照しながら、幾つかの実施例を説明する。 Hereinafter, some embodiments will be described with reference to the drawings.

図１は、本発明の実施形態を示し、ストリームデータ処理システムの一例を示している。 FIG. 1 shows an embodiment of the present invention, and shows an example of a stream data processing system.

管理サーバ１００は、クエリグラフの実行の際に、コンピュータ３００がログデータ３４１をサーバ２００に送信する間隔であるデータ送信間隔をコンピュータ３００に送信し、ログデータ３４１に対してデータ処理を実行するクエリグループをサーバ２００に割り当てる。そしてサーバ２００でのクエリグループの実行結果に基づき、クエリグラフの実行結果を外部に出力する。管理サーバ１００と一つ又は複数のサーバ２００と一つ又は複数のコンピュータ３００が通信ネットワーク４００を介して接続されている。管理サーバ１００、サーバ２００はデータ処理を行うのであれば、サーバに限らず仮想マシンであってもよい。通信ネットワーク４００を介した通信のプロトコルとしては、例えば、ＦＣ（Fibre Channel）、ＳＣＳＩ（Small Computer System Interface）、又は、ＴＣＰ／ＩＰ（Transmission Control Protocol／Internet Protocol）が採用されて良い。通信ネットワーク４００を介した通信のプロトコルとしては、例えば、IEEE８０２．１１、IEEE８０２．１５．１、又は、IEEE８０２．１５．４が採用されて良い。 When executing the query graph, the management server 100 transmits a data transmission interval, which is an interval at which the computer 300 transmits the log data 341 to the server 200, to the computer 300, and executes a data process on the log data 341. The group is assigned to the server 200. Then, based on the execution result of the query group in the server 200, the execution result of the query graph is output to the outside. The management server 100, one or more servers 200, and one or more computers 300 are connected via a communication network 400. The management server 100 and the server 200 are not limited to servers and may be virtual machines as long as they perform data processing. As a protocol for communication via the communication network 400, for example, FC (Fibre Channel), SCSI (Small Computer System Interface), or TCP / IP (Transmission Control Protocol / Internet Protocol) may be adopted. As a protocol for communication via the communication network 400, for example, IEEE 802.11, IEEE 802.15.1, or IEEE 802.15.4 may be adopted.

管理サーバ１００はハードウェア構成として、メモリ１１０、記憶装置１２０、プロセッサ１３０、ネットワークインターフェース１４０、入力デバイス１５０、出力デバイス１６０を備える。 The management server 100 includes a memory 110, a storage device 120, a processor 130, a network interface 140, an input device 150, and an output device 160 as a hardware configuration.

プロセッサ１３０は、メモリ１１０に格納されるプログラムを実行する。プロセッサ１１０がプログラムを実行しコンピュータ１００の機能が実現される。以下、機能部を主語の処理を説明する場合、プロセッサ１３０が当該機能部を実現するプログラムを実行していることを示す。 The processor 130 executes a program stored in the memory 110. The functions of the computer 100 are realized by the processor 110 executing the program. Hereinafter, when the processing of the subject of the functional unit is described, it indicates that the processor 130 is executing a program for realizing the functional unit.

ネットワークインターフェース１４０は、ネットワークを介して他の装置と接続するためのインターフェースである。 The network interface 140 is an interface for connecting to another device via a network.

記憶装置１２０は、ストリームデータに対する処理の内容を記述した一つまたは複数のクエリグラフ１２１と、構成情報１２２を格納する。クエリグラフは複数のクエリをツリー化したものであり、各クエリの処理内容とクエリの接続関係を示し、クエリに設定された処理の実行順序などの情報を含む。クエリグラフの一例を図３に示す。例えば、クエリ３００３はクエリ３００２の実行結果を受け取り、クエリ３００３の処理が行われることを示す。クエリ３００３の実行結果はクエリ３００４に受け渡される。記憶装置１２０は、コントローラおよび複数の記憶媒体を有するストレージシステムが考えられる。また、記憶装置１２０は、記憶媒体を有する一般的な計算機でもよいし、記憶媒体そのものであってもよい。ここで、記憶媒体は、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）及びＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）等が考えられる。 The storage device 120 stores one or a plurality of query graphs 121 describing the contents of processing on stream data, and configuration information 122. The query graph is a tree of a plurality of queries, indicates the processing content of each query and the connection relationship between the queries, and includes information such as the execution order of the processing set in the query. FIG. 3 shows an example of the query graph. For example, the query 3003 receives the execution result of the query 3002, and indicates that the processing of the query 3003 is performed. The execution result of the query 3003 is passed to the query 3004. The storage device 120 may be a storage system having a controller and a plurality of storage media. The storage device 120 may be a general computer having a storage medium, or may be the storage medium itself. Here, the storage medium may be an HDD (Hard Disk Drive), an SSD (Solid State Drive), or the like.

入力デバイス１５０は、データを入力する。入力デバイス１５０としては、たとえば、キーボード、マウス、タッチパネル、テンキー、スキャナがある。出力デバイス１６０は、データを出力する。出力デバイス１６０としては、たとえば、ディスプレイ、プリンタがある。 The input device 150 inputs data. Examples of the input device 150 include a keyboard, a mouse, a touch panel, a numeric keypad, and a scanner. The output device 160 outputs data. The output device 160 includes, for example, a display and a printer.

メモリ１１０は、送信間隔算出部１１１と、送信間隔送信部１１２と、クエリグループ割り当て部１１３と、クエリグループ実行結果処理部１１４を実現するプログラムを格納する。各部の機能については、後述する。 The memory 110 stores programs for realizing the transmission interval calculation unit 111, the transmission interval transmission unit 112, the query group assignment unit 113, and the query group execution result processing unit 114. The function of each unit will be described later.

サーバ２００は、コンピュータ３００から送信されるログデータ３４１に対して、管理サーバ１００から割り当てられたクエリグループの処理を実行する。サーバ２００はハードウェア構成として、ネットワークインターフェース２１０と、プロセッサ２２０と、メモリ２３０を備える。なお、サーバ２００はサーバに限るものでなくてもよく、仮想マシンでもよい。 The server 200 executes the processing of the query group assigned from the management server 100 on the log data 341 transmitted from the computer 300. The server 200 includes a network interface 210, a processor 220, and a memory 230 as a hardware configuration. The server 200 is not limited to a server, and may be a virtual machine.

ネットワークインターフェース２１０は、ネットワークを介して他の装置と接続するためのインターフェースである。 The network interface 210 is an interface for connecting to another device via a network.

プロセッサ２２０はメモリ２３０に格納されるプログラムを実行する。プロセッサ２２０がプログラムを実行しサーバ２００の機能が実現される。 The processor 220 executes a program stored in the memory 230. The function of the server 200 is realized by the processor 220 executing the program.

メモリ２３０はクエリグループ実行部２３１を実現するプログラムを格納する。クエリグループ実行部２３１は、コンピュータ３００から受信したログデータ３４１に対して管理サーバ１００から割り当てられたクエリグループの処理を実行する。処理結果は管理サーバ１００もしくは他のサーバ２００に出力する。 The memory 230 stores a program that implements the query group execution unit 231. The query group execution unit 231 executes the processing of the query group assigned from the management server 100 on the log data 341 received from the computer 300. The processing result is output to the management server 100 or another server 200.

コンピュータ３００はログデータ３４１を管理サーバ１００から受信したデータ送信間隔でサーバ２００に送信する。コンピュータ３００はハードウェア構成として、ネットワークインターフェース３１０と、プロセッサ３２０と、メモリ３３０と、記憶装置３４０を備える。なお、コンピュータ３００はストリームデータを蓄積するものであればよく、工場のセンサデバイスであってもよい。 The computer 300 transmits the log data 341 to the server 200 at a data transmission interval received from the management server 100. The computer 300 includes a network interface 310, a processor 320, a memory 330, and a storage device 340 as a hardware configuration. The computer 300 only needs to store stream data, and may be a factory sensor device.

ネットワークインターフェース３１０は、ネットワークを介して他の装置と接続するためのインターフェースである。 The network interface 310 is an interface for connecting to another device via a network.

プロセッサ３２０はメモリ３３０に格納されるプログラムを実行する。プロセッサ３２０がプログラムを実行しコンピュータ３００の機能が実現される。 Processor 320 executes a program stored in memory 330. The processor 320 executes the program to implement the functions of the computer 300.

メモリ３３０はデータ送信部３３１を実現するプログラムを格納する。データ送信部３３１は、ログデータ３４１を管理サーバ１００から受信したデータ送信間隔に基づいてサーバ２００に送信する。 The memory 330 stores a program that implements the data transmission unit 331. The data transmission unit 331 transmits the log data 341 to the server 200 based on the data transmission interval received from the management server 100.

記憶装置３４０は、コンピュータのネットワークへのアクセスログであるログデータ３４１を格納する。記憶装置３４０は、コントローラおよび複数の記憶媒体を有するストレージシステムが考えられる。また、記憶装置３４０は、記憶媒体を有する一般的な計算機でもよいし、記憶媒体そのものであってもよい。ここで、記憶媒体は、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）及びＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）や等が考えられる。 The storage device 340 stores log data 341 which is an access log of the computer to the network. The storage device 340 may be a storage system having a controller and a plurality of storage media. Further, the storage device 340 may be a general computer having a storage medium, or may be the storage medium itself. Here, the storage medium may be an HDD (Hard Disk Drive), an SSD (Solid State Drive), or the like.

図２は、コンピュータ３００に蓄積されるログデータ３４１に対して、クエリグラフ１２１に基づいてストリームデータ処理を実行する流れの一例である。
（Ｓ２０１）ログデータ３４１に対してクエリグラフ１２１に基づいた処理が開始されると、送信間隔算出部１１１はコンピュータ３００からサーバ２００へのデータ送信間隔をクエリグラフ１２１に基づいて算出する。コンピュータ３００からサーバ２００へのデータ送信間隔をクエリグラフ１２１に基づいて算出する手法については、後述する。
（Ｓ２０２）送信間隔送信部１１２は算出したデータ送信間隔をコンピュータ３００に送信する。コンピュータ３００は、受信したデータ送信間隔に基づいて、ログデータ３４１をサーバ２００に送信する。
（Ｓ２０３）クエリグループ割り当て部１１３は、構成情報１２２とクエリグラフ１２１と算出したデータ送信間隔に基づいて、クエリグラフ１２１内のクエリグループをサーバ２００に割り当てる。クエリグループのサーバ２００への割り当て方法については、後述する。
（Ｓ２０４）クエリグループ実行部２３１は、コンピュータ３００が送信したログデータ３４１に対して、割り当てられたクエリグループを実行する。
（Ｓ２０５）クエリグループ実行結果処理部１１４は、各サーバ２００のクエリグループの実行結果を、クエリグラフ１２１の実行結果として処理する。
（Ｓ２０６）クエリグループ実行結果処理部１１４は、クエリグラフ１２１の実行結果を出力デバイス１６０に出力し、処理を終了する。FIG. 2 is an example of a flow of executing stream data processing on the log data 341 stored in the computer 300 based on the query graph 121.
(S201) When the processing based on the query graph 121 is started for the log data 341, the transmission interval calculation unit 111 calculates the data transmission interval from the computer 300 to the server 200 based on the query graph 121. A method for calculating the data transmission interval from the computer 300 to the server 200 based on the query graph 121 will be described later.
(S202) The transmission interval transmission unit 112 transmits the calculated data transmission interval to the computer 300. The computer 300 transmits the log data 341 to the server 200 based on the received data transmission interval.
(S203) The query group assignment unit 113 assigns a query group in the query graph 121 to the server 200 based on the configuration information 122, the query graph 121, and the calculated data transmission interval. A method for assigning the query group to the server 200 will be described later.
(S204) The query group execution unit 231 executes the assigned query group on the log data 341 transmitted by the computer 300.
(S205) The query group execution result processing unit 114 processes the execution result of the query group of each server 200 as the execution result of the query graph 121.
(S206) The query group execution result processing unit 114 outputs the execution result of the query graph 121 to the output device 160, and ends the processing.

なお、Ｓ２０６とＳ２０７では、サーバ２００でのクエリグループの実行結果は管理サーバ１００で処理と出力を行っているが、サーバ２００が他のサーバ２００に処理結果を出力することや、サーバ２００が出力デバイスを持ち、その出力デバイスにサーバ２００の処理結果を出力する形であっても良い。 In S206 and S207, the execution result of the query group in the server 200 is processed and output by the management server 100. However, the server 200 outputs the processing result to another server 200, and the server 200 outputs the processing result. It may have a device and output the processing result of the server 200 to the output device.

Ｓ２０１で、送信間隔算出部１１１が、クエリグラフ１２１に基づいてコンピュータ３００からサーバ２００へのログデータ３４１のデータ送信間隔を算出する方法について具体的に説明する。 In S201, a method in which the transmission interval calculation unit 111 calculates the data transmission interval of the log data 341 from the computer 300 to the server 200 based on the query graph 121 will be specifically described.

図３は、ネットワーク分析を行うクエリグラフ処理の一例である。一般的に、ネットワーク分析では、ネットワーク通信の応答時間の平均値や最大値などの基礎的な統計情報を一定時間毎に算出して、算出した値を用いて負荷予測などの高度な分析を行う。基礎的な統計情報の算出をストリームデータ処理で行う例について説明する。 FIG. 3 is an example of a query graph process for performing network analysis. Generally, in network analysis, basic statistical information such as an average value or a maximum value of response time of network communication is calculated at regular intervals, and advanced analysis such as load prediction is performed using the calculated value. . An example in which basic statistical information is calculated by stream data processing will be described.

図３のクエリグラフ１２１は、分析対象のネットワークログを読込み、正常データの最大応答時間と通信エラー数を機器毎に算出するためのストリームデータ処理のクエリグラフであり、９つのクエリから構成されている。３００１、３００２、３００３、３００４、３００５、３００６、３００７、３００８、３００９はクエリを表し、矢印はクエリ間のつながりを表している。各クエリで行う処理について説明する。分析対象のログデータ３４１を３００１で受けとり、受け取ったデータから正常データのみの抽出処理を３００２で行って、機器毎の最大応答時間を３００３で４秒毎に算出する。算出結果から最大応答時間が１００ミリ秒以上のデータのみの抽出を３００４で行って、３００５で最大応答時間の処理結果を出力デバイス１６０に出力している。３００６では、３００１で読み込まれたデータのうち通信エラーのみを抽出し、機器毎のエラー数を３０秒毎に３００７で算出する。３００７の算出結果から、合計エラー数が５以上の機器データのみを３００８にて抽出する。抽出結果を３００９において出力デバイス１６０に出力する。 The query graph 121 of FIG. 3 is a query graph of stream data processing for reading a network log to be analyzed and calculating the maximum response time of normal data and the number of communication errors for each device, and includes nine queries. I have. Reference numerals 3001, 3002, 3003, 3004, 3005, 3006, 3007, 3008, and 3009 represent queries, and arrows represent connections between the queries. The processing performed in each query will be described. The log data 341 to be analyzed is received at 3001, and only normal data is extracted from the received data at 3002, and the maximum response time for each device is calculated at 3003 every 4 seconds. Only data having a maximum response time of 100 milliseconds or more is extracted from the calculation result in 3004, and the processing result of the maximum response time is output to the output device 160 in 3005. In step 3006, only communication errors are extracted from the data read in step 3001, and the number of errors for each device is calculated in step 3007 every 30 seconds. From the calculation result of 3007, only device data having a total error number of 5 or more is extracted in 3008. The extraction result is output to the output device 160 in 3009.

特に、３００１はコンピュータ３００からログデータ３４１を受信するクエリであり、３００５と３００９はクエリグラフで実行した処理を出力するクエリである。 In particular, 3001 is a query that receives the log data 341 from the computer 300, and 3005 and 3009 are queries that output the processing executed in the query graph.

図４は、処理例の入力データであるログデータ３４１の一例である。ログデータ３４１は、複数のレコードを有する。このレコードは、データ属性として、ＴＩＭＥＳＴＡＭＰ４００１と、ＩＤ４００２と、ＲＥＳＰＯＮＳＥ＿ＴＩＭＥ４００３と、ＳＴＡＴＵＳ＿ＣＯＤＥ４００４とを有する。 FIG. 4 is an example of log data 341 which is input data of a processing example. The log data 341 has a plurality of records. This record has TIMESTAMP 4001, ID 4002, RESPONSE_TIME 4003, and STATUS_CODE 4004 as data attributes.

ＴＩＭＥＳＴＡＭＰ４００１は、通信ログが生成された日時を表す。ＩＤ４００２は、その通信ログの通信を行った機器を識別する情報を表す。ＲＥＳＰＯＮＳＥ＿ＴＩＭＥ４００３は、通信の応答時間を表す。ＳＴＡＴＵＳ＿ＣＯＤＥ４００４は、その通信のステータスコードを表す。 TIMESTAMP 4001 indicates the date and time when the communication log was generated. The ID 4002 indicates information for identifying a device that has communicated with the communication log. RESPONSE_TIME 4003 indicates a response time of communication. STATUS_CODE 4004 indicates a status code of the communication.

例えば、図４のレコード４００５は、通信ログの生成時間（ＴＩＭＥＳＴＡＭＰ４００１）が「０８：０３：０１：０１」で、機器を識別する情報（ＩＤ４００２）が「１」で、通信の応答時間（ＲＥＳＰＯＮＳＥ＿ＴＩＭＥ３１３）が「７０」で、ステータスコード（ＳＴＡＴＵＳ＿ＣＯＤＥ３４００４）が「２００」であることを示す。ここで、ステータスコード「２００」はデータが正常に通信されたことを示し、「５０」は通信エラーが起きたことを示す。 For example, in the record 4005 in FIG. 4, the communication log generation time (TIMESTAMP 4001) is “08: 03: 01: 01”, the device identification information (ID 4002) is “1”, and the communication response time (RESPONSE_TIME 313) Is “70” and the status code (STATUS_CODE34004) is “200”. Here, the status code “200” indicates that the data has been normally transmitted, and “50” indicates that a communication error has occurred.

レコード４００５と４００７は、ステータスコードが「２００」のため、データが正常に通信されており、クエリグラフ１２１の分岐の３００２側の処理が行われる。レコード４００５の応答時間は７０ミリ秒のため、クエリ３００４で抽出されず、応答時間が２５０ミリ秒のレコード４００７は３００５のクエリの処理が行われる。また、レコード４００６はステータスコードが「５０」のため、通信エラーが起きたことを示しており、クエリグラフ１２１の分岐の３００６側の処理が行われる。 Since the status codes of the records 4005 and 4007 are “200”, the data is normally communicated, and the process of the branch 3002 of the query graph 121 is performed. Since the response time of the record 4005 is 70 milliseconds, it is not extracted by the query 3004, and the record 4007 whose response time is 250 milliseconds is processed by the query of 3005. The record 4006 indicates that a communication error has occurred because the status code is “50”, and the process of the branch 3006 of the query graph 121 is performed.

なお、３００３と３００７にはデータを蓄積し数秒毎に出力するＲＳＴＲＥＡＭ処理が含まれているため、データを逐次的に送信・処理する代わりに、ＲＳＴＲＥＡＭの時間間隔でデータをまとめて送信し、統計情報算出などの処理を行っても結果出力の時間間隔に影響を与えない。この長くすることができる送信間隔を許容送信間隔と呼ぶこととする。 Since 3003 and 3007 include an RSTREAM process for accumulating data and outputting the data every few seconds, instead of transmitting and processing the data sequentially, the data is collectively transmitted at the RSTREAM time interval, and the statistics are transmitted. Even if processing such as information calculation is performed, the time interval of result output is not affected. The transmission interval that can be lengthened is referred to as an allowable transmission interval.

図５は、各クエリに許容送信間隔を設定後の図３のクエリグラフの一例である。５００１、５００２、５００３、５００４、５００５、５００６、５００７、５００８、５００９はクエリを表している。５００３に許容送信間隔４秒、５００７に許容送信間隔３０秒が図３のクエリグラフを読み取ることで設定される。 FIG. 5 is an example of the query graph of FIG. 3 after setting an allowable transmission interval for each query. Reference numerals 5001, 5002, 5003, 5004, 5005, 5006, 5007, 5008, and 5009 represent queries. An allowable transmission interval of 4 seconds is set to 5003, and an allowable transmission interval of 30 seconds is set to 5007 by reading the query graph of FIG.

次に、クエリに許容送信間隔が設定されたクエリグラフにおいて、クエリグラフ全体の許容送信間隔を算出する方法について述べる。クエリグラフ全体の許容送信間隔とは、各クエリの許容送信間隔に基づいて算出されたクエリグラフ全体で設定される許容送信間隔のことである。クエリグラフ全体の許容送信間隔を算出した場合、各クエリ間のデータのやりとりをクエリグラフ全体の許容送信間隔まで長くしても、データ出力に対する遅延は生じない。このクエリグラフ全体の許容送信間隔を全体許容送信間隔と呼ぶ。以下、全体許容送信間隔の算出方法について説明する。 Next, a method for calculating the allowable transmission interval of the entire query graph in the query graph in which the allowable transmission interval is set for the query will be described. The allowable transmission interval of the entire query graph is an allowable transmission interval set for the entire query graph calculated based on the allowable transmission interval of each query. When the allowable transmission interval of the entire query graph is calculated, no delay occurs in the data output even if the exchange of data between the queries is extended to the allowable transmission interval of the entire query graph. This allowable transmission interval of the entire query graph is referred to as an overall allowable transmission interval. Hereinafter, a method of calculating the total allowable transmission interval will be described.

図６をもちいて、直線で構成されているクエリグラフの全体許容送信間隔の設定の方法の１例を説明する。クエリグラフ６０１はクエリ６００１、６００２、６００３、６００４からなる直線のクエリグラフで構成されており、クエリの許容送信間隔が６００２と６００３に設定されている。６００１がデータの受信に関するクエリであり、６００４がデータの処理結果の出力に関するクエリである。この場合、６００２と６００３の許容送信間隔の最小値である２秒でまとめてデータを送信しても、データの処理に遅延を生じない。よって２秒が、クエリグラフ６０１の全体許容送信間隔となる。 An example of a method of setting the entire allowable transmission interval of a query graph composed of straight lines will be described with reference to FIG. The query graph 601 is configured by a linear query graph including queries 6001, 6002, 6003, and 6004, and the allowable transmission intervals of the queries are set to 6002 and 6003. Reference numeral 6001 denotes a query relating to data reception, and reference numeral 6004 denotes a query relating to output of a data processing result. In this case, even if data is transmitted collectively at 2 seconds which is the minimum value of the allowable transmission interval between 6002 and 6003, no delay occurs in data processing. Therefore, 2 seconds is the entire allowable transmission interval of the query graph 601.

図７は、クエリグラフの処理結果の出力に関するクエリからデータの受信に関するクエリまでの各経路を直線に展開後の図３のクエリグラフの一例である。直線Ａ７０１と直線Ｂ７０２は、それぞれ展開後の経路を表している。クエリ５００３には許容送信間隔４秒が設定されており、クエリ５００７には許容送信間隔３０秒が設定されている。それぞれの許容送信間隔の最大公約数である２秒が、クエリグラフ全体の許容送信間隔となる。この場合、コンピュータ３００からサーバ２００へのログデータの受信を行うクエリ５００１において、全体許容送信間隔の２秒間分のデータをまとめて受信しても、データの出力に遅延を生じない。データを逐次受信するよりも、２秒分まとめて受信することで通信回数を低減し、ネットワーク負荷や消費電力を抑制することが出来る。 FIG. 7 is an example of the query graph of FIG. 3 after each path from the query regarding the output of the processing result of the query graph to the query regarding the reception of the data is linearly developed. A straight line A 701 and a straight line B 702 each represent a route after development. An allowable transmission interval of 4 seconds is set in the query 5003, and an allowable transmission interval of 30 seconds is set in the query 5007. 2 seconds, which is the greatest common divisor of each allowable transmission interval, is the allowable transmission interval of the entire query graph. In this case, in the query 5001 for receiving log data from the computer 300 to the server 200, even if data for two seconds of the entire allowable transmission interval is collectively received, no delay occurs in data output. Rather than sequentially receiving data, by receiving data for two seconds at a time, the number of times of communication can be reduced, and network load and power consumption can be suppressed.

なお、クエリグラフに分岐がない場合には、直線の許容送信間隔をクエリグラフの全体許容送信間隔として設定出来る。 If there is no branch in the query graph, the allowable transmission interval of a straight line can be set as the overall allowable transmission interval of the query graph.

次に、クエリグラフにおいて、許容送信間隔の設定されているクエリから、設定されていないクエリの許容送信間隔を算出する手法について、図８を用いて説明する。 Next, a method of calculating an allowable transmission interval of a query for which an allowable transmission interval has not been set from a query for which an allowable transmission interval has been set in the query graph will be described with reference to FIG.

図８Ａは、分岐を構成するクエリのうち、入力側クエリが複数有り、入力側クエリには許容送信間隔が設定されているが、出力側クエリには許容送信時間が設定されていない場合の展開方法の例である。８０１、８０２、８０３はクエリである。８０１には許容送信間隔２秒が設定されており、８０２には許容送信間隔６秒が設定されている。図８Ａのように、入力側クエリが複数有り、入力側クエリに許容送信間隔が設定されており、出力側クエリには設定されていない場合には、入力側クエリの許容送信間隔の最大公約数を出力側クエリの許容送信間隔とする。従って、８０１の許容送信間隔２秒と、８０２の許容送信間隔６秒の最大公約数である２秒が８０３の許容送信間隔として設定される。 FIG. 8A is an expansion in a case where there are a plurality of input side queries among the queries constituting the branch, and the allowable transmission interval is set in the input side query, but the allowable transmission time is not set in the output side query. It is an example of a method. Reference numerals 801, 802, and 803 are queries. An allowable transmission interval of 2 seconds is set in 801, and an allowable transmission interval of 6 seconds is set in 802. As shown in FIG. 8A, when there are a plurality of input-side queries, an allowable transmission interval is set in the input-side query, and no setting is made in the output-side query, the greatest common divisor of the allowable transmission interval of the input-side query Is the allowable transmission interval of the output side query. Therefore, 2 seconds, which is the greatest common divisor of the allowable transmission interval of 2 seconds of 801 and the allowable transmission interval of 6 seconds of 802, is set as the allowable transmission interval of 803.

図８Ｂは、分岐を構成するクエリのうち、出力側クエリが複数有り、入力側クエリには許容送信間隔が設定されていないが、出力側クエリには許容送信間隔が設定されている場合の展開方法の例である。８０４、８０５、８０６はクエリである。８０５には許容送信間隔２秒が設定されており、８０６には許容送信間隔３秒が設定されている。図８Ｂのように、出力側クエリが複数有り、入力側クエリには許容送信間隔が設定されていないが、出力側クエリには許容送信間隔が設定されている場合には、出力側クエリの許容送信間隔の最大公約数を入力側クエリの許容送信間隔とする。従って、８０５の許容送信間隔２秒と、８０６の許容送信間隔３秒の最大公約数である１秒が８０４の許容送信間隔として設定される。 FIG. 8B is a development in a case where there are a plurality of output queries among the queries constituting the branch, and the input query does not have the allowable transmission interval, but the output query has the allowable transmission interval. It is an example of a method. 804, 805, and 806 are queries. In 805, an allowable transmission interval of 2 seconds is set, and in 806, an allowable transmission interval of 3 seconds is set. As shown in FIG. 8B, when there are a plurality of output queries and the allowable transmission interval is not set in the input query, but the allowable transmission interval is set in the output query, the output query is allowed. The greatest common divisor of the transmission interval is set as the allowable transmission interval of the input side query. Therefore, 1 second which is the greatest common divisor of the allowable transmission interval of 2 seconds of 805 and the allowable transmission interval of 3 seconds of 806 is set as the allowable transmission interval of 804.

図８Ｃは、分岐を構成するクエリのうち、入力側クエリが複数有り、入力側クエリには許容送信間隔が設定されていないが、出力側クエリには許容送信時間が設定されている場合の展開方法の例である。８０７、８０８、８０９はクエリである。８０９には許容送信間隔２秒が設定されている。図９Ｃのように、入力側クエリが複数有り、入力側クエリに許容送信間隔が設定されていないが、出力側クエリには設定されている場合には、出力側クエリの許容送信間隔をそのまま入力側クエリの許容送信間隔とする。従って、８０９の許容送信間隔２秒が、８０７と８０８の許容送信間隔として設定される。 FIG. 8C is an expansion in a case where there are a plurality of input queries among the queries constituting the branch, and the allowable transmission interval is not set for the input queries, but the allowable transmission time is set for the output queries. It is an example of a method. 807, 808, and 809 are queries. In 809, an allowable transmission interval of 2 seconds is set. As shown in FIG. 9C, when there are a plurality of input side queries and the allowable transmission interval is not set in the input side query, but is set in the output side query, the allowable transmission interval of the output side query is directly input. It is the allowable transmission interval of the side query. Therefore, the allowable transmission interval of 2 seconds of 809 is set as the allowable transmission interval of 807 and 808.

図８Ｄは、分岐を構成するクエリのうち、出力側クエリが複数有り、入力側クエリには許容送信間隔が設定されているが、出力側クエリには許容送信間隔が設定されていない場合の展開方法の例である。８１０、８１１、８１２はクエリである。８１０には許容送信間隔１秒が設定されている。図８Ｄのように、出力側クエリが複数有り、入力側クエリには許容送信間隔が設定されているが、出力側クエリには許容送信間隔が設定されていない場合には、入力側クエリの許容送信間隔をそのまま出力側クエリの許容送信間隔とする。従って、８１０の許容送信間隔１秒が、８１１と８１２の許容送信間隔として設定される。 FIG. 8D is a development in a case where there are a plurality of output queries among the queries constituting the branch, and an allowable transmission interval is set in the input query, but an allowable transmission interval is not set in the output query. It is an example of a method. 810, 811 and 812 are queries. In 810, an allowable transmission interval of 1 second is set. As shown in FIG. 8D, when there are a plurality of output side queries and an allowable transmission interval is set for the input side query, but an allowable transmission interval is not set for the output side query, the allowable range of the input side query is set. The transmission interval is directly used as the allowable transmission interval of the output side query. Therefore, the allowable transmission interval of one second of 810 is set as the allowable transmission interval of 811 and 812.

なお、前述した手法を用いて許容送信間隔の設定されたクエリの許容送信間隔に基づいて、許容送信間隔の設定されていないクエリにさらに前述の算出処理を実行してもよい。また、許容送信間隔の設定されているクエリの処理の前もしくは後のクエリが許容送信間隔を設定されていない、かつ前もしくは後のクエリが一つしかない場合は、前もしくは後のクエリの許容送信間隔を許容送信間隔の設定されたクエリの許容送信間隔と等しいとして設定してもよい。 In addition, based on the allowable transmission interval of the query for which the allowable transmission interval has been set using the above-described method, the above-described calculation processing may be further performed on the query for which the allowable transmission interval has not been set. Also, if the query before or after the processing of the query for which the allowable transmission interval is set does not set the allowable transmission interval, and there is only one query before or after, the allowable The transmission interval may be set to be equal to the allowable transmission interval of the query for which the allowable transmission interval has been set.

図９を用いて、送信間隔算出部１１１がクエリグラフ１２１に基づいてデータの送信間隔を設定する処理の流れの一例を示す。 FIG. 9 shows an example of a flow of a process in which the transmission interval calculation unit 111 sets a data transmission interval based on the query graph 121.

送信間隔算出部１１１は、管理サーバ１００がログデータの解析の指示を受けると、コンピュータ３００からサーバ２００へのデータ送信間隔の算出処理を開始する。ログデータの解析は管理サーバ１００の管理システムの管理者のクエリグラフ実行の要求によって始まってもよいし、クエリグラフが記憶装置１２１に新たに格納されたタイミングで開始してもよい。
（Ｓ９０１）データ送信間隔の算出処理が開始されると、送信間隔算出部１１１はクエリグラフ１２１のＲＳＴＲＥＡＭ変換を行うクエリに対して、変換秒数を許容送信間隔として設定する。
（Ｓ９０２）許容送信間隔の設定されたクエリグラフにおいて、クエリの前後関係から新たに許容送信間隔が設定可能なクエリがあるか判定する。ある場合はＳ９０３の処理を、ない場合はＳ９０４の処理を行う。
（Ｓ９０３）図８を用いて説明したように、クエリの前後関係を参照しクエリの許容送信間隔を設定する。設定したらＳ９０２の処理に戻る。
（Ｓ９０４）クエリグラフに分岐があるか判定する。分岐がある場合はＳ９０５の処理を、ない場合はＳ９０６の処理を行う。
（Ｓ９０５）図７を用いて説明したように、分岐をもとにクエリグラフを直線に展開する。直線に展開したら、Ｓ９０４の処理に戻る。
（Ｓ９０６）展開後の各直線に対して、図６で説明したように直線の許容送信間隔を算出する。具体的には、直線内のクエリの許容送信間隔の最小値を直線の許容送信間隔とする。
（Ｓ９０７）展開後の直線全てに許容送信間隔が設定されているか判定する。設定されている場合にはＳ９０８の処理を、設定されていない直線がある場合にはＳ９１０の処理を行う。
（Ｓ９０８）すべての展開後の直線の許容送信間隔の最小値を、クエリグラフ全体の許容送信間隔として設定する。なお、全てのクエリの許容送信間隔の最小値をクエリグラフ全体の許容送信間隔として選んでも良い。
（Ｓ９０９）送信間隔算出部１１１は、クエリグラフ全体の許容送信間隔を、コンピュータ３００からサーバ２００へのログデータのデータ送信間隔として送信間隔送信部１１２に通知する。これにより処理を終了する。
（Ｓ９１０）送信間隔算出部１１１は、コンピュータ３００からサーバ２００へのデータのデータ送信間隔を設定出来ないことを送信間隔送信部１１２に通知する。これにより処理が終了する。クエリグラフ全体の許容送信間隔が設定できなかった場合には、コンピュータ３００からサーバ２００へデータを逐次送信することとなる。When the management server 100 receives an instruction to analyze log data, the transmission interval calculation unit 111 starts a process of calculating a data transmission interval from the computer 300 to the server 200. The analysis of the log data may be started by a request for executing a query graph by an administrator of the management system of the management server 100, or may be started at a timing when the query graph is newly stored in the storage device 121.
(S901) When the data transmission interval calculation process starts, the transmission interval calculation unit 111 sets the number of conversion seconds as an allowable transmission interval for a query for performing RSTREAM conversion of the query graph 121.
(S902) In the query graph in which the allowable transmission interval is set, it is determined from the context of the query whether there is a query for which a new allowable transmission interval can be set. If there is, the process of S903 is performed, and if not, the process of S904 is performed.
(S903) As described with reference to FIG. 8, the allowable transmission interval of the query is set with reference to the context of the query. After the setting, the process returns to S902.
(S904) It is determined whether there is a branch in the query graph. If there is a branch, the process of S905 is performed, and if there is no branch, the process of S906 is performed.
(S905) As described with reference to FIG. 7, the query graph is developed into a straight line based on the branch. After the development to a straight line, the process returns to S904.
(S906) For each straight line after the development, the allowable transmission interval of the straight line is calculated as described with reference to FIG. Specifically, the minimum value of the allowable transmission interval of the query within the straight line is set as the allowable transmission interval of the straight line.
(S907) It is determined whether the allowable transmission interval is set for all the straight lines after the development. If it has been set, the process of S908 is performed, and if there is a straight line that has not been set, the process of S910 is performed.
(S908) The minimum value of the allowable transmission intervals of all the developed straight lines is set as the allowable transmission interval of the entire query graph. Note that the minimum value of the allowable transmission intervals of all queries may be selected as the allowable transmission interval of the entire query graph.
(S909) The transmission interval calculation unit 111 notifies the transmission interval transmission unit 112 of the allowable transmission interval of the entire query graph as the data transmission interval of log data from the computer 300 to the server 200. Thus, the process ends.
(S910) The transmission interval calculation unit 111 notifies the transmission interval transmission unit 112 that the data transmission interval of data from the computer 300 to the server 200 cannot be set. This ends the process. If the allowable transmission interval of the entire query graph cannot be set, the data is sequentially transmitted from the computer 300 to the server 200.

なお、Ｓ９０２とＳ９０３については、処理をスキップしても良い。 Note that the processing may be skipped for S902 and S903.

図１０と図１１を用いて、送信間隔算出部１１１がクエリグラフ１２１にもとづいて、コンピュータ３００からサーバ２００へのデータ送信間隔を送信間隔送信部１１２に通知するまでの流れを具体的に説明する。図１０のクエリグラフの１００１から１０２８は全てクエリを表す。また、クエリ１００１はデータの受信を行うクエリであり、クエリ１００７と１００８は処理結果を出力するクエリである。 With reference to FIGS. 10 and 11, the flow until the transmission interval calculation unit 111 notifies the transmission interval transmission unit 112 of the data transmission interval from the computer 300 to the server 200 based on the query graph 121 will be specifically described. . All of the query graphs 1001 to 1028 in FIG. 10 represent queries. A query 1001 is a query for receiving data, and queries 1007 and 1008 are queries for outputting processing results.

図１０ＡのＲＳＴＲＥＡＭ変換を行うクエリを含んだクエリグラフは、図１０Ｂのように許容送信間隔が各クエリに設定される。クエリ１０１２は、クエリ１０１３の許容送信間隔２秒とクエリ１０１４の許容送信間隔３秒の最大公約数１秒が許容送信間隔として設定される。クエリ１０１７は、クエリ１０１５の許容送信間隔６秒と、クエリ１０１６の許容送信間隔４秒とクエリ１０１４の許容送信間隔３秒の最大公約数１秒が許容送信間隔として設定される。クエリ１０１８は、クエリ１０１４の許容送信間隔３秒が許容送信間隔として設定される。クエリ１０２１にはクエリ１０２２の許容送信間隔である１秒が許容送信間隔として設定される。 In the query graph including the query for performing the RSTREAM conversion in FIG. 10A, an allowable transmission interval is set for each query as shown in FIG. 10B. In the query 1012, a maximum common divisor of one second of the allowable transmission interval of the query 1013 of 2 seconds and the allowable transmission interval of the query 1014 of 3 seconds is set as the allowable transmission interval. In the query 1017, an allowable transmission interval of 6 seconds, an allowable transmission interval of 4 seconds of the query 1016, and a maximum common divisor of 1 second of an allowable transmission interval of 3 seconds of the query 1014 is set as the allowable transmission interval. In the query 1018, the allowable transmission interval of the query 1014 is set as 3 seconds as the allowable transmission interval. In the query 1021, 1 second, which is the allowable transmission interval of the query 1022, is set as the allowable transmission interval.

図１１は図１０Ｃをクエリグラフの分岐に基づいて直線に展開したものを上から１１０１、１１０２、１１０３、１１０４として並べたものである。１１０１の直線の許容送信間隔は、直線内に存在するクエリの許容送信間隔の最小値である１秒として算出される。１１０２の直線の許容送信間隔は、直線内に存在するクエリの許容送信間隔の最小値である１秒として算出される。１１０３の直線の許容送信間隔は、直線内に存在するクエリの許容送信間隔の最小値である１秒として算出される。１１０４の直線の許容送信間隔は、直線内に存在するクエリの許容送信間隔の最小値である１秒として算出される。 FIG. 11 is a diagram obtained by expanding FIG. 10C into straight lines based on the branches of the query graph and arranging them as 1101, 1102, 1103, and 1104 from the top. The allowable transmission interval of the straight line 1101 is calculated as 1 second which is the minimum value of the allowable transmission interval of the query existing in the straight line. The allowable transmission interval of the straight line 1102 is calculated as 1 second which is the minimum value of the allowable transmission interval of the query existing in the straight line. The allowable transmission interval of the straight line 1103 is calculated as 1 second which is the minimum value of the allowable transmission interval of the query existing in the straight line. The allowable transmission interval of the straight line 1104 is calculated as 1 second which is the minimum value of the allowable transmission interval of the query existing in the straight line.

各直線の許容送信間隔は、１秒、１秒、１秒、１秒となり、最小値の１秒がクエリグラフ全体の許容送信間隔として算出される。送信間隔算出部１１１は、コンピュータ３００からサーバ２００へのデータ送信間隔として１秒を送信間隔送信部１１２に通知する。コンピュータ３００はサーバ２００に対して、１秒分のデータをまとめて送信しても出力結果に遅延を生じない。以上より、クエリグラフの制約条件の下で通信回数を少なくし、逐次データを送信するよりもネットワーク負荷や消費電力を抑制することができる。 The allowable transmission interval of each straight line is 1 second, 1 second, 1 second, and 1 second, and the minimum value of 1 second is calculated as the allowable transmission interval of the entire query graph. The transmission interval calculation unit 111 notifies the transmission interval transmission unit 112 of one second as the data transmission interval from the computer 300 to the server 200. Even if the computer 300 collectively transmits data for one second to the server 200, no delay occurs in the output result. As described above, it is possible to reduce the number of communication times under the constraint conditions of the query graph, and to suppress the network load and the power consumption as compared with the case of sequentially transmitting data.

次に、図１２と図１３を用いてクエリグラフに全体許容送信間隔を設定出来ない場合について説明する。１２０１から１２１８はクエリを表す。また、クエリ１２０１はデータの受信を行うクエリであり、クエリ１２０７と１２０８は処理結果を出力するクエリである。 Next, a case where the entire allowable transmission interval cannot be set in the query graph will be described with reference to FIGS. Reference numerals 1201 to 1218 represent queries. A query 1201 is a query for receiving data, and queries 1207 and 1208 are queries for outputting a processing result.

図１２ＡのＲＳＴＲＥＡＭ変換を行うクエリを含んだクエリグラフは、図１２Ｂのように許容送信間隔が各クエリに設定される。クエリの許容送信間隔の前後関係から許容送信間隔を設定出来るクエリは存在しない。図１３は図１２Ｂを分岐に基づいて直線に展開したものをクエリグラフの上から１３０１、１３０２、１３０３、１３０４として並べたものである。１３０１と１３０２は直線の許容送信間隔が２秒と算出されているが、１３０３と１３０４については、直線の許容送信間隔を算出出来ない。よって、クエリグラフの全体許容送信間隔を設定出来ず、送信間隔算出部１１１は、コンピュータ３００からサーバ２００へのデータの送信を逐次行うよう送信間隔送信部１１２に通知する。 In the query graph including the query for performing the RSTREAM conversion in FIG. 12A, the allowable transmission interval is set for each query as shown in FIG. 12B. There is no query that can set the allowable transmission interval based on the context of the allowable transmission interval of the query. FIG. 13 is obtained by expanding FIG. 12B into straight lines based on the branches and arranging them as 1301, 1302, 1303, and 1304 from the top of the query graph. For 1301 and 1302, the allowable transmission interval of the straight line is calculated as 2 seconds, but for 1303 and 1304, the allowable transmission interval of the straight line cannot be calculated. Therefore, the entire allowable transmission interval of the query graph cannot be set, and the transmission interval calculation unit 111 notifies the transmission interval transmission unit 112 to sequentially transmit data from the computer 300 to the server 200.

次に、クエリグループ割り当て部１１３が、クエリグラフ１２１と構成情報１２２と算出したデータ送信間隔に基づいて、サーバ２００にクエリグループの割り当てを行う処理について説明する。クエリグループ割り当て部１１３は、各サーバ２００がデータ送信間隔内でデータを処理できるよう、クエリグループを割り当てることで、取得したログデータを遅滞なく解析・出力することが出来る。 Next, a process in which the query group assignment unit 113 assigns a query group to the server 200 based on the query graph 121, the configuration information 122, and the calculated data transmission interval will be described. The query group allocating unit 113 can analyze and output the acquired log data without delay by allocating the query group so that each server 200 can process the data within the data transmission interval.

図１４はサーバ２００へのクエリグル―プ割り当て処理の流れの一例である。クエリグループ割り当て処理の開始は、送信間隔算出部１１１がデータ送信間隔を算出したタイミングでもよいし、管理サーバ１００の管理システムの管理者の指示によって開始してもよい。
（Ｓ１４０１）クエリグラフにおいて、各クエリ間のデータの送信量を算出する。
（Ｓ１４０２）データの出力を開始するクエリからクエリ間のデータ送信量が最小となるクエリまでをサーバへ割り当てるクエリグループの候補として設定する。
（Ｓ１４０３）サーバへ割り当てるクエリグループの候補を、算出したデータ送信間隔内で処理可能なサーバが存在するかをサーバの構成情報１２２に基づいて判定する。
（Ｓ１４０４）サーバへ割り当てるクエリグループの候補を、算出したデータ送信間隔内で処理可能なサーバが存在する場合にＳ１４０６に、存在しない場合にＳ１４０５の処理を行う。
（Ｓ１４０５）クエリ間のデータ送信量が次に小さくなるクエリまでを、サーバへ割り当てるクエリグループの候補として設定する。
（Ｓ１４０６）割り当てるクエリグループの候補を、Ｓ１４０４で判定したサーバに割り当てる。
（Ｓ１４０７）クエリグラフ内の全てのクエリをサーバに割り当てたか判定する。割り当てた場合はクエリグループ割り当て処理を終了する。割り当てが終了していない場合は、クエリグラフ内のクエリグループの割り当てが終わっていないクエリに対して、Ｓ１４０２の処理を行う。FIG. 14 is an example of the flow of a query group assignment process to the server 200. The query group assignment process may be started at the timing at which the transmission interval calculation unit 111 has calculated the data transmission interval, or may be started by an instruction of the administrator of the management system of the management server 100.
(S1401) In the query graph, the data transmission amount between each query is calculated.
(S1402) The range from the query that starts outputting data to the query that minimizes the data transmission amount between queries is set as a query group candidate to be allocated to the server.
(S1403) A query group candidate to be assigned to the server is determined based on the server configuration information 122 as to whether there is a server that can be processed within the calculated data transmission interval.
(S1404) If there is a server that can process a query group candidate to be assigned to the server within the calculated data transmission interval, the processing in S1406 is performed, and if not, the processing in S1405 is performed.
(S1405) The query up to the next smaller data transmission amount between queries is set as a query group candidate to be assigned to the server.
(S1406) The query group candidate to be assigned is assigned to the server determined in S1404.
(S1407) It is determined whether all the queries in the query graph have been assigned to the server. If so, the query group assignment process ends. If the assignment has not been completed, the process of S1402 is performed for a query that has not been assigned a query group in the query graph.

以上の操作によって、複数のサーバ２００でログデータ３４１に対してクエリグラフ１２１に基づいて解析処理を行う際に、データ出力に遅延なく処理を行うことが可能となる。 With the above operation, when the plurality of servers 200 perform the analysis processing on the log data 341 based on the query graph 121, the processing can be performed without delay in the data output.

次に、図１５と図１６を用いて実際にサーバにクエリグループを割り当てる処理について説明する。 Next, a process of actually assigning a query group to a server will be described with reference to FIGS.

図１５は、割り当て対象のクエリである。１５０１から１５０５はクエリを示している。本発明の手法を使い、データ送信間隔は１秒と算出出来る。図１６は、図１の構成情報１１２の具体例の一例であり、例えばクエリの想定処理タプル数やタプル毎処理時間、想定処理時間が格納されている。想定処理タプル数は、例えば入力タプルとクエリの情報から算出できる。具体的には、集約処理(Group byなど)を特定時間間隔で行うクエリでは、特定時間内にクエリに入力されるタプルのGroup by指定カラムのカラム値種類数の積が、後段クエリに渡す処理タプル数の上限となる。それ以外のクエリでは、前段クエリの処理タプル数がそのまま後段クエリの処理タプル数となる。Group by以外に、例えば1カラムに対するＳＵＭ処理やＡＶＧ処理でも後段クエリに渡す処理タプル数が減少する。想定処理時間は、例えばクエリの実測処理時間から算出出来る。 FIG. 15 shows a query to be assigned. Reference numerals 1501 to 1505 indicate queries. Using the method of the present invention, the data transmission interval can be calculated as 1 second. FIG. 16 is an example of a specific example of the configuration information 112 in FIG. 1, and stores, for example, the number of assumed processing tuples of the query, the processing time for each tuple, and the assumed processing time. The number of assumed processing tuples can be calculated from, for example, input tuples and query information. Specifically, in queries that perform aggregation processing (such as Group by) at specific time intervals, the product of the number of column value types of the Group by specified columns of tuples input to the query within the specific time is passed to the subsequent query This is the upper limit of the number of tuples. For other queries, the number of processing tuples of the first-stage query is directly used as the number of processing tuples of the second-stage query. In addition to the group by, for example, in SUM processing or AVG processing for one column, the number of processing tuples to be passed to the subsequent query is reduced. The assumed processing time can be calculated, for example, from the actual measurement processing time of the query.

まず、１５Ａのクエリグラフについて、各クエリ間のデータの送信量を算出し、クエリグループ間のデータ送信量が最小となるクエリグループ候補を求める。この最小となるクエリグループの求め方は後に説明する。クエリグラフ内の全てのクエリを１つのクエリグループに割り当てる場合、クエリグラフ間のデータ送信が必要無いため、データ送信量が最小となる。 First, for the query graph of 15A, the data transmission amount between the queries is calculated, and a query group candidate that minimizes the data transmission amount between query groups is obtained. The method for obtaining the minimum query group will be described later. When all queries in the query graph are assigned to one query group, data transmission between the query graphs is not required, so that the data transmission amount is minimized.

１５Ｂに、全てのクエリを１クエリグループで処理する例を示した。まず、データ送信量が最小となる１５Ｂについて、処理時間を算出して、許容送信間隔内で処理可能なサーバが存在するかを判定する。処理時間は、図１６の情報を用いて算出する。クエリグループを構成するクエリの想定処理時間の合計が、クエリグループの想定処理時間となる。１５Ｂでは、１秒間に受信するデータの想定処理時間の合計が２秒となり、算出したデータ送信間隔より長くなっている。よって１秒ごとにデータを受信しても受信するデータを処理しきれず、未処理データが無制限にたまってしまう。 FIG. 15B shows an example in which all queries are processed in one query group. First, the processing time is calculated for 15B with the minimum data transmission amount, and it is determined whether there is a server that can process within the allowable transmission interval. The processing time is calculated using the information in FIG. The sum of the assumed processing times of the queries constituting the query group is the assumed processing time of the query group. In the case of 15B, the total estimated processing time of data received per second is 2 seconds, which is longer than the calculated data transmission interval. Therefore, even if data is received every second, the data to be received cannot be processed completely, and unprocessed data is accumulated without limit.

１５Ｃは、クエリグループ間のデータ送信量が１５Ｂの次に小さいクエリグループ割り当てである。サーバ１に割り当てたクエリグループの想定処理時間の合計が１秒となり、入力データ量と処理可能量がバランスする。その後、クエリグループ割り当て部１１３はクエリグループ割り当てが行われていない１５０５クエリを、他のサーバに対してクエリグループ割り当てを行う。以上の操作によって、処理結果の出力に遅延を生じることなくデータの処理が可能となる。 15C is a query group allocation in which the data transmission amount between query groups is the second smallest after 15B. The total estimated processing time of the query group assigned to the server 1 is 1 second, and the input data amount and the processable amount are balanced. Thereafter, the query group allocating unit 113 allocates the 1505 query to which no query group has been allocated to another server. By the above operation, data processing can be performed without delay in outputting the processing result.

図１７はクエリグループ間通信量の小さいクエリグループ抽出処理の流れの一例である。クエリグループ抽出処理の開始は、送信間隔算出部１１１がデータの送信間隔を算出したタイミングでもよいし、管理サーバ１００の管理システムの管理者の指示によって開始してもよい。
（Ｓ１７０１）クエリグラフにおいて、「データ入力側でGroup byを含むRSTREAM処理を行うクエリ数が最も多い、クエリグループ候補」を抽出し、「データ入力側のGroup byを含むＲＳＴＲＥＡＭ処理数」を記憶する。
（Ｓ１７０２）抽出したクエリグループと同じ、「データ入力側のGroup byを含むＲＳＴＲＥＡＭ処理数」を持つすべてのクエリグループ候補について「データがクエリグループ間を往復する、クエリグループ候補」か判定し、「往復しない」と判定された候補が存在する場合にＳ１７０３、存在しない場合にＳ１７０４の処理を行う。
（Ｓ１７０３）「往復しない」クエリグループ候補を「クエリグループ割り当て処理」に渡して、処理終了。
（Ｓ１７０４）「データ入力側でGroup byを含むＲＳＴＲＥＡＭ処理を行うクエリ数が抽出候補の次にすくない、クエリグループ候補」が存在しないか判定し、存在する場合にＳ１７０５、存在しない場合にS１７０６の処理を行う。
（Ｓ１７０５）「往復しない」クエリグループ候補が存在しないと「クエリグループ割り当て処理」に通知して、処理終了。
（Ｓ１７０６）「データ入力側でGroup byを含むＲＳＴＲＥＡＭ処理を行うクエリ数が
抽出候補の次にすくない、クエリグループ候補」を抽出する。「データ入力側のGroup byを含むＲＳＴＲＥＡＭ処理数」を記憶し、Ｓ１７０２の処理を行う。FIG. 17 is an example of the flow of a query group extraction process in which the communication volume between query groups is small. The query group extraction process may be started at the timing when the transmission interval calculation unit 111 has calculated the data transmission interval, or may be started by an instruction of the administrator of the management system of the management server 100.
(S1701) In the query graph, “query group candidates with the largest number of queries that perform RSTREAM processing including Group by on the data input side” are extracted, and “the number of RSTREAM processes including Group by on the data input side” are stored. .
(S1702) For all the query group candidates having the same “number of RSTREAM processes including Group by on the data input side” as the extracted query group, it is determined whether the query group candidate is “a query group candidate in which data reciprocates between query groups”. If there is a candidate determined not to reciprocate, the process of S1703 is performed, and if not, the process of S1704 is performed.
(S1703) The query group candidate “do not go and return” is passed to the “query group assignment process”, and the process ends.
(S1704) It is determined whether or not “the number of queries for which RSTREAM processing including Group by is performed on the data input side is shortest next to the extraction candidates, query group candidates” does not exist. I do.
(S1705) If there is no query group candidate “do not go and return”, “query group allocation processing” is notified, and the processing ends.
(S1706) “A query group candidate in which the number of queries for performing RSTREAM processing including Group by on the data input side is the second smallest after the extraction candidate” is extracted. “The number of RSTREAM processes including Group by on the data input side” is stored, and the process of S1702 is performed.

以上の操作によって、クエリグループ間通信量の小さいクエリグループ抽出処理を行うことが可能となる。ネットワーク負荷を低減することが可能となる。 With the above operation, it is possible to perform a query group extraction process with a small inter-query group communication volume. Network load can be reduced.

次に、図１８と図１９を用いてクエリグループ間通信量の小さいクエリグループ抽出処理について説明する。図１８は、割り当て対象のクエリである。１８０１から１８０６と１８１１から１８１６はクエリを表す。図１９は、クエリグループ間をデータが往復する場合と往復しない場合の具体例である。クエリグラフ内の全てのクエリを１つのクエリグループに割り当てる場合、クエリグラフ間のデータ送信が必要無いため、データ送信量が最小となる。 Next, a query group extraction process with a small inter-query group traffic will be described with reference to FIGS. 18 and 19. FIG. 18 shows a query to be assigned. 1801 to 1806 and 1811 to 1816 represent queries. FIG. 19 is a specific example of a case where data reciprocates between query groups and a case where data does not reciprocate. When all queries in the query graph are assigned to one query group, data transmission between the query graphs is not required, so that the data transmission amount is minimized.

１８Ａに示した、全てのクエリを１クエリグループで含む場合から処理を始める。１８Ａをクエリグループ間通信量の最も小さいクエリグループとしてクエリグループ割り当て処理に渡し、割り当てが行えなかった場合には、次にクエリグループ間送信量が小さいクエリグループの抽出を行う。次にクエリグループ間送信量が小さいクエリグループは、「Group byを含むＲＳＴＲＥＡＭ処理」である１８１５クエリを含まない１８Ｂである。このクエリグループについて、クエリグループ割り当て処理を行う。１８Ｂのクエリグループについても、割り当て処理が行えなかった場合には、次に「Group byを含むＲＳＴＲＥＡＭ処理」が少ないクエリグループを抽出する。なお、この抽出の際に、図１９に示すようなクエリグループ間のデータ往復が発生しない事が望ましい。クエリ１９０１と１９０２はデータ出力に関するクエリである。 The process starts from a case where all queries are included in one query group, as shown in 18A. 18A is passed to the query group assignment process as the query group with the smallest inter-query group communication amount, and if the assignment is not performed, the next query group with the smallest inter-query group transmission amount is extracted. The next query group having the smallest transmission amount between query groups is 18B that does not include the 1815 query that is the “RSTREAM process including Group by”. A query group assignment process is performed for this query group. If the assignment processing cannot be performed on the 18B query group, a query group with the least “RSTREAM processing including Group by” is extracted next. At the time of this extraction, it is desirable that data reciprocation between query groups as shown in FIG. 19 does not occur. Queries 1901 and 1902 are queries relating to data output.

図１９の１９Ａではクエリグループ１とクエリグループ２の間をデータが往復している。一方、１９Ｂではクエリグループ３とクエリグループ間をデータが往復していない。よって、クエリグラフ１９Ａではデータ出力までにグループ間、すなわちサーバ間でデータのやり取りが２回生じてしまうが、クエリグラフ１９Ｂではサーバ間のデータのやりとりは１回しか生じない。クエリグループ間の往復を判定する方法の一例としては、データ入力からデータ出力までの各径路について、同じクエリグループを複数回辿らないかを判定し、辿る場合にはデータがクエリグループ間を往復していると判定できる。 In FIG. 19A, data reciprocates between query group 1 and query group 2. On the other hand, in 19B, data does not reciprocate between the query group 3 and the query group. Therefore, in the query graph 19A, data is exchanged twice between groups, that is, between servers, until data is output, whereas in the query graph 19B, data is exchanged only once between servers. As an example of a method for determining the round trip between query groups, for each route from data input to data output, it is determined whether the same query group is traced multiple times, and if so, the data round trips between query groups. Can be determined.

図２０は、クエリグラフ１２１を管理サーバ１００の管理者が出力デバイス１６０を通して確認する例の一例である。クエリグラフ２００２は、データ送信間隔の算出対象のクエリグラフを表している。算出結果２００３は、データ送信間隔の算出結果である。チューニングポイント２００４は、データ送信間隔を長くするための方法が示されている。 FIG. 20 is an example of an example in which the administrator of the management server 100 checks the query graph 121 through the output device 160. The query graph 2002 represents a query graph whose data transmission interval is to be calculated. The calculation result 2003 is a calculation result of the data transmission interval. The tuning point 2004 shows a method for extending the data transmission interval.

この例では、算出結果２００３は、データ送信間隔が１秒である事と、直線Ａと直線Ｂの許容送信間隔が１秒である事を示している。チューニングポイント２００４は、クエリａとクエリｂの許容送信間隔を２秒にすると、データ送信間隔を２秒に長く出来る事を示している。 In this example, the calculation result 2003 indicates that the data transmission interval is 1 second and that the allowable transmission interval between the straight lines A and B is 1 second. The tuning point 2004 indicates that if the allowable transmission interval between the query a and the query b is set to 2 seconds, the data transmission interval can be extended to 2 seconds.

例えば、管理サーバ１００の管理システムの管理者は、２００１のインターフェース上でクエリグラフ１２１を確認し、業務要件を満たす範囲でクエリを変更して許容送信間隔を設定・変更することでログデータのデータ送信間隔を制御することが可能となる。管理者がコンピュータ３００からサーバ２００へのログデータのデータ送信間隔を制御することで、ログデータの送信回数を少なくし、消費電力を抑制するよう設定することが可能となる。 For example, the administrator of the management system of the management server 100 checks the query graph 121 on the interface of 2001, changes the query within a range that satisfies the business requirements, and sets / changes the allowable transmission interval, thereby changing the data of the log data. It becomes possible to control the transmission interval. By controlling the data transmission interval of log data from the computer 300 to the server 200, the administrator can make settings to reduce the number of log data transmissions and suppress power consumption.

次に、データ送信間隔の算出方法の別の実施形態について、図２１と図２２を使って説明する。 Next, another embodiment of the method of calculating the data transmission interval will be described with reference to FIGS.

クエリグラフの全体許容送信間隔は、実施例１の直線に展開する方法以外でも算出することができる。具体的には、クエリグラフに基づいた処理全体のデータを受信するクエリから処理結果を出力するクエリを辿り、全ての経路に許容送信間隔が設定できるクエリが存在するか調べ、全ての経路で許容送信間隔が設定できる場合には各径路の許容送信間隔の最大公約数を処理全体の許容送信間隔とできる。 The overall allowable transmission interval of the query graph can be calculated by a method other than the method of developing the query graph into a straight line. Specifically, the query that outputs the processing result is traversed from the query that receives the data of the entire process based on the query graph, checks whether there is a query that can set the allowable transmission interval for all routes, When the transmission interval can be set, the greatest common divisor of the allowable transmission interval of each path can be set as the allowable transmission interval of the entire process.

図２１は、各クエリに許容送信間隔を設定後のクエリグラフの一例である。クエリ上の数字は許容送信間隔を示している。２１１１、２１１２、２１１３、２１１４、２１１５、２１１６、２１１７、２１２１、２１２２、２１２３、２１２４、２１３１、２１３２、２１３３、２１３４、２１３５はクエリを表している。２１１４に許容送信間隔２秒、２１１５に許容送信間隔１秒、２１１６に許容送信間隔１秒、２１２３に許容送信間隔２秒、２１３３に許容送信間隔３秒が設定されている。 FIG. 21 is an example of a query graph after setting an allowable transmission interval for each query. The number on the query indicates the allowable transmission interval. Reference numerals 2111, 2112, 2113, 2114, 2115, 2116, 2117, 2121, 2122, 2123, 2124, 2131, 2132, 2133, 2134, and 2135 represent queries. An allowable transmission interval of 2 seconds is set in 2114, an allowable transmission interval of 1 second is set in 2115, an allowable transmission interval of 1 second is set in 2116, an allowable transmission interval of 2 seconds is set in 2123, and an allowable transmission interval of 3 seconds is set in 2133.

２１１０コンピュータ１の処理について、データ送信間隔の算出処理を行う。クエリグラフに基づいた処理結果の出力を行うクエリである２１４０出力から、データ受信を行うクエリである２１１２への各径路について、許容送信間隔が設定出来るか調べる。２１４０から２１１１へは、３つの経路があり、２１１５と２１１６には許容送信間隔１秒、２１１４には許容送信間隔２秒が設定されている。各径路の許容送信間隔の最大公約数である１秒が、コンピュータ1処理のデータ送信間隔として設定される。 For the processing of the 2110 computer 1, a data transmission interval calculation process is performed. It is checked whether an allowable transmission interval can be set for each path from the output 2140 which is a query for outputting a processing result based on the query graph to the query 2112 which is a query for receiving data. There are three routes from 2140 to 2111, and an allowable transmission interval of 1 second is set for 2115 and 2116, and an allowable transmission interval of 2 seconds is set for 2114. One second, which is the greatest common divisor of the allowable transmission interval of each path, is set as the data transmission interval of the computer 1 process.

２１２０コンピュータ２の処理について、データ送信間隔の算出処理を行う。クエリグラフに基づいた処理結果の出力を行うクエリである２１４０出力から、データ受信を行うクエリである２１２２への各径路について、許容送信間隔が設定出来るか調べる。２１４０から２１２２へは、１つの経路があり、２１２３には許容送信間隔２秒が設定されている。許容送信間隔の２秒が、コンピュータ２処理のデータ送信間隔として設定される。 2120 For the processing of the computer 2, a data transmission interval calculation process is performed. It is checked whether an allowable transmission interval can be set for each path from the output 2140 which is a query for outputting a processing result based on the query graph to the query 2122 which is a query for receiving data. There is one route from 2140 to 2122, and 2123 has an allowable transmission interval of 2 seconds. The allowable transmission interval of 2 seconds is set as the data transmission interval of the computer 2 process.

２１３０コンピュータ３の処理について、データ送信間隔の算出処理を行う。クエリグラフに基づいた処理結果の出力を行うクエリである２１４０出力から、データ受信を行うクエリである２１３２への各径路について、許容送信間隔が設定出来るか調べる。２１４０から２１３２へは、２つの経路があり、２１３３には許容送信間隔３秒が設定されている。しかし、２１４０、２１３４、２１３２の経路には許容送信間隔が設定できないため、コンピュータ３の処理にはデータ送信間隔が設定できない。 2130 For the processing of the computer 3, a data transmission interval calculation process is performed. It is checked whether an allowable transmission interval can be set for each path from the output 2140 which is a query for outputting a processing result based on the query graph to the query 2132 which is a query for receiving data. There are two routes from 2140 to 2132, and 2133 has an allowable transmission interval of 3 seconds. However, since the allowable transmission interval cannot be set for the routes 2140, 2134, and 2132, the data transmission interval cannot be set for the processing of the computer 3.

以下、本実施例で行われる処理を説明する。図２２は、データ送信間隔算出処理の流れの一例を示す。 Hereinafter, processing performed in the present embodiment will be described. FIG. 22 shows an example of the flow of the data transmission interval calculation process.

送信間隔算出部１１１は、管理サーバ１００がログデータの解析の指示を受けると、コンピュータ３００からサーバ２００へのデータ送信間隔の算出処理を開始する。ログデータの解析は管理サーバ１００の管理システムの管理者のクエリグラフ実行の要求によって始まってもよいし、クエリグラフが記憶装置１２１に新たに格納されたタイミングで開始してもよい。
（Ｓ２２０１）許容送信間隔算出処理が開始されると、送信間隔算出部１１１はクエリグラフの各クエリの許容送信間隔を設定する。
（Ｓ２２０２）送信間隔算出部１１１は、クエリグラフに許容送信間隔が設定出来るか未確認の経路があるか否かを判断する。その判断結果が肯定の場合、Ｓ２２０３が実行され、その判断結果が否定の場合、Ｓ２２０４が実行される。
（Ｓ２２０３）送信間隔算出部１１１は、クエリグラフの経路の一つに許容送信間隔が設定出来るかの確認を行い、設定できる場合には許容送信間隔を算出する。
（Ｓ２２０５）送信間隔算出部１１１は、すべての経路に許容送信間隔が設定されているか否かを判断する。その判断結果が肯定の場合、Ｓ２２０６が実行され、その判断結果が否定の場合、Ｓ２２０７が実行される。
（Ｓ２２０６）送信間隔算出部１１１は、各径路の許容送信間隔の最大公約数をクエリグラフ全体の許容送信間隔とする。
（Ｓ２２０７）送信間隔算出部１１１は、クエリグラフ全体の許容送信間隔を、コンピュータ３００からサーバ２００へのログデータのデータ送信間隔として送信間隔送信部１１２に通知する。これにより処理を終了する。
（Ｓ２２０８）送信間隔算出部１１１は、コンピュータ３００からサーバ２００へのデータのデータ送信間隔を設定出来ないことを送信間隔送信部１１２に通知する。これにより処理が終了する。クエリグラフ全体の許容送信間隔が設定できなかった場合には、コンピュータ３００からサーバ２００へデータを逐次送信することとなる。When the management server 100 receives an instruction to analyze log data, the transmission interval calculation unit 111 starts a process of calculating a data transmission interval from the computer 300 to the server 200. The analysis of the log data may be started by a request for executing a query graph by an administrator of the management system of the management server 100, or may be started at a timing when the query graph is newly stored in the storage device 121.
(S2201) When the allowable transmission interval calculation process starts, the transmission interval calculation unit 111 sets the allowable transmission interval of each query in the query graph.
(S2202) The transmission interval calculation unit 111 determines whether an allowable transmission interval can be set in the query graph or not. If the determination is positive, S2203 is executed, and if the determination is negative, S2204 is executed.
(S2203) The transmission interval calculation unit 111 checks whether an allowable transmission interval can be set for one of the paths of the query graph, and if it can be set, calculates the allowable transmission interval.
(S2205) The transmission interval calculation unit 111 determines whether or not an allowable transmission interval has been set for all routes. If the determination result is affirmative, S2206 is executed, and if the determination result is negative, S2207 is executed.
(S2206) The transmission interval calculation unit 111 sets the greatest common divisor of the allowable transmission interval of each path as the allowable transmission interval of the entire query graph.
(S2207) The transmission interval calculation unit 111 notifies the transmission interval transmission unit 112 of the allowable transmission interval of the entire query graph as the data transmission interval of log data from the computer 300 to the server 200. Thus, the process ends.
(S2208) The transmission interval calculation unit 111 notifies the transmission interval transmission unit 112 that the data transmission interval of data from the computer 300 to the server 200 cannot be set. This ends the process. If the allowable transmission interval of the entire query graph cannot be set, the data is sequentially transmitted from the computer 300 to the server 200.

なお、Ｓ２２０２とＳ２２０３は行わずに、Ｓ２２０５を行ってもよい。これによって、データ送信間隔の算出までの手順を少なくすることが出来る。 Note that S2205 may be performed without performing S2202 and S2203. As a result, the procedure up to the calculation of the data transmission interval can be reduced.

本発明は前述した実施例に限定されるものではない。例えば、工場のＩｏＴシステムに対して、本発明を適用した場合について説明する。 The present invention is not limited to the embodiments described above. For example, a case where the present invention is applied to an IoT system in a factory will be described.

工場内に取り付けられたセンサデバイスからデータを収集し、クエリグラフ１２１に基づいて設備の故障検知を行う場合がある。センサデバイスは危険な場所に設置されるため、電池駆動が多く、取得したデータを無線でサーバに送信する飛ばす必要がある。クエリグラフ１２１に基づいてデータ送信間隔を設定することで、データの送信回数を少なくして電池の消費を抑制し、電池交換の頻度を少なくすることが出来る。 In some cases, data is collected from sensor devices installed in a factory, and equipment failure detection is performed based on the query graph 121. Since the sensor device is installed in a dangerous place, it is often driven by a battery, and it is necessary to transmit the acquired data wirelessly to the server. By setting the data transmission interval based on the query graph 121, it is possible to reduce the number of data transmissions, suppress battery consumption, and reduce the frequency of battery replacement.

１００管理サーバ
１１０メモリ
１１１送信間隔算出部
１１２送信間隔送信部
１１３クエリグループ割り当て部
１１４クエリグループ実行結果処理部
１２０記憶装置
１２１クエリグラフ
１２２構成情報
１３０プロセッサ
１４０ネットワークインターフェース
１５０入力デバイス
１６０出力デバイス
２００サーバ
２１０ネットワークインターフェース
２２０プロセッサ
２３０メモリ
２３１クエリグループ実行部
３００コンピュータ
３１０ネットワークインターフェース
３２０プロセッサ
３３０メモリ
３３１データ送信部
３４０記憶装置
３４１ログデータ
４００通信ネットワーク100 Management Server 110 Memory 111 Transmission Interval Calculation Unit 112 Transmission Interval Transmission Unit 113 Query Group Assignment Unit 114 Query Group Execution Result Processing Unit 120 Storage Device 121 Query Graph 122 Configuration Information 130 Processor 140 Network Interface 150 Input Device 160 Output Device 200 Server 210 Network interface 220 Processor 230 Memory 231 Query group execution unit 300 Computer 310 Network interface 320 Processor 330 Memory 331 Data transmission unit 340 Storage device 341 Log data 400 Communication network

Claims

ストリームデータを送信する第一の処理装置とクエリに基づいて前記第一の処理装置から受信した前記ストリームデータに対して処理を実行する第二の処理装置とネットワークを介して接続される管理装置であって、
前記管理装置は、
プロセッサと格納部を備え、
前記格納部は、
複数の前記クエリから構成されるクエリグラフに関する情報を格納し、
前記クエリグラフを構成する少なくとも一つの前記クエリは、所定の間隔で前記ストリームデータの処理を実行する処理に関する情報を含み、
前記プロセッサは、
前記クエリグラフに基づいて、一つ以上の前記クエリを前記第二の処理装置に割り当て、
前記所定の間隔と前記クエリグラフに基づいて、前記第一の処理装置が前記ストリームデータを前記第二の処理装置に送信する間隔であるデータ送信間隔を算出し、
前記データ送信間隔に関する情報を前記第一の処理装置に送信することを特徴とする管理装置。A management device connected via a network to a first processing device that transmits stream data and a second processing device that performs processing on the stream data received from the first processing device based on the query. So,
The management device,
Equipped with a processor and storage,
The storage unit,
Storing information about a query graph composed of a plurality of the queries;
At least one of the queries constituting the query graph includes information on a process of executing the process of the stream data at predetermined intervals,
The processor comprises:
Assigning one or more of the queries to the second processing device based on the query graph;
Based on the predetermined interval and the query graph, calculate a data transmission interval that is an interval at which the first processing device transmits the stream data to the second processing device,
A management device for transmitting information on the data transmission interval to the first processing device.

請求項１に記載の管理装置であって、
前記クエリグラフは、
前記第二の処理装置が前記ストリームデータを受信する処理に関する受信クエリと、
前記第二の処理装置が前記クエリに基づいて処理を実行した結果を出力する処理に関する出力クエリを含み、
前記プロセッサは、
前記クエリグラフにおける前記受信クエリと前記出力クエリとの間の経路上に存在する前記クエリに設定された前記所定の間隔に基づいて、
前記データ送信間隔を算出することを特徴とする管理装置。The management device according to claim 1,
The query graph is
A reception query related to a process in which the second processing device receives the stream data,
The second processing device includes an output query related to a process of outputting a result of executing a process based on the query,
The processor comprises:
Based on the predetermined interval set in the query that exists on the path between the received query and the output query in the query graph,
A management device for calculating the data transmission interval.

請求項２に記載の管理装置であって、
前記データ送信間隔は、前記クエリに設定された前記所定の間隔の最大公約数であることを特徴とする管理装置。The management device according to claim 2, wherein
The management device, wherein the data transmission interval is a greatest common divisor of the predetermined interval set in the query.

請求項２に記載の管理装置であって、
前記データ送信間隔は、前記クエリに設定された前記所定の間隔の最小値であることを特徴とする管理装置。The management device according to claim 2, wherein
The management device, wherein the data transmission interval is a minimum value of the predetermined interval set in the query.

請求項２に記載の管理装置であって、
前記データ送信間隔は、前記クエリグラフにおける前記受信クエリと前記出力クエリとの間の各経路の前記クエリの前記所定の間隔の最小値の最大公約数であることを特徴とする管理装置。The management device according to claim 2, wherein
The management device according to claim 1, wherein the data transmission interval is a greatest common divisor of a minimum value of the predetermined interval of the query in each path between the reception query and the output query in the query graph.

請求項１に記載の管理装置であって、
前記格納部は、
前記第二の処理装置の前記クエリに基づいた処理の実行時間に関する情報を含む構成情報を格納し、
前記プロセッサは、
前記クエリグラフと前記構成情報に基づいて、一つ以上の前記クエリを前記第二の処理装置に割り当てることを特徴とする管理装置。The management device according to claim 1,
The storage unit,
Configuration information including information on the execution time of the processing based on the query of the second processing device is stored,
The processor comprises:
A management device, wherein one or more of the queries are assigned to the second processing device based on the query graph and the configuration information.

請求項６に記載の管理装置であって、
前記プロセッサは、
前記クエリグラフと前記構成情報に基づいて、前記第二の処理装置の前記割り当てられたクエリに基づいた処理にかかる時間が、前記データ送信間隔の時間内に収まるように一つ以上の前記クエリを前記第二の処理装置に割り当てることを特徴とする管理装置。The management device according to claim 6, wherein
The processor comprises:
Based on the query graph and the configuration information, the time required for the processing based on the assigned query of the second processing device, one or more of the queries so as to fit within the time of the data transmission interval A management device, wherein the management device is assigned to the second processing device.

ストリームデータを送信する第一の処理装置とクエリに基づいて前記第一の処理装置から受信した前記ストリームデータに対して処理を実行する第二の処理装置とネットワークを介して接続される管理システムによるストリームデータ処理の実行環境設定方法であって、
前記管理システムは、
複数の前記クエリから構成される前記クエリグラフに基づいて、一つ以上の前記クエリを前記第二の処理装置に割り当て、
少なくとも一つの前記クエリに設定されるストリームデータの処理を実行する間隔である実行間隔と前記クエリグラフに基づいて、前記第一の処理装置が前記ストリームデータを前記第二の処理装置に送信する間隔であるデータ送信間隔を算出し、
前記データ送信間隔に関する情報を前記第一の処理装置に送信する
ことを特徴とする実行環境設定方法。A first processing device that transmits stream data and a management system that is connected via a network to a second processing device that executes processing on the stream data received from the first processing device based on the query An execution environment setting method for stream data processing,
The management system includes:
Assigning one or more of the queries to the second processing device based on the query graph composed of a plurality of the queries;
An interval at which the first processing device transmits the stream data to the second processing device based on the execution interval and the query graph, which are intervals at which the processing of the stream data set in at least one of the queries is executed. Calculate the data transmission interval that is
An execution environment setting method, wherein information on the data transmission interval is transmitted to the first processing device.

請求項８に記載の実行環境設定方法であって、
前記管理システムは、
前記クエリグラフにおける前記第二の処理装置が前記ストリームデータを受信する処理に関する受信クエリと前記第二の処理装置が前記クエリに基づいて処理を実行した結果を出力する処理に関する出力クエリとの間の経路上に存在する前記クエリに設定された前記実行間隔に基づいて、
前記データ送信間隔を算出することを特徴とする実行環境設定方法。An execution environment setting method according to claim 8, wherein
The management system includes:
Between a query received by the second processing device for receiving the stream data in the query graph and an output query related to a process of outputting a result of the processing performed by the second processing device based on the query; Based on the execution interval set in the query existing on the route,
An execution environment setting method, wherein the data transmission interval is calculated.

請求項９に記載の実行環境設定方法であって、
前記管理システムは、
前記実行間隔と前記クエリグラフに基づいて、前記実行間隔の最大公約数となる前記データ送信間隔を算出することを特徴とする実行環境設定方法。An execution environment setting method according to claim 9, wherein
The management system includes:
An execution environment setting method, wherein the data transmission interval that is the greatest common divisor of the execution interval is calculated based on the execution interval and the query graph.

請求項９に記載の実行環境設定方法であって、
前記管理システムは、
前記実行間隔と前記クエリグラフに基づいて、前記実行間隔の最小値となるデータ送信間隔を算出することを特徴とする実行環境設定方法。An execution environment setting method according to claim 9, wherein
The management system includes:
An execution environment setting method, comprising: calculating a data transmission interval that is a minimum value of the execution interval based on the execution interval and the query graph.

請求項９に記載の実行環境設定方法であって、
前記管理システムは、
前記クエリグラフにおける前記受信クエリと前記出力クエリとの間の経路ごとに前記クエリの前記実行間隔の最小値を算出し、
前記算出した最小値の最大公約数となる前記データ送信間隔を算出することを特徴とする実行環境設定方法。An execution environment setting method according to claim 9, wherein
The management system includes:
Calculating the minimum value of the execution interval of the query for each path between the received query and the output query in the query graph,
An execution environment setting method, wherein the data transmission interval that is the greatest common divisor of the calculated minimum value is calculated.

請求項８に記載の実行環境設定方法であって、
前記管理システムは、
前記第二の処理装置が前記クエリに基づいた処理の実行時間に関する情報を含む構成情報と前記クエリグラフに基づいて、一つ以上の前記クエリを前記第二の処理装置に割り当てることを特徴とする実行環境設定方法。An execution environment setting method according to claim 8, wherein
The management system includes:
The second processing device allocates one or more queries to the second processing device based on the configuration information including information on an execution time of a process based on the query and the query graph. Execution environment setting method.

請求項１３に記載の実行環境設定方法であって、
前記管理システムは、
前記クエリグラフと前記構成情報に基づいて、前記第二の処理装置の前記割り当てられたクエリに基づいた処理にかかる時間が前記データ送信間隔の時間内に収まるように一つ以上の前記クエリを前記第二の処理装置に割り当てることを特徴とする実行環境設定方法。The execution environment setting method according to claim 13, wherein
The management system includes:
Based on the query graph and the configuration information, the one or more of the queries so that the time required for processing based on the assigned query of the second processing device falls within the time of the data transmission interval. An execution environment setting method, wherein the method is assigned to a second processing device.

管理装置と第一の処理装置と第二の処理装置からなるストリームデータ処理システムであって、
前記管理装置は、
第一のプロセッサと第一の格納部を備え、
前記第一の格納部は、
複数のクエリから構成されるクエリグラフに関する情報を格納し、
前記クエリグラフを構成する少なくとも一つの前記クエリは、所定の間隔でストリームデータの処理を実行する処理に関する情報を含み、
前記第一のプロセッサは、
前記クエリグラフに基づいて、一つ以上の前記クエリを前記第二の処理装置に割り当て、
前記所定の間隔と前記クエリグラフに基づいて、前記第一の処理装置が前記ストリームデータを前記第二の処理装置に送信する間隔であるデータ送信間隔を算出し、
前記データ送信間隔に関する情報を前記第一の処理装置に送信し、
前記第一の処理装置は、
第二のプロセッサと第二の格納部を備え、
前記第二の格納部は、
前記受信したデータ送信間隔に関する情報と前記ストリームデータとして送信されるデータを格納し、
前記第二のプロセッサは、
前記受信したデータ送信間隔に関する情報に基づいて、前記第二の格納部に格納されたデータを前記ストリームデータとして前記第二の装置に送信し、
前記第二の処理装置は、
第三のプロセッサを備え、
前記第三のプロセッサは、
前記割り当てられたクエリに基づいて、前記受信したストリームデータに対して処理を実行し、
前記処理結果を出力することを特徴とするストリームデータ処理システム。A stream data processing system including a management device, a first processing device, and a second processing device,
The management device,
Comprising a first processor and a first storage unit,
The first storage unit,
Stores information about a query graph composed of multiple queries,
At least one of the queries constituting the query graph includes information on a process of executing a process of stream data at a predetermined interval,
The first processor comprises:
Assigning one or more of the queries to the second processing device based on the query graph;
Based on the predetermined interval and the query graph, calculate a data transmission interval that is an interval at which the first processing device transmits the stream data to the second processing device,
Transmitting information on the data transmission interval to the first processing device,
The first processing device,
A second processor and a second storage unit,
The second storage unit,
Store information on the received data transmission interval and data transmitted as the stream data,
The second processor,
Based on the information on the received data transmission interval, transmitting the data stored in the second storage unit as the stream data to the second device,
The second processing device,
Equipped with a third processor,
The third processor,
Performing a process on the received stream data based on the assigned query;
A stream data processing system for outputting the processing result.