JP7187065B1

JP7187065B1 - Calculation method determination system, calculation method determination method, and calculation method determination program

Info

Publication number: JP7187065B1
Application number: JP2021105479A
Authority: JP
Inventors: 督基平賀; 博史三宅
Original assignee: Morpho Inc
Current assignee: Morpho Inc
Priority date: 2021-06-25
Filing date: 2021-06-25
Publication date: 2022-12-12
Anticipated expiration: 2041-06-25
Also published as: JP2023004024A

Abstract

【課題】計算コストを適切に考慮してネットワーク構造の各層の計算手法を決定できるシステムを提供する。【解決手段】計算手法決定システムは、層ごとに予め準備された少なくとも１つの計算手法に基づいて、ネットワーク構造の入力から出力までの間の各層の計算手法の組み合わせを複数決定する決定部と、複数の組み合わせごとに、データのやり取りを行う二層の計算手法が所定の関係を満たすか否かを判定し、二層の間のデータのやり取りのために実行される適合層を決定する適合部と、計算手法の計算コストと適合層の計算コストとに基づいて、複数の組み合わせごとに総コストを算出する算出部と、組み合わせごとに算出された総コストに基づいて、複数の組み合わせの中から一つの組み合わせを選択する選択部とを備える。【選択図】図４Kind Code: A1 A system is provided that can determine a calculation method for each layer of a network structure by properly considering calculation costs. A calculation method determination system includes a determination unit that determines a plurality of combinations of calculation methods for each layer from input to output of a network structure based on at least one calculation method prepared in advance for each layer; A matching unit that determines whether or not two layers of calculation methods that exchange data satisfy a predetermined relationship for each combination, and determines a matching layer to be executed for data exchange between the two layers. , a calculation unit that calculates the total cost for each of a plurality of combinations based on the calculation cost of the calculation method and the calculation cost of the adaptive layer; and a selection unit that selects one combination. [Selection drawing] Fig. 4

Description

本開示は、計算手法決定システム、計算手法決定方法、及び、計算手法決定プログラムに関する。 The present disclosure relates to a calculation method determination system, a calculation method determination method, and a calculation method determination program.

特許文献１は、入力層、中間層、出力層という複数層で構成される階層型ネットワークを有する認識装置のためのシステムを開示する。認識装置は、学習済みのネットワーク構造及び重みデータを用いて認識を行う。システムは、予め準備された計算手法を実行環境にて実際に動作させて計算コストを取得し、計算コストが最小となる計算手法を層ごとに選択する。 Patent Literature 1 discloses a system for a recognition device having a hierarchical network composed of multiple layers, an input layer, an intermediate layer, and an output layer. The recognition device performs recognition using the learned network structure and weight data. The system actually operates prepared calculation methods in an execution environment, acquires calculation costs, and selects a calculation method with the lowest calculation cost for each layer.

特開２０１９－１２８８３１号公報JP 2019-128831 A

特許文献１に記載のシステムにおいては、ネットワーク構造全体の計算コストは、各層の計算コストを加算して得ることができる。このため、各層において計算コストが最小となる計算手法が選択されれば、ネットワーク構造全体の計算コストも最小となる。ところで、データをやり取りする二層間の計算手法において、計算に用いるデバイス、環境、データフォーマットなどの違いがある場合には、二層間のデータのやり取りのための処理が必要となる。このため、各層において計算コストが最小となる計算手法を選択したとしても、ネットワーク構造全体の計算コストが最小となるとは限らない。本開示は、ネットワーク構造の各層の計算手法をより適切に決定することができる技術を提供する。 In the system described in Patent Document 1, the computational cost of the entire network structure can be obtained by adding the computational cost of each layer. Therefore, if a calculation method that minimizes the calculation cost in each layer is selected, the calculation cost of the entire network structure is also minimized. By the way, in the calculation method between two layers for exchanging data, if there are differences in the device, environment, data format, etc. used for calculation, processing for exchanging data between the two layers is required. Therefore, even if a calculation method that minimizes the calculation cost in each layer is selected, the calculation cost of the entire network structure is not necessarily minimized. The present disclosure provides techniques that can more appropriately determine computational techniques for each layer of the network structure.

本開示の一側面は、ネットワーク構造及び重みデータを用いて入力データを処理するための計算が行われる実行環境において、ネットワーク構造の層ごとに計算手法を決定する計算手法決定システムである。システムは、決定部、適合部、算出部及び選択部を備える。決定部は、層ごとに予め準備された少なくとも１つの計算手法に基づいて、ネットワーク構造の入力から出力までの間の各層の計算手法の組み合わせを複数決定する。適合部は、複数の組み合わせごとに、データのやり取りを行う二層の計算手法が所定の関係を満たすか否かを判定し、二層の間のデータのやり取りのために実行される適合層を決定する。算出部は、計算手法の計算コストと適合層の計算コストとに基づいて、複数の組み合わせごとに総コストを算出する。選択部は、組み合わせごとに算出された総コストに基づいて、複数の組み合わせの中から一つの組み合わせを選択する。 One aspect of the present disclosure is a computational method determination system that determines a computational method for each layer of a network structure in an execution environment where computations are performed to process input data using network structure and weight data. The system comprises a determiner, a matcher, a calculator and a selector. The determination unit determines a plurality of combinations of calculation methods for each layer from input to output of the network structure based on at least one calculation method prepared in advance for each layer. The adapting unit determines whether or not the two-layer calculation method for exchanging data satisfies a predetermined relationship for each of a plurality of combinations, and selects an adaptation layer to be executed for exchanging data between the two layers. decide. The calculation unit calculates the total cost for each of the multiple combinations based on the calculation cost of the calculation method and the calculation cost of the matching layer. The selection unit selects one combination from a plurality of combinations based on the total cost calculated for each combination.

この計算手法決定システムにおいては、ネットワーク構造の入力から出力までの間の各層の計算手法の組み合わせが複数決定される。そして、複数の組み合わせごとに、データのやり取りを行う二層の計算手法が所定の関係を満たすか否かが判定され、二層の間のデータのやり取りのために実行される適合層が決定される。そして、計算手法の計算コストと適合層の計算コストとに基づいて、複数の組み合わせごとに総コストが算出される。組み合わせごとに算出された総コストに基づいて、複数の組み合わせの中から一つの組み合わせが選択される。このように、計算手法決定システムによれば、適合層の計算コストを考慮して各層の計算手法の組み合わせの総コストが算出される。このため、計算手法決定システムは、各層の計算コストを加算する場合と比べて、ネットワーク構造全体の計算コストをより正確に算出することができる。よって、計算手法決定システムは、ネットワーク構造の各層の計算手法をより適切に決定することができる。 In this calculation method determination system, a plurality of combinations of calculation methods for each layer from the input to the output of the network structure are determined. Then, for each of the plurality of combinations, it is determined whether or not the two-layer calculation method for exchanging data satisfies a predetermined relationship, and a matching layer to be executed for exchanging data between the two layers is determined. be. Then, based on the calculation cost of the calculation method and the calculation cost of the matching layer, the total cost is calculated for each of the multiple combinations. One combination is selected from a plurality of combinations based on the total cost calculated for each combination. Thus, according to the calculation method determination system, the total cost of the combination of calculation methods for each layer is calculated in consideration of the calculation cost of the matching layer. Therefore, the calculation method determination system can calculate the calculation cost of the entire network structure more accurately than when adding the calculation cost of each layer. Therefore, the calculation method determination system can more appropriately determine the calculation method for each layer of the network structure.

一実施形態においては、少なくとも１つの計算手法は、実行環境で実行可能であり、それぞれが異なる演算で同一機能を発揮する複数のアルゴリズムを含んでもよい。この場合、計算手法決定システムは、同一結果を出力するものの演算方法がそれぞれ異なる複数のアルゴリズムを実行環境にて実際に動作させ、得られた計算コストに基づいて最適なアルゴリズムを決定することができる。 In one embodiment, the at least one computational technique may include multiple algorithms that are executable in an execution environment, each performing the same function with different operations. In this case, the calculation method determination system can actually run multiple algorithms that output the same result but have different calculation methods in the execution environment, and can determine the optimum algorithm based on the obtained calculation cost. .

一実施形態においては、少なくとも１つの計算手法は、実行環境で実行可能であり、それぞれが異なるデバイスを用いて同一機能を発揮する複数のアルゴリズムを含み、適合層は、異種デバイス間のデータ転送の処理を実行してもよい。この場合、計算手法決定システムは、異種デバイス間のデータ転送の処理の計算コストを総コストに含めることができるので、ネットワーク構造の各層の計算手法をより適切に決定することができる。 In one embodiment, the at least one computational technique is executable in an execution environment and includes multiple algorithms each performing the same function using a different device, and the adaptation layer is responsible for data transfer between heterogeneous devices. processing may be performed. In this case, the calculation method determination system can include the calculation cost of data transfer processing between heterogeneous devices in the total cost, so that the calculation method for each layer of the network structure can be determined more appropriately.

一実施形態においては、少なくとも１つの計算手法は、実行環境で実行可能であり、それぞれが異なるデータ型のデータに対して同一の演算を行う複数のアルゴリズムを含み、適合層は、データ型の変換の処理を実行してもよい。この場合、計算手法決定システムは、データ型の変換の処理の計算コストを総コストに含めることができるので、ネットワーク構造の各層の計算手法をより適切に決定することができる。 In one embodiment, the at least one computational technique is executable in an execution environment and includes a plurality of algorithms each performing the same operation on data of different data types, and the adaptation layer comprises data type conversion. may be performed. In this case, the calculation method determination system can include the calculation cost of data type conversion processing in the total cost, so that the calculation method for each layer of the network structure can be determined more appropriately.

一実施形態においては、少なくとも１つの計算手法は、実行環境で実行可能であり、それぞれが異なるチャネル位置のデータに対して同一の演算を行う複数のアルゴリズムを含み、適合層は、データレイアウトの変更の処理を実行してもよい。この場合、計算手法決定システムは、データレイアウトの変更の処理の計算コストを総コストに含めることができるので、ネットワーク構造の各層の計算手法をより適切に決定することができる。 In one embodiment, the at least one computational technique is executable in an execution environment and includes a plurality of algorithms each performing the same operation on data at different channel locations, and the adaptation layer changes the data layout. may be performed. In this case, the calculation method determination system can include the calculation cost of the data layout change processing in the total cost, so that the calculation method for each layer of the network structure can be determined more appropriately.

一実施形態においては、選択部は、複数の組み合わせの中から総コストが最小となる組み合わせを選択してもよい。この場合、計算手法決定システムは、ネットワーク構造の各層の計算手法を最も適切に決定することができる。 In one embodiment, the selection unit may select a combination with the lowest total cost from among a plurality of combinations. In this case, the calculation method determination system can most appropriately determine the calculation method for each layer of the network structure.

一実施形態においては、ネットワーク構造が、分岐元の層から複数の層に分岐し、分岐元の層から派生した複数の出力が合流層で合流する分岐構造を有する場合、決定部は、分岐元の層の計算手法を拘束条件とし、分岐元の層の後続の層が拘束条件を満たすように拘束条件ごとに組み合わせを決定してもよい。分岐構造に含まれる各層において、例えば計算コストが最小となる計算手法が選択された場合、分岐元の層の計算手法が一方の分岐では第一手法となり、他方の分岐では第二手法となるような不整合が生じるおそれがある。このため、計算手法決定システムは、分岐元の層の後続の層については分岐元の層の計算手法を拘束条件として計算手法を決定し、拘束条件ごとに組み合わせを決定することにより、不整合が発生することを抑制できる。 In one embodiment, when the network structure has a branching structure in which a plurality of layers branch from a branching source layer and a plurality of outputs derived from the branching source layer merge in a confluence layer, the determining unit A combination may be determined for each constraint so that the layer subsequent to the branch source layer satisfies the constraint. For example, if a calculation method with the lowest calculation cost is selected in each layer included in the branch structure, the calculation method of the branch source layer becomes the first method in one branch and the second method in the other branch. Such inconsistencies may occur. For this reason, the calculation method determination system determines the calculation method for the layer following the branch source layer with the calculation method of the branch source layer as a constraint condition, and determines the combination for each constraint condition, thereby eliminating inconsistencies. You can prevent it from happening.

一実施形態においては、ネットワーク構造の入力を行う入力層が複数となる場合、決定部は、複数の入力層の上流側に、複数の入力層それぞれに接続される補助入力層を生成し、補助入力層の計算手法を拘束条件とし、補助入力層の後続の層が拘束条件を満たすように拘束条件ごとに組み合わせを決定してもよい。この場合、計算手法決定システムは、複数の入力層を有するネットワーク構造であっても、補助入力層を生成することにより、分岐構造と同一の論理を用いることができるので、不整合が発生することを抑制できる。 In one embodiment, when there are a plurality of input layers for inputting the network structure, the determination unit generates an auxiliary input layer connected to each of the plurality of input layers on the upstream side of the plurality of input layers, and The calculation method of the input layer may be used as a constraint condition, and the combination may be determined for each constraint condition so that the layer subsequent to the auxiliary input layer satisfies the constraint condition. In this case, even if the network structure has multiple input layers, the calculation method determination system can use the same logic as the branch structure by generating an auxiliary input layer. can be suppressed.

一実施形態においては、ネットワーク構造の出力を行う出力層が複数となる場合、決定部は、複数の出力層の下流側に、複数の出力層それぞれに接続される補助出力層を生成し、複数の出力層よりも前段の層において分岐している層の計算手法を拘束条件とし、拘束条件を満たすように拘束条件ごとに組み合わせを決定してもよい。この場合、計算手法決定システムは、複数の出力層を有するネットワーク構造であっても、補助出力層を生成することにより、分岐構造と同一の論理を用いることができるので、不整合が発生することを抑制できる。 In one embodiment, when there are a plurality of output layers for outputting the network structure, the determining unit generates auxiliary output layers downstream of the plurality of output layers and connected to each of the plurality of output layers. It is also possible to set the calculation method of the layer branching in the layer preceding the output layer as the constraint condition, and determine the combination for each constraint condition so as to satisfy the constraint condition. In this case, even if the network structure has a plurality of output layers, the calculation method determination system can use the same logic as the branch structure by generating the auxiliary output layer. can be suppressed.

本開示の他の側面は、ネットワーク構造及び重みデータを用いて入力データを処理するための計算が行われる実行環境において、ネットワーク構造の層ごとに計算手法を決定する計算手法決定方法である。方法は、決定ステップ、適合ステップ、算出ステップ及び選択ステップを備える。決定ステップでは、層ごとに予め準備された少なくとも１つの計算手法に基づいて、ネットワーク構造の入力から出力までの間の各層の計算手法の組み合わせを複数決定する。適合ステップでは、複数の組み合わせごとに、データのやり取りを行う二層の計算手法が所定の関係を満たすか否かを判定し、二層の間のデータのやり取りのために実行される適合層を決定する。算出ステップでは、計算手法の計算コストと適合層の計算コストとに基づいて、複数の組み合わせごとに総コストを算出する。選択ステップでは、組み合わせごとに算出された総コストに基づいて、複数の組み合わせの中から一つの組み合わせを選択する。 Another aspect of the present disclosure is a computational method determination method for determining a computational method for each layer of a network structure in an execution environment where computations are performed to process input data using the network structure and weight data. The method comprises a determining step, a matching step, a calculating step and a selecting step. In the determination step, based on at least one calculation method prepared in advance for each layer, a plurality of combinations of calculation methods for each layer from input to output of the network structure are determined. In the adaptation step, it is determined whether or not the two-layer calculation method for exchanging data satisfies a predetermined relationship for each of a plurality of combinations, and the adaptation layer executed for exchanging data between the two layers is determined. decide. In the calculation step, the total cost is calculated for each of the multiple combinations based on the calculation cost of the calculation method and the calculation cost of the adaptive layer. In the selection step, one combination is selected from a plurality of combinations based on the total cost calculated for each combination.

本開示のさらに他の側面は、ネットワーク構造及び重みデータを用いて入力データを処理するための計算が行われる実行環境において、ネットワーク構造の層ごとに計算手法を決定するようにコンピュータを動作させる計算手法決定プログラムである。プログラムは、コンピュータを、決定部、適合部、算出部及び選択部として機能させる。決定部は、層ごとに予め準備された少なくとも１つの計算手法に基づいて、ネットワーク構造の入力から出力までの間の各層の計算手法の組み合わせを複数決定する。適合部は、複数の組み合わせごとに、データのやり取りを行う二層の計算手法が所定の関係を満たすか否かを判定し、二層の間のデータのやり取りのために実行される適合層を決定する。算出部は、計算手法の計算コストと適合層の計算コストとに基づいて、複数の組み合わせごとに総コストを算出する。選択部は、組み合わせごとに算出された総コストに基づいて、複数の組み合わせの中から一つの組み合わせを選択する選択部として機能させる。 Yet another aspect of the present disclosure is a computation that operates a computer to determine a computational approach for each layer of the network structure in an execution environment where computations are performed to process input data using the network structure and weight data. It is a method decision program. The program causes the computer to function as a determiner, fitter, calculator and selector. The determination unit determines a plurality of combinations of calculation methods for each layer from input to output of the network structure based on at least one calculation method prepared in advance for each layer. The adapting unit determines whether or not the two-layer calculation method for exchanging data satisfies a predetermined relationship for each of a plurality of combinations, and selects an adaptation layer to be executed for exchanging data between the two layers. decide. The calculation unit calculates the total cost for each of the multiple combinations based on the calculation cost of the calculation method and the calculation cost of the matching layer. The selection unit functions as a selection unit that selects one combination from a plurality of combinations based on the total cost calculated for each combination.

この計算手法決定方法及び計算手法決定プログラムは、上述した計算手法決定システムと同一の効果を奏する。 This calculation method determination method and calculation method determination program have the same effect as the above-described calculation method determination system.

本開示の種々の側面によれば、計算コストを適切に考慮してネットワーク構造の各層の計算手法を決定することができる。 According to various aspects of the present disclosure, computational cost can be appropriately considered to determine computational techniques for each layer of the network structure.

認識部を説明する図である。It is a figure explaining a recognition part. 認識部におけるニューラルネットワークを説明する図である。FIG. 4 is a diagram for explaining a neural network in a recognition unit; 図２に示す人工ニューロンを説明する図である。It is a figure explaining the artificial neuron shown in FIG. 第一実施形態に係る計算手法決定システムの機能ブロック図である。1 is a functional block diagram of a calculation method determination system according to a first embodiment; FIG. 図４に示す装置のハードウェア構成を示すブロック図。FIG. 5 is a block diagram showing the hardware configuration of the device shown in FIG. 4; （Ａ）はネットワーク構造の二層を説明する図である。（Ｂ）は（Ａ）の二層の間に適合層が挿入された例を説明する図である。(A) is a diagram illustrating the two layers of the network structure. (B) is a diagram illustrating an example in which a matching layer is inserted between the two layers of (A). 計算手法決定システムの動作を示すフローチャートである。It is a flowchart which shows operation|movement of a calculation method determination system. （Ａ）は直列に接続されたネットワーク構造、（Ｂ）は分岐されたネットワーク構造、（Ｃ）は複数分岐されたネットワーク構造、（Ｄ）は複数の入出力を有するネットワーク構造である。(A) is a serially connected network structure, (B) is a branched network structure, (C) is a multi-branched network structure, and (D) is a network structure with multiple inputs and outputs. 第二実施形態に係る計算手法決定システムの機能ブロック図である。It is a functional block diagram of a calculation method determination system according to a second embodiment.

以下、添付図面を参照して実施形態について説明する。なお、図面の説明において同一の要素には同一の符号を付し、重複する説明は繰り返さない。 Embodiments will be described below with reference to the accompanying drawings. In the description of the drawings, the same elements are denoted by the same reference numerals, and redundant description will not be repeated.

［第一実施形態］
（計算手法決定システムの概要）
実施形態に係る計算手法決定システム１００（図４参照）は、階層型ネットワークを有する処理装置の最適な計算手法を決定するシステムである。階層型ネットワークとは、階層構造を有するネットワークであり、一例としてニューラルネットワークである。ニューラルネットワークは、ネットワーク構造及び重みデータなどを用いて定義される。ニューラルネットワークの詳細は後述される。 [First embodiment]
(Overview of calculation method determination system)
A calculation method determination system 100 (see FIG. 4) according to the embodiment is a system that determines an optimum calculation method for a processing device having a hierarchical network. A hierarchical network is a network having a hierarchical structure, such as a neural network. A neural network is defined using a network structure, weight data, and the like. Details of the neural network will be described later.

処理装置は、種々のプログラムを実行可能な実行環境を有する。実行環境では、ネットワーク構造及び重みデータを用いて入力データを処理するための計算が実行される。入力データとは、階層型ネットワークの目的を達成するために処理されるデータである。例えば、階層型ネットワークの目的が認識である場合には、入力データは認識対象データとなる。処理装置は、階層型ネットワークを有する装置であれば特に限定されない。処理装置は、一例として、画像の内容をラベリングする端末装置であったり、画像内の物***置（人の位置など）を特定する監視カメラであったり、一般的なパーソナルコンピュータであってもよい。以下では、処理装置の一例として、ニューラルネットワークを用いて認識対象データの内容を認識する認識部１１（図１，図４参照）を有する端末装置を説明する。認識対象データとは、コンピュータに認識させる対象となるデータであり、例えば、画像データ、音声データ、テキストデータなどである。 The processing device has an execution environment capable of executing various programs. In the execution environment, computations are performed to process the input data using the network structure and weight data. Input data is data that is processed to achieve the objectives of the hierarchical network. For example, if the purpose of the hierarchical network is recognition, the input data is recognition target data. The processing device is not particularly limited as long as it has a hierarchical network. The processing device may be, for example, a terminal device that labels the content of an image, a surveillance camera that identifies the position of an object (such as the position of a person) in an image, or a general personal computer. A terminal device having a recognition unit 11 (see FIGS. 1 and 4) that recognizes the content of recognition target data using a neural network will be described below as an example of a processing device. Recognition target data is data to be recognized by a computer, such as image data, voice data, and text data.

計算手法決定システム１００により決定される認識部１１の計算手法とは、入力に対して演算を行い、結果を出力する手法である。計算手法はルーチンともいう。計算手法決定システム１００は、入力に対して所定の精度で同一の結果を出力することを前提として、実行可能な複数種類のアルゴリズムの中から最適なアルゴリズムを決定したり、リソースの種類、使用量又は使用形態が異なる複数の同一アルゴリズムの中から最適なアルゴリズムを決定したりする。リソースは、ハードウェアリソースやソフトウェアリソースが含まれる。ハードウェアリソースは、例えば演算するためのＣＰＵ（Central Processing Unit）、ＧＰＵ（GraphicsProcessing Unit）、ＤＳＰ（Digital Signal Processor）、キャッシュなどである。ソフトウェアリソースは、例えばライブラリである。アルゴリズムが使用するリソースは、例えばアルゴリズムのパラメータで定義される。 The calculation method of the recognition unit 11 determined by the calculation method determination system 100 is a method of performing an operation on an input and outputting the result. A calculation method is also called a routine. The calculation method determination system 100 determines the optimum algorithm from among a plurality of types of executable algorithms, and determines the type of resource and the amount of usage, on the premise that the same result is output with a predetermined accuracy for the input. Alternatively, the optimum algorithm is determined from a plurality of identical algorithms with different usage patterns. Resources include hardware resources and software resources. The hardware resources are, for example, CPUs (Central Processing Units), GPUs (Graphics Processing Units), DSPs (Digital Signal Processors), caches, and the like. A software resource is, for example, a library. The resources used by the algorithm are defined, for example, by the parameters of the algorithm.

計算手法決定システム１００は、計算コストを考慮して認識部１１の計算手法を決定する。計算コストとは、一例として、計算に要する時間（処理時間）で評価される。この場合、計算コストは、計算に要する時間が長いほど大きくなる。計算コストは、リソースの使用量で評価されてもよい。この場合、計算コストは、リソースの使用量が大きくなるほど大きくなる。計算コストは、計算に要する時間及びリソースの使用量の２つを用いて評価されてもよい。 The calculation method determination system 100 determines the calculation method of the recognition unit 11 in consideration of the calculation cost. Calculation cost is evaluated by, for example, the time required for calculation (processing time). In this case, the calculation cost increases as the time required for calculation increases. Computational cost may be measured in terms of resource usage. In this case, the computational cost increases as the amount of resource usage increases. Computational cost may be evaluated in terms of both computational time and resource usage.

計算手法決定システム１００は、認識部１１の実行環境で実行可能な計算手法の中から計算コストが最小となる計算手法を決定する。計算手法決定システム１００は、実行可能な計算手法の計算コストの平均よりも小さい計算コストとなる計算手法を選択するようにしてもよい。このように、計算手法決定システム１００は、認識部１１の計算速度を最適化したり、認識部１１のリソース使用量を最適化したりする。 The calculation method determination system 100 determines the calculation method with the lowest calculation cost from among the calculation methods that can be executed in the execution environment of the recognition unit 11 . The computational method determination system 100 may select a computational method whose computational cost is less than the average computational cost of the executable computational methods. In this manner, the calculation method determination system 100 optimizes the calculation speed of the recognition unit 11 and optimizes the resource usage of the recognition unit 11 .

計算手法決定システム１００は、計算手法を決定する際に、認識部１１の実行環境で実行可能な計算手法を実行し、実行結果に基づいて計算手法を評価し、最適な計算手法を決定する。 When determining a calculation method, the calculation method determination system 100 executes a calculation method executable in the execution environment of the recognition unit 11, evaluates the calculation method based on the execution result, and determines the optimum calculation method.

具体的な一例として、計算手法決定システム１００は、認識部１１の実行環境において、ネットワーク構造の層ごとに予め準備された少なくとも１つの計算手法を用いて、所定データに対するネットワーク構造の各層の計算を実行することと、ネットワーク構造の各層の計算結果に基づいて、ネットワーク構造の層ごとに少なくとも１つの計算手法の計算コストを取得することとを実行可能に構成される。 As a specific example, the calculation method determination system 100 uses at least one calculation method prepared in advance for each layer of the network structure in the execution environment of the recognition unit 11 to calculate each layer of the network structure for predetermined data. and obtaining a computational cost of at least one computational method for each layer of the network structure based on computational results of each layer of the network structure.

具体的な他の例として、計算手法決定システム１００は、層ごとに予め準備された少なくとも１つの計算手法に基づいて、ネットワーク構造の入力から出力までの間の各層の計算手法の組み合わせを複数決定することと、複数の組み合わせごとに、データのやり取りを行う二層の計算手法が所定の関係を満たすか否かを判定し、二層の間のデータのやり取りのために実行される適合層を決定することと、計算手法の計算コストと適合層の計算コストとに基づいて、複数の組み合わせごとに総コストを算出することと、組み合わせごとに算出された総コストに基づいて複数の組み合わせの中から一つの組み合わせを選択することと、を実行可能に構成される。 As another specific example, the calculation method determination system 100 determines a plurality of combinations of calculation methods for each layer from the input to the output of the network structure based on at least one calculation method prepared in advance for each layer. and for each of a plurality of combinations, it is determined whether or not the two-layer calculation method that exchanges data satisfies a predetermined relationship, and an adaptation layer that is executed for data exchange between the two layers is determined. calculating a total cost for each of the plurality of combinations based on the computational cost of the computational method and the computational cost of the adaptation layer; and determining among the plurality of combinations based on the computed total cost for each combination and selecting a combination from.

（認識部の詳細）
最初に、計算手法決定システムが計算手法を決定する対象となる認識部１１を説明する。以下では、一例として、認識対象データが画像データであり、認識する対象が画像の内容（人、動物、物体、風景、室内など）である場合を説明する。 (Details of the recognition unit)
First, the recognition unit 11 for which the calculation method determination system determines the calculation method will be described. In the following, as an example, the recognition target data is image data, and the recognition target is the content of the image (person, animal, object, landscape, indoors, etc.).

図１は、認識部１１を説明する図である。図１に示されるように、認識部１１は、端末装置１０に備わる。認識部１１は、画像データである認識対象データＧ１を入力し、認識結果を出力する。図１の（Ａ）に示されるように、認識対象データＧ１は、犬が描画された画像の画像データである。認識部１１は、画像データ（より詳細には画素値）を入力し、学習結果（例えばネットワーク構造及び重みデータ）を用いて画像の内容を表すラベルを出力する。ラベルとは、認識対象データの内容を分類するために用いるものであり、システム利用者によって予め設定されたカテゴリを識別する情報である。図１の（Ａ）の例では、認識部１１は「犬」のラベルを出力する。ラベルは、認識部１１によって認識対象データＧ１に付与される。付与とは、関連付けされることを意味し、例えば関連テーブルなどで認識対象データＧ１とラベルとの関係性のみが記録されてもよいし、認識対象データＧ１そのものに組み込まれてもよい。認識部１１は、画像データを入力してラベルを付与することができるため、画像データを自動的に分類したり、Ｗｅｂ上で所望の画像を検索したりすることができる。 FIG. 1 is a diagram for explaining the recognition unit 11. As shown in FIG. As shown in FIG. 1 , the recognition unit 11 is provided in the terminal device 10 . The recognition unit 11 receives recognition target data G1, which is image data, and outputs a recognition result. As shown in FIG. 1A, recognition target data G1 is image data of an image in which a dog is drawn. The recognition unit 11 receives image data (more specifically, pixel values) and outputs a label representing the content of the image using learning results (for example, network structure and weight data). A label is used to classify the contents of recognition target data, and is information identifying a category preset by a system user. In the example of FIG. 1A, the recognition unit 11 outputs the label "dog". The label is given to the recognition target data G1 by the recognition unit 11 . Assignment means to associate, and for example, only the relationship between the recognition target data G1 and the label may be recorded in a relation table or the like, or may be incorporated in the recognition target data G1 itself. Since the recognition unit 11 can input image data and assign a label, it can automatically classify the image data and search for a desired image on the Web.

システム利用者によって予め設定されたラベルが複数ある場合、その中から最も確からしいラベルを認識対象データに付与するシングルラベル処理と、ある一定の確からしさとなったラベルの全てを認識対象データに付与するマルチラベル処理が存在する。図１の（Ｂ）に示されるように、認識対象データＧ２は、人及び花が描画された画像の画像データである。認識部１１がシングルラベル処理を行う場合、認識部１１によって認識対象データＧ２に「人」のラベルが付与される。認識部１１がマルチラベル処理を行う場合、認識部１１によって認識対象データＧ２に「人」のラベルと「花」のラベルの２つが付与される。 When there are multiple labels preset by the system user, single label processing that assigns the most probable label to the recognition target data from among them, and assigns all labels that have a certain degree of certainty to the recognition target data. There is multi-label processing that As shown in FIG. 1B, the recognition target data G2 is image data of an image in which people and flowers are drawn. When the recognizing unit 11 performs the single label processing, the recognizing unit 11 assigns the label “person” to the recognition target data G2. When the recognizing unit 11 performs multi-label processing, the recognizing unit 11 assigns two labels, a “person” label and a “flower” label, to the recognition target data G2.

図２は、認識部１１におけるニューラルネットワークを説明する図である。認識部１１は、ニューラルネットワークを用いて画像データに対応するラベルを認識する。ニューラルネットワークとは、脳神経系をモデルにした情報処理システムである。図２に示すように、認識部１１のニューラルネットワークは、いわゆる階層型ニューラルネットワークであり、円で示す多数の人工ニューロンが階層を形成しつつ連結されている。階層型ニューラルネットワークは、入力用の人工ニューロン、処理用の人工ニューロン及び出力用の人工ニューロンを備える。 FIG. 2 is a diagram for explaining the neural network in the recognition unit 11. As shown in FIG. The recognition unit 11 uses a neural network to recognize labels corresponding to image data. A neural network is an information processing system modeled on the nervous system. As shown in FIG. 2, the neural network of the recognition unit 11 is a so-called hierarchical neural network, in which a large number of artificial neurons indicated by circles are connected while forming a hierarchy. A hierarchical neural network comprises an input artificial neuron, a processing artificial neuron, and an output artificial neuron.

入力用の人工ニューロンは、認識対象データを取得して処理用の人工ニューロンへ分配する。以下では、ニューラルネットワークでやり取りされる信号そのものをスコアという。スコアは数値である。入力用の人工ニューロンは、並列配置されることで入力層１１１を形成する。 The artificial neuron for input acquires recognition target data and distributes it to the artificial neuron for processing. The signal itself exchanged in the neural network is hereinafter referred to as a score. A score is a number. Input artificial neurons form an input layer 111 by being arranged in parallel.

処理用の人工ニューロンは、入力用の人工ニューロンに接続され、人工ニューロンの機能にしたがって入力を処理し、出力を他のニューロンへ伝える。処理用の人工ニューロンは、並列配置されることで中間層１１２を形成する。中間層１１２は、複数の層であってもよい。なお、中間層１１２を備えた３階層以上のニューラルネットワークをディープニューラルネットワークという。 The processing artificial neuron is connected to the input artificial neuron, processes the input according to the function of the artificial neuron, and conveys the output to other neurons. Artificial neurons for processing are arranged in parallel to form an intermediate layer 112 . Intermediate layer 112 may be multiple layers. A neural network with three or more hierarchies including the intermediate layer 112 is called a deep neural network.

出力用の人工ニューロンは、外部へ認識スコアを出力する。出力用の人工ニューロンは、ラベルの数と同じ数だけ用意される。つまり、ニューラルネットワークでは、ラベルごとに認識スコアを出力する。図２の例では、「犬」「人」「花」の３つのラベルに合わせて３つの人工ニューロンが用意されている。出力用の人工ニューロンは、「犬」のラベルに対応する認識スコアＢ１、「人」のラベルに対応する認識スコアＢ２、「花」のラベルに対応する認識スコアＢ３を出力する。認識スコアは、認識結果を表すスコアであり、正評価を「１」、負評価を「０」として学習した場合には、ラベルの認識スコアが高いほど画像の内容を示すラベルである確からしさが高くなる。出力用の人工ニューロンは、並列配置されることで出力層１１３を形成する。 The output artificial neuron outputs the recognition score to the outside. The number of artificial neurons for output is the same as the number of labels. In other words, the neural network outputs a recognition score for each label. In the example of FIG. 2, three artificial neurons are prepared for the three labels of "dog", "person" and "flower". The output artificial neuron outputs a recognition score B1 corresponding to the "dog" label, a recognition score B2 corresponding to the "person" label, and a recognition score B3 corresponding to the "flower" label. The recognition score is a score representing the recognition result. When learning is performed with a positive evaluation as "1" and a negative evaluation as "0", the higher the recognition score of a label, the more likely it is that the label indicates the content of the image. get higher The artificial neurons for output form an output layer 113 by being arranged in parallel.

認識部１１は、出力層１１３によって出力された認識スコアを用いて、付与ラベルを決定する。例えば、認識部１１は、所定値以上の認識スコアに対応するラベルを認識対象データに付与する。これにより、認識対象データにその内容を示すラベルが自動的に付与される。なお、シングルラベル処理の場合には、認識部１１は、最も高い認識スコアに対応するラベルを認識対象データに付与する。 The recognition unit 11 uses the recognition score output by the output layer 113 to determine the attached label. For example, the recognition unit 11 assigns a label corresponding to a recognition score equal to or greater than a predetermined value to recognition target data. As a result, a label indicating its content is automatically assigned to the recognition target data. In the case of single label processing, the recognition unit 11 assigns the label corresponding to the highest recognition score to the recognition target data.

図３は、図２に示す人工ニューロンを説明する図である。図３の（Ａ）に示す人工ニューロンは、ｘ_１，ｘ_２，ｘ_３を入力し、それぞれに対応する重み係数ｗ₁，ｗ_２，ｗ_３をそれぞれ積算する。人工ニューロンは、積算値（ｘ_１・ｗ₁，ｘ_２・ｗ_２，ｘ_３・ｗ_３）とバイアス値ｂとの総和を算出する。この総和を活性化関数に代入して、人工ニューロンの出力とする。 FIG. 3 is a diagram explaining the artificial neuron shown in FIG. The artificial neuron shown in (A) of FIG. 3 inputs x ₁ , x ₂ and x ₃ and integrates corresponding weighting factors w ₁ , w ₂ and w ₃ respectively. The artificial neuron calculates the sum of the integrated values (x ₁ ·w ₁ , x ₂ ·w ₂ , x ₃ ·w ₃ ) and the bias value b. Substitute this sum into the activation function to obtain the output of the artificial neuron.

より詳細には、対象の人工ニューロンの出力は、以下の数式（１）となる。

ここで、ｇは活性化関数であり、例えばシグモイド関数である。 More specifically, the output of the artificial neuron of interest is given by Equation (1) below.

where g is an activation function, eg a sigmoid function.

図３の（Ｂ）は、Ｎ階層（Ｎ＝３）の人工ニューロンを説明する図である。図３の（Ｂ）に示されるように、三階層の場合には、二階層に位置する人工ニューロンの出力ｈ_１ ^（２）、ｈ_２ ^（２）、ｈ_３ ^（２）はそれぞれ以下の数式（３）～（５）となる。ここで、ｎは対象階層の人工ニューロンの数、w_１ｊ ^（１）は二階層一番目の人工ニューロンにおける一階層ｊ番目の出力に対応する重み係数、ｂ_１ ^（１）は一階層のバイアス値である。

ｗ_２ｊ ^（１）は二階層二番目の人工ニューロンにおける一階層ｊ番目の出力に対応する重み係数、ｗ_３ｊ ^（１）は二階層三番目の人工ニューロンにおける一階層ｊ番目の出力に対応する重み係数、ｂ_２ ^（１）は一階層二番目のバイアス値、ｂ_３ ^（１）は一階層三番目のバイアス値である。これにより、三階層の人工ニューロンの出力ｈ_１ ^（３）は以下の数式（６）で表される。

なお、バイアス値ｂは必ずしも必要ではなく、前段の人工ニューロンの出力と重み係数との積算値だけで出力を演算してもよい。 FIG. 3B is a diagram for explaining an artificial neuron of N layers (N=3). As shown in FIG. 3B, in the case of three layers, the outputs h ₁ ⁽²⁾ , h ₂ ⁽²⁾ , and h ₃ ⁽²⁾ of the artificial neurons located in the second layer are expressed by the following equations. (3) to (5). Here, n is the number of artificial neurons in the target layer, w _1j ⁽¹⁾ is the weight coefficient corresponding to the j-th output of the first artificial neuron in the second layer, and b ₁ ⁽¹⁾ is the bias value of the first layer. is.

w _2j ⁽¹⁾ is the weight coefficient corresponding to the j-th output of the second layer of the second artificial neuron, and w _3j ⁽¹⁾ is the weight of the j-th output of the third layer of the second layer of the artificial neuron. The coefficient b ₂ ⁽¹⁾ is the second bias value of the first hierarchy, and b ₃ ⁽¹⁾ is the third bias value of the first hierarchy. As a result, the output h ₁ ⁽³⁾ of the three-level artificial neuron is represented by the following formula (6).

Note that the bias value b is not necessarily required, and the output may be calculated using only the integrated value of the output of the artificial neuron in the preceding stage and the weighting factor.

人工ニューロンは上記に限定されるものではなく、一般化したものでもよい。ｉ番目の中間層１１２の機能に関する一般式は以下の数式（７）となる。

ここで、ｘ⁽ⁱ⁾は中間層１１２への入力ベクトル、ｗ⁽ⁱ⁾は中間層１１２の重みパラメータベクトル、ｂ⁽ⁱ⁾はバイアスベクトル、v⁽ⁱ⁾は中間層１１２の出力ベクトルである。画像認識で一般的に使用される中間層１１２の一例として、全結合層及び畳み込み層がある。図３で表現されている全結合層の出力は、一般的には以下の数式（８）となる。

ここで、ｘ_ｐ ⁽ⁱ⁾はｉ番目の中間層１１２の入力の第ｐ成分、v_ｑ ⁽ⁱ⁾は中間層１１２の出力の第ｑ成分、ｗ_ｐ，ｑ ⁽ⁱ⁾は中間層１１２の重み係数のｐ，ｑ成分である。また、畳み込み層の出力は以下の数式（９）となる。

ここで、ｘ_{ｐ，（r，s）} ⁽ⁱ⁾はｉ番目の中間層１１２の入力の第ｐチャンネルの（r，s）成分、v_{ｑ，（r，s）} ⁽ⁱ⁾は中間層１１２の出力の第ｑチャンネルの（r，s）成分、ｗ_{ｐ，ｑ，（r’，s’）} ⁽ⁱ⁾は中間層１１２の畳み込みフィルタに関する重み係数である。r’，s’は、０から畳み込みフィルタの（幅－１）、(高さ－１)の値まで変化する。以上のような中間層１１２及び活性化関数ｇ⁽ⁱ⁾の計算を繰り返すことにより、出力層１１３直前の中間層の出力が以下の数式１０となる。

The artificial neuron is not limited to the above, and may be generalized. A general formula for the function of the i-th intermediate layer 112 is the following formula (7).

where x ⁽ⁱ⁾ is the input vector to hidden layer 112, w ⁽ⁱ⁾ is the weight parameter vector of hidden layer 112, b ⁽ⁱ⁾ is the bias vector, and v ⁽ⁱ⁾ is the output vector of hidden layer 112. . Examples of intermediate layers 112 commonly used in image recognition include fully connected layers and convolutional layers. The output of the fully connected layer represented in FIG. 3 is generally given by Equation (8) below.

Here, x _p ⁽ⁱ⁾ is the p-th component of the input of the i-th hidden layer 112, v _q ⁽ⁱ⁾ is the q-th component of the output of the hidden layer 112, and w _p,q ⁽ⁱ⁾ is the These are the p and q components of the weighting coefficients. Also, the output of the convolutional layer is the following formula (9).

where x _{p, (r, s)} ⁽ⁱ⁾ is the (r, s) component of the p-th channel of the input of the i-th hidden layer 112, v _{q, (r, s)} ⁽ⁱ⁾ is the hidden layer 112 The (r,s) component of the qth channel of the output of w _{p,q,(r′,s′)} ⁽ⁱ⁾ is the weighting factor for the convolution filter of the hidden layer 112 . r' and s' vary from 0 to (width-1) and (height-1) values of the convolution filter. By repeating the calculation of the intermediate layer 112 and the activation function g ⁽ⁱ⁾ as described above, the output of the intermediate layer immediately before the output layer 113 is given by Equation 10 below.

上述したネットワーク構造、重み係数（重みデータ）及びバイアス値は、後述する学習装置３０で学習され、学習結果として認識部１１へ配布されたものである。つまり、学習装置３０は、認識対象データの特徴量とその内容を示すラベルとを対応させるためのネットワーク構造、重み係数及びバイアス値を学習する装置である。なお、認識部１１がバイアス値ｂを用いない場合には、学習装置３０は、ネットワーク構造及び重み係数のみを学習する。 The network structure, weighting coefficients (weighting data) and bias values described above are learned by the learning device 30, which will be described later, and distributed to the recognition unit 11 as learning results. In other words, the learning device 30 is a device that learns a network structure, a weighting factor, and a bias value for associating the feature quantity of the recognition target data with the label indicating its content. Note that when the recognition unit 11 does not use the bias value b, the learning device 30 learns only the network structure and weighting coefficients.

（計算手法決定システムの構成）
図４は、実施形態に係る計算手法決定システム１００の機能ブロック図である。図４に示されるように、計算手法決定システム１００は、端末装置１０及び提供装置４０を含み、学習装置３０に接続される。学習装置３０は、画像データを収集して学習する。学習装置３０の学習結果は、提供装置４０を介して端末装置１０へ提供される。 (Configuration of calculation method determination system)
FIG. 4 is a functional block diagram of the calculation method determination system 100 according to the embodiment. As shown in FIG. 4 , the calculation method determination system 100 includes a terminal device 10 and a providing device 40 and is connected to a learning device 30 . The learning device 30 learns by collecting image data. The learning result of the learning device 30 is provided to the terminal device 10 via the providing device 40 .

（ハードウェア構成）
最初に、端末装置１０、学習装置３０及び提供装置４０のハードウェアについて説明する。図５は、図４に示す装置のハードウェア構成を示すブロック図である。図５に示すように、端末装置１０は、物理的には、ＣＰＵ１０１、ＲＡＭ（Random Access Memory）１０２及びＲＯＭ（Read Only Memory）１０３などの主記憶装置、タッチパネルやキーボードなどの入力デバイス１０４、ディスプレイなどの出力デバイス１０５、ハードディスクなどの補助記憶装置１０６などを含む通常のコンピュータシステムとして構成される。端末装置１０の各機能は、ＣＰＵ１０１が、ＲＡＭ１０２、ＲＯＭ１０３などのハードウェア上に所定のコンピュータソフトウェアを読み込ませ、ＣＰＵ１０１の制御の元で入力デバイス１０４及び出力デバイス１０５を動作させるとともに、主記憶装置や補助記憶装置１０６におけるデータの読み出し及び書き込みを行うことで実現される。なお、端末装置１０は、上記以外のハードウェアを備えてもよい。例えば、端末装置１０は、ＧＰＵ、ＦＰＧＡ（Field-Programmable Gate Array）、ＤＳＰなどを備えてもよい。 (Hardware configuration)
First, hardware of the terminal device 10, the learning device 30, and the providing device 40 will be described. FIG. 5 is a block diagram showing the hardware configuration of the device shown in FIG. As shown in FIG. 5, the terminal device 10 physically includes a main storage device such as a CPU 101, a RAM (Random Access Memory) 102 and a ROM (Read Only Memory) 103, an input device 104 such as a touch panel and a keyboard, and a display. , and an auxiliary storage device 106 such as a hard disk. Each function of the terminal device 10 is performed by the CPU 101, which loads predetermined computer software onto hardware such as the RAM 102 and the ROM 103, operates the input device 104 and the output device 105 under the control of the CPU 101, and operates the main memory and It is realized by reading and writing data in the auxiliary storage device 106 . Note that the terminal device 10 may include hardware other than the above. For example, the terminal device 10 may include a GPU, FPGA (Field-Programmable Gate Array), DSP, and the like.

学習装置３０及び提供装置４０のハードウェアも端末装置１０と同一のハードウェアで構成可能である。すなわち、学習装置３０は、物理的には、ＣＰＵ３０１、ＲＡＭ３０２及びＲＯＭ３０３などの主記憶装置、入力デバイス３０４、出力デバイス３０５、補助記憶装置３０６などを含む通常のコンピュータシステムとして構成される。提供装置４０は、物理的には、ＣＰＵ４０１、ＲＡＭ４０２及びＲＯＭ４０３などの主記憶装置、入力デバイス４０４、出力デバイス４０５、補助記憶装置４０６などを含む通常のコンピュータシステムとして構成される。提供装置４０の各機能は、ＣＰＵ４０１が、ＲＡＭ４０２、ＲＯＭ４０３などのハードウェア上に所定のコンピュータソフトウェアを読み込ませ、ＣＰＵ４０１の制御の元で入力デバイス４０４及び出力デバイス４０５を動作させるとともに、主記憶装置や補助記憶装置４０６におけるデータの読み出し及び書き込みを行うことで実現される。なお、学習装置３０及び提供装置４０は、ハードウェアとして１つの筐体に収容されている必要はなく、いくつかの装置に分離していてもよい。 The hardware of the learning device 30 and the providing device 40 can also be configured with the same hardware as the terminal device 10 . That is, the learning device 30 is physically configured as a normal computer system including a main storage device such as a CPU 301, a RAM 302 and a ROM 303, an input device 304, an output device 305, an auxiliary storage device 306, and the like. The providing device 40 is physically configured as a normal computer system including main storage devices such as a CPU 401, RAM 402 and ROM 403, an input device 404, an output device 405, an auxiliary storage device 406, and the like. Each function of the providing device 40 is performed by the CPU 401 causing the CPU 401 to load predetermined computer software onto hardware such as the RAM 402 and the ROM 403, operating the input device 404 and the output device 405 under the control of the CPU 401, It is realized by reading and writing data in the auxiliary storage device 406 . Note that the learning device 30 and the providing device 40 do not need to be housed in one housing as hardware, and may be separated into several devices.

（機能的構成）
次に、学習装置３０の機能的構成について説明する。学習装置３０は、画像データを収集して学習する。学習装置３０は、画像データを格納したデータベース２１、画像データを生成するカメラ２２、画像データをダウンロード可能なＷｅｂサイト２３などに接続されており、学習の入力データとなる画像データを取得することができる。もちろん、学習装置３０は、外部記憶媒体を接続して画像データを取得してもよいし、通信を介して画像データを受信してもよく、画像データ取得の態様には限定されない。 (Functional configuration)
Next, the functional configuration of the learning device 30 will be described. The learning device 30 learns by collecting image data. The learning device 30 is connected to a database 21 that stores image data, a camera 22 that generates image data, a website 23 from which image data can be downloaded, and the like, and can acquire image data as input data for learning. can. Of course, the learning device 30 may acquire image data by connecting an external storage medium, or may receive image data via communication, and is not limited to the form of image data acquisition.

学習装置３０は、ニューラルネットワークを機械学習により構築するプラットフォームを有する。プラットフォームは、学習モデルを学習データに基づいて機械学習する動作環境である。学習モデルは、例えばネットワーク構造及び重みデータなどであり、学習データは、例えば画像データである。プラットフォームは、機械学習を行うソフトウェア群であるフレームワークを備える。プラットフォーム及びフレームワークは、使用言語の違いや設計思想の違いなどによって、学習結果出力に複数のフォーマットが存在する。 The learning device 30 has a platform for constructing a neural network by machine learning. A platform is an operating environment for machine learning a learning model based on learning data. The learning model is, for example, network structure and weight data, and the learning data is, for example, image data. The platform has a framework, which is a group of software that performs machine learning. Platforms and frameworks have multiple formats for outputting learning results due to differences in languages used and design concepts.

提供装置４０は、学習装置３０から取得した、フォーマットの異なる学習結果を、統一フォーマットに変換し、端末装置１０へ提供する。これにより、端末装置１０は、あらゆるプラットフォームで学習された学習結果を利用することができる。提供装置４０は、変換部４１を備える。変換部４１は、ネットワーク構造及び重みデータを学習装置３０から取得し、統一フォーマットの学習結果Ｍ１に変換する。 The providing device 40 converts learning results in different formats acquired from the learning device 30 into a unified format and provides the same to the terminal device 10 . As a result, the terminal device 10 can use learning results learned on any platform. The providing device 40 includes a conversion unit 41 . The conversion unit 41 acquires the network structure and weight data from the learning device 30 and converts them into a learning result M1 in a unified format.

端末装置１０は、認識部１１の動作が効率的となるように、学習結果Ｍ１のネットワーク構造を事前に動作させ、計算コストが最小となる計算手法を決定しておく。これにより、端末装置１０は、学習結果Ｍ１を利用した認識部１１の動作を効率化させることができる。このために、端末装置１０は、事前計算部１２、候補決定部１３、組み合わせ決定部１４（決定部の一例）、適合部１５、コスト取得部１６、算出部１７及び選択部１８を備える。 The terminal device 10 operates the network structure of the learning result M1 in advance so that the operation of the recognition unit 11 is efficient, and determines a calculation method that minimizes the calculation cost. As a result, the terminal device 10 can efficiently operate the recognition unit 11 using the learning result M1. For this purpose, the terminal device 10 includes a pre-calculation unit 12 , a candidate determination unit 13 , a combination determination unit 14 (an example of a determination unit), an adaptation unit 15 , a cost acquisition unit 16 , a calculation unit 17 and a selection unit 18 .

事前計算部１２は、学習結果Ｍ１のネットワーク構造を端末装置１０の実行環境で動作させるように構成される。事前計算部１２は、提供装置４０から学習結果Ｍ１を取得する。そして、学習結果Ｍ１に含まれる層の情報を候補決定部１３へ出力する。 The pre-calculation unit 12 is configured to operate the network structure of the learning result M1 in the execution environment of the terminal device 10 . The pre-calculation unit 12 acquires the learning result M1 from the provision device 40 . Then, the information of the layer included in the learning result M1 is output to the candidate determination unit 13. FIG.

候補決定部１３は、実行環境に基づいて、ネットワーク構造の各層に対して少なくとも１つの計算手法を準備する。一例として、候補決定部１３は、端末装置１０の実行環境のハードウェア構成プロファイル、ソフトウェアライブラリなどを参照しつつ、学習結果Ｍ１に含まれる層の情報に基づいて、各層に対して実行可能な計算手法ＴＤ１を決定する。これにより、事前計算部１２の実行前に、層ごとに計算手法が予め準備される。 The candidate determination unit 13 prepares at least one calculation method for each layer of the network structure based on the execution environment. As an example, the candidate determination unit 13 refers to the hardware configuration profile of the execution environment of the terminal device 10, the software library, etc., and based on the information of the layers included in the learning result M1, performs calculations that can be performed on each layer. Determine method TD1. Thus, a calculation method is prepared in advance for each layer before execution by the pre-calculation unit 12 .

計算手法ＴＤ１は、例えば、実行環境で実行可能であり、それぞれが異なる演算で同一機能を発揮する複数のアルゴリズムを含む。計算手法ＴＤ１は、それぞれが異なる演算で同一機能を発揮するアルゴリズムであれば、同一リソースを用いても異なるリソースを用いてもよい。計算手法ＴＤ１の一例としては、ループの階層の順番が異なるアルゴリズムや、ＣＰＵ拡張機能を利用したアルゴリズムとＣＰＵ拡張機能を利用しないアルゴリズムなどである。 Calculation technique TD1 includes, for example, a plurality of algorithms that can be executed in an execution environment, each performing the same function with different operations. The calculation method TD1 may use the same resource or different resources as long as they are algorithms that exhibit the same function in different operations. An example of the calculation method TD1 is an algorithm in which the order of loop layers is different, an algorithm using CPU extended functions and an algorithm not using CPU extended functions, and the like.

計算手法ＴＤ１は、それぞれが異なるデバイスを用いて同一機能を発揮する複数のアルゴリズムを含んでもよい。例えば、計算手法ＴＤ１は、演算に用いられるデバイス（プロセッサ）がＣＰＵであるアルゴリズム、ＧＰＵであるアルゴリズム、及びＤＳＰであるアルゴリズムを含む。計算手法ＴＤ１は、実行環境で実行可能であり、それぞれが異なるデータ型のデータに対して同一の演算を行う複数のアルゴリズムを含んでもよい。データ型は、一例として浮動小数点のビット数である。例えば、計算手法ＴＤ１は、処理データの浮動小数点が８ビット、３２ビット、及び６４ビットでそれぞれ表現されるアルゴリズムを含む。計算手法ＴＤ１は、所定のビット数で量子化されたデータを処理するアルゴリズムを含んでもよい。また、計算手法ＴＤ１は、実行環境で実行可能であり、それぞれが異なるチャネル位置のデータに対して同一の演算を行う複数のアルゴリズムを含んでもよい。チャネル位置は、画像データにおけるＲＧＢチャネルのデータ位置のことである。チャネルファーストの場合には画像データの次元の並びはチャネル、高さ、幅の順で表現され、チャネルラストの場合には画像データの次元の並びは高さ、幅、チャネルの順で表現される。つまり、計算手法ＴＤ１は、チャネルファーストデータを扱うアルゴリズムと、チャネルラストデータを扱うアルゴリズムとを含んでもよい。 The calculation technique TD1 may include multiple algorithms each performing the same function using different devices. For example, the calculation method TD1 includes an algorithm in which the device (processor) used for calculation is a CPU, an algorithm in which it is a GPU, and an algorithm in which it is a DSP. The calculation technique TD1 can be executed in an execution environment and may include multiple algorithms each performing the same operation on data of different data types. The data type is, for example, the number of floating-point bits. For example, calculation method TD1 includes algorithms in which floating point data to be processed are represented by 8 bits, 32 bits, and 64 bits, respectively. Calculation technique TD1 may include an algorithm that processes data quantized with a predetermined number of bits. The calculation technique TD1 may also include multiple algorithms that are executable in an execution environment and each performs the same operation on data at different channel locations. The channel position is the data position of the RGB channels in the image data. In the case of channel-first, the dimensional order of the image data is expressed in the order of channel, height, and width; in the case of channel-last, the dimensional order of the image data is expressed in the order of height, width, and channel. . That is, the calculation technique TD1 may include an algorithm that handles channel first data and an algorithm that handles channel last data.

組み合わせ決定部１４は、計算手法ＴＤ１に基づいて、ネットワーク構造の入力から出力までの間の各層の計算手法の組み合わせを複数決定する。例えば、第一層が第一計算手法ＭＡ１、第二層が第二計算手法ＭＡ２、第三層が第三計算手法ＭＡ３である場合、計算手法の組み合わせは、（ＭＡ１，ＭＡ２，ＭＡ３）となる。少なくとも１つの層の計算手法が複数存在する場合には、計算手法の組み合わせも複数存在することになる。 The combination determination unit 14 determines a plurality of combinations of calculation methods for each layer from the input to the output of the network structure based on the calculation method TD1. For example, if the first layer is the first calculation method MA1, the second layer is the second calculation method MA2, and the third layer is the third calculation method MA3, the combination of calculation methods is (MA1, MA2, MA3). . If there are multiple calculation methods for at least one layer, there are also multiple combinations of calculation methods.

端末装置１０の実行環境においては、ネットワーク構造に含まれる二層間が所定の関係を満たすとき、当該二層の間のデータのやり取りのための処理が自動で実行される。図６の（Ａ）はネットワーク構造の二層を説明する図である。図６に示されるように、第一層Ｌ１と第二層Ｌ２とは互いに接続され、第一層Ｌ１の出力が第二層Ｌ２の入力となる。端末装置１０は、第一層Ｌ１の計算手法と第二層Ｌ２の計算手法とが所定の関係を満たす場合に、適合層を自動挿入する。 In the execution environment of the terminal device 10, when two layers included in the network structure satisfy a predetermined relationship, processing for exchanging data between the two layers is automatically executed. FIG. 6A is a diagram illustrating the two layers of the network structure. As shown in FIG. 6, the first layer L1 and the second layer L2 are connected together, and the output of the first layer L1 becomes the input of the second layer L2. The terminal device 10 automatically inserts a matching layer when the calculation method of the first layer L1 and the calculation method of the second layer L2 satisfy a predetermined relationship.

図６の（Ｂ）は図６の（Ａ）の二層の間に適合層が挿入された例を説明する図である。図６の（Ｂ）に示されるように、第一層Ｌ１と第二層Ｌ２との間に適合層ＡＬ１が挿入されている。例えば、データを受け取る後段の第二層Ｌ２の計算手法が、前段の第一層Ｌ１の計算手法で用いられたデバイスとは異なるデバイスを用いる場合、異種デバイス間のデータ転送の処理が必要になる。より具体的な一例としては、第一層Ｌ１がＧＰＵでデータを処理した場合であって、第二層Ｌ２がＣＰＵでデータを処理する場合、デバイス間でデータ転送を行う適合層ＡＬ１が挿入される。 FIG. 6(B) is a diagram illustrating an example in which a matching layer is inserted between the two layers in FIG. 6(A). As shown in FIG. 6B, a matching layer AL1 is inserted between the first layer L1 and the second layer L2. For example, when the calculation method of the second layer L2 in the latter stage that receives data uses a device different from the device used in the calculation method of the first layer L1 in the previous stage, data transfer processing between different types of devices is required. . As a more specific example, when the first layer L1 processes data with a GPU and the second layer L2 processes data with a CPU, an adaptation layer AL1 that transfers data between devices is inserted. be.

あるいは、データを受け取る後段の第二層Ｌ２の計算手法が、前段の第一層Ｌ１の計算手法で用いられたデータ型とは異なるデータ型のデータを用いる場合、データ型の変換の処理が必要になる。より具体的な一例としては、第一層Ｌ１の計算手法として浮動小数点のビット数が６４ビットで表現されたデータが処理され、第二層Ｌ２の計算手法として浮動小数点のビット数が３２ビットで表現されたデータを処理する場合、６４ビットから３２ビットへのデータ型の変換を行う適合層ＡＬ１が挿入される。 Alternatively, when the calculation method of the second layer L2 in the latter stage that receives the data uses data of a data type different from the data type used in the calculation method of the first layer L1 in the previous stage, data type conversion processing is required. become. As a more specific example, as the calculation method of the first layer L1, the data represented by the number of floating-point bits is 64 bits, and as the calculation method of the second layer L2, the number of floating-point bits is 32 bits. When processing the represented data, an adaptation layer AL1 is inserted that performs a data type conversion from 64-bit to 32-bit.

あるいは、データを受け取る後段の第二層Ｌ２の計算手法が、前段の第一層Ｌ１の計算手法で用いられたチャネル位置とは異なるチャネル位置のデータを用いる場合、データレイアウトの変更の処理が必要になる。例えば、第一層Ｌ１の計算手法としてチャネルファーストデータが処理され、第二層Ｌ２の計算手法としてチャネルラストデータを処理する場合、チャネルラストデータからチャネルファーストデータへの変換、つまり、データレイアウトの変更を行う適合層ＡＬ１が挿入される。なお、上記の例示は、重複して発生することもある。例えば、状況に応じて、異種デバイス間のデータ転送の処理だけでなく、データレイアウトの変更の処理も同時に実行され得る。 Alternatively, if the calculation method of the second layer L2 in the subsequent stage that receives the data uses data at a channel position different from the channel position used in the calculation method of the first layer L1 in the previous stage, it is necessary to change the data layout. become. For example, when channel-first data is processed as the calculation method of the first layer L1 and channel-last data is processed as the calculation method of the second layer L2, conversion from channel-last data to channel-first data, that is, changing the data layout An adaptation layer AL1 is inserted that performs It should be noted that the above examples may occur redundantly. For example, depending on the situation, not only the process of transferring data between devices of different types but also the process of changing the data layout may be executed at the same time.

このように、ネットワーク構造に含まれる二層間が所定の関係を満たすとき、当該二層の間のデータのやり取りのための処理が自動で実行される。計算手法決定システム１００は、自動で実行される適合層ＡＬ１の計算コストを考慮するために、適合部１５を備える。適合部１５は、組み合わせ決定部１４によって決定された計算手法の組み合わせごとに、適合層ＡＬ１が自動挿入される箇所及び適合層ＡＬ１の処理内容を判定する。例えば、適合部１５は、組み合わせ決定部１４によって決定された複数の組み合わせごとに、データのやり取りを行う二層の計算手法が所定の関係を満たすか否かを判定する。所定の関係とは、二層の間のデータのやり取りのために処理が必要となる上述した関係である。 In this way, when two layers included in the network structure satisfy a predetermined relationship, processing for exchanging data between the two layers is automatically executed. The calculation method determination system 100 comprises an adaptation unit 15 in order to consider the calculation cost of the adaptation layer AL1 that is automatically executed. For each combination of calculation methods determined by the combination determination unit 14, the matching unit 15 determines the location where the matching layer AL1 is automatically inserted and the processing content of the matching layer AL1. For example, the adapting unit 15 determines whether or not the two-layer calculation method for exchanging data satisfies a predetermined relationship for each of the multiple combinations determined by the combination determining unit 14 . Predetermined relationships are the relationships described above that require processing for the exchange of data between the two layers.

そして、適合部１５は、所定の関係が満たされる場合には二層の間のデータのやり取りのために実行される適合層を決定する。適合層を決定とは、二層間で実施される処理を決定することである。適合部１５は、使用されるデバイスが異なる関係である場合にはデバイス間のデータ転送を行う適合層ＡＬ１を決定する。適合部１５は、データ型に違いが生じる関係である場合にはデータ型の変換を行う適合層ＡＬ１を決定する。適合部１５は、チャネル位置に違いが生じる関係である場合にはデータレイアウトの変更を行う適合層ＡＬ１を決定する。 The adaptation unit 15 then determines the adaptation layer to be executed for the exchange of data between the two layers if the predetermined relationship is satisfied. Determining the matching layer is determining the processing to be performed between the two layers. The adaptation unit 15 determines an adaptation layer AL1 that performs data transfer between the devices when the devices used have different relationships. The adapting unit 15 determines an adapting layer AL1 that performs data type conversion when there is a relationship that causes a difference in data type. The matching unit 15 determines the matching layer AL1 in which the data layout is to be changed in the case of a relationship that causes a difference in channel positions.

事前計算部１２は、ネットワーク構造の層ごとに予め準備された計算手法ＴＤ１を用いて、所定データに対するネットワーク構造の各層の計算を実行する。所定データは、ニューラルネットワークで処理可能な予め定められたデータであれば何でもよく、事前に用意されたテストデータであってもよい。事前計算部１２は、計算手法ＴＤ１を実行する。例えば、畳み込み層に３つの計算手法が準備されている場合、事前計算部１２は、畳み込み層の計算をそれぞれの計算手法で１回または複数回実行する。 The pre-calculation unit 12 uses a calculation method TD1 prepared in advance for each layer of the network structure to calculate each layer of the network structure for predetermined data. The predetermined data may be any predetermined data that can be processed by the neural network, and may be test data prepared in advance. The pre-calculator 12 executes the calculation technique TD1. For example, when three calculation methods are prepared for the convolution layer, the pre-calculation unit 12 performs the calculation of the convolution layer once or multiple times using each calculation method.

事前計算部１２は、上述した適合層ＡＬ１についても計算コストを事前に取得すべく、処理を行う。事前計算部１２は、計算手法ＴＤ１を参照して、挿入される適合層ＡＬ１を判定し、適合層ＡＬ１に関する処理を実行する。あるいは、事前計算部１２は、適合部１５から適合層ＡＬ１に係る情報を取得し、適合層ＡＬ１に関する処理を実行してもよい。 The pre-calculation unit 12 also performs processing to obtain in advance the calculation cost for the adaptive layer AL1 described above. The pre-calculation unit 12 refers to the calculation method TD1 to determine the adaptive layer AL1 to be inserted, and executes processing related to the adaptive layer AL1. Alternatively, the pre-calculation unit 12 may acquire information related to the matching layer AL1 from the matching unit 15 and execute processing related to the matching layer AL1.

コスト取得部１６は、事前計算部１２の計算結果に基づいて、ネットワーク構造の層ごとに少なくとも１つの計算手法の計算コストを取得する。ネットワーク構造の層ごとに計算コストを取得するとは、上述したニューラルネットワークの層ごとのそれぞれ計算コストを取得することである。例えば、中間層１１２それぞれの計算コストが取得される。具体的な一例として、コスト取得部１６は、第一の畳み込み層に対して準備された複数のアルゴリズムぞれぞれの計算コスト、第一の全結合層に対して準備された複数のアルゴリズムそれぞれの計算コスト、第二の畳み込み層に対して準備された１のアルゴリズムの計算コストなど、層ごとに準備された計算手法で、実際に処理を実行した結果から計算コストを取得する。さらに、コスト取得部１６は、適合層ＡＬ１の計算コストも取得する。コスト取得部１６は、層（適合層ＡＬ１を含む）と計算手法と計算コストとを対応させた結果リストＴＤ２を生成する。一例として、コスト取得部１６は、層と計算手法と演算速度とを関連付けた結果リストＴＤ２を生成する。以下では、結果リストＴＤ２をプロファイルともいう。 The cost acquisition unit 16 acquires the calculation cost of at least one calculation method for each layer of the network structure based on the calculation results of the pre-calculation unit 12 . Obtaining the calculation cost for each layer of the network structure means obtaining the calculation cost for each layer of the neural network described above. For example, the computational cost of each of the intermediate tiers 112 is obtained. As a specific example, the cost acquisition unit 16 calculates the calculation cost of each of the plurality of algorithms prepared for the first convolutional layer, the calculation cost of each of the plurality of algorithms prepared for the first fully connected layer, , and the calculation cost of one algorithm prepared for the second convolutional layer. Furthermore, the cost acquisition unit 16 also acquires the calculation cost of the matching layer AL1. The cost acquisition unit 16 generates a result list TD2 in which layers (including the matching layer AL1), calculation methods, and calculation costs are associated with each other. As an example, the cost acquisition unit 16 generates a result list TD2 that associates layers, calculation methods, and calculation speeds. Below, the result list TD2 is also called a profile.

算出部１７は、計算手法の計算コストと適合層の計算コストとに基づいて、複数の組み合わせごとに総コストを算出する。算出部１７は、組み合わせ決定部１４及び適合部１５から、複数の組み合わせと、組み合わせごとの適合層ＡＬ１に関する情報を取得する。適合層ＡＬ１に関する情報とは、ネットワーク構造における挿入位置及び処理内容である。算出部１７は、組み合わせ決定部１４及び適合部１５から取得された複数の組み合わせ及び適合層ＡＬ１に対応する計算コストを、結果リストＴＤ２を参照して取得する。そして、複数の組み合わせごとに総コストを算出する。算出部１７は、適合層ＡＬ１を含むネットワーク構造全体において実行される層の計算コストの総和を算出し、総コストとする。 The calculation unit 17 calculates the total cost for each of the multiple combinations based on the calculation cost of the calculation method and the calculation cost of the matching layer. The calculation unit 17 acquires a plurality of combinations and information about the matching layer AL1 for each combination from the combination determination unit 14 and the matching unit 15 . The information about the adaptation layer AL1 is the insertion position and processing contents in the network structure. The calculation unit 17 obtains the calculation costs corresponding to the plurality of combinations obtained from the combination determination unit 14 and the matching unit 15 and the matching layer AL1 by referring to the result list TD2. Then, the total cost is calculated for each of the multiple combinations. The calculation unit 17 calculates the sum of the calculation costs of the layers executed in the entire network structure including the adaptive layer AL1, and sets it as the total cost.

選択部１８は、組み合わせごとに算出された総コストに基づいて、複数の組み合わせの中から一つの組み合わせを選択する。例えば、選択部１８は、複数の組み合わせの中から総コストが最小となる組み合わせを選択する。選択部１８は、選択された組み合わせに係る計算手法を学習結果Ｍ１に付与した、学習結果Ｍ２を生成する。 The selection unit 18 selects one combination from a plurality of combinations based on the total cost calculated for each combination. For example, the selection unit 18 selects a combination with the lowest total cost from among multiple combinations. The selection unit 18 generates a learning result M2 by adding the calculation method associated with the selected combination to the learning result M1.

認識部１１は、学習結果Ｍ２を参照し、選択部１８によりネットワーク構造の各層に対応付けられた計算手法を用いて、入力データに対するネットワーク構造の各層の計算を実行環境において実行する。これにより、認識処理効率が向上する。 The recognition unit 11 refers to the learning result M2, and uses the calculation method associated with each layer of the network structure by the selection unit 18 to perform calculation of each layer of the network structure for the input data in the execution environment. This improves the recognition processing efficiency.

（計算手法決定システムの動作：計算手法決定方法）
図７は、計計算手法決定システムの動作を示すフローチャートである。図７に示されるフローチャートは、端末装置１０により実行される。 (Operation of calculation method determination system: calculation method determination method)
FIG. 7 is a flow chart showing the operation of the calculation method determination system. The flowchart shown in FIG. 7 is executed by the terminal device 10. FIG.

図７に示されるように、端末装置１０の組み合わせ決定部１４は、組み合わせ決定処理（Ｓ１０：決定ステップの一例）として、計算手法ＴＤ１からネットワーク構造の各層の計算手法を取得する。そして、組み合わせ決定部１４は、複数の計算手法の組み合わせを決定する。続いて、端末装置１０の適合部１５は、判定処理（Ｓ１２：適合ステップの一例）として、組み合わせ決定処理（Ｓ１０）で決定された複数の組み合わせごとに、適合層ＡＬ１の挿入箇所及び内容を判定する。続いて、端末装置１０の算出部１７は、算出処理（Ｓ１４：算出ステップの一例）として、結果リストＴＤ２を参照して、組み合わせごとに総コストを算出する。続いて、端末装置１０の選択部１８は、選択処理（Ｓ１６：選択ステップの一例）として、計算手法の組み合わせごとの総コストに基づいて、複数の組み合わせの中から一つの組み合わせを選択する。そして、選択部１８は、記憶処理（Ｓ１８）として、選択された組み合わせに係る計算手法を学習結果Ｍ１に記憶し、学習結果Ｍ２を生成する。記憶処理（Ｓ１８）が終了すると、図７のフローチャートは終了する。図７に示されるフローチャートが実行されることで、学習モデルを効率良く実行可能な計算手法が認識部１１へ提供される。 As shown in FIG. 7, the combination determination unit 14 of the terminal device 10 acquires the calculation method for each layer of the network structure from the calculation method TD1 as the combination determination process (S10: an example of the determination step). Then, the combination determining unit 14 determines a combination of multiple calculation methods. Subsequently, the matching unit 15 of the terminal device 10 determines the insertion position and content of the matching layer AL1 for each of the plurality of combinations determined in the combination determination process (S10) as determination processing (S12: an example of a matching step). do. Subsequently, the calculator 17 of the terminal device 10 refers to the result list TD2 and calculates the total cost for each combination as a calculation process (S14: an example of a calculation step). Subsequently, the selection unit 18 of the terminal device 10 selects one combination from a plurality of combinations based on the total cost for each combination of calculation methods as a selection process (S16: an example of a selection step). Then, as a storage process (S18), the selection unit 18 stores the calculation method associated with the selected combination in the learning result M1 and generates the learning result M2. When the storage process (S18) ends, the flowchart of FIG. 7 ends. Execution of the flowchart shown in FIG. 7 provides the recognizing unit 11 with a calculation method capable of efficiently executing the learning model.

（計算手法の組み合わせの総コスト算出の詳細）
計算手法の組み合わせはネットワーク構造が複雑になるほど増加し、総コストを算出するために多くの時間が必要になる。特に、適合層ＡＬ１を考慮することにより、組み合わせパターンが複雑となる。以下では、適合層ＡＬ１が考慮されたネットワーク構造の各層の最適な計算手法を決定するアルゴリズムを説明する。 (Details of calculating the total cost of a combination of calculation methods)
The number of combinations of calculation methods increases as the network structure becomes more complex, and it takes more time to calculate the total cost. In particular, consideration of the matching layer AL1 complicates the combination pattern. In the following, an algorithm is described for determining the optimal computational approach for each layer of the network structure given the adaptation layer AL1.

最初にネットワーク構造の形式的な定義を説明する。ネットワーク構造は、ｎ個の層で構成され、１つの入力と１つの出力とを有する有向非巡回グラフ(DAG: directed acyclic graph)である。ｎ個の層はＬ_１，Ｌ_２，…，Ｌ_ｎで識別され、Ｌ_１が入力層であり、Ｌ_ｎが出力層である。各層Ｌ_ｉは計算手法（ルーチン）の集合Ｒ_ｉを有する。Ｒ_ｉ(λ)は、計算手法の内容（スキーマ）λを持つルーチンを含む集合Ｒ_ｉのサブセットである。サブセットＲ_ｉ(λ)に含まれるルーチンの中で最速のルーチンＦ_ｉ(λ)は、プロファイルから得られるルーチンｒの処理時間、つまり実行環境でルーチンｒを実行させて計測された処理時間をＴ（ｒ）とすると、以下の数式（１１）で定義される。

First, a formal definition of the network structure is presented. The network structure is a directed acyclic graph (DAG) consisting of n layers and having one input and one output. The n layers are identified by L ₁ , L ₂ , . . . , L _n , where L ₁ is the input layer and L _n is the output layer. Each layer L _i has a set R _i of computational methods (routines). R _i (λ) is a subset of the set R _i containing routines with computational content (schema) λ. The fastest routine F _i (λ) among the routines included in the subset R _i (λ) is the processing time of the routine r obtained from the profile, that is, the processing time measured by executing the routine r in the execution environment is T (r) is defined by the following formula (11).

計算手法の組み合わせは、各層のルーチンを順に辿ることからルーチン経路ともいう。ルーチン経路Ｓは、以下の数式（１２）で表現できる。

ここで、エッジの層（Ｌ_ｉ，Ｌ_ｊ）がネットワーク構造に存在し、かつ、異なるスキーマを有する場合には、適合層のルーチンである適合ルーチンｒ^a _ｉ、ｊは、ルーチン経路Ｓに含まれる。この定義に基づいて、最適なルーチン経路Ｓ^*を定義する。最適なルーチン経路Ｓ^*は、ネットワーク構造の全てのルーチン経路Ｓのうち、全体の処理時間が最小となる経路であり、以下の数式（１３），（１４）で表現される。

最適なルーチン経路Ｓ^*は、数式（１３）の組み合わせ最適化の解である。 The combination of calculation methods is also called a routine path because the routines of each layer are traced in order. The routine path S can be expressed by the following formula (12).

Here, if a layer of edges (L _i , L _j ) exists in the network structure and has a different schema, then the adaptation routine ^ra _i,j , which is the routine of the adaptation layer, is included in the routine path S be Based on this definition, we define the optimal routine path S ^* . The optimal routine path S ^* is the path with the shortest overall processing time among all the routine paths S of the network structure, and is expressed by the following formulas (13) and (14).

The optimal routine path S ^* is the combinatorial optimization solution of Equation (13).

（直列構造）
最初に最も単純なネットワーク構造を説明する。図８の（Ａ）は直列接続されたネットワーク構造の一例である。図８の（Ａ）では、第一層Ｌ１、第二層Ｌ２、第三層Ｌ３を有するネットワーク構造が示される。ルーチン経路におけるスキーマの集合はΛ={α，β}とする。つまり、２つのスキーマα，βが存在する場合において、最適なルーチン経路Ｓ^*を算出する。スキーマα，βは任意に設定される。例えば、スキーマαは、層で処理されるデータの浮動小数点のビット数が３２ビットであることを意味し、スキーマβは、層で処理されるデータが８ビットで量子化されていることを意味してもよい。この場合、スキーマα，βを層ｎごとに実行することになるため、最適なルーチン経路Ｓ^*の探索範囲のサイズは、Ｏ（２^ｎ）となる。探索範囲のサイズをＯ（ｎ）に減らすために、動的計画法を採用することができる。

ここで、Ｆ^a _ｉ,ｊ(β→α)は、層Ｌ_ｉから層Ｌ_ｊへ適合させるための最速のルーチンである。また、ｉ＝１のとき、Ｓ_１（λ）＝｛Ｆ_１（λ）｝であり、λは集合Λに含まれる（λ∈Λ）。数式（１５）によれば、層Ｌ_ｉの最速のルーチンを選択し、層Ｌ_ｉ－１において最適なルーチン経路を計算することによって、層Ｌ_ｉにおける最適なルーチン経路が得られることを意味する。数式（１５）の後半は、Ｓ_ｉ－１（α）またはＳ_ｉ－１（β）においてスキーマが変化したときに適応するための計算コストを含めている。なお、数式（１５）はスキーマαについての数式であり、スキーマβに対しても同様の関係が成り立つ。数式（１５）に基づいて、最適なルーチン経路Ｓ^*=fastest_λ∈ΛS_n(λ)という命題が成立する。 (series structure)
First, the simplest network structure is described. FIG. 8A is an example of a network structure connected in series. In FIG. 8A, a network structure with a first layer L1, a second layer L2 and a third layer L3 is shown. Let the set of schemata in the routine path be Λ={α,β}. That is, the optimum routine path S ^* is calculated when there are two schemes α and β. Schemas α and β are set arbitrarily. For example, the scheme α means that the data processed by the layer has 32 floating-point bits, and the schema β means that the data processed by the layer is quantized by 8 bits. You may In this case, since the schemas α and β are executed for each layer n, the size of the search range for the optimal routine path S ^* is O(2 ⁿ ). Dynamic programming can be employed to reduce the size of the search range to O(n).

where F ^a _i,j (β→α) is the fastest routine to fit from layer L _i to layer L _j . Also, when i=1, S ₁ (λ)={F ₁ (λ)} and λ is included in the set Λ (λεΛ). Equation (15) means that the optimal routine path in layer L _i is obtained by choosing the fastest routine in layer L i and calculating the optimal routine path in layer L _i ₋₁ . The second half of Equation (15) includes the computational cost to adapt when the schema changes in S _i-1 (α) or S _i-1 (β). Note that the formula (15) is a formula for the schema α, and the same relationship holds for the schema β. Based on Equation (15), the proposition that the optimal routine path S ^* =fastest _λ∈Λ S _n (λ) holds.

（分岐構造）
次に、層の出力が２つに分岐している例を説明する。図８の（Ｂ）は分岐されたネットワーク構造を示す。分岐構造では、分岐元の層から複数の層に分岐し、分岐元の層から派生した複数の出力が合流層で合流する。図８の（Ｂ）に示される例では、第一層Ｌ１の出力が第二層Ｌ２と第四層Ｌ４とに分岐している。分岐された経路は第五層Ｌ５で合流する。分岐構造においては、直列構造の考え方を拡張することができる。第三層Ｌ３及び第四層Ｌ４を第五層Ｌ５に接続するための適合ルーチンを有するルーチン経路Ｓ_３（λ_３）及びＳ_４（λ_４）の和を計算することで、経路が合流する第五層Ｌ５における最適なルーチン経路Ｓ_５（λ_５）を得ることができる。ただし、この計算では、ルーチン経路Ｓ_３（λ_３）及びＳ_４（λ_４）に関して、分岐元である第一層Ｌ１のスキーマが同一であるという拘束条件が必要になる。分岐元である第一層Ｌ１のスキーマが同一でない場合、ルーチン経路Ｓ_５（λ_５）は第一層Ｌ１において二つのルーチンを持つという矛盾が生じるためである。この拘束条件を反映させるため、ルーチン経路をＳ_ｉ（λ；Ｃ_ｉ）とし、制約Ｃ_ｉを含むようにルーチン経路Ｓ_ｉ（λ）の定義を修正することができる。つまり、分岐元の層のルーチンを拘束条件とし、分岐元の層の後続の層が拘束条件を満たすように拘束条件ごとに組み合わせを決定するようにする。例えば、ルーチン経路Ｓ_５（λ；λ_１＝α)は、第一層Ｌ１のルーチンがスキーマαを持つ必要があることを意味する。この拡張により、第五層Ｌ５における最適なルーチン経路は、以下の式（１６）によって得られる。

第三層Ｌ３及び第四層Ｌ４における最適なルーチン経路Ｓ_３（λ_３）及びＳ_４（λ_４）の結合を考慮することで、最速となるルーチン経路を得られる。各経路には、選択されたスキーマに応じて二つの候補が存在する。したがって、最速となるルーチン経路は、合計で四つの候補を評価することで得ることができる。より一般化した議論においては、ｍ個の制約層が存在する場合には、制約はスキーマの集合Λにおけるべき乗のｍ個目の要素とみなすことができる。Ａ＝｛ａ_１，…，ａ_ｍ｝をｍ個の被制約層の集合とした場合、タプルλ_Ａ＝（λ_ａ１，…，λ_ａｍ）∈Λ_ｍとなり、タプルλ_Ａは層に対するスキーマの制約の組合せとなる。タプルλ_Ａが制約となる場合、スキーマλ_ｊを有する層_ｊの最速ルーチンＦ_ｊ（λ_ｊ）は、拘束条件を満たす最適なルーチン経路Ｓ_ｉ（λ_ｉ；λ_Ａ）に含まれなければならない。これにより、拘束条件を満たす最適なルーチン経路Ｓ^*は、以下の式（１７）となる。

(branched structure)
Next, an example in which the output of the layer is bifurcated will be described. FIG. 8B shows a branched network structure. In the branching structure, a layer at the branching source branches into a plurality of layers, and a plurality of outputs derived from the layer at the branching source merge in a confluence layer. In the example shown in FIG. 8B, the output of the first layer L1 branches to the second layer L2 and the fourth layer L4. The branched paths merge at the fifth layer L5. In a branched structure, the concept of serial structure can be extended. The paths merge by computing the sum of the routine paths S ₃ (λ ₃ ) and S ₄ (λ ₄ ) with an adaptation routine for connecting the third layer L3 and the fourth layer L4 to the fifth layer L5. The optimal routine path S ₅ (λ ₅ ) in the fifth layer L5 can be obtained. However, this calculation requires a constraint that the schema of the first layer L1, which is the branch source, be the same for the routine paths S ₃ (λ ₃ ) and S ₄ (λ ₄ ). This is because if the schema of the first layer L1, which is the branch source, is not the same, the routine path _S5 ( _λ5 ) has two routines in the first layer L1, which is a contradiction. To reflect this constraint, let the routine path be S _i (λ; C _i ) and modify the definition of routine path S _i (λ) to include the constraint C _i . In other words, the routine of the branch source layer is used as a constraint condition, and the combination is determined for each constraint condition so that the layer subsequent to the branch source layer satisfies the constraint condition. For example, the routine path S ₅ (λ; λ ₁ =α) means that the routines of the first layer L1 must have the schema α. With this extension, the optimal routine path in the fifth layer L5 is obtained by Equation (16) below.

The fastest routine path is obtained by considering the combination of the optimal routine paths S ₃ (λ ₃ ) and S ₄ (λ ₄ ) in the third layer L3 and the fourth layer L4. For each path there are two candidates depending on the schema chosen. Therefore, the fastest routine path can be obtained by evaluating a total of four candidates. In a more generalized discussion, if there are m constraint layers, a constraint can be viewed as the m-th element of the power in the set Λ of schemata. Let A = {a ₁ ,..., a _m } be the set of m constrained layers, then tuple λ _A = (λ _a1 ,..., λ _am )∈Λ _m , where tuple λ _A is the schema for the layers. It becomes a combination of constraints. If a tuple λ _A is a constraint, then the fastest routine F _j (λ _j ) in layer _j with schema λ _j must be included in the optimal routine path S _i (λ _i ; λ _A ) that satisfies the constraint. . As a result, the optimal routine path S ^* that satisfies the constraint condition is given by the following equation (17).

（複数の分岐構造）
次に、ネットワーク構造が複数の分岐構造を有する場合を説明する。図８の（Ｃ）は複数分岐されたネットワーク構造を示す。図８の（Ｃ）に示される例では、第一層Ｌ１の出力が第二層Ｌ２と第六層Ｌ６とに分岐している。さらに、第二層Ｌ２の出力が第三層Ｌ３と第四層Ｌ４とに分岐している。第二層Ｌ２から分岐された経路は、第五層Ｌ５で合流する。そして、第一層Ｌ１から分岐された経路は、第八層Ｌ８で合流する。上述した単一の分岐構造の考え方を拡張すると、分岐元である第一層Ｌ１及び第二層Ｌ２に基づいて拘束条件が設定される。スキーマは２つであるとする。まず、第二層Ｌ２の下流に位置する層の最適なルーチン経路を計算する。第二層Ｌ２の下流に位置する層は、ここでは、第三層Ｌ３、第四層Ｌ４、第五層Ｌ５及び第八層Ｌ８の４つの層である。このため、合計で８つの候補が評価される。候補数は、図８の（Ｂ）と比較すると２倍になる。簡易的に一般化すると、ｂ個の分岐を持つネットワーク構造には、２^ｂ個のスキーマの組み合わせが得られることになる。 (multiple branched structures)
Next, a case where the network structure has a plurality of branch structures will be described. FIG. 8C shows a network structure with multiple branches. In the example shown in FIG. 8C, the output of the first layer L1 branches to the second layer L2 and the sixth layer L6. Furthermore, the output of the second layer L2 branches to the third layer L3 and the fourth layer L4. The paths branched from the second layer L2 merge at the fifth layer L5. Then, the paths branched from the first layer L1 merge at the eighth layer L8. Expanding the idea of the single branch structure described above, the constraint conditions are set based on the first layer L1 and the second layer L2, which are branch sources. Suppose there are two schemas. First, the optimal routine path for the layer located downstream of the second layer L2 is calculated. The layers located downstream of the second layer L2 are here four layers: a third layer L3, a fourth layer L4, a fifth layer L5 and an eighth layer L8. Thus, a total of 8 candidates are evaluated. The number of candidates is doubled when compared with (B) of FIG. A simple generalization is that for a network structure with b branches, there are 2 ^b schema combinations.

分岐の個数に応じて指数関数的に組み合わせの個数が増大するため、演算のためのコストを低減させる工夫が必要になる。一例として、分岐がマージされる場合、つまり分岐が合流する場合には制約を緩和することが考えられる。例えば、第五層Ｌ５と第七層Ｌ７の経路を第八層Ｌ８にマージする場合、必要な制約は第一層Ｌ１のスキーマのみである。なぜなら、第三層Ｌ３と第四層Ｌ４とをマージするときに、第二層Ｌ２の拘束条件が第五層Ｌ５においてすでに満たされている必要があるためである。したがって、第二層Ｌ２の拘束条件は、以下の式（１８）のように、第五層Ｌ５で緩和することができる。

式（１８）においては、高速な経路を選択することによってスキーマλ_２の制約を除去する。同様の緩和はＳ_５（α；λ_１＝β）、Ｓ_５（β；λ_１＝α）、Ｓ_５（β；λ_１＝β）にも適用することができる。より一般化した内容は、以下のとおりである。分岐層Ｌ_ｉの全ての下流の経路が通過する層の集合をＡ_ｉとする。層Ｌ_ｊにおける層Ｌ_ｉに対する制約は、層Ｌ_ｊが集合Ａ_ｉに属する場合のみ、緩和することができる。例えば、Ｌ_ｊ∈Ａ_ｉである場合、層Ｌ_ｉの全ての下流の経路は層Ｌ_ｊを通ることになる。層Ｌ_ｊの下流では、層Ｌ_ｉに対する制約とのマージは発生しない。よって、層Ｌ_ｊに対する制約を緩和することができる。しかしながら、Ｌ_ｊ∈Ａ_ｉではない場合、層Ｌ_ｊを通らずに層Ｌ_ｉに至る経路ρが存在する。つまり、層Ｌ_ｊの下流には層Ｌ_ｋ（ｊ＜ｋ＜＝ｎ）で経路ρと交差する経路が存在することになる。この場合、層Ｌ_ｊにおける拘束を緩和することはできない。 Since the number of combinations increases exponentially according to the number of branches, it is necessary to devise ways to reduce the cost of operations. One example would be to relax the constraint when branches are merged, that is, when branches merge. For example, if the routes of the fifth layer L5 and the seventh layer L7 are merged into the eighth layer L8, the only constraint required is the schema of the first layer L1. This is because the constraint condition of the second layer L2 must already be satisfied in the fifth layer L5 when merging the third layer L3 and the fourth layer L4. Therefore, the constraint condition of the second layer L2 can be relaxed by the fifth layer L5 as shown in the following formula (18).

In equation (18), we remove the constraint of schema _λ2 by choosing the fast path. Similar relaxations can be applied to S ₅ (α; λ ₁ =β), S ₅ (β; λ ₁ =α), S ₅ (β; λ ₁ =β). The more general content is as follows. Let A _i be the set of layers through which all the downstream paths of the branching layer L _i pass. A constraint on layer L _i in layer L _j can be relaxed only if layer L _j belongs to set A _i . For example, if L _j εA _i , then all downstream paths of layer L _i will pass through layer L _j . Downstream of layer _Lj , no merging with constraints on layer L _i occurs. Therefore, the constraint on layer _Lj can be relaxed. However, if L _j ∈ A _i , then there exists a path ρ to layer L _i that does not pass through layer L _j . That is, downstream of layer L _j , there exists a route crossing route ρ at layer L _k (j<k<=n). In this case, the constraint on layer _Lj cannot be relaxed.

（複数の入出力）
次に、ネットワーク構造が複数の入力又は複数の出力を有する場合を説明する。図８の（Ｄ）は、複数の入出力を有するネットワーク構造である。図８の（Ｄ）に示されるように、第一層Ｌ１及び第二層Ｌ２の二つの層が入力層として機能する。第四層Ｌ４及び第五層Ｌ５の二つの層が出力層として機能する。上述した議論を適用させるために、ネットワーク構造に対して、補助入力層ＡＵ１及び補助出力層ＡＵ２の概念が導入される。補助入力層ＡＵ１は、第一層Ｌ１及び第二層Ｌ２よりも上流側に擬似的に導入される。補助入力層ＡＵ１は、第一層Ｌ１及び第二層Ｌ２のデータの入力条件を揃えるために導入される。つまり、補助入力層ＡＵ１の計算手法を拘束条件とし、第一層Ｌ１及び第二層Ｌ２の計算手法が補助入力層ＡＵ１の計算手法となるようにスキーマ経路が決定される。このようなスキーマ経路の決定は、補助入力層ＡＵ１が擬似的に導入されることで実現するこれにより、上述した分岐構造の考え方と同様の議論が可能となり、第一層Ｌ１及び第二層Ｌ２が個々に最速となる計算手法を選択したために第一層Ｌ１及び第二層Ｌ２のデータのデータ型や取り扱うデバイスが異なる、という不整合が生じることを回避することができる。 (multiple inputs and outputs)
Next, we describe the case where the network structure has multiple inputs or multiple outputs. FIG. 8D is a network structure with multiple inputs and outputs. As shown in FIG. 8D, two layers, a first layer L1 and a second layer L2, function as input layers. Two layers, the fourth layer L4 and the fifth layer L5, function as output layers. To apply the discussion above, the concept of an auxiliary input layer AU1 and an auxiliary output layer AU2 is introduced for the network structure. The auxiliary input layer AU1 is artificially introduced upstream of the first layer L1 and the second layer L2. The auxiliary input layer AU1 is introduced to match the data input conditions of the first layer L1 and the second layer L2. That is, the schema path is determined such that the calculation method of the auxiliary input layer AU1 is a constraint condition, and the calculation methods of the first layer L1 and the second layer L2 are the calculation methods of the auxiliary input layer AU1. Determination of such a schema path is achieved by artificially introducing the auxiliary input layer AU1. This enables discussion similar to the branching structure concept described above, and the first layer L1 and the second layer L2 It is possible to avoid the inconsistency that the data types of the data of the first layer L1 and the data of the second layer L2 and the devices to be handled are different due to the individual selection of the fastest calculation method.

補助出力層ＡＵ２は、第四層Ｌ４及び第五層Ｌ５よりも下流側に擬似的に導入される。補助出力層ＡＵ２は、第四層Ｌ４及び第五層Ｌ５のデータの入力条件を揃えるために導入される。言い換えれば、第四層Ｌ４及び第五層Ｌ５よりも前段の層において分岐している層（ここでは第三層Ｌ３）において第四層Ｌ４及び第五層Ｌ５によって要求される計算手法が異なるという不整合を回避するために導入される。第三層Ｌ３の計算手法を拘束条件とし、第四層Ｌ４及び第五層Ｌ５の計算手法が補助出力層ＡＵ２の計算手法となるようにスキーマ経路が決定される。このようなスキーマ経路の決定は、補助出力層ＡＵ２が擬似的に導入されることで実現する。これにより、上述した分岐構造の考え方と同様の議論が可能となり、第四層Ｌ４及び第五層Ｌ５が個々に最速となる計算手法を選択したために第四層Ｌ４及び第五層Ｌ５のデータのデータ型や取り扱うデバイスが異なる、という不整合が生じることを回避することができる。 The auxiliary output layer AU2 is artificially introduced downstream of the fourth layer L4 and the fifth layer L5. The auxiliary output layer AU2 is introduced to match the input conditions of the data of the fourth layer L4 and the fifth layer L5. In other words, the calculation method required by the fourth layer L4 and the fifth layer L5 is different in the layer (here, the third layer L3) branching in the layer preceding the fourth layer L4 and the fifth layer L5. Introduced to avoid inconsistencies. The schema path is determined such that the computation method of the third layer L3 is a constraint condition, and the computation methods of the fourth layer L4 and the fifth layer L5 are the computation methods of the auxiliary output layer AU2. Determination of such a schema path is realized by artificially introducing the auxiliary output layer AU2. As a result, the same discussion as the idea of the branching structure described above is possible, and since the calculation method that makes the fourth layer L4 and the fifth layer L5 individually selected the fastest calculation method, the data of the fourth layer L4 and the fifth layer L5 It is possible to avoid inconsistencies caused by different data types and different devices to be handled.

次に、計算手法決定システム１００として機能させるための計算手法決定プログラムを説明する。計算手法決定プログラムは、メインモジュール、決定モジュール、適合モジュール、算出モジュール及び選択モジュールを備えている。メインモジュールは、装置を統括的に制御する部分である。決定モジュール、適合モジュール、算出モジュール及び選択モジュールを実行させることにより実現される機能は、上述した組み合わせ決定部１４、適合部１５、算出部１７及び選択部１８の機能とそれぞれ同様である。 Next, a calculation method determination program for functioning as the calculation method determination system 100 will be described. The calculation method determination program comprises a main module, a determination module, an adaptation module, a calculation module and a selection module. The main module is a part that controls the device in a centralized manner. Functions realized by executing the determining module, matching module, calculating module, and selecting module are the same as the functions of the above-described combination determining unit 14, matching unit 15, calculating unit 17, and selecting unit 18, respectively.

プログラムは、例えば、ＲＯＭ又は半導体メモリなどの非一時的な記録媒体によって提供される。また、プログラムは、ネットワークなどの通信を介して提供されてもよい。 A program is provided by non-temporary recording media, such as ROM or semiconductor memory, for example. Also, the program may be provided via communication such as a network.

（第一実施形態のまとめ）
計算手法決定システム１００では、適合層ＡＬ１の計算コストを考慮して各層の計算手法の組み合わせの総コストが算出される。このため、計算手法決定システム１００は、各層の計算コストを加算する場合と比べて、ネットワーク構造全体の計算コストをより正確に算出することができる。よって、計算手法決定システム１００は、ネットワーク構造の各層の計算手法をより適切に決定することができる。これにより、認識部１１の計算負荷を軽減することができる。 (Summary of the first embodiment)
In calculation method determination system 100, the total cost of the combination of calculation methods for each layer is calculated in consideration of the calculation cost of adaptive layer AL1. Therefore, the calculation method determination system 100 can calculate the calculation cost of the entire network structure more accurately than when adding the calculation cost of each layer. Therefore, the calculation method determination system 100 can more appropriately determine the calculation method for each layer of the network structure. Thereby, the calculation load of the recognition unit 11 can be reduced.

計算手法決定システム１００によれば、同一結果を出力するものの演算方法がそれぞれ異なる複数のアルゴリズムを実行環境にて実際に動作させ、得られた計算コストに基づいて最適なアルゴリズムを決定することができる。計算手法決定システム１００によれば、異種デバイス間のデータ転送の処理の計算コストを総コストに含めることにより、ネットワーク構造の各層の計算手法をより適切に決定することができる。計算手法決定システム１００によれば、データ型の変換の処理の計算コストを総コストに含めることにより、ネットワーク構造の各層の計算手法をより適切に決定することができる。計算手法決定システム１００によれば、データレイアウトの変更の処理の計算コストを総コストに含めることにより、ネットワーク構造の各層の計算手法をより適切に決定することができる。計算手法決定システム１００によれば、複数の組み合わせの中から総コストが最小となる組み合わせを選択することにより、ネットワーク構造の各層の計算手法を最も適切に決定することができる。 According to the calculation method determination system 100, it is possible to actually operate a plurality of algorithms that output the same result but use different calculation methods in an execution environment, and to determine the optimum algorithm based on the obtained calculation cost. . According to the computational method determination system 100, by including the computational cost of data transfer processing between different devices in the total cost, it is possible to more appropriately determine the computational method for each layer of the network structure. According to the calculation method determination system 100, by including the calculation cost of data type conversion processing in the total cost, it is possible to more appropriately determine the calculation method for each layer of the network structure. According to the calculation method determination system 100, the calculation method for each layer of the network structure can be determined more appropriately by including the calculation cost of the processing of changing the data layout in the total cost. According to the calculation method determination system 100, by selecting the combination that minimizes the total cost from a plurality of combinations, the calculation method for each layer of the network structure can be most appropriately determined.

計算手法決定システム１００によれば、分岐元の層の後続の層については分岐元の層の計算手法を拘束条件として計算手法を決定し、拘束条件ごとに組み合わせを決定することにより、不整合が発生することを抑制できる。計算手法決定システム１００によれば、複数の入力層を有するネットワーク構造において補助入力層を生成することにより、不整合が発生することを抑制できる。計算手法決定システム１００によれば、複数の出力層を有するネットワーク構造において補助出力層を生成することにより、不整合が発生することを抑制できる。 According to the calculation method determination system 100, for the layer subsequent to the branch source layer, the calculation method is determined using the calculation method of the branch source layer as a constraint condition, and the combination is determined for each constraint condition, thereby eliminating inconsistencies. You can prevent it from happening. According to the calculation method determination system 100, generation of an auxiliary input layer in a network structure having a plurality of input layers can suppress occurrence of inconsistency. According to the calculation method determination system 100, by generating an auxiliary output layer in a network structure having a plurality of output layers, occurrence of inconsistency can be suppressed.

［第二実施形態］
第二実施形態に係る計算手法決定システム１００Ａは、第一実施形態に係る計算手法決定システム１００と比較して、組み合わせ決定部１４、適合部１５、算出部１７及び選択部１８の機能が提供装置４０側に存在する点が相違し、その他は同一である。第二実施形態においては、第一実施形態と重複する説明は繰り返さない。 [Second embodiment]
Compared to the calculation method determination system 100 according to the first embodiment, the calculation method determination system 100A according to the second embodiment has the functions of the combination determination unit 14, the matching unit 15, the calculation unit 17, and the selection unit 18. The difference is that it exists on the 40 side, and the others are the same. In the second embodiment, the description overlapping with the first embodiment will not be repeated.

図９は、第二実施形態に係る計算手法決定システム１００Ａの機能ブロック図である。図９に示されるように、計算手法決定システム１００Ａは、提供装置４０Ａ及び端末装置１０Ａを備える。提供装置４０Ａは、提供装置４０と比較して、組み合わせ決定部１４、適合部１５、算出部１７及び選択部１８を備える点が相違する。組み合わせ決定部１４、適合部１５、算出部１７及び選択部１８の各機能は、第一実施形態において説明された内容と同一である。また、計算手法決定システム１００Ａのその他の構成は、計算手法決定システム１００と同一である。 FIG. 9 is a functional block diagram of the calculation method determination system 100A according to the second embodiment. As shown in FIG. 9, the calculation method determination system 100A includes a providing device 40A and a terminal device 10A. The provision device 40A differs from the provision device 40 in that it includes a combination determination unit 14, a matching unit 15, a calculation unit 17, and a selection unit . The functions of the combination determination unit 14, the matching unit 15, the calculation unit 17, and the selection unit 18 are the same as those described in the first embodiment. Other configurations of the calculation method determination system 100A are the same as those of the calculation method determination system 100. FIG.

（第二実施形態のまとめ）
計算手法決定システム１００Ａでは、適合層ＡＬ１の計算コストを考慮して各層の計算手法の組み合わせの総コストが算出される。このため、計算手法決定システム１００Ａは、各層の計算コストを加算する場合と比べて、ネットワーク構造全体の計算コストをより正確に算出することができる。よって、計算手法決定システム１００Ａは、ネットワーク構造の各層の計算手法をより適切に決定することができる。これにより、認識部１１の計算負荷を軽減することができる。また、計算手法決定システム１００Ａは、計算手法決定システム１００と比べて、端末装置１０Ａに導入される機能を最小限にすることができる。これにより、システムのメンテナンス性が向上する。 (Summary of the second embodiment)
Calculation method determination system 100A calculates the total cost of a combination of calculation methods for each layer in consideration of the calculation cost of adaptive layer AL1. Therefore, the calculation method determination system 100A can calculate the calculation cost of the entire network structure more accurately than when adding the calculation cost of each layer. Therefore, the calculation method determination system 100A can more appropriately determine the calculation method for each layer of the network structure. Thereby, the calculation load of the recognition unit 11 can be reduced. Moreover, the calculation method determination system 100A can minimize the functions introduced into the terminal device 10A compared to the calculation method determination system 100. FIG. This improves the maintainability of the system.

なお、本開示は上記実施形態に限定されるものではない。本開示は、その要旨を逸脱しない範囲で様々な変形が可能である。 Note that the present disclosure is not limited to the above embodiments. Various modifications can be made to the present disclosure without departing from the gist thereof.

計算手法決定システム１００はハードウェアとして端末装置１０及び提供装置４０を含むことが例示されたが、これに限定されない。例えば、計算手法決定システム１００は、機能ごとに用意された装置が通信ネットワークを介して接続されてなる集合体として構成されてもよい。あるいは、計算手法決定システム１００は、全ての機能を発揮可能な単一のハードウェアで構成されていてもよい。計算手法決定システム１００Ａも同様に、種々のハードウェアで構成されてもよいし、単一のハードウェアで構成されてもよい。 Although the computing method determination system 100 is illustrated to include the terminal device 10 and the providing device 40 as hardware, it is not limited to this. For example, the calculation method determination system 100 may be configured as an assembly in which devices prepared for each function are connected via a communication network. Alternatively, the calculation method determination system 100 may be configured with a single piece of hardware capable of exhibiting all functions. Similarly, the calculation method determination system 100A may be configured with various types of hardware, or may be configured with a single piece of hardware.

計算手法決定システム１００は認識部１１の層ごとに最適な計算手法を決定しているため、１の認識部１１に対して最適な計算手法のセットが１つ提供されることになる。しかしながら、提供される最適な計算手法のセットの数は１つに限定されず、２セット以上提供されてもよい。この場合、認識部１１は、例えば、提供された複数のセットをそれぞれ実行し、最も速く処理することができたセットを選択すればよい。 Since the calculation method determination system 100 determines the optimum calculation method for each layer of the recognition unit 11 , one set of optimum calculation methods is provided for one recognition unit 11 . However, the number of sets of optimal calculation techniques provided is not limited to one, and two or more sets may be provided. In this case, the recognizing unit 11 may, for example, execute each of the provided sets and select the set that can be processed fastest.

計算手法決定システム１００は、同一環境を有する他の端末に最適な計算手法を展開してもよい。例えば、端末装置１０と同一の環境を有する端末装置１０Ｘが存在するとする。端末装置１０Ｘは、例えば端末装置１０と同一機種である。計算手法決定システム１００は、端末装置１０Ｘに対して、選択部１８により選択された計算手法を提供する提供部をさらに備えてもよい。この場合、計算手法決定システム１００は、同一環境の端末には、実際にその端末で計算することなく、同一環境を有する他の端末で決定した計算手法を適用させることができる。 The calculation method determination system 100 may deploy the optimum calculation method for other terminals having the same environment. For example, assume that a terminal device 10X having the same environment as the terminal device 10 exists. The terminal device 10X is of the same model as the terminal device 10, for example. The calculation method determination system 100 may further include a providing unit that provides the calculation method selected by the selection unit 18 to the terminal device 10X. In this case, calculation method determination system 100 can apply a calculation method determined by another terminal having the same environment to a terminal in the same environment without actually performing calculations in the terminal.

１０，１０Ａ…端末装置、１１…認識部、１２…事前計算部、１３…候補決定部、１４…組み合わせ決定部、１５…適合部、１６…コスト取得部、１７…算出部、１８…選択部、１００，１００Ａ…計算手法決定システム。 10, 10A terminal device 11 recognition unit 12 pre-calculation unit 13 candidate determination unit 14 combination determination unit 15 adaptation unit 16 cost acquisition unit 17 calculation unit 18 selection unit , 100, 100A... Calculation method determination system.

Claims

ネットワーク構造及び重みデータを用いて入力データを処理するための計算が行われる実行環境において、前記ネットワーク構造の層ごとに計算手法を決定する計算手法決定システムであって、
前記層ごとに予め準備された少なくとも１つの計算手法に基づいて、前記ネットワーク構造の入力から出力までの間の各層の計算手法の組み合わせを複数決定する決定部であって、少なくとも１つの層の計算手法が複数存在する、前記決定部と、
複数の組み合わせごとに、データのやり取りを行う二層の計算手法が所定の関係を満たすか否かを判定し、前記所定の関係が満たされる場合には前記二層の間のデータのやり取りのために実行される適合層を決定する適合部と、
前記計算手法の計算コストと前記適合層の計算コストとに基づいて、前記複数の組み合わせごとに総コストを算出する算出部と、
前記組み合わせごとに算出された前記総コストに基づいて、前記複数の組み合わせの中から一つの前記組み合わせを選択する選択部と、
を備える計算手法決定システム。 A calculation method determination system that determines a calculation method for each layer of the network structure in an execution environment in which calculations are performed to process input data using the network structure and weight data,
a determination unit that determines a plurality of combinations of calculation methods for each layer from input to output of the network structure based on at least one calculation method prepared in advance for each layer, wherein calculation of at least one layer the determining unit, in which a plurality of methods exist ;
For each of a plurality of combinations, it is determined whether or not a two-layer calculation method that exchanges data satisfies a predetermined relationship, and if the predetermined relationship is satisfied, for data exchange between the two layers. an adaptation part that determines the adaptation layer to be performed on
a calculation unit that calculates a total cost for each of the plurality of combinations based on the calculation cost of the calculation method and the calculation cost of the adaptive layer;
a selection unit that selects one of the plurality of combinations based on the total cost calculated for each combination;
A computational method determination system comprising:

前記少なくとも１つの計算手法は、前記実行環境で実行可能であり、それぞれが異なる演算で同一機能を発揮する複数のアルゴリズムを含む、請求項１に記載の計算手法決定システム。 2. The computational strategy determination system of claim 1, wherein said at least one computational strategy is executable in said execution environment and includes a plurality of algorithms each performing the same function with different operations.

前記少なくとも１つの計算手法は、前記実行環境で実行可能であり、それぞれが異なるデバイスを用いて同一機能を発揮する複数のアルゴリズムを含み、
前記適合層は、異種デバイス間のデータ転送の処理を実行する、
請求項１又は２に記載の計算手法決定システム。 the at least one computational technique comprises a plurality of algorithms executable in the execution environment, each performing the same function using a different device;
the adaptation layer performs the processing of data transfer between heterogeneous devices;
The calculation method determination system according to claim 1 or 2.

前記少なくとも１つの計算手法は、前記実行環境で実行可能であり、それぞれが異なるデータ型の前記データに対して同一の演算を行う複数のアルゴリズムを含み、
前記適合層は、データ型の変換の処理を実行する、
請求項１～３の何れか一項に記載の計算手法決定システム。 the at least one computational technique is executable in the execution environment and includes a plurality of algorithms each performing the same operation on the data of different data types;
the adaptation layer performs a process of data type conversion;
The calculation method determination system according to any one of claims 1 to 3.

前記少なくとも１つの計算手法は、前記実行環境で実行可能であり、それぞれが異なるチャネル位置の前記データに対して同一の演算を行う複数のアルゴリズムを含み、
前記適合層は、データレイアウトの変更の処理を実行する、
請求項１～４の何れか一項に記載の計算手法決定システム。 the at least one computational technique comprises a plurality of algorithms executable in the execution environment, each algorithm performing the same operation on the data at different channel locations;
the adaptation layer performs processing of data layout changes;
A calculation method determination system according to any one of claims 1 to 4.

前記選択部は、前記複数の組み合わせの中から前記総コストが最小となる前記組み合わせを選択する、請求項１～５の何れか一項に記載の計算手法決定システム。 The calculation method determination system according to any one of claims 1 to 5, wherein said selection unit selects said combination that minimizes said total cost from among said plurality of combinations.

前記ネットワーク構造が、分岐元の層から複数の層に分岐し、分岐元の層から派生した複数の出力が合流層で合流する分岐構造を有する場合、前記決定部は、分岐元の層の計算手法を拘束条件とし、前記分岐元の層の後続の層が前記拘束条件を満たすように拘束条件ごとに前記組み合わせを決定する、請求項１～６の何れか一項に記載の計算手法決定システム。 When the network structure has a branching structure in which a plurality of layers branch from a branching source layer and a plurality of outputs derived from the branching source layer merge at a confluence layer, the determination unit calculates the branching source layer The calculation method determination system according to any one of claims 1 to 6, wherein the method is a constraint condition, and the combination is determined for each constraint condition so that the layer succeeding the branching source layer satisfies the constraint condition. .

前記ネットワーク構造の入力を行う入力層が複数となる場合、前記決定部は、複数の入力層の上流側に、複数の入力層それぞれに接続される補助入力層を生成し、前記補助入力層の計算手法を拘束条件とし、前記補助入力層の後続の層が前記拘束条件を満たすように拘束条件ごとに組み合わせを決定する、請求項１～７の何れか一項に記載の計算手法決定システム。 When there are a plurality of input layers for inputting the network structure, the determination unit generates an auxiliary input layer connected to each of the plurality of input layers on the upstream side of the plurality of input layers, and 8. The calculation method determination system according to claim 1, wherein a calculation method is defined as a constraint condition, and a combination is determined for each constraint condition so that a subsequent layer of the auxiliary input layer satisfies the constraint condition.

前記ネットワーク構造の出力を行う出力層が複数となる場合、前記決定部は、複数の出力層の下流側に、前記複数の出力層それぞれに接続される補助出力層を生成し、前記複数の出力層よりも前段の層において分岐している層の計算手法を拘束条件とし、前記拘束条件を満たすように拘束条件ごとに前記組み合わせを決定する、請求項１～７の何れか一項に記載の計算手法決定システム。 When there are a plurality of output layers that output the network structure, the determination unit generates an auxiliary output layer downstream of the plurality of output layers and connected to each of the plurality of output layers, 8. The method according to any one of claims 1 to 7, wherein a calculation method for a layer branching in a layer preceding the layer is set as a constraint condition, and the combination is determined for each constraint condition so as to satisfy the constraint condition. Calculation method decision system.

ネットワーク構造及び重みデータを用いて入力データを処理するための計算が行われる実行環境において、前記ネットワーク構造の層ごとに計算手法を決定する計算手法決定方法であって、決定部、適合部、算出部、及び選択部がコンピュータに実装され、
前記決定部が、前記層ごとに予め準備された少なくとも１つの計算手法に基づいて、前記ネットワーク構造の入力から出力までの間の各層の計算手法の組み合わせを複数決定する決定ステップであって、少なくとも１つの層の計算手法が複数存在する、前記決定ステップと、
前記適合部が、複数の組み合わせごとに、データのやり取りを行う二層の計算手法が所定の関係を満たすか否かを判定し、前記所定の関係が満たされる場合には前記二層の間のデータのやり取りのために実行される適合層を決定する適合ステップと、
前記算出部が、前記計算手法の計算コストと前記適合層の計算コストとに基づいて、前記複数の組み合わせごとに総コストを算出する算出ステップと、
前記選択部が、前記組み合わせごとに算出された前記総コストに基づいて、前記複数の組み合わせの中から一つの前記組み合わせを選択する選択ステップと、
を備える計算手法決定方法。 A calculation method determination method for determining a calculation method for each layer of the network structure in an execution environment in which calculations are performed for processing input data using a network structure and weight data,the determination unit, the adaptation unit, the calculation unit, and the selection unit are implemented in a computer;
The decision unitA determination step of determining a plurality of combinations of calculation methods for each layer from the input to the output of the network structure based on at least one calculation method prepared in advance for each layer.wherein there is a plurality of at least one layer of computational methods, the determining stepWhen,
The adapting part is For each of a plurality of combinations, it is determined whether or not a two-layer calculation method that exchanges data satisfies a predetermined relationship, and if the predetermined relationship is satisfied, for data exchange between the two layers. a fitting step that determines a matching layer to be performed on
The calculation unit a calculation step of calculating a total cost for each of the plurality of combinations based on the calculation cost of the calculation method and the calculation cost of the adaptive layer;
The selection unit a selection step of selecting one of the plurality of combinations based on the total cost calculated for each combination;
A computational method determination method comprising:

ネットワーク構造及び重みデータを用いて入力データを処理するための計算が行われる実行環境において、前記ネットワーク構造の層ごとに計算手法を決定するようにコンピュータを動作させる計算手法決定プログラムであって、
前記コンピュータを、
前記層ごとに予め準備された少なくとも１つの計算手法に基づいて、前記ネットワーク構造の入力から出力までの間の各層の計算手法の組み合わせを複数決定する決定部であって、少なくとも１つの層の計算手法が複数存在する、前記決定部、
複数の組み合わせごとに、データのやり取りを行う二層の計算手法が所定の関係を満たすか否かを判定し、前記所定の関係が満たされる場合には前記二層の間のデータのやり取りのために実行される適合層を決定する適合部、
前記計算手法の計算コストと前記適合層の計算コストとに基づいて、前記複数の組み合わせごとに総コストを算出する算出部、及び、
前記組み合わせごとに算出された前記総コストに基づいて、前記複数の組み合わせの中から一つの前記組み合わせを選択する選択部として機能させる、計算手法決定プログラム。 A calculation method determination program that causes a computer to determine a calculation method for each layer of the network structure in an execution environment where calculations are performed to process input data using the network structure and weight data,
said computer,
a determination unit that determines a plurality of combinations of calculation methods for each layer from input to output of the network structure based on at least one calculation method prepared in advance for each layer, wherein calculation of at least one layer The decision unit, wherein there are a plurality of methods ,
For each of a plurality of combinations, it is determined whether or not a two-layer calculation method that exchanges data satisfies a predetermined relationship, and if the predetermined relationship is satisfied, for data exchange between the two layers. an adaptation part that determines the adaptation layer to be performed on
a calculation unit that calculates a total cost for each of the plurality of combinations based on the calculation cost of the calculation method and the calculation cost of the adaptive layer; and
A calculation method determination program functioning as a selection unit that selects one of the plurality of combinations based on the total cost calculated for each combination.