CN110532291B

CN110532291B - Method and system for converting deep learning frame model based on minimum execution cost

Info

Publication number: CN110532291B
Application number: CN201910676904.9A
Authority: CN
Inventors: 何文婷; 程学旗; 钟巧灵; 张志斌; 郭嘉丰; 赵鹏
Original assignee: Institute of Computing Technology of CAS
Current assignee: Institute of Computing Technology of CAS
Priority date: 2019-07-25
Filing date: 2019-07-25
Publication date: 2022-07-12
Anticipated expiration: 2039-07-25
Also published as: CN110532291A

Abstract

The invention provides a method and a system for converting a model between deep learning frameworks based on minimum execution cost, wherein the method comprises the following steps: on the basis of the prior art, adding operation conversion cost value, and simultaneously considering the condition that a plurality of independent operations can be fused to supplement fusion mapping; the concrete realization of the model is embodied in the operation conversion of the formed model, and the converted model structure with the lowest execution cost is obtained through a dynamic programming algorithm according to the model conversion mapping table in the stage. The invention can reduce the read-write process of intermediate results among a plurality of operations through operation fusion, thereby optimizing the calculation performance and the storage space and further reducing the execution cost of the converted model. Meanwhile, when various fusion options exist, a model conversion method with the minimum execution cost is obtained through a dynamic programming algorithm.

Description

Method and system for converting deep learning frame model based on minimum execution cost

Technical Field

The invention relates to the field of deep learning, in particular to a method and a system for model conversion between deep learning frames based on minimum execution cost, which are particularly applied to model conversion of different frames.

Background

The current deep learning model may be implemented with different frameworks. After a model is trained on frame a, the user can turn it to a model under frame B so that it can make inferences directly under frame B without having to retrain the model. As shown in fig. 1, each Model is composed of a plurality of different operations, and each operation op (operator) generally has contents such as an operation type, an input (supporting zero to many), an output (supporting 1 to many), and an attribute. As shown in fig. 2, the operations are related to each other through respective inputs and outputs, and the model structure of fig. 2 includes 5 OPs, which are 5 operations of Conv, SpatialBN, Relu, AveragePool, and FC. For a Conv operation whose inputs include data and weight, the output is only one and is one of the inputs of the spatialBN operation.

The model can be defined by a formula consisting of a sequence of 1 to n operations: model equal to OP₁，OP₂，OP₃…OP_n。

The model conversion method between different frames such as Pythrch/caffe 2/Tensorflow and ONNX frames is as follows: traversing each node/layer in the model, mapping each operation constituting the original model one by one to form operations under a new frame, and finally constructing the converted operations into a model structure required by the new frame.

And performing one-to-one mapping according to the operation conversion table during model conversion. For example, the mapping relationship is shown in table 1:

original operation	Operation of the object
		Conv	Conv
SpatialBN	BatchNormalization
		Relu	Relu
AveragePool	GlobalAveragePool
		FC	Gemm
…	…

TABLE 1

The model structure for fig. 2 is converted into the model structure under the new framework according to the operation conversion rule shown in table 1, as shown in fig. 3. The current implementation method is simple and clear, and the conversion of the model can be realized only by having an operation mapping relation table among different frames and performing one-to-one matching according to the table.

However, the current model conversion method has the following problems: no consideration is given to the case where multiple independent operations can be merged. For example, the mapping relations corresponding to X1 and X2 are X1- > Y1 and X2- > Y2, but when the output of X1 is used as the input of X2, X1 and X2- > Y3. If the cost of performing Y3 is lower than Y1+ Y2, then Y3 can be considered in model conversion, rather than going to Y1 and Y2.

Disclosure of Invention

The invention aims to solve the problem that the execution cost of a converted model is too high when the deep learning model is converted among different frames in the prior art, and provides a method and a system for converting the model among the deep learning frames based on the minimum execution cost, which can generate the minimum execution cost model.

The technology provided by the invention reduces the reading and writing process of the intermediate result of a plurality of fusible operations by fusing a plurality of operations under the original frame, thereby optimizing the storage space and the calculation performance. On the other hand, when the operation of the whole model is converted, the execution cost minimum model after the conversion under the new framework is obtained through a dynamic programming method.

Aiming at the defects of the prior art, the invention provides a deep learning inter-frame model conversion method based on minimum execution cost, which comprises the following steps:

step 1, obtaining an operation conversion table of an original deep learning frame and a target deep learning frame, wherein the operation conversion table comprises a mapping relation of original operations in the original deep learning frame to target operations in the target deep learning frame, adding a plurality of operations in the original deep learning frame to the operation conversion table, fusing the mapping relation of the plurality of operations in the original deep learning frame to the target operations in the target deep learning frame, forming a mapping rule table, and taking the conversion calculation amount or complexity of each mapping relation in the mapping rule table as an execution cost;

step 2, inputting an original model structure under the original deep learning framework, traversing an operation sequence in the original model structure to obtain an original operation queue, and sequentially extracting operations from the original operation queue as current operations;

step 3, searching the mapping rule table by the current operation to obtain a searching result of the mapping relation, selecting the mapping relation with the minimum execution cost in the searching result as the current rule, using the target operation in the current rule as the conversion result of the current operation, and adding the conversion result to a target operation queue in sequence;

and 4, circularly executing the step 2 to the step 3 until all the operations in the original operation queue have conversion results, and outputting the model structure of the target operation queue under the target deep learning framework as a model conversion result.

The deep learning inter-frame model conversion method based on the minimum execution cost is characterized in that the current operation comprises a single operation and a plurality of operations in an original operation queue.

The method for converting the deep learning framework model based on the minimum execution cost comprises the following operations: the subset of the original operation queue takes the single operation as the head, and the operations in the subset are in a serial relation, and the serial relation is that the output of the previous operation is the input of the next operation.

The deep learning inter-frame model conversion method based on the minimum execution cost is characterized in that the execution cost of the model conversion result is as follows:

wherein OP₁，OP₂，OP₃…OP_nRepresenting the first to nth operations of the original operation queue, Cost (opi) representing the operation Cost of the independent conversion of the ith operation, Cost (OP)_i+OP_i+1): and representing the operation cost after the fusion of the ith operation to the (i + 1) th operation.

The method for converting a deep learning inter-frame model based on minimum execution cost, wherein the current operation includes a plurality of operations in an original operation queue, the step 3 includes: and selecting a set with the minimum execution cost from the non-empty power set of the current operation, and taking the mapping relation of the set with the minimum execution cost in the mapping rule table as the current rule.

The invention also provides a system for converting the model between the deep learning frameworks, which comprises the following steps:

the method comprises the steps that a module 1 obtains an operation conversion table of an original deep learning frame and a target deep learning frame, wherein the operation conversion table comprises a mapping relation of original operations in the original deep learning frame corresponding to target operations in the target deep learning frame, a plurality of operations in the original deep learning frame are added into the operation conversion table, the mapping relation of the plurality of operations in the original deep learning frame corresponding to the target operations in the target deep learning frame is fused, a mapping rule table is formed, and conversion calculation amount or complexity of each mapping relation in the mapping rule table is used as execution cost;

the module 2 inputs an original model structure under the original deep learning framework, traverses the operation sequence in the original model structure to obtain an original operation queue, and sequentially extracts operations from the original operation queue as current operations;

the module 3 searches the mapping rule table according to the current operation to obtain a searching result of the mapping relation, selects the mapping relation with the minimum execution cost in the searching result as the current rule, uses the target operation in the current rule as the conversion result of the current operation, and adds the conversion result to a target operation queue in sequence;

and the module 4 executes the modules 2 to 3 circularly until all the operations in the original operation queue have conversion results, and outputs the model structure of the target operation queue under the target deep learning framework as a model conversion result.

The model conversion system between the deep learning frames, wherein the current operation comprises a single operation and a plurality of operations in an original operation queue.

The deep learning inter-frame model conversion system, wherein the current operations include: the subset of the original operation queue takes the single operation as the head, and the operations in the subset are in a serial relation, and the serial relation is that the output of the previous operation is the input of the next operation.

The model conversion system between the deep learning frameworks is characterized in that the execution cost of the model conversion result is as follows:

wherein OP₁，OP₂，OP₃…OP_nRepresenting the first to nth operation of the original operation queue, Cost (OP)_i) Represents the operation Cost of the ith operation independent conversion, Cost (OP)_i+OP_i+1): and representing the operation cost after the fusion of the ith operation to the (i + 1) th operation.

The system for model conversion between deep learning frames, wherein the current operation includes a plurality of operations in an original operation queue, the module 3 includes: and selecting a set with the minimum execution cost from the non-empty power set of the current operation, and taking the mapping relation of the set with the minimum execution cost in the mapping rule table as a current rule.

According to the scheme, the invention has the advantages that:

compared with the prior art, the method can reduce the read-write process of intermediate results among a plurality of operations through operation fusion, thereby optimizing the calculation performance and the storage space and further reducing the execution cost of the converted model. Meanwhile, when various fusion options exist, a model conversion method with the minimum execution cost is obtained through a dynamic programming algorithm.

Drawings

FIG. 1 is a basic definition diagram of a single operation;

FIG. 2 is a schematic structural view of a single mold;

FIG. 3 is a diagram of the model structure under the new framework after conversion;

FIG. 4 is a detailed flow chart of the present invention;

FIG. 5 is a block diagram of a pre-conversion model according to an embodiment of the present invention;

FIG. 6 is a diagram of a transformed model structure according to an embodiment of the present invention.

Detailed Description

The inventor finds that the defect in the prior art is caused by considering only the conversion element of a single independent operation, does not consider fusing a plurality of independent operations into one operation, and can judge which execution cost of a plurality of conversion modes is the lowest between fusion and non-fusion when researching how to obtain the optimal model by converting the deep learning model among different frames. Compared with the existing method, the model conversion method of the invention can obtain the model structure with lower execution cost.

The invention mainly comprises the following key points:

the key point 1, when model conversion is carried out among different frames, considering the operation fusion according to a preset list aiming at the operation conversion, and adding a plurality of operation mapping relations which can be fused into a mapping table in the process of operation conversion; the method has the technical effects that a plurality of operations are fused, and the read-write process of intermediate results of the plurality of fusible operations is reduced, so that the storage space and the calculation performance are optimized;

the operation mapping table related in the key point is not a mapping relation between simple operations, and also includes an execution cost after operation conversion, that is, mapping elements of the table in the invention at least include: original operation, converted target operation and target operation execution cost. For example, an operation mapping table as shown in table 2 below is generated:

original OP	OP after conversion	Execution cost
			X1	Y1	2
X2	Y2	2
			X3	Y3	3
X4	Y4	1
			X1+X2	Y2	2
X2+X3	Y5	4.5
			X1+X2+X3	Y6	5
…	…	…

TABLE 2

Note that: where X1+ X2 indicates that when the output of the X1 operation is taken as the input to X2, then an operational fusion can be made to convert X1 and X2 to Y2 operations.

A key point 2, under the condition that the operation is fused with various choices, obtaining a model structure with optimal global execution cost by a dynamic programming method; the method has the technical effects that the execution cost of the model structure obtained by final conversion is lowest when the model structure is executed;

specifically, the calculation method of the model cost is as follows:

the specific meanings of the variables in this formula are as follows:

OP₁，OP₂，OP₃…OP_nthe original operation queue representing the model of the original frame is from the first operation to the nth operation in sequence.

Cost(OP_i): the cost of the independent translation of the ith operation. With respect to the above table, it can be found that: cost (X)₁)＝2，

Cost(OP_i+OP_i+1): representing the cost after the fusion of the operations with continuous input-output connection relations, if the fusion is not possible, the cost value is infinite, and the OP after the conversion in the table is replaced by a special symbol, for example? ". From the above table, Cost (X1+ X2) ═ 2, and Cost (X3+ X4) (+ ∞.

The dynamic programming method comprises the following steps: cost (X)₁，X₂)＝min(Cost(X₁)+Cost(X₂),Cost(X₁+X₂))＝min(2+2,2)＝2。

The model conversion system based on minimum cost evaluation provided by the invention can be realized in two stages:

generating an operation mapping table: on the basis of the prior art, an evaluation value of operation conversion cost is added (the smaller the value is, the better the value is), and meanwhile, the condition that a plurality of independent operations can be fused is considered, so that fusion mapping is supplemented.

A model conversion stage: the concrete realization of the model is embodied in the operation conversion of the formed model, and the converted model structure with the lowest execution cost is obtained through a dynamic programming algorithm according to the model conversion mapping table in the stage. For example, the model structure is X1, X2, X3, X4 according to the context of input and output, then we can get the minimum cost of the model conversion according to the table above:

Cost(X1,X2,X3,X4)＝Cost(X1+X2+X3)+Cost(X4)＝5+1＝6。

in order to make the aforementioned features and effects of the present invention more comprehensible, embodiments accompanied with figures are described in detail below.

The specific embodiment of the invention comprises two stages, namely an operation mapping rule generation stage and a model conversion stage. The specific flow of the two stages is shown in fig. 4.

The specific execution process of the operation mapping rule generation stage is as follows:

1. and acquiring an original frame of the model to be converted and acquiring a target frame of the target conversion model. When performing model conversion between different frames, the operation under the original frame is converted to be a under the frame 1, and the operation under the frame 2 may be a or B. Both are not regularly followed. Therefore, the original and target model frames need to be determined first.

2. Determining the mapping relation of each operation between the original frame and the target frame by judging whether the semantics of different operations between the two frames are consistent, wherein the mapping relation comprises the mapping information of each independent operation under the original frame and the mapping information of fusion of a plurality of operations under the original frame. And filling the information into an operation mapping rule table. The basic elements of the table include three aspects: original operation, converted operation (target operation), and execution cost of converted operation.

3. And calculating the execution cost of the converted operation of various operation mapping relations in the operation mapping rule table. There are several alternative ways to calculate the execution cost, for example, measuring the execution cost of an operation by the amount of computation or complexity required by the operation.

4. And outputting the operation mapping rule table under the original and target frameworks provided for the input for model conversion.

The specific implementation process of the model conversion stage is as follows:

1. inputting an original model structure under an original frame;

2. traversing the operation in the original model structure to form a topological sort (original operation queue);

3. matching the current operation (next operation which can sense the operation at the same time, next operation and the like) with the operation mapping rule table.

4. If only one rule is matched, directly converting; otherwise, when a plurality of rules are matched, the minimum execution cost of the currently converted operation set is calculated by using a dynamic programming algorithm.

5. And finally, outputting the model structure under the new adaptive framework after the conversion with the minimum execution cost.

Corresponding to specific examples of the above embodiments:

the original frame A and the target frame B are shown in the figure 5.

S1, constructing an operation mapping rule table between the frames A and B. The table includes three elements: original operation, convertible operation, and converted operation execution cost.

S2, filling the mapping relation of the operation and calculating the execution cost of each converted operation:

s3, aiming at the model, the operations to be converted can be obtained to comprise X1, X2, X3, X4 and X5. The cost of each independent operation for respective conversion and the execution cost of the fusion of a plurality of operations can be obtained by matching the mapping table.

S3.1, when the original operation of X1 is matched, the following rules can be matched:

original OP	OP after conversion	Execution cost
			X1	Y1	2
X1+X2	Y2	1
			X1+X2+X3	Y6	4.5

Then Cost (X1, X2, X3, X4, X5) ═ min (Cost (X1) + Cost (X2, X3, X4, X5), Cost (X1+ X2) + Cost (X3, X4, X5), Cost (X1+ X2+ X3) + Cost (X4, X5))) } min (2+ Cost (X2, X3, X4, X5),1+ Cost (X3, X4, X5),4.5+ Cost (X4, X5));

s3.2, when the original operation of X2 is matched, the operation conforming to the matching rule comprises the following steps:

original OP	OP after conversion	Execution cost
			X2	Y2	2
X2+X3	Y5	2

Therefore, Cost (X2, X3, X4, X5) ═ min (Cost (X2) + Cost (X3, X4, X5), Cost (X2+ X3) + Cost (X4, X5)) } min (2+ Cost (X3, X4, X5),2+ Cost (X4, X5))

S3.3, when the original operation of X3 is matched, the operation conforming to the matching rule comprises the following steps:

original OP	OP after conversion	Execution cost
			X3	Y3	3
X3+X4	Y4	5

Therefore, the obtained Cost (X3, X4, X5) ═ min (Cost (X3) + Cost (X4, X5), Cost (X3+ X4) + Cost (X5))) } min (3+ Cost (X4, X5),5+ Cost (X5))

S3.4, when the original operation of X4 is matched, the operation conforming to the matching rule comprises the following steps:

original OP	OP after conversion	Execution cost
			X4	Y4	5
X4+X5	Y8	6

Therefore, the obtained Cost (X4, X5) is min (Cost (X4) + Cost (X5), Cost (X4+ X5))) min (5+ Cost (X5, 6))

S3.5, matching the operation principle of X5 to obtain the following rule:

original OP	OP after conversion	Execution cost
			X5	Y7	2

Get Cost (X5) 2

S3.6, the rollback may result in a Cost (X4, X5) of min (5+ Cost (X5), and a Cost (6) of min (5+2,6) of 6

Cost(X3,X4,X5)＝min(3+Cost(X4,X5),5+Cost(X5))＝min(3+6,5+2)＝7

Cost(X2,X3,X4,X5)＝min(2+Cost(X3,X4,X5),2+Cost(X4,X5))＝min(2+7,2+6)＝8

Cost(X1,X2,X3,X4,X5)＝min(2+Cost(X2,X3,X4,X5),1+Cost(X3,X4,X5),4.5+Cost(X4,X5))＝min(2+8,1+7,4.5+6)＝8

The lowest Cost required to obtain the entire model is thus 8, and 8 is obtained as Cost (X1+ X2) + Cost (X3+ X4) + Cost (X5).

S4, the final transformed model obtained by the dynamic programming algorithm in step S3 has an execution Cost of 8 ═ Cost (X1+ X2) + Cost (X3+ X4) + Cost (X5), which is lower than the Cost 2+2+3+5+2 of the original one-to-one mapping — 14.

Original OP	OP after conversion	Execution cost
			X1	Y1	2
X2	Y2	2
			X3	Y3	3
X4	Y4	5
			X5	Y7	2

And the specific conversion rule is as follows: the specific conversion relationship is as follows:

original OP	OP after conversion	Execution cost
			X1+X2	Y2	1
X5	Y7	2
			X3+X4	Y4	5

The resulting transformed model structure is shown in fig. 6 (the parameters and inputs required for the operations after transformation need to be reset according to the operation definitions).

The following are system examples corresponding to the above method examples, and this embodiment can be implemented in cooperation with the above embodiments. The related technical details mentioned in the above embodiments are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related technical details mentioned in the present embodiment can also be applied to the above embodiments.

The deep learning inter-frame model conversion system, wherein the current operations include: the original operation queue takes the single operation as the first subset, and the operations in the subset are in a series relation, and the series relation is that the output of the previous operation is the input of the next operation.

wherein OP₁，OP₂，OP₃…OP_nRepresenting the first to nth operations of the original operation queue, Cost (OP)_i) Representing the operating Cost of the ith operation independent transition, Cost (OP)_i+OP_i+1): and representing the operation cost after the fusion of the ith operation to the (i + 1) th operation.

The deep learning inter-frame model conversion system, wherein the current operation includes a plurality of operations in the original operation queue, the module 3 includes: and selecting a set with the minimum execution cost from the non-empty power set of the current operation, and taking the mapping relation of the set with the minimum execution cost in the mapping rule table as the current rule.

Claims

1. A deep learning inter-frame model conversion method based on minimum execution cost is characterized by comprising the following steps:

step 1, obtaining an operation conversion table of an original deep learning frame and a target deep learning frame, wherein the operation conversion table comprises a mapping relation of original operations in the original deep learning frame to target operations in the target deep learning frame, adding a plurality of operations in the original deep learning frame to the operation conversion table, fusing the mapping relation of the plurality of operations in the original deep learning frame to the target operations in the target deep learning frame, forming a mapping rule table, and taking the operation execution complexity of each mapping relation in the mapping rule table after conversion as an execution cost;

2. The method of claim 1, wherein the current operation comprises a single operation and multiple operations in an original operation queue.

3. The method of claim 2, wherein the current operations include operations of: the subset of the original operation queue takes the single operation as the head, and the operations in the subset are in a serial relation, and the serial relation is that the output of the previous operation is the input of the next operation.

4. The method for deep learning inter-framework model transformation based on minimum execution cost as claimed in claim 1, wherein the execution cost of the model transformation result is:

wherein OP₁，OP₂，OP₃…OP_nRepresenting the first to nth operation of the original operation queue, Cost (OP)_i) Representing the Cost of execution of the i-th operation independent transition, Cost (OP)_i+OP_i+1): and representing the execution cost after the fusion of the ith operation to the (i + 1) th operation.

5. The method of claim 1, wherein the current operation comprises a plurality of operations in an original operation queue, the step 3 comprises: and selecting a set with the minimum execution cost from the non-empty power set of the current operation, and taking the mapping relation of the set with the minimum execution cost in the mapping rule table as a current rule.

6. A deep learning inter-framework model transformation system, comprising:

the method comprises the steps that a module 1 obtains an operation conversion table of an original deep learning frame and a target deep learning frame, wherein the operation conversion table comprises a mapping relation of original operations in the original deep learning frame corresponding to target operations in the target deep learning frame, a plurality of operations in the original deep learning frame are added into the operation conversion table, the mapping relation of the plurality of operations in the original deep learning frame corresponding to the target operations in the target deep learning frame is fused, a mapping rule table is formed, and the operation execution complexity after conversion of each mapping relation in the mapping rule table is used as an execution cost;

7. The deep learning inter-frame model conversion system of claim 6, wherein the current operation comprises a single operation and multiple operations in an original operation queue.

8. The deep learning inter-frame model conversion system of claim 7, wherein the current operations include operations to: the original operation queue takes the single operation as the first subset, and the operations in the subset are in a series relation, and the series relation is that the output of the previous operation is the input of the next operation.

9. The system of claim 6, wherein the model transformation result is executed at a cost of:

wherein OP₁，OP₂，OP₃…OP_nRepresenting the first to nth operations of the original operation queue, Cost (OP)_i) Representing the Cost of execution of the ith operation independent translation, Cost (OP)_i+OP_i+1): and representing the execution cost after the fusion of the ith operation to the (i + 1) th operation.

10. The system of claim 6, wherein the current operation comprises a plurality of operations in an original operation queue, the module 3 comprises: and selecting a set with the minimum execution cost from the non-empty power set of the current operation, and taking the mapping relation of the set with the minimum execution cost in the mapping rule table as a current rule.