CN110991659A

CN110991659A - Abnormal node identification method and device, electronic equipment and storage medium

Info

Publication number: CN110991659A
Application number: CN201911250256.7A
Authority: CN
Inventors: 屈伟; 董峰
Original assignee: Beijing QIYI Century Science and Technology Co Ltd
Current assignee: Beijing QIYI Century Science and Technology Co Ltd
Priority date: 2019-12-09
Filing date: 2019-12-09
Publication date: 2020-04-10
Anticipated expiration: 2039-12-09
Also published as: CN110991659B

Abstract

The embodiment of the invention provides an abnormal node identification method, an abnormal node identification device, electronic equipment and a storage medium, wherein the method comprises the following steps: the method comprises the steps of inputting feature data of a test image into a deep learning model to be recognized, wherein the deep learning model to be recognized comprises a plurality of nodes, monitoring the processing time of a designated node in the plurality of nodes in the process of processing the feature data by the deep learning model to be recognized, wherein the processing time of the designated node is the time for processing received data by the designated node, and when the processing time of the designated node is greater than a preset time threshold value, determining the designated node as an abnormal node. By adopting the scheme provided by the embodiment of the invention, the abnormal node is identified from a plurality of nodes contained in the deep learning model, and after the abnormal node is identified, the abnormal node can be further processed, so that the deep learning model is favorable for deeply researching the reasoning acceleration performance method, and the running speed of the deep learning model is accelerated.

Description

Abnormal node identification method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of machine learning technologies, and in particular, to an abnormal node identification method and apparatus, an electronic device, and a storage medium.

Background

In the technical field of machine learning, the deep learning model is developed rapidly and is widely applied. Currently, in practical applications, model redundancy may exist in the deep learning model, for example, redundancy exists in parameters and structures of the deep learning model, that is, redundant nodes or parameters exist in the model, so that when the deep learning model is applied, the running time of the model is long. In addition, the existing deep learning model has the problem of long running time of part of nodes, and the running speed of the deep learning model is low due to the long running time of the part of nodes. Therefore, optimization processing of nodes with long running time in the deep learning model is beneficial to improving the running speed of the deep learning model. Before optimizing the nodes with long running time in the deep learning model, how to identify the nodes with long running time from a plurality of nodes contained in the deep learning model is very important.

Disclosure of Invention

An object of an embodiment of the present invention is to provide an abnormal node identification method, so as to identify an abnormal node from a plurality of nodes included in a deep learning model. The specific technical scheme is as follows:

to achieve the above object, an embodiment of the present invention provides an abnormal node identification method, including:

inputting feature data of a test image into a deep learning model to be recognized, wherein the deep learning model to be recognized comprises a plurality of nodes;

monitoring the processing time of a designated node in the plurality of nodes in the process of processing the characteristic data by the deep learning model to be recognized, wherein the processing time of the designated node is the time of processing the received data by the designated node;

and when the processing duration of the designated node is greater than a preset duration threshold, determining that the designated node is an abnormal node.

Further, the monitoring the processing duration of a designated node of the plurality of nodes includes:

monitoring an input time point when a designated node of the plurality of nodes receives data to be processed and an output time point when the received data is processed;

and calculating the difference value obtained by subtracting the input time point from the output time point to be used as the processing time length of the designated node.

monitoring the time from the characteristic data input into the deep learning model to be recognized to the time when each node in the plurality of nodes receives data needing to be processed, wherein the time is used as the arrival time;

and calculating the arrival time length of the next node of the specified node, and subtracting the difference value of the arrival time lengths of the specified nodes to be used as the processing time length of the specified node.

Further, the calculating an arrival time length of a node next to the designated node, minus a difference value of the arrival time lengths of the designated nodes, includes:

when the designated node has a plurality of next nodes, selecting the next node with the minimum arrival time from the plurality of next nodes;

and calculating the arrival time length of the next node with the minimum selected arrival time length, and subtracting the difference value of the arrival time lengths of the specified nodes.

monitoring the time length from the time when the characteristic data is input into the deep learning model to be identified to the time when the data processing of each node in the plurality of nodes is completed, wherein the time length is used as the output time length;

and calculating the output time length of the designated node, and subtracting the difference value of the output time length of the previous node of the designated node to be used as the processing time length of the designated node.

Further, the calculating the output duration of the designated node, minus the difference of the output duration of the node before the designated node, includes:

when the designated node has a plurality of previous nodes, selecting the previous node with the largest output time from the plurality of previous nodes;

and calculating the output time length of the specified node, and subtracting the difference value of the output time length of the previous node with the maximum selected output time length.

Further, the deep learning model to be identified is a model obtained by optimizing an original deep learning model based on a high-performance neural network inference engine TensrT; or

The deep learning model to be identified is obtained by optimizing an original deep learning model based on open visual reasoning and neural network optimization tool OpenVINO.

In order to achieve the above object, an embodiment of the present invention further provides an abnormal node identification apparatus, including:

the system comprises an input module, a recognition module and a recognition module, wherein the input module is used for inputting feature data of a test image into a deep learning model to be recognized, and the deep learning model to be recognized comprises a plurality of nodes;

the monitoring module is used for monitoring the processing duration of a designated node in the plurality of nodes in the process of processing the characteristic data by the deep learning model to be recognized, wherein the processing duration of the designated node is the duration of processing the received data by the designated node;

and the determining module is used for determining the designated node as an abnormal node when the processing time length of the designated node is greater than a preset time length threshold.

Further, the monitoring module includes:

the monitoring submodule is used for monitoring an input time point when a designated node in the plurality of nodes receives data to be processed and an output time point when the received data is processed;

and the calculating submodule is used for calculating the difference value obtained by subtracting the input time point from the output time point to be used as the processing time length of the designated node.

Further, the monitoring module includes:

the monitoring submodule is used for monitoring the time from the characteristic data input into the deep learning model to be identified to the time when each node in the plurality of nodes receives the data to be processed, and the time is used as the arrival time;

and the calculation submodule is used for calculating the arrival time length of the next node of the designated node, and subtracting the difference value of the arrival time lengths of the designated nodes to be used as the processing time length of the designated node.

Further, the calculation sub-module is specifically configured to, when the designated node has a plurality of next nodes, select a next node with a minimum arrival time from the plurality of next nodes; and calculating the arrival time length of the next node with the minimum selected arrival time length, and subtracting the difference value of the arrival time lengths of the specified nodes.

Further, the monitoring module includes:

the monitoring submodule is used for monitoring the time from the time when the characteristic data is input into the deep learning model to be identified to the time when the received data is processed by each node in the plurality of nodes, and the time is used as the output time;

and the calculation submodule is used for calculating the output time length of the designated node, and subtracting the difference value of the output time length of the previous node of the designated node to be used as the processing time length of the designated node.

Further, the calculation sub-module is specifically configured to, when the designated node has multiple previous nodes, select a previous node with the largest output duration from the multiple previous nodes; and calculating the output time length of the specified node, and subtracting the difference value of the output time length of the previous node with the maximum selected output time length.

In order to achieve the above object, an embodiment of the present invention provides an electronic device, which includes a processor, a communication interface, a memory, and a communication bus, where the processor and the communication interface are configured to complete communication between the memory and the processor through the communication bus;

a memory for storing a computer program;

and the processor is used for realizing any one of the abnormal node identification method steps when executing the program stored in the memory.

In order to achieve the above object, an embodiment of the present invention provides a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements any of the above described steps of the abnormal node identification method.

In order to achieve the above object, an embodiment of the present invention further provides a computer program product containing instructions, which when run on a computer, causes the computer to perform any of the above described steps of the abnormal node identifying method.

The embodiment of the invention has the following beneficial effects:

the abnormal node identification method provided by the embodiment of the invention comprises the steps of acquiring nodes of a deep learning model, monitoring the time length of processing received data by an appointed node in the process of processing the characteristic data of a test image by the deep learning model to be identified, and determining the appointed node with the processing time length being larger than a preset time length threshold value as the abnormal node. By adopting the method provided by the embodiment of the invention, the time length for processing the received data by the designated node is monitored, and the designated node with the processing time length being greater than the preset time length threshold value is taken as the abnormal node, so that the abnormal node is identified from a plurality of nodes contained in the deep learning model.

Of course, not all of the advantages described above need to be achieved at the same time in the practice of any one product or method of the invention.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.

Fig. 1 is a first flowchart of an abnormal node identification method according to an embodiment of the present invention;

fig. 2 is a second flowchart of an abnormal node identification method according to an embodiment of the present invention;

fig. 3 is a third flowchart of an abnormal node identification method according to an embodiment of the present invention;

fig. 4 is a fourth flowchart of an abnormal node identification apparatus according to an embodiment of the present invention;

fig. 5 is a first structural diagram of an abnormal node identification apparatus according to an embodiment of the present invention;

fig. 6 is a second structural diagram of an abnormal node identification apparatus according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of a part of nodes of the deep learning model according to the embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention.

Because the existing deep learning model has the problem that the running speed of the deep learning model is low due to long running time of part of nodes, in order to solve the technical problem, the embodiment of the invention provides an abnormal node identification method, as shown in fig. 1, which comprises the following steps:

step 101, inputting feature data of a test image into a deep learning model to be recognized, wherein the deep learning model to be recognized comprises a plurality of nodes.

And 102, monitoring the processing duration of a designated node in the plurality of nodes in the process of processing the characteristic data by the deep learning model to be recognized, wherein the processing duration of the designated node is the duration of processing the received data by the designated node.

And 103, when the processing time length of the designated node is greater than a preset time length threshold value, determining that the designated node is an abnormal node.

By adopting the method provided by the embodiment of the invention, the time length for processing the received data by the designated node is monitored, and the designated node with the processing time length being greater than the preset time length threshold value is taken as the abnormal node, so that the abnormal node is identified from a plurality of nodes contained in the deep learning model.

In the application of the deep learning model, the node with the processing time length being greater than the preset time length threshold value may cause the running speed of the deep learning model to be reduced, and the application of the deep learning model is influenced. Therefore, the nodes with the processing time length larger than the preset time length threshold value in the nodes of the deep learning model are used as abnormal nodes. By adopting the method provided by the embodiment of the invention, the abnormal node of the deep learning model can be identified, and after the abnormal node of the deep learning model is identified, the abnormal node can be further processed, so that the deep learning model can be deeply researched to deduce and accelerate the performance of the deep learning model, and further the deep learning model is optimized and the running speed of the deep learning model is accelerated.

The method and the apparatus for identifying an abnormal node according to the embodiments of the present invention are described in detail below with reference to specific embodiments.

The embodiment of the invention discloses an abnormal node identification method, which comprises the following steps as shown in figure 2:

step 201, inputting feature data of a test image into a deep learning model to be recognized, wherein the deep learning model to be recognized comprises a plurality of nodes.

In the embodiment of the invention, the deep learning model is formed by mutually connecting a plurality of nodes. Each node represents a specific output function, called activation function, for example Sigmoid (logistic regression function). Each node is used for receiving data to be processed from the connected upper-level node, processing the data to be processed and outputting the processed data to the connected lower-level node.

In the embodiment of the invention, the deep learning model to be identified can be a model obtained by optimizing an original deep learning model based on TensorRT (high performance neural network inference engine); or, the deep learning model to be identified may also be a model obtained by optimizing an original deep learning model based on OpenVINO (open visual inference and neural network optimization tool); alternatively, the deep learning model to be recognized can also be an original deep learning model which is not subjected to any optimization processing.

At present, a model obtained by optimizing an original deep learning model based on TensorRT and a model obtained by optimizing the original deep learning model based on OpenVINO can both improve the running speed of the model on the original deep learning model. However, the existing optimization mode has the limitations that the optimization is unreasonable, and the speed of the optimized model is improved little. For example, when the OpenVINO is used to optimize the original deep learning model, the default input channel is in the NCHW format, and the default channel used by the TensorFlow (artificial intelligence learning system) is NHWC, so the optimized model needs to be transposed, where the NCHW format indicates that the storage order of the input data in the memory is NCHW (batch channels, number of pixels in the height direction and width direction), and the NHWC format indicates that the storage order of the input data in the memory is NHWC (number of channels in the number of pixels in the batch height direction and width direction). And the transposition time overhead of the three-dimensional convolution is large, so that the transposition operation seriously influences the running speed of the optimized model. Therefore, the deep learning model obtained by optimizing based on the original deep learning model can be used as the deep learning model to be identified to identify the abnormal node.

In this step, a plurality of node names of the deep learning model to be recognized may also be obtained, for example, after the original deep learning model is optimized by using OpenVINO, the optimized structure information of the deep learning model to be recognized may be obtained, where the structure information of the deep learning model to be recognized includes the names of all nodes of the deep learning model to be recognized.

Step 202, monitoring an input time point when a designated node of the plurality of nodes receives data to be processed and an output time point when the received data is processed.

In the embodiment of the invention, when the abnormal node identification is carried out on the deep learning model to be identified, all nodes of the deep learning model to be identified can be selected to carry out the abnormal node identification, and the abnormal node identification can also be carried out only on the nodes in part of parameter layers of the deep learning model to be identified. When abnormal node identification is carried out on nodes in a part of parameter layers of the deep learning model to be identified, the nodes in the part of parameter layers can be determined as designated nodes, and then abnormal nodes can be identified from the designated nodes.

In this step, a plurality of node names of the deep learning model to be recognized may be obtained based on the structural information of the deep learning model to be recognized, and then, a specific node may be selected from the obtained plurality of nodes, and the selected specific node may be one specific node, or a plurality of specific nodes, or all nodes of the deep learning model to be recognized.

In this step, the designated node receives data representing the characteristics of the test image, and after receiving the data representing the characteristics of the test image, the designated node may perform corresponding processing on the received data to obtain processed data, and the designated node may output the processed data to a next designated node.

And for the designated nodes, sequentially recording the input time point of the data to be processed received by each designated node and the output time point of the received data after the received data are processed according to the processing sequence of the data representing the characteristics of the test image. When the marked data representing the characteristics of the test image is changed compared with the data needing to be processed received by the designated node, the fact that the data representing the characteristics of the test image is processed by the designated node can be determined, and the time point when the marked data representing the characteristics of the test image is changed can be recorded as the output time point of the designated node. For example, in one possible implementation, the designated node a receives the data characterizing the test image to be processed as a, b, c, d, e and f, where f is the mark data, and when f is monitored to be changed to f^′When, it meansThe fixed node A completes the processing of the received data a, b, c, d, e and f, and can record the change of the data f into f^′Is the output time point corresponding to the designated node a.

In a possible implementation manner, after the feature data of the test image is input into the deep learning model to be recognized, the logging information function in the deep learning model to be recognized may be used to record an input time point when each node of the deep learning model to be recognized receives data to be processed and an output time point when the received data is completely processed. Furthermore, an input time point when a specified node in the deep learning model to be recognized receives data to be processed and an output time point when the received data is processed can be obtained, so that the specified node in the deep learning model to be recognized can be monitored.

And step 203, calculating the difference value of the output time point minus the input time point as the processing time length of the designated node.

In this step, the difference between the output time point and the input time point of the designated node represents the time length consumed by the designated node in the processing process of the data representing the test image characteristics.

In one possible embodiment, the time point at which the node a receives the data characterizing the test image to be processed is designated t₁Designating the time point at which the processing of the received data by the node A is completed as t₂Calculating (t)₂-t₁) The value of (b) represents, as the processing time length of the designated node a, the time length consumed by the designated node a for the processing of the data representing the test image feature.

And 204, when the processing time length of the designated node is greater than a preset time length threshold value, determining that the designated node is an abnormal node.

In this step, the preset time length threshold value may be specifically set according to the difference of the deep learning models to be recognized and the difference of the performance of the running deep learning model devices to be recognized, and different preset time length threshold values may be set for different designated nodes.

In a possible implementation manner, 5 times of the processing duration of the convolutional node B specified in the deep learning model to be identified may be set as a preset duration threshold, and for each specified node, it is determined whether the processing duration of the specified node is greater than 5 times of the processing duration of the convolutional node B, and when the processing duration of the specified node is greater than 5 times of the processing duration of the convolutional node B, the specified node may be determined as an abnormal node.

By adopting the method provided by the embodiment of the invention, the input time point of the data to be processed received by the designated node and the output time point of the received data after being processed are monitored, the difference value of the input time point subtracted from the output time point is further calculated to be used as the processing time length of the designated node, and the designated node with the processing time length larger than the preset time length threshold is used as the abnormal node, so that the abnormal node is identified from a plurality of nodes contained in the deep learning model. After the abnormal nodes of the deep learning model are identified, the abnormal nodes can be further processed, so that the deep learning model reasoning acceleration performance method can be further researched, the deep learning model can be optimized in a targeted manner, the running speed of the deep learning model can be increased, and the optimization efficiency of the model can be improved.

In another embodiment of the present invention, as shown in fig. 3, the method for identifying an abnormal node according to an embodiment of the present invention may include the following steps:

step 301, inputting feature data of a test image into a deep learning model to be recognized, wherein the deep learning model to be recognized comprises a plurality of nodes.

This step is the same as step 201, and is not described herein again.

Step 302, monitoring the time length from the time when the characteristic data is input into the deep learning model to be identified to the time when each node in the plurality of nodes receives the data needing to be processed, wherein the time length is used as the arrival time length.

In this step, the plurality of nodes of the deep learning model to be recognized receive data representing the characteristics of the test image, and after each node of the plurality of nodes receives the data representing the characteristics of the test image, the received data can be correspondingly processed to obtain processed data, and the processed data can be output to the next node.

In this step, the time when the feature data is input into the deep learning model to be recognized may be used as an initial time, and for a plurality of nodes, the time point when each node receives data to be processed is sequentially recorded according to the processing sequence of the data representing the feature of the test image, where the initial time may be set according to a specific application scenario, for example, the initial time may be set to zero.

In a possible implementation manner, after the feature data of the test image is input into the deep learning model to be recognized, the time point when the specified node of the deep learning model to be recognized receives the data to be processed can be recorded by using the logging.

Step 303, determining whether the designated node corresponds to only one next node, if yes, executing step 304a, and if no, executing step 304 b.

And step 304a, calculating the arrival time length of the next node of the designated node, and subtracting the difference value of the arrival time lengths of the designated nodes to be used as the processing time length of the designated node.

In this step, when the designated node corresponds to only one next node, the arrival time of the next node of the designated node may be calculated, the difference of the arrival time of the designated node is subtracted, the obtained difference represents the time length consumed by the designated node in the processing process of the data representing the test image feature, and the obtained difference may be used as the processing time of the designated node.

In a possible implementation, the time length for receiving the data needing to be processed by the node C is designated as t_CThe next node corresponding to the designated node C is only the designated node D, and the time length of the designated node D for receiving the data needing to be processed is t_DCan calculate (t)_D-t_C) The value of (b) represents, as the processing time length of the designated node C, the time length consumed by the designated node C for the processing of the data representing the test image feature.

And step 304b, calculating the arrival time length of the next node with the minimum arrival time length in the plurality of next nodes, and subtracting the difference value of the arrival time lengths of the specified nodes to be used as the processing time length of the specified nodes.

In this step, when the designated node has a plurality of next nodes, the next node with the minimum arrival time may be selected from the plurality of next nodes, the arrival time of the selected next node with the minimum arrival time is calculated, and the difference between the arrival times of the designated nodes is subtracted to serve as the processing time of the designated node.

In a possible implementation, the time length for receiving the data needing to be processed by the designated node E is t_EThe designated node E corresponds to a plurality of next nodes: specifying node F₁Specifying node F₂Specifying node F₃And a designated node F₄And specifies node F₁The time length of receiving the data needing to be processed is t_F1Specifying node F₂The time length of receiving the data needing to be processed is t_F2Specifying node F₃The time length of receiving the data needing to be processed is t_F3Specifying node F₄The time length of receiving the data needing to be processed is t_F4Comparing t_F1、t_F2、t_F3And t_F4Is selected, the smallest min { t } of_F1,t_F2,t_F3,t_F4Can calculate (min { t }_F1,t_F2,t_F3,t_F4}-t_E) The value of (b) represents, as the processing time length of the designated node E, the time length consumed by the designated node E for the processing of the data representing the test image feature.

And 305, when the processing time length of the designated node is greater than a preset time length threshold value, determining that the designated node is an abnormal node.

This step is the same as step 204, and is not described herein again.

By adopting the method provided by the embodiment of the invention, the processing time of the designated node is calculated by monitoring the arrival time of the designated node, and the designated node with the processing time greater than the preset time threshold is taken as the abnormal node, so that the abnormal node is identified from a plurality of nodes contained in the deep learning model. And after the abnormal node of the deep learning model is identified, the abnormal node can be further processed, so that the deep learning model can be deeply researched to accelerate the performance method of the deep learning model inference, the deep learning model is further optimized, and the running speed of the deep learning model is accelerated.

In another embodiment of the present invention, as shown in fig. 4, the method for identifying an abnormal node according to the embodiment of the present invention may include the following steps:

step 401, inputting feature data of a test image into a deep learning model to be recognized, where the deep learning model to be recognized includes a plurality of nodes.

This step is the same as step 201, and is not described herein again.

Step 402, monitoring the time length from inputting the characteristic data into the deep learning model to be identified to the completion of the processing of the received data by each node in the plurality of nodes, and taking the time length as the output time length.

In this step, the plurality of nodes of the deep learning model to be recognized receive data representing the characteristics of the test image, each node of the plurality of nodes can perform corresponding processing on the received data after receiving the data representing the characteristics of the test image, so as to obtain processed data, and the time length of outputting the processed data to the next node by each node is used as the corresponding output time length.

In this step, the time when the feature data is input into the deep learning model to be recognized may be used as an initial time, and for the plurality of nodes, the time point when the received data is processed by each node is sequentially recorded according to the processing sequence of the data representing the feature of the test image, where the initial time may be set according to a specific application scenario, for example, the initial time may be set to zero.

In a possible implementation manner, after the feature data of the test image is input into the deep learning model to be identified, the time point when the processing of the received data is completed by the designated node of the deep learning model to be identified can be recorded by using the logging.

Step 403, determining whether the designated node corresponds to only one previous node, if yes, executing step 404a, and if no, executing step 404 b.

Step 404a, calculating the output duration of the designated node, and subtracting the difference value of the output duration of the previous node of the designated node to be used as the processing duration of the designated node.

In this step, when the designated node corresponds to only one previous node, the output duration of the designated node may be calculated, the difference of the output durations of the previous nodes of the designated node is subtracted, the obtained difference represents the time length consumed by the designated node in the processing process of the data representing the test image feature, and the obtained difference may be used as the processing duration of the designated node.

In one possible embodiment, the time length t for the node G to complete the processing of the received data is specified_HThe previous node corresponding to the designated node H is only the designated node G, and the time length of the designated node G for completing the processing of the received data is t_GCan calculate (t)_H-t_G) The value of (b) represents, as the processing time length of the designated node H, the time length consumed by the designated node H for the processing of the data representing the test image feature.

Step 404b, calculating the output duration of the designated node, and subtracting the difference value of the output durations of the previous node with the maximum output duration from the plurality of previous nodes to be used as the processing duration of the designated node.

In this step, when the designated node has a plurality of previous nodes, a previous node with the largest output duration may be selected from the plurality of previous nodes, the output duration of the designated node is calculated, and the difference between the output durations of the selected previous nodes with the largest output duration is subtracted to serve as the processing duration of the designated node.

In a possible embodiment, the time duration t for the node M to complete the processing of the received data is specified_MDesignating the node M to correspond to a plurality of previous nodes: specifying a node L₁Specifying node L₂Specifying node L₃Specifying node L₄And a designated node L₅And specify the sectionPoint L₁The received data is processed for a time t_L1Specifying a node L₂The received data is processed for a time t_L2Specifying a node L₃The received data is processed for a time t_L3Specifying a node L₄The received data is processed for a time t_L4Specifying a node L₅The received data is processed for a time t_L5Comparing t_L1、t_L2、t_L3、t_L4And t_L5Of which the largest max t is selected_L1,t_L2,t_L3,t_L4,t_L5Can calculate (t)_M-max{t_L1,t_L2,t_L3,t_L4,t_L5}) as the processing duration of the designated node M, which represents the length of time spent by the designated node M in the processing process of the data representing the test image feature.

Step 405, when the processing time length of the designated node is greater than the preset time length threshold, determining that the designated node is an abnormal node.

This step is the same as step 204, and is not described herein again.

By adopting the method provided by the embodiment of the invention, the processing time of the designated node is calculated by monitoring the output time of the designated node, and the designated node with the processing time larger than the preset time threshold is taken as the abnormal node, so that the abnormal node is identified from a plurality of nodes contained in the deep learning model. And after the abnormal node of the deep learning model is identified, the abnormal node can be further processed, so that the deep learning model can be deeply researched to accelerate the performance method of the deep learning model inference, the deep learning model is further optimized, and the running speed of the deep learning model is accelerated.

In the embodiment of the invention, after the abnormal node of the deep learning model to be recognized is determined, the recognized abnormal node can be processed by adopting the following method:

the first mode is as follows: when the identified abnormal node is a redundant node for the deep learning model, the identified abnormal node can be selected to be deleted. And deleting the abnormal nodes to obtain a new deep learning model, and applying the new deep learning model to an applicable scene.

The second mode is as follows: when the identified abnormal node is not a redundant node for the deep learning model, the node which consumes less time and can execute the same function can be selected to replace the identified abnormal node, a new deep learning model is obtained after the node is changed, whether the new deep learning model operates normally can be detected, and if the obtained new deep learning model operates normally and the operation speed is increased, the new deep learning model can be applied to an application scene.

The scheme provided by the embodiment of the invention can also identify abnormal nodes aiming at the optimized deep learning model. For example, the deep learning model is optimized by using development tools such as OpenVINO and TensorRT, and the running speed of the optimized model is increased, but after the deep learning model is optimized by using two tools such as OpenVINO and TensorRT, the running time of partial nodes of the optimized deep learning model is longer than that before optimization. For such a problem, the scheme provided by the embodiment of the present invention may be adopted, and the optimized deep learning model is used as the deep learning model to be identified.

In one possible embodiment, 398 short videos of duration 10-20 seconds were tested using the deep learning model, and 1190 seconds were tested using the original deep learning model. Since the default channel for training the original deep learning model is NHWC, and the default input channel for optimizing the original deep learning model by OpenVINO is NCHW, when the OpenVINO tool is used to optimize the original deep learning model, the default channel NHWC of the original deep learning model needs to be converted into the NCHW channel. When an NHWC channel is converted into an NCHW channel through a default channel of an original deep learning model, a node with the node name of 3D transit needs to be introduced, the original deep learning model is optimized through an OpenVINO tool to obtain a first optimization model, wherein the first optimization model comprises the node 3D transit (transposed node), and the testing time of testing through the first optimization model is 427 seconds. And identifying abnormal nodes by using the scheme provided by the embodiment of the invention and taking the first optimization model as a deep learning model to be identified, wherein the processing time length of the node 3 Dtransfer of the deep learning model to be identified is longer than the set threshold time length. After the abnormal node 3 Dtransform is identified, a second optimization model can be obtained when the original deep learning model is optimized by using an OpenVINO tool on the premise of not introducing the node 3 Dtransform, and the second optimization model is used for testing, wherein the testing time is 300 seconds.

Therefore, aiming at the optimized deep learning model, after the abnormal node of the deep learning model is identified, the abnormal node can be further processed, so that the deep learning model is further optimized, the deep learning model can be further researched, the deep learning model reasoning acceleration performance method can be further researched, and the running speed of the deep learning model is further accelerated.

The Deep learning model in the embodiment of the present invention may specifically be DNN (Deep Neural Network), and specifically may include: CNN (Convolutional Neural Network), RNN (Recurrent Neural Network), LSTM (Long Short Term Memory Network).

The deep learning model in the embodiment of the present invention may be specifically used for:

and (3) target classification: object classification is an object recognition problem based on a classification task, i.e. a computer finds out from given data which of these data are the desired objects. For example, cat and dog classifications or floral classifications;

target detection: the target detection can determine the specific position of a target to be detected from a current image, and the target detection has wide application and is often applied to power system detection, medical image detection and the like;

target segmentation: in the deep learning field, the research direction of the target segmentation is mainly divided into semantic segmentation and example segmentation, wherein the semantic segmentation is to classify each pixel point in an image and judge which pixels in the image belong to which target, while the example segmentation is to judge not only which pixels belong to the target, but also which pixels belong to a first target and which pixels belong to a second target, and the key in the medical image is to segment human organs at present;

and (3) voice recognition: the speech recognition aims to transmit a natural language to a computer in the form of an acoustic signal, and the natural language is understood and responded by the computer, and the application scenes of the speech recognition can be as follows: the driving navigation software guides the road and broadcasts the road condition for the driver through the voice recognition technology;

automatic driving: in the automatic driving technique, a deep learning model may be used to identify a vehicle driving environment condition.

In the embodiment of the present invention, for each node of the deep learning model, each node of the deep learning model may be identified by using a DFS (depth First Search of a graph) and a BFS (break First Search algorithm). As shown in FIG. 8, part of nodes a1-a19 of the deep learning model, where node a1 is the root node.

The nodes a1-a19 can be found out through a breadth-first search algorithm, the breadth-first algorithm can traverse from a root node to sequentially traverse the next-level node adjacent to the previous node, the traversed nodes do not need to be traversed for the second time, and the specific steps of traversing the nodes can be as follows:

traversal may begin with root node a1, then traverse node a2 adjacent to root node a1, node, then sequentially traverse nodes adjacent to node a 2: node a13, node a17, node a19, node a 3; then traverse the node adjacent to node a 13: node a10, traversing the nodes adjacent to node a 17: node a16, traversing the nodes adjacent to node a 19: node a15, traversing the nodes adjacent to node a 3: node a 4; then traverse the node adjacent to node a 10: node a9, traversing the nodes adjacent to node a 16: node a14 node a18, traversing the nodes adjacent to node a 15: node a8, traversing the nodes adjacent to node a 4: node a 6; then traverse the node adjacent to node a 14: node a12, traversing the nodes adjacent to node a 18: node a7 then traverses the nodes adjacent to node a 12: node a 11. The node a1-a19 of the deep learning model is found.

For any node of the deep learning model, the previous-level node adjacent to the node is a father node of the node, the next-level node adjacent to the node is a child node of the node, the breadth-first search algorithm can search the node of the deep learning model, and the father node of the child node of the deep learning model can be determined according to the adjacent relation between the nodes.

The node depth of the deep learning model node is the path from the root node to the node, namely the number of nodes passed by the node from the root node to the node is added with 2, wherein the depth of the root node is 1. After each node of the deep learning model is found, the found nodes can be sequenced according to the node depth, and the nodes of the deep learning model are summarized. As shown in fig. 8, the root node depth of the node a1 is 1, the maximum depth of the node a2 is 2, the maximum depth of the node a3 is 3, the maximum depth of the node a4 is 4, the maximum depth of the node a6 is 5, the maximum depth of the node a7 is 6, the maximum depth of the node a8 is 5, the maximum depth of the node a9 is 10, the maximum depth of the node a10 is 9, the maximum depth of the node a11 is 7, the maximum depth of the node a12 is 6, the maximum depth of the node a13 is 8, the maximum depth of the node a14 is 5, the maximum depth of the node a15 is 4, the maximum depth of the node a16 is 4, the maximum depth of the node a17 is 3, the maximum depth of the node a18 is 5, and the maximum depth of the node a19 is 3. The nodes a1-a19 in fig. 8 can be ranked according to the maximum node depth, and the ranked nodes are: node a1, node a2, node a3, node a17, node a19, node a4, node a16, node a15, node a6, node a14, node a18, node a8, node a7, node a12, node a11, node a13, node a10, and node a 9.

By using BFS firstly, the father node of the deep learning model node can be made clear, and then the node is summarized through DFS, so that on one hand, the node of the deep learning model is identified, on the other hand, for each node, the time difference between the node and the father node is the processing time length corresponding to the node, and further according to the scheme provided by the embodiment of the invention, when the processing time length of the node is greater than the preset time length threshold value, the node is determined to be an abnormal node.

Based on the same inventive concept, according to the abnormal node identification method provided in the above embodiment of the present invention, correspondingly, another embodiment of the present invention further provides an abnormal node identification apparatus, a schematic structural diagram of which is shown in fig. 5, and the method specifically includes:

an input module 501, configured to input feature data of a test image into a deep learning model to be recognized, where the deep learning model to be recognized includes multiple nodes;

the monitoring module 502 is configured to monitor a processing duration of a designated node in the plurality of nodes in a process of processing the feature data by the deep learning model to be recognized, where the processing duration of the designated node is a duration of processing the received data by the designated node;

the determining module 503 is configured to determine that the designated node is an abnormal node when the processing duration of the designated node is greater than the preset duration threshold.

Therefore, by adopting the device provided by the embodiment of the invention, the time length for processing the received data by the designated node is monitored, and the designated node with the processing time length larger than the preset time length threshold value is taken as the abnormal node, so that the abnormal node is identified from the plurality of nodes contained in the deep learning model. And after the abnormal node of the deep learning model is identified, the abnormal node can be further processed, so that the deep learning model can be deeply researched to accelerate the performance method of the deep learning model inference, the deep learning model is further optimized, and the running speed of the deep learning model is accelerated.

Further, as shown in fig. 6, the monitoring module 502 includes:

a monitoring submodule 601, configured to monitor an input time point when a designated node in the plurality of nodes receives data to be processed, and an output time point when the received data is processed;

and the calculating submodule 602 is configured to calculate a difference between the output time point and the input time point as a processing duration of the designated node.

Further, as shown in fig. 6, the monitoring module 502 includes:

the monitoring submodule 601 is used for monitoring the time from the time when the characteristic data is input into the deep learning model to be identified to the time when each node in the plurality of nodes receives the data to be processed, and the time is used as the arrival time;

the calculating submodule 602 is configured to calculate an arrival duration of a next node of the designated node, and subtract a difference of the arrival durations of the designated node to obtain a processing duration of the designated node.

Further, as shown in fig. 6, the calculating submodule 602 is specifically configured to, when the designated node has multiple next nodes, select a next node with the smallest arrival time from the multiple next nodes; and calculating the arrival time length of the next node with the minimum selected arrival time length, and subtracting the difference value of the arrival time lengths of the specified nodes.

Further, as shown in fig. 6, the monitoring module 502 includes:

the monitoring submodule 601 is used for monitoring the time length from the time when the characteristic data is input into the deep learning model to be identified to the time when the received data is processed by each node in the plurality of nodes, and the time length is used as the output time length;

the calculating submodule 602 is configured to calculate an output duration of the designated node, and subtract a difference of the output duration of the previous node of the designated node to obtain a processing duration of the designated node.

Further, as shown in fig. 6, the calculating submodule 602 is specifically configured to, when the designated node has a plurality of previous nodes, select a previous node with the largest output duration from the plurality of previous nodes; and calculating the output time length of the designated node, and subtracting the difference value of the output time length of the previous node with the maximum selected output time length.

Further, the deep learning model to be recognized is a model obtained by optimizing the original deep learning model based on TensorRT; or the deep learning model to be recognized is a model obtained by optimizing the original deep learning model based on OpenVINO.

Based on the same inventive concept, according to the abnormal node identification method provided in the above embodiment of the present invention, correspondingly, another embodiment of the present invention further provides an electronic device, referring to fig. 7, the electronic device according to the embodiment of the present invention includes a processor 701, a communication interface 702, a memory 703 and a communication bus 704, where the processor 701, the communication interface 702 and the memory 703 complete mutual communication through the communication bus 704.

A memory 703 for storing a computer program;

the processor 701 is configured to implement the following steps when executing the program stored in the memory 703:

The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

The communication interface is used for communication between the electronic equipment and other equipment.

The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component.

In another embodiment of the present invention, a computer-readable storage medium is further provided, in which a computer program is stored, and the computer program, when executed by a processor, implements the steps of any of the above-mentioned abnormal node identification methods.

In yet another embodiment, a computer program product containing instructions is provided, which when run on a computer, causes the computer to perform any one of the above-described methods for abnormal node identification.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the device, the electronic apparatus and the storage medium embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and the relevant points can be referred to the partial description of the method embodiments.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims

1. An abnormal node identification method is characterized by comprising the following steps:

2. The method of claim 1, wherein monitoring a processing duration of a given node of the plurality of nodes comprises:

3. The method of claim 1, wherein monitoring a processing duration of a given node of the plurality of nodes comprises:

4. The method of claim 3, wherein calculating the arrival time of the next node of the designated node minus the difference between the arrival times of the designated nodes comprises:

5. The method of claim 1, wherein monitoring a processing duration of a given node of the plurality of nodes comprises:

6. The method of claim 5, wherein calculating the output duration of the designated node minus the output duration of the node immediately preceding the designated node comprises:

7. The method according to claim 1, wherein the deep learning model to be identified is a model obtained by optimizing an original deep learning model based on a high-performance neural network inference engine TensrT; alternatively, the first and second electrodes may be,

8. An abnormal node identifying apparatus, comprising:

9. The apparatus of claim 8, wherein the monitoring module comprises:

10. The apparatus of claim 8, wherein the monitoring module comprises:

11. The apparatus of claim 8, wherein the monitoring module comprises:

12. The device according to claim 8, wherein the deep learning model to be identified is a model obtained by optimizing an original deep learning model based on a high-performance neural network inference engine TensrT; or

13. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;

a memory for storing a computer program;

a processor for implementing the method steps of any of claims 1 to 7 when executing a program stored in the memory.

14. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1 to 7.