CN112099848B

CN112099848B - Service processing method, device and equipment

Info

Publication number: CN112099848B
Application number: CN202010954140.8A
Authority: CN
Inventors: 童超
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Hangzhou Hikvision Digital Technology Co Ltd
Priority date: 2020-09-11
Filing date: 2020-09-11
Publication date: 2024-03-05
Anticipated expiration: 2040-09-11
Also published as: CN112099848A

Abstract

The application provides a service processing method, device and equipment, wherein the method comprises the following steps: acquiring a target calculation flow diagram corresponding to a target service; acquiring a program to be executed corresponding to a target service according to the target calculation flow diagram, and acquiring a configuration file corresponding to the program to be executed according to the target calculation flow diagram; and carrying out service processing on the data to be processed through the program to be executed and the configuration file to obtain a service processing result matched with the target service. By the technical scheme, various types of business processing can be realized by using a machine learning technology, and an application development framework based on a computation flow graph can be realized.

Description

Service processing method, device and equipment

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a service processing method, device and equipment.

Background

Machine learning is a way to realize artificial intelligence, is a multi-domain interdisciplinary, and relates to multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. Machine learning is used to study how computers mimic or implement learning behavior of humans to acquire new knowledge or skills, reorganizing existing knowledge structures to continually improve their performance. Machine learning is more focused on algorithm design, enabling a computer to automatically learn rules from data and utilize the rules to predict unknown data.

Machine learning has found very wide application, for example: data mining, computer vision, natural language processing, biometric identification, search engines, medical diagnostics, credit card fraud detection, stock market analysis, DNA sequence sequencing, voice and handwriting recognition, strategic gaming and robotic use, and the like.

Although machine learning techniques are widely used, there is currently no reasonable implementation of how to implement various types of business processes, such as face detection, vehicle detection, etc., using machine learning techniques.

Disclosure of Invention

The application provides a service processing method, which comprises the following steps:

acquiring a target calculation flow diagram corresponding to a target service; the target computation flow graph at least comprises a plurality of operation nodes, function types of the operation nodes and connection relations of the operation nodes;

acquiring a program to be executed corresponding to a target service according to the target calculation flow diagram, and acquiring a configuration file corresponding to the program to be executed according to the target calculation flow diagram; the program to be executed comprises a plurality of executable files, and the executable files correspond to the function types of the operation nodes; the configuration file comprises input-output relations of the plurality of executable files, and the input-output relations of the plurality of executable files are determined based on the connection relations of the plurality of operation nodes;

Carrying out service processing on the data to be processed through the program to be executed and the configuration file to obtain a service processing result matched with the target service; and processing the input data of the executable file aiming at each executable file in the program to be executed to obtain a data processing result, and outputting the data processing result of the executable file to the next executable file with an output relation with the executable file based on the input-output relation in the configuration file.

The application provides a service processing device, which comprises:

the acquisition module is used for acquiring a target calculation flow graph corresponding to the target service; the target computation flow graph at least comprises a plurality of operation nodes, function types of the operation nodes and connection relations of the operation nodes; acquiring a program to be executed corresponding to a target service according to the target calculation flow diagram, and acquiring a configuration file corresponding to the program to be executed according to the target calculation flow diagram; the program to be executed comprises a plurality of executable files, and the executable files correspond to the function types of the operation nodes; the configuration file comprises input-output relations of the plurality of executable files, and the input-output relations of the plurality of executable files are determined based on the connection relations of the plurality of operation nodes;

The processing module is used for carrying out service processing on the data to be processed through the program to be executed and the configuration file to obtain a service processing result matched with the target service; and processing the input data of the executable file aiming at each executable file in the program to be executed to obtain a data processing result, and outputting the data processing result of the executable file to the next executable file with an output relation with the executable file based on the input-output relation in the configuration file.

The application provides a service processing device, comprising: a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor;

the processor is configured to execute machine-executable instructions to perform the steps of:

As can be seen from the above technical solutions, in the embodiments of the present application, a program to be executed and a configuration file may be obtained based on a target computation flow graph corresponding to a target service, where the program to be executed and the configuration file are used to perform service processing on data to be processed, the target computation flow graph includes a plurality of operation nodes and function types of each operation node, the program to be executed includes a plurality of executable files corresponding to the plurality of operation nodes, and the configuration file includes an input-output relationship of the plurality of executable files, where the executable files can implement machine learning, and when performing service processing on data to be processed based on the plurality of executable files, various types of service processing, such as face detection, vehicle detection, and so on, can be implemented using a machine learning technology. The method can realize an application development framework based on the computational flow graph, utilizes a machine learning model to construct the computational flow graph, processes analysis tasks of various multi-mode data (such as audio and video data, sensor data and the like), and can run on various hardware platforms. The application development framework can relate to reasoning operation of a machine learning model, support users to quickly add or utilize the existing general image processing and machine learning model to quickly build target service, perform targeted optimization according to the hardware resource conditions of different hardware platforms, balance resource use and service processing effects, and maximally utilize hardware resources.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the following description will briefly describe the drawings that are required to be used in the embodiments of the present application or the description in the prior art, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings may also be obtained according to these drawings of the embodiments of the present application for a person having ordinary skill in the art.

FIG. 1 is a flow chart of a business processing method in one embodiment of the present application;

FIG. 2A is a split schematic of a target service in one embodiment of the present application;

FIG. 2B is a schematic diagram of a computational flow graph in one embodiment of the present application;

FIG. 2C is a schematic diagram of a program to be executed in one embodiment of the present application;

FIG. 2D is a flow diagram of performing scheduling in one embodiment of the present application;

FIG. 3A is a schematic diagram of an application development framework in one embodiment of the present application;

FIG. 3B is a schematic diagram of a process for model package importation in one embodiment of the present application;

FIG. 4 is a block diagram of a service processing device in one embodiment of the present application;

fig. 5 is a block diagram of a service processing apparatus in one embodiment of the present application.

Detailed Description

The terminology used in the embodiments of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to any or all possible combinations including one or more of the associated listed items.

It should be understood that although the terms first, second, third, etc. may be used in embodiments of the present application to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, a first message may also be referred to as a second message, and similarly, a second message may also be referred to as a first message, without departing from the scope of the present application. Depending on the context, furthermore, the word "if" used may be interpreted as "at … …" or "at … …" or "in response to a determination".

Before describing the technical scheme of the application, concepts related to the embodiments of the application are described.

Machine learning: machine learning is a way to implement artificial intelligence to study how computers simulate or implement learning behavior of humans to obtain new knowledge or skills, reorganizing existing knowledge structures to continuously improve their own performance. Deep learning belongs to a subclass of machine learning, and is a process of modeling specific problems in the real world using mathematical models to solve similar problems in the field. Neural networks are implementations of deep learning, and for ease of description, the structure and function of the neural network is described herein by taking neural networks as an example, and for other subclasses of machine learning, the structure and function of the neural network are similar.

Neural network: the neural network includes, but is not limited to, convolutional Neural Network (CNN), cyclic neural network (RNN), fully connected network, etc., and the structural units of the neural network include, but are not limited to, convolutional layer (Conv), pooling layer (Pool), excitation layer, fully connected layer (FC), etc.

In practical applications, the neural network may be constructed by combining one or more convolution layers, one or more pooling layers, one or more excitation layers, and one or more fully-connected layers according to different requirements.

In the convolution layer, the input data features are enhanced by performing convolution operation by using convolution kernel, the convolution kernel can be a matrix with m x n, the input data features of the convolution layer are convolved with the convolution kernel, the output data features of the convolution layer can be obtained, and the convolution operation is actually a filtering process.

In the pooling layer, operations such as maximum value taking, minimum value taking, average value taking and the like are performed on input data features (such as output of a convolution layer), so that the input data features are subsampled by utilizing the principle of local correlation, the processing amount is reduced, the feature invariance is kept, and the pooling layer operation is actually a downsampling process.

In the excitation layer, the input data features may be mapped using an activation function (e.g., a nonlinear function) to introduce a nonlinear factor such that the neural network enhances expression through nonlinear combinations.

The activation function may include, but is not limited to, a ReLU (Rectified Linear Units, rectified linear unit) function that is used to place features less than 0 at 0, while features greater than 0 remain unchanged.

In the fully-connected layer, all data features input to the fully-connected layer are fully-connected, so that a feature vector is obtained, and the feature vector can comprise a plurality of data features.

The technical solutions of the embodiments of the present application are described below with reference to specific embodiments.

In this embodiment of the present application, a service processing method is provided, and referring to fig. 1, which is a flow chart of the service processing method, the method may be applied to any device, and the method may include the following steps:

step 101, obtaining a target computation flow diagram corresponding to a target service, where the target computation flow diagram at least includes a plurality of operation nodes, function types of each operation node, and connection relations of the plurality of operation nodes.

In one possible implementation, the target computation flow graph may be acquired as follows:

in step 1011, an initial computation flow graph corresponding to the target service is obtained, where the initial computation flow graph may at least include a plurality of operation nodes, a function type of each operation node, and a connection relationship of the plurality of operation nodes.

In this embodiment, an application development framework is provided, and each user may implement its own service based on the application development framework, so that for convenience of distinction, the service is referred to as a target service, such as face detection, body detection, vehicle detection, and the like, and the target service is not limited. For example, the target service may be a service for implementing face detection and gender detection, and is shown in fig. 2A, which is a schematic diagram of the target service.

For example, for a target service, the target service may be split into multiple functions, as shown in fig. 2A, and the target service for implementing face detection and gender detection may be split into a video decoding function, an image scaling function, a face detection function, an image matting function, and a face gender classification function.

Based on the above functions of the target service, the processing procedure of the target service may sequentially be that, for input data (such as a video image), video decoding is performed on the input data, then image scaling is performed on the data after the video decoding, then face detection is performed on the data after the image scaling, then image matting is performed on the data after the face detection, then face gender classification is performed on the data after the image matting, and finally, face frames and genders may be output.

For each function of the target service, one operation node may be corresponding, after the target service is split into a plurality of functions, a plurality of operation nodes corresponding to the plurality of functions may be obtained, and in an initial computation flow diagram corresponding to the target service, the plurality of operation nodes and function types of each operation node may be included.

For example, after the target service is split into a video decoding function, an image scaling function, a face detection function, an image matting function and a face gender classification function, a video decoding operation node corresponding to the video decoding function, an image scaling operation node corresponding to the image scaling function, a face detection operation node corresponding to the face detection function, an image matting operation node corresponding to the image matting function and a face gender classification operation node corresponding to the face gender classification function can be obtained, namely, the initial computation flow graph comprises the video decoding operation node, the image scaling operation node, the face detection operation node, the image matting operation node and the face gender classification operation node. For another example, the initial computation flow graph may include a function type "video decoding function" of the operation node 1 and the operation node 1, a function type "image scaling function" of the operation node 2 and the operation node 2, a function type "face detection function" of the operation node 3 and the operation node 3, a function type "image matting function" of the operation node 4 and the operation node 4, and a function type "face gender classification function" of the operation node 5 and the operation node 5. For convenience of description, the former case is taken as an example.

For example, regarding the connection relation of the plurality of operation nodes, it may be determined based on the relation of the plurality of functions of the target service, for example, as shown in fig. 2A, since the image scaling process is performed on the data after the video decoding process and the image matting process is performed on the data after the video decoding process, the video decoding operation node is connected to the image scaling operation node, and the video decoding operation node is connected to the image scaling operation node. The image scaling operation node is connected to the face detection operation node because the face detection processing is performed on the data after the image scaling. Because the image matting processing is carried out on the data after the face detection, the face detection operation node is connected with the image matting operation node. Because the face gender classification processing is carried out on the data after the image matting, the image matting operation node is connected with the face gender classification operation node.

In summary, the connection relationship between the video decoding operation node, the image scaling operation node, the face detection operation node, the image matting operation node, and the face gender classification operation node may be obtained, so far, an initial computation flow diagram corresponding to the target service may be obtained, and the initial computation flow diagram is shown in fig. 2B and is an example of the initial computation flow diagram.

For example, the initial computation flowsheet may further include a subgraph, and the subgraph is made up of at least one operation node, for example, the video decoding operation node and the image scaling operation node may be combined into one subgraph 1, that is, the initial computation flowsheet includes subgraph 1, the face detection operation node, the image matting operation node, the face gender classification operation node, and subgraph 1 includes the video decoding operation node and the image scaling operation node. In summary, the initial computational flow graph is a hierarchical structure, and includes sub-graph 1, face detection operation node, image matting operation node, face gender classification operation node, and video decoding operation node and image scaling operation node at a first layer of the initial computational flow graph.

Illustratively, for the initial computation flow graph, the connection relationships between the operation nodes in the initial computation flow graph may include a sequential execution relationship, for example, operation node 1 is connected to operation node 2, operation node 2 is connected to operation node 3, operation node 3 is connected to operation node 5, and so on.

For the initial computation flow graph, the connection relationship between the operation nodes in the initial computation flow graph may include determining an execution relationship, for example, the operation node 1 is connected to the operation node 2, and the operation node 1 is connected to the operation node 3, for the operation node 1, when the condition a is satisfied, the output of the operation node 1 is connected to the operation node 2, and when the condition B is satisfied, the output of the operation node 1 is connected to the operation node 3.

For the initial computation flow graph, the connection relationship between the operation nodes in the initial computation flow graph may include a loop execution relationship, for example, the operation node 1 is connected to the operation node 2, the operation node 2 is connected to the operation node 1, that is, the output of the operation node 1 is connected to the operation node 2, the output of the operation node 2 is connected to the operation node 1, the output of the operation node 1 is connected to the operation node 2, and so on, until the preset condition is met, the connection relationship between the operation node 1 and the operation node 2 is skipped.

Of course, the sequential execution relationship, the judgment execution relationship and the loop execution relationship are just a few examples of the connection relationship, and the connection relationship is not limited as long as the connection relationship exists between the operation nodes.

Illustratively, to obtain an initial computational flow graph, the following may be used:

the first mode splits the target service into a plurality of functions, and the splitting mode is not limited, so long as the plurality of functions are combined to realize the target service. Then, a plurality of operation nodes corresponding to the plurality of functions are acquired, that is, the function corresponding to the operation node can be embodied by the name of the operation node. Then, the connection relationship of the plurality of operation nodes is determined based on the relationship of the plurality of functions of the target service, for example, when the output data of the function 1 is processed by the function 2, the operation node corresponding to the function 1 is connected to the operation node corresponding to the function 2. Then, an initial computation flow graph corresponding to the target service is generated based on the plurality of operation nodes and connection relations of the plurality of operation nodes, wherein the initial computation flow graph comprises the plurality of operation nodes, function types of the operation nodes (the function types of the operation nodes can be embodied through names of the operation nodes), and connection relations of the plurality of operation nodes, and is shown in fig. 2B.

For example, key steps in the target service may be identified, based on which the target service is split into multiple functions, each key step in the target service representing a function of the target service.

Illustratively, the initial computational flow graph is a data flow graph that is used to describe the target business process.

For example, the computation flow graph in the present embodiment may also be referred to as a data flow graph, a computation graph, or the like.

In a second mode, a computation flow graph configuration file corresponding to a target service is received, wherein the computation flow graph configuration file comprises a plurality of operation nodes, function types of the operation nodes and connection relations of the operation nodes, an initial computation flow graph corresponding to the target service is generated based on the computation flow graph configuration file, and the initial computation flow graph comprises the operation nodes, the function types of the operation nodes and the connection relations of the operation nodes, and is shown in fig. 2B and is a schematic diagram of the initial computation flow graph.

For example, the user may split the target service into a plurality of functions, acquire a plurality of operation nodes corresponding to the plurality of functions, and determine connection relationships of the plurality of operation nodes based on relationships of the plurality of functions of the target service. The user may then input a computational flow graph profile to the device, the computational flow graph profile including a plurality of operational nodes and connection relationships of the plurality of operational nodes. Upon receiving the computational flow graph configuration file, the device may generate an initial computational flow graph based on the computational flow graph configuration file.

And receiving an initial computation flow diagram corresponding to the target service, which is input by a user, wherein the initial computation flow diagram comprises a plurality of operation nodes, function types of the operation nodes and connection relations of the operation nodes.

For example, a user may directly generate an initial computational flowsheet corresponding to a target business and input the initial computational flowsheet to a device, i.e., the device may directly receive the initial computational flowsheet input by the user. For example, a user may drag-and-drop programming via a visual map editor to automatically generate an initial computational flow graph and input the initial computational flow graph into a device, thereby reducing the configuration effort of a developer.

Of course, the first mode, the second mode and the third mode are just examples of obtaining the initial computation flowsheet, and the obtaining mode of the initial computation flowsheet is not limited as long as the initial computation flowsheet can be obtained.

Step 1012, obtaining a target computation flow graph according to the initial computation flow graph; or, optimizing the initial computation flow graph, and acquiring a target computation flow graph according to the optimized computation flow graph.

In one possible implementation, after an initial computational flowsheet is obtained, a target computational flowsheet may be obtained from the initial computational flowsheet, e.g., the initial computational flowsheet may be used as the target computational flowsheet.

In another possible implementation manner, after the initial computation flow graph is obtained, the initial computation flow graph may be optimized, an optimized computation flow graph is obtained, and a target computation flow graph is obtained according to the optimized computation flow graph, for example, the optimized computation flow graph may be used as the target computation flow graph.

Of course, the above manner is only a few examples of obtaining the target computation flow graph, and the obtaining manner of the target computation flow graph is not limited, as long as the target computation flow graph can be obtained based on the initial computation flow graph, and the target computation flow graph may include a plurality of operation nodes, function types of the operation nodes (the function types of the operation nodes may be represented by names of the operation nodes), and connection relations of the plurality of operation nodes.

Illustratively, optimizing the initial computational flow graph to obtain an optimized computational flow graph may include, but is not limited to: and optimizing the operation nodes of the initial computation flow graph and/or the connection relation of the operation nodes to obtain an optimized computation flow graph. The optimization process includes, but is not limited to, at least one of the following: splitting, merging, memory sharing, hardware acceleration and sequence adjustment.

For example, the initial computing flow chart is taken as a candidate computing flow chart 1, a program 1 to be executed corresponding to the target service is acquired based on the candidate computing flow chart 1 (the acquisition mode refers to the mode of acquiring the program to be executed based on the target computing flow chart later), the program 1 to be executed is deployed on a hardware platform, and the performance cost (such as at least one of computing performance cost, storage performance cost and bandwidth performance cost) of the program 1 to be executed is determined. And optimizing the candidate calculation flow chart 1 to obtain an optimized candidate calculation flow chart 2, acquiring a program 2 to be executed based on the candidate calculation flow chart 2, deploying the program 2 to be executed on a hardware platform, and determining the performance cost of the program 2 to be executed.

For example, if the performance overhead of the program 2 to be executed is smaller than the performance overhead of the program 1 to be executed, the candidate computation flow graph 2 is taken as the target computation flow graph. Or if the performance cost of the program 2 to be executed is greater than the performance cost of the program 1 to be executed, taking the candidate calculation flow chart 1 as a target calculation flow chart.

When the candidate calculation flow chart 1 is optimized to obtain the candidate calculation flow chart 2, the candidate calculation flow chart 1 can be split to obtain the candidate calculation flow chart 2; or, carrying out merging treatment on the candidate calculation flow chart 1 to obtain a candidate calculation flow chart 2; or, performing memory sharing processing on the candidate calculation flow chart 1 to obtain a candidate calculation flow chart 2; or, carrying out hardware acceleration processing on the candidate calculation flow chart 1 to obtain a candidate calculation flow chart 2; or, performing order adjustment processing on the candidate calculation flow chart 1 to obtain the candidate calculation flow chart 2.

When the candidate computing flow chart 1 is subjected to optimization processing to obtain the candidate computing flow chart 2, the candidate computing flow chart 1 can be subjected to splitting processing, merging processing, memory sharing processing, hardware acceleration processing and at least two optimization processing in sequence adjustment processing to obtain the candidate computing flow chart 2. For convenience of description, the splitting process and the memory sharing process performed on the candidate computation flow graph 1 will be described as an example.

For example, the candidate computation flowsheet 1 is split to obtain a computation flowsheet after the split, and the memory sharing process is performed on the computation flowsheet after the split to obtain the candidate computation flowsheet 2.

For another example, memory sharing processing is performed on the candidate computation flow graph 1 to obtain a computation flow graph after the memory sharing processing, and splitting processing is performed on the computation flow graph after the memory sharing processing to obtain the candidate computation flow graph 2.

For another example, the candidate computation flow graph 1 is split to obtain a computation flow graph after the splitting (the computation flow graph is denoted as computation flow graph a), the program a to be executed is obtained based on the computation flow graph a, the program a to be executed is deployed on the hardware platform, and the performance cost of the program a to be executed is determined. And if the performance cost of the program A to be executed is smaller than that of the program 1 to be executed, performing memory sharing processing on the calculation flow chart A to obtain a candidate calculation flow chart 2. Or if the performance cost of the program A to be executed is greater than the performance cost of the program 1 to be executed, performing memory sharing processing on the candidate calculation flow chart 1 to obtain the candidate calculation flow chart 2.

In one possible implementation, splitting the initial computational flow graph refers to: and splitting part or all of the operation nodes in the initial calculation flow chart, for example, splitting one operation node in the initial calculation flow chart into at least two operation nodes. The merging processing of the initial computation flow graph means that: and merging partial operation nodes in the initial calculation flow chart, for example, merging at least two operation nodes in the initial calculation flow chart into one operation node. The memory sharing processing for the initial computation flow graph is as follows: the same memory is shared for some of the operation nodes in the initial computation flow graph, for example, the same memory is used for at least two of the operation nodes in the initial computation flow graph, that is, the at least two operation nodes occupy the same memory space. The hardware acceleration processing of the initial computation flow graph means that: and (3) carrying out hardware acceleration on one or more operation nodes in the initial computation flow diagram, namely realizing the processing function of the operation nodes through hardware.

In one possible implementation, performing the order adjustment process on the initial computational flowsheet may refer to: the connection relation of the operation nodes in the initial computation flow graph is adjusted (the connection relation of the operation nodes is adjusted, and the execution sequence is also adjusted substantially), and/or the execution sequence of the operation nodes in the initial computation flow graph is adjusted. For example, in the initial computation flow graph, the operation node 1 is connected to the operation node 2, the operation node 1 is connected to the operation node 3, and the operation node 1 is adjusted so as not to be connected to the operation node 2, the operation node 1 is connected to the operation node 3, and the operation node 2 is connected to the operation node 3. For another example, the execution sequence of each operation node in the initial computation flow graph is as follows: the operation node 1, the operation node 2 and the operation node 3 are adjusted to be in the execution sequence of: an operation node 2, an operation node 3, and an operation node 1.

Of course, the foregoing is merely a few examples of optimizing an initial computational flow graph, and the optimization is not limited to this, and the disclosure is not limited to specific implementations of various optimization.

In one possible implementation, the initial computational flow graph is optimized to obtain an optimized computational flow graph, which may include, but is not limited to, the following ways: and optimizing the operation nodes of the initial computation flow graph and/or the connection relation of the operation nodes based on the hardware capability of a hardware platform for deploying the program to be executed, so as to obtain an optimized computation flow graph. The optimization process includes, but is not limited to, at least one of the following: splitting, merging, memory sharing, hardware optimization acceleration and execution sequence adjustment.

For example, the program to be executed is acquired based on the target computation flow graph, and the specific acquisition mode refers to the subsequent embodiment, which is not described herein. After the program to be executed is obtained, the program to be executed may be deployed to a hardware platform (such as CPU (Central Processing Unit, central processing unit), GPU (Graphics Processing Unit, graphics processor), ARM (Advanced RISC Machines, advanced reduced instruction set processor), FPGA (Field Programmable Gate Array ), etc., and the program to be executed is run on the hardware platform, so that the service processing of the target service is implemented through the program to be executed. On the basis, in order to optimize the initial computation flow graph, the hardware capability (such as at least one of computation capability, storage capability and bandwidth capability) of the hardware platform can be obtained, and on the basis of the hardware capability of the hardware platform, the operation nodes and/or the connection relation of the operation nodes of the initial computation flow graph can be optimized under the condition of ensuring the computation accuracy, so that the optimized computation flow graph is obtained.

For example, the initial computing flow chart is taken as a candidate computing flow chart 1, the program 1 to be executed is acquired based on the candidate computing flow chart 1, the program 1 to be executed is deployed on a hardware platform, and the performance overhead of the program 1 to be executed is determined. If the performance cost of the program 1 to be executed exceeds the hardware capability range of the hardware platform (for example, the computing performance cost is greater than the computing capability of the hardware platform, or the storage performance cost is greater than the storage capability of the hardware platform, or the bandwidth performance cost is greater than the bandwidth capability of the hardware platform), the candidate computing flow graph 1 needs to be optimized, so as to obtain the optimized candidate computing flow graph 2. If the performance cost of the program 1 to be executed does not exceed the hardware capability range of the hardware platform, the candidate computing flow chart 1 can be directly used as a target computing flow chart, or the candidate computing flow chart 1 can be subjected to optimization processing, so that the optimized candidate computing flow chart 2 is obtained.

And obtaining a candidate computing flow chart 2, acquiring the program 2 to be executed based on the candidate computing flow chart 2, deploying the program 2 to be executed on a hardware platform, and determining the performance cost of the program 2 to be executed. If the performance cost of the program 2 to be executed exceeds the hardware capability range of the hardware platform, the candidate calculation flow chart 2 cannot be used as a target calculation flow chart, and the candidate calculation flow chart 2 is continuously subjected to optimization processing to obtain an optimized candidate calculation flow chart. If the performance overhead of the program 2 to be executed does not exceed the hardware capability range of the hardware platform, the candidate calculation flow chart 2 may be used as the target calculation flow chart, or the candidate calculation flow chart 2 may be optimized.

Step 102, acquiring a program to be executed corresponding to the target service according to the target calculation flow diagram, and acquiring a configuration file corresponding to the program to be executed according to the target calculation flow diagram. Illustratively, the program to be executed includes a plurality of executable files corresponding to the function types of the plurality of operation nodes in the target computing flowsheet. The configuration file includes input-output relationships of a plurality of executable files, which are determined based on connection relationships of a plurality of operation nodes.

In one possible implementation, the program to be executed and the configuration file may be acquired as follows:

in step 1021, based on the function type of each operation node in the target computation flow graph, determining an executable file corresponding to the operation node, wherein the executable file is an executable file for realizing the function type.

For example, for each operation node in the target computation flow graph, an executable file for implementing the function type may be selected from the operator library based on the function type of the operation node, and the selected executable file is determined to be an executable file corresponding to the operation node. The operator library may include a plurality of executable files stored in advance, each executable file being used to implement at least one function type.

In one possible implementation, an operator library may be maintained that includes a plurality of pre-stored executable files (also referred to as library files) for each executable file for implementing at least one function type, e.g., an operator library includes executable file 1 and executable file 2, executable file 1 for implementing function type 1 and function type 2, and executable file 2 for implementing function type 3.

For an operator library, this may include, but is not limited to: executable files for implementing image processing functions (e.g., image filtering, geometric image transformation, color conversion, etc.), executable files for implementing machine learning functions (e.g., face detection, body detection, vehicle detection, face gender classification, etc.), executable files for implementing video analysis functions (e.g., motion estimation, object tracking, etc.), executable files for implementing pattern recognition functions (e.g., feature extraction, object detection, etc.), executable files for implementing deep learning model reasoning functions, etc. Of course, the foregoing is merely an example of an executable file, and is not limited in this regard.

Illustratively, in step 1021, for each operation node in the target computation flow graph, the function type of the operation node may be determined first, e.g., by the name of the operation node. Then, an executable file for realizing the function type can be selected from the operator library, wherein the executable file is the executable file corresponding to the operation node.

For example, referring to fig. 2B, assuming that the target computation flow graph includes a video decoding operation node, an image scaling operation node, a face detection operation node, an image matting operation node, and a face gender classification operation node, it may be determined that a function type of the video decoding operation node is a video decoding function, a function type of the image scaling operation node is an image scaling function, a function type of the face detection operation node is a face detection function, a function type of the image matting operation node is an image matting function, and a function type of the face gender classification operation node is a face gender classification function.

The method comprises the steps of selecting an executable file a1 for realizing a video decoding function from an operator library, selecting an executable file a2 for realizing an image scaling function from the operator library, selecting an executable file a3 for realizing a face detection function from the operator library, selecting an executable file a4 for realizing an image matting function from the operator library, and selecting an executable file a5 for realizing a face gender classification function from the operator library.

The executable files of different function types may be the same executable file or different executable files, for example, the executable file a1 for implementing the video decoding function may be the same as or different from the executable file a2 for implementing the image scaling function. For convenience of description, the description will be given by taking the example that the executable file a 1-the executable file a5 are different executable files.

In the above embodiment, the executable file refers to a binary file that can be loaded and executed by an operating system, and in different operating system environments, the executable file may be presented in different manners, and may include a code section, a data section, a stack section, an extension section, and the like, where the code section stores an execution instruction of a computer, that is, an operation instruction to be executed by a CPU, the data section stores data to be used by the CPU, and the stack section stores information related to a register. An executable file is a module, or group of modules, that performs a certain function.

In the present embodiment, the executable file is a code file for implementing a specified function, for example, the executable file a1 for implementing a video decoding function, that is, a code file capable of implementing a video decoding function, and the present embodiment is not limited as to how the video decoding function is implemented by the executable file a 1.

In one possible implementation, the device itself contains a large number of executable files, which are stored in an operator library. In addition to the executable file provided by the device itself, the user is provided with registration capabilities of the executable file, i.e. the user is able to store the executable file in the operator library. In summary, the executable files stored in the operator library may be provided by the device itself or may be provided by the user.

Illustratively, because the device provides the user with the registration capability of the executable file, the user may send a registration message to the device through the user terminal. Based on this, a registration message sent by the user terminal may be received, the registration message comprising a code file, i.e. a code file for implementing at least one function type, such as a code file for implementing a video decoding function, etc. Then, the code file is converted into an executable file for implementing at least one function type, for example, the code file is compiled into an executable file capable of being run by a hardware platform, when the code file is used for implementing a video decoding function, the converted executable file is also used for implementing the video decoding function, i.e. the function type of the executable file is the same as the function type of the code file. The executable file is then stored into an operator library.

For example, the user completes the development of the code file according to the interface requirement of the device, such as developing the code file with the video decoding function, and sends a registration message to the device through the user terminal, wherein the registration message includes the code file. For example, the user can implement three interface functions, such as initialization (initialization) of a code file, creation (creation) of a code file, and main processing (Process) of a code file, according to the interface requirements of the device, and after implementing the three interface functions, such as initialization, creation, main processing, etc., the code file can be obtained.

After the device obtains the code file, the code file can be compiled by utilizing a dynamic library automatic compiling tool to obtain an executable file which can be operated by a hardware platform. With respect to the process of compiling a code file, there is no limitation herein as long as the executable file obtained by compiling can be run by a hardware platform.

After the executable file is obtained, the executable file can be stored in an operator library, and file information of the executable file, such as a file name of the executable file, a function type of the executable file, input parameter information of the executable file and the like, are added according to a registration format of the executable file.

Step 1022, determining input/output relationships of the plurality of executable files corresponding to the plurality of operation nodes in the target computation flow graph based on the connection relationships of the plurality of operation nodes in the target computation flow graph.

For example, after the executable file corresponding to each operation node in the target computation flow graph is obtained, a plurality of executable files can be obtained, and the input-output relationship of the plurality of executable files is determined based on the connection relationship of the plurality of operation nodes. For example, referring to fig. 2B, in the target computation flow graph, a video decoding operation node is connected to an image scaling operation node, the video decoding operation node is connected to an image matting operation node, the image scaling operation node is connected to a face detection operation node, the face detection operation node is connected to an image matting operation node, and the image matting operation node is connected to a face gender classification operation node, on the basis, the input-output relationships of the plurality of executable files include: the input of the executable file a1 is empty (i.e. the executable file a1 is the first file), and the output of the executable file a1 is the executable file a2 and the executable file a4; the input of the executable file a2 is an executable file a1, and the output of the executable file a2 is an executable file a3; the input of the executable file a3 is an executable file a2, and the output of the executable file a3 is an executable file a4; the input of the executable file a4 is an executable file a1 and an executable file a3, and the output of the executable file a4 is an executable file a5; the input of the executable file a5 is the executable file a4, the output of the executable file a5 is empty, i.e. the executable file a5 is the last file.

Step 1023, acquiring a program to be executed corresponding to the target service based on a plurality of executable files corresponding to a plurality of operation nodes in the target computation flow graph, and acquiring a configuration file corresponding to the program to be executed based on the input-output relation of the plurality of executable files.

For example, the program to be executed may include an executable file a1, an executable file a2, an executable file a3, an executable file a4, and an executable file a5. The configuration files may include an executable file a1, an executable file a2, an executable file a3, an executable file a4, and an input-output relationship of an executable file a5, for example, the output of the executable file a1 is the executable file a2 and the executable file a4, the output of the executable file a2 is the executable file a3, the output of the executable file a3 is the executable file a4, and the output of the executable file a4 is the executable file a5, which is shown in fig. 2C and is a schematic diagram of the configuration file.

And step 103, carrying out service processing on the data to be processed through the program to be executed and the configuration file to obtain a service processing result matched with the target service. For each executable file in the program to be executed, the input data of the executable file is processed to obtain a data processing result, and the data processing result of the executable file is output to the next executable file with an output relationship with the executable file based on the input-output relationship in the configuration file.

For example, referring to fig. 2C, after obtaining the data to be processed, the data to be processed is input to the executable file a1, and the data to be processed is processed by the executable file a1 to obtain data a, and the data a is output to the executable file a2 and the executable file a4 having an output relationship with the executable file a 1. The data A is processed through the executable file a2 to obtain data B, and the data B is output to the executable file a3 which has an output relation with the executable file a 2. The data B is processed through the executable file a3 to obtain data C, and the data C is output to the executable file a4 which has an output relation with the executable file a3. Since the input data of the executable file a4 is data a and data C, the data a and the data C are processed by the executable file a4 to obtain data D, and the data D is output to the executable file a5 having an output relationship with the executable file a4. And processing the data D through the executable file a5 to obtain a service processing result matched with the target service, and outputting the service processing result matched with the target service.

In one possible implementation, after the program to be executed is obtained, the program to be executed may be deployed on a hardware platform and run on the hardware platform. After the configuration file is obtained, the configuration file is sent to the hardware platform. In summary, the hardware platform may generate the execution sequence of each executable file according to the input-output relationship of each executable file in the configuration file, and execute each executable file in the program to be executed based on the execution sequence of each executable file. For example, after receiving the data to be processed, the hardware platform uses the data to be processed as input data of the program to be executed, and processes the data sequentially through each executable file in the program to be executed, that is, the data stream flows according to the execution sequence of each executable file, and when passing through each executable file, relevant processing calculation is performed respectively, and the output result is transferred to the next executable file, and the like until the processing is completed, so as to obtain the service processing result.

In one possible implementation, the program to be executed may include, but is not limited to: a data processing subroutine, a model reasoning subroutine, a post-processing subroutine; based on this, the service processing is performed on the data to be processed with the configuration file through the program to be executed, so as to obtain a service processing result matched with the target service, which may include, but is not limited to, the following ways: based on the input-output relationship in the configuration file, determining that the output of the data processing subprogram is a model reasoning subprogram, and the input of the model reasoning subprogram is the data processing subprogram; and determining that the output of the model inference subroutine is a post-processing subroutine, and the input of the post-processing subroutine is the model inference subroutine. Then, inputting the data to be processed into a data processing subprogram; the data processing subprogram preprocesses the data to be processed, and inputs the preprocessed data to the model reasoning subprogram; the model reasoning subprogram carries out model reasoning on the data, and the data after the model reasoning is completed is input to the post-processing subprogram; and the post-processing subprogram carries out post-processing on the data to obtain a service processing result matched with the target service.

For example, referring to fig. 2D, to perform scheduling, data to be processed may be input to a data processing subroutine, where the data processing subroutine is configured to perform preprocessing, such as encoding, decoding, scaling, etc., on the data to be processed. The data processing subprogram inputs the preprocessed data to the model reasoning subprogram, and the model reasoning subprogram performs model reasoning on the data, such as face detection, face gender classification and other operations. The model reasoning subprogram inputs the data after model reasoning to the post-processing subprogram, and the post-processing subprogram performs post-processing operation on the data to obtain a service processing result matched with the target service.

In one possible implementation manner, before the to-be-executed program and the configuration file perform service processing on the to-be-processed data, if the target computation flow graph further includes a network parameter corresponding to the operation node, the network parameter may be loaded in the executable file for the executable file corresponding to the operation node in the to-be-executed program, and the executable file after loading the network parameter may be added to the to-be-executed program, that is, for the executable file in the to-be-executed program, the executable file after loading the network parameter may be the executable file.

For example, when the target computation flow graph (initial computation flow graph) is obtained, the target computation flow graph may include, for each operation node, a network parameter corresponding to the operation node, or may not include a network parameter corresponding to the operation node, in addition to the connection relationship between the plurality of operation nodes and the plurality of operation nodes.

For example, the video decoding operation node corresponds to a video decoding function, and the target computation flow graph may include network parameters corresponding to the video decoding operation node, where the network parameters represent decoding parameters of the video image, for example, decoding the video image in decoding mode 1 or decoding mode 2. The executable file a1 corresponding to the video decoding operation node supports the decoding mode 1 and the decoding mode 2, if the network parameter indicates that the decoding mode 1 is adopted, the executable file a1 can decode the video image in the decoding mode 1, and if the network parameter indicates that the decoding mode 2 is adopted, the executable file a1 can decode the video image in the decoding mode 2. The image scaling operation node corresponds to an image scaling function, and the target computing flow graph may include network parameters corresponding to the image scaling operation node, where the network parameters represent scaling parameters of the video image, for example, an input scaling parameter of 1920×1680 and an output scaling parameter of 960×720. Based on this, the executable file a2 corresponding to the image scaling operation node may scale the video image of 1920×1680 to the video image of 960×720. The face detection operation node corresponds to a face detection function, and network parameters corresponding to the face detection function may not be included in the target computation flow graph. For another example, the face gender classification operation node corresponds to a face gender classification function, and the target computing flowsheet may not include network parameters corresponding to the face gender classification function.

The image matting operation node corresponds to an image matting function, and the target computing flow graph may include network parameters corresponding to the image matting operation node, where the network parameters represent matting parameters of the video image, for example, the matting parameters may be matting coordinates, such as matting upper left corner coordinates, matting width and matting height, or matting upper left corner coordinates and matting lower right corner coordinates, and the like. Based on the above, the executable file a4 corresponding to the image matting operation node can select a sub-image matched with the matting coordinate from the video image, and intercept the sub-image from the video image.

In another possible implementation manner, before the service processing is performed on the data to be processed by the program to be executed and the configuration file, the executable file of the network model to be loaded in the program to be executed may be obtained from the model pool, the network model corresponding to the executable file may be loaded in the executable file, and the executable file after the network model is loaded may be added to the program to be executed, that is, the executable file in the program to be executed may be the executable file after the network model is loaded.

For example, for each executable file in a program to be executed, the executable file may or may not need to load a network model. If the network model needs to be loaded, the network model corresponding to the executable file is obtained from the model pool, and the network model is loaded in the executable file, namely the executable file in the program to be executed loads the network model.

For example, the video decoding operation node corresponds to a video decoding function, and the executable file a1 corresponding to the video decoding operation node does not need to load a network model. The image scaling operation node corresponds to the image scaling function, and the executable file a2 corresponding to the image scaling operation node does not need to load a network model. The image matting operation node corresponds to an image matting function, and an executable file a4 corresponding to the image matting operation node does not need to load a network model.

The face detection operation node corresponds to a face detection function, and the executable file a3 corresponding to the face detection operation node needs to be loaded with a network model, i.e. a network model for realizing the face detection function, such as a face detection model, which is a machine learning network model trained based on a machine learning algorithm. The executable file a3 can realize the face detection function by loading the face detection model in the executable file a 3. The face gender classification operation node corresponds to a face gender classification function, and the executable file a5 corresponding to the face gender classification operation node needs to be loaded with a network model, namely, a network model for realizing the face gender classification function, such as a face gender classification model, is a machine learning network model trained based on a machine learning algorithm. By loading the face gender classification model in the executable file a5, the executable file a5 can realize face gender classification, namely determining the gender of the user based on the face of the user.

In another possible implementation manner, before the to-be-executed program and the configuration file perform service processing on the to-be-processed data, if the target computation flow graph further includes a network parameter corresponding to the operation node, and the executable file corresponding to the operation node is an executable file that needs to load a network model, the network parameter may also be loaded in the executable file, the network model corresponding to the executable file is obtained from the model pool, the network model is loaded in the executable file, and the executable file after the network parameter and the network model are loaded is added to the to-be-executed program, that is, the network parameter and the network model are loaded for the executable file in the to-be-executed program, where the implementation process of this embodiment is referred to the above embodiment.

In the above embodiment, it is necessary to acquire the network model corresponding to the executable file from the model pool, and before acquiring the network model corresponding to the executable file from the model pool, it is also possible to acquire a machine learning network model that has completed training, and perform a specified process on the machine learning network model to obtain a processed network model, and store the processed network model in the model pool. By way of example, the designation process may include, but is not limited to, at least one of the following: quantization processing, encapsulation processing, compiling processing and encryption processing.

When the machine learning network model is packaged, at least one of model identification, model service analysis information, operation platform information and model version information can be packaged.

By way of example, and not limitation, the machine learning network model may be a neural network (e.g., convolutional neural network, etc.) model, and sample data may be used to train various neural network parameters within the neural network model, such as convolutional layer parameters (e.g., convolutional kernel parameters), pooling layer parameters, excitation layer parameters, full-link layer parameters, etc.

Obviously, by training each neural network parameter in the neural network model, the neural network model can be fitted with the mapping relation of input and output, and the training process of the neural network model is not limited. After the neural network model training is completed, a neural network model with the training completed can be obtained.

For example, after obtaining the machine learning network model that has been trained, the machine learning network model may be quantized. For example, model parameters, input features and output features represented by floating point numbers are approximately represented by fixed point values, so that the operation speed of the machine learning network model is improved, and the machine learning network model is compressed. For example, the machine learning network model includes a large number of parameters, which are floating point type parameters that occupy a large amount of storage space, and the operation of the floating point type parameters consumes a large amount of calculation resources, and if calculation can be performed using fixed point type parameters without affecting accuracy, the calculation speed can be increased, the calculation resources can be saved, and the storage space can be saved, thereby introducing quantization technology that compresses the machine learning network model by reducing the number of bits required to represent each weight, and based on the quantization technology, the floating point type parameters can be converted into fixed point type parameters.

For example, after obtaining the machine learning network model that has been trained, the machine learning network model may be subjected to an encapsulation process. For example, since the machine learning network model may come from various training frameworks, the hardware platform for deploying the program to be executed is also various, in order to facilitate the hardware platform to identify the information of the machine learning network model and execute the machine learning network model on the hardware platform to complete model reasoning and result analysis, the machine learning network model may be packaged, that is, the machine learning network model is packaged into a machine learning network model that can be inferred by the hardware platform, and the packaging manner is not limited, and is related to the machine learning network model and the hardware platform, for example, if the hardware platform is an FPGA, the machine learning network model may be packaged into a machine learning network model that can be inferred by the FPGA.

When the machine learning network model is packaged, at least one of model identification, model service analysis information, operation platform information and model version information can be packaged. For example, at least one of the model identification, the model service analysis information, the operation platform information and the model version information is packaged and packaged in the machine learning network model, so that the hardware platform can identify the model identification, the model service analysis information, the operation platform information and the model version information of the machine learning network model through analyzing the machine learning network model, the safety of the model information is ensured, and the cross-platform reasoning deployment is facilitated.

For example, after obtaining the machine learning network model that has been trained, the machine learning network model may be compiled. For example, because the machine learning network model may come from various different training frameworks and no special requirements are made on the training frameworks, the machine learning network model may be compiled into a general model exchange format (such as ONNX) to obtain a compiled machine learning network model, so that targeted optimization compilation is performed for different hardware platforms to improve reasoning efficiency.

For example, after obtaining the machine learning network model that has been trained, the machine learning network model may be encrypted, thereby obtaining the security of the machine learning network model.

For example, after obtaining the machine learning network model that has been trained, at least two processes of quantization processing, encapsulation processing, compiling processing and encryption processing may be performed on the machine learning network model, for example, the machine learning network model is quantized first, the machine learning network model after quantization processing is compiled, the machine learning network model after compiling processing is encapsulated, and the machine learning network model after encapsulation processing is encrypted, so as to obtain the network model after processing. Obviously, after the processing, the trained learning network model can be subjected to quantization, encapsulation, compiling, encryption and other processing, and is converted into a network model capable of carrying out model reasoning on various different hardware platforms.

In summary, the processed network model may be obtained and stored in the model pool, and the processed network model may be a human body detection model (i.e., a network model for implementing human body detection), a face detection model (i.e., a network model for implementing face detection), a vehicle detection model, a face gender classification model, etc., which are not limited to the type of the network model.

For example, the above execution sequence is only an example given for convenience of description, and in practical application, the execution sequence between steps may be changed, which is not limited. Moreover, in other embodiments, the steps of the corresponding methods need not be performed in the order shown and described herein, and the methods may include more or less steps than described herein. Furthermore, individual steps described in this specification, in other embodiments, may be described as being split into multiple steps; various steps described in this specification, in other embodiments, may be combined into a single step.

The above technical solutions of the embodiments of the present application are described below with reference to specific application scenarios.

An application development framework (application development and reasoning framework) is provided in the embodiment of the application, and relates to a computational flow graph constructed by using a machine learning model and a deep learning model, and the application can process analysis tasks of various multi-mode data (such as audio and video, sensors and the like) and can run on various hardware platforms (such as CPU, GPU, ARM, FPGA and the like). The application development framework can operate the reasoning operation of the machine learning model, can also combine the existing or new machine learning model and algorithm in the form of a calculation flow diagram, and can realize the operation of the machine learning model and algorithm in various cross-platform devices, can reasonably optimize the calculation flow diagram by combining with the condition of hardware resources, and can maximally utilize the device resources, balance efficiency and reasoning effect. The user can quickly add or quickly build business applications using existing general image processing and machine learning algorithms.

The application development framework can provide a service application programming interface, a service application module registration, a machine learning model registration and a public operator library for a developer, so that the developer can perform reasoning construction and scene application development based on the machine learning model on the premise of not knowing the details of the underlying algorithm. The application development framework abstracts the general algorithms of reasoning calculation, image processing and computational vision of a machine learning (perception) model into module components, constructs a modularized component diagram for the data flow processing of the whole business application, and solves the business application problem by maintaining and modifying the calculation diagram of the data flow. For example, the video and audio data stream can be input into a computation flow diagram, and the video face detection and tracking tasks are completed by constructing the computation flow diagram comprising the functional modules of video decoding, face detection, target tracking and the like, so that the face position frame is output in real time.

The application development framework is supported to run on different devices (such as embedded, mobile devices, workstations, servers and the like) and hardware platforms (such as CPU, GPU, ARM, FPGA and the like), comprises heterogeneous and non-heterogeneous platforms, and is a cross-platform processing framework. The developer can quickly register the existing or new machine learning model and algorithm into the framework without paying attention to the running environments of different hardware platforms and focusing on the development of the algorithm module and the model, and a computational graph is constructed to realize the cross-platform processing.

Referring to fig. 3A, for a networking schematic of an application development framework, the application development framework may include, but is not limited to, an operator development and registration module, a model encapsulation import module, a computation flow graph construction module, a computation flow graph optimization module, a computation flow graph execution module, a private operator library, a public operator library, and a model pool. The operator development and registration module and the model encapsulation import module belong to an operator and model registration layer, the computation flow graph construction module belongs to a visual business graph construction layer, the computation flow graph optimization module and the computation flow graph execution module belong to a computation flow graph engine layer, the private operator library, and the public operator library and the model pool belong to a frame resource management layer.

The application development framework takes an operation node (the operation node can also be called an operator) as a center, the operation node is the most basic calculation unit, the encapsulation of the data stream operation is realized inside, and the bottommost function is provided for the outside. The computation flow graph is a directed graph representing the computation flow of the data flow, the data flow enters from an input node of the computation flow graph (namely, the first operation node of the computation flow graph), flows at each operation node of the computation flow graph, and finally is converged at an output node. Each node in the computation flow graph can be an operation node or a subgraph, and the subgraph also comprises a complete computation flow graph, so that the combination of some fixed computations is realized, and the nodes are convenient to multiplex in different graphs, so that the application of large-scale modularization is quickly built.

Referring to FIG. 3A, the application development framework itself may store a large number of executable files into a common operator library. On this basis, the operator development and registration module is used for providing the registration capability of the executable file for the user, the user completes the development of the code file (namely, the customized private code file) according to the interface requirement of the operator development and registration module, and a registration message is input to the operator development and registration module, wherein the registration message comprises the code file. The operator development and registration module parses the code file from the registration message, converts the code file into an executable file, and compiles the code file using, for example, a dynamic library automatic compilation tool to obtain an executable file that can be run by the hardware platform. The operator development and registration module stores the executable file into a private operator library or a public operator library, and adds file information of the executable file, such as file name, function type, input parameter information and the like of the executable file according to the registration format of the executable file.

The application interface requirements of the operator development and registration module include: each newly added executable file needs to realize three interface functions of operator development and registration module initialization, operator development and registration module creation, operator development and registration module main processing and the like according to the general interface requirements defined by the operator development and registration module.

Referring to FIG. 3A, the application development framework itself may store the network model into a model pool. On the basis, the model package import module is used for providing import capability of the network model for the user, so that the user inputs the machine learning network model which has completed training to the model package import module, the model package import module acquires the machine learning network model which has completed training, performs specified processing on the machine learning network model to obtain a processed network model, and stores the processed network model in the model pool.

Referring to fig. 3B, the model package import module performs quantization processing, encapsulation processing, compiling processing, and encryption processing on the machine learning network model, thereby converting the machine learning network model into a model expression file (processed network model) that can perform model reasoning on various hardware platforms. Model reasoning refers to: the model is deployed to the domain where classification predictions of the data are required to identify known patterns or objects.

Because the machine learning network model trained by the user can come from various different training frameworks (Caffe, pytorch, tensorFlow, paddlePaddle), the hardware platform for actually completing model reasoning is also various, so that in order to facilitate the application development framework to identify model information and complete model reasoning and result analysis on the corresponding hardware platform, the models of the various different training frameworks can be converted into model expression files which can be inferred on different hardware platforms through model encapsulation. In addition, model identification, model service analysis information, operation platform information, model version information and the like can be packaged in the model expression file in a packaging stage, so that the purpose and the operation platform of the model can be rapidly identified through the corresponding analysis protocol, the safety of the model information is ensured, and the cross-platform reasoning deployment is facilitated.

Because the application development framework does not make special requirements on the model training framework, the model can be converted into a general model exchange format (such as ONNX) through model conversion compiling, namely, the machine learning network model is subjected to targeted optimization compiling aiming at different hardware platforms, so that the reasoning efficiency is improved.

Referring to fig. 3A, a computation flow graph construction module may obtain an initial computation flow graph corresponding to a target service, where the initial computation flow graph may include a plurality of operation nodes, and connection relationships of the plurality of operation nodes.

For example, the computation flow graph construction module splits the target service into a plurality of functions, acquires a plurality of operation nodes corresponding to the plurality of functions, determines connection relationships of the plurality of operation nodes based on relationships of the plurality of functions of the target service, and generates an initial computation flow graph based on the plurality of operation nodes and the connection relationships of the plurality of operation nodes. For example, the computational flow graph construction module identifies key steps in the target business, corresponds to available operational nodes or subgraphs, and finally generates an initial computational flow graph to describe the data flow graph of the whole business process.

For another example, a computational flow graph construction module receives a computational flow graph configuration file corresponding to a target service input by a user, where the computational flow graph configuration file may include a plurality of operation nodes and connection relationships of the plurality of operation nodes, and generates an initial computational flow graph corresponding to the target service based on the computational flow graph configuration file.

For another example, a computational flowsheet construction module receives an initial computational flowsheet corresponding to a target business entered by a user. For example, a user can automatically generate an initial computational flow graph by drag programming with the help of a visual graph editor, reducing the configuration effort of a developer. The developer can arrange and modify the business process by dragging the graphical operation nodes, configuring network parameters of the operation nodes and connecting the operation nodes.

For example, referring to fig. 2A, for a target service for implementing face detection and gender detection, the target service may be split into a video decoding function, an image scaling function, a face detection function, an image matting function, and a face gender classification function. Referring to fig. 2B, an initial computation flow diagram corresponding to the target service is shown.

Referring to fig. 3A, the computation flow graph optimization module may optimize an initial computation flow graph to obtain an optimized computation flow graph, and take the optimized computation flow graph as a target computation flow graph. For example, the operation nodes of the initial computation flow graph and/or the connection relation of the operation nodes are optimized, and the optimized computation flow graph is obtained.

For example, the computation flow graph optimization module performs at least one of splitting processing, merging processing, memory sharing processing and hardware acceleration processing on the operation nodes of the initial computation flow graph under the condition of ensuring the computation accuracy based on the hardware capability (such as at least one of computation capability, storage capability and bandwidth capability) of the hardware platform.

The computation flow graph optimization module can also perform execution sequence adjustment processing on the connection relation of the operation nodes of the initial computation flow graph, for example, the execution sequence of the connection relation of the operation nodes is adjusted for data parallel processing, for example, the execution sequence adjustment processing is not limited by the pipelining parallelism of different operation nodes.

Through the optimization steps, the optimal target calculation flow diagram is finally found.

Referring to fig. 3A, for each operation node in the target computation flow graph, the computation flow graph execution module selects, based on the function type of the operation node, an executable file for implementing the function type, that is, an executable file corresponding to the operation node, from the public operator library and/or the private operator library. Based on the connection relation of a plurality of operation nodes in the target calculation flow graph, the input-output relation of a plurality of executable files is determined. And acquiring a program to be executed corresponding to the target service based on the plurality of executable files, and acquiring a configuration file corresponding to the program to be executed based on the input-output relation of the plurality of executable files, thereby obtaining the program to be executed and the configuration file.

And the computation flow graph execution module performs service processing on the data to be processed through the program to be executed and the configuration file to obtain a service processing result matched with the target service. For example, the program to be executed is deployed to the hardware platform, and the configuration file is sent to the hardware platform. And the hardware platform generates the execution sequence of each executable file in the program to be executed according to the input-output relation of each executable file in the configuration file. After receiving the data to be processed, the data to be processed is used as input data of a program to be executed, the data is sequentially processed through each executable file in the program to be executed, namely, data flows flow according to the execution sequence of each executable file, relevant processing calculation is respectively carried out through each executable file, an output result is transmitted to the next executable file, and the like until the processing is completed, and a service processing result is obtained.

Referring to fig. 2D, the application program may input data to be processed into the program to be executed, the data processing subroutine is started, and invokes the preprocessing interface to perform data preprocessing, such as completing coding and decoding, scaling, and the like, and after the preprocessing is completed, the data is transferred to the model reasoning subroutine. The model reasoning subprogram calls the reasoning interface to combine the data and the loaded model to complete the reasoning calculation, and returns the reasoning result to the post-processing subprogram after the output result is obtained. And the post-processing subprogram finishes the post-processing operation of the data, and finally returns the post-processing result to the application program, thereby finishing the execution process of the execution computation flow graph.

Before business processing is carried out on data to be processed through a program to be executed and a configuration file, if a target computation flow graph comprises network parameters corresponding to an operation node, loading the network parameters in an executable file corresponding to the operation node in the program to be executed, and adding the executable file loaded with the network parameters into the program to be executed. And/or, for an executable file in the program to be executed, which needs to load a network model, a network model corresponding to the executable file can be obtained from a model pool, the network model is loaded in the executable file, and the executable file after the network model is loaded is added into the program to be executed.

Referring to FIG. 3A, for public operator libraries and private operator libraries, may include, but are not limited to: executable files for implementing image processing functions (e.g., image filtering, geometric image transformation, color conversion, etc.), executable files for implementing machine learning functions (e.g., face detection, body detection, vehicle detection, face gender classification, etc.), executable files for implementing video analysis functions (e.g., motion estimation, object tracking, etc.), executable files for implementing pattern recognition functions (e.g., feature extraction, object detection, etc.), executable files for implementing deep learning model reasoning functions, etc. For the model pool, common models such as human body detection models, vehicle detection models, and the like can be included but are not limited to.

In consideration of the use safety of users and the like, the application development framework can support encryption of computational flow graph protocols, operator libraries and models, and the users can automatically perform decryption operation during use.

As can be seen from the above technical solutions, the present embodiment provides a machine learning application development framework based on a computation flow graph, which not only supports the reasoning application of a machine learning model, but also supports the development of various complex systems and industry scene applications that are quickly built around the application development of the machine learning model by a user, and under the application development framework, the user does not need to know the implementation details of the underlying algorithm. The application development framework is a cross-platform framework, can run in respective embedded platforms, workstations, servers and other devices, can perform targeted optimization aiming at the characteristics of the platforms, and fully utilizes the resources of the hardware platform. And the unified application development framework of the end-edge cloud is supported, and after the development of the executable file and the model is realized, the smooth migration capability can be realized on all platforms.

Based on the same application concept as the above method, in an embodiment of the present application, a service processing apparatus is further provided, as shown in fig. 4, which is a structural diagram of the apparatus, where the apparatus includes:

an obtaining module 41, configured to obtain a target computation flow graph corresponding to a target service; the target computation flow graph at least comprises a plurality of operation nodes, function types of the operation nodes and connection relations of the operation nodes; acquiring a program to be executed corresponding to a target service according to the target calculation flow diagram, and acquiring a configuration file corresponding to the program to be executed according to the target calculation flow diagram; the program to be executed comprises a plurality of executable files, and the executable files correspond to the function types of the operation nodes; the configuration file comprises input-output relations of the plurality of executable files, and the input-output relations of the plurality of executable files are determined based on the connection relations of the plurality of operation nodes;

the processing module 42 is configured to perform service processing on the data to be processed through the program to be executed and the configuration file, so as to obtain a service processing result matched with the target service; and processing the input data of the executable file aiming at each executable file in the program to be executed to obtain a data processing result, and outputting the data processing result of the executable file to the next executable file with an output relation with the executable file based on the input-output relation in the configuration file.

For example, the acquiring module 41 is specifically configured to, when acquiring a target computation flowsheet corresponding to a target service: acquiring an initial calculation flow diagram corresponding to a target service; the initial computation flow graph at least comprises a plurality of operation nodes, function types of the operation nodes and connection relations of the operation nodes;

acquiring the target computing flow graph according to the initial computing flow graph; or, optimizing the initial computation flow graph, and acquiring the target computation flow graph according to the optimized computation flow graph.

Illustratively, the obtaining module 41 is specifically configured to, when optimizing the initial computation flowsheet:

optimizing the operation nodes of the initial computation flow graph and/or the connection relation of the operation nodes to obtain an optimized computation flow graph; wherein the optimization process includes at least one of: splitting, merging, memory sharing, hardware acceleration and sequence adjustment.

For example, the obtaining module 41 is specifically configured to, when obtaining a program to be executed corresponding to a target service according to the target computation flowsheet: determining an executable file corresponding to each operation node based on the function type of each operation node in the target calculation flow graph, wherein the executable file is used for realizing the function type; and acquiring a program to be executed corresponding to the target service based on a plurality of executable files corresponding to a plurality of operation nodes in the target computation flow graph.

Illustratively, the obtaining module 41 is specifically configured to, based on the function type of each operation node in the target computation flow graph, determine an executable file corresponding to the operation node:

selecting an executable file for realizing the function type from an operator library according to the function type of each operation node in the target calculation flow diagram; the operator library comprises a plurality of executable files stored in advance, and each executable file is used for realizing at least one function type;

and determining the selected executable file as the executable file corresponding to the operation node.

Illustratively, the program to be executed includes a data processing subroutine, a model reasoning subroutine, and a post-processing subroutine; the processing module 42 performs service processing on the data to be processed through the program to be executed and the configuration file, and is specifically configured to:

based on the input-output relationship in the configuration file, determining that the output of the data processing subprogram is a model reasoning subprogram, and the input of the model reasoning subprogram is the data processing subprogram; and determining that the output of the model inference subroutine is a post-processing subroutine, the input of the post-processing subroutine being the model inference subroutine;

Inputting data to be processed into a data processing subprogram; the data processing subprogram preprocesses the data to be processed, and inputs the preprocessed data to the model reasoning subprogram; the model reasoning subprogram carries out model reasoning on the data, and the data after the model reasoning is completed is input to the post-processing subprogram; and the post-processing subprogram carries out post-processing on the data to obtain a service processing result matched with the target service.

Illustratively, the processing module 42 is further configured to perform service processing on the data to be processed through the program to be executed and the configuration file, and before obtaining a service processing result matched with the target service, further configured to:

if the target computation flow graph further comprises network parameters corresponding to the operation nodes, loading the network parameters in the executable files corresponding to the operation nodes in the program to be executed, and adding the executable files loaded with the network parameters into the program to be executed; or alternatively, the first and second heat exchangers may be,

and aiming at an executable file which needs to load a network model in the program to be executed, acquiring the network model corresponding to the executable file from a model pool, loading the network model in the executable file, and adding the executable file loaded with the network model into the program to be executed.

Based on the same application concept as the above method, a service processing device is further provided in the embodiment of the present application, and from a hardware level, a schematic diagram of a hardware architecture of the service processing device may be shown in fig. 5. The service processing device may include: a processor 51 and a machine-readable storage medium 52, the machine-readable storage medium 52 storing machine-executable instructions executable by the processor 51; the processor 51 is configured to execute machine executable instructions to implement the methods disclosed in the above examples of the present application. For example, the processor 51 is configured to execute machine executable instructions to implement the steps of:

Based on the same application concept as the above method, the embodiment of the present application further provides a machine-readable storage medium, where the machine-readable storage medium stores a number of computer instructions, where the computer instructions can implement the method disclosed in the above example of the present application when executed by a processor.

For example, the computer instructions, when executed by a processor, can implement the steps of:

By way of example, the machine-readable storage medium may be any electronic, magnetic, optical, or other physical storage device that can contain or store information, such as executable instructions, data, and the like. For example, a machine-readable storage medium may be: RAM (Radom Access Memory, random access memory), volatile memory, non-volatile memory, flash memory, a storage drive (e.g., hard drive), a solid state drive, any type of storage disk (e.g., optical disk, dvd, etc.), or a similar storage medium, or a combination thereof.

The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. A typical implementation device is a computer, which may be in the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email device, game console, tablet computer, wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present application.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Moreover, these computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.

Claims

1. A method of service processing, the method comprising:

acquiring an initial computation flow diagram corresponding to a target service, wherein the initial computation flow diagram at least comprises a plurality of operation nodes, function types of the operation nodes and connection relations of the operation nodes;

optimizing the operation nodes of the initial computation flow graph and/or the connection relation of the operation nodes to obtain an optimized computation flow graph, and acquiring a target computation flow graph according to the optimized computation flow graph; the target computation flow graph at least comprises a plurality of operation nodes, function types of the operation nodes and connection relations of the operation nodes; if the optimization processing comprises splitting processing and memory sharing processing, splitting the initial computation flow graph to obtain a computation flow graph after splitting processing, acquiring a first program to be executed based on the initial computation flow graph, and acquiring a second program to be executed based on the computation flow graph after splitting processing; if the performance cost of the second program to be executed is smaller than that of the first program to be executed, performing memory sharing processing on the split computation flow graph to obtain the optimized computation flow graph; or if the performance cost of the second program to be executed is greater than the performance cost of the first program to be executed, performing memory sharing processing on the initial computation flow graph to obtain the optimized computation flow graph;

2. The method of claim 1, wherein the step of determining the position of the substrate comprises,

the obtaining the program to be executed corresponding to the target service according to the target calculation flow graph comprises the following steps:

Determining an executable file corresponding to each operation node based on the function type of each operation node in the target calculation flow graph, wherein the executable file is used for realizing the function type;

and acquiring a program to be executed corresponding to the target service based on a plurality of executable files corresponding to a plurality of operation nodes in the target computation flow graph.

3. The method of claim 2, wherein the determining, based on the function type of each operation node in the target computation flow graph, an executable file corresponding to the operation node comprises:

4. The method according to claim 3, wherein before selecting the executable file for implementing the function type from the operator library based on the function type of the operation node, the method further comprises:

Receiving a registration message sent by a user terminal, wherein the registration message comprises a code file;

converting the code file into an executable file for implementing at least one function type;

and storing the executable file into the operator library.

5. The method of claim 1, wherein the program to be executed comprises a data processing subroutine, a model reasoning subroutine, a post-processing subroutine;

the service processing is carried out on the data to be processed through the program to be executed and the configuration file to obtain a service processing result matched with the target service, and the method comprises the following steps:

6. The method of claim 1, wherein the step of determining the position of the substrate comprises,

before the service processing is performed on the data to be processed through the program to be executed and the configuration file to obtain the service processing result matched with the target service, the method further comprises the following steps:

7. The method of claim 6, wherein the step of providing the first layer comprises,

before the network model corresponding to the executable file is obtained from the model pool, the method further comprises:

acquiring a machine learning network model which has completed training;

performing appointed processing on the machine learning network model to obtain a processed network model;

And storing the processed network model in the model pool.

8. The method of claim 7, wherein the specifying process comprises at least one of: quantization processing, encapsulation processing, compiling processing and encryption processing;

when the machine learning network model is packaged, at least one of model identification, model service analysis information, operation platform information and model version information is packaged.

9. A service processing apparatus, the apparatus comprising:

the system comprises an acquisition module, a calculation module and a calculation module, wherein the acquisition module is used for acquiring an initial calculation flow diagram corresponding to a target service, wherein the initial calculation flow diagram at least comprises a plurality of operation nodes, function types of the operation nodes and connection relations of the operation nodes; optimizing the operation nodes of the initial computation flow graph and/or the connection relation of the operation nodes to obtain an optimized computation flow graph, and acquiring a target computation flow graph according to the optimized computation flow graph; the target computation flow graph at least comprises a plurality of operation nodes, function types of the operation nodes and connection relations of the operation nodes; if the optimization processing comprises splitting processing and memory sharing processing, splitting the initial computation flow graph to obtain a computation flow graph after splitting processing, acquiring a first program to be executed based on the initial computation flow graph, and acquiring a second program to be executed based on the computation flow graph after splitting processing; if the performance cost of the second program to be executed is smaller than that of the first program to be executed, performing memory sharing processing on the split computation flow graph to obtain the optimized computation flow graph; or if the performance cost of the second program to be executed is greater than the performance cost of the first program to be executed, performing memory sharing processing on the initial computation flow graph to obtain the optimized computation flow graph;

10. A service processing apparatus, comprising: a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor;