CN115237457A

CN115237457A - AI application operation method and related product

Info

Publication number: CN115237457A
Application number: CN202110445015.9A
Authority: CN
Inventors: 李雨洺; 李泽奇; 谢周意
Original assignee: Huawei Cloud Computing Technologies Co Ltd
Current assignee: Huawei Cloud Computing Technologies Co Ltd
Priority date: 2021-04-24
Filing date: 2021-04-24
Publication date: 2022-10-25

Abstract

The application discloses an AI application running method and a related product, wherein the method is applied to application execution equipment and comprises the following steps: acquiring an AI application, wherein the AI application comprises a functional unit group and a model configuration file, and the functional unit group comprises an AI functional unit; and operating the functional unit group in the AI application, wherein when the AI functional unit is operated, calling a target adaptation model adapted to an inference framework in the application execution equipment to execute inference according to the model configuration file, and obtaining an inference result. By using the method, the application execution equipment can automatically call an adaptive model adaptive to an inference framework in the equipment to execute inference. From another perspective, the developer of the AI application does not need to consider how to modify the inference code of the AI application to adapt to different inference frames in different application execution devices when the AI application is developed, so as to reduce the development difficulty and the deployment difficulty of the AI application.

Description

AI application operation method and related product

Technical Field

The present application relates to the technical field of Artificial Intelligence (AI), and in particular, to an operating method for AI applications and related products.

Background

In recent years, AI technology has been continuously developed, and in particular, deep learning technology has been continuously developed, so that AI applications are widely applied to various fields, for example, image and voice recognition, natural language translation, computer gaming, and the like.

The AI application represents an application program developed for a specific application scenario and including at least one operator, where the operator in the AI application is an operation set for implementing a part of functions, and the functions of some operators in the AI application may be implemented by a trained AI model, that is, during the starting process of the AI application, the trained AI model may be called to perform inference to obtain an output result of the trained AI model. The AI model relies on an inference framework for reasoning. The inference framework is software that can be called by the AI application to drive the AI model in the AI application to make inference and obtain the output result of the AI model.

Generally, one AI application needs to be deployed to different application execution devices in a plurality of different environments (e.g., terminal environment, edge environment, cloud environment). The application execution devices may have different inference hardware, where the inference hardware refers to hardware having computing capability and capable of implementing an inference function, and the inference hardware may include chips such as a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), or a neural-Network Processing Unit (NPU). Inference hardware developed by hardware vendors for themselves generally provides a specific inference framework, so that higher inference performance can be achieved when the corresponding inference hardware is called to perform inference operations in the AI application based on the specific inference framework. Based on the above background, when an AI application is deployed to different types of application execution devices, different inference frameworks need to be used in order to ensure the inference performance of an AI model in the AI application. This results in the need to individually modify the inference code of the AI application to adapt to different inference frameworks. Therefore, huge workload is brought to an AI application developer, and the development difficulty of the AI application is improved.

Disclosure of Invention

The application discloses an AI application running method and a related product, different application execution devices utilizing the method can automatically call an adaptive model adaptive to an inference frame in the device to execute inference, and further, an AI application developer does not need to consider how to modify the inference code of the AI application to adapt to different inference frames in different application execution devices when developing the AI application, thereby reducing the development difficulty and the deployment difficulty of the AI application and improving the deployment efficiency of the AI application.

In a first aspect, the present application provides a method for running an AI application, where the method is applied to an application execution device, and the method includes the following steps:

acquiring an AI application, wherein the AI application comprises a functional unit group and a model configuration file, and the functional unit group comprises an AI functional unit;

and operating the functional unit group in the AI application, wherein when the AI functional unit is operated, calling a target adaptation model adapted to an inference framework in the application execution equipment to execute inference according to the model configuration file, and obtaining an inference result.

In the method, the AI application comprises a model configuration file, and the model configuration file comprises configuration information required by different inference frameworks for inference based on corresponding adaptive models. Therefore, after the application execution device acquires the AI application, the target adaptation model adapted to the inference framework in the application execution device can be called according to the model configuration file to execute inference, so as to run the AI application. From another perspective, the AI application can be deployed to different application execution devices without modifying inference code for different inference frameworks, thereby reducing development and deployment difficulties of the AI application.

In a possible implementation manner of the first aspect, the model configuration file includes a storage path of at least one adaptation model, and configuration information of each adaptation model; before invoking a target adaptation model adapted to an inference framework in the application execution device to perform inference, the method further comprises: determining the target adaptation model adapted to an inference framework in the application execution equipment according to the configuration information of each adaptation model; and acquiring the target adaptation model according to the storage path of the target adaptation model in the model configuration file. In this way, the application execution device can automatically acquire the target adaptation model adapted to the inference framework in the application execution device.

In a possible implementation manner of the first aspect, the obtaining the AI application specifically includes: establishing connection with an AI application management platform; and acquiring the AI application from the AI application management platform, wherein the AI application is generated by the AI application management platform. The process of generating the AI application by the AI application management platform may specifically include: the AI application management platform converts the trained AI model into at least one adaptation model, each adaptation model is adapted to one inference frame, and a model configuration file comprising configuration information required by each inference frame when performing inference based on the corresponding adaptation model and a functional unit group required by operating the AI application are generated, so that the AI application is generated. Therefore, an AI application developer can deploy the AI application to different application execution devices to run without adaptively modifying the inference code of the AI application for different inference frameworks, thereby reducing the development difficulty and the deployment difficulty of the AI application.

In a possible implementation manner of the first aspect, the functional unit group further includes a preprocessing functional unit, and before invoking a target adaptation model adapted to the inference hardware in the application execution device to perform inference, the method further includes: and operating the preprocessing function unit, wherein when the preprocessing function unit is operated, the input data of the AI application is processed.

In a possible implementation manner of the first aspect, the configuration information of the target adaptation model includes a format of input data, and the executing the processed functional unit includes: and when the input data of the AI application does not meet the format of the input data, operating the preprocessing functional unit.

Since different adaptation models may have different requirements on the format of the input data (e.g., image size, color space, etc.), the application execution device needs to run a preprocessing function unit to process the input data of the AI application into input data that meets the requirements of the target adaptation model before inferring the target adaptation model. In the method, the application execution device may obtain the requirement of each adaptation model for the format of the input data from the configuration information of each adaptation model in the model configuration file, so that the application execution device may run the preprocessing unit according to the configuration information of the target adaptation model in the model configuration file to process the input data of the AI application into the input data meeting the requirement of the target adaptation model.

In a possible implementation manner of the first aspect, the functional unit group further includes a post-processing functional unit, and after obtaining the inference result, the method further includes: and operating the post-processing functional unit, wherein when the post-processing functional unit is operated, the inference result is processed. Since the AI application has a specified requirement for the output result, and the inference result output by different adaptation models may not satisfy the requirement for the output result of the AI application, for example, the AI application requires that the size of the output image is 640 × 480, and the size of the output image of the target adaptation model is 512 × 288, in this case, after obtaining the inference result, the post-processing functional unit is operated to convert the size of the output image of the target adaptation model into the size of the output image required by the AI application.

In a possible implementation manner of the first aspect, before performing the method, the method further includes: installing a deployment agent provided by the AI application management platform, the deployment agent for communicating with the AI application management platform. By deploying the agent, the application execution device can establish a long connection with the AI application management platform, which can facilitate the application execution device to acquire the AI application from the AI application management platform.

In a second aspect, the present application provides an application execution device, which includes a processor, a memory, and a code in the memory executed by the processor to perform:

acquiring an AI application, wherein the AI application comprises a functional unit group and a model configuration file, and the functional unit group comprises AI functional units;

In a possible implementation manner of the second aspect, the model configuration file includes a storage path of at least one adaptation model, and configuration information of each adaptation model; the processor executes the code in the memory and further performs: determining the target adaptation model adapted to an inference framework in the application execution equipment according to the configuration information of each adaptation model; acquiring the target adaptation model according to the storage path of the target adaptation model in the model configuration file; and calling the target adaptation model to execute inference to obtain the inference result.

In a possible implementation manner of the second aspect, the processor executes the code in the memory, and further executes: establishing connection with an AI application management platform; and acquiring the AI application from the AI application management platform, wherein the AI application is generated by the AI application management platform.

In a possible implementation manner of the second aspect, the functional unit group further includes a preprocessing functional unit, and the processor executes the code in the memory and further performs: and operating the preprocessing function unit, wherein when the preprocessing function unit is operated, the input data of the AI application is processed.

In a possible implementation manner of the second aspect, the configuration information of the target adaptation model includes a format of input data, and the processor executes the code in the memory and further performs: and when the input data of the AI application does not meet the format of the input data, operating the preprocessing functional unit.

In a possible implementation manner of the second aspect, the functional unit group further includes a post-processing functional unit, and the processor executes the code in the memory and further performs: and operating the post-processing functional unit, wherein when the post-processing functional unit is operated, the inference result is processed.

In a possible implementation manner of the second aspect, the processor executes the code in the memory, and further executes: installing a deployment agent provided by the AI application management platform, the deployment agent for communicating with the AI application management platform.

In a third aspect, the present application provides a computer-readable storage medium storing computer instructions that, when executed by a computing device, perform the method provided in the first aspect or any one of the possible implementation manners of the first aspect.

In a fourth aspect, the present application provides a computer program product comprising computer instructions that, when executed by a computing device, perform the method provided in the first aspect or any one of the possible implementations of the first aspect.

Drawings

In order to more clearly illustrate the technical solutions in the present application, the drawings needed to be used in the description of the present application will be briefly introduced below, and it is apparent that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic diagram of an application scenario provided in the present application;

FIG. 2 is a schematic diagram of an application scenario provided herein;

FIG. 3 is a schematic diagram of a model configuration file provided herein;

FIG. 4 is a schematic diagram of an orchestration interface provided herein;

fig. 5 is a schematic flowchart of an operation method of an AI application provided in the present application;

FIG. 6 is a schematic diagram of a model configuration file provided herein;

fig. 7 is a schematic structural diagram of an application execution device provided in the present application.

Detailed Description

In order to facilitate understanding of the technical solutions provided in the present application, some technical terms related to the present application are first introduced.

The AI application is an application realized by an AI technique such as machine learning. An AI application represents an application program developed for a particular application scenario that includes at least one operator. Wherein, the operator in the AI application is a set of operations for implementing part of the functionality. The functions of some operators in an AI application may be implemented by an AI model.

An AI model refers to a model developed and trained by an AI technique such as machine learning (e.g., a deep learning technique). The AI model may specifically be a neural network model. The neural network model can automatically analyze and obtain rules from the data and use the rules to reason the unknown data.

The application scenario to which the present application relates will be described below with reference to fig. 1. As shown in fig. 1, the AI application management platform 100 includes a development system 101, and the development system 101 may be deployed in a cloud environment. A cloud environment indicates a central cluster of computing devices owned by a cloud service provider for providing computing, storage, and communication resources. At least one cloud device is included in the cloud environment. The cloud device may be a computing device in a cluster of central computing devices, such as a central server. The client 200 establishes a communication connection with the development system 101 in the cloud environment, which may be a long connection. The client 200 may be a browser or a dedicated client.

An AI model developer may interact with the development system 101 through the client 200, thereby constructing and training an AI model on the development system 101, and publishing the AI model after the AI model is trained. Further, an AI application developer may interact with the development system 101 through the client 200, so as to generate an AI function unit based on the trained AI model, and generate an AI application according to the AI function unit and the relevant preprocessing function unit and post-processing function unit. The AI application management platform 100 also supports deployment of an AI application to the application execution device 400 (e.g., the terminal device 401, the edge device 402, or the cloud device 403).

In some implementations, the AI application developer can also interact with the AI application management platform 100 through the client 200 to publish the AI application to the application marketplace 300. As such, the AI application user may trigger an AI application deployment operation through the application marketplace 300, and the application marketplace 300 may deploy the AI application to the application execution device 400, for example, the application marketplace 300 may deploy the AI application to the application execution device 400 through the AI application management platform 100.

The AI model developer and the AI application developer may be the same developer or different developers. The terminal device 401 includes, but is not limited to, a desktop, a laptop, a tablet, or a smart phone, etc. Edge device 402 is a computing device in an edge environment. The computing devices in the edge environment may be edge servers, computing boxes, and the like. Wherein the edge environment indicates a cluster of edge computing devices geographically closer to the end device for providing computing, storage, and communication resources. Cloud device 403 is a computing device in a cloud environment, such as a central server.

AI applications typically rely on an inference framework to infer through an AI model. The inference framework is software that can be invoked by the AI application to drive the AI model in the AI application to make inferences, resulting in output results of the AI model. In particular, the inference framework is provided with an Application Programming Interface (API). The AI application calls the API to drive the AI model in the AI application to perform inference. The inference framework is typically provided by the hardware vendor when issuing the inference hardware. Different hardware manufacturers can release different types of inference hardware, such as a CPU, a GPU, an NPU, or other AI chips, and in order to fully exert the performance of the inference hardware, the hardware manufacturers can also provide an inference framework adapted to the inference hardware.

AI applications can be used in different scenarios in different domains. For example, the AI application may be a smart city management application for implementing automated city management. For example, the AI application may be a content auditing application for automatically auditing the content to be published on the platform, thereby improving auditing efficiency.

Currently, many AI applications are required to be deployed to different environments, for example, smart city management applications are generally required to be deployed to different environments to meet business requirements of cooperatively performing city management in different environments. However, the inference hardware types of the application execution devices in different environments are also different, for example, some of the application execution devices have inference hardware of CPU, other application execution devices have inference hardware of GPU, and still other application execution devices have inference hardware of NPU. The inference hardware developed by a hardware manufacturer for itself generally provides a specific inference framework, so that invoking the corresponding inference hardware to execute the inference operation in the AI application based on the specific inference framework can achieve high inference performance. When an AI application is deployed to different types of application execution devices (e.g., devices with different inference hardware), in order to ensure the inference performance of the AI model in the AI application, the inference code of the AI application needs to be modified separately to adapt to different inference frameworks. Therefore, huge workload is brought to an AI application developer, and the development difficulty of the AI application is improved.

In order to solve the above problem, the present application provides an AI application operating method, in which: the AI application acquired by the application execution device 400 from the AI application management platform 100 includes a model configuration file including configuration information required by different inference frameworks for inference based on the corresponding adaptation models. Accordingly, the application execution device 400 can invoke a target adaptation model adapted to the inference framework in the application execution device according to the model configuration file to perform inference, thereby running the AI application. In this way, the AI application can be run by application execution devices having different inference hardware. From another perspective, an AI application developer need not consider how to modify the inference code of an AI application to adapt to different inference frameworks in different application execution devices when deployed on different application execution devices when developing the AI application. An AI application developer can directly deploy the AI application to different application execution devices, and the application execution devices can automatically call an adaptive model adaptive to an inference framework in the device to execute inference according to a model configuration file when running the AI application. Therefore, the development difficulty and the deployment difficulty of the AI application can be reduced.

The AI application including the model configuration file and the development process, deployment process, and operation process thereof are described below with reference to fig. 2 to 6.

Referring first to fig. 2, the AI application management platform 100 shown in fig. 2 is augmented with a deployment management system 102 as compared to the AI application management platform 100 shown in fig. 1. The development system 101 and the deployment management system 102 establish a communication connection. The communication connection includes a wired communication connection or a wireless communication connection. Among them, various parts of the AI application management platform 100, such as the development system 101 and the deployment management system 102, may be deployed in a centralized manner in the cloud environment or may be deployed in a distributed manner in the cloud environment. FIG. 2 is illustrated with development system 101 and deployment management system 102 deployed centrally.

The development system 101 is configured to convert the trained AI model into at least one adapted model and generate a model configuration file. Specifically, when the trained AI model is obtained based on the custom model template training, the development system 101 may receive a conversion script of the adaptation model corresponding to different inference frames written by an AI model developer, and then generate at least one adaptation model using the conversion script. Each adaptation model is adapted to an inference framework, for example, the OpenVino model is adapted to the OpenVino framework, and the pb model is adapted to the tensrflow framework. Then, the development system 101 generates a model configuration file according to configuration information required when each inference framework performs inference based on the corresponding adaptation model.

The model configuration file comprises configuration information of at least one adaptive model, namely configuration information required by each inference framework for reasoning based on the corresponding adaptive model. Taking fig. 3 as an example, the model configuration file includes configuration information 1 required by the inference frame 1 for performing inference based on the corresponding adaptation model 1, configuration information 2, … required by the inference frame 2 for performing inference based on the corresponding adaptation model 2, and configuration information N required by the inference frame N for performing inference based on the corresponding adaptation model N. Wherein N is a positive integer.

As shown in fig. 3, the configuration information of each adaptation model may specifically include a hardware type (device type) of the inference hardware corresponding to the inference framework, so that the inference framework may match the hardware type of the inference hardware to the corresponding configuration information, and perform inference based on the configuration information. Further, the configuration information may also include other inference configuration items, such as an input data format of the adaptation model, an input identifier of the adaptation model, an output identifier of the adaptation model, and the like. Taking the input data as an image as an example, the input data format of the adaptation model may include, but is not limited to, one or more of the following: color space of the image (e.g., RGB or NV 21), image size (including width and height), number of image channels, whether normalization is required, and the like. The input identification of the adaptation model may be a name of an adaptation model input node (e.g., a name of a pb model input node), the input name, and the output identification of the adaptation model may be a name of an adaptation model output node (e.g., a name of a pb model output node). It should be understood that the configuration information of the above-mentioned adaptation models is only an example, and different adaptation models have different requirements on the format of the input data, so in practical applications, the configuration information of each adaptation model needs to be configured according to a specific adaptation model and a specific application scenario.

In some possible implementations, the model configuration file further includes at least one adaptation model or a storage path of at least one adaptation model. It should be understood that the purpose of including at least one adaptation model or the storage path of at least one adaptation model in the model configuration file is to enable the application execution device 400 to obtain the corresponding adaptation model (see the related description of S104 in detail). When the model profile includes at least one adaptation model, the application execution apparatus 400 may not need to remotely download the target adaptation model, thereby improving the start efficiency of the AI application. When the model configuration file comprises at least one storage path of the adaptive model, the scale of the model configuration file is greatly reduced, so that the resource consumed by transmitting the model configuration file can be reduced, and the resource utilization rate is improved.

As shown in fig. 3, the model configuration file includes a storage path of the adapted model 1, a storage path of the adapted model 2, …, and a storage path of the adapted model N. In this way, after the application execution device 400 acquires the AI application, the application execution device 400 may acquire a corresponding adaptation model according to a storage path of each adaptation model included in the model configuration file. In some embodiments, the application execution device 400 may also send a model obtaining request to the deployment management system 102, and the deployment management system 102 may, in response to the model obtaining request, obtain the corresponding adaptation model according to the storage path of the adaptation model, and then return to the application execution device 400.

Specifically, the AI model may be obtained by an AI model developer interacting with the development system 101 through the client 200 to perform model training. Specifically, the development system 101 may include a cloud integrated development environment (cloud IDE). The cloud IDE provides algorithms and data for model training, and the cloud IDE can perform model training through the above algorithms (e.g., deep learning algorithm) according to the data, so as to obtain a trained AI model.

Further, the development system 101 may also provide tool services (tools) through the closed IDE, where the tool services have the following functions: data management, training management, deployment management, and the like. The training management comprises the steps that a recommendation user (such as an AI model developer) uses preset model templates to build and train the model, and the AI model trained on the model templates can be automatically converted into an adaptive model adaptive to different inference frames.

The closed IDE is also provided with a package of files (packages) that include a package of algorithms for training. An AI model developer selects an algorithm from the algorithm package, and after the AI model is trained by the algorithm, the AI model can be automatically converted by using the conversion capability provided by the tool services, so that a model configuration file is generated. For example, an AI model developer may click on a model conversion control on a user interface to trigger a model conversion operation, and a closed IDE may automatically convert a trained AI model into at least one adapted model through tool services, each adapted model being adapted to an inference frame. This way a mask model difference can be achieved.

In some possible implementations, the AI model developer can also use custom model templates for model training. When training with the custom model template, the development system 101 may convert the trained AI model into at least one adapted model based on at least one conversion script defined by the AI model developer. The conversion script and the adaptation model can be in one-to-one correspondence.

The development system 101 is also used to generate AI applications. The AI application includes a set of functional units and a model configuration file. The group of functional units includes at least one functional unit, and a functional unit (also referred to as an operator) refers to a set of program codes including a series of operations for implementing an independent business function.

In a particular embodiment, the set of functional units may include an AI functional unit for performing AI model inference to implement a function of the AI model, the AI functional unit including inference code capable of driving inference of one of the at least one adapted model. Optionally, the functional unit group may further include a preprocessing functional unit for performing processing before the AI model inference, a post-processing functional unit for performing processing after the AI model inference, and the like. The preprocessing function unit may be configured to preprocess input data of the AI application to convert the input data into input data meeting requirements of the adaptation model. The preprocessing function unit can be configured according to the requirements of different adaptation models on the input data format and possible input data of the AI application. Taking the input data as the video as an example, the preprocessing functional unit may include a functional unit for implementing image scaling, a functional unit for implementing video encoding and decoding, a functional unit for implementing image color space transformation, and the like. Similarly, the post-processing functional unit can be used to process the output results of the AI model to meet the requirements of the AI application. The post-processing functional unit may be configured according to the output data formats of different adaptation models and the requirements of the AI application on the format of the output data of the adaptation model, which is not specifically limited here.

In the embodiment of the present application, the functional unit group may be generated in a plurality of ways, and two possible ways are listed below.

The first implementation mode comprises the following steps: the AI application developer triggers an orchestration operation through an orchestration tool in development system 101, for example, for a visualization orchestration interface. The arranging refers to flexibly assembling the normalized components such as the functional units to obtain the business process. The development system 101 organizes a plurality of function units including the AI function unit in response to an organizing operation of the AI application developer to generate a function unit group. Zero code programming can thus be implemented. Accordingly, the functional unit group is shown in a diagram form, and for convenience of description, the embodiment of the application is referred to as a service logic diagram. Then, after the application execution device 400 acquires the AI application, the application execution device 400 may load the functional unit group in the form of the business logic diagram described above through the diagram engine.

The second implementation mode comprises the following steps: an AI application developer triggers a code writing operation through the development system 101, specifically, each functional unit is provided with an API, the AI application developer can call the API of any one functional unit through writing a code, and the development system 101 responds to the code writing operation of the AI application developer and generates a functional unit group according to the written code. Accordingly, the set of functional units is in the form of code, for example including API calls. Then, after the application execution device 400 acquires the AI application, the application execution device 400 may load the above-described functional unit group through the API.

Considering that the input data format and the output data format of the AI application may be different in different business scenarios, the development system 101 may also support an AI application developer to customize the input data format and/or the output data format of the AI application to increase the adaptability of the AI application. In particular, the development system 101 can present a user interface, such as a visual layout interface, to the AI application developer through which the input data format and/or the output data format of the AI application configured by the AI application developer is received.

Taking fig. 4 as an example, as shown in fig. 4, the layout interface includes a toolbar 501 and an editing area 502, the toolbar 501 carries a plurality of functional units, and an AI application developer may select the functional units in the toolbar 501, and drag the functional units to the editing area 502, for example, sequentially select a pre-processing functional unit (including a de-encapsulation functional unit, a decoding functional unit, a service branching functional unit, and an image scaling (resize) functional unit), an AI functional unit (including an inference 1 functional unit and an inference 2 functional unit), and a post-processing functional unit (including a post-processing 1 functional unit, a post-processing 2 functional unit, and a service decision functional unit), drag the selected functional units to the editing area 502, connect the functional units in the editing area 502, add an input node and an output node, configure an input data format of an AI application at the input node, and configure an output data format of the AI application at the output node, thereby forming a service logic diagram. The business logic diagram can be packaged to form an AI application and loaded to run using a graph engine.

The business logic diagrams described above are generally static, and once packaged into an AI application, the business process of the AI application typically does not perform dynamic modifications. Based on this, when the business process has a requirement of dynamic modification, a dynamic business logic diagram (hereinafter referred to as a dynamic diagram) can be generated, and the diagram engine can load the dynamic diagram in a function call mode.

The development system 101 may also provide debugging functionality to debug the business logic of the AI application. It should be noted that, for some service logics, when the development system 101 does not provide a corresponding functional unit, the development system 101 further supports an AI application developer to manually write a code corresponding to the service logic.

The deployment management system 102 is used to deploy the developed AI application to the application execution device 400. The application execution device 400 may be one or more devices in a resource pool, such as a terminal device 401, an edge device 402, and a cloud device 403. Specifically, the deployment management system 102 may package the model configuration file and the service logic diagram configured on the development system 101 by a packaging tool or a packaging tool, for example, a continuous integration/continuous delivery (CI/CD) tool, generate a file package with a set format, and then deploy the file package to the application execution device. When the application execution device provides services to the outside in a container manner, the deployment management system 102 may further construct an image of the AI application according to the AI application through an image construction tool, and then the deployment management system 102 deploys the AI application according to the image of the AI application in an application execution device, for example, a container of the application execution device, so that the influence of the environmental difference may be reduced through the image.

In another possible implementation, the deployment management system 102 may also deploy the developed AI application to the application execution device 400 by: first, an AI application developer issues a developed AI application to the application market 300, then triggers an application deployment operation through the application market 300, and the application market 300 generates an application deployment request in response to the application deployment operation, and then sends the application deployment request to the deployment management system 102 to request the AI application to be deployed to the application execution device 400.

After the application execution device 400 acquires the AI application from the deployment management system 102, since the AI application includes the model configuration file including configuration information such as the inference hardware type, the application execution device 400 may match the model configuration file with the inference hardware type of the device, so as to acquire the target adaptation model matching the inference frame corresponding to the inference hardware type of the device, and then call the target adaptation model to execute inference, thereby further completing the operation of the AI application. A specific process of the application execution apparatus 400 running the AI application will be further described below with reference to fig. 5.

As shown in fig. 5, fig. 5 is a schematic flowchart illustrating an operation method of an AI application provided in the present application, where the method includes, but is not limited to, the following steps:

s101: the application execution device 400 installs the deployment agent provided by the AI application management platform 100.

Wherein the deployment agent is used to implement communication between the application execution device 400 and the AI application management platform 100. Specifically, the deployment agent establishes a long connection with the deployment management system 102, and is able to cooperate with the deployment management system 102 so that the application execution device 400 can acquire the AI application from the deployment management system 102.

S102: the application execution device 400 establishes a connection with the AI application management platform 100.

S103: the application execution device 400 acquires an AI application from the AI application management platform 100. The AI application includes a group of functional units and a model configuration file.

S104: the application execution device 400 acquires a target adaptation model adapted to the inference framework in the application execution device 400 according to the model configuration file.

In one embodiment, the model configuration file includes a storage path of at least one adaptation model, and configuration information of each adaptation model. Then, the application execution device 400 acquires the target adaptation model adapted to the inference framework in the application execution device 400 according to the model configuration file, including the following steps: the application execution device 400 determines the target adaptation model adapted to the inference framework in the application execution device according to the configuration information of each adaptation model, and then obtains the target adaptation model according to the storage path of the target adaptation model in the model configuration file.

In another embodiment, the model profile includes at least one adaptation model, and configuration information for each adaptation model. Then, the application execution device 400 obtains the target adaptation model adapted to the inference framework in the application execution device 400 according to the model configuration file, including the following steps: the application execution device 400 determines the target adaptation model adapted to the inference framework in the application execution device according to the configuration information of each adaptation model, and then obtains the target adaptation model from the model configuration file.

More specifically, the configuration information of each adaptation model includes the hardware type of the inference hardware corresponding to the inference framework adapted to the adaptation model, and therefore, the application execution device 400 determines the target adaptation model according to the configuration information of each adaptation model, including the following steps: the application execution device 400 determines that the hardware type of the inference hardware in the configuration information of the target adaptation model is consistent with the hardware type of the inference hardware in the application execution device 400 according to the hardware type of the inference hardware in the device and the configuration information of each adaptation model, and then determines that the target adaptation model is adapted to the inference framework in the application execution device 400.

It is noted that an adaptation model may have a plurality of different versions, and therefore at least one adaptation model in the model configuration file comprises different versions of the same adaptation model. At this time, the configuration information of each adaptation model may further include a version of the adaptation model, so that the application execution device 400 may select an adaptation model meeting the requirements of the device, thereby simplifying the compatibility problem of different versions of the same adaptation model.

S105: the application execution device 400 runs a functional unit group in the AI application to run the AI application.

In a specific embodiment, the group of functional units includes one or more of the following functional units: AI function unit, pretreatment function unit, post-treatment function unit. When the functional unit group includes an AI functional unit, a pre-processing functional unit, and a post-processing functional unit, step S105 may specifically include the following steps:

step 1: the application execution device 400 runs a preprocessing unit.

In a specific embodiment, when the application execution device 400 runs the preprocessing function unit, the application execution device 400 processes the input data of the AI application to process the input data of the AI application into input data that meets the requirements of the target adaptation model.

In a specific embodiment, the configuration information of each adaptation model in the model configuration file includes the format of the input data, i.e. the requirements of each adaptation model on the format of the input data. Then, the application execution device 400 may obtain the requirement of the target adaptation model for the format of the input data according to the configuration information of the target adaptation model, and then process the input data of the AI application according to the requirement of the target adaptation model for the format of the input data, so as to obtain the input data meeting the requirement of the target adaptation model.

Alternatively, the application execution device 400 may not perform step 1 when the input data of the AI application satisfies the requirement of the target adaptation model on the format of the input data.

Optionally, the preprocessing function unit includes a plurality of preprocessing function sub-units, and each preprocessing function sub-unit is configured to implement one or more preprocessing processes. Therefore, the application execution device 400 may determine which parameters the input data of the AI application does not satisfy according to the requirements of the target adaptation model on the format of the input data, and then run the corresponding pre-processing function sub-unit for the parameters the input data of the AI application does not satisfy, thereby processing the input data of the AI application into the input data satisfying the requirements of the target adaptation model.

It should be understood that the configuration information of each adaptation model in the model configuration file may further include an input identifier of the adaptation model and an output identifier of the adaptation model, for example, before reasoning about the pb model, identifiers need to be added to the input node and the output node of the pb model, i.e., the name of the input node and the name of the output node, respectively, so that if the target adaptation model is the pb model, the application execution device 400 may also run a pre-processing function subunit that adds the input identifier and the output identifier. That is, in practical applications, the application execution device 400 may run the preprocessing function unit according to practical situations.

Step 2: the application execution device 400 runs an AI function unit.

In a specific embodiment, when the application execution device runs the AI function unit, the application execution device 400 invokes the target adaptation model to perform inference according to the model configuration file, so as to obtain an inference result.

And step 3: the application execution device 400 runs a post-processing functional unit.

In a specific embodiment, when the application execution device 400 runs the post-processing functional unit, the inference result is processed to be output data meeting the AI application requirement.

In a specific embodiment, the configuration information of each adapted model in the model configuration file further includes requirements for output data (e.g., format of the output data), that is, requirements of the AI application for the output data. Then, the application execution device 400 may obtain the requirement of the AI application on the output data according to the configuration information of the target adaptation model, and then process the output data of the target adaptation model according to the requirement of the AI application on the output data, so as to obtain the output data meeting the requirement of the AI application on the output data.

Alternatively, when the output data of the target adaptation model satisfies the requirement of the AI application for the output data, the application execution device 400 may not perform step 3.

Optionally, the post-processing functional unit includes a plurality of post-processing functional subunits, and each post-processing functional subunit is configured to implement one or more post-processing procedures. Therefore, the application execution device 400 may determine which requirements the output data of the target adaptation model does not satisfy according to the requirements of the AI application on the output data, and then operate the corresponding post-processing function sub-unit for the requirements that the output data of the target adaptation model does not satisfy, thereby processing the output data of the target adaptation model into the output data that satisfies the requirements of the AI application on the output data. It should be understood that, similarly to the application execution device 400 running the pre-processing functional unit, in practical applications, the application execution device 400 will run the post-processing functional unit according to practical situations, and will not be described here.

For example, assuming that the inference hardware in the application execution device 400 is a GPU, the model configuration file in the AI application is as shown in fig. 6, and the input data of the AI application is an NV21 image with a size of 512 × 288. Then, when the AI application is launched, the application execution device 400 (specifically the internal inference framework) automatically parses the model configuration file shown in fig. 6. Then, the application execution apparatus 400 determines and acquires the adaptation model 3 from the model configuration file. Then, it is determined that a preprocessing unit for image color space conversion is operated according to the configuration information of the adaptation model 3, so that the NV21 image is subjected to color space conversion processing to obtain an RGB image, and at the same time, a normalization preprocessing unit, and a mean processing unit are operated, so that the converted image is subjected to normalization, and RGB mean processing, and the name of the input node and the name of the output node are configured for the adaptation model 3. And finally, the processed image is used as the input of the adaptive model 3, the AI function unit is operated to carry out reasoning on the adaptive model 3 to obtain a reasoning result, then the reasoning result is determined to meet the requirement of the output data format of the adaptive model 3 applied by the AI, and the reasoning result is output.

Based on the above solution of the present application, after the application execution device 400 acquires the AI application from the AI application management platform 100, a target adaptation model adapted to an inference framework in the application execution device can be automatically invoked to execute inference according to a model configuration file in the AI application, so as to run the AI application. In this way, application execution devices having different inference frameworks can all run the AI application. From another perspective, an AI application developer can deploy an AI application directly to different application execution devices by generating a model configuration file and packaging the model configuration file in the AI application without manually modifying the inference code to adapt to the different inference frameworks. Therefore, the development difficulty and the deployment difficulty of the AI application can be reduced.

The foregoing describes in detail a method for running an AI application by an application execution device, and the application execution device provided in the present application is described below with reference to fig. 7. Fig. 7 shows a schematic structural diagram of an application execution device provided in the present application, and an inference framework is installed on the application execution device 400, where the inference framework is software installed on the application execution device and called by an AI application. As shown in fig. 7, the application execution device 400 includes a memory 601, a processor 602, a communication interface 603, and a bus 604, wherein the memory 601, the processor 602, and the communication interface 603 are communicatively connected to each other through the bus 604.

The Memory 601 may be a Read Only Memory (ROM), a static Memory device, a dynamic Memory device, or a Random Access Memory (RAM). The memory 601 may store programs, e.g., codes of functional unit groups. The memory 601 may also store program data generated by the processor 602 upon execution, e.g., target adaptation models, configuration information for the target adaptation models, inference results, etc.

The processor 602 may be a general purpose CPU, GPU, microprocessor, application Specific Integrated Circuit (ASIC), or one or more integrated circuits. The processor may also be an integrated circuit chip having signal processing capabilities. In implementation, some or all of the functions of the application execution device may be performed by instructions in the form of hardware, integrated logic circuits, or software in the processor 602. The processor 602 may also be a general purpose processor, a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. The general-purpose processor may be a microprocessor, or the processor may be any conventional processor, and the steps of the method disclosed in the foregoing description may be directly implemented as a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software modules may be located in ram, flash, rom, prom, eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 601, and the processor 602 reads information in the memory 601, and performs a function of the application execution device 400 running the AI application in conjunction with hardware thereof. In a specific embodiment, the processor 602 may further include one or more inference hardware, for example, the inference hardware 1 shown in fig. 6, which is hardware called by the inference framework for executing the AI application, for example, artificial intelligence chips developed by various vendors such as CPU, GPU, and the like.

The communication interface 603 enables communication between the application execution device 400 and other devices or communication networks using transceiver modules such as, but not limited to, transceivers. For example, an AI application may be obtained through communication interface 603.

The bus 604 may include a pathway to transfer information between the various components of the application execution apparatus 400 (e.g., the memory 601, the processor 602, the inference hardware 130, the communication interface 603).

In a specific embodiment, the processor 602 executes the code in the memory 601 to perform the following steps: acquiring an AI application, wherein the AI application comprises a functional unit group and a model configuration file, and the functional unit group comprises an AI functional unit; and operating the functional unit group in the AI application, wherein when the AI functional unit is operated, the target adaptation model adapted to the inference framework in the application execution device 400 is called to execute inference according to the model configuration file, and an inference result is obtained.

In a specific embodiment, the model configuration file comprises at least one storage path of the adaptation model and configuration information of each adaptation model; the processor 602 executes the code in the memory 601 and further performs the following steps: determining a target adaptation model adapted to the inference framework in the application execution device 400 according to the configuration information of each adaptation model; and acquiring the target adaptation model according to the storage path of the target adaptation model in the model configuration file.

In a specific embodiment, the processor 602 executes the code in the memory 601 and further performs the following steps: establishing connection with the AI application management platform 100; the AI application is acquired from the AI application management platform 100, and the AI application is generated by the AI application management platform 100.

In a specific embodiment, the functional unit group further includes a preprocessing functional unit, and the processor 602 executes the code in the memory 601, and further performs the following steps: and operating a preprocessing function unit, wherein when the preprocessing function unit is operated, input data of the AI application is processed.

In a specific embodiment, the configuration information of the target adaptation model includes a format of input data, and the processor 602 executes the code in the memory 601, and further performs the following steps: and when the input data of the AI application does not meet the format of the input data, operating a preprocessing functional unit.

In a specific embodiment, the functional unit group further includes a post-processing functional unit, and the processor 602 executes the code in the memory 601, and further performs the following steps: and operating the post-processing functional unit, wherein when the post-processing functional unit is operated, the inference result is processed.

In a specific embodiment, a deployment agent provided by the AI application management platform 100 is installed, the deployment agent being used to communicate with the AI application management platform 100.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the above-described computer program instructions are loaded and executed on a computing device (such as the application execution device shown in fig. 7), the procedures or functions according to the embodiments of the present application are generated in whole or in part. The computing device may be a general purpose computer, a special purpose computer, a network of computers, or other programmable apparatus. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium that can be accessed by a computing device or a data storage device, such as a server, a data center, etc., that includes one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy disk, memory disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., SSD), among others. In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

While the invention has been described with reference to specific embodiments, the scope of the invention is not limited thereto, and those skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the invention. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. An Artificial Intelligence (AI) application running method is applied to an application execution device and comprises the following steps:

2. The method of claim 1, wherein the model configuration file comprises a storage path of at least one adaptation model, and configuration information of each adaptation model;

before the calling the target adaptation model adapted to the inference framework in the application execution device to perform inference, the method further comprises:

determining the target adaptation model adapted to an inference framework in the application execution equipment according to the configuration information of each adaptation model;

and acquiring the target adaptation model according to the storage path of the target adaptation model in the model configuration file.

3. The method according to claim 1 or 2, wherein the obtaining the AI application specifically includes:

establishing connection with an AI application management platform;

and acquiring the AI application from the AI application management platform, wherein the AI application is generated by the AI application management platform.

4. The method according to any one of claims 1-3, wherein the group of functional units further comprises a pre-processing functional unit, and wherein the method further comprises, before invoking a target adaptation model adapted to inference hardware in the application execution device to perform inference:

and operating the preprocessing function unit, wherein when the preprocessing function unit is operated, the input data of the AI application is processed.

5. The method of claim 4, wherein the configuration information of the target adaptation model comprises a format of input data, and wherein the running the pre-processed functional unit comprises:

and when the input data of the AI application does not meet the format of the input data, operating the preprocessing functional unit.

6. The method according to any of claims 1-5, wherein the set of functional units further comprises a post-processing functional unit, and after obtaining the inference result, the method further comprises:

and operating the post-processing functional unit, wherein when the post-processing functional unit is operated, the inference result is processed.

7. The method according to any of claims 1-6, wherein prior to performing the method, the method further comprises:

installing a deployment agent provided by the AI application management platform, the deployment agent for communicating with the AI application management platform.

8. An application execution device comprising a processor, a memory, the processor executing code in the memory to perform:

9. The apparatus of claim 8, wherein the model configuration file comprises a storage path of at least one adaptation model, and configuration information of each adaptation model;

the processor executes the code in the memory and further performs:

10. The apparatus of claim 8 or 9, wherein the processor executes code in the memory and further performs:

establishing connection with an AI application management platform;

11. The apparatus of any of claims 8-10, wherein the set of functional units further comprises a pre-processing functional unit, and wherein the processor executes code in the memory and further performs:

12. The apparatus of claim 11, wherein the configuration information of the target adaptation model comprises a format of input data, and wherein the processor executes code in the memory and further performs:

13. The apparatus of any of claims 8-12, wherein the set of functional units further comprises a post-processing functional unit, and wherein the processor executes code in the memory and further performs:

14. The apparatus of any of claims 8-13, wherein the processor executes code in the memory and further performs:

15. A computer readable storage medium having computer instructions stored thereon which, when executed by a computing device, cause the computing device to perform the method of any of claims 1 to 7.

16. A computer program product comprising computer instructions which, when executed by a computing device, cause the computing device to perform the method of any of claims 1 to 7.