CN114489963A - Management method, system, equipment and storage medium of artificial intelligence application task - Google Patents

Management method, system, equipment and storage medium of artificial intelligence application task Download PDF

Info

Publication number
CN114489963A
CN114489963A CN202110172700.9A CN202110172700A CN114489963A CN 114489963 A CN114489963 A CN 114489963A CN 202110172700 A CN202110172700 A CN 202110172700A CN 114489963 A CN114489963 A CN 114489963A
Authority
CN
China
Prior art keywords
computing
task
computing device
tasks
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110172700.9A
Other languages
Chinese (zh)
Inventor
陈普
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Cloud Computing Technologies Co Ltd
Original Assignee
Huawei Cloud Computing Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Cloud Computing Technologies Co Ltd filed Critical Huawei Cloud Computing Technologies Co Ltd
Priority to EP21890892.9A priority Critical patent/EP4239476A4/en
Priority to PCT/CN2021/124253 priority patent/WO2022100365A1/en
Publication of CN114489963A publication Critical patent/CN114489963A/en
Priority to US18/316,818 priority patent/US20230281056A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5033Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering data affinity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5066Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/501Performance criteria

Abstract

The application discloses a method, a system, equipment and a storage medium for managing artificial intelligence application tasks, and belongs to the technical field of AI. The AI application task involves a plurality of computing tasks, the method comprising: during the execution of the plurality of computing tasks by the first computing device, the management apparatus determines, among the plurality of computing tasks, a target computing task to be executed by the second computing device, the target computing task having data association with at least one other computing task involved in the AI application task; the management device sends a first instruction to the first computing equipment; the first computing device sends data obtained by executing at least one other computing task to the second computing device according to the first instruction; the second computing device performs the target computing task based on the input data using the data as input data for the target computing task. The method and the device are beneficial to improving the resource utilization rate of the computing equipment executing the AI application task and/or the processing efficiency of the computing task.

Description

Management method, system, equipment and storage medium of artificial intelligence application task
The present disclosure claims priority from chinese patent application No. 202011262475.X entitled "a method and apparatus for dynamic scheduling of AI system resources" filed on 12/11/2020, which is incorporated by reference in its entirety.
Technical Field
The present application relates to the field of Artificial Intelligence (AI), and in particular, to a method, a system, a device, and a storage medium for managing an AI application task.
Background
With the rapid development of artificial intelligence technology, the problem is solved by using the artificial intelligence technology in more and more application scenes. One application scenario typically needs to solve a plurality of problems, for example, in an analysis recognition scenario of a traffic intersection video, a plurality of problems of vehicle detection, tracking, vehicle type recognition, color recognition, traffic light detection, traffic light state recognition, human detection and tracking, non-motor vehicle detection and tracking, and the like need to be solved.
In the related art, when a solution related to an application scenario includes multiple problems to be solved, an AI application task in an application scenario may include multiple computing tasks, each computing task separately solves a part of the problems in the application scenario, and the multiple computing tasks may interact with each other to form a solution of the application scenario together. The multiple computing tasks may be centrally deployed on a single computing device or distributively deployed over multiple computing devices.
Currently, when a computing device executes a plurality of computing tasks related to an AI application task, the computing tasks can only be executed in a pre-deployed manner. In an actual application scenario, in a process in which a computing task is executed in a computing device according to a pre-deployment manner, there may be a case where a resource utilization rate in the computing device is low or a processing efficiency of the computing task is low.
Disclosure of Invention
The application provides a management method, a management system, a management device and a storage medium for an AI application task, which are beneficial to improving the resource utilization rate of computing equipment for executing the AI application task and/or the processing efficiency of the computing task, and improving the operation performance of the whole system for realizing the AI application task. The technical scheme provided by the application is as follows:
in a first aspect, the present application provides a method for managing an AI application task. An AI application task can include multiple computing tasks, each for implementing a portion of the functionality of a solution. The method comprises the following steps: during the process that the first computing device executes the plurality of computing tasks, the management device determines a target computing task in the plurality of computing tasks, wherein the target computing task is a computing task to be executed by the second computing device, and the target computing task is in data association with at least one other computing task related to the AI application task; the management device sends a first instruction to the first computing equipment; the first computing device sends data obtained by executing at least one other computing task to the second computing device according to the first instruction; the second computing device performs the target computing task based on the input data using the data as input data for the target computing task. And the other computing tasks are computing tasks except the target computing task in the AI application task. The existence of data association among computing tasks means that: the input data for one computing task is the output data for another computing task or tasks that have been invoked.
By the management apparatus determining, during execution of a plurality of calculation tasks included in the AI application task by the first calculation device, a target calculation task to be executed in the second calculation device among the plurality of calculation tasks, the first computing device then sends data needed to perform the target computing task to the second computing device, which takes the data sent by the first computing device as input data for the target computing task, and executing the target computing task based on the input data, enabling, during execution of the AI application task by the first computing device, the target computing task is adjusted to be executed by the second computing device, the computing task can be flexibly scheduled in the process of executing the AI application task, the resource utilization rate and/or the computing task processing efficiency of the first computing device and the second computing device are improved, and the operation performance of the whole system for realizing the AI application task is improved.
In one implementation, before the management device determines a target computing task of the plurality of computing tasks, the method further comprises: the management apparatus acquires first resource usage information when a plurality of computing tasks are executed in a first computing device. Accordingly, the management device determines a target computing task of the plurality of computing tasks, including: the management device determines a target computing task of the plurality of computing tasks based on the first resource usage information.
The first resource usage information is used for reflecting resource conditions used by the computing task in the first AI application task and resource conditions of the first computing device. By acquiring first resource usage information of a plurality of computing tasks when the computing tasks are executed in first computing equipment and determining a target computing task in the computing tasks based on the first resource usage information, the target computing task can be screened out from the computing tasks according to the computing task in the first AI application task and the actual condition that the first computing equipment uses resources, and the target computing task is scheduled, which is beneficial to improving the resource utilization rate and/or the computing task processing efficiency of the first computing equipment.
Alternatively, when the running program of the target computing task is not deployed in the second computing device, the running program of the target computing task needs to be deployed in the second computing device before the target computing task is executed by using the second computing device. Then, according to different implementations of the running program for obtaining the target computing task by the second computing device, the method further includes: the management device sends the running program of the target computing task to the second computing device, or the management device sends a second instruction to the second computing device to instruct the second computing device to obtain the running program of the target computing task.
In one implementation, to ensure the operational performance of the second computing device in performing the target computing task, the method further comprises: the management apparatus sends a third instruction to the second computing device to instruct the second computing device to prepare computing resources for performing the target computing task.
In an implementation manner in which the management apparatus determines the target computing task of the plurality of computing tasks, the management apparatus may determine the target computing task of the plurality of computing tasks when the operating efficiency of the plurality of computing tasks in the first computing device does not satisfy a preset first condition and/or the resource utilization condition for operating the plurality of computing tasks in the first computing device does not satisfy a preset second condition.
By determining the target computing task to be executed by the second computing device among the plurality of computing tasks in the first AI application task when the operating efficiency of the computing task in the first AI application task and/or the resource utilization condition of the computing task in the first AI application task do not satisfy the corresponding condition, the operating efficiency and/or the resource utilization ratio of the first AI application task can be effectively improved.
The management apparatus may manage the computing tasks executed in the computing devices according to the resource usage of the plurality of computing devices managed by the management apparatus, so as to improve the operation performance of the overall system including the plurality of computing devices. The method further comprises: the management apparatus acquires second resource usage information of a computing task executed in a second computing device. Accordingly, the management device determines a target computing task of the plurality of computing tasks, including: the management device determines a target computing task of the plurality of computing tasks based on the first resource usage information and the second resource usage information.
By determining that the second computing device executes the target computing task according to the first resource usage information and the second resource usage information, the operating efficiency and the resource utilization rate of the AI application tasks running in the multiple computing devices managed by the management device can be considered globally, the AI computing tasks in the first AI application tasks can be managed when the overall operating efficiency and/or the resource utilization condition of the first AI application tasks are poor, the chopping board effect of the overall system can be reduced at the cost of executing the scheduling operation, and the operating performance of the overall system of the multiple computing devices managed by the management device is improved.
In one implementation, the first resource usage information includes: execution information of at least one of the plurality of computing tasks and resource information of the first computing device.
Optionally, the operational information of the at least one computing task is obtained from one or more of the following operational parameters: the method comprises the steps of calculating the calling times of a task in unit time length, calculating the input data volume of the task, calculating the output data volume of the task, calling the running time of the calculation task by first calculation equipment, calling the consumption of a processor of the calculation task, and calling the consumption of a memory of the calculation task.
The resource information of the first computing device is obtained by one or more of the following parameters: a nominal value and a total consumption of the memory of the first computing device, a bandwidth and a nominal value of the bandwidth of the data transmitted by the processor memory of the first computing device to the AI computing memory of the first computing device, and a bandwidth and a nominal value of the bandwidth of the data transmitted by the AI computing memory of the first computing device to the processor memory of the first computing device.
In one implementation, the management apparatus is disposed in the first computing device, the second computing device, or a third computing device, wherein the third computing device is connected with the first computing device and the second computing device through a communication path.
The first computing equipment is a display card, an AI computing chip or a server; the second computing device is a display card, an AI computing chip or a server; the third computing device is a display card, an AI computing chip or a server.
In a second aspect, the present application provides an AI system comprising a first computing device, a second computing device, and a management apparatus, the management apparatus being configured to determine a target computing task of a plurality of computing tasks, wherein the target computing task is a computing task to be executed by the second computing device, and the target computing task is associated with at least one other computing task presence data related to an AI application task; the management device is also used for sending a first instruction to the first computing equipment; the first computing device is used for sending data obtained by executing at least one other computing task to the second computing device according to the first instruction; and the second computing device is used for taking the data as input data of the target computing task and executing the target computing task based on the input data.
Optionally, the management device is further configured to: acquiring first resource usage information of a plurality of computing tasks when executed in a first computing device; a target computing task of the plurality of computing tasks is determined based on the first resource usage information.
Optionally, when the running program of the target computing task is not deployed in the second computing device, the management apparatus is further configured to: and sending the running program of the target computing task to the second computing device, or sending a second instruction to the second computing device to instruct the second computing device to acquire the running program of the target computing task.
Optionally, the management device is further configured to: sending a third instruction to the second computing device to instruct the second computing device to prepare computing resources for executing the target computing task.
Optionally, the management apparatus is specifically configured to: when the operating efficiency of a plurality of computing tasks in the first computing device does not meet a preset first condition and/or the resource utilization condition for operating the plurality of computing tasks in the first computing device does not meet a preset second condition, determining a target computing task in the plurality of computing tasks.
Optionally, the management device is further configured to: obtaining second resource usage information for a computing task executing in a second computing device; the management device is specifically configured to: a target computing task of the plurality of computing tasks is determined based on the first resource usage information and the second resource usage information.
Optionally, the first resource usage information includes: execution information of at least one of the plurality of computing tasks and resource information of the first computing device.
Optionally, the operational information of the at least one computing task is obtained from one or more of the following operational parameters: the method comprises the steps of calculating the calling times of a task in unit time length, calculating the input data volume of the task, calculating the output data volume of the task, calling the running time of the calculation task by first calculation equipment, calling the consumption of a processor of the calculation task, and calling the consumption of a memory of the calculation task.
The resource information of the first computing device is obtained from one or more of the following parameters: the memory of the first computing device is a nominal value and a total consumption amount of the memory, the processor memory of the first computing device transmits data to the AI computing memory of the first computing device through a bandwidth and a nominal value of the bandwidth, and the AI computing memory of the first computing device transmits data to the processor memory of the first computing device through the bandwidth and the nominal value of the bandwidth.
Optionally, the management apparatus is disposed in the first computing device, the second computing device, or a third computing device, wherein the third computing device is connected to the first computing device and the second computing device through a communication path.
Optionally, the first computing device is a display card, an AI computing chip, or a server; the second computing device is a display card, an AI computing chip or a server; the third computing device is a display card, an AI computing chip or a server.
In a third aspect, the present application provides a management apparatus, where the management apparatus includes a control module and a scheduling module, the control module is configured to determine a target computing task in a plurality of computing tasks in a process where a first computing device executes the plurality of computing tasks, where the target computing task is a computing task to be executed by a second computing device, and the target computing task is associated with at least one other computing task related to an AI application task by data; the scheduling module is used for sending a first instruction to the first computing device, wherein the first instruction is used for instructing the first computing device to send data obtained by executing at least one other computing task to a target computing task in the second computing device.
Optionally, the control module is further configured to: acquiring first resource usage information of a plurality of computing tasks when executed in a first computing device; a target computing task of the plurality of computing tasks is determined based on the first resource usage information.
Optionally, when the running program of the target computing task is not deployed in the second computing device, the scheduling module is further configured to send the running program of the target computing task to the second computing device, or send a second instruction to the second computing device to instruct the second computing device to acquire the running program of the target computing task.
Optionally, the scheduling module is further configured to send a third instruction to the second computing device to instruct the second computing device to prepare the computing resource for executing the target computing task.
Optionally, the control module is specifically configured to: when the operation efficiency of a plurality of computing tasks in the first computing device does not meet a preset first condition and/or the resource utilization condition for operating the plurality of computing tasks in the first computing device does not meet a preset second condition, determining a target computing task in the plurality of computing tasks.
Optionally, the control module is further configured to obtain second resource usage information of a computing task executed in the second computing device; the control module is specifically configured to determine a target computing task of the plurality of computing tasks based on the first resource usage information and the second resource usage information.
Optionally, the first resource usage information includes: execution information of at least one of the plurality of computing tasks and resource information of the first computing device.
Optionally, the operational information of the at least one computing task is obtained from one or more of the following operational parameters: the method comprises the steps of calculating the calling times of a task in unit time length, calculating the input data volume of the task, calculating the output data volume of the task, calling the running time of the calculation task by first calculation equipment, calling the consumption of a processor of the calculation task, and calling the consumption of a memory of the calculation task.
The resource information of the first computing device is obtained by one or more of the following parameters: the memory of the first computing device is a nominal value and a total consumption amount of the memory, the processor memory of the first computing device transmits data to the AI computing memory of the first computing device through a bandwidth and a nominal value of the bandwidth, and the AI computing memory of the first computing device transmits data to the processor memory of the first computing device through the bandwidth and the nominal value of the bandwidth.
Optionally, the management apparatus is disposed in the first computing device, the second computing device, or a third computing device, wherein the third computing device is connected to the first computing device and the second computing device through a communication path.
Optionally, the first computing device is a video card, an AI computing chip, or a server; the second computing device is a display card, an AI computing chip or a server; the third computing device is a display card, an AI computing chip or a server.
In a fourth aspect, the present application provides an electronic device, where the electronic device includes a memory and a processor, and when the processor executes computer instructions stored in the memory, the electronic device implements the functions of the apparatus provided in the third aspect.
In a fifth aspect, the present application provides a computer-readable storage medium, which is a non-volatile computer-readable storage medium, and the computer-readable storage medium stores program instructions therein, and when the program instructions are executed by an electronic device, the electronic device implements the functions of the apparatus provided in the third aspect.
In a sixth aspect, the present application provides a computer program product containing instructions for causing a computer to implement the functions of the apparatus provided in the third aspect when the computer program product runs on the computer.
Drawings
Fig. 1 is a schematic diagram of an execution logic of an AI application task according to an embodiment of the present application;
fig. 2 is a schematic diagram of an AI system related to a management method for an AI application task according to an embodiment of the present application;
fig. 3 is a schematic diagram of an AI system related to another management method for an AI application task according to an embodiment of the present application;
fig. 4 is a schematic diagram illustrating that the functions of a management device provided in the embodiment of the present application can be abstracted into a management cloud service on a cloud platform by a cloud service provider;
fig. 5 is a schematic diagram of an AI system related to another management method for an AI application task according to an embodiment of the present application;
fig. 6 is a flowchart of a method for managing an AI application task according to an embodiment of the present application;
fig. 7 is a flowchart of a method for managing an AI application task, which is implemented by using a functional module according to an embodiment of the present application;
FIG. 8 is a schematic diagram of an AI application task execution logic according to an embodiment of the present disclosure;
Fig. 9 is a schematic diagram of an execution logic of a first AI application task and a second AI application task after a target computing task in the first AI application task is scheduled to be executed in a second computing device according to an embodiment of the present application;
fig. 10 is a schematic diagram of another execution logic of a first AI application task and a second AI application task after a target computing task in the first AI application task is scheduled to be executed in a second computing device according to an embodiment of the present application;
fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
With the development of the AI technology, the AI technology is used to solve problems in more and more application scenarios. Solutions to specific goals of an application scenario may be achieved by AI application tasks, such as: for a traffic scene, a set of real-time video analysis solutions can be provided to obtain a vehicle track and vehicle attributes under the traffic scene and a real-time traffic light state under the traffic scene. An AI application task represents a solution provided for a particular goal of an application scenario, which may be executed by a computing device to implement a particular function. Generally, an AI application task involves multiple computing tasks, each for implementing a part of the functionality of the solution, and therefore, an AI application task may also be considered to include multiple computing tasks. There may be data associations between some of the computing tasks in the AI application tasks, such as: the input data for one computing task is the output data for another computing task or tasks that have been invoked. Therefore, there is usually a precedence order in the execution times of some of the computation tasks in an AI application task.
Still taking the video stream analysis of the traffic scene as an example, as shown in fig. 1, the traffic video stream can be analyzed by using the AI application task. The AI application task comprises a plurality of calculation tasks, wherein the plurality of calculation tasks are respectively used for realizing video decoding, vehicle target detection, vehicle target tracking, vehicle attribute detection, traffic light state detection and data output. Each calculation task is represented by a circle in fig. 1, and the arrows in fig. 1 represent the data flow direction between the calculation tasks. When executing the AI application task, the computing device sequentially executes the computing tasks for the same video stream according to the path indicated by the arrow to obtain the analysis result for the video stream.
In one implementation, the plurality of computing tasks are deployed in a centralized manner, that is, the same computing device is used to execute the plurality of computing tasks included in the AI application task.
In another implementation, multiple computing tasks are deployed in a distributed manner. For example, a plurality of computing tasks are respectively deployed on a plurality of computing devices, each computing device implements a corresponding function by executing the computing task deployed thereon, and the plurality of computing devices cooperatively implement a function of an AI application task including the plurality of computing tasks.
It should be understood that the computing device in this application may be an AI chip (e.g., a chip including a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a Field Programmable Gate Array (FPGA) or an application-specific integrated circuit (ASIC)), a graphics card, a server, a virtual machine, or the like.
However, in the above two implementation manners, after the AI application task is deployed, the related art can only complete the corresponding computing task according to the fixed deployment manner, which results in a poor resource utilization rate of the overall system in which the AI application task is deployed. For example, when an AI application task is deployed on a first computing device, during the execution of each computing task in the AI application task by the first computing device, there may be a case where the computing power required by some computing tasks during execution is small, but the occupied computing memory required is large, and at this time, due to insufficient memory that can be provided for other computing tasks, the other computing power in the computing device cannot be used for executing other computing tasks during the execution of this type of computing task, so that the resource utilization rate of the computing device is insufficient. For another example, during the process of executing each computing task in the AI application tasks, there may be a case where the computing power required by some computing tasks during the execution is large, but the occupied computing memory required is small, and at this time, due to insufficient computing power that can be provided for other computing tasks, there is a waste of computing memory in the computing device during the execution of this type of computing task, and the resource utilization rate of the computing device is also insufficient.
The resources of the computing device refer to computing resources required for executing computing tasks, and the computing device includes computing resources, memory resources and the like. For example, computational resources provided by a general purpose processor (e.g., a CPU), an AI processor (e.g., a GPU), memory resources provided by processor memory (e.g., CPU memory) and AI computational memory (e.g., GPU memory). The processor memory is a memory allocated to the general-purpose processor, the AI calculation memory is a memory allocated to the AI processor, and when a calculation task is executed by the AI processor, the AI calculation memory needs to be occupied.
It should be understood that the computing power required by a computing task during execution in this application means the amount or ratio of the amount of computing resources that need to be used in the computing device per unit time to execute the computing task, or the time or ratio of the time that the computing task uses the computing resources per unit time. For example: when the computing device is a graphics card including a GPU and a memory, the computational power required during the execution of the computing task may be expressed in terms of the actual consumed time of the GPU computing resources occupied by the computing task per second, such as: when the GPU takes up 10 ms per second (1 ms — 1000 ms), and the calculation task is called 1 time per second, the calculation power required for the calculation task can be expressed as (10 ms/1000 ms) × 1 × 100% — 1%.
It should also be understood that, in the present application, the calculation memory required in the process of executing the calculation task indicates the amount of memory or the memory ratio in the computing device occupied when executing the calculation task. Still taking the example of the computing device as a graphics card including a GPU and a memory, the computing memory required during the execution of the computing task may be represented by a ratio of the memory (also referred to as a video memory) in the graphics card that is required to be occupied by the running of the computing task to the rated amount of memory of the graphics card.
The embodiment of the application provides a method for managing AI application tasks. The method comprises the following steps: in the process of executing a plurality of computing tasks related to one AI application task by a first computing device, a management apparatus determines a target computing task to be executed in a second computing device among the plurality of computing tasks, and then the first computing device transmits data required for executing the target computing task to the second computing device, and the second computing device takes the data transmitted by the first computing device as input data of the target computing task and executes the target computing task based on the input data. Wherein the data required to execute the target computing task comprises: data associated with the target computing task and the at least one other computing task presence data involved in the AI application task. The other computing tasks are computing tasks in the AI application task except the target computing task. The existence of data association among computing tasks means that: the input data for one computing task is the output data for another computing task or tasks that have been invoked.
By the aid of the management method for the AI application task, the target computing task can be adjusted to be executed by the second computing device in the process of executing the AI application task by the first computing device, namely, the method can flexibly schedule the computing task in the process of executing the AI application task, resource utilization rate and/or computing task processing efficiency of the first computing device and the second computing device are improved, and running performance of an overall system for realizing the AI application task is improved.
Fig. 2 is a schematic diagram of an AI system related to a method for managing an AI application task according to an embodiment of the present application. As shown in fig. 2, the AI system includes: a first computing device 10, a second computing device 20, and a third computing device 30, the third computing device 30 being connected to the first computing device 10 and the second computing device 20 by a communications pathway. Optionally, the AI system may also include more computing devices. The following describes the operating principle of the AI system according to the method for managing an AI application task provided in the embodiment of the present application, by taking the AI system shown in fig. 2 as an example.
As shown in fig. 2, a first AI application task 101 is deployed in the first computing device 10. A second AI application task 201 is deployed in the second computing device 20. The management apparatus 301 is disposed in the third computing device 30. The management means 301 is configured to determine, among the plurality of computing tasks included in the first AI application task 101, a target computing task to be executed by the second computing device 20 during execution of the first AI application task, and send a first instruction to the first computing device 10. Accordingly, the first computing device 10 is further configured to send data obtained by performing at least one other computing task to the second computing device 20 according to the first instruction, the at least one other computing task being associated with the target computing task presence data. And, the second computing device 20 is also operative to perform the target computing task based on data obtained from performing the at least one other computing task. As can be seen from this fig. 2, in the AI system shown in this fig. 2, the management apparatus may be separately deployed on the third computing device 30.
Alternatively, the third computing device 30, the first computing device 10, and the second computing device 20 may each be a graphics card, an AI computing chip, a physical machine, a bare metal server, or a cloud server. For example: the third computing device 30, the first computing device 10, and the second computing device 20 may each be an AI computing chip. When the first computing device 10 and the second computing device 20 are video cards or AI computing chips, the first computing device 10 and the second computing device 20 may be respectively deployed on different hosts or on the same host. When the third computing device 30, the first computing device 10, and the second computing device 20 are video cards or AI computing chips, the third computing device 30 may be deployed on a separate host with respect to the first computing device 10 and the second computing device 20, or the third computing device 30 may be deployed on the same host as some or all of the first computing device 10 and the second computing device 20.
Fig. 3 is a schematic diagram of an AI system related to another management method for an AI application task according to an embodiment of the present application. As shown in fig. 3, the AI system includes a first computing device 10 and a second computing device 20, and the first computing device 10 and the second computing device 20 are connected by a communication path. Optionally, the AI system may also include more computing devices. The following describes, by taking the AI system shown in fig. 3 as an example, an operating principle of the AI system related to another method for managing an AI application task provided in the embodiment of the present application.
As shown in fig. 3, a first AI application task 101 is deployed in the first computing device 10. A second AI application task 201 is deployed in the second computing device 20. Also, a management apparatus 103 is disposed in the first computing device 10. Alternatively, the management apparatus 103 may also be disposed in the second computing device 20. The functions of the first computing device 10, the second computing device 20, and the management apparatus 103 can be referred to the relevant description in the AI system shown in fig. 2, respectively. As can be seen from this fig. 3, in the AI system shown in this fig. 3, the management apparatus 103 is disposed in a computing device for executing an AI application task. When the management device is deployed in the computing equipment for executing the AI application task, the communication efficiency between the management device and the device for executing the AI application task in the computing equipment can be improved, and the influence of external factors such as a network is reduced.
Alternatively, the first and second computing devices 10 and 20 may be a graphics card, an AI computing chip, a physical machine, a bare metal server, or a cloud server. When the first computing device 10 and the second computing device 20 are video cards or AI computing chips, the first computing device 10 and the second computing device 20 may be respectively deployed on different hosts, or the first computing device 10 and the second computing device 20 may be deployed on the same host.
In one implementation, the AI system may be a system on the cloud that may utilize computing resources on the cloud to provide cloud services for users. Accordingly, the first computing device 10, the second computing device 20, and the third computing device 30 may be computing devices in a cloud platform. For example: a video card of a server in the cloud platform, an AI computing chip or a host (e.g., a cloud server). The AI application task executed by the first computing device 10 and the second computing device 20 may be that a cloud service provider having cloud platform resources deploys in a computing device on a cloud platform and provides the computing device to a user for use, or that an AI algorithm provider deploys in a computing device in a cloud platform and provides the computing device to a user for use.
At this time, as shown in fig. 4, the function of the management apparatus 301 can be abstracted as a kind of management cloud service at the cloud platform 1 by the cloud service provider and provided to the user. The management cloud service can determine a target computing task to be executed by the second computing device in a plurality of computing tasks during the process of executing the AI application task by the first computing device, and send a first instruction to the first computing device to realize the management of the target computing task. Fig. 4 is a schematic diagram of the management apparatus 301 disposed in the third computing device 30.
Optionally, as shown in fig. 4, the functions of the first AI application task 101 and/or the second AI application task 201 can also be abstracted by the cloud service provider in the cloud platform 1 to an AI service cloud service, and the AI service cloud service can implement an AI service of the user by executing the AI application task. The AI business cloud service and the management cloud service can be used in a matched mode. Fig. 4 is a schematic diagram of the first computing device 10 being configured to execute a first AI application task and the second computing device 20 being configured to execute a second AI application task.
For example, after the user purchases the AI business cloud service, the cloud platform may automatically provide the management cloud service for the AI business cloud service purchased by the user. Namely, the cloud platform monitors the service quality of the cloud service in the process of providing the AI service cloud service for the user, and when the service quality is poor, schedules a computing task in the AI application task by operating the management method of the AI application task provided by the embodiment of the application, and provides the management cloud service for the AI service cloud service purchased by the user, so as to ensure the service quality of the AI service cloud service purchased by the user.
For another example, the user may select whether to purchase the management cloud service when purchasing the AI business cloud service. When a user purchases the AI service cloud service and purchases the management cloud service, the management cloud service is used for monitoring the service quality of the AI service cloud service, and when the service quality is poor, the computing task in the AI application task is scheduled by operating the management method of the AI application task provided by the embodiment of the application, and the management cloud service is provided for the AI service purchased by the user, so that the service quality of the AI service purchased by the user is ensured.
In another possible implementation, the management cloud service may be an independent cloud service from the cloud services provided by the cloud platform. That is, the user can independently purchase the management cloud service on the cloud platform. When the user runs the AI application task by using the resources provided by the other platform, the management cloud service can be purchased only at the cloud platform, so that the computing task in the AI application task in the other resources can be scheduled by the management cloud service.
It should be noted that, in this embodiment of the application, the cloud platform 1 may be a cloud platform of a center cloud, a cloud platform of an edge cloud, or a cloud platform including a center cloud and an edge cloud, and this embodiment of the application is not particularly limited thereto. Moreover, when the computing device deployed with the management apparatus and the computing device used for executing the AI application task are both deployed in the cloud platform, the computing device deployed with the management apparatus and the computing device used for executing the AI application task may be deployed on the same cloud or on different clouds. For example, computing devices for performing AI application tasks are deployed on a central cloud and computing devices with management appliances deployed on an edge cloud.
In an implementation manner, the management method for the AI application task provided in the embodiment of the present application may be cooperatively implemented by a plurality of functional modules. The following describes the implementation of the method procedure by a plurality of functional modules, taking the AI system shown in fig. 2 as an example.
As shown in fig. 5, in the method for managing an AI application task provided in this embodiment of the application, the functions of the management device 301 may be implemented by the scheduling module 3011 and the control module 3012, the functions of the first computing device 10 are implemented by the first collection module 102, the first task scheduling execution module 103, and the first resource scheduling execution module 104, and the functions of the second computing device 20 are implemented by the second collection module 202, the second task scheduling execution module 203, and the second resource scheduling execution module 204. Fig. 5 is a schematic diagram of an AI application task including 5 calculation tasks, and black dots in fig. 5 indicate the calculation tasks. The function of each functional module is as follows:
the first acquisition module 102 and the second acquisition module 202 are configured to, in a process of executing an AI application task, acquire operation parameters of the AI application task and send the operation parameters to the scheduling module 3011, or process the operation parameters to obtain resource usage information and send the resource usage information to the scheduling module 3011. The operation parameters are basic parameters used for reflecting the use condition of resources by each computing task in the AI application task and the use condition of resources by computing equipment used for executing the AI application task. The resource use information is obtained after the operation parameters are processed, and the resource use information is used for reflecting the resource use condition of the computing task in the AI application task and the resource use condition of the computing equipment used for executing the AI application task. For example, the operation parameters are time information of calling the CPU by each computing task in the AI application task during execution and time information of calling the CPU by the first computing device, and the resource usage information is information of calling frequency, calling duration, consumption and the like of each computing task to the CPU during execution, and information of calling frequency, calling duration, consumption and the like of the first computing device to the CPU, which are obtained according to the operation parameters.
The scheduling module 3011 is configured to provide resource usage information to the control module 3012 based on the received operating parameters and/or resource usage information.
The control module 3012 is configured to determine, based on the resource usage information, a target computing task to be executed by the second computing device among a plurality of computing tasks included in the AI application task, send a notification indicating that the target computing task is executed by the second computing task to the scheduling module 3011, and send a first instruction to the first task scheduling execution module 103.
Accordingly, the scheduling module 3011 is further configured to generate a resource scheduling policy based on the notification indicating the execution of the target computing task by the second computing task, and provide the resource scheduling policy to the second resource scheduling execution module 204. The resource scheduling policy is used to instruct the second resource scheduling execution module 204 to prepare the computing resources for executing the target computing task. Optionally, the scheduling module 3011 is further configured to send the resource scheduling policy to the first resource scheduling module 1023. The resource scheduling policy is used to instruct the first resource scheduling module 1023 to perform resource reclamation on the resources required for executing the target computing task.
The first task scheduling execution module 103 is configured to send data obtained by executing at least one other computing task to the second computing device based on the received first instruction, so that the second computing device executes the target computing task according to the data. The function of the second task scheduling execution module 203 refers to the function of the first task scheduling execution module 103.
The first resource scheduling executing module 104 and the second resource scheduling executing module 204 are configured to schedule resources based on the resource scheduling policy, so that the computing device executes the computing task by using the scheduled resources.
The following describes an implementation process of a management method for an AI application task provided in an embodiment of the present application. As shown in fig. 6, the implementation process of the management method for the AI application task includes the following steps:
step 601, the management apparatus acquires first resource usage information of a plurality of computing tasks in the first AI application task when executed in the first computing device.
During the first AI application task executed by the first computing device (for the sake of distinction, the AI application task executed by the first computing device is hereinafter referred to as the first AI application task), the first computing device may acquire the operation parameter and provide the operation parameter to the management apparatus, so that the management apparatus obtains the first resource usage information according to the operation parameter. Or after acquiring the operation parameters, the first computing device performs processing according to the operation parameters to obtain first resource usage information, and provides the first resource usage information to the management device. Wherein the first AI application task includes one or more computing tasks, each computing task for implementing a portion of the functionality of the solution, and the functionality implemented by each computing task may be implemented by executing one or more algorithms.
In an implementation manner, when the method for managing an AI application task provided in this embodiment is implemented by an AI system shown in fig. 5, a first acquisition module 102 is disposed in a first computing device, and a scheduling module 3011 is disposed in a third computing device. As shown in fig. 7, during the first AI application task executed by the first computing device, the first acquisition module 102 can obtain an operation parameter and provide the operation parameter to the scheduling module 3011, so that the scheduling module 3011 obtains the first resource usage information based on the operation parameter. The operation parameters may be basic parameters for reflecting resource conditions used by respective computing tasks in the first AI application task and resource conditions used by the first computing device. The first resource usage information is obtained after the operation parameters are processed, and the first resource usage information is used for reflecting resource conditions used by the computing task in the first AI application task and resource usage conditions of the first computing device. For example, the operation parameters are time information for calling the CPU and time information for calling the CPU by the first computing device during the execution of each computing task in the AI application task, and the first resource usage information is information of calling frequency, calling duration, consumption and the like of the CPU during the execution of each computing task obtained according to the operation parameters, and information of calling frequency, calling duration, consumption and the like of the CPU by the first computing device.
Optionally, the first resource usage information includes: the computing device includes at least one of a first AI application task and resource information of a first computing device.
Wherein the operation information of at least one calculation task can be obtained by one or more of the following operation parameters: calculating the calling times of the task in unit time length; and, calculating an input data volume for the task; and, calculating the output data volume of the task; the first computing device calls the running time of the computing task; and, the first computing device invoking consumption of the processor of the computing task; and, the first computing device invokes a consumption of memory for the computing task. Wherein the consumption of memory by the first computing device to invoke the computing task comprises: the first computing device invokes a processor memory and/or an AI computation memory consumption of the computation task.
The resource information of the first computing device may be obtained from one or more of the following parameters: a nominal value and a total consumption of memory of the first computing device; and the first computing device processor memory calculates the bandwidth of data transmission and the rated value of the bandwidth to the AI of the first computing device; and, the AI computing memory of the first computing device transmitting the bandwidth of the data and the nominal value of the bandwidth to the processor memory of the first computing device. The nominal value and the total consumption of the memory of the first computing device include: a nominal value and a total consumption of processor memory of the first computing device, and/or an AI computing memory of the first computing device.
Wherein the consumption of the processor calling the computational task is represented by the number of cores and the duration of the processor used to perform the call. For example, the consumption amount of the processor for executing the call is, for example, 10 msec consumption of one core, which is represented by the occupation time of the single core when the call is executed. The consumption of memory is represented by the range of memory used by the compute task when it is invoked. For example, when a calculation task is called, the AI calculation memory consumption is 10 Gigabytes (GB). Correspondingly, the first resource usage information may be memory occupancy and/or power consumption ratio. The management device may calculate resource usage information according to the operation parameters by using a preset algorithm.
For example, assuming that the first computing device executes the first AI application task, the operation parameters acquired by the first acquisition module 102 include: when a certain calculation task is executed on one path of data flow, the average number of times of calling the calculation task per second is 0.5; each time the computing task is invoked, the input data size of the computing task is 100 Kilobytes (KB); the output data volume of the calculation task is 1KB when the calculation task is called each time; when the computing task is called each time, the running time of calling the computing task by the first computing equipment is 10 milliseconds; every time the calculation task is called, the consumption of AI calculation memory is 500-600 Megabytes (MB); calculating the total consumption of processor memories used by all general processors in the equipment to be 7.5 GB; the total consumption of AI calculation memories used by all calculation tasks in the calculation equipment is 7.5 GB; the bandwidth of data transmission from the processor memory of the computing device to the AI computing memory of the computing device is 1 gigabyte per second (GB/S), and the bandwidth of data transmission from the AI computing memory of the computing device to the processor memory of the computing device is 1 GB/S.
After the first acquisition module 102 sends the operation parameters to the scheduling module 3011, the scheduling module 3011 may calculate, according to the operation parameters, first resource usage information by using a preset algorithm. When the calculation force required in the execution process of the calculation task is expressed by the actual consumed time of the GPU calculation resource occupied by the calculation task per second, since the average number of times of calling the calculation task per second is 0.5, the running time of calling the calculation task by the first calculation device is 10 milliseconds and 1 second is equal to 1000 milliseconds each time the calculation task is called, the calculation force required in the execution process of the calculation task can be expressed as (10 milliseconds/1000 milliseconds) × 0.5 × 100% — 0.5%. When the AI calculation memory required during execution of the calculation task is represented by a ratio of the AI calculation memory occupied by operation of the calculation task to the total consumption of the AI calculation memory used by all the calculation tasks in the first calculation device, since the consumption of the AI calculation memory is 500 to 600MB and the consumption of the AI calculation memory used by all the calculation tasks in the calculation device is 7.5GB and 1GB is 1024MB each time the calculation task is invoked, the AI calculation memory required during execution of the calculation task can be represented as (600MB/(7.5GB × 1024)) x 100% × 7.8%, calculated according to the consumption of the AI calculation memory of 600 MB. Namely, the first resource usage information includes: the calculation task occupies 0.5% of AI calculation power in 1 second, and 7.8% of AI calculation memory.
Optionally, an implementation manner of the first computing device acquiring the operation parameter includes: the information is collected based on a framework used for AI calculation development, or obtained by querying or counting through an Application Program Interface (API), or collected and reported by an application program for implementing a first AI application task. Moreover, for some functional modules with fixed consumption, the operating parameters can be fixed values, and when the operating parameters need to be acquired, the preset fixed values can be used as the operating parameters, and the acquisition by an acquisition mode is not needed. For example, when the object analyzed by the video stream is a video stream whose resolution is a specified value, and the consumption required by the video decoding stream is fixed, the values of the operation parameters corresponding to the video decoding stream may be uniformly set as fixed consumption values, and when the operation parameters need to be obtained, the fixed consumption values are determined as the values of the operation parameters corresponding to the video decoding stream. Wherein the fixed consumption value may be determined statistically.
It should be noted that the operation of acquiring the first resource usage information may be performed automatically by the management apparatus, or the management apparatus may perform the operation under manual trigger. For example, the management apparatus may automatically acquire the first resource usage information of the first AI application task executed in the first computing device periodically during the execution of the first AI application task according to a preset time period. For another example, a system maintainer or a user of the management apparatus may send a trigger instruction to the management apparatus when there is a scheduling need, so that the management apparatus obtains the first resource usage information, which is executed by the first AI application task in the first computing device, under the instruction of the trigger instruction.
In addition, when the management method for the AI application task provided by the embodiment of the application is abstracted from a cloud service into a management cloud service and provided for a user, the user can configure whether the operation of acquiring the first resource usage information is automatically executed or manually triggered to execute when purchasing the management cloud service.
In step 602, the management apparatus obtains second resource usage information of a computing task executed in a second computing device.
The management apparatus may manage the computing tasks executed in the computing devices according to the resource usage of the plurality of computing devices managed by the management apparatus, so as to improve the operation performance of the overall system including the plurality of computing devices. Also, the AI application task may be executed in a computing device other than the first computing device of the plurality of computing devices. The method for managing an AI application task provided in the embodiment of the present application further includes: the management apparatus acquires second resource usage information of a computing task executed in the second computing device. The second computing device is any one of a plurality of computing devices managed by the management apparatus except the first computing device. The implementation process of step 602 is referred to the implementation process of step 601, and is not described herein again.
It should be noted that the step 602 is an optional step. In performing the management method of the AI application task, whether to perform this step 602 may be determined according to application requirements. When the step 602 is executed, the computing tasks can be managed in consideration of the resource usage of the plurality of computing devices managed by the management apparatus, and the operation performance of the entire system can be effectively ensured.
It should be appreciated that, in some implementations, the second AI application task executed by the second computing device may be the same as the first AI application task executed by the first computing device, i.e., each of the computing tasks included in the first AI application task and the second AI application task are the same. For example: fig. 1 is a logic schematic diagram of a first AI application task and a second AI application task, where the first AI application task and the second AI application task are both used to analyze a video stream of a traffic scene, and both the first AI application task and the second AI application task include a plurality of calculation tasks, and the plurality of calculation tasks are respectively used to perform video decoding, vehicle target detection, vehicle target tracking, vehicle attribute detection, traffic light state detection, and data output on the video stream.
In other implementations, the second AI application task performed by the second computing device may also be different from the first AI application task performed by the first computing device. The difference may mean that the plurality of calculation tasks included in the second AI application task are partially or entirely different. For example, the first AI application task executed by the first computing device includes 5 computing tasks, which are video decoding, vehicle object detection, vehicle object tracking, vehicle attribute detection, and data output, respectively, and the second AI application task executed by the second computing device includes 3 computing tasks, which are traffic light detection, traffic light state detection, and data output, respectively, and it is known that the computing task in the first AI application task is partially the same as the AI computing task in the second AI application task. For another example, the first AI application task executed by the first computing device includes 4 computing tasks, the 4 computing tasks are respectively video decoding, vehicle target detection, vehicle target tracking and vehicle attribute detection, the second AI application task executed by the second computing device includes 3 computing tasks, the 3 computing tasks are respectively traffic light detection, traffic light state detection and data output, and it is known that the computing tasks in the first AI application task are all the same as the AI computing tasks in the second AI application task.
Step 603, during the process that the first computing device executes the plurality of computing tasks, the management device determines a target computing task of the plurality of computing tasks included in the first AI application task based on the first resource usage information and the second resource usage information.
After the management device acquires the first resource usage information and the second resource usage information, it may make a management decision for a computation task in the AI application task according to the first resource usage information and the second resource usage information, and determine a target computation task among a plurality of computation tasks included in the AI application task. Wherein the target computing task is a computing task to be executed by the second computing device, the target computing task having data association with at least one other computing task involved in the AI application task. Alternatively, the output results of the determination of the target computing task may be represented using a graph structure. It should be understood that, in other implementations, when step 602 is not executed, the implementation process of step 603 is: the management device determines a target computing task of a plurality of computing tasks included in the first AI application task based on the first resource usage information.
In an implementation manner of determining the target computing task according to the first resource usage information and the second resource usage information, the management apparatus may determine, according to the first resource usage information, operating efficiencies of a plurality of computing tasks in the first AI application task and/or resource usage conditions for operating the plurality of computing tasks in the first computing device, and when the operating efficiencies of the plurality of computing tasks in the first computing device do not satisfy a preset first condition and/or the resource usage conditions for operating the plurality of computing tasks in the first computing device do not satisfy a preset second condition, the management apparatus determines the target computing task in the plurality of computing tasks.
The first condition and the second condition can be set according to application requirements. For example, the first condition may be that the operating efficiency of the computing task in the first AI application task reaches a reference efficiency threshold, and the second condition may be that a difference between a resource utilization rate of the computing task in the first AI application task for the first resource and a resource utilization rate of the computing task in the first AI application task for the second resource is smaller than a reference difference threshold. For example, the second condition may be that a difference between an AI computation power occupancy ratio and an AI computation memory occupancy ratio of the computation task in the first AI application task is less than a reference difference threshold. And when determining the target calculation task to be scheduled in the first AI application task according to the analysis result, the determination may also be made according to the operating efficiency and/or resource utilization condition of each calculation task in the first AI application task. For example, when a difference between an AI computation power occupancy ratio and an AI computation memory occupancy ratio of a certain computation task in the first AI application task is greater than a specified difference threshold, the computation task is determined as a target computation task. In addition, other factors may also be referenced when determining the target AI computation task. For example, when the difference between the AI computation power occupancy ratio and the AI computation memory occupancy ratio of a certain computation task in a first AI application task is greater than a specified difference threshold, and resources such as a memory copy bandwidth and a network bandwidth of the first computation device can support scheduling of the computation task to be executed by a second computation device, the computation task is determined as a target computation task.
By determining the target computing task to be executed by the second computing device among the plurality of computing tasks in the first AI application task when the operating efficiency of the computing task in the first AI application task and/or the resource utilization condition of the computing task in the first AI application task do not satisfy the corresponding condition, the operating efficiency and/or the resource utilization ratio of the first AI application task can be effectively improved.
Alternatively, the management apparatus determines a process of executing the target computing task by the second computing device, and may also be executed in combination with the operating efficiency and resource utilization of the first computing device and the second computing device in executing the computing task. For example, the management apparatus determines that the target computing task is executed by the second computing device when the operating efficiency of the plurality of computing tasks in the second computing device also does not satisfy the first condition, and/or the resource utilization for operating the plurality of computing tasks in the second computing device also does not satisfy the second condition. For another example, the management apparatus may determine that the target computing task is executed by the second computing device when the resource utilization rate of the target computing task for the first resource is less than the resource utilization rate of the target computing task for the second resource, and the resource utilization rate of the computing task in the second AI application task for the second resource is less than the resource utilization rate of the computing task in the second AI application task for the first resource.
By determining that the second computing device executes the target computing task according to the first resource usage information and the second resource usage information, the operating efficiency and the resource utilization rate of the AI application tasks running in the multiple computing devices managed by the management device can be considered globally, the AI computing tasks in the first AI application tasks can be managed when the overall operating efficiency of the first AI application tasks is poor and/or the resource utilization condition is poor, the short wood effect of the overall system can be reduced at the cost of executing the scheduling operation, and the operating performance of the overall system of the multiple computing devices managed by the management device is improved.
In an implementation manner, when the method for managing an AI application task provided in this embodiment is implemented by the AI system shown in fig. 5, a scheduling module 3011 and a control module 3012 are provided in a third computing device, as shown in fig. 7, after the scheduling module 3011 acquires first resource usage information and second resource usage information, the first resource usage information and the second resource usage information are sent to the control module 3012, and the control module 3012 determines, according to the first resource usage information and the second resource usage information, a target computing task to be executed by a second computing device in a plurality of computing tasks in a first AI application task.
For example, suppose that the first computing device executes a first AI application task, the execution logic of the first AI application task refers to fig. 8, the first AI application task includes 1 to 5 computing tasks shown in fig. 8, and the first resource usage information of the first AI application task running in the process of the first computing device includes: the 5 th calculation task in the first AI application task occupies 0.5% of AI calculation power in 1 second, and occupies 7.8% of AI calculation memory. The second computing device executes a second AI application task, the logic diagram of the second AI application task continues to refer to fig. 8, the second AI application task includes 1 to 5 computing tasks shown in fig. 8, and the second resource usage information of the second AI application task running in the process of the second computing device includes: the AI calculation power occupation ratio of 5 calculation tasks in the second AI application task within 1 second is about 8%, and the AI calculation memory occupation ratio is about 0.5%.
According to the first resource usage information and the second resource usage information, the control module 3012 may know that an AI computation memory occupation ratio of a 5 th computation task executed by the first computing device is greater than an AI computation memory occupation ratio, and a difference between the AI computation memory occupation ratio and the AI computation memory occupation ratio is greater than a reference difference threshold, and that an AI computation memory occupation ratio of a second AI application task executed by the second computing device is greater than an AI computation memory occupation ratio, and a difference between the AI computation memory occupation ratio and the AI computation memory occupation ratio is also greater than a reference difference threshold, and neither the first computing device nor the second computing device can utilize resources well. Considering that the amount of data output once after the 5 th computation task is completed is relatively limited with respect to the total copy bandwidth, if it is considered that the computation bandwidth of the 5 th computation task can be distributed to the second computing device, the control module 3012 determines, according to the analysis result, to schedule the 5 th computation task in the first AI application task to the second computing device for execution.
Step 604, the management apparatus sends a first instruction to the first computing device, where the first instruction is used to instruct to send data obtained by executing at least one other computing task to the second computing device, and the target computing task is associated with the at least one other computing task existence data.
After determining the target computing task, the management apparatus may send a first instruction to the first computing device, so that the first computing device may send data required for executing the target computing task to the second computing device, so that the second computing device may execute the target computing task according to the data. Wherein the data required to specify the target computing task comprises data obtained by performing the at least one other computing task.
The management device sends the first instruction to the first computing device, and the management device sends the first instruction to a functional module, which is used for sending data required by executing the target computing task to the second computing device, in the first computing device. When the management device is deployed in the first computing device, the management device may be a virtual device in the first computing device, which has the aforementioned functions, and the sending action is a sending action between different function modules in the first computing device. When the management apparatus is deployed in the third computing device, the sending action is a sending action between different computing devices.
In an implementation manner, when the management method for the AI application task provided in the embodiment of the present application is implemented by the AI system shown in fig. 5, the control module 3012 is disposed in the third computing device, and the first task scheduling execution module 103 is disposed in the first computing device. As shown in fig. 7, after the control module 3012 determines the target computing task, it may send a first instruction to the first task scheduling execution module 103, so that the first task scheduling execution module 103 sends data required for executing the target computing task to the second computing device according to the first instruction. In this case, the management apparatus transmits the first command to the first computing device, which is a transmission operation between different devices.
Step 605, when the running program of the target computing task is not deployed in the second computing device, the management apparatus sends the running program of the target computing task to the second computing device, or the management apparatus sends a second instruction to the second computing device to instruct the second computing device to acquire the running program of the target computing task.
The second computing device may or may not have a running program for the target computing task deployed therein. When the second computing device does not have the running program of the target computing task deployed therein, before the second computing device executes the target computing task, the running program of the target computing task needs to be deployed in the second computing device.
Alternatively, the management apparatus may be from a computing device storing an execution program of the target computing task. And acquiring the running program of the target computing task, and sending the running program of the target computing task to the second computing equipment. For example, when part of the interactive transactions between different computing devices need to be implemented by the management apparatus, assuming that the first computing device stores the running program of the target computing task, the management apparatus may first obtain the running program of the target computing task from the first computing device, and then send the running program of the target computing task to the second computing device by the management apparatus. And, the acquiring, by the management apparatus, the running program of the target computing task from the first computing device may include: the management device actively reads the running program of the target computing task from the first computing equipment, or receives the running program of the target computing task sent by the first computing equipment.
Alternatively, the management apparatus may send a second instruction to the second computing device to instruct the second computing device to acquire the running program of the target computing task from the computing devices in which the running program of the target computing task is stored. Correspondingly, after receiving the second instruction, the second computing device can directly obtain the running program of the target computing task from the computing device in which the running program of the target computing task is stored according to the second instruction. And, the process of the second computing device acquiring the running program of the target computing task from the computing devices in which the running program of the target computing task is stored may include: the second computing device sends an acquisition request to the computing device storing the running program of the target computing task to request the computing device to send the running program of the target computing task to the second computing device, or the second computing device reads the running program of the target computing task from the computing device storing the running program of the target computing task.
It should be noted that step 605 is an optional step, and when the running program of the target computing task is not deployed in the second computing device, step 605 needs to be executed, and when the running program of the target computing task is deployed in the second computing device, step 605 does not need to be executed, and after step 604 is completed, step 606 may be directly executed.
At step 606, the management apparatus sends a third instruction to the second computing device to instruct the second computing device to prepare computing resources for performing the target computing task.
In order to ensure the running performance of the second computing device for executing the target computing task, after the management apparatus determines the target computing task, the management apparatus further needs to send a third instruction to the second computing device to instruct the second computing device to prepare computing resources for executing the target computing task.
In an implementation manner, the management apparatus may obtain a computation demand of a target computation task, determine, based on the computation demand, a computation resource required to run the target computation task, generate a resource scheduling policy for the target computation task in combination with a resource in the second computing device, and send a third instruction carrying the resource scheduling policy to the second computing device, so as to instruct the second computing device to prepare a computation resource for executing the target computation task according to the resource scheduling. Optionally, the resource scheduling policy further indicates that resources in the first computing device are scheduled, such as instructing the first computing device to reclaim resources for executing the target computing task. It should be understood that when the resource scheduling policy also indicates that the resource in the first computing device is to be scheduled, the management apparatus may also send the resource scheduling policy to the first computing device to facilitate the first computing device in scheduling the resource in the first computing device according to the resource scheduling policy. The calculation requirement of the target calculation task can be determined according to the input data volume and the output data volume of the target calculation task and the algorithm realized by the target calculation task.
For example, assume that the resource scheduling policy carried by the third instruction indicates that the second computing device allocates the computing resource for executing the target computing task, and the resource allocated to the target computing task includes: the bandwidth for transmitting data to the AI computing memory of the computing device by the AI computing memory of the computing device is adjusted to be 1GB/S, and the bandwidth for transmitting data to the processor memory of the computing device by the AI computing memory of the computing device is adjusted to be 1 GB/S.
The management device can make scheduling decision according to the preset resource scheduling rule to generate a resource scheduling strategy. By carrying out scheduling decision according to the resource scheduling rule, the short board effect of the whole system can be reduced at the cost of executing scheduling operation, and the running performance of the whole system is improved.
In an implementation manner, when the method for managing an AI application task provided in the embodiment of the present application is implemented by the AI system shown in fig. 5, the third computing device is provided with a scheduling module 3011 and a control module 3012, the first computing device is provided with the first resource scheduling execution module 104, and the second computing device is provided with the second resource scheduling execution module 204. As shown in fig. 7, after the control module 3012 determines the target computing task, it sends a notification indicating that the target computing task is executed by the second computing device to the scheduling module 3011, and the scheduling module 3011 may generate a corresponding resource scheduling policy according to the notification and send the resource scheduling policy to the second resource scheduling execution module 204 and the first resource scheduling execution module 104, so that the first resource scheduling execution module 104 and the second resource scheduling execution module 204 perform resource scheduling according to the resource scheduling policy.
It should be noted that the process of generating the resource scheduling policy may also be performed by the second computing device. For example, after the management apparatus sends the third instruction to the second computing device, the second computing device may obtain the computation demand of the target computation task according to the third instruction, determine the computation resource required for executing the target computation task based on the computation demand, and generate the resource scheduling policy for the target computation task in combination with the resource in the second computing device.
Step 607, the second computing device prepares computing resources for performing the target computing task based on the third instruction.
After receiving the resource scheduling policy, the second computing device may schedule the resource of the second computing device in a manner indicated by the resource scheduling policy, so as to prepare the computing resource for executing the target computing task. This step 607 may be performed by the second resource schedule execution module 204, as shown in fig. 7. It should be understood that, when the resource scheduling policy further indicates that resources in the first computing device are scheduled, the method for managing the AI application task provided in the embodiment of the present application further includes: the first computing device schedules resources in the first computing device according to the resource scheduling policy. For example, resources are reclaimed for performing the target computing task. Also, it is necessary to determine that all of the computing tasks performed using the resource have been offloaded before the resource is reclaimed. Wherein scheduling, by the first computing device, resources in the first computing device according to the resource scheduling policy may be performed by the first resource scheduling execution module 104.
Step 608, the first computing device sends data obtained by executing at least one other computing task to the second computing device according to the first instruction.
After the first computing device determines that the target computing task is executed by the second computing device, data needed to execute the target computing task needs to be sent to the second computing device. The data required to perform the target computing task includes data obtained from performing at least one other computing task. The at least one other computing task is associated with the target computing task presence data.
In one implementation, as shown in FIG. 7, the operation of sending data required to execute the target computing task to the second computing device may be performed by the first task schedule execution module 103. Moreover, since there may be a time delay in the process of sending the data required for executing the target computing task, the first task scheduling execution module 103 is also responsible for performing cache management on the data required for executing the target computing task, such as temporarily caching the data required for executing the target computing task.
And step 609, the second computing device takes the data as input data of the target computing task and executes the target computing task based on the input data.
After the computing resources are prepared for the target computing task, the running program of the target computing task can be run by using the computing resources prepared for the target computing task, and the data obtained by at least one other computing task is used as the input data of the target computing task to execute the target computing task.
For example, assuming that the target computing task is the 5 th computing task in the first AI application task, the management apparatus instructs to reclaim the resources in the first computing device for running the target AI computing task, and before the target computing task is scheduled to the second computing device, the first computing device is configured to execute the first AI application task, the second computing device is configured to execute the second AI application task, and both the first AI application task and the second AI application task include the target computing task, the logic diagram of the execution of the first AI application task and the second AI application task after the target computing task is scheduled to the second computing device for execution is shown in fig. 9. That is, the 5 th computing task is no longer executed in the first computing device, and the operation executed by the 5 th computing task originally deployed in the first computing device is executed by the 5 th computing task executed by the second computing device.
For another example, assuming that the target computing task is the 5 th computing task in the first AI application task, the management apparatus instructs to reclaim the resources in the first computing device for executing the target computing task, and no running program of any computing task is deployed in the second computing device before the target computing task is scheduled in the second computing device, the logic diagram of the execution of the first AI application task and the second AI application task after the target computing task is scheduled in the second computing device for execution is shown in fig. 10. That is, the 5 th computing task is no longer executed in the first computing device, and the 5 th computing task is executed by the second computing device instead.
For another example, assuming that the target computing task is the 5 th computing task in the first AI application task, the management apparatus instructs to reclaim the resources in the first computing device for executing the target computing task, and before the target computing task is scheduled in the second computing device, the running program of the 5 th computing task in the first AI application task is not deployed in the second computing device, and after the target computing task is scheduled in the second computing device for execution, the execution logic diagrams of the first AI application task and the second AI application task are shown in fig. 10. I.e., the 5 th computing task is no longer performed in the first computing device, the 5 th computing task is performed by the second computing device. Wherein, for ease of viewing, the computing tasks that were deployed in the second computing device prior to scheduling are not shown in this FIG. 10.
In summary, in the management method for AI application tasks provided in the embodiments of the present application, by determining, by the management apparatus, a target computing task to be executed in the second computing device among a plurality of computing tasks included in an AI application task executed by the first computing device, and then sending data required for executing the target computing task to the second computing device by the first computing device, the second computing device using the data sent by the first computing device as input data of the target computing task and executing the target computing task based on the input data, the target computing task can be scheduled to be executed in the second computing device during the execution of the AI application task by the first computing device, the computing task can be flexibly scheduled during the execution of the AI application task, which is beneficial to improving resource utilization and/or computing task processing efficiency of the first computing device and the second computing device, the operation performance of the overall system for realizing the AI application task is improved.
It should be noted that, in the management method for an AI application task provided in the embodiment of the present application, the order of steps may be appropriately adjusted, or the steps may be correspondingly increased or decreased according to the situation, for example, whether to execute step 602 or whether to execute step 605 may be determined according to an application requirement. Those skilled in the art can easily conceive of various methods within the technical scope of the present disclosure, and therefore, the detailed description is omitted.
The embodiment of the application also provides an AI system. The AI system includes a first computing device, a second computing device, and a management apparatus. Please refer to fig. 2, fig. 3, or fig. 5 for the architecture of the AI system. The first computing device, the second computing device and the management device have the following functions:
and a management device for determining a target computing task of the plurality of computing tasks, wherein the target computing task is a computing task to be executed by the second computing device, and the target computing task is associated with at least one other computing task presence data related to the AI application task.
The management device is further used for sending the first instruction to the first computing equipment.
The first computing device is used for sending data obtained by executing at least one other computing task to the second computing device according to the first instruction.
And the second computing device is used for taking the data as input data of the target computing task and executing the target computing task based on the input data.
Optionally, the management device is further configured to: first resource usage information is obtained for a plurality of computing tasks when executed in a first computing device, and a target computing task of the plurality of computing tasks is determined based on the first resource usage information.
Optionally, when the running program of the target computing task is not deployed in the second computing device, the management apparatus is further configured to: and sending the running program of the target computing task to the second computing device, or sending a second instruction to the second computing device to instruct the second computing device to acquire the running program of the target computing task.
Optionally, the management device is further configured to: sending a third instruction to the second computing device to instruct the second computing device to prepare computing resources for executing the target computing task.
Optionally, the management apparatus is specifically configured to: when the operating efficiency of a plurality of computing tasks in the first computing device does not meet a preset first condition and/or the resource utilization condition for operating the plurality of computing tasks in the first computing device does not meet a preset second condition, determining a target computing task in the plurality of computing tasks.
Optionally, the management device is further configured to: second resource usage information for a computing task executing in a second computing device is obtained.
Correspondingly, the management device is specifically configured to: a target computing task of the plurality of computing tasks is determined based on the first resource usage information and the second resource usage information.
Optionally, the first resource usage information includes: execution information of at least one of the plurality of computing tasks and resource information of the first computing device.
Optionally, the operational information of the at least one computing task is obtained from one or more of the following operational parameters: the method comprises the steps of calculating the calling times of a task in unit time length, calculating the input data volume of the task, calculating the output data volume of the task, calling the running time of the calculation task by first calculation equipment, calling the consumption of a processor of the calculation task, and calling the consumption of a memory of the calculation task.
Optionally, the resource information of the first computing device is obtained by one or more of the following parameters: the memory of the first computing device is a nominal value and a total consumption amount of the memory, the processor memory of the first computing device transmits data to the AI computing memory of the first computing device through a bandwidth and a nominal value of the bandwidth, and the AI computing memory of the first computing device transmits data to the processor memory of the first computing device through the bandwidth and the nominal value of the bandwidth.
Optionally, the management apparatus is disposed in the first computing device, the second computing device, or a third computing device, wherein the third computing device is connected with the first computing device and the second computing device through a communication path.
Optionally, the first computing device is a video card, an AI computing chip, or a server. The second computing device is a video card, an AI computing chip or a server. The third computing device is a display card, an AI computing chip or a server.
In summary, in the AI system provided in the embodiment of the present application, by determining, by the management apparatus, a target computing task to be executed in the second computing device among a plurality of computing tasks included in the AI application task, in the plurality of computing tasks, and then sending, by the first computing device, data required for executing the target computing task to the second computing device, the second computing device using the data sent by the first computing device as input data of the target computing task and executing the target computing task based on the input data, the target computing task can be scheduled to be executed in the second computing device during the AI application task executed by the first computing device, the computing task can be flexibly scheduled during the AI application task, which is beneficial to improve resource utilization and/or computing task processing efficiency of the first computing device and the second computing device, the operation performance of the overall system for realizing the AI application task is improved.
It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working processes of the management apparatus, the first computing device, and the second computing device described above may refer to corresponding contents in the foregoing method embodiments, and are not described herein again.
The embodiment of the application further provides a management device. Referring to fig. 5, as shown in fig. 5, the management apparatus 301 includes a control module 3012 and a scheduling module 3011.
The control module 3012 is configured to, in a process that a first computing device executes a plurality of computing tasks, determine a target computing task in the plurality of computing tasks, where the target computing task is a computing task to be executed by a second computing device, and the target computing task has data association with at least one other computing task related to an AI application task;
the scheduling module 3011 is configured to send a first instruction to the first computing device, where the first instruction is used to instruct the first computing device to send data obtained by executing at least one other computing task to a target computing task in the second computing device.
Optionally, the control module 3012 is further configured to: acquiring first resource usage information of a plurality of computing tasks when executed in a first computing device; a target computing task of the plurality of computing tasks is determined based on the first resource usage information.
Optionally, when the running program of the target computing task is not deployed in the second computing device, the scheduling module 3011 is further configured to send the running program of the target computing task to the second computing device, or send a second instruction to the second computing device to instruct the second computing device to obtain the running program of the target computing task.
Optionally, the scheduling module 3011 is further configured to send a third instruction to the second computing device to instruct the second computing device to prepare computing resources for executing the target computing task.
Optionally, the control module 3012 is specifically configured to: when the operating efficiency of a plurality of computing tasks in the first computing device does not meet a preset first condition and/or the resource utilization condition for operating the plurality of computing tasks in the first computing device does not meet a preset second condition, determining a target computing task in the plurality of computing tasks.
Optionally, the control module 3012 is further configured to obtain second resource usage information of a computing task executed in a second computing device. Accordingly, the control module 3012 is specifically configured to determine a target computing task of the plurality of computing tasks based on the first resource usage information and the second resource usage information.
Optionally, the first resource usage information includes: execution information of at least one of the plurality of computing tasks and resource information of the first computing device.
Optionally, the operational information of the at least one computing task is obtained from one or more of the following operational parameters: the method comprises the steps of calculating the calling times of a task in unit time length, calculating the input data volume of the task, calculating the output data volume of the task, calling the running time of the calculation task by first calculation equipment, calling the consumption of a processor of the calculation task, and calling the consumption of a memory of the calculation task.
Optionally, the resource information of the first computing device is obtained by one or more of the following parameters: the memory of the first computing device is a nominal value and a total consumption amount of the memory, the processor memory of the first computing device transmits data to the AI computing memory of the first computing device through a bandwidth and a nominal value of the bandwidth, and the AI computing memory of the first computing device transmits data to the processor memory of the first computing device through the bandwidth and the nominal value of the bandwidth.
Optionally, the management apparatus is disposed in the first computing device, the second computing device, or a third computing device, wherein the third computing device is connected to the first computing device and the second computing device through a communication path.
Optionally, the first computing device is a display card, an AI computing chip, or a server; the second computing device is a display card, an AI computing chip or a server; the third computing device is a display card, an AI computing chip or a server.
In summary, in the management apparatus provided in the embodiment of the present application, by determining, by the management apparatus, a target computing task to be executed in a second computing device among a plurality of computing tasks included in an AI application task, and then sending a first instruction to the first computing device, so that the first computing device sends data required for executing the target computing task to the second computing device, and the second computing device uses the data sent by the first computing device as input data of the target computing task, and executes the target computing task based on the input data, the target computing task can be scheduled to be executed in the second computing device during the execution of the AI application task by the first computing device, the computing task can be flexibly scheduled during the execution of the AI application task, which is beneficial to improving resource utilization and/or computing task processing efficiency of the first computing device and the second computing device, the operation performance of the overall system for realizing the AI application task is improved.
It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working process of the management apparatus, i.e., the module, described above may refer to the corresponding content in the foregoing method embodiment, and is not described herein again.
The embodiment of the application further provides an electronic device, and the electronic device is used for realizing the function of the management device by running the instruction. In the scenario shown in fig. 2, the electronic device may be the third computing device 30, or may be an electronic device including the aforementioned third computing device 30, such as: the third computing device 30 is a graphics card, the electronic device may be a server including the graphics card. In the embodiment as illustrated in fig. 3, the electronic device may be the first computing device 10 that implements the function of the management apparatus 103, or may be an electronic device that includes the aforementioned first computing device 10. As shown in fig. 11, the electronic device 110 includes a bus 1101, a processor 1102, a communication interface 1103, and a memory 1104. Communication between the processor 1102, memory 1104 and communication interface 1103 occurs via a bus 1101.
The processor 1102 may be an integrated circuit chip having signal processing capabilities. In the implementation process, the function of the management device in the method for managing an AI application task provided in the embodiment of the present application may be implemented by an integrated logic circuit of hardware in the processor 502 or an instruction in the form of software. The processor 502 may also be an application-specific integrated circuit (ASIC), a Programmable Logic Device (PLD) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or a combination of some or all of the above. The PLD may be a Complex Programmable Logic Device (CPLD), a field-programmable gate array (FPGA), a General Array Logic (GAL), or any combination thereof.
The processor 1102 may also be a general purpose processor, which may be a microprocessor, or the processor may be any conventional processor or the like. For example, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a Network Processor (NP), or a combination of some or all of a CPU, a GPU, and an NP. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 501, and the processor 502 reads information and computer instructions in the memory 501, and implements the function of the management device in the method for managing an AI application task according to the embodiment of the present application in combination with hardware thereof.
Memory 1104 stores computer instructions and data. For example, the memory 1104 stores therein an executable code for managing a function of the device in the management method for the AI application task, and the processor 1102 reads the executable code in the memory 1104 to perform the management method for the AI application task provided in the embodiment of the present application. The memory 1104 may be either volatile memory or nonvolatile memory, or may include a combination of both volatile and nonvolatile memory. The nonvolatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an electrically Erasable EPROM (EEPROM), a flash memory (flash memory), a hard disk (HDD), or a solid-state drive (SSD). The volatile memory may be a random access memory (RA) M) that functions as an external cache. By way of example, but not limitation, many forms of RAM are available, such as static random access memory (static RAM, SRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic Random Access Memory (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), enhanced synchronous SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), and direct bus RAM (DR RAM). Also, other software modules required to run a process, such as an operating system, may also be included in memory 1104. The operating system may be LINUXTM,UNIXTM,WINDOWSTMAnd so on.
The communication interface 1103 enables communication between the computer 110 and other devices or communication networks using transceiver modules, such as, but not limited to, transceivers.
Bus 1101 may include a pathway to transfer information between various components of electronic device 110, such as processor 1102, communication interface 1103, and memory 1104.
The present application also provides a computer-readable storage medium, which may be a non-transitory readable storage medium, and when program instructions in the computer-readable storage medium are executed by an electronic device, the electronic device implements the functions of the management apparatus in the management method for an AI application task provided in the present application. The computer readable storage medium includes, but is not limited to, volatile memory such as random access memory, non-volatile memory such as flash memory, Hard Disk Drive (HDD), Solid State Drive (SSD).
The embodiment of the present application further provides a computer program product including an instruction, which, when the computer program product runs on an electronic device, enables the electronic device to implement the function of the management apparatus in the method for managing an AI application task provided in the embodiment of the present application.
The embodiment of the present application further provides a chip, where the chip includes a programmable logic circuit and/or a program instruction, and when the chip runs, the chip is configured to implement a function of a management device in the management method for the AI application task provided in the embodiment of the present application.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
In the embodiments of the present application, the terms "first", "second", and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. The term "one or more" refers to one or more, and the term "plurality" refers to two or more, unless expressly defined otherwise.
The term "and/or" in this application is only one kind of association relationship describing the associated object, and means that there may be three kinds of relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
The present application is intended to cover various modifications, equivalents, improvements, and equivalents of the embodiments described above, which may fall within the spirit and scope of the present application.

Claims (29)

1. A method for managing an artificial intelligence AI application task, wherein the AI application task involves a plurality of computing tasks, the method comprising:
during the process that the first computing device executes the plurality of computing tasks, the management device determines a target computing task in the plurality of computing tasks, wherein the target computing task is a computing task to be executed by a second computing device, and the target computing task is associated with data of at least one other computing task related to the AI application task;
the management apparatus sending a first instruction to the first computing device;
The first computing device sends data obtained by executing the at least one other computing task to the second computing device according to the first instruction;
the second computing device takes the data as input data for the target computing task, and executes the target computing task based on the input data.
2. The method of claim 1, wherein prior to the management device determining a target computing task of the plurality of computing tasks, the method further comprises:
the management device acquires first resource usage information of the plurality of computing tasks when executed in the first computing device;
the management device determining a target computing task of the plurality of computing tasks, comprising:
the management device determines a target computing task of the plurality of computing tasks based on the first resource usage information.
3. The method of claim 1 or 2, wherein when the running program of the target computing task is not deployed in the second computing device, the method further comprises:
the management device sends the running program of the target computing task to the second computing device, or the management device sends a second instruction to the second computing device to instruct the second computing device to acquire the running program of the target computing task.
4. The method according to any one of claims 1-3, further comprising:
the management apparatus sends a third instruction to the second computing device to instruct the second computing device to prepare computing resources for executing the target computing task.
5. The method of any of claims 1-4, wherein the managing device determining a target computing task of the plurality of computing tasks comprises:
when the operating efficiency of the plurality of computing tasks in the first computing device does not meet a preset first condition, and/or the resource utilization condition for operating the plurality of computing tasks in the first computing device does not meet a preset second condition, the management device determines a target computing task in the plurality of computing tasks.
6. The method of claim 2, further comprising: the management device acquires second resource usage information of a computing task executed in the second computing device;
the management device determining a target computing task of the plurality of computing tasks, comprising:
the management device determines a target computing task of the plurality of computing tasks based on the first resource usage information and the second resource usage information.
7. The method according to claim 2 or 6, wherein the first resource usage information comprises: execution information for at least one of the plurality of computing tasks and resource information for the first computing device.
8. The method of claim 7, wherein the operational information of the at least one computing task is obtained from one or more of the following operational parameters:
the calling times of the calculation tasks in unit time length, the input data volume of the calculation tasks, the output data volume of the calculation tasks, the running time of the calculation tasks called by the first calculation equipment, the consumption of a processor calling the calculation tasks, and the consumption of a memory calling the calculation tasks;
the resource information of the first computing device is obtained by one or more of the following parameters: the memory of the first computing device may be configured to transmit data to the AI computing memory of the first computing device, and the AI computing memory of the first computing device may be configured to transmit data to the processor memory of the first computing device.
9. The method of any one of claims 1 to 8, wherein the management apparatus is deployed in the first computing device, the second computing device, or a third computing device, wherein the third computing device is connected to the first computing device and the second computing device via a communication pathway.
10. The method of claim 9,
the first computing equipment is a display card, an AI computing chip or a server;
the second computing equipment is a display card, an AI computing chip or a server;
the third computing device is a display card, an AI computing chip or a server.
11. An Artificial Intelligence (AI) system, the AI system comprising a first computing device, a second computing device and a management means,
the management device is configured to determine a target computing task of the plurality of computing tasks, where the target computing task is a computing task to be executed by a second computing device, and the target computing task is associated with data of at least one other computing task related to the AI application task;
the management device is further used for sending a first instruction to the first computing equipment;
The first computing device is used for sending data obtained by executing the at least one other computing task to the second computing device according to the first instruction;
the second computing device is configured to use the data as input data of the target computing task, and execute the target computing task based on the input data.
12. The system of claim 11, wherein the management device is further configured to:
obtaining first resource usage information of the plurality of computing tasks when executed in the first computing device;
determining a target computing task of the plurality of computing tasks based on the first resource usage information.
13. The system according to claim 11 or 12, wherein when the running program of the target computing task is not deployed in the second computing device, the management apparatus is further configured to:
and sending the running program of the target computing task to the second computing device, or sending a second instruction to the second computing device to instruct the second computing device to acquire the running program of the target computing task.
14. The system according to any of claims 11-13, wherein the management device is further configured to:
Sending a third instruction to the second computing device to instruct the second computing device to prepare computing resources for performing the target computing task.
15. The system according to any one of claims 11 to 14, characterized in that said management means are in particular configured to:
and when the operating efficiency of the plurality of computing tasks in the first computing device does not meet a preset first condition and/or the resource utilization condition for operating the plurality of computing tasks in the first computing device does not meet a preset second condition, determining a target computing task in the plurality of computing tasks.
16. The system of claim 12, wherein the management device is further configured to:
obtaining second resource usage information for a computing task executing in the second computing device;
the management device is specifically configured to:
determining a target computing task of the plurality of computing tasks based on the first resource usage information and the second resource usage information.
17. The system according to claim 12 or 16, wherein the first resource usage information comprises: execution information of at least one of the plurality of computing tasks and resource information of the first computing device.
18. The system of claim 17, wherein the operational information of the at least one computing task is obtained from one or more of the following operational parameters:
the calling times of the calculation tasks in unit time length, the input data volume of the calculation tasks, the output data volume of the calculation tasks, the running time of the calculation tasks called by the first calculation equipment, the consumption of a processor calling the calculation tasks, and the consumption of a memory calling the calculation tasks;
the resource information of the first computing device is obtained by one or more of the following parameters: the memory of the first computing device may be configured to transmit data to the AI computing memory of the first computing device, and the AI computing memory of the first computing device may be configured to transmit data to the processor memory of the first computing device.
19. The system according to any one of claims 11 to 18, wherein the management apparatus is deployed in the first computing device, the second computing device, or a third computing device, wherein the third computing device is connected to the first computing device and the second computing device via a communication pathway.
20. The system of claim 19,
the first computing equipment is a display card, an AI computing chip or a server;
the second computing equipment is a display card, an AI computing chip or a server;
the third computing device is a display card, an AI computing chip or a server.
21. A management device, characterized in that the management device comprises a control module and a scheduling module,
the control module is configured to determine a target computing task in the plurality of computing tasks during execution of the plurality of computing tasks by a first computing device, where the target computing task is a computing task to be executed by a second computing device, and the target computing task is associated with data of at least one other computing task related to the AI application task;
the scheduling module is configured to send a first instruction to the first computing device, where the first instruction is used to instruct the first computing device to send data obtained by executing the at least one other computing task to the target computing task in the second computing device.
22. The apparatus of claim 21, wherein the control module is further configured to:
Obtaining first resource usage information of the plurality of computing tasks when executed in the first computing device; determining a target computing task of the plurality of computing tasks based on the first resource usage information.
23. The apparatus according to claim 21 or 22, wherein the scheduling module is further configured to send the running program of the target computing task to the second computing device or send a second instruction to the second computing device to instruct the second computing device to obtain the running program of the target computing task when the running program of the target computing task is not deployed in the second computing device.
24. The apparatus of any of claims 21-23, wherein the scheduling module is further configured to send a third instruction to the second computing device to instruct the second computing device to prepare computing resources for executing the target computing task.
25. The apparatus according to any one of claims 21-24, wherein the control module is specifically configured to:
and when the operating efficiency of the plurality of computing tasks in the first computing device does not meet a preset first condition and/or the resource utilization condition for operating the plurality of computing tasks in the first computing device does not meet a preset second condition, determining a target computing task in the plurality of computing tasks.
26. The apparatus of claim 22, wherein the control module is further configured to obtain second resource usage information for a computing task executing in the second computing device;
the control module is specifically configured to determine a target computing task of the plurality of computing tasks based on the first resource usage information and the second resource usage information.
27. An electronic device, comprising a memory and a processor, wherein the electronic device implements the functionality of the apparatus of any of the preceding claims 21-26 when the processor executes the computer instructions stored in the memory.
28. A computer-readable storage medium, in which program instructions are stored, which program instructions, when executed by an electronic device, implement the functions of the apparatus of any of the preceding claims 21-26.
29. A computer program product, characterized in that the computer program product comprises program instructions, which, when executed by an electronic device, implement the functions of the apparatus of any of the preceding claims 21-26.
CN202110172700.9A 2020-11-12 2021-02-08 Management method, system, equipment and storage medium of artificial intelligence application task Pending CN114489963A (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP21890892.9A EP4239476A4 (en) 2020-11-12 2021-10-16 Method, system, and device for managing artificial intelligence application task, and storage medium
PCT/CN2021/124253 WO2022100365A1 (en) 2020-11-12 2021-10-16 Method, system, and device for managing artificial intelligence application task, and storage medium
US18/316,818 US20230281056A1 (en) 2020-11-12 2023-05-12 Artificial Intelligence Application Task Management Method, System, Device, and Storage Medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011262475X 2020-11-12
CN202011262475 2020-11-12

Publications (1)

Publication Number Publication Date
CN114489963A true CN114489963A (en) 2022-05-13

Family

ID=81491714

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110172700.9A Pending CN114489963A (en) 2020-11-12 2021-02-08 Management method, system, equipment and storage medium of artificial intelligence application task

Country Status (4)

Country Link
US (1) US20230281056A1 (en)
EP (1) EP4239476A4 (en)
CN (1) CN114489963A (en)
WO (1) WO2022100365A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115543862A (en) * 2022-09-27 2022-12-30 超聚变数字技术有限公司 Memory management method and related device

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117312388B (en) * 2023-10-08 2024-03-19 江苏泰赋星信息技术有限公司 Artificial intelligence model control system

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5773554B2 (en) * 2012-02-27 2015-09-02 株式会社日立製作所 Task management method and task management apparatus
US9122524B2 (en) * 2013-01-08 2015-09-01 Microsoft Technology Licensing, Llc Identifying and throttling tasks based on task interactivity
CN111381946B (en) * 2018-12-29 2022-12-09 上海寒武纪信息科技有限公司 Task processing method and device and related products
KR20200084707A (en) * 2019-01-03 2020-07-13 삼성전자주식회사 Master device for managing distributed processing of task, task processing device for processing task and method for operating thereof
CN111400008B (en) * 2020-03-13 2023-06-02 北京旷视科技有限公司 Computing resource scheduling method and device and electronic equipment
CN111651253B (en) * 2020-05-28 2023-03-14 中国联合网络通信集团有限公司 Computing resource scheduling method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115543862A (en) * 2022-09-27 2022-12-30 超聚变数字技术有限公司 Memory management method and related device
CN115543862B (en) * 2022-09-27 2023-09-01 超聚变数字技术有限公司 Memory management method and related device

Also Published As

Publication number Publication date
WO2022100365A1 (en) 2022-05-19
US20230281056A1 (en) 2023-09-07
EP4239476A4 (en) 2024-01-03
EP4239476A1 (en) 2023-09-06

Similar Documents

Publication Publication Date Title
US20200137151A1 (en) Load balancing engine, client, distributed computing system, and load balancing method
CN111614570B (en) Flow control system and method for service grid
CN108370341B (en) Resource allocation method, virtual network function manager and network element management system
CN111625354B (en) Edge computing equipment calculation force arranging method and related equipment thereof
WO2015196931A1 (en) Disk io-based virtual resource allocation method and device
US20230281056A1 (en) Artificial Intelligence Application Task Management Method, System, Device, and Storage Medium
CN110532086B (en) Resource multiplexing method, device, system and storage medium
CN113641457A (en) Container creation method, device, apparatus, medium, and program product
CN109376053B (en) Data processing method and device and mobile terminal
US11871280B2 (en) VNF instantiation method, NFVO, VIM, VNFM, and system
CN110069341A (en) What binding function configured on demand has the dispatching method of dependence task in edge calculations
KR20200054368A (en) Electronic apparatus and controlling method thereof
US20240152393A1 (en) Task execution method and apparatus
CN113282410B (en) Resource allocation method and device
CN114296953A (en) Multi-cloud heterogeneous system and task processing method
WO2021013185A1 (en) Virtual machine migration processing and strategy generation method, apparatus and device, and storage medium
CN114979286B (en) Access control method, device, equipment and computer storage medium for container service
CN108667920B (en) Service flow acceleration system and method for fog computing environment
US9668082B2 (en) Virtual machine based on a mobile device
CN112667390A (en) Icing monitoring data processing system and method combining 5G technology and edge calculation
CN111796994B (en) Time delay guaranteeing method, system and device, computing equipment and storage medium
Sutagundar et al. Development of fog based dynamic resource allocation and pricing model in IoT
US20230359501A1 (en) Accelerator management device, communication system, accelerator management method, and accelerator management program
US20230195546A1 (en) Message Management Method and Apparatus, and Serverless System
US20230195527A1 (en) Workload distribution by utilizing unused central processing unit capacity in a distributed computing system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination