CN113791906A - Scheduling system and optimization algorithm based on GPU resources in artificial intelligence and engineering fields - Google Patents

Scheduling system and optimization algorithm based on GPU resources in artificial intelligence and engineering fields Download PDF

Info

Publication number
CN113791906A
CN113791906A CN202111081087.6A CN202111081087A CN113791906A CN 113791906 A CN113791906 A CN 113791906A CN 202111081087 A CN202111081087 A CN 202111081087A CN 113791906 A CN113791906 A CN 113791906A
Authority
CN
China
Prior art keywords
gpu
task
priority
tasks
resources
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111081087.6A
Other languages
Chinese (zh)
Inventor
唐维昌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Daisy Shanghai Software Co ltd
Original Assignee
Daisy Shanghai Software Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Daisy Shanghai Software Co ltd filed Critical Daisy Shanghai Software Co ltd
Publication of CN113791906A publication Critical patent/CN113791906A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5011Pool
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5021Priority

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a scheduling system and an optimization algorithm in the field of artificial intelligence and engineering based on GPU resources, wherein the scheduling system comprises: the human-family interaction module is used for enabling an operator to visually operate; the resource management module is used for establishing usable GPU resources into a GPU resource pool, the GPU resource pool is composed of a plurality of nodes, each node corresponds to one device, and the internal multi-card ID relationship of each node GPU and the available resource change of the node GPU are fed back to the GPU resource pool in real time so as to facilitate the dynamic management of the resource management module; each GPU periodically feeds back a use state including a GPU idle state, a GPU resource utilization rate and task completion to the resource management module, and the resource management module releases GPU resources after acquiring a task completion signal so as to be convenient for use of subsequent tasks; and the scheduling module is used for carrying out calculation amount budget on the tasks input by the human-computer interaction module and distributing GPU resources according to the calculated calculation amount so as to realize balance between improvement of calculation speed and reasonable utilization of the GPU resources.

Description

Scheduling system and optimization algorithm based on GPU resources in artificial intelligence and engineering fields
Technical Field
The invention relates to a computer technology, in particular to a scheduling optimization algorithm based on GPU resources in the fields of artificial intelligence and engineering.
Background
For GPU resource requirements on a computing platform in the fields of artificial intelligence and engineering, based on the characteristic that a GPU is different from a CPU, the number of CPUs on servers with certain specifications is constant, the number of GPUs on each server is quite possibly inconsistent, and a traditional CPU-based scheduling mode can schedule on the basis of the CPU resources with constant number but cannot schedule GPU resources with uncertain number, so that the traditional CPU resource scheduling technology is not suitable for GPU resource scheduling.
At present, the technologies for GPU resource scheduling mainly include: the techniques of the Nvidia GPU Docker, churm, and Nvidia GPU Docker can realize batch scheduling, but cannot realize interactive high-performance scheduling, and do not have a priority scheduling algorithm; slurm belongs to an open source technology, and has weak scheduling algorithm and no interaction and priority. Both the two technologies are researched and developed abroad, are widely used, have wide customer groups, are limited to dispatching CPUs, cannot meet the dispatching requirements of different numbers of GPUs of different nodes used in the professional field, cannot dispatch the GPUs across nodes, are researched and developed abroad, and have potential safety hazards.
Disclosure of Invention
In view of the above defects in the prior art, the technical problem to be solved by the present invention is to provide a scheduling system and an optimization algorithm in the artificial intelligence and engineering field based on GPU resources, which can realize efficient scheduling of GPU resources.
In order to achieve the above object, the present invention provides a scheduling system based on GPU resources in the field of artificial intelligence and engineering, comprising:
the human-family interaction module is used for enabling an operator to visually operate, transmitting an instruction of operating the 3D model by the client to the server, and feeding back the calculated operand of the 3D model to the client after the server calculates the operand of the 3D model, so that the 3D model operated by the client is not different from the 3D model operated by the server in a cloud computing mode;
the resource management module is used for establishing usable GPU resources into a GPU resource pool, the GPU resource pool is composed of a plurality of nodes, each node corresponds to one device, and the internal multi-card ID relationship of each node GPU and the available resource change of the node GPU are fed back to the GPU resource pool in real time so as to facilitate the dynamic management of the resource management module; each GPU periodically feeds back a use state including a GPU idle state, a GPU resource utilization rate and task completion to the resource management module, and the resource management module releases GPU resources after acquiring a task completion signal so as to be convenient for use of subsequent tasks;
and the scheduling module is used for carrying out calculation amount budget on the tasks input by the human-computer interaction module and distributing GPU resources according to the calculated calculation amount so as to realize balance between improvement of calculation speed and reasonable utilization of the GPU resources.
Further, the scheduling mode of the scheduling module includes:
1) autonomously identifying the model scale and matching scheduling, and allocating GPU resources for operation according to the calculation power required by the input task budget and the required calculation power;
2) firstly, after the GPU resources required by the task budget are calculated, matching the tasks according to the GPU resources of each node, preferentially distributing the tasks to the same node, preferentially distributing the tasks to idle GPU operation, and when the GPU corresponds to a plurality of tasks, sequentially queuing the tasks for operation;
3) the optimal solution is that the GPU resource which is operated fastest is selected on the premise of not influencing the calculation of other tasks according to the task quantity and the current GPU resource state;
4) the priority level is divided according to the emergency, and a preset threshold value is adopted as the dividing basis; when the priorities need to be allocated in the tasks, a priority parameter is directly given to the tasks, the scheduling module compares the priority parameter carried by the tasks with each priority threshold value, converts the priority level of the tasks, preferentially uses GPU resources according to the priority level, allocates the GPU resources according to an optimal solution mode by priority default, and preferentially occupies the optimal GPU resources for the tasks with higher priorities by adopting a shortest time principle;
in the priority operation priority task with priority corresponding to the GPU, a plurality of priorities corresponding to the GPU are queued and sequentially calculated according to priority parameter values, and a plurality of priorities corresponding to the GPU with equal priority parameters are queued and operated according to a first-in first-out principle.
Furthermore, the budgeting mode of the model mainly comprises the following steps:
budgeting the computation required for completing the specified task by using the data volume of the 3D model as a standard according to the model quantity budget;
according to the input value budget, taking the operation amount input by an operator at a client as a budget operation amount;
establishing an operand estimation model according to the operand budget algorithm budget, and training by using a large amount of operation data so that the operand budget algorithm can automatically estimate the required operand according to the characteristics of the current task;
and dynamically adjusting, namely monitoring the operation progress in the operation process, increasing GPU resources if the operation progress is lower than expected, and properly reducing the GPU resources if the operation progress is higher than expected so as to ensure that each operation task is performed quickly and smoothly.
The invention also discloses a scheduling optimization algorithm in the fields of artificial intelligence and engineering based on GPU resources, which comprises the following steps:
s100, dynamically updating GPU resource pool information by a resource management module;
s200, an operator inputs a task needing to be operated at a client through a man-machine interaction module, the task needing to be operated is uploaded to a server, and the server forwards the task to a scheduling module for processing;
s300, the scheduling module firstly checks whether the task contains the priority parameter, if so, the task enters priority scheduling, and if not, the task is scheduled in a common scheduling mode.
Further, S100 further includes:
s110, periodically requesting information to each GPU, feeding back a signal by each GPU according to the request information, and judging whether the GPU can be normally used or not by the GPU resource pool according to the fed-back signal, so that the GPU which cannot be used is found in time, and the GPU is prevented from being used in the subsequent task allocation;
s120, when the number of GPUs of each node is increased or decreased, the node sends updating information to a GPU resource pool, and the GPU resource pool updates GPU resources of the node according to the information;
and S130, in the operation process, each node collects state information of the GPU therein and feeds the state information back to the resource management module periodically, the state information of the GPU comprises an idle state of the GPU, a resource utilization rate when the GPU runs and a task completion state, and the resource management module releases the computing power of the GPU after acquiring the task completion state information, so that the operation of the next task is started, and task assignment is carried out on each GPU according to the state of the GPU.
Further, S300 further includes:
s310, dividing priority levels according to emergency situations, wherein a preset threshold is adopted as a dividing basis; when the priority needs to be distributed in the task, priority parameters are directly given to the task, the scheduling module compares the priority parameters carried by the task with various priority threshold values, the priority level of the task is converted, and then GPU resources are preferentially used according to the priority level;
allocating GPU resources according to an optimal solution mode by default according to the priority, and preferentially occupying the optimal GPU resources for tasks with higher priorities by adopting a shortest time principle;
in the priority operation priority task with priority corresponding to the GPU, a plurality of priorities corresponding to the GPU are queued and sequentially calculated according to priority parameter values, and a plurality of priorities corresponding to the GPU with equal priority parameters are queued and operated according to a first-in first-out principle.
Further, S300 further includes: s320, the common scheduling method includes:
s321, autonomously identifying model scale and matching scheduling, and allocating GPU resources for operation according to the calculation power required by the input task budget and the required calculation power;
s322, firstly, after the GPU resources required by the task budget are calculated, matching the tasks according to the GPU resources of each node, preferentially distributing the tasks to the same node, preferentially distributing the tasks to idle GPU operation, and when the GPU corresponds to a plurality of tasks, sequentially queuing the tasks for operation;
and S323, selecting the GPU resource which is operated fastest on the premise of not influencing the calculation of other tasks according to the task amount and the current GPU resource state by the optimal solution.
The invention also discloses an interaction method of the 3D model, which applies the optimization method; the method specifically comprises the following steps:
s1, the client side obtains the interactive operation instruction input by the user and then transmits the interactive operation instruction to the server side;
s2, the server side completes graph operation according to the obtained interactive operation instruction to obtain the graph variable quantity in the operation process;
s3, the server side forms an image set according to the image variable quantity obtained in the S2 and the frame extraction information, then encodes and compresses the image set according to the video stream, and transmits the compressed image set to the client side;
and S4, decoding the compressed image set received by the client, and continuously playing according to the information of the video stream to realize the feedback of the interactive operation instruction input by the user.
The invention has the beneficial effects that:
1. according to the automatic discovery and arrangement technical method, after a task requirement is received, a flexible temporary GPU resource pool is dynamically established according to the quantity of resources required by the task, so as to respond to different artificial intelligence and application calculation requirements in the engineering field, the task calculation is finished, the GPU resource group is released, the GPU returns to the shared resource pool, and the large-scale calculation force requirement of the artificial intelligence is met.
2. The scheduling algorithm of the scheduling module supports various modes under a CPU, optimal solution scheduling, first-in first-out scheduling, autonomous recognition model scale, matching scheduling and the like, and through analysis of the algorithm on platform calculation power, an idle resource matching task meeting conditions is searched, a simulation task is run, calculation is completed, autonomous release is performed, and the next task is received.
3. According to the invention, the GPU resource scheduling algorithm is optimized, the utilization rate of GPU resources is analyzed and fed back to the user in real time through the scheduling system, and tasks submitted by the user are automatically allocated to idle GPU resources through the scheduling system for calculation, so that the number of CPUs and servers are not occupied, and the performance is rapidly improved.
4. According to the method, the priority of the data of the tasks is automatically analyzed, an optimal solution is found by comparing the parameters submitted by a user with a standard library through the analysis method, and then the priority algorithm scheduling is carried out simultaneously, so that the requirement of the prior execution of important tasks is met;
5. the invention is very suitable for all the industrial design and simulation fields with professional graphic requirements, and covers the fields of film and television design, biology and gene, aerospace, automobile manufacturing and parts, dies, chips and semiconductors, nuclear industry, industrial and civil electrical appliances, meteorology and the like.
Drawings
FIG. 1 is a schematic diagram of the system of the present invention.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.
Referring to fig. 1, a GPU resource based scheduling system in the artificial intelligence and engineering fields includes:
the human-family interaction module is used for enabling an operator to visually operate, transmitting an instruction of operating the 3D model by the client to the server, and feeding back the calculated operand of the 3D model to the client after the server calculates the operand of the 3D model, so that the 3D model operated by the client is not different from the 3D model operated by the server in a cloud computing mode;
the resource management module is used for establishing usable GPU resources into a GPU resource pool, the GPU resource pool is composed of a plurality of nodes, each node corresponds to one device, the internal multi-card ID relation of each node GPU and the available resource change of the node GPU are fed back to the GPU resource pool in real time, so that the resource management module can manage dynamically, if the number of GPUs is increased or decreased, corresponding information is fed back to the GPU resource pool in combination with node information, and the GPU resource pool is enabled to increase or decrease the number of GPUs, resources and other information of the node. Each GPU periodically feeds back a use state to the resource management module, mainly a GPU idle state, GPU resource utilization rate and task completion, and after the resource management module obtains a task completion signal, the GPU resource is released so as to be convenient for use of a subsequent task.
And the scheduling module is used for carrying out calculation amount budget on the tasks input by the human-computer interaction module and distributing GPU resources according to the calculated calculation amount so as to realize balance between improvement of calculation speed and reasonable utilization of the GPU resources.
The scheduling mode of the scheduling module comprises the following steps:
1. and (4) autonomously identifying the model scale and matching scheduling, and allocating GPU resources for operation according to the input calculation power required by the task budget and the required calculation power. The model budgeting method mainly comprises the following steps:
and according to the model quantity budget, the data quantity of the 3D model is used as a standard, and the calculation quantity required for completing the specified task is budgeted.
And according to the input value budget, taking the calculation amount input by the operator at the client as the budget calculation amount.
And establishing an operand evaluation model according to the operand budget algorithm budget, and training by using a large amount of operation data, so that the operand budget algorithm can automatically estimate the required operand according to the characteristics of the current task.
And dynamically adjusting, namely monitoring the operation progress in the operation process, increasing GPU resources if the operation progress is lower than expected, and properly reducing the GPU resources if the operation progress is higher than expected so as to ensure that each operation task is performed quickly and smoothly.
2. And performing first-in first-out, after the GPU resources required by the task budget are calculated, matching the tasks according to the GPU resources of each node, preferentially distributing the tasks to the same node, preferentially distributing the tasks to idle GPU operation, and when the GPU corresponds to a plurality of tasks, sequentially queuing the tasks for operation.
3. And (3) selecting the GPU resource which is operated fastest on the premise of not influencing the calculation of other tasks according to the task amount and the current GPU resource state.
4. And the priority level is divided according to the emergency, and a preset threshold value is adopted as the dividing basis. When the priorities need to be allocated in the tasks, a priority parameter is directly given to the tasks, the scheduling module compares the priority parameter carried by the tasks with each priority threshold value, the priority level of the tasks is converted, then GPU resources are preferentially used according to the priority level, the priority defaults to allocate the GPU resources according to an optimal solution mode, and for the tasks with higher priorities, the shortest time principle is adopted, and the optimal GPU resources are preferentially occupied. In the priority operation priority task with priority corresponding to the GPU, a plurality of priorities corresponding to the GPU are queued and sequentially calculated according to priority parameter values, and a plurality of priorities corresponding to the GPU with equal priority parameters are queued and operated according to a first-in first-out principle.
The scheduling module of the implementation utilizes GPU resources to the maximum, the scheduling module is integrated into a JSS pin FOR GPU in the embodiment, the scheduling of a GPU mode is different from that of a traditional CPU, the scheduling module firstly analyzes the idle states of GPU resources of different nodes, analyzes the relation among multi-card IDs inside the GPU, records the idle states into a GPU resource pool of the JSS, feeds back the idle states in real time, and then combines the idle states with the GPU resource pool according to the requirement of a task on the GPU, can simultaneously realize 8000 GPU resources among 10000 server nodes in high-performance high-speed scheduling, and realizes orderly execution, automatic queuing, priority scheduling, resource allocation according to requirements and the like of the task.
A scheduling optimization algorithm in the fields of artificial intelligence and engineering based on GPU resources comprises the following steps:
s100, the resource management module dynamically updates GPU resource pool information, the updating mode is mainly that information is periodically requested for each GPU, each GPU feeds back signals according to the request information, and the GPU resource pool judges whether the GPU can be normally used or not according to the fed-back signals, so that the GPU which cannot be used is found in time, and the GPU is prevented from being used in subsequent task allocation. In addition, when the number of GPUs of each node is increased or decreased, the node sends updating information to the GPU resource pool, and the GPU resource pool updates the GPU resources of the node according to the information. In the operation process, each node collects the state information of the GPU therein and periodically feeds the state information back to the resource management module, the state information of the GPU comprises the idle state of the GPU, the resource utilization rate when the GPU runs and the task completion state, and the resource management module releases the computing power of the GPU after acquiring the task completion state information, so that the operation of the next task is started, and task assignment is carried out on each GPU according to the state of the GPU.
S200, an operator inputs a task needing to be operated at a client through a man-machine interaction module, the task needing to be operated is uploaded to a server, and the server forwards the task to a scheduling module for processing; in this embodiment, the human-computer interaction module transmits an instruction for operating the 3D model by the client to the server, and the server calculates the operand of the 3D model and then feeds the operand back to the client, so that the client operates the large 3D model without the difference from the server.
S300, the scheduling module firstly checks whether the task contains the priority parameter, if so, the task enters priority scheduling, and if not, the task is scheduled in a common scheduling mode.
S310, the priority level is divided according to the emergency, and a preset threshold value is adopted as the dividing basis; when the priority needs to be distributed in the task, priority parameters are directly given to the task, the scheduling module compares the priority parameters carried by the task with various priority threshold values, the priority level of the task is converted, and then GPU resources are preferentially used according to the priority level;
allocating GPU resources according to an optimal solution mode by default according to the priority, and preferentially occupying the optimal GPU resources for tasks with higher priorities by adopting a shortest time principle;
in the priority operation priority task with priority corresponding to the GPU, a plurality of priorities corresponding to the GPU are queued and sequentially calculated according to priority parameter values, a plurality of priorities corresponding to the GPU are equal in priority parameter values, and queuing operation is performed according to a first-in first-out principle;
s320, the common scheduling modes mainly comprise:
s321, autonomously identifying model scale and matching scheduling, and allocating GPU resources to perform operation according to the input calculation power required by task budget and the required calculation power. The model budgeting method mainly comprises the following steps:
budgeting the computation required for completing the specified task by using the data volume of the 3D model as a standard according to the model quantity budget;
according to the input value budget, taking the operation amount input by an operator at a client as a budget operation amount;
establishing an operand estimation model according to the operand budget algorithm budget, and training by using a large amount of operation data so that the operand budget algorithm can automatically estimate the required operand according to the characteristics of the current task;
and dynamically adjusting, namely monitoring the operation progress in the operation process, increasing GPU resources if the operation progress is lower than expected, and properly reducing the GPU resources if the operation progress is higher than expected so as to ensure that each operation task is performed quickly and smoothly.
S322, firstly, after the GPU resources required by the task budget are calculated, matching the tasks according to the GPU resources of each node, preferentially distributing the tasks to the same node, preferentially distributing the tasks to idle GPU operation, and when the GPU corresponds to a plurality of tasks, sequentially queuing the tasks for operation;
and S323, selecting the GPU resource which is operated fastest on the premise of not influencing the calculation of other tasks according to the task amount and the current GPU resource state by the optimal solution.
The invention also discloses an interaction method of the 3D model, which applies the optimization method; the method specifically comprises the following steps:
s1, the client side obtains the interactive operation instruction input by the user and then transmits the interactive operation instruction to the server side;
s2, the server side completes graph operation according to the obtained interactive operation instruction to obtain the graph variable quantity in the operation process;
s3, the server side forms an image set according to the image variable quantity obtained in the S2 and the frame extraction information, then encodes and compresses the image set according to the video stream, and transmits the compressed image set to the client side;
and S4, decoding the compressed image set received by the client, and continuously playing according to the information of the video stream to realize the feedback of the interactive operation instruction input by the user.
The invention is not described in detail, but is well known to those skilled in the art.
The foregoing detailed description of the preferred embodiments of the invention has been presented. It should be understood that numerous modifications and variations could be devised by those skilled in the art in light of the present teachings without departing from the inventive concepts. Therefore, the technical solutions available to those skilled in the art through logic analysis, reasoning and limited experiments based on the prior art according to the concept of the present invention should be within the scope of protection defined by the claims.

Claims (9)

1. A scheduling system based on GPU resources in the fields of artificial intelligence and engineering is characterized by comprising:
the human-computer interaction module is used for enabling an operator to visually operate, transmitting an instruction of operating the 3D model by the client to the server, and feeding back the calculated amount of the 3D model to the client after the server calculates the calculated amount of the 3D model, so that the 3D model operated by the client is not different from the 3D model operated by the server in a cloud computing mode;
the resource management module is used for establishing usable GPU resources into a GPU resource pool, the GPU resource pool is composed of a plurality of nodes, each node corresponds to one device, and the multi-card ID relationship in each node GPU and the available resource change of the node GPU are fed back to the GPU resource pool in real time so as to facilitate the dynamic management of the resource management module; each GPU periodically feeds back a use state including a GPU idle state, a GPU resource utilization rate and task completion to the resource management module, and the resource management module releases GPU resources after acquiring a task completion signal so as to be convenient for use of subsequent tasks;
and the scheduling module is used for carrying out calculation amount budget on the tasks input by the human-computer interaction module and distributing GPU resources according to the calculated calculation amount so as to realize balance between improvement of calculation speed and reasonable utilization of the GPU resources.
2. The system for scheduling of artificial intelligence and engineering based on GPU resources of claim 1, wherein the scheduling mode of the scheduling module comprises:
autonomously identifying the model scale and matching scheduling, and allocating GPU resources for operation according to the calculation power required by the input task budget and the required calculation power;
firstly, after the GPU resources required by the task budget are calculated, matching the tasks according to the GPU resources of each node, preferentially distributing the tasks to the same node, preferentially distributing the tasks to idle GPU operation, and when the GPU corresponds to a plurality of tasks, sequentially queuing the tasks for operation;
the optimal solution is that the GPU resource which is operated fastest is selected on the premise of not influencing the calculation of other tasks according to the task quantity and the current GPU resource state;
the priority level is divided according to the emergency, and a preset threshold value is adopted as the dividing basis; when the priorities need to be allocated in the tasks, a priority parameter is directly given to the tasks, the scheduling module compares the priority parameter carried by the tasks with each priority threshold value, converts the priority level of the tasks, preferentially uses GPU resources according to the priority level, allocates the GPU resources according to an optimal solution mode by priority default, and preferentially occupies the optimal GPU resources for the tasks with higher priorities by adopting a shortest time principle;
in the priority operation priority task with priority corresponding to the GPU, a plurality of priorities corresponding to the GPU are queued and sequentially calculated according to priority parameter values, and a plurality of priorities corresponding to the GPU with equal priority parameters are queued and operated according to a first-in first-out principle.
3. The system of claim 2, wherein the model budgeting method mainly comprises:
budgeting the computation required for completing the specified task by using the data volume of the 3D model as a standard according to the model quantity budget;
according to the input value budget, taking the operation amount input by an operator at a client as a budget operation amount;
establishing an operand estimation model according to the operand budget algorithm budget, and training by using a large amount of operation data so that the operand budget algorithm can automatically estimate the required operand according to the characteristics of the current task;
and dynamically adjusting, namely monitoring the operation progress in the operation process, increasing GPU resources if the operation progress is lower than expected, and properly reducing the GPU resources if the operation progress is higher than expected so as to ensure that each operation task is performed quickly and smoothly.
4. A scheduling optimization algorithm in the fields of artificial intelligence and engineering based on GPU resources is characterized by comprising the following steps:
s100, dynamically updating GPU resource pool information by a resource management module;
s200, an operator inputs a task needing to be operated at a client through a man-machine interaction module, the task needing to be operated is uploaded to a server, and the server forwards the task to a scheduling module for processing;
s300, the scheduling module firstly checks whether the task contains the priority parameter, if so, the task enters priority scheduling, and if not, the task is scheduled in a common scheduling mode.
5. The algorithm for scheduling optimization in the artificial intelligence and engineering field based on GPU resources as claimed in claim 4, wherein S100 further comprises:
s110, periodically requesting information to each GPU, feeding back a signal by each GPU according to the request information, and judging whether the GPU can be normally used or not by the GPU resource pool according to the fed-back signal, so that the GPU which cannot be used is found in time, and the GPU is prevented from being used in the subsequent task allocation;
s120, when the number of GPUs of each node is increased or decreased, the node sends updating information to a GPU resource pool, and the GPU resource pool updates GPU resources of the node according to the information;
and S130, in the operation process, each node collects state information of the GPU therein and feeds the state information back to the resource management module periodically, the state information of the GPU comprises an idle state of the GPU, a resource utilization rate when the GPU runs and a task completion state, and the resource management module releases the computing power of the GPU after acquiring the task completion state information, so that the operation of the next task is started, and task assignment is carried out on each GPU according to the state of the GPU.
6. The algorithm for scheduling optimization in the artificial intelligence and engineering field based on GPU resources as claimed in claim 4, wherein S300 further comprises:
s310, dividing priority levels according to emergency situations, wherein a preset threshold is adopted as a dividing basis; when the priority needs to be distributed in the task, priority parameters are directly given to the task, the scheduling module compares the priority parameters carried by the task with various priority threshold values, the priority level of the task is converted, and then GPU resources are preferentially used according to the priority level;
allocating GPU resources according to an optimal solution mode by default according to the priority, and preferentially occupying the optimal GPU resources for tasks with higher priorities by adopting a shortest time principle;
in the priority operation priority task with priority corresponding to the GPU, a plurality of priorities corresponding to the GPU are queued and sequentially calculated according to priority parameter values, and a plurality of priorities corresponding to the GPU with equal priority parameters are queued and operated according to a first-in first-out principle.
7. The algorithm for scheduling optimization in the artificial intelligence and engineering field based on GPU resources as claimed in claim 4, wherein S300 further comprises: s320, the common scheduling method includes:
s321, autonomously identifying model scale and matching scheduling, and allocating GPU resources for operation according to the calculation power required by the input task budget and the required calculation power;
s322, firstly, after the GPU resources required by the task budget are calculated, matching the tasks according to the GPU resources of each node, preferentially distributing the tasks to the same node, preferentially distributing the tasks to idle GPU operation, and when the GPU corresponds to a plurality of tasks, sequentially queuing the tasks for operation;
and S323, selecting the GPU resource which is operated fastest on the premise of not influencing the calculation of other tasks according to the task amount and the current GPU resource state by the optimal solution.
8. A method of interaction of 3D models, characterized in that the optimization method of any of claims 4 to 7 is applied.
9. A method of interacting with 3D models according to claim 8, characterized in that it comprises the following steps:
s1, the client side obtains the interactive operation instruction input by the user and then transmits the interactive operation instruction to the server side;
s2, the server side completes graph operation according to the obtained interactive operation instruction to obtain the graph variable quantity in the operation process;
s3, the server side forms an image set according to the image variable quantity obtained in the S2 and the frame extraction information, then encodes and compresses the image set according to the video stream, and transmits the compressed image set to the client side;
and S4, decoding the compressed image set received by the client, and continuously playing according to the information of the video stream to realize the feedback of the interactive operation instruction input by the user.
CN202111081087.6A 2021-08-09 2021-09-15 Scheduling system and optimization algorithm based on GPU resources in artificial intelligence and engineering fields Pending CN113791906A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110910166 2021-08-09
CN2021109101667 2021-08-09

Publications (1)

Publication Number Publication Date
CN113791906A true CN113791906A (en) 2021-12-14

Family

ID=79183549

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111081087.6A Pending CN113791906A (en) 2021-08-09 2021-09-15 Scheduling system and optimization algorithm based on GPU resources in artificial intelligence and engineering fields

Country Status (1)

Country Link
CN (1) CN113791906A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114416381A (en) * 2022-03-28 2022-04-29 维塔科技(北京)有限公司 Processing resource over-partitioning method, device, equipment and storage medium
CN114820279A (en) * 2022-05-18 2022-07-29 北京百度网讯科技有限公司 Distributed deep learning method and device based on multiple GPUs and electronic equipment
CN115220921A (en) * 2022-09-19 2022-10-21 浙江大华技术股份有限公司 Resource scheduling method, image processor, image pickup device, and medium
CN115269159A (en) * 2022-09-27 2022-11-01 苏州美集供应链管理股份有限公司 Scheduling system and method based on artificial intelligence and edge calculation support
CN115658311A (en) * 2022-10-31 2023-01-31 北京百度网讯科技有限公司 Resource scheduling method, device, equipment and medium
CN117170878A (en) * 2023-10-31 2023-12-05 北京蓝耘科技股份有限公司 Method for dynamically adjusting CPU and GPU caches
WO2024055168A1 (en) * 2022-09-13 2024-03-21 华为技术有限公司 Resource allocation method, processor, and computing platform
CN117785491A (en) * 2024-02-28 2024-03-29 北京蓝耘科技股份有限公司 GPU cloud computing resource management method, system and storage medium
CN117785491B (en) * 2024-02-28 2024-05-28 北京蓝耘科技股份有限公司 GPU cloud computing resource management method, system and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110198244A (en) * 2019-06-19 2019-09-03 北京百度网讯科技有限公司 Resource allocation method and device towards isomery cloud service
CN111176852A (en) * 2020-01-15 2020-05-19 上海依图网络科技有限公司 Resource allocation method, device, chip and computer readable storage medium
CN111258767A (en) * 2020-01-22 2020-06-09 中国人民解放军国防科技大学 Intelligent cloud computing resource allocation method and device for complex system simulation application
CN111367679A (en) * 2020-03-31 2020-07-03 中国建设银行股份有限公司 Artificial intelligence computing power resource multiplexing method and device
CN111399976A (en) * 2020-03-02 2020-07-10 上海交通大学 GPU virtualization implementation system and method based on API redirection technology
CN111966504A (en) * 2020-10-23 2020-11-20 腾讯科技(深圳)有限公司 Task processing method in graphics processor and related equipment
CN112416585A (en) * 2020-11-20 2021-02-26 南京大学 GPU resource management and intelligent scheduling method for deep learning
CN113157413A (en) * 2021-04-16 2021-07-23 上海交通大学 Deep learning task resource optimization configuration method and system based on service quality requirement

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110198244A (en) * 2019-06-19 2019-09-03 北京百度网讯科技有限公司 Resource allocation method and device towards isomery cloud service
CN111176852A (en) * 2020-01-15 2020-05-19 上海依图网络科技有限公司 Resource allocation method, device, chip and computer readable storage medium
CN111258767A (en) * 2020-01-22 2020-06-09 中国人民解放军国防科技大学 Intelligent cloud computing resource allocation method and device for complex system simulation application
CN111399976A (en) * 2020-03-02 2020-07-10 上海交通大学 GPU virtualization implementation system and method based on API redirection technology
CN111367679A (en) * 2020-03-31 2020-07-03 中国建设银行股份有限公司 Artificial intelligence computing power resource multiplexing method and device
CN111966504A (en) * 2020-10-23 2020-11-20 腾讯科技(深圳)有限公司 Task processing method in graphics processor and related equipment
CN112416585A (en) * 2020-11-20 2021-02-26 南京大学 GPU resource management and intelligent scheduling method for deep learning
CN113157413A (en) * 2021-04-16 2021-07-23 上海交通大学 Deep learning task resource optimization configuration method and system based on service quality requirement

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张云泉等: "《人工智能三驾马车-大数据、算力和算法》", 31 July 2021, 科学技术文献出版社, pages: 108 - 109 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114416381A (en) * 2022-03-28 2022-04-29 维塔科技(北京)有限公司 Processing resource over-partitioning method, device, equipment and storage medium
CN114416381B (en) * 2022-03-28 2022-08-12 维塔科技(北京)有限公司 Processing resource over-partitioning method, device, equipment and storage medium
CN114820279A (en) * 2022-05-18 2022-07-29 北京百度网讯科技有限公司 Distributed deep learning method and device based on multiple GPUs and electronic equipment
WO2024055168A1 (en) * 2022-09-13 2024-03-21 华为技术有限公司 Resource allocation method, processor, and computing platform
CN115220921A (en) * 2022-09-19 2022-10-21 浙江大华技术股份有限公司 Resource scheduling method, image processor, image pickup device, and medium
CN115269159A (en) * 2022-09-27 2022-11-01 苏州美集供应链管理股份有限公司 Scheduling system and method based on artificial intelligence and edge calculation support
CN115658311A (en) * 2022-10-31 2023-01-31 北京百度网讯科技有限公司 Resource scheduling method, device, equipment and medium
CN117170878A (en) * 2023-10-31 2023-12-05 北京蓝耘科技股份有限公司 Method for dynamically adjusting CPU and GPU caches
CN117170878B (en) * 2023-10-31 2024-01-26 北京蓝耘科技股份有限公司 Method for dynamically adjusting CPU and GPU caches
CN117785491A (en) * 2024-02-28 2024-03-29 北京蓝耘科技股份有限公司 GPU cloud computing resource management method, system and storage medium
CN117785491B (en) * 2024-02-28 2024-05-28 北京蓝耘科技股份有限公司 GPU cloud computing resource management method, system and storage medium

Similar Documents

Publication Publication Date Title
CN113791906A (en) Scheduling system and optimization algorithm based on GPU resources in artificial intelligence and engineering fields
CN112465129B (en) On-chip heterogeneous artificial intelligent processor
CN107888669B (en) Deep learning neural network-based large-scale resource scheduling system and method
CN105718479B (en) Execution strategy generation method and device under cross-IDC big data processing architecture
US9946563B2 (en) Batch scheduler management of virtual machines
CN104794194B (en) A kind of distributed heterogeneous concurrent computational system towards large scale multimedia retrieval
CN112463709A (en) Configurable heterogeneous artificial intelligence processor
CN111679904B (en) Task scheduling method and device based on edge computing network
CN112181613B (en) Heterogeneous resource distributed computing platform batch task scheduling method and storage medium
US9104491B2 (en) Batch scheduler management of speculative and non-speculative tasks based on conditions of tasks and compute resources
WO2021136512A1 (en) Method and device for scheduling on basis of deep learning node computation, and storage medium
CN105378668B (en) The interruption of operating system management in multicomputer system guides
CN104123182A (en) Map Reduce task data-center-across scheduling system and method based on master-slave framework
CN111209077A (en) Deep learning framework design method
CN110389816A (en) Method, apparatus and computer program product for scheduling of resource
CN108132840B (en) Resource scheduling method and device in distributed system
CN113867907A (en) CPU resource-based scheduling system and optimization algorithm in engineering field
CN111506434A (en) Task processing method and device and computer readable storage medium
CN113316116A (en) Vehicle calculation task unloading method based on multi-arm gambling machine
CN116048721A (en) Task allocation method and device for GPU cluster, electronic equipment and medium
CN115292046A (en) Calculation force distribution method and device, storage medium and electronic equipment
CN115543624A (en) Heterogeneous computing power arrangement scheduling method, system, equipment and storage medium
CN115586961A (en) AI platform computing resource task scheduling method, device and medium
CN111309488A (en) Method and system for sharing computing resources of unmanned aerial vehicle cluster and computer storage medium
CN114327894A (en) Resource allocation method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination