CN118170515A - Processor task package scheduling device and method - Google Patents

Processor task package scheduling device and method Download PDF

Info

Publication number
CN118170515A
CN118170515A CN202410288271.5A CN202410288271A CN118170515A CN 118170515 A CN118170515 A CN 118170515A CN 202410288271 A CN202410288271 A CN 202410288271A CN 118170515 A CN118170515 A CN 118170515A
Authority
CN
China
Prior art keywords
task package
dependency
package
execution
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410288271.5A
Other languages
Chinese (zh)
Inventor
请求不公布姓名
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Bi Ren Technology Co ltd
Original Assignee
Shanghai Bi Ren Technology Co ltd
Filing date
Publication date
Application filed by Shanghai Bi Ren Technology Co ltd filed Critical Shanghai Bi Ren Technology Co ltd
Publication of CN118170515A publication Critical patent/CN118170515A/en
Pending legal-status Critical Current

Links

Abstract

The application provides a processor task package scheduling device and method, which relate to the field of chip design verification and comprise the following steps: the task packet grabbing module is used for receiving the current task packet; the task package distribution module is used for determining a target task package with an execution dependency relationship with the current task package and an execution result of the target task package based on the execution dependency information of the current task package and the dependency table; distributing the current task package based on the execution result of the target task package; the dependency table is used for recording the execution result of each task package. The method and the device provided by the application improve the execution efficiency of the task package and the performance of the processor.

Description

Processor task package scheduling device and method
Technical Field
The application relates to the field of chip design verification, in particular to a processor task package scheduling device and method.
Background
With the rapid development of artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) technology, graphics processors (Graphics Processing Unit, GPU) have become an indispensable tool for accelerating complex computations. Graphics processors possess a large number of cores compared to conventional central processing units (Central Processing Unit, CPUs). The rich core enables the graphics processor to process large amounts of data simultaneously, and is well suited for parallel computing, such as image and video processing, deep learning, scientific simulation, and the like.
Graphics processors typically execute multiple task packages concurrently. These task packages may have a front-back dependency relationship in the execution sequence (for example, the execution of the next task package depends on the execution of the previous task package), if the task packages cannot be reasonably scheduled, the task packages are executed out of order, and data disorder, resource competition, system error and the like occur, which results in reduced execution efficiency of the task packages and reduced performance of the graphics processor.
Therefore, how to improve the execution efficiency of task packages and improve the performance of processors is a technical problem to be solved in the industry.
Disclosure of Invention
The application provides a task package scheduling device and method for a processor, which are used for solving the technical problem of how to improve the execution efficiency of task packages and the performance of the processor.
The application provides a processor task package scheduling device, which comprises:
The task packet grabbing module is used for receiving the current task packet;
The task package distribution module is used for determining a target task package with an execution dependency relationship with the current task package and an execution result of the target task package based on the execution dependency information of the current task package and the dependency table; distributing the current task package based on the execution result of the target task package;
The dependency table is used for recording the execution result of each task package.
In some embodiments, the execution dependency information of the current task package includes a dependency identification bit, a current task package number, and a target task package number.
In some embodiments, the task package distribution module includes:
a dependency table storage sub-module for storing the dependency table;
The dependency comparison sub-module is used for setting the identification bit corresponding to the task package number as a first preset value in the dependency table and determining an execution result of the target task package based on the value of the identification bit corresponding to the target task package number in the dependency table under the condition that the dependency identification bit of the current task package is a preset effective value; distributing the current task package based on the execution result of the target task package;
Wherein the dependency table comprises a plurality of identification bits; each identification bit corresponds to the task package number of each task package one by one; the value of the identification bit is used for representing the execution result of the task package; the first preset value indicates that the task package is not completed.
In some embodiments, the task package distribution module further includes a task package cache sub-module; correspondingly, the dependency comparison submodule is specifically configured to:
Storing the current task package to the task package caching submodule under the condition that the execution result of the target task package is incomplete, continuously detecting the identification bit corresponding to the target task package number in the dependency table, and determining the execution result of the target task package based on the value of the identification bit corresponding to the target task package number;
And distributing the current task package under the condition that the execution result of the target task package is that the execution is completed.
In some embodiments, the dependency comparison sub-module is further to:
Monitoring the current task package; under the condition that the execution of the current task package is completed, setting a corresponding identification bit of a task package number of the current task package in the dependency table as a second preset value;
the second preset value is used for indicating that the task package execution is completed.
In some embodiments, the dependency comparison sub-module is further to:
under the condition that the dependency identification bit of the current task package is a preset invalid value, determining that the current task package does not have a target task package for executing the dependency relationship; and distributing the current task package.
The application provides a command processor, which comprises the processor task packet scheduling device.
The application provides a processor, comprising the command processor.
The application provides a method for scheduling a task packet of a processor, which comprises the following steps:
Receiving a current task package;
Determining a target task package with an execution dependency relationship with the current task package and an execution result of the target task package based on the execution dependency information of the current task package and a dependency table; distributing the current task package based on the execution result of the target task package;
The dependency table is used for recording the execution result of each task package.
The application provides an electronic device, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the task package scheduling method of the processor when executing the program.
The application provides a processor task package scheduling device and method, comprising a task package grabbing module and a task package distribution module, wherein execution dependency information is written in a current task package, a dependency table is arranged in the task package distribution module, execution results of all task packages are recorded, a target task package and an execution result thereof which have an execution dependency relationship with the current task package can be determined through the execution dependency information and the dependency table, and the current task package is distributed according to the execution result of the target task package; the task packages can be reasonably distributed according to the execution dependency relationship among the task packages in the concurrent processing process, so that data disorder, resource competition, system errors and the like are avoided, the execution efficiency of the task packages is improved, and the performance of the processor is also improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
In order to more clearly illustrate the application or the technical solutions of the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the application, and other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a task packet scheduler of a processor according to the present application;
FIG. 2 is a schematic diagram of a task package according to the present application;
FIG. 3 is a schematic diagram of a task package distribution module according to the present application;
FIG. 4 is a second schematic diagram of a task package distribution module according to the present application;
FIG. 5 is a schematic diagram of a command processor according to the present application;
FIG. 6 is a schematic diagram of a processor according to the present application;
FIG. 7 is a flow chart of a method for scheduling task packages of a processor according to the present application;
FIG. 8 is a task package scheduling effect diagram provided by the present application;
FIG. 9 is a second flowchart of a task packet scheduling method according to the present application;
fig. 10 is a schematic structural diagram of an electronic device provided by the present application.
Detailed Description
In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.
It should be noted that the terms "first," "second," and the like herein are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In the present application, a processor refers to a device having processing capabilities. For example, the processor may be any one of a CPU (Central Processing Unit, a central processing unit), a GPU (Graphics Processing Unit, a graphics processor), and a GPGPU (General-Purpose computing on Graphics Processing Unit, a General-purpose graphics processor). The following embodiments are illustrated with a Graphics Processor (GPU) merely to facilitate understanding of the meaning of the processor to which the present application relates, and are not intended to make unnecessary limitations on the type of processor.
Graphics processors are processors designed specifically for large data processing such as graphics, and their design goal is to provide efficient computing power when processing large-scale, highly concurrent graphics data. The number of processors of the GPU is greater than that of the CPU, and the computing power of each processor is relatively weak, but they can process a large amount of data at the same time, thereby providing higher computing efficiency.
Structurally, GPUs typically comprise:
a Processor Unit (PU), also called a Stream Processor (SP), is a core computing unit of the GPU, for performing computing operations;
A video Memory (Graphics Memory) for storing Graphics data, or Graphics related data such as texture, and intermediate results required in the GPU computing process;
The memory controller (Memory Controller) is used for controlling the read-write operation of the video memory;
A command processor (Command Processor, CP) for controlling the processor unit inside the GPU, coordinating data transmission and communication between the GPU and the CPU;
And a graphics Output (Display Output) for outputting the image data processed by the GPU to a Display.
Among other things, command handlers play a very important role in the scheduling of task packages.
Fig. 1 is a schematic structural diagram of a task packet scheduling device for a processor, where, as shown in fig. 1, the task packet scheduling device for a processor 100 includes a task packet grabbing module 110 and a task packet distributing module 120.
The task packet grabbing module is used for receiving the current task packet;
The task package distribution module is used for determining a target task package with an execution dependency relationship with the current task package and an execution result of the target task package based on the execution dependency information of the current task package and the dependency table; distributing the current task package based on the execution result of the target task package;
The dependency table is used for recording the execution result of each task package.
Specifically, the processor task packet scheduling device provided by the embodiment of the application can be understood as an internal component of a command processor in a graphics processor, and can also be understood as the command processor itself.
From the functional structure, the task package scheduling device of the processor comprises a task package grabbing module and a task package distributing module which are connected with each other.
The task package grabbing module is mainly used for sending task package grabbing commands and receiving task packages. A task packet refers to a data packet containing instructions or a combination of instructions. The instructions in the task package are used to define operations such as parallel computation, data transfer, and graphics rendering performed in the graphics processor. Instructions are typically generated by a driver of the graphics processor or a central processing unit and sent to the graphics processor by way of communication.
The task package distribution module is mainly used for distributing task packages, namely, sending the task packages received by the task package grabbing module to each processor unit (task package execution unit) in the graphics processor for execution.
Concurrently executing task packages may have execution dependencies. An execution dependency refers to a sequential constraint between instructions, the execution of one instruction requiring the execution of the other instructions to be dependent on the results of the execution of the other instructions. The execution dependencies determine the order and concurrency of instruction execution. More specifically, the types of execution dependencies may include data dependencies, control dependencies, resource dependencies, and the like.
The data dependency relationship means that there is a data dependency between instructions, and a subsequent instruction needs to wait for a previous instruction to complete and generate a result before executing. The control dependency relationship means that control flow dependency exists among instructions, and the execution sequence is influenced by control structures such as conditional branches. Resource dependency refers to that competing accesses to shared resources (such as memories and registers) exist between instructions, and reasonable resource allocation and management are required.
For example, the first task package functions to load data and the second task package functions to read data. The execution dependency relationship exists between the first task package and the second task package, that is, the second task package needs to wait for the first task package to finish executing before the correct data can be read. If the second task package is executed in advance under the condition that the first task package is not executed, the second task package reads the wrong data.
The execution dependency information is information for describing execution dependency relationships among task packages, and the information may specifically include whether dependency relationships exist between the execution dependency information and other task packages, identification information of the task packages, identification information of other task packages, and the like, and may also include other information such as types of the execution dependency relationships.
The execution dependency information may be pre-written or recorded in the task package. If the task package is transmitted in the form of a data message, a plurality of fields may be selected in the data message for recording execution dependency information. For example, 3 fields may be selected and used to record whether there is a dependency relationship with other task packages, identification information of the task packages, and identification information of other task packages, respectively.
Dependency tables may also be provided in the task package distribution module. The dependency table may be implemented by setting registers or memory in the task package distribution module. The Memory may be implemented by selecting a Static Random-Access Memory (SRAM).
The dependency table is used for recording the execution result of each task package. For example, the dependency table may include a plurality of identification bits, each of which is in one-to-one correspondence with a task package for identifying the execution result of the corresponding task package. The execution results may include completed execution and incomplete execution. Execution completion may be represented by 0 and execution incompletion may be represented by 1. When writing 0 into an address corresponding to a certain identification bit in the dependency table, storing 0 into the address, and also indicating that the execution of the task packet corresponding to the identification bit is completed; when 1 is written into an address corresponding to a certain identification bit in the dependency table, the 1 is stored into the address, and the task packet corresponding to the identification bit is not completed. In an embodiment of the present application, the task package execution incompletion may be that the task package is executing or that the task package has not yet been distributed.
The processor task packet scheduling device continuously schedules the received task packets. The current task package is the task package which needs to be scheduled currently. The operation mechanism of the task packet scheduling device of the processor will be described below by taking the current task packet as an example.
The task package grabbing module sends the received current task package to the task package distributing module. The task package distribution module can analyze the current task package to obtain the execution dependency information of the current task package. According to the execution dependency information, the task package distribution module can determine whether the current task package has a dependency relationship with other task packages. The target task package is a task package with an execution dependency relationship with the current task package.
If it is determined that the current task package has a dependency relationship with other task packages, that is, it is determined that the target task package exists, the task package distribution module may further obtain, according to the execution dependency information, identification information of the current task package and identification information of the target task package.
The execution of the current task package depends on the execution result of the target task package, that is, the task package distribution module can distribute the current task package only when the execution of the target task package is completed. Therefore, the task package distribution module can read the dependency table, acquire the execution result of the target task package, and determine the execution result of the target task package, for example, by reading the value of the corresponding identification bit of the target task package in the dependency table.
If the execution result of the target task package is that the execution is finished, the task package distribution module can immediately distribute the current task package; if the execution result of the target task package is that the execution is not finished, the task package distribution module can suspend to distribute the current task package, and after the execution of the target task package is finished, the current task package is distributed again.
In addition, considering that the subsequent task package may depend on the execution result of the current task package, the task package distribution module may further rewrite the dependency table according to the execution situation of the current task package. For example, the current task package is in the process of not being distributed or being executed (it can be understood that the current task package is not executed), and the task package distribution module can write the corresponding identification bit in the dependency table into 1; when the execution of the current task package is completed, the task package distribution module may write the corresponding identification bit in the dependency table into 0.
The task scheduling device provided by the embodiment of the application comprises a task package grabbing module and a task package distribution module, wherein execution dependency information is written in a current task package, a dependency table is arranged in the task package distribution module, execution results of all task packages are recorded, a target task package and an execution result thereof which have an execution dependency relationship with the current task package can be determined through the execution dependency information and the dependency table, and the current task package is distributed according to the execution result of the target task package; the task packages can be reasonably distributed according to the execution dependency relationship among the task packages in the concurrent processing process, so that data disorder, resource competition, system errors and the like are avoided, the execution efficiency of the task packages is improved, and the performance of the processor is also improved.
In some embodiments, the execution dependency information for the current task package includes a dependency identification bit, a current task package number, and a target task package number.
Specifically, fig. 2 is a schematic structural diagram of a task package provided by the present application, where, as shown in fig. 2, the task package at least includes a dependency identification bit, a current task package number, a target task package number, and an instruction portion. The instruction part is used for storing the instruction to be distributed.
The dependency identification bit is used for indicating the dependency state of the current task package, namely whether the dependency relationship exists with other task packages. The identification bit may be represented by a single bit or by a combination of bits. For example, the field of the dependency identification bit in the task packet may be Valid bit, which is represented by 1 bit, and 1 is set to a preset Valid value, and 0 is set to a preset invalid value. When the value of the dependency identification bit is 1, the dependency relationship between the current task package and other task packages is shown; when the value of the dependency identification bit is 0, the current task package and other task packages are not in a dependency relationship.
The current task package number is used for identifying the current task package in the concurrently executed task packages, and can be represented by adopting a combination of a plurality of bits, and the number of the bits can be determined according to the number of the concurrently executed task packages. The field of the current task package number in the task package may be named ID Num.
The target task package number is used for identifying the target task package corresponding to the current task package, and can also be represented by adopting a combination of a plurality of bits, and the number of bits is consistent with that of the current task package number. The field of the target task package number in the task package may be named info.
The task scheduling device provided by the embodiment of the application can specifically determine the execution dependency information of the task package into the dependency identification bit, the current task package number and the target task package number, which not only indicates whether the execution dependency relationship exists, but also is convenient for determining the corresponding task package, the execution dependency information is recorded in the task package, the analysis is convenient, the convenience of determining the execution dependency information is improved, the task package scheduling is facilitated to be accelerated, and the execution efficiency of the task package is improved.
In some embodiments, the task package distribution module includes:
The dependency table storage submodule is used for storing the dependency table;
The dependency comparison sub-module is used for setting the identification bit corresponding to the task package number as a first preset value in the dependency table and determining the execution result of the target task package based on the value of the identification bit corresponding to the target task package number in the dependency table under the condition that the dependency identification bit of the current task package is a preset effective value; distributing the current task package based on the execution result of the target task package;
wherein the dependency table comprises a plurality of identification bits; each identification bit corresponds to the task package number of each task package one by one; the value of the identification bit is used for representing the execution result of the task package; the first preset value indicates that the task package execution is not completed.
Specifically, fig. 3 is one of the schematic structural diagrams of the task package distribution module provided in the present application, and as shown in fig. 3, the task package distribution module 120 may include a dependency table storage sub-module 121 and a dependency comparison sub-module 122.
The dependency table may include a plurality of identification bits. Each identification bit corresponds to the task package number of each task package one by one. The value of the identification bit may be used to represent the execution result of the corresponding task package. The first preset value may be selected to be 1, which indicates that the task package is not completed, and specifically may include a situation that the task package is not yet distributed or is being executed, and the like.
The dependency table may be stored in a dependency table storage sub-module. The dependency table storage sub-module may be implemented by a register or memory. For example, the dependency table storage submodule is implemented through a register, each bit (bit) in the register corresponds to each task packet one by one, and for example, the address of the bit can be matched with the task packet number (ID Num).
After analyzing the current task package, the dependency comparison submodule firstly acquires the dependency identification bit. If the dependency identification bit of the current task package is a preset effective value, the current task package number and the target task package number need to be further acquired.
And setting an identification bit corresponding to the task package number as a first preset value in the dependency table. The first preset value may be 1, which is used to indicate that the task package is in a state of incomplete execution (in the actual case, the current task package is not yet distributed and waits for execution). The purpose of this is to take into account that the subsequent task package may depend on the execution result of the current task package, i.e. the distribution of the subsequent task package may require acquisition of the execution result of the current task package.
Meanwhile, determining the execution result of the target task package according to the value of the identification bit corresponding to the target task package number in the dependency table.
And if the value of the identification bit corresponding to the target task package number is a first preset value, indicating that the target task package is in a state of incomplete execution. At this time, the target task package obviously restricts or limits the distribution of the current task package, and the current task package can be distributed after the target task package is completely executed. It can be understood that the execution dependency relationship between the current task package and the target task package has not been released.
If the value of the identification bit corresponding to the target task package number is a second preset value, the second preset value may be 0, which is used to indicate that the task package is in a state that execution is completed. At this time, the target task package is already executed, and there is no restriction or limitation on the distribution of the current task package, and the current task package can be immediately distributed without waiting. It can be understood that the execution dependency relationship between the current task package and the target task package has been released.
According to the task scheduling device provided by the embodiment of the application, the dependency comparison sub-module is used for comparing and analyzing the execution dependency information of the current task package with the dependency table, determining the execution result of the target task package, distributing the current task package according to the execution result of the target task package, reasonably distributing the current task package according to the execution result of the target task package, avoiding data disorder, resource competition, system error and the like, improving the execution efficiency of the task package and improving the performance of a processor.
In some embodiments, the task package distribution module further includes a task package cache sub-module; accordingly, the dependency comparison submodule is specifically configured to:
storing the current task package into a task package buffer sub-module under the condition that the execution result of the target task package is incomplete, continuously detecting the identification bit corresponding to the target task package number in the dependency table, and determining the execution result of the target task package based on the value of the identification bit corresponding to the target task package number;
and distributing the current task package under the condition that the execution result of the target task package is that the execution is completed.
Specifically, fig. 4 is a second schematic structural diagram of the task package distribution module provided in the present application, and as shown in fig. 4, the task package distribution module 120 further includes a task package buffering sub-module 123.
The task package scheduling device is considered to schedule the received task package continuously, and there may be a plurality of task packages needing to wait for distribution. Thus, a task package cache sub-module may be provided. The sub-module is connected with the dependency comparison sub-module. The storage capacity of the task packet cache submodule can be set according to the requirement.
And the dependency comparison submodule distributes the current task package according to the execution result of the target task package.
If the execution result of the target task package is that the execution is completed, the dependency comparison submodule directly distributes the current task package.
If the execution result of the target task package is that the execution is not completed, the dependency comparison sub-module can store the current task package to the task package cache sub-module, continuously detect the identification bit corresponding to the target task package number in the dependency table, determine the execution result of the target task package according to the value of the corresponding identification bit until the detection that the identification bit corresponding to the target task package number is the second preset value, determine that the execution result of the target task package is that the execution is completed, and distribute the current task package.
According to the task scheduling device provided by the embodiment of the application, the task package caching submodule is arranged to cache the current task package, so that the data transmission pressure of the dependency comparison submodule is reduced, and the execution efficiency of the task package is improved.
In some embodiments, the dependency comparison sub-module is further to:
Monitoring a current task package; under the condition that the execution of the current task package is completed, setting the corresponding identification bit of the task package number of the current task package in the dependency list as a second preset value;
The second preset value is used for indicating that the task package execution is completed.
Specifically, it is considered that the subsequent task package may depend on the execution result of the current task package, that is, the distribution of the subsequent task package may need to obtain the execution result of the current task package.
The dependency comparison sub-module may also monitor the execution of the current task package. For example, the dependency comparison sub-module may communicate with a processor unit corresponding to the current task package to obtain an execution result of the current task package. If the execution of the current task package is completed, the dependency comparison submodule modifies the corresponding identification bit of the task package number of the current task package in the dependency table and rewrites the identification bit into a second preset value.
The task scheduling device provided by the embodiment of the application can realize real-time updating of the dependency list by monitoring the current task package, thereby reducing the delay waiting time and improving the execution efficiency of the task package.
In some embodiments, the dependency comparison sub-module is further to:
Under the condition that the dependency identification bit of the current task package is a preset invalid value, determining that the current task package does not have a target task package for executing the dependency relationship; and distributing the current task package.
Specifically, if the dependency comparison submodule analyzes the current task package and determines that the dependency identification bit is a preset invalid value, it can be determined that the current task package does not have a target task package for executing the dependency relationship, and the current task package can be directly distributed.
According to the task scheduling device provided by the embodiment of the application, the current task package is directly distributed by taking the dependency identification bit as the preset invalid value, and the current task package number and the target task package number are not required to be acquired again for judgment, so that the scheduling time of the task package is shortened, and the execution efficiency of the task package is improved.
Fig. 5 is a schematic structural diagram of a command processor provided in the present application, and as shown in fig. 5, a command processor 500 includes the task scheduling device 100 in the above embodiment.
In particular, the command processor is used to control a plurality of processor units within the graphics processor, coordinate data transfer and communication between the graphics processor and the central processor.
The command processor provided by the embodiment of the application can reasonably distribute the task packets according to the execution dependency relationship among the task packets in the concurrent processing process due to the internal arrangement of the task scheduling device in the embodiment, thereby avoiding data disorder, resource competition, system error and the like, improving the execution efficiency of the task packets and improving the performance of the graphics processor.
Fig. 6 is a schematic structural diagram of a processor provided in the present application, and as shown in fig. 6, a processor 600 includes the command processor 500 in the above embodiment.
Specifically, the processor in the embodiment of the application may include a traditional graphics processor mainly used for graphics data processing, and may also include a general-purpose graphics processor supporting general-purpose computing tasks and deep learning tasks, such as various neural network accelerators, and the like. The embodiment of the application does not particularly limit the category, the purpose and the architecture of the processor.
The processor provided by the embodiment of the application comprises the command processor in the embodiment, so that the task packets can be reasonably distributed according to the execution dependency relationship among the task packets in the concurrent processing process, thereby avoiding data disorder, resource competition, system error and the like, improving the execution efficiency of the task packets and improving the performance of the processor.
Fig. 7 is a schematic flow chart of a method for scheduling a task packet of a processor, as shown in fig. 7, where the method is applicable to a processor in the foregoing embodiment, and the processor includes a command processor and a plurality of processor units, and the method specifically includes:
Step 710, the command processor receives the current task package, and parses the current task package to obtain execution dependency information carried in the current task package.
Step 720, the command processor determines according to the dependency identification bit. If the dependency identification bit is a preset invalid value, directly distributing the current task package; if the dependency identification bit is a preset effective value, reading the dependency table according to the current task package number and the target task package number, and determining the execution result of the target task package according to the identification bit corresponding to the target task package number.
If the target task package is executed, the dependency relationship between the target task package and the current task package can be understood as being relieved, and the current task package can be directly distributed; if the target task package is not executed, the dependency relationship between the target task package and the current task package can be understood as not being relieved, and the current task package can be required to be cached, and the execution of the target task package is waited for.
Step 730, detecting an identification bit corresponding to a target task package number in the dependency table, determining an execution result of the target task package based on a value of the identification bit corresponding to the target task package number, and distributing the current task package when the execution result of the target task package is that the execution is completed; otherwise, the dependency table is continuously detected.
Fig. 8 is a diagram of a task packet scheduling effect provided by the present application, as shown in fig. 8, where a processor includes 2 processor units (task packet execution units) and 3 task packets need to be scheduled, where task packet 2 depends on the execution result of task packet 1.
In the related art, the processor task packet scheduling device either distributes the received 3 task packets directly to 2 task packets without considering the execution dependency relationship, which may cause data derangement. Or the task package 1 is distributed to the processor unit 1, and when the task package 2 is distributed, the task package 2 is distributed to the processor unit 2 after waiting for the task package 1 to be executed. This causes the task package 3 to be forced to wait as well. If the execution time of each task package is 100 nanoseconds, a total of 300 nanoseconds is required.
If the processor task package scheduling device is according to the method provided by the application, the task package 1 is distributed to the processor unit 1, meanwhile, the task package 3 is directly distributed to the processor unit 2, and after the task package 1 is completely executed, the task package 2 is directly distributed to the processor unit 2, so that only 200 nanoseconds are needed.
By the method, the idle state of the processor unit 2 is utilized to execute the subsequent task package 3 without the dependency relationship, so that the scheduling efficiency of the task package and the performance of the processor are improved.
The method provided by the embodiment of the application is described below, and the method described below and the device described above can be referred to correspondingly.
Fig. 9 is a second flow chart of the method for scheduling task packets of a processor according to the present application, as shown in fig. 9, the execution body of the method is a task packet scheduling device of a processor in the above embodiment, which includes step 910 and step 920.
Step 910, receiving a current task package;
step 920, determining a target task package with an execution dependency relationship with the current task package and an execution result of the target task package based on the execution dependency information of the current task package and the dependency table;
Step 930, distributing the current task package based on the execution result of the target task package;
The dependency table is used for recording the execution result of each task package.
According to the task scheduling method provided by the embodiment of the application, the execution dependency information is written in the current task package, the dependency table is set in the task package distribution module, the execution results of all task packages are recorded, the target task package and the execution result thereof which have the execution dependency relationship with the current task package can be determined through the execution dependency information and the dependency table, and the current task package is distributed according to the execution result of the target task package; the task packages can be reasonably distributed according to the execution dependency relationship among the task packages in the concurrent processing process, so that data disorder, resource competition, system errors and the like are avoided, the execution efficiency of the task packages is improved, and the performance of the processor is also improved.
In some embodiments, the execution dependency information for the current task package includes a dependency identification bit, a current task package number, and a target task package number.
In some embodiments, determining a target task package having an execution dependency relationship with the current task package and an execution result of the target task package based on the execution dependency information of the current task package and the dependency table includes:
Setting an identification bit corresponding to a task package number as a first preset value in a dependency table under the condition that the dependency identification bit of the current task package is a preset effective value, and determining an execution result of a target task package based on the value of the identification bit corresponding to the target task package number in the dependency table;
wherein the dependency table comprises a plurality of identification bits; each identification bit corresponds to the task package number of each task package one by one; the value of the identification bit is used for representing the execution result of the task package; the first preset value indicates that the task package execution is not completed.
In some embodiments, distributing the current task package based on the execution result of the target task package includes:
under the condition that the execution result of the target task package is that the execution is not completed, caching the current task package, continuously detecting the identification bit corresponding to the target task package number in the dependency table, and determining the execution result of the target task package based on the value of the identification bit corresponding to the target task package number;
and distributing the current task package under the condition that the execution result of the target task package is that the execution is completed.
In some embodiments, the method further comprises:
Monitoring a current task package; under the condition that the execution of the current task package is completed, setting the corresponding identification bit of the task package number of the current task package in the dependency list as a second preset value;
The second preset value is used for indicating that the task package execution is completed.
In some embodiments, the method further comprises:
Under the condition that the dependency identification bit of the current task package is a preset invalid value, determining that the current task package does not have a target task package for executing the dependency relationship; and distributing the current task package.
Fig. 10 is a schematic structural diagram of an electronic device according to the present application, and as shown in fig. 10, the electronic device may include: processor (Processor) 1010, communication interface (Communications Interface) 1020, memory (Memory) 1030, and communication bus (Communications Bus) 1040, wherein Processor 1010, communication interface 1020, memory 1030 communicate with each other via communication bus 1040. Processor 1010 may invoke logic commands in memory 1030 to perform the methods described in the embodiments above, such as:
Receiving a current task package; determining a target task package with an execution dependency relationship with the current task package and an execution result of the target task package based on the execution dependency information of the current task package and the dependency table; distributing the current task package based on the execution result of the target task package; the dependency table is used for recording the execution result of each task package.
In addition, the logic commands in the memory described above may be implemented in the form of software functional units and may be stored in a computer readable storage medium when sold or used as a stand alone product. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in the form of a software product stored in a storage medium, comprising several commands for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The processor in the electronic device provided by the embodiment of the application can call the logic instruction in the memory to realize the method, and the specific implementation mode is consistent with the implementation mode of the method, and the same beneficial effects can be achieved, and the detailed description is omitted here.
The embodiments of the present application also provide a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the methods provided by the above embodiments.
The specific embodiment is consistent with the foregoing method embodiment, and the same beneficial effects can be achieved, and will not be described herein.
The embodiments of the present application provide a computer program product comprising a computer program which, when executed by a processor, implements a method as described above.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims (10)

1. A processor task package scheduling apparatus, comprising:
The task packet grabbing module is used for receiving the current task packet;
The task package distribution module is used for determining a target task package with an execution dependency relationship with the current task package and an execution result of the target task package based on the execution dependency information of the current task package and the dependency table; distributing the current task package based on the execution result of the target task package;
The dependency table is used for recording the execution result of each task package.
2. The processor task package scheduling apparatus of claim 1, wherein the execution dependency information of the current task package includes a dependency identification bit, a current task package number, and a target task package number.
3. The processor task package scheduling apparatus according to claim 2, wherein the task package distribution module includes:
a dependency table storage sub-module for storing the dependency table;
The dependency comparison sub-module is used for setting the identification bit corresponding to the task package number as a first preset value in the dependency table and determining an execution result of the target task package based on the value of the identification bit corresponding to the target task package number in the dependency table under the condition that the dependency identification bit of the current task package is a preset effective value; distributing the current task package based on the execution result of the target task package;
Wherein the dependency table comprises a plurality of identification bits; each identification bit corresponds to the task package number of each task package one by one; the value of the identification bit is used for representing the execution result of the task package; the first preset value indicates that the task package is not completed.
4. The processor task package scheduling device of claim 3, wherein the task package distribution module further comprises a task package cache sub-module; correspondingly, the dependency comparison submodule is specifically configured to:
Storing the current task package to the task package caching submodule under the condition that the execution result of the target task package is incomplete, continuously detecting the identification bit corresponding to the target task package number in the dependency table, and determining the execution result of the target task package based on the value of the identification bit corresponding to the target task package number;
And distributing the current task package under the condition that the execution result of the target task package is that the execution is completed.
5. A processor task package scheduler according to claim 3, wherein the dependency comparison sub-module is further configured to:
Monitoring the current task package; under the condition that the execution of the current task package is completed, setting a corresponding identification bit of a task package number of the current task package in the dependency table as a second preset value;
the second preset value is used for indicating that the task package execution is completed.
6. A processor task package scheduler according to claim 3, wherein the dependency comparison sub-module is further configured to:
under the condition that the dependency identification bit of the current task package is a preset invalid value, determining that the current task package does not have a target task package for executing the dependency relationship; and distributing the current task package.
7. A command processor comprising the processor task packet scheduler of any one of claims 1 to 6.
8. A processor comprising the command processor of claim 7.
9. A method for scheduling a task package of a processor, comprising:
Receiving a current task package;
Determining a target task package with an execution dependency relationship with the current task package and an execution result of the target task package based on the execution dependency information of the current task package and a dependency table; distributing the current task package based on the execution result of the target task package;
The dependency table is used for recording the execution result of each task package.
10. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the processor task package scheduling method of claim 9 when the program is executed by the processor.
CN202410288271.5A 2024-03-13 Processor task package scheduling device and method Pending CN118170515A (en)

Publications (1)

Publication Number Publication Date
CN118170515A true CN118170515A (en) 2024-06-11

Family

ID=

Similar Documents

Publication Publication Date Title
US8214624B2 (en) Processing long-latency instructions in a pipelined processor
US8082420B2 (en) Method and apparatus for executing instructions
US6671827B2 (en) Journaling for parallel hardware threads in multithreaded processor
US11609792B2 (en) Maximizing resource utilization of neural network computing system
US8843910B1 (en) Identifying a set of functionally distinct reorderings in a multithreaded program
US20210232394A1 (en) Data flow processing method and related device
KR20080104073A (en) Dynamic loading and unloading for processing unit
US20140143524A1 (en) Information processing apparatus, information processing apparatus control method, and a computer-readable storage medium storing a control program for controlling an information processing apparatus
US9513923B2 (en) System and method for context migration across CPU threads
CN112667289A (en) CNN reasoning acceleration system, acceleration method and medium
CN114610394B (en) Instruction scheduling method, processing circuit and electronic equipment
US9009020B1 (en) Automatic identification of interesting interleavings in a multithreaded program
CN109656868B (en) Memory data transfer method between CPU and GPU
CN111026444A (en) GPU parallel array SIMT instruction processing model
US20190220257A1 (en) Method and apparatus for detecting inter-instruction data dependency
CN112559403B (en) Processor and interrupt controller therein
CN112948136A (en) Method for implementing asynchronous log record of embedded operating system
KR102205899B1 (en) Method and apparatus for avoiding bank conflict in memory
WO2023077875A1 (en) Method and apparatus for executing kernels in parallel
Lázaro-Muñoz et al. A tasks reordering model to reduce transfers overhead on GPUs
US9038077B1 (en) Data transfer protection in a multi-tasking modeling environment
CN118170515A (en) Processor task package scheduling device and method
US20220413849A1 (en) Providing atomicity for complex operations using near-memory computing
US11126535B2 (en) Graphics processing unit for deriving runtime performance characteristics, computer system, and operation method thereof
US10366049B2 (en) Processor and method of controlling the same

Legal Events

Date Code Title Description
PB01 Publication