CN113835852B - Task data scheduling method and device - Google Patents

Task data scheduling method and device Download PDF

Info

Publication number
CN113835852B
CN113835852B CN202110991440.8A CN202110991440A CN113835852B CN 113835852 B CN113835852 B CN 113835852B CN 202110991440 A CN202110991440 A CN 202110991440A CN 113835852 B CN113835852 B CN 113835852B
Authority
CN
China
Prior art keywords
equipment
task data
execution unit
target
bus
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110991440.8A
Other languages
Chinese (zh)
Other versions
CN113835852A (en
Inventor
刘长坤
郑�硕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Neusoft Medical Systems Co Ltd
Original Assignee
Neusoft Medical Systems Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Neusoft Medical Systems Co Ltd filed Critical Neusoft Medical Systems Co Ltd
Priority to CN202110991440.8A priority Critical patent/CN113835852B/en
Publication of CN113835852A publication Critical patent/CN113835852A/en
Application granted granted Critical
Publication of CN113835852B publication Critical patent/CN113835852B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/48Indexing scheme relating to G06F9/48
    • G06F2209/484Precedence

Abstract

The invention discloses a task data scheduling method and device, relates to the technical field of data processing, and mainly aims to solve the problems of low scheduling efficiency and poor accuracy of existing task data. Comprising the following steps: acquiring task data to be scheduled, and determining target operation equipment based on the processing complexity of the task data; analyzing the equipment function type of the target operation equipment, and calling an equipment operation load mark sequence matched with the target operation equipment based on the equipment function type; and if the target execution unit matched with the task data is found out based on the equipment operation load marking sequence and the occupation mark is not marked, executing the task data based on the target execution unit and marking the occupation mark for the target execution unit.

Description

Task data scheduling method and device
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a task data scheduling method and device.
Background
In the processing of computationally intensive tasks in a large number of data fields, such as scientific computing tasks including computational fluid mechanics, signal and image processing, deep learning, etc., the computing power and data transmission capability of computing devices become major processing bottlenecks. With the appearance of a Central Processing Unit (CPU) serving as a technical bottleneck of a core processing device, people start to transfer a large-scale execution computing task to a graphics card (GPU) for processing, so that different computing devices such as a plurality of GPUs and the like are added on the basis of one CPU device, and a computing construction mode of cooperative work of the computing devices, namely heterogeneous computing, is realized through a proper scheduling method.
At present, the existing heterogeneous computing platform generally adopts a linear mode to execute the scheduling flow of data reading, computing and data writing, and provides a scheduling method based on an instruction queue based on a GPU computing interface, however, because a plurality of command queues are used by a GPU driver to control, the task adjustment is not easy to manually carry out, and when the computing load is unbalanced, the situation of computing resource waste exists, the high-efficiency scheduling requirements for different devices and different computing tasks cannot be met, and the scheduling efficiency and accuracy of different task data are greatly reduced.
Disclosure of Invention
In view of the above, the present invention provides a task data scheduling method and device, and mainly aims to solve the problems of low scheduling efficiency and poor accuracy of the existing task data.
According to one aspect of the present invention, there is provided a task data scheduling method, including:
acquiring task data to be scheduled, and determining target operation equipment based on the processing complexity of the task data;
analyzing the equipment function type of the target operation equipment, and calling an equipment operation load marking sequence matched with the target operation equipment based on the equipment function type, wherein the equipment operation load marking sequence is used for representing the occupied condition of different execution units, and the occupied execution units are marked by corresponding occupied identifiers;
And if the target execution unit matched with the task data is found out based on the equipment operation load marking sequence and the occupation mark is not marked, executing the task data based on the target execution unit and marking the occupation mark for the target execution unit.
Further, before the task data to be scheduled is obtained and the target running device is determined based on the processing complexity of the task data, the method further includes:
determining the equipment function type of each operation equipment, and establishing an abstract model of the operation equipment based on the equipment function type, wherein the abstract model comprises at least two execution units;
defining a device running idle state and a bus running idle state based on each execution unit in the abstract model;
and generating a device operation load marking sequence based on the device operation idle state and the bus operation idle state, wherein the device operation load marking sequence comprises the content of whether the device execution unit is marked with the occupied mark or not and the content of whether the bus execution unit is marked with the occupied mark or not.
Further, the device function type includes a first device type and a second device type, and the building the abstract model of the running device based on the device function type includes:
Determining the abstract number of hardware devices corresponding to each operation device belonging to the first device type and the second device type;
dividing the running equipment to obtain an equipment execution unit comprising a storage function and a calculation function and a bus execution unit;
establishing a first device abstraction model and a second device abstraction model matched with the hardware device abstraction number based on the device execution unit and the bus execution unit;
wherein the first device type includes a CPU and the second device type includes a GPU, an FPGA, or a DSP.
Further, if the target execution unit matched with the task data is found to have no occupation identifier marked based on the equipment operation load marking sequence, the method further comprises, before executing the task data based on the target execution unit and marking the occupation identifier for the target execution unit:
if the equipment operation load mark sequence of the target operation equipment is the equipment operation load mark sequence matched with the first equipment type, judging whether all calculation execution units in the equipment operation load mark sequence are in an occupied state or not, and calling the equipment operation load mark sequence matched with the second equipment type to search the calculation execution units when all calculation execution units are in the occupied state; or alternatively, the first and second heat exchangers may be,
If the equipment operation load mark sequence of the target operation equipment is called as the second equipment type, judging whether all bus execution units and/or all storage execution units in the equipment operation load mark sequence are in an occupied state, and calling the equipment operation load mark sequence matched with the second equipment type to search the bus execution units and/or the storage execution units when all bus execution units and/or all storage execution units are in the occupied state.
Further, after the generating the device running load flag sequence based on the device running idle state and the bus running idle state, the method further includes:
and when detecting that the device execution unit and/or the bus execution unit in the device operation load mark sequence execute task data, configuring an occupation mark of the device execution unit and/or the bus execution unit in the device operation load mark sequence based on the task data.
Further, the determining the target running device based on the processing complexity of the task data includes:
determining processing complexity based on data processing characteristics of the task data, wherein the data processing characteristics are used for representing at least one of time consumption, occupied space and processing mode of the task data for data processing;
Selecting a scheduling strategy according to the processing complexity of the task data, and determining target operation equipment according to the selected scheduling strategy, wherein the scheduling strategy comprises a static scheduling strategy and a dynamic scheduling strategy, the static scheduling strategy is a scheduling mode of the target operation equipment determined by pre-analyzing based on the task data, and the dynamic scheduling strategy is a scheduling mode of the target operation equipment determined based on an occupation identifier in an equipment operation load mark sequence.
Further, the obtaining task data to be scheduled includes:
responding to a task data processing request, analyzing and splitting the task data, determining at least one piece of task data after analyzing and splitting as task data to be scheduled, wherein the analyzing and splitting is used for representing the process of analyzing and splitting the calculation process of the task data.
According to another aspect of the present invention, there is provided a task data scheduling apparatus including:
the acquisition module is used for acquiring task data to be scheduled and determining target operation equipment based on the processing complexity of the task data;
the device operation load marking sequence is used for representing the occupied condition of different execution units, wherein the occupied execution units are marked by corresponding occupation identifiers;
And the execution module is used for executing the task data based on the target execution unit and marking the occupation identification for the target execution unit if the target execution unit matched with the task data is found out based on the equipment operation load marking sequence.
Further, the apparatus further comprises:
the system comprises an establishing module, a processing module and a processing module, wherein the establishing module is used for determining the equipment function type of each operation equipment and establishing an abstract model of the operation equipment based on the equipment function type, and the abstract model comprises at least two execution units;
the definition module is used for defining an equipment operation idle state and a bus operation idle state based on each execution unit in the abstract model;
the generating module is used for generating a device operation load marking sequence based on the device operation idle state and the bus operation idle state, wherein the device operation load marking sequence comprises content of whether the device execution unit is marked with occupied marks or not and content of whether the bus execution unit is marked with occupied marks or not.
Further, the device function type includes a first device type and a second device type, and the establishing module includes:
The determining unit is used for determining the hardware device abstract number corresponding to each operation device belonging to the first device type and the second device type;
the dividing unit is used for dividing the execution area of the running equipment to obtain an equipment execution unit comprising a storage function and a calculation function and a bus execution unit;
the building unit is used for building a first device abstract model and a second device abstract model matched with the hardware device abstract number based on the device executing unit and the bus executing unit.
Further, the generating module is specifically configured to configure, when it is detected that the device execution unit and/or the bus execution unit in the device running load flag sequence execute task data, an occupation identifier of the device execution unit and/or the bus execution unit in the device running load flag sequence based on the task data.
Further, the calling module includes:
the first calling unit is used for judging whether all calculation execution units in the equipment operation load mark sequence are in an occupied state or not if the equipment operation load mark sequence of the target operation equipment is the equipment operation load mark sequence matched with the first equipment type, so that when all calculation execution units are in the occupied state, the equipment operation load mark sequence matched with the second equipment type is called to search the calculation execution units; or alternatively, the first and second heat exchangers may be,
And the second calling unit is used for judging whether all bus execution units and/or all storage execution units in the equipment operation load mark sequence are in an occupied state or not if the equipment operation load mark sequence of the target operation equipment is of a second equipment type, so that when all bus execution units and/or all storage execution units are in the occupied state, the equipment operation load mark sequence matched with the second equipment type is called to search the bus execution units and/or the storage execution units.
Further, the acquisition module includes:
the computing unit is used for determining processing complexity based on data processing characteristics of the task data, wherein the data processing characteristics are used for representing at least one of time consumption, occupied space and processing mode of the task data for data processing;
the determining unit is used for selecting a scheduling strategy according to the processing complexity of the task data and determining target operation equipment according to the selected scheduling strategy, wherein the scheduling strategy comprises a static scheduling strategy and a dynamic scheduling strategy, the static scheduling strategy is used for determining the scheduling mode of the target operation equipment based on the pre-analysis of the task data, and the dynamic scheduling strategy is used for determining the scheduling mode of the target operation equipment based on the occupation identification in the equipment operation load mark sequence.
Further, the acquisition module further includes:
the splitting unit is used for responding to the task data processing request, analyzing and splitting the task data, determining at least one piece of task data after analyzing and splitting as task data to be scheduled, and the analyzing and splitting is used for representing the process of analyzing and splitting the calculation process of the task data.
According to still another aspect of the present invention, there is provided a storage medium having stored therein at least one executable instruction for causing a processor to perform operations corresponding to the scheduling method of task data as described above.
According to still another aspect of the present invention, there is provided a terminal including: the device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the task data scheduling method.
By means of the technical scheme, the technical scheme provided by the embodiment of the invention has at least the following advantages:
Compared with the prior art, the method and the device for scheduling the task data have the advantages that the task data to be scheduled are obtained, and the target operation equipment is determined based on the processing complexity of the task data; analyzing the equipment function type of the target operation equipment, and calling an equipment operation load marking sequence matched with the target operation equipment based on the equipment function type, wherein the equipment operation load marking sequence is used for representing the occupied condition of different execution units, and the occupied execution units are marked by corresponding occupied identifiers; if the target execution unit matched with the task data is found out based on the equipment operation load marking sequence and the occupation mark is not marked, the task data is executed based on the target execution unit, and the occupation mark is marked for the target execution unit, namely, the task data is distributed in a non-queue mode (tasks can be distributed without waiting when the executable task condition is met) by detecting the occupation condition of the equipment, the continuity of the equipment execution task is ensured, the task accurate scheduling requirement when different operation equipment is in a load state is greatly met, and compared with the scheduling method of the instruction queue in the prior art, the problem of task scheduling resource waste under the condition of uneven calculation load is avoided, so that the scheduling accuracy, namely the efficiency of the task data is improved.
The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present invention more readily apparent.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
FIG. 1 shows a flow chart of a task data scheduling method provided by an embodiment of the invention;
FIG. 2 is a flowchart of another task data scheduling method according to an embodiment of the present invention;
FIG. 3 is a flowchart of a method for scheduling task data according to an embodiment of the present invention;
FIG. 4 shows a block diagram of a task data scheduler according to an embodiment of the present invention;
fig. 5 shows a schematic structural diagram of a terminal according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The embodiment of the invention provides a task data scheduling method, as shown in fig. 1, which comprises the following steps:
101. and acquiring task data to be scheduled, and determining target operation equipment based on the processing complexity of the task data.
In the embodiment of the invention, after determining task data to be processed, a heterogeneous computing platform serving as a current execution main body performs a scheduling method of the task data in the application, and heterogeneous computing is a computing mode of forming a system by using multiple functional units of different instruction sets and architectures to process the task data, and in the process, functional units need to be determined based on scheduling of the task data to execute corresponding processing functions of the task data. In the embodiment of the invention, the heterogeneous computing platform serving as the current execution main body can be applied to cloud end, server end, client end and other devices, so that task data need to be accurately scheduled during heterogeneous computing, and the purpose of efficiently processing the data is achieved. The task data is task content to be subjected to heterogeneous computation, and the task data may be task data that can be segmented or task data that cannot be segmented. The processing complexity is the complexity of data processing on the task data, and in the embodiment of the invention, the processing complexity can be determined based on the data size of the task data, the processing mode of the task data, the transmission time of the task data and other condition contents, and generally, the processing complexity can be classified into the constant task complexity and the variable task complexity, so that the target operation equipment is determined.
It should be noted that, in the embodiment of the present invention, in order to implement accurate scheduling of different task data, the heterogeneous computing platform as the current execution end is divided into a plurality of running devices, including, but not limited to, a CPU, a GPU, a programmable device FPGA (Field Programmable Gate Array), and running devices such as data signal processing (Digital Signal Processing, DSP). For example, a heterogeneous computing platform consisting of cpu+gpu may be partitioned into multiple execution devices, including CPU1, CPU2, CPU 3.+ GPU1, GPU2, GPU3., in order to determine one target execution device from the multiple execution devices based on processing complexity.
102. Analyzing the equipment function type of the target operation equipment, and calling an equipment operation load marking sequence matched with the target operation equipment based on the equipment function type.
In the embodiment of the present invention, since the data processing on the task data is performed based on the target running device, after the target running device is determined, the device function type needs to be parsed, that is, the device function type is used to represent the type of the data processing function performed by the device, and includes a first device type (for example, a CPU) and a second device type (for example, a GPU, FPGA or DSP), so as to invoke a device running load tag sequence matched with the device function type. The device running load marking sequence is used for representing the occupied condition of different execution units, and can be generated and stored in a table form, the occupied position of the occupied execution unit in the device running load marking sequence is marked by a corresponding occupied identifier, and the execution unit is a functional unit for processing task data by execution devices of different device function types, and at least comprises two execution units, such as a bus execution unit and a device execution unit, wherein the device execution unit comprises, but is not limited to, a calculation execution unit, a storage execution unit (an input data storage execution unit, an output data storage execution unit) and the like. Because the device operation load mark sequence records whether different execution units are marked with the occupation mark generated by executing the task data, the execution unit of the target operation device capable of executing the task data can be determined.
It should be noted that, since each target operation device relies on hardware and a bus to perform functions of data transmission and data processing, the target operation devices with different device function types each include a bus, so as to complete data transmission and data processing in cooperation with a CPU, a GPU, and the like. Thus, the device running load flag sequence also has recorded therein an occupancy flag indicating whether the bus as the execution unit is occupied by task data.
103. And if the target execution unit matched with the task data is found out based on the equipment operation load marking sequence and the occupation mark is not marked, executing the task data based on the target execution unit and marking the occupation mark for the target execution unit.
In the embodiment of the invention, because the occupation identifiers of each task data on different execution devices are recorded in the device operation load marking sequence, a target execution unit matched with the task data is firstly determined, the target execution unit is one execution unit in the target operation device, such as a calculation unit and a storage unit, then whether the target execution unit is marked with the occupation identifier is judged, if the occupation identifier is marked, the target execution unit is indicated to be occupied, and if the occupation identifier is not marked, the target execution unit is indicated to be idle. Specifically, when the target execution unit does not mark the occupation mark, task data can be executed based on the target execution unit to complete task data scheduling, namely, as long as the task executing condition is met, the execution unit can be arranged to task, not the mechanical queue type task allocation mechanism is realized by waiting in sequence, the task scheduling mode of executable allocation is achieved, the utilization rate and the task processing efficiency of the equipment are greatly improved, meanwhile, in order to ensure that other task data cannot be subjected to task processing based on the target execution unit, the task data is executed based on the target execution unit, and the occupation mark is marked for the target execution unit, so that the target execution unit is indicated to be occupied when other task data are scheduled.
In an embodiment of the present invention, for further limitation and illustration, as shown in fig. 2, before step 101 obtains task data to be scheduled and determines a target running device based on a processing complexity of the task data, the method further includes: 201. determining the equipment function type of each operation equipment, and establishing an abstract model of the operation equipment based on the equipment function type; 202. defining a device running idle state and a bus running idle state based on each execution unit in the abstract model; 203. and generating a device running load mark sequence based on the device running idle state and the bus running idle state.
In the embodiment of the invention, after the task data is determined, the target operation equipment is required to be determined according to the processing complexity, and the target operation equipment comprises one of a plurality of execution units in the operation equipment with different equipment function types, so that the execution objects capable of carrying out the task data are accurately searched for in order to realize the independence of the execution units, and an abstract model of the operation equipment is established based on the equipment function types. The abstract models of the running devices of different device function types may be the same or different, for example, the abstract models established for the CPU device types as the first device type include a1, a 2. Furthermore, the abstract model is divided into execution units, wherein the abstract model comprises at least two execution units, such as at least a bus execution unit and a device execution unit, and the device execution unit comprises, but is not limited to, a calculation execution unit, a storage execution unit (an input data storage execution unit, an output data storage execution unit) and the like. For the operation device of the non-first device type, the self-existing input storage unit, output storage unit and the like may be further divided, and the embodiment of the present invention is not limited in particular. For example, in device GPUb: b1 builds an abstract model, dividing out 1 st, 2 nd,..n computing units, denoted b1-Cmpt1, b1-Cmpt2,..b 1-Cmptn, dividing out 1 st, 2 nd,..m storage areas, denoted b1-Input1, b1-Input2,..b 1-Input m. Dividing the 1 st and 2 nd storage areas of Output data, namely b1-Output1 and b1-Output2, wherein the process of inputting and outputting the data needs to occupy a bus, and the calculation process needs to occupy a calculation execution unit of the device.
In addition, since the abstract model is an abstract structure for dividing execution units for each execution device, after the abstract model is built, a device running idle state and a bus running idle state are defined for each execution unit in the abstract model, such as the device running idle state shown in table 1 and the bus running idle state shown in table 2, wherein 0 is represented as an idle state, so that a device running load mark sequence is generated based on the device running idle state and the bus running idle state, that is, a mapping relation indicating that different execution units are not occupied by task data in the idle state is generated, so that the device running load mark sequence includes whether the device execution units are marked with content of occupied marks or not and whether the bus execution units are marked with content of occupied marks or not.
TABLE 1
TABLE 2
Bus input/output Direction Buffer area
GPU:b1 0 -
GPU:b2 0 -
In table 2, the device running load flag sequence constructs a list in the form of an unsigned integer matrix, each row corresponds to one GPU running device, and if the column element in the bus input/output Direction as the bus execution unit is 0, it may indicate that the bus device is in an idle state. If the column element in the bus input/output Direction as the bus execution unit is 1, it indicates that the current bus is inputting data. Element 2 indicates that the current bus is outputting data. The element of the Buffer column indicates the execution unit number where data is currently being Input or output, e.g., record 1 when in execution unit Input1, record 2 when in execution unit Input2, and specifically match the number of specific execution units that are partitioned. When the bus is idle, the Buffer column number has no meaning. When the device operation load flag sequence indicates that a certain computing operation unit of a certain operation device, the data input operation unit or the data output operation unit is occupied, the execution unit or the area cannot be reused. When a certain execution device occupies the bus, the execution device cannot apply for bus execution again. When the accumulated value of the bus occupied by each device is larger than or equal to the bus capacity, each device can not apply for bus execution. In the initial state, all elements are defined as 0, when one task data needs to be scheduled, a scheduling system of a current execution main body determines processing complexity according to the property and the like of the task data, and selects an execution device which can be used, wherein the execution device which is preferentially used usually exists, and when all the execution devices are unavailable, other types of devices are used.
It should be noted that, during the period of reading task data, the execution unit corresponding to the corresponding read task data should be assigned a task number. During the task data processing, the corresponding computing execution units and the execution units corresponding to the read and write data should be assigned task numbers. After the task data processing is completed, the corresponding execution unit for writing the data should be assigned a task number until the data is written back to the main memory. When the resources in the execution units are no longer used, the corresponding execution units should be assigned a value of 0. When the data transmission process is executed, the contents in the corresponding column of the bus execution unit are assigned according to the transmission direction and the area number, and the transmission completion is set to 0.
In one embodiment of the present invention, for further definition and explanation, the device function types include a first device type and a second device type, and the step 201 of creating an abstract model of the running device based on the device function types specifically includes: determining the abstract number of hardware devices corresponding to each operation device belonging to the first device type and the second device type; dividing the running equipment to obtain an equipment execution unit comprising a storage function and a calculation function and a bus execution unit; and establishing a first device abstraction model and a second device abstraction model matched with the hardware device abstraction number based on the device execution unit and the bus execution unit.
In order to improve the scheduling accuracy, flexibility and high efficiency of task data, when an abstract model is built, the abstract number of concrete hardware devices of a first device type and a second device type is aimed at, so that an abstract model with matched number is built. The first device type is a CPU device type, and the second device type includes other devices such as a GPU device type, an FPGA device type, a DSP device type, and the like. Meanwhile, as the heterogeneous computing process can access the input storage area and the output storage area, each running device should divide at least two or more input storage areas and output storage areas to alternately execute reading and writing in order not to interfere other simultaneous data reading and writing processes, and further, the executing areas in each running device are divided to obtain device executing units comprising a storage function and a computing function. The device execution unit includes a calculation execution unit and a storage execution unit (an input data storage execution unit and an output data storage execution unit), where the calculation execution unit may be divided into a plurality of units or one unit, and the embodiment of the invention is not limited in detail. The first device abstract model is a CPU device abstract model, and the second device abstract model is other device abstract models including a GPU device type, an FPGA device type, a DSP device type and the like.
In one embodiment of the present invention, for further definition and explanation, step 203, after generating a device running load flag sequence based on the device running idle state and the bus running idle state, further comprises: and when detecting that the device execution unit and/or the bus execution unit in the device operation load mark sequence execute task data, configuring an occupation mark of the device execution unit and/or the bus execution unit in the device operation load mark sequence based on the task data.
In the embodiment of the invention, in order to realize the intelligence and the accuracy of scheduling the task data and avoid the situation that multiple tasks occupy the same execution unit, the device execution unit and/or the bus execution unit detect whether the task data is executed in real time, if the device execution unit and the bus execution unit execute the task data, the occupation identifiers of the device execution unit and the bus execution unit in the device operation load mark sequence are configured based on the task data, wherein the occupation identifiers can be any type of identifiers.
It should be noted that, since each device operation and bus operation are defined as an idle state when the device operation load flag sequence is initially generated, that is, when no calculation of any task data is performed, the corresponding calculation execution unit is defined as an idle state. And when a certain input or output storage execution unit is not used as the input or output of the calculation execution unit, the data input from the main storage GPU is not waited, and when the data output to the main storage GPU is not waited, the corresponding storage area of the running equipment is in an idle state. The CPU device can directly access the main memory at any time, and thus the input or output memory execution unit of the CPU device can be always configured in an idle state. In addition, since bus bandwidth is limited, the computation process of any device may be idle or non-idle at any time, but only a limited number of GPU devices, called bus capacity, can be configured to perform read or write processes simultaneously.
In an embodiment of the present invention, for further defining and explaining, if the target execution unit matching with the task data is found to have no occupation identifier marked based on the device operation load marking sequence, the method further includes, before executing the task data based on the target execution unit and marking the occupation identifier for the target execution unit: if the equipment operation load mark sequence of the target operation equipment is the equipment operation load mark sequence matched with the first equipment type, judging whether all calculation execution units in the equipment operation load mark sequence are in an occupied state or not, and calling the equipment operation load mark sequence matched with the second equipment type to search the calculation execution units when all calculation execution units are in the occupied state; or if the equipment operation load mark sequence of the target operation equipment is called as the second equipment type, judging whether all bus execution units and/or all storage execution units in the equipment operation load mark sequence are in an occupied state, and calling the equipment operation load mark sequence matched with the second equipment type to search the bus execution units and/or the storage execution units when all bus execution units and/or all storage execution units are in the occupied state.
In the embodiment of the present invention, the running device is further defined to include a first device type and a second device type, so that after the running load flag sequence of the device matching the first device type is called based on the device function type, or the running load flag sequence of the device matching the second device type is called, the scheduling is performed according to different device types.
Specifically, if the device running load flag sequence of the calling target running device is a device running load flag sequence matched with the first device type (for example, the CPU device type), the scheduling judgment is performed on the CPU device preferentially, so as to judge whether the column corresponding to the calculation execution unit in the device running load flag sequence table is in an unoccupied state, that is, whether 0 representing an idle state exists. If yes, an idle computing execution unit can be selected in a random mode to carry out computing occupation. Otherwise, if all the columns corresponding to the calculation execution units are marked as non-0, the calculation execution units of the CPU device are occupied by tasks, and the calculation execution units are searched by using the GPU device instead, i.e. the device operation load marking sequence matched with the second device type (such as the GPU device type) is called.
Specifically, if the device operation load flag sequence of the target operation device is called as the second device type (such as GPU device type), it is described that the GPU device is preferentially dispatched, and then it is determined whether all bus execution units and/or all storage execution units are in an occupied state, i.e., if the capacity in the column corresponding to the bus execution units is not full and there is an execution unit, which indicates that the bus is available, and the input data execution unit is idle, then one execution unit is selected in a random manner, and devices that neither the calculation execution unit nor the output data area is fully occupied are preferentially selected, so as to execute the flow of reading data according to the input data execution unit in the idle state. Otherwise, if the bus occupation identifier exists, it indicates that the bus is occupied or no input data execution unit is available, and the CPU device should be used instead, i.e. the device running load flag sequence matching the second device type (such as GPU device type) is called to find the bus execution unit and/or the storage execution unit.
In the embodiment of the invention, if the CPU equipment is scheduled and judged to be the a flow and the GPU equipment is scheduled to be the b flow preferentially, the input data execution units of the GPU equipment are all occupied simultaneously in the process of scheduling and judging the b flow, the GPU equipment is determined to be blocked, and the corresponding resources can be waited for to be idle. Because the CPU device can directly obtain the task data generally, the data reading process and the data writing process of the CPU device do not need to be considered, and the bus occupation problem does not need to be considered, when the task data scheduling in heterogeneous computation is performed, the computing execution unit in the corresponding device operation load mark sequence table should be assigned to be a task number, so that after the task data processing is finished, the task number serving as the occupation mark is set to be 0 again. To identify an idle state. Correspondingly, in the process of scheduling judgment in the process of the process a, the computing and executing units of the CPU equipment are all occupied at the same time, and the CPU equipment is determined to be blocked, so that the corresponding resources can be waited for to be idle. When the read data is finished, if an idle calculation execution unit and an idle output data execution unit exist on the CPU device, the calculation execution unit can be scheduled to calculate, and the calculation result is stored in the output data execution unit. After the task data processing is completed, the current execution end can determine whether to write the data back to the main memory according to the occupation condition of the bus. When any execution unit cannot execute due to occupied resources, the execution unit is determined to be blocked, and the execution unit enters waiting until the resources are idle.
In one embodiment of the present invention, for further definition and explanation, step 102 parses the device function type of the target running device, and invokes a device running load tag sequence matched with the target running device based on the device function type, including: if the calculation execution unit searched by the equipment operation load mark sequence matched with the second equipment type is in an occupied state, indicating the task data to enter a waiting operation; or if the bus execution unit and/or the storage execution unit searched by the equipment operation load mark sequence matched with the second equipment type is in an occupied state, indicating the task data to enter a waiting operation.
In the embodiment of the invention, in order to enable each execution unit to better process task data, in the process of distributing the calculation execution units, the bus execution units and the storage execution units, if the calculation execution units found by the equipment operation load mark sequence matched with the second equipment type are in an occupied state, the fact that no calculation execution units can be executed is indicated, so that the current execution main body can instruct the task data to enter waiting operation. Correspondingly, if the bus execution unit and/or the storage execution unit found by the device operation load mark sequence matched with the second device type is in an occupied state, the fact that no bus execution unit and/or no storage execution unit can be executed is indicated, and therefore the current execution body can indicate that task data enter waiting operation.
In one embodiment of the present invention, for further limitation and illustration, as shown in fig. 3, determining the target running device based on the processing complexity of the task data in step 101 includes: 1011. determining processing complexity based on data processing characteristics of the task data; 1012. and selecting a scheduling strategy according to the processing complexity of the task data, and determining target operation equipment according to the selected scheduling strategy.
Because task data can be data content that can be split, in order to accurately schedule task data to carry out data processing of a corresponding execution unit, in the embodiment of the invention, processing complexity is determined based on data processing characteristics of task data, so that a scheduling strategy is selected based on the processing complexity. The data processing features are used for characterizing at least one of time consumption, occupied space and processing mode of task data for data processing so as to determine the change condition of task complexity, for example, the complexity of data transmission, and the transmission time can be estimated by dividing the transmitted data quantity by the data bus bandwidth as the processing complexity; the complexity of data calculation can be estimated by dividing the number of operations of a task, such as four operations, by the flow index of the device as the processing complexity. Meanwhile, as the processing complexity can be classified into constant task complexity and variable task complexity, in the process of determining a specific method of a target operation device, firstly, a scheduling strategy is determined based on a processing complexity end, and the scheduling strategy comprises a static scheduling strategy and a dynamic scheduling strategy so as to determine operation devices of different device function types based on different scheduling strategies. The static scheduling strategy is to analyze and determine a scheduling mode of the target operation device in advance based on task data, the dynamic scheduling strategy is to determine the scheduling mode of the target operation device based on an occupation identifier in a device operation load mark sequence, and according to different scheduling modes, different operation devices with different device function types can be matched by combining different processing complexity, and the method specifically comprises 4 combination modes:
1. And if the task complexity is unchanged, selecting a static scheduling strategy to determine the target operation equipment. The static strategy is a scheduling mode of the target operation equipment determined in advance by analyzing, so that the target operation equipment can be selected by combining task data with the properties, the operation state and the operation examples of each operation equipment in advance; alternatively, the task data is stored in the GPU: b1 is much less time to run on the device than on the other device, such task data may be predetermined to be fixed at the GPU: b1, the embodiment of the invention is not particularly limited.
2. And if the task complexity is unchanged, selecting a dynamic scheduling strategy to determine the target operation equipment. The dynamic scheduling policy determines a scheduling mode of the target operating device based on the occupation identifier in the device operating load tag sequence, so that before executing the scheduling process in the embodiment of the invention, task data can be randomly scheduled on operating devices with different values, and the running time is counted.
3. And if the task complexity changes, selecting a static scheduling strategy to determine the target operation equipment. The static scheduling is a scheduling mode of pre-analyzing and determining the target operation equipment, so that the historical processing information of the task data of the same type can be collected in advance, the processing complexity of the current task data or the processing prediction result is predicted by pre-completing the training machine learning model based on the historical processing information, and the prediction result is used for judging and selecting the proper operation equipment in actual operation.
4. And if the task complexity changes, selecting a dynamic scheduling strategy to determine the target operation equipment. The dynamic scheduling policy determines the scheduling mode of the target operation device based on the occupation identifier in the device operation load mark sequence, and meanwhile, as most of task data does not change too severely in processing complexity among a plurality of continuous task data, the current execution end can schedule the task data to different operation devices every time a stage passes, so as to estimate the operation efficiency of the task data and provide a basis for selecting the operation device for other task data.
It should be noted that, in the embodiment of the present invention, in selecting the scheduling policy according to the processing complexity of the task data, any one of the static scheduling policy or the dynamic scheduling policy may be selected, or the dynamic scheduling policy may be combined with the static scheduling policy, so that the operating device that performs optimal operation for the task data is selected as the target operating device, thereby avoiding scheduling the task data on the non-optimal device. For example, after determining the dynamic scheduling policy according to the processing complexity, the target executing devices 1, 3, 5 may be selected, and the target executing devices 1, 3, 5 may be screened again by combining with the static complexity, so as to obtain only one or several optimal ones, so that scheduling is performed.
In one embodiment of the present invention, for further limitation and explanation, the step 101 of obtaining task data to be scheduled specifically includes: and responding to the task data processing request, analyzing and splitting the task data, and determining at least one piece of task data after analyzing and splitting as task data to be scheduled.
In order to achieve accurate processing of task data and improve accuracy and effectiveness of scheduling, when a task data processing request is received, the task data is analyzed and split, analysis and splitting processing is used for representing analysis and splitting processing of a calculation process of the task data, namely, types of different task data can be analyzed and split, for example, image task data can be split into image data blocks, video task data can be split into images of frames, fluid grid task data can be split into region data and the like, and the embodiment of the invention is not limited specifically.
It should be noted that, after splitting the task data, each split sub-task data may be scheduled as a single task data, so that when each operation device belongs to an idle state, the task data is dynamically scheduled to the corresponding operation device. Meanwhile, for task data, it is necessary to preferentially provide compilable code suitable for each system performing scheduling for efficient splitting.
In one application scenario in the embodiment of the present invention, GPU-based: b1, GPU: each execution unit in the abstract model of b2 defines a device running idle state and a bus running idle state, and generates a GPU based on the device running idle state and the bus running idle state: b1 and GPU: the device operation load marking sequence table corresponding to b2 is an initial state table shown in table 3, wherein the initial state table comprises a GPU: b1 and GPU: b2 respectively correspond to the input storage execution units 1 and 2, the output storage execution units 1 and 2 and the calculation execution unit, and further comprise a bus execution unit Direction, buffer.
TABLE 3 Table 3
Input1 Input2 Cmpt1 Output1 Output2 Direction Buffer
GPU:b1 0 0 0 0 0 0 -
GPU:b2 0 0 0 0 0 0 -
When the task data 1 and the task data 2 are respectively transmitted to the GPU by the main memory cpu a 1: b1, GPU: b2 Input1 execution unit, GPU: b1 and GPU: elements in the Input1 execution unit at b2 are respectively configured as a occupation identification task 1 serial number and a task 2 serial number, as shown in table 4, and meanwhile, the bus execution units are occupied and respectively configured as 1.
TABLE 4 Table 4
Input1 Input2 Cmpt1 Output1 Output2 Direction Buffer
GPU:b1 1 0 0 0 0 1 1
GPU:b2 2 0 0 0 0 1 1
The data transmission of the task data 1 is completed, the calculation execution unit Cmpt1 is entered, and the result is stored in the Output storage execution unit Output1. Meanwhile, the task data 3 is stored by the main CPU: a1 is transmitted to the GPU: in the Input storage execution unit Input2 of b1, as shown in the configuration occupation identifier in table 5, an element in the bus execution unit Buffer records the execution unit serial number 2 where the data is currently being Input or output.
TABLE 5
Input1 Input2 Cmpt1 Output1 Output2 Direction Buffer
GPU:b1 1 3 1 1 0 1 2
GPU:b2 2 0 0 0 0 1 1
The data transmission of the task data 2 is completed, the calculation execution unit Cmpt1 is entered, and the result is stored in the Output storage execution unit Output1. Meanwhile, the task data 4 is stored by the main CPU: a1 is transmitted to the GPU: in the Input storage execution unit Input2 of b2, as shown in the configuration occupation identifier in table 6, the element in the bus execution unit Buffer records the execution unit serial number 2 where the data is currently being Input or output.
TABLE 6
Input1 Input2 Cmpt1 Output1 Output2 Direction Buffer
GPU:b1 1 3 1 1 0 1 2
GPU:b2 2 4 2 2 0 1 2
The task data 2 is calculated. After the data transmission of the task data 4 is completed, the task data enters a calculation execution unit, and the result is stored in an Output storage execution unit Output 2. Task data 5 is stored by the host CPU: a1 is transmitted to the GPU: b2 stores the configuration placeholder identifiers in the execution unit Input2 as shown in table 7.
TABLE 7
Input1 Input2 Cmpt1 Output1 Output2 Direction Buffer
GPU:b1 1 3 1 1 0 1 2
GPU:b2 5 4 4 2 4 1 1
The data transmission of the task data 5 is completed. The calculation result of the task data 2 starts to be output. The calculation of the task data 1 is completed. The task data 3 enters the calculation execution unit and stores the result into the Output2 area. Task 6 is executed by the host CPU: a1 is transmitted to the GPU: b1 stores the configuration placeholder identifiers in the execution unit Input1 as shown in table 8.
TABLE 8
Input1 Input2 Cmpt1 Output1 Output2 Direction Buffer
GPU:b1 6 3 6 1 6 1 6
GPU:b2 5 4 4 2 4 6 1
The task data 3,4 are calculated. Task 2 transmission is complete. The task data 5 enters the calculation execution unit and stores the result into the data Output execution unit Output1. Task data 7 is stored by the host CPU: a1 is transmitted to the GPU: b2 stores the configuration placeholder identifiers in the execution unit Input2 as shown in table 9.
TABLE 9
Input1 Input2 Cmpt1 Output1 Output2 Direction Buffer
GPU:b1 6 0 0 1 3 1 1
GPU:b2 5 7 5 5 4 1 2
The transmission of the task data 6 is completed. The calculation result of the task data 1 starts to be output, as the configuration occupation identifier shown in table 10.
Table 10
Input1 Input2 Cmpt1 Output1 Output2 Direction Buffer
GPU:b1 6 0 0 1 3 2 1
GPU:b2 5 7 5 5 4 1 2
The transmission of the task data 1, 7 is completed. The task data 6 enters the calculation execution unit and stores the result into the data storage execution unit Output1. The calculation results of the task data 3,4 start to be output, as the configuration occupancy flag shown in table 11.
TABLE 11
Input1 Input2 Cmpt1 Output1 Output2 Direction Buffer
GPU:b1 6 0 0 6 3 2 2
GPU:b2 5 7 5 5 4 2 2
The task data 5, 6 are calculated to be complete, as shown in table 12 for configuration placeholders.
Table 12
Input1 Input2 Cmpt1 Output1 Output2 Direction Buffer
GPU:b1 0 0 0 6 3 2 2
GPU:b2 0 7 0 5 4 2 2
The transmission of the task data 3 and 4 is completed, and the settlement results of the task data 5 and 6 are output. The task data 7 enters the calculation execution unit and stores the calculation result into the data Output execution unit Output2, such as configuration occupation identification shown in table 13.
TABLE 13
Input1 Input2 Cmpt1 Output1 Output2 Direction Buffer
GPU:b1 0 0 0 6 0 2 1
GPU:b2 0 7 7 5 7 2 1
The transmission of the task data 5 and 6 is completed, the calculation of the task data 7 is completed, and the calculation result starts to be output, such as configuration occupation identification shown in table 14.
TABLE 14
The transmission of the task data 7 is completed, and the calculation is completed, and the configuration occupation identifier is shown in table 15. When the task data is accessed again, determining whether an execution unit in an idle state exists according to the occupied position identification recorded in the equipment operation load mark sequence table so as to occupy the space, and configuring the occupied execution unit with the corresponding occupied position identification.
TABLE 15
Input1 Input2 Cmpt1 Output1 Output2 Direction Buffer
GPU:b1 0 0 0 6 0 0 -
GPU:b2 0 0 0 0 0 0 -
Compared with the prior art, the method for scheduling the task data has the advantages that the task data to be scheduled is obtained, and the target operation equipment is determined based on the processing complexity of the task data; analyzing the equipment function type of the target operation equipment, and calling an equipment operation load marking sequence matched with the target operation equipment based on the equipment function type, wherein the equipment operation load marking sequence is used for representing the occupied condition of different execution units, and the occupied execution units are marked by corresponding occupied identifiers; if the target execution unit matched with the task data is found out based on the equipment operation load marking sequence and the occupation mark is not marked, the task data is executed based on the target execution unit, and the occupation mark is marked for the target execution unit, so that the task accurate scheduling requirement when different operation equipment is in a load state is greatly met, the problem of task scheduling resource waste under the condition of uneven calculation load is avoided, and the scheduling accuracy, namely the scheduling efficiency of the task data is improved.
Further, as an implementation of the method shown in fig. 1, an embodiment of the present invention provides a task data scheduling device, as shown in fig. 4, where the device includes:
an obtaining module 31, configured to obtain task data to be scheduled, and determine a target operation device based on processing complexity of the task data;
the retrieving module 32 is configured to parse a device function type of the target operation device, and retrieve a device operation load tag sequence matched with the target operation device based on the device function type, where the device operation load tag sequence is used to characterize a situation that different execution units are occupied, and the occupied execution units are marked by corresponding occupancy identifiers;
and the execution module 33 is configured to execute the task data based on the target execution unit and tag the occupation identifier for the target execution unit if the target execution unit matched with the task data is found to be unmarked for occupation based on the device operation load tag sequence.
Further, the apparatus further comprises:
the system comprises an establishing module, a processing module and a processing module, wherein the establishing module is used for determining the equipment function type of each operation equipment and establishing an abstract model of the operation equipment based on the equipment function type, and the abstract model comprises at least two execution units;
The definition module is used for defining an equipment operation idle state and a bus operation idle state based on each execution unit in the abstract model;
the generating module is used for generating a device operation load marking sequence based on the device operation idle state and the bus operation idle state, wherein the device operation load marking sequence comprises content of whether the device execution unit is marked with occupied marks or not and content of whether the bus execution unit is marked with occupied marks or not.
Further, the device function type includes a first device type and a second device type, and the establishing module includes:
the determining unit is used for determining the hardware device abstract number corresponding to each operation device belonging to the first device type and the second device type;
the dividing unit is used for dividing the execution area of the running equipment to obtain an equipment execution unit comprising a storage function and a calculation function and a bus execution unit;
the building unit is used for building a first device abstract model and a second device abstract model matched with the hardware device abstract number based on the device executing unit and the bus executing unit.
Further, the generating module is specifically configured to configure, when it is detected that the device execution unit and/or the bus execution unit in the device running load flag sequence execute task data, an occupation identifier of the device execution unit and/or the bus execution unit in the device running load flag sequence based on the task data.
Further, the calling module includes:
the first calling unit is used for judging whether all calculation execution units in the equipment operation load mark sequence are in an occupied state or not if the equipment operation load mark sequence of the target operation equipment is the equipment operation load mark sequence matched with the first equipment type, so that when all calculation execution units are in the occupied state, the equipment operation load mark sequence matched with the second equipment type is called to search the calculation execution units; or alternatively, the first and second heat exchangers may be,
and the second calling unit is used for judging whether all bus execution units and/or all storage execution units in the equipment operation load mark sequence are in an occupied state or not if the equipment operation load mark sequence of the target operation equipment is of a second equipment type, so that when all bus execution units and/or all storage execution units are in the occupied state, the equipment operation load mark sequence matched with the second equipment type is called to search the bus execution units and/or the storage execution units.
Further, the acquisition module includes:
the computing unit is used for determining processing complexity based on data processing characteristics of the task data, wherein the data processing characteristics are used for representing at least one of time consumption, occupied space and processing mode of the task data for data processing;
the determining unit is used for selecting a scheduling strategy according to the processing complexity of the task data and determining target operation equipment according to the selected scheduling strategy, wherein the scheduling strategy comprises a static scheduling strategy and a dynamic scheduling strategy, the static scheduling strategy is used for determining the scheduling mode of the target operation equipment based on the pre-analysis of the task data, and the dynamic scheduling strategy is used for determining the scheduling mode of the target operation equipment based on the occupation identification in the equipment operation load mark sequence.
Further, the acquisition module further includes:
the splitting unit is used for responding to the task data processing request, analyzing and splitting the task data, determining at least one piece of task data after analyzing and splitting as task data to be scheduled, and the analyzing and splitting is used for representing the process of analyzing and splitting the calculation process of the task data.
Compared with the prior art, the task data to be scheduled is obtained, and the target operation equipment is determined based on the processing complexity of the task data; analyzing the equipment function type of the target operation equipment, and calling an equipment operation load marking sequence matched with the target operation equipment based on the equipment function type, wherein the equipment operation load marking sequence is used for representing the occupied condition of different execution units, and the occupied execution units are marked by corresponding occupied identifiers; if the target execution unit matched with the task data is found out based on the equipment operation load marking sequence and the occupation mark is not marked, the task data is executed based on the target execution unit, and the occupation mark is marked for the target execution unit, so that the task accurate scheduling requirement when different operation equipment is in a load state is greatly met, the problem of task scheduling resource waste under the condition of uneven calculation load is avoided, and the scheduling accuracy, namely the scheduling efficiency of the task data is improved.
According to one embodiment of the present invention, there is provided a storage medium storing at least one executable instruction for performing the task data scheduling method in any of the above-described method embodiments.
Fig. 5 is a schematic structural diagram of a terminal according to an embodiment of the present invention, and the specific embodiment of the present invention is not limited to the specific implementation of the terminal.
As shown in fig. 5, the terminal may include: a processor 402, a communication interface (Communications Interface) 404, a memory 406, and a communication bus 408.
Wherein: processor 402, communication interface 404, and memory 406 communicate with each other via communication bus 408.
A communication interface 404 for communicating with network elements of other devices, such as clients or other servers.
The processor 402 is configured to execute the program 410, and may specifically perform relevant steps in the above-described task data scheduling method embodiment.
In particular, program 410 may include program code including computer-operating instructions.
The processor 402 may be a central processing unit CPU, or a specific integrated circuit ASIC (Application Specific Integrated Circuit), or one or more integrated circuits configured to implement embodiments of the present invention. The one or more processors included in the terminal may be the same type of processor, such as one or more CPUs; but may also be different types of processors such as one or more CPUs and one or more ASICs.
Memory 406 for storing programs 410. Memory 406 may comprise high-speed RAM memory or may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
Program 410 may be specifically operable to cause processor 402 to:
acquiring task data to be scheduled, and determining target operation equipment based on the processing complexity of the task data;
analyzing the equipment function type of the target operation equipment, and calling an equipment operation load marking sequence matched with the target operation equipment based on the equipment function type, wherein the equipment operation load marking sequence is used for representing the occupied condition of different execution units, and the occupied execution units are marked by corresponding occupied identifiers;
and if the target execution unit matched with the task data is found out based on the equipment operation load marking sequence and the occupation mark is not marked, executing the task data based on the target execution unit and marking the occupation mark for the target execution unit.
It will be appreciated by those skilled in the art that the modules or steps of the invention described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may alternatively be implemented in program code executable by computing devices, so that they may be stored in a memory device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than that shown or described, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps within them may be fabricated into a single integrated circuit module for implementation. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (9)

1. A method for scheduling task data, comprising:
acquiring task data to be scheduled, and determining target operation equipment based on the processing complexity of the task data;
analyzing the equipment function type of the target operation equipment, and calling an equipment operation load marking sequence matched with the target operation equipment based on the equipment function type, wherein the equipment operation load marking sequence is used for representing the occupied condition of different execution units, and the occupied execution units are marked by corresponding occupied identifiers;
if the target execution unit matched with the task data is found out based on the equipment operation load marking sequence and does not mark the occupied identification, executing the task data based on the target execution unit and marking the occupied identification for the target execution unit;
Before the task data to be scheduled is obtained and the target running device is determined based on the processing complexity of the task data, the method further comprises:
determining the equipment function type of each operation equipment, and establishing an abstract model of the operation equipment based on the equipment function type, wherein the abstract model comprises at least two execution units;
defining a device running idle state and a bus running idle state based on each execution unit in the abstract model, wherein the execution unit for running the device comprises a bus execution unit and a device execution unit;
and generating a device operation load marking sequence based on the device operation idle state and the bus operation idle state, wherein the device operation load marking sequence comprises the content of whether the device execution unit is marked with the occupied mark or not and the content of whether the bus execution unit is marked with the occupied mark or not.
2. The method of claim 1, wherein the device function types comprise a first device type, a second device type, and wherein building the abstract model of the running device based on the device function types comprises:
determining the abstract number of hardware devices corresponding to each operation device belonging to the first device type and the second device type;
Dividing the running equipment to obtain an equipment execution unit comprising a storage function and a calculation function and a bus execution unit;
based on the device executing unit and the bus executing unit, a first device abstract model and a second device abstract model which are matched with the hardware device abstract number are established,
wherein the first device type includes a CPU and the second device type includes a GPU, an FPGA, or a DSP.
3. The method of claim 2, wherein if a target execution unit matching the task data is found based on the device run load tag sequence that does not tag an occupied tag, executing the task data based on the target execution unit and prior to tagging the occupied tag for the target execution unit, the method further comprises:
if the equipment operation load mark sequence of the target operation equipment is the equipment operation load mark sequence matched with the first equipment type, judging whether all calculation execution units in the equipment operation load mark sequence are in an occupied state or not, and calling the equipment operation load mark sequence matched with the second equipment type to search the calculation execution units when all calculation execution units are in the occupied state; or alternatively, the first and second heat exchangers may be,
If the equipment operation load mark sequence of the target operation equipment is called as the second equipment type, judging whether all bus execution units and/or all storage execution units in the equipment operation load mark sequence are in an occupied state, and calling the equipment operation load mark sequence matched with the first equipment type to search the bus execution units and/or the storage execution units when all bus execution units and/or all storage execution units are in the occupied state.
4. The method of claim 1, wherein the generating a device running load flag sequence based on the device running idle state and the bus running idle state comprises:
and when detecting that the device execution unit and/or the bus execution unit in the device operation load mark sequence execute task data, configuring an occupation mark of the device execution unit and/or the bus execution unit in the device operation load mark sequence based on the task data.
5. The method of any of claims 1-4, wherein the determining a target operating device based on the processing complexity of the task data comprises:
Determining processing complexity based on data processing characteristics of the task data, wherein the data processing characteristics are used for representing at least one of time consumption, occupied space and processing mode of the task data for data processing;
selecting a scheduling strategy according to the processing complexity of the task data, and determining target operation equipment according to the selected scheduling strategy, wherein the scheduling strategy comprises a static scheduling strategy and a dynamic scheduling strategy, the static scheduling strategy is a scheduling mode of the target operation equipment determined by pre-analyzing based on the task data, and the dynamic scheduling strategy is a scheduling mode of the target operation equipment determined based on an occupation identifier in an equipment operation load mark sequence.
6. The method according to any one of claims 1-5, wherein the obtaining task data to be scheduled comprises:
responding to a task data processing request, analyzing and splitting the task data, determining at least one piece of task data after analyzing and splitting as task data to be scheduled, wherein the analyzing and splitting is used for representing the process of analyzing and splitting the calculation process of the task data.
7. A task data scheduling apparatus, comprising:
The acquisition module is used for acquiring task data to be scheduled and determining target operation equipment based on the processing complexity of the task data;
the device operation load marking sequence is used for representing the occupied condition of different execution units, wherein the occupied execution units are marked by corresponding occupation identifiers;
the execution module is used for executing the task data based on the target execution unit and marking the occupation identification for the target execution unit if the occupation identification which is matched with the task data is not marked based on the equipment operation load marking sequence;
the apparatus further comprises:
the system comprises an establishing module, a processing module and a processing module, wherein the establishing module is used for determining the equipment function type of each operation equipment and establishing an abstract model of the operation equipment based on the equipment function type, and the abstract model comprises at least two execution units;
the definition module is used for defining an equipment operation idle state and a bus operation idle state based on each execution unit in the abstract model;
The generating module is used for generating a device operation load marking sequence based on the device operation idle state and the bus operation idle state, wherein the device operation load marking sequence comprises content of whether the device execution unit is marked with occupied marks or not and content of whether the bus execution unit is marked with occupied marks or not.
8. A storage medium having stored therein at least one executable instruction for causing a processor to perform operations corresponding to the scheduling method of task data according to any one of claims 1-5.
9. A terminal, comprising: the device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus;
the memory is configured to store at least one executable instruction, where the executable instruction causes the processor to perform operations corresponding to the task data scheduling method according to any one of claims 1 to 5.
CN202110991440.8A 2021-08-26 2021-08-26 Task data scheduling method and device Active CN113835852B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110991440.8A CN113835852B (en) 2021-08-26 2021-08-26 Task data scheduling method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110991440.8A CN113835852B (en) 2021-08-26 2021-08-26 Task data scheduling method and device

Publications (2)

Publication Number Publication Date
CN113835852A CN113835852A (en) 2021-12-24
CN113835852B true CN113835852B (en) 2024-04-12

Family

ID=78961442

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110991440.8A Active CN113835852B (en) 2021-08-26 2021-08-26 Task data scheduling method and device

Country Status (1)

Country Link
CN (1) CN113835852B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018076238A1 (en) * 2016-10-27 2018-05-03 华为技术有限公司 Heterogeneous system, computation task assignment method and device
CN110018893A (en) * 2019-03-12 2019-07-16 平安普惠企业管理有限公司 A kind of method for scheduling task and relevant device based on data processing
CN113127160A (en) * 2019-12-30 2021-07-16 阿里巴巴集团控股有限公司 Task scheduling method, system and equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6562093B2 (en) * 2018-01-23 2019-08-21 日本電気株式会社 System management device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018076238A1 (en) * 2016-10-27 2018-05-03 华为技术有限公司 Heterogeneous system, computation task assignment method and device
CN110018893A (en) * 2019-03-12 2019-07-16 平安普惠企业管理有限公司 A kind of method for scheduling task and relevant device based on data processing
CN113127160A (en) * 2019-12-30 2021-07-16 阿里巴巴集团控股有限公司 Task scheduling method, system and equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
feature-aware task scheduling on CPU-FPGA heterogeneous platforms;Peilun Du,et al;2019 IEEE 21st international conference on high performance computing and communications;全文 *
异构集群下分布式计算任务聚类调度算法研究;许明睿;信息科技(第2期);全文 *

Also Published As

Publication number Publication date
CN113835852A (en) 2021-12-24

Similar Documents

Publication Publication Date Title
US10114682B2 (en) Method and system for operating a data center by reducing an amount of data to be processed
US9495206B2 (en) Scheduling and execution of tasks based on resource availability
CN109144710A (en) Resource regulating method, device and computer readable storage medium
CN106557369A (en) A kind of management method and system of multithreading
US20060161720A1 (en) Image data transmission method and system with DMAC
CN111078394B (en) GPU thread load balancing method and device
CN106569892B (en) Resource scheduling method and equipment
CN111506434B (en) Task processing method and device and computer readable storage medium
CN111104210A (en) Task processing method and device and computer system
CN111708639A (en) Task scheduling system and method, storage medium and electronic device
CN116263701A (en) Computing power network task scheduling method and device, computer equipment and storage medium
CN107977275B (en) Task processing method based on message queue and related equipment
CN114610472B (en) Multi-process management method in heterogeneous computing and computing equipment
CN113835852B (en) Task data scheduling method and device
CN116089477B (en) Distributed training method and system
CN110825502B (en) Neural network processor and task scheduling method for neural network processor
CN105912394B (en) Thread processing method and system
CN114741166A (en) Distributed task processing method, distributed system and first equipment
CN112114967A (en) GPU resource reservation method based on service priority
CN111782688A (en) Request processing method, device and equipment based on big data analysis and storage medium
CN110955461A (en) Processing method, device and system of computing task, server and storage medium
CN115292053B (en) CPU, GPU and NPU unified scheduling method of mobile terminal CNN
CN111782482B (en) Interface pressure testing method and related equipment
CN112905351B (en) GPU and CPU load scheduling method, device, equipment and medium
CN114201306B (en) Multi-dimensional geographic space entity distribution method and system based on load balancing technology

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant