CN111381941A - Method and device for providing QoS in concurrent task processing system - Google Patents

Method and device for providing QoS in concurrent task processing system Download PDF

Info

Publication number
CN111381941A
CN111381941A CN201811611942.8A CN201811611942A CN111381941A CN 111381941 A CN111381941 A CN 111381941A CN 201811611942 A CN201811611942 A CN 201811611942A CN 111381941 A CN111381941 A CN 111381941A
Authority
CN
China
Prior art keywords
node
task
processing
command
data path
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811611942.8A
Other languages
Chinese (zh)
Inventor
路向峰
孙清涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Memblaze Technology Co Ltd
Original Assignee
Beijing Memblaze Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Memblaze Technology Co Ltd filed Critical Beijing Memblaze Technology Co Ltd
Priority to CN201811611942.8A priority Critical patent/CN111381941A/en
Publication of CN111381941A publication Critical patent/CN111381941A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/506Constraint

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

A method and apparatus for providing QoS in a concurrent task processing system are provided. The task scheduling method comprises the following steps: identifying that a first node is in a busy state; sending a current limit notification to the second node; and in response to receiving the current limit notification, the second node slows down processing of the task.

Description

Method and device for providing QoS in concurrent task processing system
Technical Field
The present application relates to task scheduling, and in particular, to providing QoS (Quality of service) in a task processing system that processes concurrent tasks on a large scale.
Background
In some applications, the processor handles large-scale concurrent tasks. Such as an embedded processor for a network device, a storage device, processes multiple network packets or IO commands concurrently.
In a desktop CPU and a server CPU, an operating system is operated, and a plurality of processes and/or threads operated on the CPU are scheduled by the operating system to process tasks, so that a user does not need to intervene the switching between the processes/threads too much, and the operating system selects an appropriate process/thread to schedule so as to fully utilize the computing capacity of the CPU.
However, in large-scale concurrent task scheduling systems, the QoS index is often difficult to reach high levels. In pursuit of concurrency, some tasks are not processed in time due to interference of various factors, and the QoS index is seriously affected.
Disclosure of Invention
The QoS of a large-scale concurrent task scheduling system needs to be improved.
According to a first aspect of the present application, there is provided a first task scheduling method according to the first aspect of the present application, including: identifying that a first node is in a busy state; sending a current limit notification to the second node; in response to receiving the current limit notification, the second node slows down processing of the task.
According to a first task scheduling method of a first aspect of the present application, there is provided a second task scheduling method of the first aspect of the present application, wherein the first node and the second node are not adjacent on the data path, and the second node does not send a task to the first node.
According to the first or second task scheduling method of the first aspect of the present application, there is provided the third task scheduling method of the first aspect of the present application, wherein the first node recognizes that it is in a busy-work state, and sends a current limit notification to the second node.
According to one of the first to third task scheduling methods of the first aspect of the present application, there is provided the fourth task scheduling method of the first aspect of the present application, wherein the first node recognizes that it is in a busy-work state according to that its subsequent node cannot receive a task or performs an operation that is relatively time-consuming among a plurality of operations.
According to a fourth task scheduling method of the first aspect of the present application, there is provided the fifth task scheduling method of the first aspect of the present application, wherein the relatively time-consuming operation among the plurality of operations is an operation of issuing an erase command to the nonvolatile memory.
According to one of the first to fifth task scheduling methods of the first aspect of the present application, there is provided the sixth task scheduling method of the first aspect of the present application, wherein the second node slows down the processing speed of the task by adding a delay in processing the gap of the task.
According to one of the first to sixth task scheduling methods of the first aspect of the present application, there is provided the seventh task scheduling method according to the first aspect of the present application, wherein the second node slows down the processing speed of the task by adding a delay in a gap of fetching the task from the inbound queue and/or adding the task to the outbound queue.
According to one of the first to seventh task scheduling methods of the first aspect of the present application, there is provided the eighth task scheduling method of the first aspect of the present application, wherein the second node slows down a processing speed of the task by adding the task to the staging table.
According to one of the first to eighth task scheduling methods of the first aspect of the present application, there is provided a ninth task scheduling method according to the first aspect of the present application, further comprising: the second node adds the task to the scratch list in response to its subsequent node failing to receive the task.
According to an eighth or ninth task scheduling method of the first aspect of the present application, there is provided the tenth task scheduling method of the first aspect of the present application, wherein the task in the scratch list has a timestamp indicating a time when the task is first started to be processed by the task processing system; or the tasks in the scratch list are ordered according to the time when the tasks are processed by the task processing system for the first time.
According to a tenth task scheduling method of the first aspect of the present application, there is provided the eleventh task scheduling method of the first aspect of the present application, further comprising: the second node fetches the task having the earliest time to be first processed by the task processing system from the scratch list and processes the task.
According to one of the first to eleventh task scheduling methods of the first aspect of the present application, there is provided the twelfth task scheduling method of the first aspect of the present application, further comprising: in response to the first node being in a busy state, sending a current limit notification to a third node, wherein the first node and the third node are not adjacent on the data path.
According to one of the first to twelfth task scheduling methods of the first aspect of the present application, there is provided the thirteenth task scheduling method of the first aspect of the present application, wherein the second node is a most-front node on the data path.
According to one of the first to eleventh task scheduling methods of the first aspect of the present application, there is provided the fourteenth task scheduling method according to the first aspect of the present application, further comprising: in response to the first node being in a work-busy state, the first node indicating to the coordinating node that it is in a work-busy state; the coordinating node sends a current limit notification to the second node in response to identifying that the first node is in a work-busy state.
According to one of the first to eleventh task scheduling methods of the first aspect of the present application, or the fourteenth task scheduling method, there is provided the fifteenth task scheduling method of the first aspect of the present application, further comprising: and the coordination node sends a current-limiting notification to a third node according to the working state of the first node, wherein the third node and the first node belong to different data paths.
According to a fifteenth task scheduling method of the first aspect of the present application, there is provided the sixteenth task scheduling method of the first aspect of the present application, wherein the first node belongs to a data path for processing a host IO command, the third node belongs to a data path for processing a garbage collection IO command, and the third node is a top-level node of the data path for processing the garbage collection IO command.
According to one of the fourteenth to sixteenth task scheduling methods of the first aspect of the present application, there is provided the seventeenth task scheduling method of the first aspect of the present application, wherein the coordinating node acquires an operating state of one or more nodes and sends a current limit notification to one, more, or all of the nodes.
According to one of the fourteenth to seventeenth task scheduling methods of the first aspect of the present application, there is provided the eighteenth task scheduling method of the first aspect of the present application, wherein the coordinating node is provided by a processor different from a processor providing the first node or the second node.
According to one of the first to eighteenth task scheduling methods of the first aspect of the present application, there is provided the nineteenth task scheduling method of the first aspect of the present application, wherein the first node is a last-stage node of the data path.
According to a fourteenth task scheduling method of the first aspect of the present application, there is provided the twentieth task scheduling method of the first aspect of the present application, wherein the first node and the second node belong to different data paths.
According to a twentieth task scheduling method of the first aspect of the present application, there is provided the twenty-first task scheduling method of the first aspect of the present application, further comprising: the coordinating node, in response to identifying that the first node is in a work-busy state, sends a current limit notification to a third node, wherein the third node belongs to the same data path as the first node.
According to a twenty-first task scheduling method of the first aspect of the present application, there is provided the twenty-second task scheduling method of the first aspect of the present application, wherein a current-limiting notification sent by the node to the third node and a current-limiting notification sent to the second node are coordinated so that performances of respective processing tasks on the first data path and the second data path have a specified ratio or a ratio range; wherein the first node belongs to a first data path and the second node belongs to a second data path.
According to one of the first to twenty-second task scheduling methods of the first aspect of the present application, there is provided a twenty-third task scheduling method according to the first aspect of the present application, wherein the node is a processor or a task processing unit.
According to a second aspect of the present application, there is provided a first task scheduling apparatus, an information processing device, a storage device or a computer according to the second aspect of the present application, comprising a memory, a processor and a program stored on the memory and executable on the processor, wherein the processor implements one of the task scheduling methods according to the first aspect of the present application when executing the program.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art according to the drawings.
FIG. 1A is a schematic diagram of task scheduling according to an embodiment of the present application;
FIG. 1B is a block diagram of a task processing system according to an embodiment of the present application;
FIG. 1C is a schematic diagram of a data path of a task processing system according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a data path of a task processing system according to yet another embodiment of the present application; and
FIG. 3 is a schematic diagram of a data path of a task processing system according to yet another embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1A is a schematic diagram of task scheduling according to an embodiment of the present application.
In fig. 1A, the direction from left to right is the direction in which time elapses. Also shown are a plurality of tasks (1-1, 2-1, 3-1, 1-2, 2-2 and 3-2) being processed, wherein in the reference numerals structured "a-b", the preceding symbol a indicates a task and the following symbol b indicates a subtask included in the task. FIG. 1A illustrates that 3 tasks are processed in a time sequence, each task comprising 2 subtasks.
The solid arrows indicate the temporal order of processing a plurality of tasks, and the dashed arrows indicate the logical order of processing of the tasks. For example, taking task 1 as an example, it is required to process its subtask 1-1 (task 1-1) first, and then to process its subtask 1-2 (task 1-2). Still by way of example, referring to FIG. 1A, after processing task 1-1, task 1-2 (because the required resources are not ready) cannot be processed immediately, and thus task 2-1 is scheduled to execute, and task 3-1, then the resources required by task 1-2 are identified as ready, and after processing task 3-1, task 1-2 is scheduled to execute.
On a processor, tasks are processed by executing code segments. A single CPU (or CPU core) that processes only a single task at any one time. Illustratively, as shown in FIG. 1A, for multiple tasks to be processed, a code segment for processing task 1-1 is executed first, a code segment for processing task 2-1 is executed next, a code segment for processing task 3-1 is executed next, a code segment for processing task 1-2 is executed next, a code segment for processing task 2-2 is executed next, and a code segment for processing task 3-2 is executed next. The logical order of processing of tasks is indicated in the code segments of the respective processing tasks. For example, the logical sequence includes tasks 1-2 to be processed after tasks 1-1. As yet another example, a code segment whose logical order post-processing is indicated in the code segments of processing task 1-1 should be the code segment of processing task 1-2.
The code segments for processing each task communicate with each other through signals, queue shared memories and the like. For example, a logically-prior task generates data as a producer, and a logically-subsequent task receives producer-generated data as a consumer.
FIG. 1B is a block diagram of a task processing system according to an embodiment of the present application.
Referring to fig. 1B, the task processing system includes two parts, software and hardware. The hardware includes, for example, one or more CPUs running software, other hardware resources (e.g., memory, codecs, interfaces, accelerators, interrupt controllers, etc.) associated with processing related tasks.
A code segment of software running on the CPU is referred to as a task processing unit. The task processing system includes a plurality of task processing units. Each task processing unit processes the same or different tasks. For example, task processing unit 0 processes a first sub-task of a task (e.g., task 1-1, task 2-1, and task 3-1), while task processing unit 1, task processing unit 2, and task processing unit 3 process a second sub-task of a task (e.g., task 1-2, task 2-2, and task 3-2).
The software further comprises a task management unit for scheduling one of the task processing units to run on the hardware.
FIG. 1C is a schematic diagram of a data path of a task processing system according to an embodiment of the present application.
Taking a storage device that processes an IO command as an example of a task processing system, a processing procedure of the IO command is divided into, for example, 2 stages. Node 160 processes the 1 st phase of the IO command, while node 165 processes the 2 nd phase of the IO command based on the results of the 1 st phase processing. Node 160 retrieves the pending IO command from inbound queue 180 and populates outbound queue 185 with the stage 1 processing results of the IO command for delivery to node 165. Node 165 retrieves the IO command from queue 185 (or its own inbound queue coupled to queue 185) and continues processing the stage 2 of the IO command. Queue 180, node 160, queue 185, and node 165 thus form a data path for processing IO commands. Multiple stages on the datapath process the IO commands in sequence.
Alternatively, the nodes communicate with each other in various ways, such as queues, shared memories, interrupts, signals, and the like.
A node is for example a processor or a task processing unit.
By way of example, node 165 provides a "backpressure" mechanism to node 160 through queue 185 to coordinate the disparity in IO command processing speed between node 160 and node 165. Due to the complexity of the tasks, it is difficult to ensure that the time required for the 1 st and 2 nd phases of the processing of the IO command is the same, and thus there may be a difference in the speed of the node 160 processing the 1 st phase of the IO command and the speed of the node 165 processing the 2 nd phase of the IO command. For example, node 165 processes the phase 2 of the IO command at a slower rate than node 160 processes the phase 1 of the IO command, such that over time, pending IO commands are progressively piled up in queue 185. Node 165 fetches the IO command from queue 185 according to its processing power, and when queue 185 is filled with pending IO commands, node 160 will find that queue 185 is full and temporarily cannot add stage 1 processing results of the IO command to queue 185.
Optionally, the node further comprises a staging queue. For example, the node 160 adds the stage 1 processing result of the IO command to the temporary storage queue in response to identifying that the queue 185 is full, and adds the stage 1 processing result of the IO command fetched from the temporary storage queue to the queue 185 after a free entry appears in the queue 185. Still optionally, node 160 further suspends fetching IO commands from queue 180 in response to identifying that queue 185 is full.
In this way, the processing speed of each node is adaptively coordinated across the data path.
FIG. 2 is a schematic diagram of a data path of a task processing system according to yet another embodiment of the present application.
The data path of the task processing system includes node 260, node 264, and node 268. Node 260, node 264, and node 268 are ordered, e.g., node 260 processes the 1 st phase of the IO command, node 264 processes the 2 nd phase of the IO command, and node 268 processes the 3 rd phase of the IO command. The subsequent node needs to continue processing the IO command using the processing result of the previous node. The IO command processing results of the nodes are transferred between the nodes through queues (e.g., queue 280, queue 284, and queue 288). For the sake of simplicity, the processing result of transferring the IO command between the nodes is simply referred to as an IO command in some cases.
It will be appreciated that IO commands may be transmitted between nodes in other ways than in queues. The datapath of a task processing system may have other numbers of nodes.
As an example, the preceding node 260 adds its own processing result of the IO command to the queue 284, and the subsequent node 264 takes out the processing result of the preceding node on the IO command from the queue 284 and continues the processing. In response to the queue 284 being full, the previous-stage node 260 temporarily adds no processing result of the IO command by itself to the queue 285 any more, but optionally adds the processing result of the IO command by itself to the temporary storage table 240. For the sake of simplicity, the processing result or intermediate result of the IO command added to the scratch list, also simply referred to as IO command, is added.
According to an embodiment of the present application, the scratch table 240 includes a plurality of entries, and each entry records a processing result or an intermediate processing result of the node on the IO command. The entries of the scratch table 240 are sorted by time stamp. The timestamp is from, for example, the time the IO command was processed by the task processing system. For example, the most front node of the data path of the task processing system corresponds to the obtained IO command to be processed, and the current time or sequence number is labeled to the IO command as a timestamp.
In FIG. 2, although only a scratch table 240 is shown as being included with node 260, it is understood that each node may include one or more scratch tables to store IO commands processed by the respective node.
According to an embodiment of the present application, for example, the node 260 identifies that the queue 284 is full, or that a free entry of the queue 284 is less than a threshold, and adds the IO command processed by itself to the scratch table 240. As another example, node 260 may also add the IO command to scratch table 240 or other scratch tables in the event that the IO command cannot be processed due to insufficient resources required to process the IO command. And in response to queue 284 becoming available or the resources needed to process the IO commands being ready, node 260 fetches the IO commands from scratch pad 240 in the chronological order indicated by the timestamps and continues processing. Preferably, the IO command with the earlier timestamp is fetched for processing first, so as to avoid the time course of the IO command being processed in the task processing system. In other words, when the node 260 retrieves an IO command from the scratch table 240, the timestamp of the IO command determines the priority at which the IO command is processed. IO commands with timestamps at an earlier time are processed earlier.
It will be appreciated that each IO command in the scratch list 240 has a timestamp, and thus the entries of the scratch list may not be sorted, but rather the IO command with the earliest timestamp is searched and processed as it is fetched from the scratch list.
According to an embodiment of the application, node 268 also provides a current limit notification to node 260. Node 268 is a non-adjacent node to node 260 on the data path so that node 260 cannot sense the working (busy) status of node 268 through its outbound queue. Since node 260 is not adjacent to node 268 in the data path, node 260 does not send IO commands directly to node 268. Node 268 sends a current limit notification to nodes 260 that are not adjacent to it via the current limit notification.
By way of example, the node 268 sends a current limit notification upon identifying a work-busy condition. For example, node 268 identifies a work-busy state in response to its outbound queue 288 being full or a free entry of outbound queue 288 being less than a threshold. As yet another example, the node 268 may identify a work-busy condition based on handling relatively time-consuming operations. By way of example, an erase command to non-volatile memory handled by node 268 is a relatively time consuming task with respect to a program command or a read command; the programming commands to non-volatile memory that node 268 processes are a relatively time consuming task with respect to read commands.
Still by way of example, the node 268 identifies a work-busy condition based on processing a second task that is different from the first type of task. For example, a first type of task is to process IO commands from the host, while a second type of task is to process garbage collection commands.
Alternatively or additionally, a subsequent node 268 on the data path, upon recognizing a busy state, sends a notification of the current limit to the most preceding node of the data path. Still optionally, the back-end node 268 sends a current limit notification to one, more or all of the front-end nodes of the data path.
By sending a current limit notification to a preceding node, a busy state is more quickly notified to the preceding node of the data path, thereby reducing the workload of the following node and the entire data path.
In response to receiving the current limit notification, node 260 slows processing of the IO command, for example. For example, node 260 slows down the processing of IO commands by increasing the delay during processing of the IO commands, which may be specified or adjusted, and by increasing the delay, increases the time to process each IO command, thereby slowing down the processing of the IO commands. As yet another example, node 260 slows the processing of IO commands by adding delay during the fetching of IO commands from inbound queue 280 and/or the adding of IO commands to outbound queue 284. As yet another example, node 260 slows the processing of IO commands by adding one or more IO commands to scratch table 240.
Because IO commands received by one or more nodes may not be processed immediately (e.g., added to scratch list 240) when the node is in a busy state, the IO commands experience a long latency in busy nodes, which is timed from when the IO commands are provided to the node. And in response to the current limit notification, the preceding stage node 260 slows down the processing speed of the IO command so that the IO command transmitted to the node in the busy state is uniformly transmitted to the node processing the busy state for, for example, the time period T. In comparison, taking N commands sent to a node in a busy state as an example, if a preceding node does not slow down the processing speed of an IO command, the N commands are all sent to the node in the busy state near the beginning of the time period T, so that the waiting time of each of the N commands is close to T in the time period T, and the average waiting time is also close to T. If the preceding node slows down the processing speed of the IO commands, the N commands reach the node in the busy state in, for example, an even distribution within the time period T, so that the average waiting time is T/2. Thus reducing the average processing delay of IO commands by nodes in a busy state in this way. The Quality of Service (QoS) of the task processing system for processing the IO command is improved. By way of example, quality of service is evaluated with a specified percentage of maximum processing delay for IO commands, e.g., 99.999% of IO commands will be processed to completion within 100 microseconds.
By reducing the processing speed of the IO command at the front-stage node, the IO command is processed more uniformly by each node on the data path, the release of single-node congestion is reduced, and the node in a busy state is promoted to release the resource occupied by the IO command to be processed as soon as possible, so that the subsequent IO command is processed favorably, and the congestion is reduced.
Further, when the nodes at each level take out the IO command from the temporary storage table 240, the IO command which is started to be processed earlier is taken out according to the time stamp sequence of the IO command and is processed preferentially, which is also helpful for further improving the service quality of the data processing system for processing the IO command.
Still further, the subsequent node 268 on the data path recognizes that the operation busy state is exited, and transmits a notification of canceling the current restriction to the most previous node on the data path. Still optionally, the back-end node 268 sends a notification to cancel the current limit to one, more or all of the front-end nodes of the data path. In response to receiving the notification to cancel current limiting, each node cancels one or more measures taken to slow down the processing speed of the IO command or gradually reduces the delay during processing of the IO command.
Still optionally, the back-end node 268 also provides a current limit notification to the front-end node 264 immediately adjacent thereto on the data path, so that the front-end node 264 implements current limit earlier without having to wait until the outbound queue of the node 264 is filled to realize that the back-end node 268 is busy.
FIG. 3 is a schematic diagram of a data path of a task processing system according to yet another embodiment of the present application.
The task processing system includes a plurality of data paths. The data path for processing the host IO command includes node 340, node 344, and node 348, and the data path for processing the garbage collection IO command includes node 360, node 364, and node 368.
Node 340, node 344, and node 348 are ordered, e.g., node 340 processes the 1 st phase of the IO command, node 344 processes the 2 nd phase of the IO command, and node 348 processes the 3 rd phase of the IO command. The subsequent node needs to continue processing the IO command using the processing result of the previous node. The IO command processing results of the nodes are transferred between the nodes through, for example, a queue. For the sake of simplicity, the processing result of transferring the IO command between the nodes is simply referred to as an IO command in some cases. Node 360, node 364, and node 368 are also ordered.
It will be appreciated that the data path of the task processing system may have other numbers of nodes, as well as other data paths.
According to the embodiment illustrated in fig. 3 of the present application, the task processing system further comprises a coordination node 370. Although a separate coordinating node 370 is shown in FIG. 3, other nodes of the data processing system (340, 344, 348, 360, 364, and 368) may alternatively act as coordinating nodes. Still alternatively, the nodes belonging to different data paths, e.g. node 340 and node 360, may be provided by the same or different processors, e.g. by code segments of two processing tasks running on the same processor, respectively.
Optionally, each node is further coupled to one or more scratch lists, the entries of which are sorted by timestamp, or from which an entry with the earliest timestamp can be fetched.
The use of a separate coordinating node 370 facilitates reducing interference with the data path. For example, if the nodes are provided by code segments that process tasks, using a separate coordinating node 370 can reduce the chance that other nodes will be swapped out of the processor's cache, thereby improving operating efficiency.
The coordinating node distributes the current limit information in the data processing system, thereby reducing the impact of the distribution of the current limit information on the data path.
Referring to FIG. 3, a back-end node 348 of the datapath processing host IO commands is coupled to the coordinating node 370 and provides the coordinating node 370 with its own operational status, e.g., indicating to the coordinating node 370 that it is busy. The coordinating node 370, in response to receiving the operational status of the subsequent node 348, provides a current limit notification to the previous node 340 and/or the previous node 344 that belong to the same data path as the subsequent node 348. In response to receiving the current limit notification, node 340 and/or node 344, for example, slow the processing of the IO command.
A back-level node 368, which handles garbage collection IO commands, is coupled to the coordinating node 370 and provides the coordinating node 370 with its own working status, e.g., indicating to the coordinating node 370 that it is busy. The coordinating node 3770 provides a current limit notification to the preceding node 360 and/or the preceding node 364 belonging to the same data path as the succeeding node 368 in response to receiving the operating state of the succeeding node 368. In response to receiving the current limit notification, node 360 and/or node 364, for example, slow the processing of the garbage collection IO command. Alternatively or additionally, coordinating node 370 may provide a current limit notification to prior node 340 and/or prior node 344 belonging to a different data path than subsequent node 368 in response to receiving an operational status of subsequent node 368. Alternatively or additionally, the coordinating node 370, in response to receiving the operational status of the subsequent node 348, provides a current limit notification to the prior node 360 and/or the prior node 364 that belong to a different data path than the subsequent node 348.
In one example, the data path that handles the host IO command has a higher priority. Thus, the coordinating node 370 provides a current limit notification to the preceding node 360 and/or node 364 of the data path handling the garbage collection IO command based on the operational state of the subsequent node 348 from the data path handling the host IO command, and does not provide a current limit notification to the preceding node 340 and/or node 344 of the data path handling the host IO command based on the operational state of the subsequent node 368 from the data path handling the garbage collection IO command.
By providing a current limited notification, the coordinating node 370 also enables control of the bandwidth provided to host IO commands and the bandwidth provided to garbage collection IO commands by limiting the speed at which the IO commands are processed by the nodes. For example, the bandwidth provided to the host IO command and the bandwidth provided to the garbage collection IO command are made to satisfy a specified ratio, or the bandwidth provided to the garbage collection IO command is made not to exceed a specified value.
Alternatively or additionally, one, more or all nodes of the task processing system provide their own working state to the coordinating node. And the coordinating node provides a current limit notification to the coordinating node to one, more, or all nodes of the task processing system in response to receiving the operational status of the node.
The current limit notification indicates, for example, to turn on the current limit or to cancel the current limit. As yet another example, the degree of current limit is indicated by a numerical value in the current limit notification. The current limit notification may be, for example, variable, with the coordinating node 370 determining the value indicated by the limit notification provided to one, more, or all of the nodes.
Alternatively or additionally, the current limit notification includes a plurality of parameters for instructing the node to exercise control over different aspects of the IO command processing. For example, the restriction notification includes two parameters, respectively indicating a time interval for retrieving the IO command from the inbound queue and a delay time interval for adding the IO command to the outbound queue.
In addition to being applied to storage devices, the embodiments according to the present application are also applicable to task scheduling in computers, servers, network devices and other electronic devices.
Embodiments of the present application also provide a program comprising program code, which, when loaded into and executed on an electronic device, causes the electronic device to perform the method described above.
It will be understood that each block of the block diagrams and flowchart illustrations, and combinations of blocks in the block diagrams and flowchart illustrations, respectively, can be implemented by various means including program instructions. These program instructions may be loaded onto a general purpose computer, special purpose computer, or other programmable data control apparatus to produce a machine, such that the instructions which execute on the computer or other programmable data control apparatus create means for implementing the functions specified in the flowchart block or blocks.
Accordingly, blocks of the block diagrams and flowchart illustrations support combinations of means for performing the specified functions, combinations of operations for performing the specified functions and program instruction means for performing the specified functions. It will also be understood that each block of the block diagrams and flowchart illustrations, and combinations of blocks in the block diagrams and flowchart illustrations, can be implemented by special purpose hardware-based computer systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
At least a portion of the various blocks, operations, and techniques described above may be performed using hardware, by controlling a device to execute firmware instructions, by controlling a device to execute software instructions, or any combination thereof. When implemented with a control device executing firmware and software instructions, the software or firmware instructions may be stored on any computer-readable storage medium, such as a magnetic disk, optical disk or other storage medium, in RAM or ROM or Flash memory, a control device, hard disk, optical disk, magnetic disk, or the like. Likewise, the software and firmware instructions may be transmitted to a user or system via any known or desired transmission means. The software or firmware instructions may include machine-readable instructions that, when executed by the control device, cause the control device to perform various actions.
When implemented in hardware, the hardware may include one or more discrete components, integrated circuits, Application Specific Integrated Circuits (ASICs), and the like.
It should be understood that the present application may be implemented in software, hardware, firmware, or a combination thereof. The hardware may be, for example, a control device, an application specific integrated circuit, a large scale integrated circuit, or the like.
Although the present application has been described with reference to examples, which are intended to be illustrative only and not to be limiting of the application, changes, additions and/or deletions may be made to the embodiments without departing from the scope of the application.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A task scheduling method comprises the following steps:
identifying that a first node is in a busy state;
sending a current limit notification to the second node;
in response to receiving the current limit notification, the second node slows down processing of the task.
2. The method of claim 1, wherein,
the first node and the second node are not adjacent on the data path, and the second node does not send tasks to the first node.
3. The method of claim 1 or 2, wherein
The second node slows down the processing of the task by adding a delay in the gap in processing the task.
4. The method of one of claims 1-3, further comprising:
the second node adds the task to the scratch list in response to its subsequent node failing to receive the task.
5. The method of claim 4, further comprising:
the second node fetches the task having the earliest time to be first processed by the task processing system from the scratch list and processes the task.
6. The method of one of claims 1-5, further comprising:
in response to the first node being in a work-busy state, the first node indicating to the coordinating node that it is in a work-busy state;
the coordinating node sends a current limit notification to the second node in response to identifying that the first node is in a work-busy state.
7. The method of claim 6, wherein
Wherein the first node and the second node belong to different data paths.
8. The method of claim 7, further comprising:
the coordinating node, in response to identifying that the first node is in a work-busy state, sends a current limit notification to a third node, wherein the third node belongs to the same data path as the first node.
9. The method of claim 8, wherein,
coordinating the current limit notification sent by the node to the third node with the current limit notification sent by the node to the second node, so that the performance of the respective processing tasks on the first data path and the second data path has a specified proportion or proportion range; wherein
The first node belongs to a first data path and the second node belongs to a second data path.
10. A task scheduler comprising a memory, a processor and a program stored on the memory and executable on the processor, wherein the processor when executing the program is configured to perform the method according to any of claims 1-9.
CN201811611942.8A 2018-12-27 2018-12-27 Method and device for providing QoS in concurrent task processing system Pending CN111381941A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811611942.8A CN111381941A (en) 2018-12-27 2018-12-27 Method and device for providing QoS in concurrent task processing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811611942.8A CN111381941A (en) 2018-12-27 2018-12-27 Method and device for providing QoS in concurrent task processing system

Publications (1)

Publication Number Publication Date
CN111381941A true CN111381941A (en) 2020-07-07

Family

ID=71216351

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811611942.8A Pending CN111381941A (en) 2018-12-27 2018-12-27 Method and device for providing QoS in concurrent task processing system

Country Status (1)

Country Link
CN (1) CN111381941A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101661386A (en) * 2009-09-24 2010-03-03 成都市华为赛门铁克科技有限公司 Multi-hardware thread processor and business processing method thereof
CN104468158A (en) * 2013-09-16 2015-03-25 华为技术有限公司 A method and apparatus for notifying status among nodes
CN106937380A (en) * 2015-12-29 2017-07-07 北京信威通信技术股份有限公司 Method for processing resource and device
CN107346265A (en) * 2016-05-05 2017-11-14 北京忆恒创源科技有限公司 Realize QoS method and apparatus
CN107391317A (en) * 2017-09-14 2017-11-24 郑州云海信息技术有限公司 A kind of method, apparatus of data recovery, equipment and computer-readable recording medium
US20180101446A1 (en) * 2016-10-11 2018-04-12 Fujitsu Limited Node diagnosis apparatus and system
CN108228349A (en) * 2017-12-26 2018-06-29 北京市商汤科技开发有限公司 For handling the method for task, system and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101661386A (en) * 2009-09-24 2010-03-03 成都市华为赛门铁克科技有限公司 Multi-hardware thread processor and business processing method thereof
CN104468158A (en) * 2013-09-16 2015-03-25 华为技术有限公司 A method and apparatus for notifying status among nodes
CN106937380A (en) * 2015-12-29 2017-07-07 北京信威通信技术股份有限公司 Method for processing resource and device
CN107346265A (en) * 2016-05-05 2017-11-14 北京忆恒创源科技有限公司 Realize QoS method and apparatus
US20180101446A1 (en) * 2016-10-11 2018-04-12 Fujitsu Limited Node diagnosis apparatus and system
CN107391317A (en) * 2017-09-14 2017-11-24 郑州云海信息技术有限公司 A kind of method, apparatus of data recovery, equipment and computer-readable recording medium
CN108228349A (en) * 2017-12-26 2018-06-29 北京市商汤科技开发有限公司 For handling the method for task, system and storage medium

Similar Documents

Publication Publication Date Title
CA2849565C (en) Method, apparatus, and system for scheduling processor core in multiprocessor core system
US10223166B2 (en) Scheduling homogeneous and heterogeneous workloads with runtime elasticity in a parallel processing environment
US10678722B2 (en) Using a decrementer interrupt to start long-running hardware operations before the end of a shared processor dispatch cycle
US10445850B2 (en) Technologies for offloading network packet processing to a GPU
EP3425502A1 (en) Task scheduling method and device
CN109697122B (en) Task processing method, device and computer storage medium
US9733981B2 (en) System and method for conditional task switching during ordering scope transitions
US20120284720A1 (en) Hardware assisted scheduling in computer system
US20110041131A1 (en) Migrating tasks across processors
JP2011059777A (en) Task scheduling method and multi-core system
US9697128B2 (en) Prefetch threshold for cache restoration
US9471387B2 (en) Scheduling in job execution
US20230127112A1 (en) Sub-idle thread priority class
WO2017105573A2 (en) Technologies for integrated thread scheduling
CN113312323A (en) IO (input/output) request scheduling method and system for reducing access delay in parallel file system
WO2016177081A1 (en) Notification message processing method and device
US10102037B2 (en) Averting lock contention associated with core-based hardware threading in a split core environment
US10108453B2 (en) Averting lock contention associated with core-based hardware threading in a split core environment
JP2019021185A (en) Information processing device, information processing system, information processing device control method and information processing device control program
CN115981893A (en) Message queue task processing method and device, server and storage medium
CN111381941A (en) Method and device for providing QoS in concurrent task processing system
EP3396553B1 (en) Method and device for processing data after restart of node
US9483317B1 (en) Using multiple central processing unit cores for packet forwarding in virtualized networks
US9959143B1 (en) Actor and thread message dispatching
CN114661415A (en) Scheduling method and computer system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
CB02 Change of applicant information

Address after: 100192 room A302, building B-2, Dongsheng Science Park, Zhongguancun, 66 xixiaokou Road, Haidian District, Beijing

Applicant after: Beijing yihengchuangyuan Technology Co.,Ltd.

Address before: 100192 room A302, building B-2, Dongsheng Science Park, Zhongguancun, 66 xixiaokou Road, Haidian District, Beijing

Applicant before: BEIJING MEMBLAZE TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination