CN116627507A - Queue control method, apparatus, electronic device, and computer-readable storage medium - Google Patents

Queue control method, apparatus, electronic device, and computer-readable storage medium Download PDF

Info

Publication number
CN116627507A
CN116627507A CN202310731287.4A CN202310731287A CN116627507A CN 116627507 A CN116627507 A CN 116627507A CN 202310731287 A CN202310731287 A CN 202310731287A CN 116627507 A CN116627507 A CN 116627507A
Authority
CN
China
Prior art keywords
queue
segment
instruction
instructions
positions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310731287.4A
Other languages
Chinese (zh)
Other versions
CN116627507B (en
Inventor
邢宇飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Haiguang Information Technology Co Ltd
Original Assignee
Haiguang Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Haiguang Information Technology Co Ltd filed Critical Haiguang Information Technology Co Ltd
Priority to CN202310731287.4A priority Critical patent/CN116627507B/en
Publication of CN116627507A publication Critical patent/CN116627507A/en
Application granted granted Critical
Publication of CN116627507B publication Critical patent/CN116627507B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3867Concurrent instruction execution, e.g. pipeline or look ahead using instruction pipelines
    • G06F9/3869Implementation aspects, e.g. pipeline latches; pipeline synchronisation and clocking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/3243Power saving in microcontroller unit
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)

Abstract

A queue control method, a queue control device, an electronic apparatus, and a computer-readable storage medium. The queue control method comprises the following steps: acquiring a first number of empty positions in a current clock cycle in a first segment of a queue, wherein the first segment is N positions nearest to a writing end of the queue; obtaining a second number of instructions that need to be written to the queue in a next clock cycle of the current clock cycle; and under the condition that the first number is larger than or equal to the second number, in the next clock cycle, writing the instructions needing to be written into the queue into the first segment, and keeping the positions of other instructions except the read instructions in the second segment of the queue in the queue unchanged, wherein the second segment is a part except the first segment in the queue. The method can reduce the frequency of instruction movement and reduce the dynamic power consumption of the queue.

Description

Queue control method, apparatus, electronic device, and computer-readable storage medium
Technical Field
Embodiments of the present disclosure relate to a queue control method, a queue control apparatus, an electronic device, and a computer-readable storage medium.
Background
At various stages inside the processor, there are some cached queues between the components, a special linear table that only allows delete operations to be performed at the front of the table, and insert operations to be performed at the back of the table, where the end performing the insert operations may be referred to as the tail and the end performing the delete operations may be referred to as the head. Queues typically comprise two types, one is sequential write sequential read and the other is sequential write out of order read. The Scheduling Queue (SQ) is a key component that the out-of-order issue processor supports out-of-order reads of sequential input instructions, and is also the starting point for out-of-order execution. The instruction stream is distributed into the dispatch queue when the dispatch queue is not full, and is placed at an empty entry position in the dispatch queue. The instructions to be executed are selected from the scheduling queue or inferred to be executed, not completely according to the order of the instructions entering the scheduling queue, but according to the preparation state of the instructions stored in the scheduling queue, the instruction which enters the scheduling queue earliest, or the oldest instruction, in all the prepared instructions is selected. The selected instruction is launched into the pipeline and the source operands corresponding to all instructions in the dispatch queue with a dependency relationship with the instruction are marked as ready state (or called ready state), if all the source operands of an instruction in the dispatch queue are marked as ready state, the instruction is in the ready state, and can participate in the selection flow before being launched in the dispatch queue.
Disclosure of Invention
At least one embodiment of the present disclosure provides a queue control method, including: acquiring a first number of empty positions in a current clock cycle in a first segment of a queue, wherein the first segment is N positions nearest to a writing end of the queue; obtaining a second number of instructions that need to be written to the queue in a next clock cycle to the current clock cycle; writing the instruction to be written into the queue into the first segment in the next clock cycle when the first number is greater than or equal to the second number, and keeping the positions of other instructions except the read instruction in the second segment of the queue in the queue unchanged, wherein the second segment is a part of the queue except the first segment; wherein N is a positive integer.
For example, the queue control method provided by at least one example of the above-described embodiment of the present disclosure further includes:
and if the second segment has an empty position in the next clock cycle under the condition that the first number is smaller than the second number, moving an instruction, which is positioned on one side of the empty position and is close to the writing end, of the second segment to a direction which is close to the reading end of the queue, moving the instruction of the first segment to the second segment, and writing at least part of the instruction which needs to be written into the queue into the first segment.
For example, in the queue control method provided by at least one example of the above-described embodiment of the present disclosure, in the next clock cycle, the instructions of the first segment and the second segment move by at most N positions; the order of the instructions of the first segment and the second segment in the queue is maintained before and after the instructions of the first segment and the second segment are moved.
For example, in the queue control method provided by at least one example of the above embodiment of the present disclosure, in the case where the first number is equal to or greater than the second number, writing the instruction to be written to the queue to the first segment in the next clock cycle includes: and in the next clock cycle, sequentially writing the instructions needing to be written into the queue from the position closest to the readout end in the first segment to the direction far away from the readout end under the condition that the first number is greater than or equal to the second number and the first number is equal to N.
For example, in the queue control method provided by at least one example of the above embodiment of the present disclosure, in the case where the first number is equal to or greater than the second number, writing the instruction to be written to the queue to the first segment in the next clock cycle includes: when the first number is equal to or greater than the second number, the second number is P, and P positions of the first segment closest to the writing end are not all free positions, moving an instruction in the P positions in a direction approaching the reading end within the first segment range in the next clock cycle so that the P positions are free positions; sequentially writing the P instructions needing to be written into the queue from the position closest to the reading end in the P positions and in the direction far away from the reading end; wherein P is a positive integer less than N.
For example, in the queue control method provided by at least one example of the above embodiment of the present disclosure, in the case where the first number is equal to or greater than the second number, writing the instruction to be written to the queue to the first segment in the next clock cycle includes: under the condition that K continuous positions closest to the writing end in the first segment are all empty and the second number is less than or equal to K, starting from the position closest to the reading end in the K positions and sequentially writing the instruction needing to be written into the queue in the direction far away from the reading end; wherein K is a positive integer less than N.
For example, in a queue control method provided by at least one example of the above-described embodiment of the present disclosure, moving the instruction of the first segment toward the second segment, and writing at least part of the instruction to be written to the queue to the first segment includes: moving the instruction of the first segment to the second segment by at least one position, so that R positions closest to the writing end in the first segment are empty; if the second number is greater than R, sequentially writing R instructions in the instructions needing to be written into the queue from the position closest to the reading end in the R positions and in the direction far away from the reading end; wherein R is a positive integer less than or equal to N.
For example, in the queue control method provided by at least one example of the foregoing embodiment of the present disclosure, moving the instruction of the first segment toward the second segment, and writing at least part of the instruction that needs to be written to the queue to the first segment, further includes: and if the second number is less than or equal to R, sequentially writing the instructions needing to be written into the queue from the position closest to the reading end among the R positions and in the direction away from the reading end.
At least one embodiment of the present disclosure provides a queue control device, including a first obtaining unit, a second obtaining unit, and a first control unit, where the first obtaining unit is configured to obtain a first number of empty positions in a current clock cycle in a first segment of a queue, where the first segment is N positions nearest to a writing end of the queue; a second fetch unit configured to fetch a second number of instructions that need to be written to the queue in a next clock cycle to the current clock cycle; the first control unit is configured to write the instruction to be written into the queue into the first segment in the next clock cycle when the first number is greater than or equal to the second number, and keep the positions of other instructions except the read instruction in the second segment of the queue in the queue unchanged, wherein the second segment is a part of the queue except the first segment; wherein N is a positive integer.
For example, in a case where the first number is smaller than the second number, if the second segment has an empty position, the queue control device provided in at least one example of the above embodiment of the present disclosure further includes a second control unit configured to move an instruction located on a side of the empty position near the writing end in the second segment toward a direction near a reading end of the queue, move an instruction of the first segment toward the second segment, and write at least part of the instruction to be written into the queue into the first segment in the next clock cycle.
For example, in the queue control device provided by at least one example of the above-described embodiment of the present disclosure, in the next clock cycle, the instructions of the first segment and the second segment move by at most N positions; the order of the instructions of the first segment and the second segment in the queue is maintained before and after the instructions of the first segment and the second segment are moved.
For example, in the queue control device provided in at least one example of the above-described embodiment of the present disclosure, the first control unit is further configured to: and in the next clock cycle, sequentially writing the instructions needing to be written into the queue from the position closest to the readout end in the first segment to the direction far away from the readout end under the condition that the first number is greater than or equal to the second number and the first number is equal to N.
For example, in the queue control device provided in at least one example of the above-described embodiment of the present disclosure, the first control unit is further configured to: when the first number is equal to or greater than the second number, the second number is P, and P positions of the first segment closest to the writing end are not all free positions, moving an instruction in the P positions in a direction approaching the reading end within the first segment range in the next clock cycle so that the P positions are free positions; sequentially writing the P instructions needing to be written into the queue from the position closest to the reading end in the P positions and in the direction far away from the reading end; wherein P is a positive integer less than N.
At least one embodiment of the present disclosure provides an electronic device comprising a processor; a memory storing one or more computer program modules; wherein the one or more computer program modules are configured to be executed by the processor for implementing the queue control method provided by any one of the embodiments of the disclosure.
At least one embodiment of the present disclosure provides a computer-readable storage medium storing non-transitory computer-readable instructions that, when executed by a computer, implement a queue control method provided by any one embodiment of the present disclosure.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings of the embodiments will be briefly described below, and it is apparent that the drawings in the following description relate only to some embodiments of the present disclosure, not to limit the present disclosure.
FIG. 1 illustrates a schematic diagram of instruction movement in a queue;
FIG. 2 illustrates a flow chart of a method of queue control provided by at least one embodiment of the present disclosure;
FIG. 3 illustrates a schematic diagram of a queue provided in accordance with at least one embodiment of the present disclosure;
FIG. 4 illustrates a flow chart of another queue control method provided by at least one embodiment of the present disclosure;
FIG. 5 illustrates a schematic diagram of another queue provided in accordance with at least one embodiment of the present disclosure;
FIG. 6 illustrates a schematic diagram of another queue provided by at least one embodiment of the present disclosure;
FIG. 7 illustrates a schematic diagram of another queue provided in accordance with at least one embodiment of the present disclosure;
FIG. 8 illustrates a schematic diagram of another queue provided by at least one embodiment of the present disclosure;
FIG. 9 illustrates a schematic diagram of another queue provided by at least one embodiment of the present disclosure;
FIG. 10 illustrates a schematic diagram of queue movement logic provided by at least one embodiment of the present disclosure;
FIG. 11 illustrates a schematic block diagram of a queue control device provided by at least one embodiment of the present disclosure;
FIG. 12 illustrates a schematic block diagram of an electronic device provided by at least one embodiment of the present disclosure;
FIG. 13 illustrates a schematic block diagram of another electronic device provided by at least one embodiment of the present disclosure; and
fig. 14 shows a schematic diagram of a computer-readable storage medium provided by at least one embodiment of the present disclosure.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present disclosure. It will be apparent that the described embodiments are some, but not all, of the embodiments of the present disclosure. All other embodiments, which can be made by one of ordinary skill in the art without the need for inventive faculty, are within the scope of the present disclosure, based on the described embodiments of the present disclosure.
Unless defined otherwise, technical or scientific terms used in this disclosure should be given the ordinary meaning as understood by one of ordinary skill in the art to which this disclosure belongs. The terms "first," "second," and the like, as used in this disclosure, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. Likewise, the terms "a," "an," or "the" and similar terms do not denote a limitation of quantity, but rather denote the presence of at least one. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.
The move collapse method (Shift-collapse) is one implementation of maintaining schedule queue age information. In such an implementation, the order of preservation of instructions in the queue may be in order of age of the instructions. If addresses are sequentially allocated at various locations in the dispatch queue and instructions are entered from the high address end of the dispatch queue (from the read end of the queue to the write end, entry address low to high), then under the mobile collapse mechanism, instructions stored at lower addresses enter the dispatch queue earlier, i.e., older, than instructions stored at higher addresses. The new-old relationship among the instructions in the queue can be determined according to the order of entering the queue, wherein the instruction age of the instruction entering the queue is older, and then the instruction age of the instruction entering the queue is updated. To maintain the age relationship, all or part of existing instructions may be moved to a lower address while new instructions are entered in the dispatch queue, some empty locations that may exist in the queue may be filled with valid entries having higher addresses, and new empty locations may be formed at the high address locations of the dispatch queue to satisfy the new instruction write requirements. Maintaining age information may simplify the pick process before transmission in the dispatch queue, but this implementation consumes higher power. To accurately complete a dispatch task, the number of bits per valid entry in the queue is high, while multiple bits of information are written to specific locations at lower addresses in the event of a mobile collapse. Theoretically, each time a new entry instruction in the queue is scheduled, a mobile collapse occurs to maintain the correct age relationship of each instruction. Frequent refreshing of the stored values in the dispatch queue may result in higher dynamic power consumption.
Instruction age is an visual depiction of the order relationship of instructions in an instruction stream, with instructions further forward (low address) in the order of instructions being older than instructions further backward (high address). The instructions need not be executed in the order of their original age when launched into the pipeline for execution, but in order in the original instruction stream when retired, to ensure the correctness of the execution results of the data dependent instructions.
An arithmetic logic unit (Arithmetic logic unit, ALU) is a component in a processor that performs addition, subtraction, multiplication, division, shifting, and various logical operations. Most of the instruction implementation can use specific calculation in the arithmetic logic units, and by increasing the number of the arithmetic logic units, the processor can complete more specific operations in each clock cycle, so that the calculation capability is enhanced. In the case of multiple arithmetic logic units, the associated dispatch queue needs to increase the number of instructions issued per clock cycle to match the computational power of the arithmetic unit.
Multiple issue techniques are a solution to increasing the computational power, by adding execution units, a processor can support multiple operations of different or the same kind per clock cycle, meaning that a dispatch queue can issue multiple instructions into the pipeline per clock cycle. For example, two arithmetic logic units are introduced into a fixed point execution unit, then the processor may perform two fixed point arithmetic or logic operations per clock cycle. If a unified dispatch queue is selected to hold all arithmetic logic unit operations, then the dispatch queue needs to support the writing of multiple instructions per clock cycle to match the number of issue on the multiple issue premise. Under the mobile collapse mechanism, each clock cycle in the dispatch queue needs to vacate multiple locations at the high address end to hold the newly entered instruction.
In the case of multiple issue techniques, the dispatch queue needs to support multiple instruction writes per clock cycle. The number of write instructions and the number of read instructions in the queue may be equal in each clock cycle, and the larger the number of write instructions in each clock cycle, the more hardware resources are consumed, and the numerical value may be determined according to practical situations. Assuming a number of 2, the dispatch queue supports writing of up to two instructions per cycle, and accordingly, an existing instruction in the dispatch queue moves down to up to two locations with space at the low address end. If only one instruction is written to the dispatch queue in the current cycle, then the instruction will be placed at a lower address of the two possible receiving locations, which meets the requirement that the written instruction content reaches the lowest location of the dispatch queue as soon as possible to avoid more movement processes, in which case there will be a null location, which may be referred to as a void, at the high address end of the dispatch queue. Similarly, in order for an entry already in the dispatch queue to reach the lowest dispatch queue as soon as possible to make enough room at the high address end, the entry movement needs to be chosen to be the maximum allowed by the condition, if the maximum distance per movement is 2, a valid entry is moved downwards if it is required and there are at least two cavitation bubbles below it, then the movement distance of this entry is 2 instead of 1. If two instructions are written to the dispatch queue in the current cycle, then the two instructions may be placed in two possible receiving locations in order of age, and the lower address may store the older instruction contents.
FIG. 1 illustrates a schematic diagram of instruction movement in a queue.
As shown in fig. 1, a dispatch queue having 8 entry (or store) capacities is shown holding instruction contents and the start and end of data movement at each entry location for three consecutive cycles. For example, in the first clock cycle (the clock cycle labeled with numeral "1" in the drawing), 5 instructions A1 to A5 are stored in the queue, wherein instruction A5 is the instruction newly entered in the first clock cycle, instructions A1 to A4 are the instructions in the queue already existing before the first clock cycle, there is a null position between instructions A2 and A3, instruction A1 is the instruction that needs to be read out in the first clock cycle, and the current position of instruction A1 is read out and becomes the null position. In the second clock cycle (the clock cycle marked with the number "2" in the drawing), the instruction A1 is read, the instruction A2 moves 1 position to the read end, the instructions A3, A4 and A5 each move 2 positions to the read end, the instruction A6 is an instruction newly entered into the queue in the second clock cycle, the instruction A2 is an instruction to be read in the second clock cycle, and the current position of the instruction A2 is read empty and becomes an empty position. In the third clock cycle (the clock cycle marked with the number "3" in the drawing), instructions A3 and A4 each move 1 position to the readout end, instructions A5 and A6 each move 2 positions to the readout end, instructions A7 and A8 are instructions newly entered into the queue in the third clock cycle, instruction A4 is an instruction to be read in the third clock cycle, and the current position of instruction A4 is read empty and becomes an empty position.
In each clock cycle, an instruction is read out, a null position appears at a low address, if K new instructions need to be written into the scheduling queue in the next clock cycle, and K positions closest to the writing end in the queue are not all null, all existing entries in the scheduling queue need to move downwards on the premise that the low address end has a null bubble. Because the number of content bits stored in each instruction entry is large, in the tens of bits range, frequent moving operations mean that the stored values in the dispatch queue need to be frequently refreshed, resulting in great dynamic power consumption.
At least one embodiment of the present disclosure provides a queue control method, a queue control apparatus, an electronic device, and a computer-readable storage medium.
The queue control method comprises the following steps: acquiring a first number of empty positions in a current clock cycle in a first segment of a queue, wherein the first segment is N positions nearest to a writing end of the queue; obtaining a second number of instructions that need to be written to the queue in a next clock cycle of the current clock cycle; under the condition that the first number is larger than or equal to the second number, in the next clock cycle, writing the instruction needing to be written into the queue into the first segment, and keeping the positions of other instructions except the read instruction in the second segment of the queue in the queue unchanged, wherein the second segment is a part except the first segment in the queue; wherein N is a positive integer.
According to the queue control method, under the condition that the first section of the queue can accommodate the instruction to be written in the next period, the instruction is written in the empty position of the first section, and even if the second section exists in the empty position, the position of the instruction stored in the second section is not required to be moved, so that the number of times of instruction movement is reduced, and the dynamic power consumption of the queue is reduced.
Fig. 2 illustrates a flow chart of a queue control method provided in at least one embodiment of the present disclosure. As shown in fig. 2, the method may include steps S110 to S130.
Step S110: a first number of empty slots in a first segment of the queue at a current clock cycle is obtained.
The queue has a write end and an output end, and the first segment is N positions nearest to the write end of the queue, where N is a positive integer.
Step S120: a second number of instructions that need to be written to the queue in a next clock cycle of the current clock cycle is obtained.
Step S130: and under the condition that the first number is larger than or equal to the second number, writing the instruction which needs to be written into the queue into the first segment in the next clock cycle, and keeping the positions of other instructions except the read instruction in the second segment of the queue in the queue unchanged, wherein the second segment is a part except the first segment in the queue.
For example, in step S110, the write end of the queue may refer to the end that receives the new instruction write, and may also be referred to as the end of the queue or the high address end. The read end of the queue may refer to the end of the queue from which instructions are read, and may also be referred to as the head end or low address end.
For example, a queue may include a plurality of storage locations/entry locations (also referred to herein simply as "locations") arranged in sequence, each of which may store, for example, an instruction. In the embodiment of the disclosure, the multiple data storage locations of the queue may be divided into two parts, a part near the writing end may be referred to as a first segment, and another part may be referred to as a second segment.
In the embodiment of the present disclosure, the total number of storage positions included in the queue, the number N of storage positions included in the first segment, and the ratio of the number of positions of the first segment to the number of positions of the second segment may all be determined according to actual situations, which is not limited by the present disclosure. For example, the queue may be a scheduling queue SQ.
For example, a null position may be a position where no instructions are deposited, i.e., a position where the value is null, may be used to receive and deposit instructions.
Fig. 3 illustrates a schematic diagram of a queue provided in accordance with at least one embodiment of the present disclosure.
As shown in fig. 3, a schedule queue with 8 storage locations is shown in three consecutive cycles, for example, 2 locations near the write end are the first segment (i.e., n=2), and the remaining 6 locations are the second segment. Under the condition that the current clock cycle is the first clock cycle, in the current clock cycle, the first segment stores an instruction A5, and the second segment stores instructions A1 to A4, wherein the instruction A5 is a new instruction of which the current clock cycle enters the queue, and the instruction A1 is an instruction which needs to be read out in the current clock cycle. The first segment comprises 1 empty position (the position closest to the writing end), i.e. the first number 1. If 1 instruction (e.g., instruction A6) needs to be written in the second cycle, the second number is also 1, and the first number is equal to the second number, so that in the second clock cycle, a new instruction A6 is written in the first segment, for example, the side of the write end near the write instruction A5 is written, and the positions of the instructions A2 to A4 in the second segment that are not read out remain unchanged. In the second clock cycle, instruction A1 is read out such that there are 3 empty positions in the second segment, and even if there are empty positions in the second segment, instructions A2-A4 in the second segment that are not read out are not moved.
For example, in another example, N is equal to 4, for example, the total number of positions in the queue is 8, with 4 positions in the queue near the write end as the first segment and the remaining 4 positions as the second segment. If the first number of empty locations in the first segment is 2 in the first clock cycle and the second number of instructions to be written in the second clock cycle is 1, the first number is greater than the second number, so that new instructions can be written in the first segment in the second clock cycle, and each instruction in the second segment does not need to move locations.
According to the queue control method of at least one embodiment of the present disclosure, under the condition that the first segment of the queue can accommodate the instruction to be written in the next period, the instruction is written in the empty position of the first segment, and even if the second segment has the empty position, the position of the instruction stored in the second segment is not required to be moved, so that the number of times of instruction movement is reduced, and the dynamic power consumption of the queue is reduced.
Fig. 4 is a flow chart illustrating another queue control method provided by at least one embodiment of the present disclosure.
As shown in fig. 4, in some embodiments, the queue control method may further include step S140 as compared to the embodiment of fig. 2: if the first number is smaller than the second number, in the next clock cycle, if the second segment has an empty position, moving an instruction on the side, close to the writing end, of the second segment, which is located at the empty position, towards the direction, close to the reading end, of the queue, moving an instruction of the first segment towards the second segment, and writing at least part of the instruction, which needs to be written into the queue, into the first segment.
For example, taking the scheduling queue shown in fig. 3 as an example, 2 positions near the writing end are taken as a first segment (i.e., n=2), and the remaining 6 positions are taken as a second segment. Under the condition that the current clock cycle is the second clock cycle, in the current clock cycle, the first segment stores instructions A5 and A6, and the second segment stores instructions A2 to A4, wherein the instruction A6 is a new instruction of entering the queue in the current clock cycle, and the instruction A2 is an instruction which needs to be read out in the current clock cycle. The first segment does not comprise empty positions, i.e. the first number is 0. If 2 instructions (e.g., instructions A7 and A8) need to be written in the third cycle, the second number is 2, the first number is smaller than the second number, and the second segment includes 2 empty positions, so that after instruction A2 is read out, instructions A3 and A4 on the higher address side of the empty positions in the second segment can be moved to the read end, instructions A5 and A6 in the first segment can be moved to the second segment to free two positions in the first segment, and instructions A7 and A8 can be written in two empty positions in the first segment. For example, in other embodiments, in a third clock cycle, if the first segment is free of only one empty space after the first and second segments move instructions, then older instruction A7 may be written to the first segment first and newer instruction A8 may be written in a later clock cycle.
For example, in the next clock cycle of the current clock cycle, the instructions of the first segment and the second segment move by at most N positions. The maximum distance of movement of each instruction in one clock cycle corresponds to the maximum number of instructions that the queue can receive in each clock cycle, and the number of storage positions N included in the first segment corresponds to the maximum distance of movement of each instruction in the queue in each clock cycle. For example, if the queue can receive a maximum of 2 new instruction entries per cycle, then the 2 positions in the queue closest to the write end can be used as the first segment, and the instruction can be shifted to the read end by a maximum of 2 positions per clock cycle. Referring to fig. 3, in the third clock cycle, the instructions A5 and A6 move to 2 positions towards the readout end, so that 2 positions in the first segment can be emptied, and just two newly entered instructions (A7 and A8) can be accommodated, so that the newly entered instructions can be placed adjacent to the original instructions of the queue, and the utilization rate of the queue is improved.
For example, the maximum number of instructions that the queue can receive per cycle may be consistent with the number of instructions that the queue reads per cycle, e.g., 2 instructions may be read from the queue per clock cycle, and the queue may receive up to 2 new instructions per clock cycle.
For example, the order of instructions of the first and second segments in the queue remains unchanged before and after the instructions of the first and second segments are moved. For example, referring to fig. 3, in the second clock cycle, instructions A2 to A6 are placed in order in the first direction (direction from the read end to the write end), in the third clock cycle, instruction A2 is read out, the remaining instructions A3 to A6 are still placed in order in the first direction, and the context of instructions A3 to A6 is unchanged. In this way, it is ensured that older instructions of all operand ready state instructions are read out in preference to younger instructions.
For example, in writing an instruction to be written into the queue into the first segment, the instructions may be written sequentially in order of time information of the instructions from early to late. The time information can be understood as age information, the older instruction age of the time information, and the newer instruction age of the time information. The time information can be specific time points or sequence information representing time sequence, and the time information can be determined according to the generation time or the generation sequence of the instruction or according to the time points or the time sequence of the instruction reaching certain nodes. For example, referring to FIG. 3, in the third clock cycle, the time information for instruction A7 is earlier than instruction A8, i.e., instruction A7 is older than instruction A8, in which case, when instructions A7 and A8 are written to the queue, instruction A7 is written before instruction A8. In this way, older instructions are favored to be read out preferentially.
For example, in step S130, when the first number is greater than or equal to the second number, in the next clock cycle, the instruction to be written into the queue is written into the first segment, including: in the case where the first number is equal to or greater than the second number and the first number is equal to N, in the next clock cycle, instructions that need to be written to the queue are sequentially written from the position closest to the readout end in the first segment and in a direction away from the readout end.
For example, in the present clock cycle, the N positions included in the first segment are all free positions, and the number of new instructions to be written in the next clock cycle is N or less, in which case, in the next clock cycle, a number of new instructions may be written to the high address sequentially from a position closest to the readout end among the N positions, and an instruction of which the oldest is written to a position closest to the readout end among the N positions. In this way it is advantageous to get the newly entered instruction to the read-out faster.
Fig. 5 illustrates a schematic diagram of another queue provided in accordance with at least one embodiment of the present disclosure.
As shown in fig. 5, taking n=3 as an example, in the current clock cycle, the 3 positions included in the first segment are all free positions, and in the next clock cycle, 2 new instructions B7 and B8 need to be written, in this case, in the next clock cycle, 2 new instructions B7 and B8 may be written in sequence in two free positions closer to the readout end among the 3 free positions. In writing, older instruction B7 of the 2 new instructions may be written to a lower address of the two empty locations, and newer instruction B8 may be written to a higher address.
For example, taking n=2 as an example, if the 2 positions of the first segment are all empty, and 2 new instructions need to be written in the next clock cycle, then in the next clock cycle, the 2 new instructions that enter may be written in the two cavitation positions of the first segment, where an older instruction may be written in the position of the address lower in the two cavitation.
For example, still taking n=2 as an example, if 2 positions of the first segment are all empty, and 2 new instructions need to be written in the next clock cycle, then in the next clock cycle, a newly entered instruction may be written in a position with a lower address in the two cavitation bubbles of the first segment.
For example, in step S130, when the first number is greater than or equal to the second number, in the next clock cycle, the instruction to be written into the queue is written into the first segment, including: when the first number is greater than or equal to the second number, the second number is P, and the P positions of the first segment closest to the writing end are not all free positions, moving an instruction in the P positions to a direction close to the reading end in the first segment range in the next clock cycle so that the P positions are free positions; sequentially writing P instructions needing to be written into the queue from the position closest to the reading end in the P positions and in the direction far away from the reading end; wherein P is a positive integer less than N.
For example, in the case that a plurality of consecutive empty positions of the first segment closest to the writing end cannot accommodate a new instruction, the instruction stored in the first segment may be moved to the reading end within the range of the first segment, so that the new instruction is written after the stored instruction, the order of the instructions is prevented from being disturbed, and in this way, only a small number of instructions in the first segment need to be moved in a small range, with less power consumption.
Fig. 6 illustrates a schematic diagram of another queue provided in accordance with at least one embodiment of the present disclosure.
As shown in fig. 6, taking n=3 as an example, in the current clock cycle, 2 positions out of 3 positions included in the first segment are empty positions, and two positions in the sequence of instruction B7 as shown in the figure are empty positions. If 2 new instructions B8 and B9 (i.e., p=2) need to be written in the next clock cycle, and the 2 positions of the first segment closest to the writing end in the current clock cycle are not all empty, then in the next clock cycle, instruction B7 may be moved to the reading end by one position (not beyond the range of the first segment) so that all the 2 positions of the first segment closest to the writing end are empty, and thus new instructions B8 and B9 may be written in sequence in two empty positions located after instruction B7. In writing, older instruction B8 of the 2 new instructions may be written to a lower address of the two empty locations, and newer instruction B9 may be written to a higher address.
For example, in another embodiment, if the first segment has 2 instructions in the current clock cycle, in the next clock cycle, if the 2 instructions can only be left 1 empty at the highest address after moving within the first range, then the new instruction B8 can be written into this empty position first, and then the instruction B9 can be rewritten in the following clock cycle.
For example, taking n=2 as an example, the first segment in the current clock cycle includes 1 empty location and 1 new instruction needs to be written in the next cycle, in which case if the 1 empty location of the first segment is a location with an address higher than in the 2 locations of the first segment, then the 1 new instruction may be written directly in the empty location in the next clock cycle. If the 1 empty location of the first segment is the one with the lower address in the 2 locations of the first segment, then in the next clock cycle this empty location will be filled with instructions in the highest location to free the 1 location with the highest address, and then the 1 new instruction will be written in the address highest location, thus ensuring that the age ordering of the entries (instructions) in the queue is preserved.
For example, in step S130, when the first number is greater than or equal to the second number, in the next clock cycle, the writing of the instruction to be written into the queue into the first segment includes: and when the K continuous positions closest to the writing end in the first segment are all empty and the second number is less than or equal to K, sequentially writing the instructions needing to be written into the queue from the position closest to the reading end among the K positions to the direction far away from the reading end, wherein K is a positive integer less than N.
FIG. 7 illustrates a schematic diagram of another queue provided in accordance with at least one embodiment of the present disclosure.
As shown in fig. 7, taking n=3 as an example, in the current clock cycle, the 3 positions included in the first segment are all empty positions in the consecutive 2 positions nearest to the writing end (i.e., k=2), and two positions following the instruction B7 as shown are empty positions. If 1 new instruction B8 needs to be written in the next clock cycle (i.e., the second number is 1), then in the next clock cycle instruction B8 can be written directly after instruction B7 and closest to instruction B7 without moving the instructions already in the queue.
For example, taking n=2 as an example, the first segment in the current clock cycle includes 2 positions in total, and if 1 position closest to the writing end is a null position and another position has an instruction, and 1 new instruction needs to be written in the next cycle, then in the next clock cycle, the 1 new instruction may be directly written at the null position of the highest address.
For example, in step S140, moving the instruction of the first segment toward the second segment and writing at least part of the instruction that needs to be written to the queue to the first segment includes: moving the instruction of the first segment to the second segment by at least one position, so that R positions closest to the writing end in the first segment are empty; and if the second number is greater than R, sequentially writing R instructions in the instructions needing to be written into the queue from the position closest to the reading end in the R positions and in the direction far away from the reading end, wherein R is a positive integer less than or equal to N.
Fig. 8 illustrates a schematic diagram of another queue provided in accordance with at least one embodiment of the present disclosure.
As shown in fig. 8, taking n=2 as an example, the number of empty positions of the first segment in the current clock cycle is 0 (i.e., the first number is 0), and 2 new instructions C9 and C10 need to be written in the next clock cycle (i.e., the second number is 2), in which case the instructions in the first segment and the second segment need to be moved. In the next cycle instruction C1 is read, instructions C2-C8 are each moved 1 position distance towards the read end, one free position (r=1) is left on the side closest to the write end, one of instructions C9 and C10 can be written in this free position after instruction C8, for example, where older C9 is written in this free position after instruction C8, instruction C10 can be written in a later clock cycle. For example, in some embodiments, 2 instructions may be read out of the queue every clock cycle to free up 2 positions to accommodate 2 new instructions C9 and C10.
For example, in step S140, moving the instruction of the first segment to the second segment and writing at least part of the instruction that needs to be written to the queue to the first segment, further includes: moving the instruction of the first segment to the second segment by at least one position, so that R positions closest to the writing end in the first segment are empty; if the second number is less than or equal to R, sequentially writing the instructions needing to be written into the queue from the position closest to the reading end among the R positions to the direction far away from the reading end.
Fig. 9 illustrates a schematic diagram of another queue provided in accordance with at least one embodiment of the present disclosure.
As shown in fig. 9, taking n=2 as an example, the number of empty positions of the first segment in the current clock cycle is 0 (i.e., the first number is 0), and 2 new instructions C8 and C9 need to be written in the next clock cycle (i.e., the second number is 2), in which case the instructions in the first segment and the second segment need to be moved. In the next cycle, instruction C1 is read, instructions C2 and C3 are moved 1 position distance to the read end, C4-C7 are each moved 2 positions distance to the read end, 2 empty positions (r=2) are left on the side closest to the write end, instructions C8 and C9 can be written to these 2 empty positions, wherein older C8 can be written to the empty position after instruction C7, and instruction C9 can be written to the empty position after instruction C8. For example, in another embodiment, only 1 instruction C8 needs to be written to the queue in the next clock cycle of the current clock cycle, and after the first and second segmented instructions move, 2 empty locations (r=2) are left on the side closest to the writing end, and the instruction C8 may be written to the lower address empty location of the 2 empty locations.
FIG. 10 illustrates a schematic diagram of queue movement logic provided by at least one embodiment of the present disclosure.
As shown in fig. 10, the queue is divided into two parts, and the N positions at the high address end are used as a sub-queue with smaller capacity, called a first segment, and the other positions are used as another large sub-queue, called a second segment. If the first segment can accommodate the new entry instruction, the corresponding entry of the first segment can be moved in a shift-collapse manner to place the new instruction content, and the second segment does not need to do any operation. Only when the first segment cannot accommodate the new incoming instruction, the complete queue moves in a shift-collapse manner, and a storage space is reserved for the content of the new incoming instruction.
For example, in fig. 10, the part (a) is a movement logic and a movement enable of the first segment, and as can be seen from the part (a), the entry corresponding to the first segment needs to execute one of the shift-collapse operation and the complete shift-collapse operation in a small range, and the condition of judgment is whether NumFree > =numdisp is satisfied (NumFree represents a first number, numDisp represents a second number). If NumFree > =numdisp, the multiplexer selects the first segment movement logic, which includes: the instructions of the first segment are moved or not moved within a small range according to the situation described in at least one of the embodiments above. If NumFree < NumDisp, then the multiplexer selects the complete queue move logic, which includes: the instructions of the first segment are moved to the second segment (i.e., moved widely) and the instructions of the second segment are moved to the read-out end according to the situation described in at least one of the embodiments above. The full queue movement enable may trigger a first segment movement enable after which the processor may operate on the first segment according to the movement logic selected by the multiplexer. The first segment moving logic and the complete queue moving logic can be calculated in parallel, and only one alternative multiplexer is added on the path, so that delay is not greatly influenced.
For example, in fig. 10 (b) is divided into movement logic and movement enable of the second segment, as can be seen from (a), the complete queue movement logic includes movement logic related to the second segment, after the complete queue movement enable, in the case of NumFree < NumDisp, the second segment movement enable may be triggered, and after the second segment movement enable, the second segment may be operated according to the movement logic of the second segment.
According to at least one embodiment of the present disclosure, if there are already some empty locations at the high address end, and these empty locations are enough to accommodate the newly entered instruction, then the instruction stored in the queue does not need to be moved to a lower address, and the newly entered instruction is directly written to the high address location in the queue without affecting the stored instruction.
In accordance with at least one embodiment of the present disclosure, instructions are not in many cases written to a queue of the corresponding type for a maximum number of operations of the same type after decoding, and thus instructions accepted by the queue typically do not reach a maximum. If the number of instructions written in two consecutive cycles does not reach the maximum received number, the cavitation at the high address end of the queue can be used for storing new instruction contents, and the shift-collapse operation is not needed to be performed through the low address part to make room. The advantage of this is that only the high address cavitation position is needed to write new instructions, and the shift-collapse operation is not needed in the current clock cycle, so that the dynamic power consumption caused by the data movement operation can be saved.
According to at least one embodiment of the present disclosure, a judgment condition may be added to judge whether to move and a movement distance for realizing finer movement control. The judgment is based on whether the cavitation bubbles existing at the high address end can accommodate the newly entered instruction, so that the high address end cavitation bubbles and the entered instruction are required to be counted simultaneously on the premise that the new instruction enters, and the sizes of the cavitation bubbles and the entered instruction are compared to determine whether the in-situ storage condition is met.
According to at least one embodiment of the present disclosure, in a multi-issue processor, the capability of the queue to receive multi-issue instructions can be fully utilized by adopting the scheme of at least one embodiment of the present disclosure, and in the case that the instruction issued to the queue for a plurality of consecutive clock cycles does not reach the maximum value, the new entry instruction content is placed in situ, so as to reduce the number of shift-collapse operations. Because the new entry entering the queue contains more bits, the shift-collapse operation brings great dynamic power consumption, and the reduction of the number of times of shift-collapse can obviously reduce the dynamic power consumption of the queue.
According to at least one embodiment of the present disclosure, after the number of instructions to be issued is determined according to the number of execution units, the queue needs to receive a corresponding number of new instructions, and the maximum moving distance of each entry in the queue at each clock cycle may also be set according to the maximum number of received instructions in the queue. On the premise that the number of the execution units is limited by hardware resources, the maximum moving distance of each entry in the queue can be reasonably increased, the receiving capacity of each cycle of the queue can be increased, the condition that the instruction written into the queue in each cycle does not reach the maximum value can be more easily achieved, the occurrence frequency of shift-collapse is reduced in a larger range, and the dynamic power consumption of the queue is reduced.
Fig. 11 illustrates a schematic block diagram of a queue control device 200 provided in at least one embodiment of the present disclosure.
For example, as shown in fig. 11, the queue control device 200 includes a first acquisition unit 210, a second acquisition unit 220, and a first control unit 230. These components are interconnected by a bus system and/or other forms of connection mechanisms (not shown). For example, these modules may be implemented by hardware (e.g., circuit) modules, software modules, or any combination of the two, and the like, and the following embodiments are the same and will not be repeated. For example, these elements may be implemented by a Central Processing Unit (CPU), an image processor (GPU), a Tensor Processor (TPU), a Field Programmable Gate Array (FPGA), or other form of processing unit having data processing and/or instruction execution capabilities, and corresponding computer instructions. It should be noted that the components and structures of the queue control device 200 shown in fig. 11 are exemplary only and not limiting, and that the queue control device 200 may have other components and structures as desired.
The first obtaining unit 210 is configured to obtain a first number of empty positions in a current clock cycle in a first segment of a queue, where the first segment is N positions closest to a writing end of the queue, and N is a positive integer. The first acquisition unit 210 may perform, for example, step S110 described in fig. 2 or fig. 4.
The second fetch unit 220 is configured to fetch a second number of instructions that need to be written to the queue in a next clock cycle to the current clock cycle. The second acquisition unit 220 may perform, for example, step S120 described in fig. 2 or fig. 4.
The first control unit 230 is configured to write the instruction to be written to the queue to the first segment in the next clock cycle, and to keep the position of the other instructions except the read instruction in the second segment of the queue in the queue unchanged, wherein the second segment is a part of the queue except the first segment, when the first number is greater than or equal to the second number. The first control unit 230 may perform, for example, step S130 described in fig. 2 or 4.
For example, the first acquisition unit 210, the second acquisition unit 220, and the first control unit 230 may be hardware, software, firmware, and any feasible combination thereof. For example, the first acquiring unit 210, the second acquiring unit 220, and the first controlling unit 230 may be dedicated or general-purpose circuits, chips, devices, or the like, or may be a combination of a processor and a memory. With respect to the specific implementation forms of the respective units described above, the embodiments of the present disclosure are not limited thereto.
For example, the first acquisition unit 210, the second acquisition unit 220, and the first control unit 230 may include codes and programs stored in a memory; the processor may execute the code and the program to implement some or all of the functions of the first acquisition unit 210, the second acquisition unit 220, and the first control unit 230 as described above. For example, the first acquisition unit 210, the second acquisition unit 220, and the first control unit 230 may be dedicated hardware devices for implementing some or all of the functions of the first acquisition unit 210, the second acquisition unit 220, and the first control unit 230 as described above. For example, the first acquisition unit 210, the second acquisition unit 220, and the first control unit 230 may be one circuit board or a combination of circuit boards for realizing the functions as described above. In an embodiment of the present disclosure, the circuit board or the combination of circuit boards may include: (1) one or more processors; (2) One or more non-transitory memories coupled to the processor; and (3) firmware stored in the memory that is executable by the processor.
It should be noted that, in the embodiment of the present disclosure, each unit of the queue control apparatus 200 corresponds to each step of the foregoing queue control method, and the detailed description of the queue control apparatus 200 may be referred to as related description of the queue control method, which is not repeated herein. The components and structures of the queue control device 200 shown in fig. 11 are exemplary only and not limiting, and the queue control device 200 may include other components and structures as desired. The queue control device 200 may include more or less circuits or units, and the connection relationship between the respective circuits or units is not limited, and may be determined according to actual requirements. The specific configuration of each circuit or unit is not limited, and may be constituted by an analog device according to the circuit principle, a digital chip, or other applicable means.
For example, the queue processing apparatus provided in at least one example of the foregoing embodiment of the present disclosure may further include a second control unit configured to, in the next clock cycle, if the second segment has an empty position, move an instruction located on a side of the empty position near the writing end in the second segment toward a direction near a reading end of the queue, move an instruction of the first segment toward the second segment, and write at least part of the instruction to be written into the queue into the first segment, in a case where the first number is smaller than the second number.
For example, in the queue control device provided by at least one example of the above-described embodiment of the present disclosure, in the next clock cycle, the instructions of the first segment and the second segment move by at most N positions; the order of the instructions of the first segment and the second segment in the queue is maintained before and after the instructions of the first segment and the second segment are moved.
For example, in the queue control device provided by at least one example of the above-described embodiment of the present disclosure, the first control unit is further configured to: and in the next clock cycle, sequentially writing the instructions needing to be written into the queue from the position closest to the readout end in the first segment to the direction far away from the readout end under the condition that the first number is greater than or equal to the second number and the first number is equal to N.
For example, in the queue control device provided by at least one example of the above-described embodiment of the present disclosure, the first control unit is further configured to: comprising the following steps: when the first number is equal to or greater than the second number, the second number is P, and P positions of the first segment closest to the writing end are not all free positions, moving an instruction in the P positions in a direction approaching the reading end within the first segment range in the next clock cycle so that the P positions are free positions; sequentially writing the P instructions needing to be written into the queue from the position closest to the reading end in the P positions and in the direction far away from the reading end; wherein P is a positive integer less than N.
For example, in the queue control device provided by at least one example of the above-described embodiment of the present disclosure, the first control unit is further configured to: under the condition that K continuous positions closest to the writing end in the first segment are all empty and the second number is less than or equal to K, starting from the position closest to the reading end in the K positions and sequentially writing the instruction needing to be written into the queue in the direction far away from the reading end; wherein K is a positive integer less than N.
For example, in the queue control device provided by at least one example of the above-described embodiment of the present disclosure, the second control unit is further configured to: moving the instruction of the first segment to the second segment by at least one position, so that R positions closest to the writing end in the first segment are empty; if the second number is greater than R, sequentially writing R instructions in the instructions needing to be written into the queue from the position closest to the reading end in the R positions and in the direction far away from the reading end; wherein R is a positive integer less than or equal to N.
For example, in the queue control device provided by at least one example of the above-described embodiment of the present disclosure, the second control unit is further configured to: and if the second number is less than or equal to R, sequentially writing the instructions needing to be written into the queue from the position closest to the reading end among the R positions and in the direction away from the reading end.
For example, in the queue control device provided in at least one example of the above embodiment of the present disclosure, in the process of writing the instruction to be written into the queue into the first segment, the instruction is written sequentially in order from early to late according to time information of the instruction.
At least one embodiment of the present disclosure also provides an electronic device comprising a processor and a memory storing one or more computer program modules. One or more computer program modules are configured to be executed by a processor for implementing the queue control method described above.
Fig. 12 is a schematic block diagram of an electronic device provided by some embodiments of the present disclosure. As shown in fig. 12, the electronic device 300 includes a processor 310 and a memory 320. Memory 320 stores non-transitory computer-readable instructions (e.g., one or more computer program modules). The processor 310 is configured to execute non-transitory computer readable instructions that, when executed by the processor 310, perform one or more steps of the queue control method described above. The memory 320 and the processor 310 may be interconnected by a bus system and/or other forms of connection mechanisms (not shown). For specific implementation of each step of the queue control method and related explanation content, reference may be made to the above embodiment of the queue control method, and the details are not repeated here.
It should be noted that the components of the electronic device 300 shown in fig. 12 are exemplary only and not limiting, and that the electronic device 300 may have other components as desired for practical applications.
For example, the processor 310 and the memory 320 may communicate with each other directly or indirectly.
For example, the processor 310 and the memory 320 may communicate over a network. The network may include a wireless network, a wired network, and/or any combination of wireless and wired networks. Intercommunication among processor 310 and memory 320 can also be implemented via a system bus as no limitation of the present disclosure.
For example, the processor 310 may control other components in the electronic device 300 to perform desired functions. For example, the processor 310 may be a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), or other form of processing unit having data processing capabilities and/or program execution capabilities. For example, the Central Processing Unit (CPU) may be an X86 or ARM architecture, or the like. The processor 310 may be a general-purpose processor or a special-purpose processor that may control other components in the electronic device 300 to perform the desired functions.
For example, memory 320 may comprise any combination of one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. Volatile memory can include, for example, random Access Memory (RAM) and/or cache memory (cache) and the like. The non-volatile memory may include, for example, read-only memory (ROM), hard disk, erasable programmable read-only memory (EPROM), portable compact disc read-only memory (CD-ROM), USB memory, flash memory, and the like. One or more computer program modules may be stored on the computer readable storage medium and executed by the processor 310 to implement various functions of the electronic device 300. Various applications and various data, as well as various data used and/or generated by the applications, etc., may also be stored in the computer readable storage medium.
It should be noted that, in the embodiments of the present disclosure, specific functions and technical effects of the electronic device 300 may refer to the above description about the queue control method, which is not repeated herein.
Fig. 13 is a schematic block diagram of another electronic device provided by some embodiments of the present disclosure. The electronic device 400 is suitable, for example, for implementing the queue control method provided by the embodiments of the present disclosure. The electronic device 400 may be a terminal device or the like. It should be noted that the electronic device 400 shown in fig. 13 is merely an example, and does not impose any limitation on the functionality and scope of use of the embodiments of the present disclosure.
As shown in fig. 13, the electronic device 400 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 410, which may perform various suitable actions and processes according to a program stored in a Read Only Memory (ROM) 420 or a program loaded from a storage means 480 into a Random Access Memory (RAM) 430. In the RAM430, various programs and data required for the operation of the electronic device 400 are also stored. The processing device 410, ROM 420, and RAM430 are connected to each other by a bus 440. An input/output (I/O) interface 450 is also connected to bus 440.
In general, the following devices may be connected to the I/O interface 450: input devices 460 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 470 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, etc.; storage 480 including, for example, magnetic tape, hard disk, etc.; and communication device 490. The communication means 490 may allow the electronic device 400 to communicate wirelessly or by wire with other electronic devices to exchange data. While fig. 13 shows an electronic device 400 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided, and that electronic device 400 may alternatively be implemented or provided with more or fewer means.
For example, according to embodiments of the present disclosure, the above-described queue control method may be implemented as a computer software program. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the above-described queue control method. In such an embodiment, the computer program may be downloaded and installed from a network via communications device 490, or from storage 480, or from ROM 420. The functions defined in the queue control method provided by the embodiments of the present disclosure may be implemented when the computer program is executed by the processing apparatus 410.
At least one embodiment of the present disclosure also provides a computer-readable storage medium storing non-transitory computer-readable instructions that, when executed by a computer, implement the above-described queue control method.
Fig. 14 is a schematic diagram of a storage medium according to some embodiments of the present disclosure. As shown in fig. 14, the storage medium 500 stores non-transitory computer readable instructions 510. For example, non-transitory computer readable instructions 510, when executed by a computer, perform one or more steps in accordance with the queue control method described above.
For example, the storage medium 500 may be applied to the electronic device 300 described above. For example, the storage medium 500 may be the memory 320 in the electronic device 300 shown in fig. 12. For example, the relevant description of the storage medium 500 may refer to the corresponding description of the memory 320 in the electronic device 300 shown in fig. 12, and will not be repeated here.
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in this disclosure is not limited to the specific combinations of features described above, but also covers other embodiments which may be formed by any combination of features described above or equivalents thereof without departing from the spirit of the disclosure. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).
Moreover, although operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims.
For the purposes of this disclosure, the following points are also noted:
(1) The drawings of the embodiments of the present disclosure relate only to the structures to which the embodiments of the present disclosure relate, and reference may be made to the general design for other structures.
(2) The embodiments of the present disclosure and features in the embodiments may be combined with each other to arrive at a new embodiment without conflict.
The foregoing is merely specific embodiments of the disclosure, but the scope of the disclosure is not limited thereto, and the scope of the disclosure should be determined by the claims.

Claims (15)

1. A queue control method, comprising:
acquiring a first number of empty positions in a current clock cycle in a first segment of a queue, wherein the first segment is N positions nearest to a writing end of the queue;
obtaining a second number of instructions that need to be written to the queue in a next clock cycle to the current clock cycle;
Writing the instruction to be written into the queue into the first segment in the next clock cycle when the first number is greater than or equal to the second number, and keeping the positions of other instructions except the read instruction in the second segment of the queue in the queue unchanged, wherein the second segment is a part of the queue except the first segment;
wherein N is a positive integer.
2. The queue control method of claim 1, further comprising:
and if the second segment has an empty position in the next clock cycle under the condition that the first number is smaller than the second number, moving an instruction, which is positioned on one side of the empty position and is close to the writing end, of the second segment to a direction which is close to the reading end of the queue, moving the instruction of the first segment to the second segment, and writing at least part of the instruction which needs to be written into the queue into the first segment.
3. The queue control method of claim 2, wherein,
in the next clock cycle, instructions of the first segment and the second segment move by at most N positions;
The order of the instructions of the first segment and the second segment in the queue is maintained before and after the instructions of the first segment and the second segment are moved.
4. A queue control method according to any one of claims 1 to 3, in which, in the case where the first number is equal to or greater than the second number, writing the instruction to be written to the queue to the first segment in the next clock cycle includes:
and in the next clock cycle, sequentially writing the instructions needing to be written into the queue from the position closest to the readout end in the first segment to the direction far away from the readout end under the condition that the first number is greater than or equal to the second number and the first number is equal to N.
5. A queue control method according to any one of claims 1 to 3, in which, in the case where the first number is equal to or greater than the second number, writing the instruction to be written to the queue to the first segment in the next clock cycle includes:
when the first number is equal to or greater than the second number, the second number is P, and P positions of the first segment closest to the writing end are not all free positions, moving an instruction in the P positions in a direction approaching the reading end within the first segment range in the next clock cycle so that the P positions are free positions;
Sequentially writing the P instructions needing to be written into the queue from the position closest to the reading end in the P positions and in the direction far away from the reading end;
wherein P is a positive integer less than N.
6. A queue control method according to any one of claims 1 to 3, in which, in the case where the first number is equal to or greater than the second number, writing the instruction to be written to the queue to the first segment in the next clock cycle includes:
under the condition that K continuous positions closest to the writing end in the first segment are all empty and the second number is less than or equal to K, starting from the position closest to the reading end in the K positions and sequentially writing the instruction needing to be written into the queue in the direction far away from the reading end;
wherein K is a positive integer less than N.
7. The queue control method of claim 2, wherein moving the instruction of the first segment toward the second segment and writing at least a portion of the instruction that needs to be written to the queue to the first segment comprises:
moving the instruction of the first segment to the second segment by at least one position, so that R positions closest to the writing end in the first segment are empty;
If the second number is greater than R, sequentially writing R instructions in the instructions needing to be written into the queue from the position closest to the reading end in the R positions and in the direction far away from the reading end;
wherein R is a positive integer less than or equal to N.
8. The queue control method of claim 7, wherein moving the first segmented instruction toward the second segment and writing at least a portion of the instruction that needs to be written to the queue to the first segment further comprises:
and if the second number is less than or equal to R, sequentially writing the instructions needing to be written into the queue from the position closest to the reading end among the R positions and in the direction away from the reading end.
9. A queue control device comprising:
the first acquisition unit is configured to acquire a first number of empty positions in a current clock cycle in a first segment of a queue, wherein the first segment is N positions nearest to a writing end of the queue;
a second fetch unit configured to fetch a second number of instructions that need to be written to the queue in a next clock cycle to the current clock cycle;
A first control unit configured to write the instruction to be written into the queue into the first segment in the next clock cycle, and to keep the positions of other instructions in the second segment of the queue than the read instruction unchanged in the queue, wherein the second segment is a part of the queue except the first segment, when the first number is equal to or greater than the second number;
wherein N is a positive integer.
10. The queue control device of claim 9, further comprising:
and the second control unit is configured to move an instruction, which is positioned on one side of the second section and is close to the writing end, of the second section to a direction close to the reading end of the queue, move an instruction of the first section to the second section, and write at least part of the instruction which needs to be written into the queue into the first section in the next clock cycle if the second section has an empty position in the condition that the first number is smaller than the second number.
11. The queue control device of claim 10, wherein,
in the next clock cycle, instructions of the first segment and the second segment move by at most N positions;
The order of the instructions of the first segment and the second segment in the queue is maintained before and after the instructions of the first segment and the second segment are moved.
12. The queue control device according to one of claims 9 to 11, wherein the first control unit is further configured to:
and in the next clock cycle, sequentially writing the instructions needing to be written into the queue from the position closest to the readout end in the first segment to the direction far away from the readout end under the condition that the first number is greater than or equal to the second number and the first number is equal to N.
13. The queue control device according to one of claims 9 to 11, wherein the first control unit is further configured to:
when the first number is equal to or greater than the second number, the second number is P, and P positions of the first segment closest to the writing end are not all free positions, moving an instruction in the P positions in a direction approaching the reading end within the first segment range in the next clock cycle so that the P positions are free positions;
Sequentially writing the P instructions needing to be written into the queue from the position closest to the reading end in the P positions and in the direction far away from the reading end;
wherein P is a positive integer less than N.
14. An electronic device, comprising:
a processor;
a memory storing one or more computer program modules;
wherein the one or more computer program modules are configured to be executed by the processor for implementing the queue control method of any one of claims 1-9.
15. A computer readable storage medium storing non-transitory computer readable instructions which, when executed by a computer, implement the queue control method of any one of claims 1-9.
CN202310731287.4A 2023-06-19 2023-06-19 Queue control method, apparatus, electronic device, and computer-readable storage medium Active CN116627507B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310731287.4A CN116627507B (en) 2023-06-19 2023-06-19 Queue control method, apparatus, electronic device, and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310731287.4A CN116627507B (en) 2023-06-19 2023-06-19 Queue control method, apparatus, electronic device, and computer-readable storage medium

Publications (2)

Publication Number Publication Date
CN116627507A true CN116627507A (en) 2023-08-22
CN116627507B CN116627507B (en) 2024-04-12

Family

ID=87636653

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310731287.4A Active CN116627507B (en) 2023-06-19 2023-06-19 Queue control method, apparatus, electronic device, and computer-readable storage medium

Country Status (1)

Country Link
CN (1) CN116627507B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6112019A (en) * 1995-06-12 2000-08-29 Georgia Tech Research Corp. Distributed instruction queue
US20060095750A1 (en) * 2004-08-30 2006-05-04 Nye Jeffrey L Processes, circuits, devices, and systems for branch prediction and other processor improvements
WO2012138950A2 (en) * 2011-04-07 2012-10-11 Via Technologies, Inc. Conditional load instructions in an out-of-order execution microprocessor
US20140122843A1 (en) * 2011-04-07 2014-05-01 G. Glenn Henry Conditional store instructions in an out-of-order execution microprocessor
CN110362348A (en) * 2018-04-09 2019-10-22 武汉斗鱼网络科技有限公司 A kind of method, apparatus and electronic equipment of queue access data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6112019A (en) * 1995-06-12 2000-08-29 Georgia Tech Research Corp. Distributed instruction queue
US20060095750A1 (en) * 2004-08-30 2006-05-04 Nye Jeffrey L Processes, circuits, devices, and systems for branch prediction and other processor improvements
WO2012138950A2 (en) * 2011-04-07 2012-10-11 Via Technologies, Inc. Conditional load instructions in an out-of-order execution microprocessor
US20140122843A1 (en) * 2011-04-07 2014-05-01 G. Glenn Henry Conditional store instructions in an out-of-order execution microprocessor
CN110362348A (en) * 2018-04-09 2019-10-22 武汉斗鱼网络科技有限公司 A kind of method, apparatus and electronic equipment of queue access data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘炳涛等: "基于数据流块的空间指令调度方法", 《计算机研究与发展》, 14 April 2017 (2017-04-14), pages 750 - 763 *
黄立波等: "处理器值预测技术研究", 《电子学报》, vol. 51, no. 12, 31 March 2023 (2023-03-31), pages 1 - 28 *

Also Published As

Publication number Publication date
CN116627507B (en) 2024-04-12

Similar Documents

Publication Publication Date Title
EP3407182B1 (en) Vector computing device
KR102123633B1 (en) Matrix computing device and method
KR101118486B1 (en) On-demand multi-thread multimedia processor
CN110073329A (en) Memory access equipment calculates equipment and the equipment applied to convolutional neural networks operation
JP6022581B2 (en) FIFO load instruction
CN107810478A (en) The block-based framework of parallel execution with continuous blocks
CN102141905A (en) Processor system structure
CN111695672A (en) Method for improving AI engine MAC utilization rate
CN109308191B (en) Branch prediction method and device
CN114527953B (en) Memory data processing system, method, apparatus, computer device and medium
US20150195371A1 (en) Changing a cache queue based on user interface pointer movement
US9395984B2 (en) Swapping branch direction history(ies) in response to a branch prediction table swap instruction(s), and related systems and methods
CN114153500A (en) Instruction scheduling method, instruction scheduling device, processor and storage medium
US10795606B2 (en) Buffer-based update of state data
TWI754310B (en) System and circuit of pure functional neural network accelerator
CN112667289A (en) CNN reasoning acceleration system, acceleration method and medium
US20150324228A1 (en) Trace-based instruction execution processing
CN110633434A (en) Page caching method and device, electronic equipment and storage medium
CN116627507B (en) Queue control method, apparatus, electronic device, and computer-readable storage medium
US11055100B2 (en) Processor, and method for processing information applied to processor
US9417882B2 (en) Load synchronization with streaming thread cohorts
CN114840256A (en) Program data level parallel analysis method and device and related equipment
CN116820578A (en) Queue control method, apparatus, electronic device, and computer-readable storage medium
CN116804915B (en) Data interaction method, processor, device and medium based on memory
US20230409323A1 (en) Signal processing apparatus and non-transitory computer-readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant