WO2019105029A1 - 去分配命令处理方法及其存储设备 - Google Patents

去分配命令处理方法及其存储设备 Download PDF

Info

Publication number
WO2019105029A1
WO2019105029A1 PCT/CN2018/093483 CN2018093483W WO2019105029A1 WO 2019105029 A1 WO2019105029 A1 WO 2019105029A1 CN 2018093483 W CN2018093483 W CN 2018093483W WO 2019105029 A1 WO2019105029 A1 WO 2019105029A1
Authority
WO
WIPO (PCT)
Prior art keywords
allocation
command
allocation table
cpu
present application
Prior art date
Application number
PCT/CN2018/093483
Other languages
English (en)
French (fr)
Inventor
居颖轶
袁戎
孙宝勇
郭志红
高会娟
蔡述楠
Original Assignee
北京忆恒创源科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN201711222238.9A external-priority patent/CN109840048B/zh
Priority claimed from CN201810594487.9A external-priority patent/CN110580228A/zh
Application filed by 北京忆恒创源科技有限公司 filed Critical 北京忆恒创源科技有限公司
Priority to US17/044,457 priority Critical patent/US11397672B2/en
Publication of WO2019105029A1 publication Critical patent/WO2019105029A1/zh
Priority to US17/846,524 priority patent/US20220327049A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0238Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory
    • G06F12/0246Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory in block erasable memory, e.g. flash memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0253Garbage collection, i.e. reclamation of unreferenced memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/0292User address space allocation, e.g. contiguous or non contiguous base addressing using tables or multilevel address translation means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5022Mechanisms to release resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1041Resource optimization
    • G06F2212/1044Space efficiency improvement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/72Details relating to flash memory management
    • G06F2212/7201Logical to physical mapping or translation of blocks or pages

Definitions

  • the present application relates to the field of storage technologies, and in particular, to a method for allocating commands and a storage device thereof.
  • FIG. 1 shows a block diagram of a solid state storage device.
  • the solid state storage device 102 is coupled to the host for providing storage capabilities to the host.
  • the host and the solid-state storage device 102 can be coupled in various manners, including but not limited to, for example, SATA (Serial Advanced Technology Attachment), SCSI (Small C omputer System Interface) ), SAS (Serial Attached SCSI, Serial Connection S CSI), IDE (Integrated Drive Electronics), USB (Universal Serial Bus), PCIE (Peripheral Component Interconnect Express, PCIe, high-speed peripheral components) Interconnect), NVMe (NVM Express, high speed nonvolatile storage), Ethernet, Fibre Channel, wireless communication network, etc. connect the host to the solid state storage device 102.
  • SATA Serial Advanced Technology Attachment
  • SCSI Serial Connection S CSI
  • IDE Integrated Drive Electronics
  • USB Universal Serial Bus
  • PCIE Peripheral Component Interconnect Express, PCIe, high-speed peripheral components
  • the host may be an information processing device capable of communicating with the storage device in the manner described above, such as a personal computer, tablet, server, portable computer, network switch, router, cellular telephone, personal digital assistant, and the like.
  • the storage device 102 includes an interface 103, a control unit 104, one or more NVM chips 105, and a DRAM (Dynamic Random Access Memory) 110.
  • a NAND flash memory, a phase change memory, a FeRAM (Ferroelectric RAM), an MRAM (Magnetic Random Access Memory), an RRAM (Resistive Random Access Memory), and the like are common NVMs.
  • the interface 103 can be adapted to exchange data with the host via, for example, SATA, IDE, USB, PCIE, NVMe, SAS, Ethernet, Fibre Channel, and the like.
  • Control component 104 is used to control data transfers between interface 103, NVM chip 105, and DRAM 110, as well as for storage management, host logical address to flash physical address mapping, erase equalization, bad block management, and the like.
  • the control unit 104 can be implemented in various manners of software, hardware, firmware, or a combination thereof.
  • the control unit 104 can be an FPGA (Field-programmable gate array) or an ASIC (Application Specific Integrated Circuit). Integrated circuit) or a combination thereof.
  • Control component 104 may also include a processor or controller that executes software in the processor or controller to manipulate the hardware of control component 104 to process IO (Input/Output) commands.
  • Control component 104 can also be coupled to DRAM 110 and can access data from DRAM 110.
  • the DRAM can store data for FTL tables and/or cached IO commands.
  • the control component 104 includes a flash interface controller (or media interface controller, flash channel controller) coupled to the NVM chip 105 and issuing commands to the NVM chip 105 in a manner that follows the interface protocol of the NVM chip 105.
  • a flash interface controller or media interface controller, flash channel controller
  • Known NVM chip interface protocols include "Toggle”, "ONFI”, and the like.
  • a memory target is one or more logical units (LUNs, Logic UNit) of a shared CE (Chip Enable) signal within a NAND flash package.
  • LUNs logical units
  • Logic UNit logical units
  • the logic unit corresponds to a single die.
  • the logic unit can include a plurality of planes. Multiple planes within a logical unit can be accessed in parallel, while multiple logical units within a NAND flash chip can execute command and report states independently of each other.
  • a block (also called a physical block) contains multiple pages.
  • a block contains multiple pages.
  • a page on a storage medium (referred to as a physical page) has a fixed size, such as 17,664 bytes. Physical pages can also have other sizes.
  • Bulk blocks include physical blocks from each of a plurality of logical units (LUNs), also known as logical unit groups. Each logical unit can provide one physical block for a large block. For example, in the schematic diagram of the large block shown in FIG. 2, a large block is constructed on every 16 logical units (LUNs). Each chunk consists of 16 physical blocks from 16 logical units (LUNs). In the example of FIG. 2, chunk 0 includes physical block 0 from each of 16 logical units (LUNs), while chunk 1 includes physical block 1 from each logical unit (LUN). There are many other ways to construct chunks.
  • the physical pages of the same physical address within each logical unit (LUN) constitute a "page strip.”
  • the physical page P0-0, the physical page P0-1, ... and the physical page P0-x constitute a page strip 0, wherein the physical page P0-0, the physical page P0-1, ... the physical page P0-14 Used to store user data, while physical pages P0-x are used to store parity data calculated from all user data within the stripe.
  • the physical page P2-0, the physical page P2-1, ... and the physical page P2-x constitute the page strip 2.
  • the physical page used to store the checksum data can be located anywhere in the page strip.
  • FIG. 3A in FIG. 3A of the Chinese patent application No. 201710752321.0 and its specification a further configuration of a large block is provided.
  • FTL Flash Translation Layer
  • the logical address constitutes the storage space of the solid-state storage device perceived by the upper layer software such as the operating system.
  • the physical address is the address of the physical storage unit used to access the solid state storage device.
  • the address mapping can also be implemented in the related art using the intermediate address form. For example, a logical address is mapped to an intermediate address, and the intermediate address is further mapped to a physical address.
  • a table structure that stores mapping information from a logical address to a physical address is called an FTL table.
  • FTL tables are important metadata in solid state storage devices.
  • the data items of the FTL table record the address mapping relationship in units of data units in the solid state storage device.
  • the logical page in the FTL table corresponds to 4KB of storage space, while the physical page has 4KB of storage space (including additional out-of-band storage space).
  • the FTL table provides a record for each 4KB data unit to record its logical address to physical address mapping.
  • the size of the storage space corresponding to the data unit is different from the storage space of the physical page.
  • the physical page can accommodate multiple data units
  • the data unit corresponds to 4 KB of storage space
  • the physical page storage space can accommodate multiple Data unit (for example, 4)
  • the FTL table includes multiple FTL table entries (or tables).
  • a correspondence between a logical page address and a physical page is recorded in each FTL table entry.
  • a correspondence between consecutive multiple logical page addresses and consecutive multiple physical pages is recorded in each FTL table entry.
  • the correspondence between the logical block address and the physical block address is recorded in each FTL table entry.
  • the mapping relationship between the logical block address and the physical block address, and/or the mapping relationship between the logical page address and the physical page address is recorded in the FTL table.
  • the solid state storage device When processing the read command from the host, the solid state storage device obtains the corresponding physical address from the FTL table by using the logical address carried in the read command, and issues a read request to the NVM chip according to the physical address, and receives the NVM chip in response to the read request output. The data.
  • the solid state storage device allocates a physical address for the write command, records the correspondence between the logical address of the write command and the assigned physical address in the FTL table, and issues a write request to the NVM chip according to the assigned physical address. .
  • the Trim command is defined in ATA8-ACS2. Commands with the same or similar meaning are called UNMAP (Unmap) in the SCSI (Small Computer System Interface) specification, and are in the NVMe specification. Called Deallocate.
  • de-allocation is used to indicate a data set management command having the same or similar function as "pruning" of ATA8-ACS2, “de-mapping” of SCSI, “de-allocation” of NVMe, and also indicating other or Commands with the same or similar functions that appear in future protocols, specifications, or technologies.
  • a logical address range is described. After performing the de-allocation command, there can be different effects. For example, (1) after performing the de-allocation command, (before performing other write operations on the logical address range) and then reading the logical address range indicated by the de-allocation command, the obtained is determined; (2) execution After the command is allocated, (before executing other write operations to the logical address range) and then reading the logical address range indicated by the allocation command, the result is all 0; (3) after executing the deallocation command, The result obtained may be an arbitrary value when the logical address range indicated by the allocation command is read again (before performing other write operations on the logical address range). In the de-allocation command or other commands, you can set or select to assign the execution effect of the command.
  • the host can tell the SSD which logical address space no longer stores valid data, so that the SSD does not have to move the expired data when reclaiming storage space.
  • the size of the FTL table increases, resulting in a large number of memory access operations during the execution of the de-allocation command, which seriously prolongs the de-allocation.
  • the processing delay to de-allocate commands needs to be reduced. Further, there is a need to reduce the delay caused by the processing of the allocation command to the IO command processing. It also mitigates the effect of reducing the processing of the allocation command on the IO command processing bandwidth.
  • a method performed by a first storage device comprising: receiving a read command; if no entry in the de-allocation table is marked as "de-allocation", querying the FTL table Obtaining the physical address corresponding to the logical address accessed by the read command; and obtaining data from the physical address as a response to the read command.
  • a method performed by a second storage device further comprising: if at least one entry in the de-allocation table is marked as "de-allocating" The query de-allocates the table to determine whether the logical address accessed by the read command is de-allocated; if the logical address accessed by the read command is marked as "de-allocated" in the de-allocation table, the specified value is used as a response to the read command.
  • a method performed by the third storage device further comprising: if the logical address accessed by the read command is not marked in the deallocation table For "de-allocation", the data is obtained according to the physical address obtained by querying the FTL table as a response to the read command.
  • the de-allocation table indicates whether the allocation table is in the table Any entry is marked as "deallocated".
  • a method performed by a fifth storage device further comprising: de-allocating in response to receiving a de-allocation command The status of the table is set to indicate that at least one entry in the de-allocation table is marked as "de-allocated”.
  • a method performed by a sixth storage device further comprising: scanning a de-allocation table according to a de-allocation table
  • the logical address corresponding to the entry labeled "de-allocation” updates the entry corresponding to the logical address of the FTL table to indicate "de-allocation”.
  • a method performed by the seventh storage device according to the first aspect of the present application further comprising: setting a state of the de-allocation table in response to the scan de-allocation table completion No entries in the table to indicate the assignment are marked as "deallocated".
  • a method performed by an eighth storage device further comprising: allocating a write command in response to receiving a write command Physical address; if no entry in the allocation table is marked as "de-allocation", the FTL table is updated with the logical address of the write command and the assigned physical address; and the data is written according to the physical address.
  • a method performed by a ninth storage device further comprising: if at least one entry in the de-allocation table is marked as "de-allocating" ", remove the "deallocation" flag from the entry corresponding to the logical address of the write command in the allocation table.
  • a method performed by the tenth storage device according to the first aspect of the present application further comprising: before writing data according to the physical address, to a write command The issuer indicates that the write command has been processed.
  • an eighth storage device of the first aspect of the present application there is provided a method performed by an eleventh storage device according to the first aspect of the present application, wherein no entry in the acknowledgment de-allocation table is marked as "de-allocation" ", the sender of the write command is instructed that the write command has been processed.
  • an eighth or ninth storage device of the first aspect of the present application there is provided a method performed by a twelfth storage device according to the first aspect of the present application, wherein at least one entry in the learned de-allocation table is marked in response to learning For "de-allocation”, after the "de-allocation" flag is cleared from the entry corresponding to the logical address of the write command in the de-allocation table, the issuer of the write command is instructed that the write command has been processed.
  • the de-allocation table comprises a first de-allocation table and a second de-allocation table corresponding to the first FTL table and the second FTL table respectively;
  • the status indication of the first de-allocation table includes whether any entries in the de-allocation table of the first de-allocation table and the second de-allocation table are marked For "de-allocation”;
  • the status indication of the second de-allocation table includes whether any entries in the de-allocation table including the first de-allocation table and the second de-allocation table are marked as "de-allocation”.
  • a method performed by the fourteenth storage device according to the first aspect of the present application further comprising: in response to receiving the de-allocation command, the first de-allocation table The state and the state of the second de-allocation table are set such that at least one entry in the de-allocation table is marked as "de-allocated".
  • a method performed by the fifteenth storage device further comprising: in response to scanning the first de-allocation table, the first The status of the de-allocation table is set to "scan complete"; in response to scanning the second de-allocation table is completed, the state of the second de-allocation table is set to "scan complete”.
  • a method performed by the sixteenth storage device further comprising: responsive to a state of the first de-allocation table being set to " Scan completion", if the status of the second de-allocation table is "scan complete" or "no record", the status of the first de-allocation table is set to "no record", wherein the "no record” status indication includes the first de-allocation No entries in the de-allocation table of the table and the second de-allocation table are marked as "de-allocation”.
  • a method performed by the seventeenth storage device further comprising: responsive to a state of the second de-allocation table being Set to "scan complete”, if the status of the first de-allocation table is "scan complete" or "no record", set the status of the second de-allocation table to "no record", wherein the "no record” status indication includes No entries in the de-allocation table of the de-allocation table and the second de-allocation table are marked as "de-allocation”.
  • the first CPU processes access to the first de-allocation The de-allocation command of the table; the second CPU processes the de-allocation command accessing the second de-allocation table.
  • a method performed by the eighteenth storage device of the first aspect of the present application there is provided a method performed by the nineteenth storage device according to the first aspect of the present application, further comprising: responsive to a state of the first de-allocation table being updated, A CPU notifies the second CPU that the status of the first de-allocation table is updated.
  • a method performed by a twentieth storage device further comprising: responsive to a state of the second de-allocation table being Update, the second CPU notifies the first CPU that the status of the first de-allocation table is updated.
  • a method performed by the twenty-first storage device according to the first aspect of the present application further comprising: obtaining an update log of the de-allocation table If the update log of the allocation table is compressible, the update log of the obtained allocation table is cached.
  • a method performed by the twenty-second storage device according to the first aspect of the present application further comprising: if the update log of the allocation table is incompressible, The update log of the allocation table is written to the NVM chip.
  • the twenty-third storage device according to the first aspect of the present application, wherein the update log according to the acquired de-allocation table Whether the update log of the cached de-allocation table is continuous, and whether the update log of the allocation table is compressible.
  • a method performed by the twenty-fourth storage device according to the first aspect of the present application wherein the de-allocation table according to the acquired Whether the update log of the updated update log is consecutive with the cached de-allocation table, and whether the obtained update log of the de-allocation table is the same as the source of the update log of the cached de-allocation table, and determines whether the update log of the de-allocation table is compressible.
  • a first storage device comprising a control unit, a memory and an NVM chip, wherein the memory stores a de-allocation table and an FTL table, the control unit performing the first according to the present application One of the methods performed by the first to twenty-fourth storage devices of the aspect.
  • a first storage device comprising a control unit, a memory and an NVM chip, wherein the memory stores a deallocation table and an FTL table, the control component comprising the first CPU and the Two CPUs; the first CPU and the second CPU respectively perform one of the methods performed by the first to twenty-fourth storage devices according to the first aspect of the present application.
  • a second storage device according to the third aspect of the present application, wherein the memory stores the first deallocation table, the second deallocation table, the first FTL table, and the second FTL table
  • the first de-allocation table is for the first FTL table
  • the second de-allocation table is for the second FTL table
  • the first CPU maintains the state of the first de-allocation table
  • the second CPU maintains the state of the second de-allocation table.
  • a third storage device according to the third aspect of the present application, further comprising a distributor for allocating the received command to the first CPU or the second CPU One.
  • a fourth storage device assigns a de-allocation command to the same logical address according to a logical address accessed by the de-allocation command One of the first or second CPUs.
  • a fifth storage device according to the third aspect of the present application, wherein the allocator randomly or alternately allocates a read command or a write command to the first CPU or the One of the two CPUs.
  • a method for allocating a command according to a first process of the fourth aspect of the present application comprising: obtaining an address range indicated by a deallocation command in response to receiving a deallocation command; The indicated address range updates the entry to the allocation table.
  • the method for allocating a command according to the first processing of the fourth aspect of the present application provides a method for allocating a command according to the second processing of the fourth aspect of the present application, further comprising: updating the FTL table according to the address range indicated by the de-allocation command The entry of the entry; wherein the FTL table records the physical address corresponding to the logical address.
  • a method for allocating a command according to the first or second processing of the fourth aspect of the present application the method for allocating a command according to the third processing of the fourth aspect of the present application, wherein the logical address described by the deallocation command in the FTL table is provided
  • the FTL entry indicated by the range is set to a special tag.
  • the de-allocation table stores corresponding to each Whether the address is assigned information.
  • the de-allocation table is in response to the address being allocated In the middle, mark the address as being assigned.
  • a method for allocating a command according to the sixth processing of the fourth aspect of the present application wherein the address is not allocated or has been applied When the command is assigned, the address is marked as de-allocated in the de-allocation table.
  • a method for de-allocating a command according to the seventh processing of the fourth aspect of the present application the processing of the first aspect of the present application to allocate a command
  • the method after updating the de-allocation table, instructs the host to allocate the command execution completion.
  • a ninth processing de-allocation command according to the fourth aspect of the present application, wherein the FTL table is updated before updating the FTL table The entry of the FTL table corresponding to one or more addresses is locked.
  • a first storage device characterized by comprising: a control unit, one of performing a method for allocating a command according to the fourth aspect of the present application; and controlling A component-connected external memory and a non-volatile memory; wherein the de-allocation table is stored in an internal memory of a control component of the solid-state storage device or stored in an external memory.
  • a second storage device According to a first storage device of the fifth aspect of the present application, there is provided a second storage device according to the fifth aspect of the present application, wherein the de-allocation table is written to the non-volatile memory when the solid-state storage device is powered off.
  • a system for allocating a command according to a first process of the sixth aspect of the present application comprising: a control unit and an external memory; the control unit comprising a distributor and a plurality of CPUs, the dispenser, An IO command is received, and an IO command is assigned to each of the plurality of CPUs; the plurality of CPUs are configured to process the received IO commands in parallel; and the external memory stores the de-allocation table.
  • a system for allocating a command according to a first process of the sixth aspect of the present application provides a system for allocating a command according to the second process of the sixth aspect of the present application, wherein the FTL table is also stored in the memory, and the FTL table records the same The physical address corresponding to the logical address.
  • a system for allocating a command according to a first process of the sixth aspect of the present application provides a system for allocating a command according to a third process of the sixth aspect of the present application, wherein the deallocation table is divided into a plurality of sections, each section It is maintained by one of several CPUs.
  • a system for allocating a command according to the fourth processing of the sixth aspect of the present application wherein the allocator will provide the de-allocation command simultaneously
  • the CPU processes the portion of the de-allocation command that is associated with the de-allocation table that it maintains.
  • a system for allocating a command according to the fifth processing of the sixth aspect of the present application wherein the address accessed according to the IO command is The IO command is assigned to the CPU.
  • a system for allocating a command according to the sixth processing of the sixth aspect of the present application wherein the access is performed by the allocator according to the IO command Logical addresses assign IO commands to multiple CPUs.
  • a system for allocating a command according to the eighth processing of the sixth aspect of the present application according to the de-allocation of the de-allocation state recorded
  • the table entry updates the corresponding entry of the FTL table, and records the de-allocation status in the FTL entry.
  • a system for ninth processing de-allocation command according to the sixth aspect of the present application, wherein the updating is performed in response to processing the allocation command To allocate the table, one or more CPUs check their maintained de-allocation tables when they are idle or periodically.
  • a system for allocating a command according to a tenth process of the sixth aspect of the present application there is provided a system for allocating a command according to the eleventh process of the sixth aspect of the present application, wherein the check mark indicates at least the de-allocation table maintained by the CPU There is at least one entry in the tag that is marked for de-allocation.
  • a system for allocating a command according to an eleventh process of the sixth aspect of the present application the system for assigning a command according to the twelfth process of the sixth aspect of the present application, wherein the check mark further indicates the check of the de-allocation table progress.
  • a system for allocating a command according to the fourteenth process of the sixth aspect of the present application wherein the address space accessed by the IO command It is divided into multiple regions, each of which is mapped to one of a plurality of de-allocation tables.
  • a system for allocating a command according to the fourteenth process of the sixth aspect of the present application provides a system for allocating a command according to the fifteenth process of the sixth aspect of the present application, wherein an address accessed by the de-allocation command from the host is made The way the space is mapped as evenly as possible to each de-allocation table maps the logical address space to the de-allocation table.
  • a system for allocating a command according to a fifteenth process of the sixth aspect of the present application the system for allocating a command according to the sixteenth process of the sixth aspect of the present application, wherein each entry of the de-allocation table is indicated
  • the size of the address area is configurable.
  • a method for allocating a command according to a first process of the seventh aspect of the present application comprising: simultaneously transmitting a received de-allocation command to a plurality of CPUs; Allocating an address range indicated by the command, obtaining one or more addresses belonging to the own allocation table in the address range indicated by the allocation command, and updating the de-allocation table maintained by the one according to the obtained one or more addresses, to The one or more addresses recorded in the de-allocation table are de-allocated.
  • the method for allocating a command according to the first processing of the seventh aspect of the present application provides a method for allocating a command according to the first processing of the seventh aspect of the present application, further comprising: updating the FTL table according to the address range indicated by the de-allocation command.
  • a method for allocating a command according to the first or second processing of the seventh aspect of the present application the method for allocating a command according to the third processing of the seventh aspect of the present application, wherein the de-allocation table is checked periodically or periodically, The first entry marked as being allocated is found, the corresponding logical address is allocated in the FTL table according to the first entry, and the de-allocation flag of the first entry in the allocation table is cleared.
  • a method for allocating a command according to the fourth processing of the seventh aspect of the present application further comprising an address range indicated according to the de-allocation command Update the amount of valid data recorded in the bulk descriptor.
  • a method for allocating a command according to a fourth process of the seventh aspect of the present application provides a method for allocating a command according to the fifth process of the seventh aspect of the present application, wherein the de-allocation table does not exist and is marked as going
  • the check mark corresponding to the de-allocation table is cleared or reset when the assigned item is instructed, wherein the check mark indicates whether at least one item in the de-allocation table is marked as de-allocated.
  • a method for allocating a command according to a fifth process of the seventh aspect of the present application provides a method for allocating a command according to the sixth process of the seventh aspect of the present application, wherein if the check flag is cleared, when processing the read command or the write command, There is no need to access the allocation table.
  • a method for allocating a command according to a fifth process of the seventh aspect of the present application there is provided a method for allocating a command according to a seventh process of the seventh aspect of the present application, wherein if the check flag is set, when processing the read command or the write command, Need to access to allocate the table.
  • a first storage device garbage collection method comprising the steps of: selecting a large block to be recovered according to a large block descriptor table; and obtaining a to-be-recovered according to the large block to be recovered The address of the recovered data; if the allocation table is to be checked, the allocation table is accessed according to the address of the data to be recovered, and if the corresponding entry of the allocation table records the allocation, the next data to be recovered is obtained from the large block to be recycled. address.
  • the second storage device garbage collection method is provided, and if the corresponding entry of the allocation table is not recorded for allocation, according to the data to be collected.
  • the address queries the FTL table to identify whether the data to be recovered is valid; if the data to be recovered is valid, the data to be recovered is written into a new chunk, and the FTL table is updated.
  • a third storage device garbage collection method according to the eighth aspect of the present application, wherein the de-allocation table is inspected according to the check mark, wherein the check mark indication Whether or not at least one entry in the de-allocation table is marked as de-allocated.
  • the fourth storage device garbage collection method is provided, and if the allocation table does not need to be checked, the FTL is queried through the address.
  • the table obtains the physical address of the record, and identifies whether the data to be recovered is valid according to whether the physical address of the record is consistent with the physical address of the data to be recovered.
  • a fifth storage device garbage collection method is provided, and for valid data to be recovered, valid data to be recovered is written into a new large The block, and also the FTL table is updated with the physical address of the new chunk to record a valid new storage location for the data to be recovered in the FTL table.
  • a method of de-allocating a command comprising the steps of: transmitting a received de-allocation command to a plurality of CPUs; Obtaining one or more addresses belonging to the own-allocated de-allocation table in the address range, and updating the de-allocation table maintained by itself according to the one or more addresses to record the one or more addresses in the de-allocation table Was assigned.
  • the CPU updating the check mark maintained by the CPU according to the receiving the de-allocation command, wherein The check mark indicates whether at least one entry in the de-allocation table is marked as de-allocated.
  • the CPU updating the FTL according to the address range indicated by the de-allocation command Table entry; where the FTL table records the physical address corresponding to the logical address.
  • the method for allocating a command according to the first processing of the ninth aspect of the present application provides a method for allocating a command according to the fourth processing of the ninth aspect of the present application, and the method for allocating a command according to the processing of the sixth aspect of the present application is
  • the address range indicated by the allocation command is updated to update the entry of the FTL table: the CPU determines whether at least one entry in the de-allocation table is marked as de-allocated, and the FTL table corresponding to the obtained first entry corresponding to the de-allocation is obtained.
  • the address recorded by the second entry is marked as being allocated.
  • the third or fourth processing method for allocating a command according to the ninth aspect of the present application there is provided a method for allocating a command according to the fifth processing of the ninth aspect of the present application, wherein the address recorded by the marking entry in the FTL table is removed. After the allocation, the de-allocation flag of the entry to the allocation table is cleared.
  • a sixth method for allocating a command according to the ninth aspect of the present application wherein the FTL table is updated before updating the FTL table The entry of the FTL table corresponding to one or more addresses is locked.
  • the ninth aspect of the present application there is provided a method of de-allocating a command according to the seventh processing of the ninth aspect of the present application, wherein the de-allocation command is simultaneously provided to a plurality of CPUs Each of the plurality or multiple CPUs processes the commands in parallel.
  • a third method for allocating a command according to the ninth aspect of the present application there is provided a method for allocating a command according to the eighth processing of the ninth aspect of the present application, the CPU is related to a part of the de-allocation command related to the de-allocation table maintained by itself. Process it.
  • a ninth processing de-allocation command according to the ninth aspect of the present application, wherein an address accessed according to the IO command is associated with The IO commands of different parts of the FTL table are assigned to different CPU processing.
  • a method for allocating a command according to the eleventh process of the ninth aspect of the present application, wherein in response to receiving the read command, if the table is to be allocated The check flag is set, the CPU accesses the allocation table, and checks if the address accessed by the read command is de-allocated.
  • a method for allocating a command according to the twelfth process of the ninth aspect of the present application wherein, in response to receiving the read command, if The check mark of the allocation table is not set, the CPU queries the FTL table to obtain an address, and reads data from the address as a response to the read command.
  • a method for allocating a command according to the thirteenth process of the ninth aspect of the present application further comprising: responding to garbage collection Select the large block to be recycled; according to the large block to be recovered, obtain the address of the data to be recovered; if the allocation table is to be checked, access the allocation table according to the address of the data to be recovered, if the corresponding entry of the allocation table is recorded To allocate, get the next data to be recovered from the large block to be recycled.
  • a method for allocating a command according to the fourteenth aspect of the present application provides a method for allocating a command according to the fourteenth process of the ninth aspect of the present application.
  • the FTL table is queried according to the address of the data to be recovered to identify whether the data to be recovered is valid; if the data to be recovered is valid, the data to be recovered is written into a new chunk, and the FTL table is updated.
  • the method for allocating a command provides a method for allocating a command according to the fifteenth processing of the ninth aspect of the present application, and identifying whether the de-allocation table is to be checked according to the check mark , wherein the check mark indicates whether at least one entry in the de-allocation table is marked as de-allocated.
  • the method for allocating a command according to the ninth aspect of the present application provides a method for allocating a command according to the sixteenth process of the ninth aspect of the present application. If the allocation table does not need to be checked, the FTL table is queried through the address. The physical address of the record is obtained, and whether the data to be recovered is valid according to whether the physical address of the record is consistent with the physical address of the data to be recovered.
  • the method for allocating a command provides a method for allocating a command according to the seventeenth process of the ninth aspect of the present application, which is effective for valid data to be recovered.
  • the data to be recovered is written into a new chunk, and the FTL table is also updated with the physical address of the new chunk to record a valid new storage location of the data to be recovered in the FTL table.
  • the check mark further records the start position of the next check to the allocation table With the end position.
  • a method for allocating a command according to the twentieth process of the ninth aspect of the present application further comprising: de-allocating according to the check mark pair The table is cleaned up.
  • the method for allocating a command according to the twenty-first processing of the ninth aspect of the present application includes: Recording the start position of the de-allocation table to the end of the de-allocation table, checking the entries of the allocation table one by one, and if the entries are marked as de-allocation, updating the corresponding entries of the FTL table according to the table of the FTL table.
  • the item records the allocation status and clears the de-allocation status of the item in the allocation table.
  • the twentieth or twenty-first processing method for allocating a command according to the ninth aspect of the present application there is provided a method for allocating a command according to the twenty-second processing of the ninth aspect of the present application, wherein during the cleaning of the de-allocation table, If a new de-allocation command is received, the start position, current position, and end position of the de-allocation table of the check mark are updated according to the new de-allocation command.
  • a method for allocating a command according to the twenty-third processing of the ninth aspect of the present application wherein updating the check flag comprises: if a new de-allocation command After the start position and the end position are both after the end position recorded by the check mark, the end position in the check mark is updated to the end position of the new de-allocation command.
  • a method for allocating a command according to a twenty-third processing of the ninth aspect of the present application there is provided a method for allocating a command according to the twenty-fourth processing of the ninth aspect of the present application, wherein updating the check flag comprises: if a new de-allocation command The start position is after the start position recorded by the check mark, the end position is before the end position recorded by the check mark, and the current position is before the start position of the new de-allocation command, and it is not necessary to update the check mark.
  • a method for allocating a command according to a twenty-third processing of the ninth aspect of the present application there is provided a method for allocating a command according to the twenty-fifth processing of the ninth aspect of the present application, wherein updating the check flag comprises: if a new de-allocation command The start position is after the start position recorded by the check mark, the end position is before the end position recorded by the check mark, and the current position is after the start position of the new de-allocation command, and the start of the next scan is recorded in the check mark The position is the start position of the new assignment command, and the end position of the next scan is the current position.
  • a method for allocating a command according to a twenty-third processing of the ninth aspect of the present application the method for allocating a command according to the twenty-sixth processing of the ninth aspect of the present application, wherein updating the check flag comprises: if a new de-allocation command The start position and the end position are both before the start position recorded by the check mark, and after the current position is after the end position of the new de-allocation command, the start position of the next scan is recorded in the check mark as the new de-allocation command. The starting position, the end position of the next scan is the end position of the new de-allocation command.
  • a method for allocating a command according to a twenty-third processing of the ninth aspect of the present application the method for allocating a command according to the twenty-seventh processing of the ninth aspect of the present application, wherein updating the check flag comprises: if a new de-allocation command The start position is before the start position recorded by the check mark, the end position is after the start position recorded by the check mark, and the current position is after the end position of the new de-allocation command, the start of the next scan is recorded in the check mark The position is the start position of the new de-allocation command, and the end position of the next scan is the end position of the new de-allocation command.
  • a method for allocating a command according to the fifteenth process of the ninth aspect of the present application there is provided a method for allocating a command according to the twenty-eighth process of the ninth aspect of the present application, wherein, during the checking or cleaning of the de-allocation table, if receiving To one or more new de-allocation commands, the next check of the check mark is updated according to the new de-allocation command to allocate the start position and the end position of the table.
  • a twenty-eighth processing method for allocating a command according to the ninth aspect of the present application there is provided a method for allocating a command according to the twenty-ninth processing of the ninth aspect of the present application, by comparing the end position of the record in the check mark with the cleaning The current position of the allocation table is used to determine whether the cleaning of the de-allocation table is completed.
  • the method for allocating a command according to the thirtieth processing of the ninth aspect of the present application provides a method for clearing the current location of the allocation table. When the end position of the check mark is reached, the cleaning of the de-allocation table is not completed.
  • a twenty-eighth to thirth processing method for allocating a command according to the ninth aspect of the present application there is provided a method for allocating a command according to the thirty-first processing of the ninth aspect of the present application, if the de-allocation table is cleaned up The current position has reached the end position and also checks if the de-allocation table needs to be cleaned again.
  • a tenth aspect of the present application provides a computer program comprising computer program code for causing a control component to perform the first to ninth aspects of the present application when loaded onto a storage device and executed on a control component of the storage device One of the methods.
  • FIG. 1 is a block diagram of a prior art solid state storage device
  • Figure 2 shows a schematic view of a large block
  • FIG. 3A is a schematic diagram of a portion of an FTL table before processing a deallocation command according to an embodiment of the present application
  • FIG. 3B is a schematic diagram of a portion of an FTL table after processing a deallocation command according to an embodiment of the present application.
  • 4A is a schematic diagram of a de-allocation table before processing a deallocation command in the embodiment of the present application
  • 4B is a schematic diagram of a de-allocation table after processing a deallocation command in the embodiment of the present application
  • FIG. 5A is a flowchart of a method for processing a deallocation command according to an embodiment of the present application.
  • FIG. 5B is a flowchart of a method for responding to a read command according to an embodiment of the present application.
  • FIG. 5C is a flowchart of a method for responding to a write command according to an embodiment of the present application.
  • 6A is a block diagram of a control component in accordance with yet another embodiment of the present application.
  • FIG. 6B is a flowchart of a method of responding to a read command in accordance with the embodiment of FIG. 6A;
  • 6C is a flowchart of a method of responding to a write command in accordance with the embodiment of FIG. 6A;
  • FIG. 7A is a state transition diagram showing a state of a de-spreading table according to still another embodiment of the present application.
  • FIG. 7B is a flowchart of a process de-allocation command in accordance with the embodiment of FIG. 6A of the present application;
  • Figure 7C is a flow chart of a scan de-allocation table in accordance with the embodiment of Figure 6A of the present application.
  • FIG. 8 is a block diagram of a control component in accordance with another embodiment of the present application.
  • FIG. 9 is a schematic diagram of a system for processing logs according to an embodiment of the present application.
  • FIG. 10A is a schematic diagram of a log entry cache according to an embodiment of the present application.
  • FIG. 10B shows a flowchart of a compressed de-allocation table log in accordance with an embodiment of the present application
  • FIG. 10C is a schematic diagram of a log entry cache according to still another embodiment of the present application.
  • FIG. 11 is a block diagram of a control unit in accordance with still another embodiment of the present application.
  • FIG. 12 is a schematic diagram of mapping of a logical address and a de-allocation table of an IO command access according to still another embodiment of the present application;
  • FIG. 13 is a flowchart of processing a deallocation command according to still another embodiment of the present application.
  • FIG. 14 is a flowchart of updating an FTL table according to a de-allocation table according to still another embodiment of the present application.
  • 16A-16E are schematic diagrams showing a correspondence relationship between a de-allocation table and an inspection mark according to another embodiment of the present application.
  • FIG. 17 is a schematic diagram of processing a deallocation command according to another embodiment of the present application.
  • FIG. 18 is a flowchart of updating an FTL table according to a deallocation table according to another embodiment of the present application.
  • 19 is a block diagram of a control component in accordance with another embodiment of the present application.
  • FIG. 3A is a schematic diagram of a portion of an FTL table before processing a deallocation command, in accordance with an embodiment of the present application.
  • FIG. 3B is a schematic diagram of a portion of an FTL table after processing a deallocation command according to an embodiment of the present application.
  • the physical address (denoted as PPA ab) corresponding to the logical address range 0-7 (denoted as LBA 0 to LBA 7 respectively) is recorded in the FTL table, where "PPA" indicates the physical address and "a" indicates the physical. Block, and “b” indicates the physical page.
  • PPA indicates the physical address
  • a physical page whose physical address is "PPA 1-4” stores data with a logical address of "LBA 0”
  • a physical page whose physical address is "PPA 1-10” stores logic.
  • the FTL entry indicated by the logical address range described by the deallocation command in the FTL table is set to a special flag (for example, 0 or other value).
  • the logical address range indicated by the de-allocation command includes 0-7 and 100-103.
  • the contents of the entries in the FTL table in which logical addresses 0-7 and 100-103 are recorded are set to 0. Referring to FIG.
  • the physical address corresponding to the logical addresses in the FTL table is 0 (meaning a special mark), thereby conforming
  • the result of assigning the specified effect of the command (for example, all 0s) is used as a response to the read command.
  • the logical address range indicated by the de-allocation command may have a different unit size than the entry of the FTL table. For example, in the allocation command, one logical address corresponds to 512 bytes of storage space, and in the FTL table, one entry corresponds to 4 KB of storage space.
  • a de-allocation command is also maintained for efficient processing.
  • 4A is a schematic diagram of a de-allocation table before processing a deallocation command in the embodiment of the present application
  • FIG. 4B is a schematic diagram of a de-allocation table after processing a deallocation command in the embodiment of the present application.
  • the allocation table information corresponding to whether or not each logical address in the FTL table is de-allocated is stored. By way of example, one bit of storage space is provided for each logical address of the FTL table in the de-allocation table.
  • the logical address when a logical address has been allocated (ie, the logical address has a valid physical address in the FTL table) (see also FIG. 3A), in the de-allocation table, the logical address is marked as "assigned". (For example, set the corresponding 1-bit memory space to 0).
  • the logical address is marked as "deallocated" in the de-allocation table (eg, the corresponding 1-bit storage space is set to 1) ).
  • the logical addresses LBA 0-LBA 7 in the FTL table are all assigned valid physical addresses.
  • LBA 0-LBA 7 are all marked as "assigned" (the corresponding 1-bit storage space is set to 0).
  • a de-allocation command is executed on the logical address range of the LBA 0-LBA 3, the portion of the FTL table is as shown in FIG. 3B, and the portion of the de-allocation table is as shown in FIG. 4B.
  • the LBA 0-LBA 3 in the de-allocation table is marked as "de-allocation” (the corresponding 1-bit storage space is set to 1), while the LBA 4-LBA 7 in the de-allocation table is still marked as "assigned" (corresponding to The 1-bit storage space is set to 0).
  • the deallocation table in accordance with embodiments of the present application is stored in internal memory or DRAM 110 of control component 104 (see FIG. 1).
  • the allocation table is updated by a DMA operation.
  • the FTL table and the allocation table are also written to the NVM, so that when the power is restarted after the abnormal power failure, the FTL table and the de-allocation table at the time of power-down can be recovered from the NVM.
  • FIG. 5A is a flowchart of a method for processing a deallocation command according to an embodiment of the present application.
  • a logical address range indicated by the de-allocation command is obtained (512).
  • the de-allocation command indicates that de-allocation is to be performed on the logical address range of LBA 0 - LBA 3.
  • Updating the entry (514) of the de-allocation table according to the logical address range indicated by the de-allocation command, for example, corresponding to the logical address LBA 0-LBA 3 in the de-allocation table shown in FIG.
  • the entry is marked as "de-allocated" (having a value of 1) (as shown in Figure 4B).
  • the host can be instructed to allocate the command execution completion. As a result, the execution speed of the allocation command has been greatly improved.
  • the entry of the FTL table is also updated (518) according to the logical address range indicated by the de-allocation command. For example, the FTL entry corresponding to one or more logical addresses indicated by the de-allocation command is cleared, or set to a specified value (see FIG. 3B).
  • the entry of the FTL table corresponding to the one or more logical addresses to be updated is locked, so that other FTL entries are read during the FTL entry being updated. And after updating the FTL table, the entry of the FTL table corresponding to the updated one or more logical addresses is also unlocked.
  • FIG. 5B is a flowchart of a method for responding to a read command according to an embodiment of the present application.
  • the host When the host reads the assigned logical address, it should receive a specified indication such as all 0s.
  • the de-allocation table (see also Figure 4B) is queried to determine if the logical address read by the read command is de-allocated (532). If the allocation table indicates that the read logical address is in the de-allocation state, then all zeros or other specified results are used as responses to the read command (534). At step 532, if the allocation table indicates that the read logical address has been allocated, the FTL table is queried to obtain the physical address corresponding to the logical address to be read (536), and the data is read from the obtained physical address as a read. The response of the command (538).
  • the tag storage device is executing a de-allocation command.
  • the de-allocation table is first queried (see 532 in Fig. 5B).
  • the execution of the de-allocation command is completed, for example, after the execution of step 518 shown in FIG. 5A is completed, the tag storage device has completed execution of the de-allocation command.
  • step 532 of FIG. 5B does not have to be performed, and step 536 of FIG. 5B is directly executed.
  • FIG. 5C is a flowchart of a method for responding to a write command according to an embodiment of the present application.
  • its de-allocation table For a solid-state storage device that has not been written with data, its de-allocation table indicates that all logical addresses are in a de-allocation state. In response to the logical address being written to the data, the entry of the de-allocation table corresponding to the logical address of the written data is modified to the allocated state. And in response to executing the de-allocation command, the de-allocated logical address in the de-allocation table is modified again to the de-allocation state.
  • a physical address is assigned to the write command and the FTL table is updated (542) with the logical address indicated by the write command and the assigned physical address. Data is written to the assigned physical address and the write command processing completion is returned to the host (544).
  • the write command processing is completed to the host before the data is written to the physical address.
  • the de-allocation table is also updated to set the entry of the logical address to be written in the de-allocation table to be allocated (548).
  • the order of steps 542, 544, and 548 may be adjusted, or may be performed in parallel or simultaneously. Preferably, steps 544 and 548 occur after step 542.
  • FIG. 6A is a block diagram of a control component in accordance with yet another embodiment of the present application.
  • the control component 104 shown in FIG. 6A includes a host interface 610, a distributor 630, a plurality of CPUs (CPU0 and CPU1) for FTL tasks, and a media interface 620 for accessing the NVM chip 105.
  • the host interface 610 is used to exchange commands and data with the host.
  • the host communicates with the storage device through the NVMe/PCIe protocol
  • the host interface 610 processes the PCIe protocol data packet, extracts the NVMe protocol command, and returns the processing result of the NVMe protocol command to the host.
  • the allocator 630 is coupled to the host interface 610 for receiving IO commands sent by the host to the storage device and assigning IO commands to one of a plurality of CPUs for processing FTL tasks.
  • the distributor 630 can be implemented by a CPU or dedicated hardware.
  • Control component 104 is also coupled to an external memory (e.g., DRAM) 110.
  • the external memory 110 stores FTL Table 0, FTL Table 1, De-allocation Table 0, and De-allocation Table 1.
  • Multiple CPUs for processing FTL tasks process FTL tasks by using FTL tables and de-allocation tables.
  • a new physical address is assigned to the logical address to which the data is to be written, the mapping relationship between the logical address and the physical address is recorded in the FTL, and the de-allocation table is also updated.
  • the CPU accesses the FTL table and/or the allocation table, obtains the physical address corresponding to the logical address of the read command, and reads the data from the physical address.
  • the FTL table is divided into two parts (FTL Table 1 and FTL Table 2).
  • CPU 0 corresponds to FTL Table 0
  • CPU 1 corresponds to FTL Table 1.
  • the logical address is odd or even, it is determined whether the logical address is in FTL Table 0 or FTL 1.
  • the first half of the logical address space is located in FTL table 0 and the second half is located in FTL table 1.
  • the allocation table 0 corresponds to the FTL table 0, and records whether the logical address in the FTL table 0 is de-allocated.
  • the allocation table 1 corresponds to the FTL table 1, and records whether the logical addresses in the FTL table 1 are de-allocated.
  • CPU 0 corresponds to de-allocation table 0
  • CPU 1 corresponds to de-allocation table 1.
  • IO commands associated with different parts of the FTL table can be processed by different CPUs.
  • the IO command is assigned to one of the CPU 0 and the CPU 1 by the logical address accessed by the distributor 630 in accordance with the IO command.
  • CPU 0 processes the IO command in parallel with the CPU 1.
  • each of the CPU 0 and the CPU 1 has access to all FTL tables and/or de-allocation tables.
  • the allocator can assign it to any CPU processing.
  • the CPU also assigns a table maintenance status to its corresponding one.
  • CPU 0 maintains state 0 and the CPU maintains state 1.
  • State 0 indicates that the state of Table 0 is to be allocated, while State 1 indicates that the state of Table 1 is to be allocated.
  • the state of the allocation table at least indicating whether at least one entry in the de-allocation table is marked as "de-allocation".
  • state 0 uses only one bit of information, indicating whether at least one entry in the de-allocation table is marked as "de-allocation”; for de-allocation table 1, state 1 uses only one bit Information indicating whether at least one entry in the de-allocation table is marked as "de-allocation".
  • the memory space required for state 0 and state 1 is extremely small and can be stored in a register or memory inside the CPU.
  • FIG. 6B is a flow chart of a method of responding to a read command in accordance with the embodiment of FIG. 6A.
  • the dispatcher assigns it to CPU 0 (see also Figure 6A) for processing as an example.
  • CPU 0 first queries state 0 to determine if at least one record in the de-allocation table 0 is marked as "de-allocated" (631). If state 0 indicates that none of the records in the allocation table 0 are marked as "de-allocation”, the CPU 0 queries the FTL table 0 to obtain the physical address corresponding to the logical address to be read (636), and reads from the obtained physical address. The data is in response to the read command (538) without having to access to allocate table 0.
  • state 0 indicates that at least one record in the de-allocation table 0 is marked as "de-allocation" (631)
  • Assignment (632). If the allocation table 0 indicates that the read logical address is in the de-allocation state, then all zeros or other specified results are used as responses to the read command (634). In step 632, if the allocation table indicates that the read logical address has been allocated, the FTL 0 table is queried to obtain the physical address corresponding to the logical address to be read (636), and the data is read from the obtained physical address as a pair. Read the response of the command (638).
  • CPU 0 also maintains state 0.
  • the CPU 0 allocates the table 0 by scanning, and if it is found that no record in the allocation table 0 is marked as "de-allocation", the state 0 is set accordingly. And updating the de-allocation table 0 in response to executing the de-allocation command, setting state 0 to indicate that at least one record in the de-allocation table 0 is marked as "de-allocation”. Similarly, CPU 1 maintains state 1.
  • 6C is a flow chart of a method of responding to a write command in accordance with the embodiment of FIG. 6A.
  • the dispatcher assigns it to CPU 0 (see also Figure 6A) for processing as an example.
  • CPU 0 assigns a physical address (665) to the write command for carrying the data to be written by the write command.
  • CPU0 queries state 0 to determine if at least one record in the de-allocation table 0 is marked as "de-allocated” (670). If state 0 indicates that none of the records in the de-allocation table 0 are marked as "de-allocation", the CPU 0 updates the FTL table (eg, FTL table 0) according to the logical address of the write command and the physical address assigned to the write command in step 665.
  • FTL table eg, FTL table 0
  • the correspondence between the logical address of the write command and the physical address assigned thereto is recorded in the FTL table. And the NVM chip is also instructed to write the data of the write command to the assigned physical address (690).
  • an indication of completion of the write command processing is generated and sent to the issuer of the write command.
  • step 670 if state 0 indicates that there is an entry in the de-allocation table 0 that is marked as "de-allocated”, then the entry of table 0 is also updated according to the logical address of the write command (675), such that the same table is written in the allocation table.
  • the entry corresponding to the logical address of the command is set to "allocated” or clears the "de-allocation” flag for the entry.
  • step 670 if step 670 needs to be performed, the order of execution of steps 670 and 675 is not limited.
  • FIG. 7A shows a state transition diagram of a state of a de-spreading table according to still another embodiment of the present application.
  • the status of the allocation table includes a "to be scanned” state, a "scanning” state, a “scanning completed” state, and a "no recording” state.
  • state 0 and/or state 1 includes at least 2 bits indicating the four states of the de-allocation table.
  • the allocator (see FIG. 6A) allocates the de-allocation command for accessing the logical address space corresponding to FTL 0 to the CPU 0 processing, and allocates the de-allocation command for accessing the logical address space corresponding to FTL 1 to the CPU 1 for processing.
  • CPU 0 in response to processing the de-allocation command for de-allocation table 0, sets state 0 (see also FIG. 6A) to the "to-scan" state regardless of the state before state 0. Since the de-allocation command is executed, the partial entry in the allocation table 0 is set to "de-allocation". State 0 is set to the "to be scanned” state, which also indicates that at least one record in the de-allocation table is marked as "de-allocated" (see also Figure 6, step 631).
  • the state 0 when the state 0 is in the "to be scanned”, “scanned” or “scan completed” state, it is indicated that at least one record in the de-allocation table 0 is marked as “de-allocated” (see also FIG. 6 , step 631), and when state 0 is in the "no record” state, the record indicating that there is no record in the allocation table 0 is marked as "de-allocation” (see also FIG. 6, step 631).
  • the CPU 0 in response to processing the de-allocation command for the de-allocation table 0, the CPU 0 also sets state 1 (see also FIG. 6A) to the "to-be-scanned” state, or requests the CPU 1 to state 1 (see also Fig. 6A) is set to the "to be scanned” state. And, similarly, in response to the de-allocation command being processed for the de-allocation table 1 or the state 1 being set to the "to-be-scanned” state, the state 0 is set to the "to-be-scanned” state.
  • CPU 0 starts to clear to allocate table 0, and sets the physical address of the corresponding entry in FTL table 0 to the specified value according to the entry in the de-allocation table 0 set to "de-allocation" (for example, 0) ). For example, CPU 0 cleans up to allocate Table 0 during the idle period of no IO commands to be processed. Alternatively, in response to an indication from the host, CPU 0 begins to clean up to allocate table 0. In response to the start of the cleanup to allocate the table 0, the CPU 0 also sets the state 0 in the "to be scanned" state to the "scanning" state.
  • the CPU 0 In response to CPU 0 completing the cleanup of the de-allocation table 0, the CPU 0 sets the state 0 in the "scanning" state to the "scan complete” state. In order to complete the cleanup of the de-allocation table, the CPU 0 sets the physical address of the entry of the FTL table 0 corresponding to the entry in the allocation table 0 set to "de-allocation" to the specified value, and correspondingly allocates the table 0. The entry set to "de-allocation” is modified to clear the "de-allocation" flag to indicate that the true mapping relationship between the corresponding logical address and the physical address is recorded in the entry of the corresponding FTL table 0.
  • CPU 0 If state 0 is in the "scan complete” state, CPU 0 also acquires the state indicated by state 1. If state 1 is in the "scan complete” or "no record” state, CPU 0 sets state 0 to the "no record” state.
  • CPU 0 accesses or requests the value of state 1 to CPU 1. In still another embodiment, if state 0 enters the "scan complete” state, CPU 0 notifies CPU 1 of the state change result of the state. In still another embodiment, whenever a state change occurs in state 0, CPU 0 notifies CPU 1 of the state change result of the state.
  • the state 1 in response to the state 0 becoming the "no record” state, if the state 1 is in the "scan complete” state, the state 1 is also set to the "no record” state.
  • the CPU 1 maintains state 1 in a similar manner to the CPU 0.
  • Figure 7B is a flow diagram of a process de-allocation command in accordance with the embodiment of Figure 6A of the present application.
  • the allocator 630 assigns the de-allocation command to one of the CPUs (e.g., CPU 0).
  • the allocator 630 selects the CPU processing de-allocation command (720) responsible for the corresponding logical address range according to the logical address range indicated by the de-allocation command, and transmits the de-allocation command to the selected CPU (e.g., CPU 0).
  • the selected CPU (CPU 0) sets one or more entries corresponding to the de-allocation table to "de-allocation" according to the logical address range indicated by the de-allocation command (730). At this point, the host is instructed to allocate the command processing completion. And set state 0 to be in the "to be scanned" state.
  • Figure 7C is a flow diagram of a scan de-allocation table in accordance with the embodiment of Figure 6A of the present application.
  • state 0 When state 0 is in the "to-be-scanned" state, when there is no IO command to be processed, or in response to an indication from the host, the CPU (taking CPU 0 as an example) initiates a scan of the assigned allocation table 0 (740).
  • the CPU 0 traverses to allocate the table 0, and for each entry in the de-allocation table indicating "de-allocation", obtains its logical address, and modifies the physical address of the entry corresponding to the logical address in the corresponding FTL table (FTL 0) to the indication.
  • the logical address is "deallocated” to the specified value (750).
  • CPU 0 also updates the entry in Table 0 indicating "de-allocation” to clear the "de-allocation” flag.
  • CPU 0 also sets State 0 to the "Scanning" state.
  • CPU 0 When CPU 0 will de-allocate all the "de-allocation" flags in Table 0, it means that the allocation table 0 scan is complete. In response to the scan to allocate table 0 completion, CPU 0 sets state 0 to the "scan complete” state (760). If the CPU 0 receives the de-allocation command during the scan to allocate the table 0, the scan is stopped, and the state 0 is updated to the "to-be-scanned" state.
  • CPU 0 In response to state 0 being set to the "scan complete” state, CPU 0 also acquires the state indicated by state 0. If state 1 is in the "scan complete” or “no record” state at this time, CPU 0 sets state 0 to the "no record” state (770). And if state 1 is in "scan complete", CPU 0 sets state 0 to the "no record” state and also causes the state to be set to the "no record” state. Maintenance of state 1 is performed by, for example, CPU 1.
  • FIG. 8 is a block diagram of a control component in accordance with another embodiment of the present application.
  • the control unit 104 shown in FIG. 8 includes a host interface 810, a distributor, a plurality of CPUs (CPU0, CPU1, CPU2, and CPU3) for FTL tasks, and a media interface 820 for accessing the NVM chip 105.
  • CPU0 CPU0, CPU1, CPU2, and CPU3
  • media interface 820 for accessing the NVM chip 105.
  • the allocator is configured to receive a command sent by the host to the storage device and assign the command to one of a plurality of CPUs for processing the FTL task.
  • Control component 104 is also coupled to an external memory (e.g., DRAM) 110.
  • the external memory 110 stores FTL table 0, FTL table 1, FTL table 2, FTL table 3, de-allocation table 0, de-allocation table 1, de-allocation table 2, and allocation table 3.
  • Multiple CPUs for processing FTL tasks process FTL tasks by using FTL tables and de-allocation tables.
  • the FTL table is divided into four parts (FTL Table 1, FTL Table 2, FTL Table 3, and FTL Table 4).
  • CPU 0 corresponds to FTL Table 0
  • CPU 1 corresponds to FTL Table 1
  • CPU 2 corresponds to FTL Table 2
  • CPU 3 corresponds to FTL Table 3.
  • the remainder divided by the logical address by 4 determines which FTL table the logical address is located in.
  • the first 1/4 of the logical address space is located in FTL Table 0
  • the next 1/4 of the logical address is in FTL Table 1, and so on.
  • the allocation table 0 corresponds to the FTL table 0, and records whether the logical address in the FTL table 0 is de-allocated.
  • the allocation table 1 corresponds to the FTL table 1, and records whether the logical addresses in the FTL table 1 are de-allocated.
  • the allocation table 2 corresponds to the FTL table 2, and records whether the logical addresses in the FTL table 2 are de-allocated.
  • the allocation table 3 corresponds to the FTL table 3, and records whether the logical addresses in the FTL table 3 are de-allocated.
  • the CPU 0 corresponds to the de-allocation table 0
  • the CPU 1 corresponds to the de-allocation table 1
  • the CPU 2 corresponds to the de-allocation table 2
  • the CPU 3 corresponds to the de-allocation table 3.
  • commands associated with different portions of the FTL table can be processed by different CPUs.
  • the command is assigned to one of CPU 0, CPU 1, CPU 2 and CPU 3 by the logical address accessed by the distributor according to the command. Multiple CPUs process commands in parallel.
  • each CPU has access to all FTL tables and/or de-allocation tables.
  • the allocator can assign it to any CPU processing.
  • the CPU also maintains a table maintenance status for its corresponding de-allocation.
  • CPU 0 maintains state 0
  • CPU maintains state 1
  • CPU 2 maintains state 2
  • CPU 3 maintains state 3.
  • State 0 indicates the state of the allocation table
  • state 1 indicates the state of the allocation table 1
  • state 2 indicates the state of the allocation table 2
  • state 3 indicates the state of the allocation table 3.
  • the state of the allocation table is at least indicated whether at least one entry in all the de-allocation tables is marked as "de-allocation".
  • each CPU maintains and uses state 0, state 1, state 2, and state 3 in the manner illustrated in Figures 6A, 7A, and 7B.
  • the update of the status is broadcasted or notified to other CPUs, so that any CPU can know the latest value of the state maintained by other CPUs.
  • the CPU also updates the state of its own maintenance based on the update of the status of other CPU maintenance received. And when the state 0 is in the "scan complete” state, only the CPU 0 recognizes that all other CPU maintenance states are in the "scan complete” or "no record” state, and then set the state 0 of its own maintenance to the "no record” state. .
  • FIG. 9 is a schematic diagram of a system for processing logs in accordance with an embodiment of the present application.
  • solid state storage devices log any updates to their important metadata (eg, FTL tables, de-allocation tables, etc.).
  • any of a plurality of CPUs generates a log (referred to as FTL table log or de-allocation table log) entry in response to any updates to, for example, the FTL table and/or the de-allocation table.
  • the log entry records the index of the updated de-allocation table entry and the updated content of the de-allocation table entry.
  • the producer or source of the log (called the log identifier) is also recorded in the log entry.
  • the log identifier indicates, for example, CPU 0.
  • CPU 0 sends the log entry to the log service of the solid state storage device.
  • Logging services may be provided by one or more of the plurality of CPUs of the control component. The log service generates multiple data blocks and writes them to the NVM chip.
  • the assignment command typically accesses a large segment of the logical address space and results in a large number of logs that affect the performance of the solid state storage device.
  • the de-allocation table log is compressed to reduce the amount of log data written to the NVM chip.
  • FIG. 10A is a schematic diagram of a log entry cache according to an embodiment of the present application.
  • the log service component maintains multiple log entry caches. For example, the number of log entry caches is the same as the value set of the log identifier.
  • Figure 10A four log entry caches are shown, corresponding to the four CPUs that will generate the de-allocation table log for the control component.
  • the log entry cache indicates a log identifier (used to indicate the source of the log, for example, the CPU).
  • the log entry cache also records the allocation table index and optionally the count value.
  • the de-allocation table index indicates that the log entry originated from the entry to the allocation table, and also indicates the logical address corresponding to the allocation table entry that is the source of the de-log entry.
  • the log service component allocates a table log according to the log entry cache compression.
  • FIG. 10B illustrates a flow chart of a compressed de-allocation table log in accordance with an embodiment of the present application.
  • the log service component obtains the de-allocation table log from the CPU (1010), the source or log identifier of the obtained de-allocation table log (for example, CPU 0), and the de-allocation table index according to the obtained de-allocation table log indication (1020) ). Identify whether the de-allocation table log is compressible (1030). For example, the log entry cache is accessed according to the log identifier, and the de-allocation table index of the log entry cache record is compared with whether the de-allocation table index obtained from the received de-allocation table log is continuous.
  • the cached de-allocation table index is contiguous with the received de-allocation table index, it is recognized that the log can be compressed, and in the accessed log entry cache, the de-allocation table index is updated to the newly received value.
  • the count value of the log entry cache is also incremented. Still optionally, if the count value is greater than the threshold, the log data to be written to the NVM chip is generated according to the log entry cache to avoid excessive cached logs. If the cached de-allocation table index is not continuous with the received de-allocation table index, it is recognized that the log cannot be compressed, and the log data to be written to the NVM chip is generated according to the log entry cache, and the log is allocated according to the received de-allocation table. Update the log entry cache.
  • FIG. 10C is a schematic diagram of a log entry cache according to still another embodiment of the present application.
  • the log entry cache has multiple entries, each of which records the de-allocation table index and count value.
  • the log service component utilizes a log entry cache to identify successive de-allocation table logs.
  • the log service component identifies whether the received de-allocation table log hits the log entry cache.
  • the hit log entry cache means that the de-allocation table index corresponding to the received de-allocation table log is consecutive or identical to the de-allocation table index of an entry record of the log entry cache. In this case, the count value of the cached entry of the log entry is updated. If the count value is greater than the threshold, the log data to be written to the NVM chip is generated according to the log entry cache to avoid excessive cached logs.
  • the received de-allocation table log misses the log entry cache and there is a spare entry in the log entry cache, the received de-allocation table log is recorded in the spare entry without writing data to the NVM chip. If the received de-allocation table log misses the log entry cache and there are no free entries in the log entry cache, the log data to be written to the NVM chip is generated based on the log entry cache and/or the received de-allocation table entry. And optionally updating the log entry cache based on the received de-allocation table log.
  • the control unit 104 shown in FIG. 11 includes a host interface 1110, a distributor 1130, a plurality of CPUs (CPU0 and CPU1), and a media interface 1120 for accessing the NVM chip 105.
  • the host interface 1110 is used to exchange commands and data with the host.
  • the allocator 1130 is coupled to the host interface 1110 for receiving IO commands sent by the host to the storage device and assigning the IO commands to one of the plurality of CPUs. For the de-allocation command, the allocator 1130 sends the de-allocation command to each of the plurality of CPUs simultaneously, so that the plurality of CPUs cooperatively process the same de-allocation command to further speed up the processing of the de-allocation command.
  • Control component 104 is also coupled to an external memory (e.g., DRAM) 110.
  • the external memory 110 stores the FTL table, the de-allocation table 0, and the de-allocation table 1.
  • Multiple CPUs handle FTL tasks by using FTL tables and de-allocation tables.
  • the IO command is assigned to each of the plurality of CPUs such that the plurality of CPUs process the plurality of IO commands in parallel.
  • the logical address accessed according to the IO command is odd or even, and the IO command is assigned to CPU 0 or CPU 1.
  • an IO command accessing the first half of the logical address space is assigned to CPU 0
  • an IO command accessing the latter half of the logical address space is assigned to CPU 1.
  • IO commands are randomly or alternately assigned to CPU 0 or CPU 1, regardless of the logical address accessed by the IO command.
  • the de-allocation table is divided into two parts (de-allocation table 0 and de-allocation table 1).
  • CPU 0 maintains the allocation table 0 and the CPU 1 maintains the allocation table 1.
  • the control component has a greater number of CPUs to handle the FTL tasks, and each CPU maintains its own de-allocation table.
  • logical addresses are sequentially allocated to each de-allocation table for maintenance.
  • there are n de-allocation tables (n is a positive integer), and the result of modulo the logical address pair n as an index of the de-allocation table that maintains the logical address.
  • the allocator 1130 provides the de-allocation command to both the CPU 0 and the CPU 1.
  • the CPU 0 processes the portion of the de-allocation command that is maintained by the de-allocation table 0
  • the CPU 1 processes the portion of the de-allocation command that is maintained by the de-allocation table 1.
  • the CPU 0 and the CPU 1 simultaneously process the same de-allocation command, which speeds up the process of de-allocating the command. For example, the de-allocation table entry of the logical address whose maintenance value is an even number in Table 0 is allocated, and the de-allocation table entry of the logical address whose maintenance value is an odd number in Table 1 is allocated.
  • the IO commands associated with different parts of the FTL table are processed by different CPUs.
  • the IO command is assigned to one of the CPU 0 and the CPU 1 by the logical address accessed by the distributor 1130 in accordance with the IO command.
  • the CPU 0 processes a plurality of IO commands in parallel with the CPU 1.
  • the allocator 1130 assigns a write command to CPU 0 or CPU 1 in accordance with the logical address accessed by the write command. For example, if the de-allocation table 0 corresponding to the CPU 0 maintains the de-allocation table entry of the even-numbered logical address, the read command or the write command accessing the even-numbered logical address is also assigned to the CPU 0. As still another example, a read command or a write command is randomly assigned to the CPU 0 or the CPU 1. As still another example, a read command or a write command is alternately assigned to CPU 0 or CPU 1. As still another example, a read command or a write command is assigned to CPU 0 or CPU 1 according to the load of CPU 0 or CPU 1.
  • a new physical address is assigned to the logical address of the data to be written indicated by the write command, and the mapping relationship between the logical address and the physical address is recorded in the FTL.
  • the allocator 1130 assigns a read command to CPU 0 or CPU 1.
  • the FTL table is accessed, the physical address corresponding to the logical address of the read command is obtained, and the data is read from the physical address.
  • the allocation table temporarily records that the logical address is in the "de-allocation” state.
  • the CPU 0 or the CPU 1 also checks the de-allocation table, updates the corresponding entry of the FTL table according to the de-allocation table entry in which the "de-allocation” state is recorded, and records the "de-allocation” state in the FTL entry.
  • the de-allocation table is updated in response to the process de-allocation command, and the CPU 0 or CPU 1 checks the de-allocation table when idle or periodically.
  • a check mark indicating that the de-allocation table is to be checked or the check has not been completed is also recorded. Referring to Fig. 6, the check mark 0 maintained by the CPU 0 indicates whether or not the table 0 needs to be checked or the check is not completed, and the check mark 1 maintained by the CPU 1 indicates whether or not the table 1 needs to be checked or the check is not completed.
  • the check mark 0 and the check mark 1 indicate at least whether at least one item in the respective de-allocation table is marked as "de-allocation".
  • check mark 0 uses only one bit of information, indicating whether at least one entry in de-allocation table 0 is marked as "de-allocation”; for de-allocation table 1, check mark 1 only With one bit of information, it is indicated whether there is at least one entry in Table 1 that is assigned to be "deallocated”.
  • the memory space required to check mark 0 and check mark 1 is extremely small and can be stored in a register or memory inside the CPU.
  • a descriptor of the de-allocation table is also stored for indicating the progress of the check of the de-allocation table. It will be described in detail later.
  • CPU 0 may update to allocate table 0, may not update to allocate table 1
  • CPU 1 may update to allocate table 1
  • Both CPU 0 and CPU 1 can read the allocation table 0 and the de-allocation table 1.
  • the chunks of descriptors are used to describe the various chunks of the storage device.
  • Each entry of the large block descriptor table describes one of the large blocks, such as the number of the large block, the physical address of the physical block constituting the large block, the effective data amount of the large block, the number of times the large block is erased, and the like.
  • the chunk descriptor table is stored in the DRAM 110.
  • FIG. 12 is a diagram showing a mapping of a logical address and a de-allocation table of an IO command access according to still another embodiment of the present application.
  • the logical address space in the direction in which the logical address is incremented, the logical address space is divided into a plurality of regions (1202, 1204, ... 1224, etc.), and each region is mapped to a plurality of de-allocation tables (de-allocation table 0 and de-allocation) One of the tables 1).
  • each logical address area is alternately mapped to one of the de-allocation tables. For example, regions 1202, 1206, 1210, 1214, 1218 are mapped to de-allocation table 0, and regions 1204, 1208, 1212, 1216, 1220 are mapped to de-allocation table 1. In this way, the logical address space accessed by the de-allocation command from the host is mapped to each de-allocation table as evenly as possible.
  • the size of each logical address area is configurable. For example, each logical address area is the same size as the logical address range indicated by each FTL entry, for example, 4 KB.
  • the logical address space has other ways of dividing.
  • the logical address space is divided into the same number of areas as the de-allocation table, and each area is mapped to a de-allocation table.
  • the logical address area is alternately mapped to one of the four de-allocation tables.
  • logical address pair modulo 4 is used as the index of the de-allocation table to which the logical address is mapped.
  • the deallocation command indicates that de-allocation is performed on logical address regions 1210, 1212, and 1214.
  • Logical address areas 1210 and 1214 are mapped to de-allocation table 0, while logical address area 1212 is mapped to de-allocation table 1.
  • the de-allocation command is supplied to the CPU 0 and the CPU 1, the CPU 0 accesses the allocation table 0 to perform de-allocation to the logical address areas 1210 and 1214, and the CPU 1 accesses the de-allocation table 1 to perform de-allocation on the logical address area 1212.
  • FIG. 13 illustrates a flow chart of processing a deallocation command in accordance with yet another embodiment of the present application.
  • the allocator 630 (see also Fig. 11) transmits the received de-allocation command to the CPU 0 and the CPU 1 (1310) substantially simultaneously.
  • the assignment command indicates the logical address to be assigned.
  • the CPU 0 acquires one or more logical addresses (1320) belonging to the de-allocation table 0 maintained by itself according to the logical address indicated by the de-allocation command, and updates the allocation table 0 according to the logical address updates to record in the de-allocation table 0. These logical addresses are de-allocated (1330).
  • the CPU 1 acquires one or more logical addresses (1340) belonging to the de-allocation table 1 maintained by itself according to the logical address indicated by the same de-allocation command, and updates the allocation table 1 according to the logical address updates, in the de-allocation table 1 These logical addresses are recorded in the allocation (1350).
  • the CPU 0 and the CPU 1 cooperate to process different logical addresses of the same de-allocation command, which speeds up the process of de-allocating commands.
  • the same de-allocation command is sent to the multiple CPUs, and each CPU maintains its own de-allocation table and according to the logical addresses belonging to the respective de-allocation tables.
  • the tagged logical address is de-allocated in the de-allocation table.
  • FIG. 14 illustrates a flow chart for updating an FTL table according to a de-allocation table according to still another embodiment of the present application.
  • the de-allocation table is used to temporarily record that the logical address is "de-allocated” to speed up the processing of the allocation command. It is also necessary to move the "de-allocation" tag of the record in the de-allocation table to the FTL table. Still taking CPU 0 as an example, it is judged whether at least one entry in the allocation table 0 is marked as "de-allocation”, and the CPU 0 checks the de-allocation table in time to find the entry marked as "de-allocation”, according to The found entry records in the FTL table that the corresponding logical address is "deallocated” and clears the "deallocation" flag of the entry in the allocation table.
  • CPU 0 traverses to allocate table 0, finds (one) entry that records the "deallocation" flag, and obtains a corresponding logical address according to the location of the entry, the logical address. Recorded as the logical address to be cleaned (1410).
  • the CPU 0 acquires a plurality of logical addresses to be cleaned from the allocation table 0 at a time.
  • CPU 0 also updates one or more entries of the chunk descriptor table (1420).
  • the FTL table is accessed according to one or more logical addresses to be cleaned, and the physical addresses corresponding to the logical addresses to be cleaned are obtained, thereby identifying the large blocks to which the physical addresses belong.
  • the amount of valid data recorded in the chunk descriptor is updated based on the number of physical addresses that are allocated in the chunk. For example, if the five physical addresses corresponding to the five logical addresses of the 10 logical addresses to be cleaned belong to the large block 1, the data recorded by the five physical addresses in the large block 1 is no longer valid.
  • the CPU 0 also updates the entry of the FTL table corresponding to the logical address to be cleaned in the FTL table, and records it in the entry of the FTL table to be "deallocated" (1430). And in the de-allocation table, the "de-allocation" flag is cleared (1440) in the entries of the de-allocation table corresponding to the logical addresses to be cleaned.
  • CPU 0 repeats the flow shown in FIG. 14 until there is no entry in the de-allocation table 0 that is marked as "de-allocated”. And also clear the corresponding check mark 0. It will be understood that the CPU 1 performs a similar process as shown in FIG. 14 in accordance with the de-allocation table 1.
  • check flag is cleared, there is no need to access the allocation table when processing the read command or the write command to speed up the processing of the read command or the write command. If the check flag is set, when processing a read command or a write command, access to the allocation table may be required to check if the accessed logical address is de-allocated.
  • the control component also performs a garbage collection (GC, Garbage Collection) process to free up storage space occupied by invalid data.
  • GC Garbage Collection
  • the physical address 1-4, the physical address 3-6, the physical address 1-9, and the physical address 1-10 corresponding to the logical address 0 to the logical address 3 The data stored on it becomes invalid.
  • Recoverable physical block 1 discard invalid data recorded on physical address 1-4, physical address 1-9 and physical address 1-10, and move valid data on physical block 1 to new physical block, thereby releasing The storage space occupied by invalid data of physical block 1.
  • one or more large blocks are selected according to the chunk descriptor table as the recycled chunks of the garbage collection process. For example, a large block having the lowest effective data amount recorded in the entry of the large block descriptor table is selected as the reclaimed large block.
  • FIG. 15 shows a flow chart of a garbage collection process in accordance with another embodiment of the present application.
  • the large block to be reclaimed is selected according to the large descriptor table (1510), for example, the large block having the smallest valid data amount is selected. It will be appreciated that the amount of significant data for the chunks recorded by the chunk descriptor table may not be a true amount of valid data since the chunk descriptor table has not been updated by, for example, the flow of the embodiment of FIG.
  • the physical address and logical address of the data to be recovered are obtained (1520).
  • the physical address of the data to be recovered is known, and the logical address of the data to be recovered is obtained by the correspondence between the physical address and the logical address of the large record to be recovered.
  • the FTL table is directly queried by a logical address to obtain a physical address of the record, and the data to be recovered is identified according to whether the physical address of the record is consistent with the physical address of the data to be recovered. Valid and only valid data is discarded, and invalid data is discarded.
  • the de-allocation table is to be checked (1530). If there is no need to check the allocation table, the FTL table is queried by the logical address to obtain the physical address of the record, and whether the data to be recovered is valid according to whether the physical address of the record is consistent with the physical address of the data to be recovered (1550). For valid data to be recovered, the valid data to be recovered is written into a new chunk (1560), and the FTL table (1570) is also updated with the physical address of the new chunk to record valid data to be recovered in the FTL table. New storage location.
  • step 1530 if the allocation table is to be checked, the allocation table is accessed according to the logical address of the data to be recovered. If the corresponding entry of the allocation table records "de-allocation" (1540), which means that the logical address is not stored as valid data, returning to step 1520 to obtain the logical address and physical of the next data to be recovered from the large block to be recovered. address. If the corresponding entry of the allocation table does not record "deallocation", the FTL table is also queried according to the logical address of the data to be recovered to identify whether the data to be recovered is valid (1550).
  • step 1550 if the data to be recovered is invalid, the invalid data is discarded without being reclaimed, and returning to step 1520, the logical address and physical address of the next data to be recovered are obtained from the large block to be recovered. If the data to be recovered is valid, the data recorded by the physical address of the data to be recovered is written into a new chunk (1560), and the FTL table is updated accordingly (1570).
  • 16A-16E illustrate a deallocation table and checkmarks in accordance with another embodiment of the present application.
  • the start position (S), current position (C) and end position of the current check-to-allocation table are also recorded ( E), and optionally also the start position (NS) and end position (NE) of the next check to assign the table.
  • a table 0 is shown, for example, to be allocated. Since the CPU 0 has processed the de-allocation command (TR1), the entry of the shaded portion of the allocation table is set to "de-allocation". And to check or clean up the allocation table 0 to move the de-allocation tag in the de-allocation table to the FTL table.
  • S indicates the starting position of the current inspection to be performed
  • E indicates the end position of the current inspection to be performed
  • C indicates the position at which the inspection is currently being performed. It can be understood that the starting position S and the ending position E are obtained according to the range indicated by the de-allocation command (TR1). From the start position S to the end position S, the items of the allocation table are checked or cleaned one by one by, for example, the flowchart shown in Fig. 9, and C indicates the position currently being checked.
  • the CPU 0 receives the de-allocation command TR2 again.
  • the shaded portion corresponding to TR2 in Fig. 16B indicates the portion belonging to the de-allocation table 0 among the ranges indicated by the de-allocation command TR2.
  • the back of the logical address space is completely in the range currently being checked (from the start position S to the end position E). Then, only the end position of the record in the check mark is updated from E to E1 (the range indicated by TR2 belongs to the end position of the portion to which the table 0 is assigned).
  • the position currently being checked is C1, and as shown in Fig. 16B, the check of the de-allocation table 0 continues to proceed from the current position C1 to the end position E1.
  • the starting position S is not updated.
  • the CPU 0 receives the de-allocation command TR3 again.
  • the shaded portion of the horizontal line corresponding to TR3 in Fig. 16C indicates the portion of the range indicated by the deallocation command TR3 belonging to the deallocation table 0.
  • the region indicated by the deallocation command TR1 belongs to the portion of the deallocation table 0, and the region indicated by the command TR3 is assigned.
  • the starting point of the portion belonging to the allocation table 0 is within the range indicated by the de-allocation command TR1, and the area indicated by the de-allocation command TR3 belongs to the end of the portion to which the allocation table 0 belongs.
  • the range indicated by the de-allocation command TR1 belongs to the de-allocation.
  • After the part of Table 0. The position currently being checked is C2, and since the area indicated by the de-allocation command TR3 belongs to the portion of the de-allocation table 0, both the start point and the end are between the current position C2 and the end position E1, so that it is not necessary to be updated by receiving the TR3 command. Start position S and end position E1.
  • the range indicated by the de-allocation command TR3' belongs to the starting point of the portion of the de-allocation table 0 after the start position S, the end of the range is before the end position E1, and the range is covered.
  • the current position C2 which means that the de-allocation table entry that has been checked or cleaned up in the portion indicated by the de-allocation command TR3' belonging to the portion of the de-allocation table 0 is updated again.
  • the start position NS of the next scan is also recorded in the check mark
  • the range indicated by the de-allocation command TR3' belongs to the start position of the portion to which the table 0 is to be allocated
  • the end position NE of the next check is the current position.
  • the CPU 0 receives the de-allocation command TR4 again.
  • the shaded portion corresponding to TR4 in Fig. 16D indicates that the range indicated by the deallocation command TR4 belongs to the portion of the deallocation table 0.
  • the start point of the area indicated by the deallocation command TR4 belongs to the start point of the portion of the deallocation table 0 before the start point of the range indicated by the deallocation command TR1
  • the area indicated by the deallocation command TR4 belongs to the deallocation table.
  • the end of the portion of 0 is before the range indicated by the assignment command TR1.
  • the position currently being checked is C3, and the start and end of the portion of the area indicated by the de-allocation command TR4 belonging to the de-allocation table 0 are both before the current position C3.
  • the start position of the next scan is also recorded in the check mark as the start position NS1 of the deallocation command TR4, and the end position of the next check is the end position NE1 of the deallocation command TR4.
  • the CPU 0 receives the de-allocation command TR5 again.
  • the shaded portion corresponding to TR5 in Fig. 16E indicates that the range indicated by the deallocation command TR5 belongs to the portion of the deallocation table 0.
  • the start point of the area indicated by the deallocation command TR5 belonging to the portion to which the table 0 is to be allocated is before the start point of the range indicated by the deallocation command TR1, and the area indicated by the deallocation command TR5 belongs to the deallocation table.
  • the end of the portion of 0 is within the range indicated by the de-allocation command TR1.
  • the position currently being checked is C3, and the start and end of the area indicated by the de-allocation command TR5 are both before the current position C3.
  • the start position of the next scan is also recorded in the check mark as the start position NS2 of the deallocation command TR5, and the end position of the next check is the end position NE2 of the deallocation command TR5.
  • FIG. 17 is a schematic diagram of processing a deallocation command according to another embodiment of the present application.
  • the allocator 1130 receives the de-allocation command and sends the received de-allocation command to the CPU 0 and the CPU 1 substantially simultaneously (1710). Alternatively, in response to receiving the de-allocation command, the CPU 0 and the CPU 1 terminate their own checking or cleaning of the de-allocation table (if the check or clean-up of the de-allocation table is in progress) (1720).
  • the CPU 0 and the CPU 1 when the CPU 0 and the CPU 1 are idle, they check or clean up the de-allocation tables that are responsible for maintenance. When a new de-allocation command is received, the ongoing check or clean-up of the de-allocation table is suspended, and the allocation command is immediately processed to speed up the processing of the de-allocation command and reduce the processing delay of the de-allocation command.
  • the CPU 0 and the CPU 1 also update the respective maintained check marks (check mark 0 and check mark 1) (1730).
  • the check mark is updated in the manner shown in FIGS. 16A-16E, thereby recording the end position of the current inspection or cleaning of the allocation table in the check mark, and optionally, the start position of the next round of checking or cleaning the de-matching table With the end position.
  • the CPU 0 acquires one or more logical addresses (1740) belonging to the de-allocation table 0 maintained by itself according to the logical address indicated by the de-allocation command, and updates the allocation table 0 according to the logical address updates to allocate the table 0. These logical addresses are recorded in the allocation (1750).
  • the CPU 1 acquires one or more logical addresses (1760) belonging to the de-allocation table 1 maintained by itself according to the logical address indicated by the same de-allocation command, and updates the allocation table 1 according to the logical address updates, in the de-allocation table 1 These logical addresses are recorded in the allocation (1770).
  • FIG. 18 illustrates a flow chart for updating an FTL table according to a de-allocation table according to another embodiment of the present application.
  • CPU 0 it is determined whether at least one entry in the allocation table 0 is marked as "de-allocation”, and the CPU 0 checks the de-allocation table in time to find the entry marked as "de-allocation” ( 1810), according to the found entry, record the corresponding logical address in the FTL table is "deallocated” (1820), and clear the "deallocation” flag (1840) of the entry in the allocation table.
  • CPU 0 also updates the chunk descriptor table (1830). The order of execution of step 1830 and step 1840 is not limited.
  • the CPU 0 traverses to allocate the table 0, finds the (one) entry that records the "de-allocation" flag, obtains the corresponding logical address according to the location of the entry, and records the logical address as the logical address to be cleaned.
  • the CPU 0 acquires a plurality of logical addresses to be cleaned from the allocation table 0 at a time.
  • the physical addresses corresponding to the logical addresses to be cleaned are obtained from the FTL table according to one or more logical addresses to be cleaned up, thereby identifying the large blocks to which the physical addresses belong.
  • the amount of valid data recorded in the chunk descriptor is updated based on the number of physical addresses that are allocated in the chunk.
  • the CPU 0 also updates the entry of the FTL table corresponding to the logical address to be cleaned in the FTL table, and records that it is "deallocated" in the entry of the FTL table. And in the de-allocation table, the "de-allocation" flag is cleared in the entries of the de-allocation table corresponding to the logical addresses to be cleaned up.
  • the CPU 0 identifies whether the check or cleanup of the deal allocation table 0 is complete (1850). Whether the check or cleanup of the de-allocation table is completed by comparing the end position of the record in the check mark with the current position of the check or clean-up allocation table. If the current position does not reach the end position, it means that there are still items in the allocation table waiting to be checked or cleaned up. If the check or cleanup of the allocation table 0 is not completed, return to step 1810 to check the de-allocation table to find the entry marked as "de-allocated".
  • the de-allocation table needs to be checked or cleaned again (1860). It is determined whether the de-allocation table is to be scanned or cleaned by identifying whether the next scan start position and the next scan end position are recorded in the check mark. If the next scan start position and the next scan end position are recorded in the check mark, then go to step 1810 to start a new round of check or cleanup of the de-allocation table from the next scan start position. If the next scan start position and the next scan end position are not recorded in the check mark, the CPU checks or clears the allocation table.
  • each CPU performs a flow as shown in FIG. 18 on the de-allocation table that is responsible for maintenance.
  • Each CPU parallelizes the update of the FTL table according to the de-allocation table to speed up the processing.
  • the control unit 104 shown in Fig. 19 has a structure similar to that of the control unit 104 shown in Fig. 11.
  • the illustrative control component 104 includes a plurality of CPUs, and the distributor 1930 assigns IO commands to each of the plurality of CPUs.
  • the control unit of Fig. 19 includes four CPUs (CPU 0, CPU 1, CPU 2 and CPU 3).
  • the processing of the IO command by the control unit 104 is similar to that of the control unit 104 of FIG. Assign IO commands to multiple CPUs.
  • the de-allocation table is divided into four parts (de-allocation table 0, de-allocation table 1, de-allocation table 2, and de-allocation table 3).
  • the CPU 0 maintains the allocation table
  • the CPU 1 maintains the allocation table 1
  • the CPU 2 maintains the allocation table 2
  • the CPU 3 maintains the allocation table 3.
  • the result of modulo logical address pair 4 is used as an index to the de-allocation table that maintains the logical address.
  • the allocator 1930 provides the de-allocation command to both the CPU 0, the CPU 1, the CPU 2, and the CPU 3.
  • the CPU 0 processes the portion of the de-allocation command that is maintained by the de-allocation table
  • the CPU 1 processes the portion of the de-allocation command that is maintained by the de-allocation table 1
  • the CPU 2 maintains the de-allocation command from the de-allocation table 2
  • the portion is processed, and the CPU 3 processes the portion of the de-allocation command that is maintained by the de-allocation table 3.
  • the CPU 0, the CPU 1, the CPU 2 and the CPU 3 simultaneously process the same de-allocation command, speeding up the process of de-allocating the command.
  • IO commands For IO commands, IO commands associated with different parts of the FTL table are processed by different CPUs.
  • the IO command is assigned to one of the CPU 0, the CPU 1, the CPU 2, or the CPU 3 by the logical address accessed by the distributor 1930 in accordance with the IO command.
  • CPU 0, CPU 1, CPU 2 and CPU 3 process a plurality of IO commands in parallel.
  • the allocation table temporarily records that the logical address is in the "de-allocation” state.
  • the CPU 0, the CPU 1, the CPU 2, or the CPU 3 also checks the de-allocation table, and updates the corresponding entry of the FTL table according to the de-allocation table entry in which the "de-allocation" state is recorded, and records "going in the FTL entry". Assignment status.
  • a check mark indicating that the de-allocation table is to be checked or the check has not been completed is also recorded.
  • the check mark 0 maintained by the CPU 0 indicates whether or not the table 0 needs to be checked or the check is not completed, and the check mark 1 maintained by the CPU 1 indicates whether or not the table 1 needs to be checked or the check is not completed.
  • the check mark 2 maintained by the CPU 2 indicates whether or not the table 2 needs to be checked or the check has not been completed, and the check mark 3 maintained by the CPU 3 indicates whether or not the table 3 needs to be checked or the check has not been completed.
  • CPU 0 may update the allocation table 0, and may not update the allocation table 1, the allocation table 2, and the allocation table 3.
  • the CPU 1 can update the allocation table 1 and cannot update the allocation table 0, the allocation table 2, and the allocation table 3.
  • the CPU 2 can update the allocation table 2, and cannot update the allocation table 0, the allocation table 1 and the allocation table 3.
  • the CPU 3 can update the allocation table 3, and cannot update the allocation table 0, the allocation table 1 and the allocation table 2.
  • CPU 0, CPU 1, CPU 2 and CPU 3 can read all the de-allocation tables.
  • the check mark for each maintenance of the plurality of CPUs also indicates the start position (S), the current position (C) and the end position (E) of the current check of the de-allocation table maintained by the CPU, and optionally The start position (NS) and end position (NE) of the next check are also recorded.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Memory System (AREA)

Abstract

本申请公开了处理去分配命令的方法及其存储设备。所公开的方法包括如下步骤:响应于接收到去分配命令,获取去分配命令所指示的地址范围;根据去分配命令所指示的地址范围,更新去分配表的表项。本申请的实施例可以降低处理去分配命令的延迟,减轻处理去分配命令对IO命令处理带宽的影响。

Description

去分配命令处理方法及其存储设备
相关申请的交叉引用
本申请要求2017年11月29日提交的中国专利申请2017112222389(发明名称为存储命令处理方法及其存储设备)以及2018年6月11日提交的中国专利申请2018105944879(发明名称为去分配命令处理方法及其)的优先权,其全部内容通过引用合并于此。
技术领域
本申请涉及存储技术领域,尤其涉及去分配命令的方法及其存储设备。
背景技术
图1展示了固态存储设备的框图。固态存储设备102同主机相耦合,用于为主机提供存储能力。主机同固态存储设备102之间可通过多种方式相耦合,耦合方式包括但不限于通过例如SATA(Serial Advanced Technology Attachment,串行高级技术附件)、SCSI(Small C omputer System Interface,小型计算机***接口)、SAS(Serial Attached SCSI,串行连接S CSI)、IDE(Integrated Drive Electronics,集成驱动器电子)、USB(Universal Serial Bus,通用串行总线)、PCIE(Peripheral Component Interconnect Express,PCIe,高速***组件互联)、NVMe(NVM Express,高速非易失存储)、以太网、光纤通道、无线通信网络等连接主机与固态存储设备102。主机可以是能够通过上述方式同存储设备相通信的信息处理设备,例如,个人计算机、平板电脑、服务器、便携式计算机、网络交换机、路由器、蜂窝电话、个人数字助理等。存储设备102包括接口103、控制部件104、一个或多个NVM芯片105以及DRAM(Dynamic Random Access Memory,动态随机访问存储器)110。
NAND闪存、相变存储器、FeRAM(Ferroelectric RAM,铁电存储器)、MRAM(Magn etic Random Access Memory,磁阻存储器)、RRAM(Resistive Random Access Memory,阻变存储器)等是常见的NVM。
接口103可适配于通过例如SATA、IDE、USB、PCIE、NVMe、SAS、以太网、光纤通道等方式与主机交换数据。
控制部件104用于控制在接口103、NVM芯片105以及DRAM 110之间的数据传输,还用于存储管理、主机逻辑地址到闪存物理地址映射、擦除均衡、坏块管理等。控制部件104可通过软件、硬件、固件或其组合的多种方式实现,例如,控制部件104可以是FPGA(Field-programmable gate array,现场可编程门阵列)、ASIC(Application Specific Integrated Circuit,应用专用集成电路)或者其组合的形式。控制部件104也可以包括处理器或者控制器,在处理器或控制器中执行软件来操纵控制部件104的硬件来处理IO(Input/Output)命令。控制部件104还可以耦合到DRAM 110,并可访问DRAM 110的数据。在DRAM可存储FTL表和/或缓存的IO命令的数据。
控制部件104包括闪存接口控制器(或称为介质接口控制器、闪存通道控制器),闪存接口控制器耦合到NVM芯片105,并以遵循NVM芯片105的接口协议的方式向NVM芯片105发出命令,以操作NVM芯片105,并接收从NVM芯片105输出的命令执行结果。已知的NVM芯片接口协议包括“Toggle”、“ONFI”等。
存储器目标(Target)是NAND闪存封装内的共享CE(,Chip Enable,芯片使能)信号的一个或多个逻辑单元(LUN,Logic UNit)。NAND闪存封装内可包括一个或多个管芯(Die)。典型地,逻辑单元对应于单一的管芯。逻辑单元可包括多个平面(Plane)。逻辑单元内的多个平面可以并行存取,而NAND闪存芯片内的多个逻辑单元可以彼此独立地执行命令和 报告状态。
存储介质上通常按页来存储和读取数据。而按块来擦除数据。块(也称物理块)包含多个页。块包含多个页。存储介质上的页(称为物理页)具有固定的尺寸,例如17664字节。物理页也可以具有其他的尺寸。
大块包括来自多个逻辑单元(LUN),也称为逻辑单元组的每个的物理块。每个逻辑单元可以为大块提供一个物理块。例如,在图2所示出的大块的示意图中,在每16个逻辑单元(LUN)上构造大块。每个大块包括16个分别来自16个逻辑单元(LUN)的物理块。在图2的例子中,大块0包括来自16个逻辑单元(LUN)中的每个逻辑单元的物理块0,而大块1包括来自每个逻辑单元(LUN)的物理块1。也可以用多种其他方式来构造大块。
例如,在大块中构造页条带,每个逻辑单元(LUN)内相同物理地址的物理页构成了“页条带”。图2中,物理页P0-0、物理页P0-1……与物理页P0-x构成了页条带0,其中,物理页P0-0、物理页P0-1……物理页P0-14用于存储用户数据,而物理页P0-x用于存储根据条带内的所有用户数据计算得到的校验数据。类似地,图2中,物理页P2-0、物理页P2-1……与物理页P2-x构成了页条带2。用于存储校验数据的物理页可以位于页条带中的任意位置。作为又一个例子,在申请号为201710752321.0的中国专利申请的图3A及其说明书中对图3A的相关描述中,提供了大块的又一种构造方式。
在固态存储设备中,利用FTL(Flash Translation Layer,闪存转换层)来维护从逻辑地址到物理地址的映射信息。逻辑地址构成了操作***等上层软件所感知到的固态存储设备的存储空间。物理地址是用于访问固态存储设备的物理存储单元的地址。在相关技术中还可利用中间地址形态实施地址映射。例如将逻辑地址映射为中间地址,进而将中间地址进一步映射为物理地址。
存储了从逻辑地址到物理地址的映射信息的表结构被称为FTL表。FTL表是固态存储设备中的重要元数据。FTL表的数据项记录了固态存储设备中以数据单元为单位的地址映射关系。在一个例子中,FTL表中的逻辑页对应4KB存储空间,而物理页的存储空间也为4KB(还包括附加的带外存储空间)。FTL表为每个4KB的数据单元提供一条记录,以记录其逻辑地址到物理地址的映射。在另一个例子中,数据单元对应的存储空间大小和物理页的存储空间大小不同,例如物理页可容纳多个数据单元,数据单元对应4KB的存储空间,而物理页的存储空间能够容纳多个数据单元(例如4个)
FTL表包括多个FTL表条目(或称表项)。在一种情况下,每个FTL表条目中记录了一个逻辑页地址与一个物理页的对应关系。在另一种情况下,每个FTL表条目中记录了连续的多个逻辑页地址与连续的多个物理页的对应关系。在又一种情况下,每个FTL表条目中记录了逻辑块地址与物理块地址的对应关系。在依然又一种情况下,FTL表中记录逻辑块地址与物理块地址的映射关系,和/或逻辑页地址与物理页地址的映射关系。
在处理来自主机的读命令时,固态存储设备利用读命令中携带的逻辑地址从FTL表中获得对应的物理地址,并依据物理地址向NVM芯片发出读请求,并接收NVM芯片响应于读请求输出的数据。在处理来自主机的写命令时,固态存储设备为写命令分配物理地址,在FTL表中记录写命令的逻辑地址与分配的物理地址的对应关系,并依据分配的物理地址向NVM芯片发出写请求。
在ATA8-ACS2中定义了Trim(修剪)命令,具有相同或类似含义的命令在SCSI(Small Computer System Interface,小型计算机***接口)规范中被称为UNMAP(解除映射),而在NVMe规范中被称为Deallocate(去分配)。下文中,用“去分配”来指示具有同ATA8-ACS2的“修剪”、SCSI的“解除映射”、NVMe的“去分配”具有相同或相似功能的数据集管理命令,以及也指示在其他或将来的协议、规范或技术中出现的具有相同或相似功能的命令。
在去分配命令中,描述了逻辑地址范围。在执行了去分配命令后,可以有不同的效果。例如,(1)执行去分配命令后,(执行对该逻辑地址范围的其他写入操作之前)再读取去分配命令所指示的逻辑地址范围时,所得到的是确定的;(2)执行去分配命令后,(执行对该逻 辑地址范围的其他写入操作之前)再读取去分配命令所指示的逻辑地址范围时,所得到的结果是全0;(3)执行去分配命令后,(执行对该逻辑地址范围的其他写入操作之前)再读取去分配命令所指示的逻辑地址范围时,所得到的结果可以是任意值。在去分配命令或者其他命令中可设置或选择去分配命令的执行效果。
发明内容
通过使用去分配命令,主机能够告知固态硬盘哪些逻辑地址空间不再存储有效数据,从而固态硬盘在回收存储空间时,不必搬移已经失效的数据。然而,现有技术中,随着固态存储设备的存储容量变大,FTL表的尺寸随之增大,导致在执行去分配命令过程中,需要大量的内存访问操作,这严重延长了对去分配命令进行处理的时间,并影响固态硬盘的性能,进而影响对同时发生的IO命令的执行。
需要降低去分配命令的处理延迟。进一步地,需要降低处理去分配命令对IO命令处理造成的延迟。也减轻降低处理去分配命令对IO命令处理带宽的影响。
根据本申请的第一方面,提供了根据本申请第一方面的第一存储设备执行的方法,包括:接收读命令;若去分配表中没有任何条目被标记为“去分配”,查询FTL表获得读命令访问的逻辑地址对应的物理地址;以及从物理地址获取数据作为对读命令的响应。
根据本申请第一方面的第一存储设备执行的方法,提供了根据本申请第一方面的第二存储设备执行的方法,还包括:若去分配表中有至少一个条目被标记为“去分配”,查询去分配表以确定读命令访问的逻辑地址是否被去分配;若读命令访问的逻辑地址在去分配表中被标记为“去分配”,以指定值作为对读命令的响应。
根据本申请第一方面的第二存储设备执行的方法,提供了根据本申请第一方面的第三存储设备执行的方法,还包括:若读命令访问的逻辑地址在去分配表中未被标记为“去分配”,依据查询FTL表获得的物理地址获取数据作为对读命令的响应。
根据本申请第一方面的第一至第三存储设备执行的方法之一,提供了根据本申请第一方面的第四存储设备执行的方法,其中通过去分配表的状态指示去分配表中是否有任何条目被标记为“去分配”。
根据本申请第一方面的第一至第四存储设备执行的方法之一,提供了根据本申请第一方面的第五存储设备执行的方法,还包括:响应于接收去分配命令,将去分配表的状态设置为指示去分配表中有至少一个条目被标记为“去分配”。
根据本申请第一方面的第一至第五存储设备执行的方法之一,提供了根据本申请第一方面的第六存储设备执行的方法,还包括:扫描去分配表,根据去分配表中被标记为“去分配”的条目对应的逻辑地址,将FTL表的同逻辑地址对应的条目更新为指示“去分配”。
根据本申请第一方面的第六存储设备执行的方法,提供了根据本申请第一方面的第七存储设备执行的方法,还包括:响应于扫描去分配表完成,将去分配表的状态设置为指示去分配表中没有任何条目被标记为“去分配”。
根据本申请第一方面的第一至第七存储设备执行的方法之一,提供了根据本申请第一方面的第八存储设备执行的方法,还包括:响应于接收写命令,为写命令分配物理地址;若去分配表中没有任何条目被标记为“去分配”,用写命令的逻辑地址与分配的物理地址更新FTL表;以及根据物理地址写入数据。
根据本申请第一方面的第八存储设备执行的方法,提供了根据本申请第一方面的第九存储设备执行的方法,还包括:若去分配表中有至少一个条目被标记为“去分配”,将去分配表中同写命令的逻辑地址对应的条目清除“去分配”标记。
根据本申请第一方面的第八或第九存储设备执行的方法,提供了根据本申请第一方面的第十存储设备执行的方法,还包括:在根据物理地址写入数据之前,向写命令的发出方指示写命令已处理完成。
根据本申请第一方面的第八存储设备执行的方法,提供了根据本申请第一方面的第十一存储设备执行的方法,其中响应于获知去分配表中没有任何条目被标记为“去分配”,向写命令的发出方指示写命令已处理完成。
根据本申请第一方面的第八或第九存储设备执行的方法,提供了根据本申请第一方面的第十二存储设备执行的方法,其中响应于获知去分配表中有至少一个条目被标记为“去分配”,在将去分配表中同写命令的逻辑地址对应的条目清除“去分配”标记之后,向写命令的发出方指示写命令已处理完成。
根据本申请第一方面的第一至第十二存储设备执行的方法之一,提供了根据本申请第一方面的第十三存储设备执行的方法,其中去分配表包括第一去分配表与第二去分配表,分别对应于第一FTL表与第二FTL表;第一去分配表的状态指示包括第一去分配表与第二去分配表的去分配表中是否有任何条目被标记为“去分配”;第二去分配表的状态指示包括第一去分配表与第二去分配表的去分配表中是否有任何条目被标记为“去分配”。
根据本申请第一方面的第十三存储设备执行的方法,提供了根据本申请第一方面的第十四存储设备执行的方法,还包括:响应于接收去分配命令,将第一去分配表的状态与第二去分配表的状态设置为去分配表中有至少一个条目被标记为“去分配”。
根据本申请第一方面的第十四存储设备执行的方法,提供了根据本申请第一方面的第十五存储设备执行的方法,还包括:响应于扫描第一去分配表完成,将第一去分配表的状态设置为“扫描完成”;响应于扫描第二去分配表完成,将第二去分配表的状态设置为“扫描完成”。
根据本申请第一方面的第十五存储设备执行的方法,提供了根据本申请第一方面的第十六存储设备执行的方法,还包括:响应于第一去分配表的状态被设置为“扫描完成”,若第二去分配表的状态为“扫描完成”或“无记录”,将第一去分配表的状态设置为“无记录”,其中“无记录”状态指示包括第一去分配表与第二去分配表的去分配表中没有任何条目被标记为“去分配”。
根据本申请第一方面的第十五或十六存储设备执行的方法,提供了根据本申请第一方面的第十七存储设备执行的方法,还包括:响应于第二去分配表的状态被设置为“扫描完成”,若第一去分配表的状态为“扫描完成”或“无记录”,将第二去分配表的状态设置为“无记录”,其中“无记录”状态指示包括第一去分配表与第二去分配表的去分配表中没有任何条目被标记为“去分配”。
根据本申请第一方面的第十三至第十七存储设备执行的方法之一,提供了根据本申请第一方面的第十八存储设备执行的方法,其中第一CPU处理访问第一去分配表的去分配命令;第二CPU处理访问第二去分配表的去分配命令。
根据本申请第一方面的第十八存储设备执行的方法,提供了根据本申请第一方面的第十九存储设备执行的方法,还包括:响应于第一去分配表的状态被更新,第一CPU向第二CPU通知第一去分配表的状态被更新。
根据本申请第一方面的第十八或十九存储设备执行的方法,提供了根据本申请第一方面的第二十存储设备执行的方法,还包括:响应于第二去分配表的状态被更新,第二CPU向第一CPU通知第一去分配表的状态被更新。
根据本申请第一方面的第一至第二十存储设备执行的方法之一,提供了根据本申请第一方面的第二十一存储设备执行的方法,还包括:获取去分配表的更新日志;若去分配表的更新日志可压缩,缓存所获取的去分配表的更新日志。
根据本申请第一方面的第二十一存储设备执行的方法,提供了根据本申请第一方面的第二十二存储设备执行的方法,还包括:若去分配表的更新日志不可压缩,将去分配表的更新日志写入NVM芯片。
根据本申请第一方面的第二十一或二十二存储设备执行的方法,提供了根据本申请第一方面的第二十三存储设备执行的方法,其中依据获取的去分配表的更新日志同缓存的去分配 表的更新日志是否连续,确定去分配表的更新日志是否可压缩。
根据本申请第一方面的第二十一至第二十三存储设备执行的方法之一,提供了根据本申请第一方面的第二十四存储设备执行的方法,其中依据获取的去分配表的更新日志同缓存的去分配表的更新日志是否连续,以及获取的去分配表的更新日志同缓存的去分配表的更新日志的来源是否相同,确定去分配表的更新日志是否可压缩。
根据本申请的第二方面,提供了根据本申请第二方面的第一存储设备,包括控制部件、存储器和NVM芯片,其中,存储器存储去分配表和FTL表,控制部件执行根据本申请第一方面的第一至第二十四存储设备执行的方法之一。
根据本申请的第三方面,提供了根据本申请第三方面的第一存储设备,包括控制部件、存储器和NVM芯片,其中,存储器存储去分配表和FTL表,控制部件包括第一CPU与第二CPU;第一CPU与第二CPU分别执行根据本申请第一方面的第一至第二十四存储设备执行的方法之一。
根据本申请第三方面的第一存储设备,提供了根据本申请第三方面的第二存储设备,其中存储器存储第一去分配表、第二去分配表、第一FTL表和第二FTL表,第一去分配表用于第一FTL表,而第二去分配表用于第二FTL表;第一CPU维护第一去分配表的状态,第二CPU维护第二去分配表的状态。
根据本申请第三方面的第一或第二存储设备,提供了根据本申请第三方面的第三存储设备,还包括分配器,用于将接收的命令分配给第一CPU或第二CPU之一。
根据本申请第三方面的第三存储设备,提供了根据本申请第三方面的第四存储设备,其中所述分配器根据去分配命令访问的逻辑地址,将去分配命令分配给同逻辑地址对应的第一或第二CPU之一。
根据本申请第三方面的第三或第四存储设备,提供了根据本申请第三方面的第五存储设备,其中所述分配器将读命令或写命令随机或轮流分配给第一CPU或第二CPU之一。
根据本申请的第四方面,提供了根据本申请第四方面的第一处理去分配命令的方法,包括:响应于接收到去分配命令,获取去分配命令所指示的地址范围;根据去分配命令所指示的地址范围,更新去分配表的表项。
根据本申请第四方面的第一处理去分配命令的方法,提供了根据本申请第四方面的第二处理去分配命令的方法,还包括:根据去分配命令所指示的地址范围,更新FTL表的表项;其中FTL表记录了同逻辑地址对应的物理地址。
根据本申请第四方面的第一或第二处理去分配命令的方法,提供了根据本申请第四方面的第三处理去分配命令的方法,其中FTL表中由去分配命令所描述的逻辑地址范围所指示的FTL表项被设置为特殊标记。
根据本申请第四方面的第一至第三处理去分配命令的方法之一,提供了根据本申请第四方面的第四处理去分配命令的方法,其中去分配表中,存储对应于每个地址是否被去分配的信息。
根据本申请第四方面的第一至第四处理去分配命令的方法之一,提供了根据本申请第四方面的第五处理去分配命令的方法,其中响应于地址被分配,在去分配表中,将该地址标记为被分配。
根据本申请第四方面的第一至第五处理去分配命令的方法之一,提供了根据本申请第四方面的第六处理去分配命令的方法,其中当地址未被分配或者已经被应用了去分配命令时,该地址在去分配表中被标记为去分配。
根据本申请第四方面的第一至第六处理去分配命令的方法之一,提供了根据本申请第四方面的第七处理去分配命令的方法,根据本申请第一方面的处理去分配命令的方法,其中在更新了去分配表后,向主机指示去分配命令执行完成。
根据本申请第四方面的第一至第七处理去分配命令的方法之一,提供了根据本申请第四方面的第八处理去分配命令的方法,其中更新FTL表包括将去分配命令所指示的一个或多个 逻辑地址对应的FTL表项设置为指定值。
根据本申请第四方面的第一至第八处理去分配命令的方法之一,提供了根据本申请第四方面的第九处理去分配命令的方法,其中在更新FTL表前,对要更新的一个或多个地址对应的FTL表的表项加锁。
根据本申请第四方面的第一至第九处理去分配命令的方法之一,提供了根据本申请第四方面的第十处理去分配命令的方法,其中在更新FTL表后,对被更新的一个或多个地址对应的FTL表的表项解锁。
根据本申请的第五方面,提供了根据本申请第五方面的第一存储设备,其特征在于,包括:控制部件,执行根据本申请第四方面的处理去分配命令的方法之一;与控制部件连接的外部存储器以及非易失性存储器;其中去分配表存储在固态存储设备的控制部件的内部存储器或者存储在外部存储器中。
根据本申请第五方面的第一存储设备,提供了根据本申请第五方面的第二存储设备,其中当固态存储设备断电时,将去分配表写入非易失性存储器。
根据本申请的第六方面,提供了根据本申请的第六方面的第一处理去分配命令的***,包括:控制部件与外部存储器;控制部件包括分配器与多个CPU,所述分配器,接收IO命令,并将IO命令分配给多个CPU中的每一个;所述多个CPU,用于并行处理接收到的IO命令;外部存储器,存储去分配表。
根据本申请的第六方面的第一处理去分配命令的***,提供了根据本申请的第六方面的第二处理去分配命令的***,其中存储器中还存储了FTL表,FTL表记录了同逻辑地址对应的物理地址。
根据本申请的第六方面的第一处理去分配命令的***,提供了根据本申请的第六方面的第三处理去分配命令的***,其中去分配表被分为多个部分,每个部分由多个CPU中的一个维护。
根据本申请的第六方面的第一至第三处理去分配命令的***之一,提供了根据本申请的第六方面的第四处理去分配命令的***,其中分配器将去分配命令同时提供给多个CPU中的一个或多个,CPU对去分配命令中与自己维护的去分配表相关的部分进行处理。
根据本申请的第六方面的第一至第四处理去分配命令的***之一,提供了根据本申请的第六方面的第五处理去分配命令的***,其中依据IO命令访问的地址,将IO命令分配给CPU。
根据本申请的第六方面的第一至第五处理去分配命令的***之一,提供了根据本申请的第六方面的第六处理去分配命令的***,其中由分配器根据IO命令访问的逻辑地址将IO命令分配给多个CPU。
根据本申请的第六方面的第一至第六处理去分配命令的***之一,提供了根据本申请的第六方面的第七处理去分配命令的***,其中去分配表记录地址处于去分配状态。
根据本申请的第六方面的第一至第七处理去分配命令的***之一,提供了根据本申请的第六方面的第八处理去分配命令的***,根据记录了去分配状态的去分配表表项,更新FTL表的对应表项,在FTL表项中记录去分配状态。
根据本申请的第六方面的第一至第八处理去分配命令的***之一,提供了根据本申请的第六方面的第九处理去分配命令的***,其中响应于处理去分配命令而更新了去分配表,一个或多个CPU在空闲时或周期性地检查其维护的去分配表。
根据本申请的第六方面的第一至第九处理去分配命令的***之一,提供了根据本申请的第六方面的第十处理去分配命令的***,其中一个或多个CPU保存检查标记。
根据本申请的第六方面的第十处理去分配命令的***,提供了根据本申请的第六方面的第十一处理去分配命令的***,其中检查标记至少指示了CPU所维护的去分配表中存在至少一项条目被标记为去分配。
根据本申请的第六方面的第十一处理去分配命令的***,提供了根据本申请的第六方面 的第十二处理去分配命令的***,其中检查标记还指示对去分配表的检查的进展。
根据本申请的第六方面的第一至第十二处理去分配命令的***之一,提供了根据本申请的第六方面的第十三处理去分配命令的***,其中CPU仅可更新其维护的去分配表,但可读取全部的去分配表。
根据本申请的第六方面的第一至第十三处理去分配命令的***之一,提供了根据本申请的第六方面的第十四处理去分配命令的***,其中IO命令访问的地址空间被分为多个区域,每个区域被映射到多个去分配表之一。
根据本申请的第六方面的第十四处理去分配命令的***,提供了根据本申请的第六方面的第十五处理去分配命令的***,其中使得来自主机的去分配命令所访问的地址空间被尽量均匀地映射给各个去分配表的方式将逻辑地址空间映射到去分配表。
根据本申请的第六方面的第十五处理去分配命令的***,提供了根据本申请的第六方面的第十六处理去分配命令的***,其中去分配表的每个表项所指示的地址区域的大小可配置。
根据本申请的第七方面,提供了根据本申请第七方面的第一处理去分配命令的方法,包括:将接收的去分配命令同时发送给多个CPU;收到去分配命令的CPU根据去分配命令指示的地址范围,获取去分配命令指示的地址范围中属于自己的去分配表的一个或多个地址,并根据获取的所述一个或多个地址更新自己维护的去分配表,以在该去分配表中记录的所述一个或多个地址被去分配。
根据本申请第七方面的第一处理去分配命令的方法,提供了根据本申请第七方面的第一处理去分配命令的方法,其中还包括:根据去分配命令指示的地址范围更新FTL表。
根据本申请第七方面的第一或第二处理去分配命令的方法,提供了根据本申请第七方面的第三处理去分配命令的方法,其中定时或周期性的对去分配表进行检查,找出被标记为去分配的第一表项,根据第一表项在FTL表中记录对应的逻辑地址被去分配,并清除去分配表中第一表项的去分配标记。
根据本申请第七方面的第一至第三处理去分配命令的方法之一,提供了根据本申请第七方面的第四处理去分配命令的方法,其中还包括根据去分配命令指示的地址范围更新大块描述符中记录的有效数据量。
根据本申请第七方面的第四处理去分配命令的方法,提供了根据本申请第七方面的第五处理去分配命令的方法,其特征在于,其中去分配表中已不存在被标记为去分配的表项时清除或重新设置与该去分配表对应的检查标记,其中检查标记指示该去分配表中是否存在至少一项条目被标记为去分配。
根据本申请第七方面的第五处理去分配命令的方法,提供了根据本申请第七方面的第六处理去分配命令的方法,其中若检查标记被清除,在处理读命令或写命令时,无须访问去分配表。
根据本申请第七方面的第五处理去分配命令的方法,提供了根据本申请第七方面的第七处理去分配命令的方法,其中若检查标记被设置,在处理读命令或写命令时,需要访问去分配表。
根据本申请的第八方面,提供了根据本申请第八方面的第一存储设备垃圾回收方法,包括如下步骤:根据大块描述符表选择待回收的大块;根据待回收大块,获取待回收的数据的地址;若去分配表须检查,依据待回收数据的地址访问去分配表,若去分配表的对应表项记录了去分配,从待回收大块中获取下一待回收数据的地址。
根据本申请第八方面的第一存储设备垃圾回收方法,提供了根据本申请第八方面的第二存储设备垃圾回收方法,若去分配表的对应表项未记录去分配,则根据待回收数据的地址查询FTL表,以识别待回收数据是否有效;若待回收数据有效,则将待回收数据写入新大块,并更新FTL表。
根据本申请第八方面的第一或第二存储设备垃圾回收方法,提供了根据本申请第八方面 的第三存储设备垃圾回收方法,根据检查标记识别去分配表是否须检查,其中检查标记指示该去分配表中是否存在至少一项条目被标记为去分配。
根据本申请第八方面的第一至第三存储设备垃圾回收方法之一,提供了根据本申请第八方面的第四存储设备垃圾回收方法,若去分配表无须检查,则通过该地址查询FTL表,得到记录的物理地址,依据记录的物理地址与待回收数据的物理地址是否一致来识别待回收数据是否有效。
根据本申请第八方面的第四存储设备垃圾回收方法,提供了根据本申请第八方面的第五存储设备垃圾回收方法,对于有效的待回收数据,将有效的待回收数据写入新的大块,以及还用新大块的物理地址更新FTL表,以在FTL表中记录有效的待回收数据的新存储位置。
根据本申请的第九方面,提供了根据本申请第九方面的第一处理去分配命令的方法,包括如下步骤:将接收的去分配命令发送给多个CPU;各个CPU从去分配命令指示的地址范围中获取属于自己维护的去分配表的一个或多个地址,并根据所述一个或多个地址更新自己维护的去分配表,以在该去分配表中记录所述一个或多个地址被去分配。
根据本申请第九方面的第一处理去分配命令的方法,提供了根据本申请第九方面的第二处理去分配命令的方法,根据接收了去分配命令,CPU更新其维护的检查标记,其中检查标记指示该去分配表中是否存在至少一项条目被标记为去分配。
根据本申请第九方面的第一或第二处理去分配命令的方法,提供了根据本申请第九方面的第三处理去分配命令的方法,CPU根据去分配命令所指示的地址范围,更新FTL表的表项;其中FTL表记录了同逻辑地址对应的物理地址。
根据本申请第九方面的第一处理去分配命令的方法,提供了根据本申请第九方面的第四处理去分配命令的方法,根据本申请第六方面的处理去分配命令的方法,为根据去分配命令所指示的地址范围更新FTL表的表项:CPU判断去分配表中是否存在至少一项条目被标记为去分配,将同获得的标记为去分配的第一表项对应的FTL表的第二表项所记录的地址标记为被去分配。
根据本申请第九方面的第三或第四处理去分配命令的方法,提供了根据本申请第九方面的第五处理去分配命令的方法,在FTL表中标记表项所记录的地址被去分配后,将去分配表的表项的去分配标记清除。
根据本申请第九方面的第三至第五处理去分配命令的方法之一,提供了根据本申请第九方面的第六处理去分配命令的方法,其中在更新FTL表前,对要更新的一个或多个地址对应的FTL表的表项加锁。
根据本申请第九方面的第三至第五处理去分配命令的方法之一,提供了根据本申请第九方面的第七处理去分配命令的方法,其中将去分配命令同时提供给多个CPU中的每一个或多个,多个CPU并行处理去分配命令。
根据本申请第九方面的第三处理去分配命令的方法,提供了根据本申请第九方面的第八处理去分配命令的方法,CPU对去分配命令中与自己维护的去分配表相关的部分进行处理。
根据本申请第九方面的第三至第七处理去分配命令的方法之一,提供了根据本申请第九方面的第九处理去分配命令的方法,其中依据IO命令访问的地址,将关联于FTL表的不同部分的IO命令分配给不同CPU处理。
根据本申请第九方面的第三至第七处理去分配命令的方法之一,提供了根据本申请第九方面的第十处理去分配命令的方法,其中将IO命令随机分配给不同CPU处理。
根据本申请第九方面的第九或第十处理去分配命令的方法,提供了根据本申请第九方面的第十一处理去分配命令的方法,其中响应于接收读命令,若去分配表的检查标记被设置,CPU访问去分配表,检查读命令所访问的地址是否被去分配。
根据本申请第九方面的第九至第十一处理去分配命令的方法之一,提供了根据本申请第九方面的第十二处理去分配命令的方法,其中响应于接收读命令,若去分配表的检查标记未被设置,CPU查询FTL表获取地址,从该地址读取数据作为对读命令的响应。
根据本申请第九方面的第一至第十二处理去分配命令的方法之一,提供了根据本申请第九方面的第十三处理去分配命令的方法,还包括:响应于要进行垃圾回收,选择待回收的大块;根据待回收大块,获取待回收的数据的地址;若去分配表须检查,依据待回收数据的地址访问去分配表,若去分配表的对应表项记录了去分配,从待回收大块中获取下一待回收数据。
根据本申请第九方面的第十三处理去分配命令的方法,提供了根据本申请第九方面的第十四处理去分配命令的方法,若去分配表的对应表项未记录去分配,则根据待回收数据的地址查询FTL表,以识别待回收数据是否有效;若待回收数据有效,则将待回收数据写入新大块,并更新FTL表。
根据本申请第九方面的第十三或第十四处理去分配命令的方法,提供了根据本申请第九方面的第十五处理去分配命令的方法,根据检查标记识别去分配表是否待检查,其中检查标记指示该去分配表中是否存在至少一项条目被标记为去分配。
根据本申请第九方面的第十五处理去分配命令的方法,提供了根据本申请第九方面的第十六处理去分配命令的方法,若去分配表无须检查,则通过该地址查询FTL表,得到记录的物理地址,依据记录的物理地址与待回收数据的物理地址是否一致来识别待回收数据是否有效。
根据本申请第九方面的第十五或第十六处理去分配命令的方法,提供了根据本申请第九方面的第十七处理去分配命令的方法,对于有效的待回收数据,将有效的待回收数据写入新的大块,以及还用新大块的物理地址更新FTL表,以在FTL表中记录有效的待回收数据的新存储位置。
根据本申请第九方面的第一至第十七处理去分配命令的方法之一,提供了根据本申请第九方面的第十八处理去分配命令的方法,其中检查标记记录检查去分配表的开始位置、当前位置与结尾位置。
根据本申请第九方面的第十八处理去分配命令的方法,提供了根据本申请第九方面的第十九处理去分配命令的方法,其中检查标记还记录下次检查去分配表的开始位置与结尾位置。
根据本申请第九方面的第一至第十九处理去分配命令的方法之一,提供了根据本申请第九方面的第二十处理去分配命令的方法,还包括:依据检查标记对去分配表进行清理。
根据本申请第九方面的第二十处理去分配命令的方法,提供了根据本申请第九方面的第二十一处理去分配命令的方法,对去分配表表进行清理包括:从检查标记中记录的去分配表的开始位置到去分配表的结尾位置,逐个检查去分配表的表项,如果表项被标记为去分配,则据此更新FTL表的对应表项,在FTL表的表项中记录去分配状态,并清除去分配表中该表项的去分配状态。
根据本申请第九方面的第二十或二十一处理去分配命令的方法,提供了根据本申请第九方面的第二十二处理去分配命令的方法,其中对去分配表进行清理期间,若接收到新的去分配命令,根据新的去分配命令更新检查标记的去分配表的开始位置、当前位置与结尾位置。
根据本申请第九方面的第二十二处理去分配命令的方法,提供了根据本申请第九方面的第二十三处理去分配命令的方法,其中更新检查标记包括:如果新的去分配命令的开始位置和结尾位置均在检查标记所记录的结尾位置之后,则将检查标记中的结尾位置更新为新的去分配命令的结尾位置。
根据本申请第九方面的第二十三处理去分配命令的方法,提供了根据本申请第九方面的第二十四处理去分配命令的方法,其中更新检查标记包括:如果新的去分配命令的开始位置在检查标记所记录的开始位置之后,结尾位置在检查标记所记录的结尾位置之前,且当前位置在新的去分配命令的开始位置之前,则无需更新检查标记。
根据本申请第九方面的第二十三处理去分配命令的方法,提供了根据本申请第九方面的第二十五处理去分配命令的方法,其中更新检查标记包括:如果新的去分配命令的开始位置 在检查标记所记录的开始位置之后,结尾位置在检查标记所记录的结尾位置之前,且当前位置在新的去分配命令的开始位置之后,则在检查标记中记录下次扫描的开始位置为新的去分配命令的开始位置,下次扫描的结尾位置为当前位置。
根据本申请第九方面的第二十三处理去分配命令的方法,提供了根据本申请第九方面的第二十六处理去分配命令的方法,其中更新检查标记包括:如果新的去分配命令的开始位置和结尾位置均在检查标记所记录的开始位置之前,且当前位置在新的去分配命令的结尾位置之后,则在检查标记中记录下次扫描的开始位置为新的去分配命令的开始位置,下次扫描的结尾位置为新的去分配命令的结尾位置。
根据本申请第九方面的第二十三处理去分配命令的方法,提供了根据本申请第九方面的第二十七处理去分配命令的方法,其中更新检查标记包括:如果新的去分配命令的开始位置在检查标记所记录的开始位置之前,结尾位置在检查标记所记录的开始位置之后,且当前位置在新的去分配命令的结尾位置之后,则在检查标记中记录下次扫描的开始位置为新的去分配命令的开始位置,下次扫描的结尾位置为新的去分配命令的结尾位置。
根据本申请第九方面的第十五处理去分配命令的方法,提供了根据本申请第九方面的第二十八处理去分配命令的方法,其中对去分配表进行检查或清理期间,若接收到一个或多个新的去分配命令,根据新的去分配命令更新检查标记的下次检查去分配表的开始位置与结尾位置。
根据本申请第九方面的第二十八处理去分配命令的方法,提供了根据本申请第九方面的第二十九处理去分配命令的方法,通过比较检查标记中记录的结尾位置与清理去分配表的当前位置来判断对去分配表的清理是否完成。
根据本申请第九方面的第二十八或二十九处理去分配命令的方法,提供了根据本申请第九方面的第三十处理去分配命令的方法,若清理去分配表的当前位置未达到检查标记的结尾位置,则对去分配表的清理未完成。
根据本申请第九方面的第二十八至三十处理去分配命令的方法之一,提供了根据本申请第九方面的第三十一处理去分配命令的方法,若对去分配表的清理的当前位置已达到结尾位置,还检查是否需要对去分配表进行再次清理。
本申请的第十方面提供一种计算机程序,当被载入存储设备并在存储设备的控制部件上执行时,计算机程序包括的计算机程序代码使控制部件执行根据本申请第一方面至第九方面的方法之一。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为现有技术的固态存储设备的框图;
图2示出了大块的示意图;
图3A是根据本申请实施例的处理去分配命令前的FTL表的部分的示意图;
图3B是根据本申请实施例的处理去分配命令后的FTL表的部分的示意图。
图4A为本申请实施例中处理去分配命令前的去分配表的示意图;
图4B为本申请实施例中处理去分配命令后的去分配表的示意图;
图5A为本申请实施例的处理去分配命令的方法的流程图;
图5B为本申请实施例的响应读命令的方法的流程图;
图5C为本申请实施例的响应写命令的方法的流程图;
图6A是根据本申请又一实施例的控制部件的框图;
图6B为根据图6A的实施例的响应读命令的方法的流程图;
图6C为根据图6A的实施例的响应写命令的方法的流程图;
图7A展示了根据本申请又一实施例的去分表的状态的状态转换图;
图7B是根据本申请图6A所示的实施例的处理去分配命令的流程图;
图7C是根据本申请图6A所示的实施例的扫描去分配表的流程图;
图8是根据本申请另一实施例的控制部件的框图;
图9是根据本申请实施例的处理日志的***的示意图;
图10A是根据本申请实施例的日志条目缓存的示意图;
图10B展示了根据本申请实施例的压缩去分配表日志的流程图;
图10C是根据本申请又一实施例的日志条目缓存的示意图;
图11为根据本申请再一实施例的控制部件的框图;
图12为根据本申请再一实施例的IO命令访问的逻辑地址与去分配表的映射的示意图;
图13为根据本申请再一实施例的处理去分配命令的流程图;
图14为根据本申请再一实施例的根据去分配表更新FTL表的流程图;
图15为根据本申请另一实施例的垃圾回收过程的流程图;
图16A-图16E为根据本申请另一实施例的去分配表与检查标记的对应关系的示意图;
图17为根据本申请另一实施例的处理去分配命令的示意图;
图18为根据本申请另一实施例的根据去分配表更新FTL表的流程图;以及
图19为根据本申请另一实施例的控制部件的框图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
图3A是根据本申请实施例的处理去分配命令前的FTL表的部分的示意图。图3B是根据本申请实施例的处理去分配命令后的FTL表的部分的示意图。
参看图3A,FTL表中记录了同逻辑地址范围0-7(分别记为LBA 0到LBA 7)对应的物理地址(记为PPA a-b),其中“PPA”指示物理地址,“a”指示物理块,而“b”指示物理页。以物理块1为例,其物理地址为“PPA 1-4”的物理页中存储了逻辑地址为“LBA 0”的数据,而物理地址为“PPA 1-10”的物理页中存储了逻辑地址为“LBA”的数据。依然作为举例,
根据本申请的实施例,为执行去分配操作,将FTL表中,由去分配命令所描述的逻辑地址范围所指示的FTL表项设置为特殊标记(例如,0或其他值)。例如,去分配命令指示的逻辑地址范围包括0-7与100-103。为执行该去分配命令,将FTL表中记录了逻辑地址0-7以及100-103的条目的内容设置为0。参看图3B,FTL表中同逻辑地址范围0-3(分别记为LBA 0到LBA 3)对应的物理地址变为0,而同逻辑地址范围4-7(分别记为LBA 4到LBA 7)对应的逻辑地址相对于图3A未被改变。
从而接下来要读取逻辑地址LBA 0-7或LBA 100-103中的一个或多个时,在FTL表中查询到这些逻辑地址对应的物理地址为0(含义为特殊标记),从而以符合去分配命令的指定效果的结果(例如,全0)作为对读命令的响应。可以理解的是,去分配命令所指示的逻辑地址范围可以与FTL表的表项的具有不同的单位大小。例如,去分配命令中,一个逻辑地址对应512字节的存储空间,而在FTL表中,一个表项对应4KB的存储空间。
根据本申请的实施例,为高效处理去分配命令,还维护去分配表。图4A为本申请实施例中处理去分配命令前的去分配表的示意图;图4B为本申请实施例中处理去分配命令后的去分配表的示意图。去分配表中,存储对应于FTL表中的每个逻辑地址是否被去分配的信 息。作为举例,在去分配表中为FTL表的每个逻辑地址提供1比特的存储空间。
如图4A所示,当逻辑地址已经被分配(即在FTL表中该逻辑地址具有有效的物理地址)(也参看图3A),在去分配表中,将该逻辑地址标记为“被分配”(例如,将对应的1比特存储空间设置为0)。当逻辑地址未被分配或者已经被应用了去分配命令时(也参看图3B),该逻辑地址在去分配表中被标记为“去分配”(例如,将对应的1比特存储空间设置为1)。
示例性地,在与图3A所示的处理去分配命令前的FTL表对应的图4A所示的去分配表中,FTL表中逻辑地址LBA 0-LBA 7都被分配了有效的物理地址。因而在如图4A所示的去分配表中,LBA 0-LBA 7都被标记为“被分配”(对应的1比特存储空间均设置为0)。响应于收到去分配命令,对LBA 0-LBA 3的逻辑地址范围执行去分配命令,FTL表的部分如图3B所示,而去分配表的部分如图4B所示。去分配表中LBA 0-LBA 3都被标记为“去分配”(对应的1比特存储空间均设置为1),而去分配表中LBA 4-LBA 7依然被标记为“被分配”(对应的1比特存储空间均设置为0)。
示例性地,根据本申请实施例的去分配表存储在控制部件104(参看图1)的内部存储器或DRAM 110中。可选地,通过DMA操作更新去分配表。
进一步地,固态存储断电时,还将去FTL表与分配表写入NVM,从而在异常掉电后重新启动时,能够从NVM中恢复掉电时的FTL表与去分配表。
图5A为本申请实施例的处理去分配命令的方法的流程图。响应于收到去分配命令(510),获取去分配命令所指示的逻辑地址范围(512)。例如,去分配命令指示要对LBA 0–LBA 3的逻辑地址范围执行去分配。根据去分配命令所指示的逻辑地址范围,更新去分配表(参看图4A与图4B)的表项(514),例如将图4A所示的去分配表中与逻辑地址LBA 0-LBA 3对应的表项标记为“去分配”(具有1值)(如图4B所示)。在更新了去分配表后,即可向主机指示去分配命令执行完成。从而去分配命令的执行速度得到了极大的提升。还依据去分配命令所指示的逻辑地址范围,更新FTL表的表项(518)。例如,将去分配命令所指示的一个或多个逻辑地址对应的FTL表项清零,或设置为指定值(参看图3B)。可选地,在更新FTL表前,还对要更新的一个或多个逻辑地址对应的FTL表的表项加锁,从而避免FTL表项被更新期间,其他任务读取这些FTL表项。以及在更新FTL表后,还对被更新的一个或多个逻辑地址对应的FTL表的表项解锁。
图5B为本申请实施例的响应读命令的方法的流程图。主机读取被去分配的逻辑地址时,应当收到诸如全0的指定指示。
参看图5B,响应于收到读命令(530),查询去分配表(也参看图4B),判断读命令所读取的逻辑地址是否被去分配(532)。若去分配表指示所读取的逻辑地址处于去分配状态,则以全0或者其他指定的结果作为对读命令的响应(534)。在步骤532,若去分配表指示所读取的逻辑地址已被分配时,查询FTL表获得要读取的逻辑地址对应的物理地址(536),并从获得的物理地址读取数据作为对读命令的响应(538)。
可选地,查询去分配表将增加读操作的延迟,对于访问已被分配的逻辑地址的读命令,查询去分配表是没有积极意义的。为此,还记录固态存储设备正在执行去分配命令的状态。响应于收到去分配命令(510),标记存储设备正在执行去分配命令。在此情况下,若收到读命令,则首先查询去分配表(参看图5B中532)。而当去分配命令执行完成后,例如在图5A所示的步骤518执行完成后,标记存储设备已经完成对去分配命令的执行。在此情况下,若收到读命令,则不必执行图5B中步骤532,而直接执行图5B中步骤536。
图5C为本申请实施例的响应写命令的方法的流程图。
对于未被写入数据的固态存储设备,其去分配表中指示所有的逻辑地址均处于去分配状态。响应于逻辑地址被写入数据,去分配表的与被写入数据的逻辑地址对应的表项被修改为已分配状态。而响应于执行去分配命令,被去分配的逻辑地址在去分配表中的表项被再次修改为去分配状态。
参看图5C,当收到写命令(540),为写命令分配物理地址,并用写命令所指示的逻辑地 址与分配的物理地址更新FTL表(542)。向被分配的物理地址写入数据,并向主机反馈写命令处理完成(544)。可选地,为降低写命令处理延迟,在步骤542之后,在将数据写入物理地址之前,即向主机反馈写命令处理完成。还更新去分配表,将被写入的逻辑地址在去分配表中的表项设置为已分配(548)。图12C中,步骤542、步骤544与步骤548的顺序可以调整,也可以并行或同时被执行。优选地,步骤544与步骤548发生在步骤542之后。
图6A是根据本申请又一实施例的控制部件的框图。图6A中示出的控制部件104包括主机接口610、分配器630、用于FTL任务的多个CPU(CPU0与CPU1)和用于访问NVM芯片105的介质接口620。
主机接口610用于同主机交换命令与数据。在一个例子中,主机与存储设备通过NVMe/PCIe协议通信,主机接口610处理PCIe协议数据包,提取出NVMe协议命令,并向主机返回NVMe协议命令的处理结果。
分配器630耦合到主机接口610,用于接收主机发送给存储设备的IO命令,并将IO命令分配给用于处理FTL任务的多个CPU之一。分配器630可由CPU或专用硬件实现。
控制部件104还耦合到外部存储器(例如,DRAM)110。参看图6A,外部存储器110存储FTL表0、FTL表1、去分配表0与去分配表1。用于处理FTL任务的多个CPU通过使用FTL表与去分配表来处理FTL任务。
对于写命令,在CPU的指示下,为要写入数据的逻辑地址分配新的物理地址,在FTL中记录逻辑地址与物理地址的映射关系,还更新去分配表。对于读命令,CPU访问FTL表和/或去分配表,获得同读命令的逻辑地址对应的物理地址,从物理地址读出数据。
根据图6A的实施例,FTL表被分为两部分(FTL表1与FTL表2)。CPU 0对应于FTL表0,而CPU 1对应于FTL表1。例如,按逻辑地址为奇数或偶数,确定逻辑地址位于FTL表0还是FTL 1。又例如,逻辑地址空间的前一半逻辑地址位于FTL表0,而后一半地址位于FTL表1。也有多种其他方式将FTL表划分为FTL表0与FTL表1。去分配表0同FTL表0对应,记录了FTL表0中的逻辑地址是否被去分配。去分配表1同FTL表1对应,记录了FTL表1中的逻辑地址是否被去分配。CPU 0对应于去分配表0,而CPU 1对应于去分配表1。从而,关联于FTL表的不同部分的IO命令,可由不同CPU处理。由分配器630根据IO命令访问的逻辑地址将IO命令分配给CPU 0与CPU 1之一。CPU 0与CPU 1并行处理IO命令。
可选地,CPU 0与CPU 1的每个,都可访问所有的FTL表和/或去分配表。从而,对于任何IO命令,分配器可将其分配给任何一个CPU处理。
根据图6A的实施例,CPU还为其对应的去分配表维护状态。CPU 0维护状态0,而CPU维护状态1。状态0指示去分配表0的状态,而状态1指示去分配表1的状态。去分配表的状态,至少指示了去分配表中是否存在至少一项条目被标记为“去分配”。从而,作为举例,对于去分配表0,状态0仅用一比特信息,指示去分配表中是否存在至少一项条目被标记为“去分配”;对于去分配表1,状态1仅用一比特信息,指示去分配表中是否存在至少一项条目被标记为“去分配”。从而状态0与状态1所需的存储空间极小,可存储在CPU内部的寄存器或存储器中。
图6B为根据图6A的实施例的响应读命令的方法的流程图。
参看图6B,响应于收到读命令(630),作为举例,分配器将其分配给CPU 0(也参看图6A)处理。CPU 0首先查询状态0以确定去分配表0中是否有至少一项记录被标记为“去分配”(631)。若状态0指示去分配表0中没有任何的记录被标记为“去分配”,CPU 0查询FTL表0获得要读取的逻辑地址对应的物理地址(636),并从获得的物理地址读取数据作为对读命令的响应(538),而无须访问去分配表0。若状态0指示去分配表0中有至少一个记录被标记为“去分配”(631),CPU 0查询去分配表0(也参看图4B),判断读命令所读取的逻辑地址是否被去分配(632)。若去分配表0指示所读取的逻辑地址处于去分配状态,则以全0或者其他指定的结果作为对读命令的响应(634)。在步骤632,若去分配表指示所 读取的逻辑地址已被分配时,查询FTL 0表获得要读取的逻辑地址对应的物理地址(636),并从获得的物理地址读取数据作为对读命令的响应(638)。
CPU 0还维护状态0。CPU 0通过扫描去分配表0,若发现去分配表0中没有任何记录被标记为“去分配”,则相应地设置状态0。以及响应于执行去分配命令而更新去分配表0,将状态0设置为指示去分配表0中有至少一项记录被标记为“去分配”。类似地,CPU 1维护状态1。
图6C为根据图6A的实施例的响应写命令的方法的流程图。
参看图6C,响应于收到读命令(660),作为举例,分配器将其分配给CPU 0(也参看图6A)处理。CPU 0为写命令分配物理地址(665),用于承载写命令要写入的数据。CPU0查询状态0以确定去分配表0中是否有至少一项记录被标记为“去分配”(670)。若状态0指示去分配表0中没有任何的记录被标记为“去分配”,CPU 0根据写命令的逻辑地址与步骤665为写命令分配的物理地址,更新FTL表(例如,FTL表0)(680),在FTL表中记录写命令的逻辑地址与为其分配的物理地址的对应关系。以及还指示NVM芯片将写命令的数据写入所分配的物理地址(690)。可选地,在步骤690之前,即生成对写命令处理完成的指示,并发送给写命令的发出方。
在步骤670,若状态0指示去分配表0中存在被标记为“去分配”的条目,则还根据写命令的逻辑地址更新去分配表0的条目(675),使得去分配表中同写命令的逻辑地址对应的条目被设置为“已分配”或清除该条目的“去分配”标识。可选地,若步骤670需要被执行,则步骤670与步骤675的执行顺序不做限定。
图7A展示了根据本申请又一实施例的去分表的状态的状态转换图。
去分配表的状态包括“待扫描”状态、“扫描中”状态、“扫描完成”状态与“无记录”状态。因而,状态0和/或状态1包括至少2比特指示去分配表的四种状态。
分配器(参看图6A)将访问FTL 0对应的逻辑地址空间的去分配命令,分配给CPU 0处理,而将访问FTL 1对应的逻辑地址空间的去分配命令,分配给CPU 1处理。
以CPU 0为例,响应于为去分配表0处理了去分配命令,CPU 0将状态0(也参看图6A)设置为“待扫描”状态,而无论状态0之前处于何种状态。由于执行了去分配命令,去分配表0中的部分条目被设置为“去分配”。而状态0被设置为“待扫描”状态,也指示了去分配表中存在至少一项记录被标记为“去分配”(也参看图6,步骤631)。可选地,在状态0处于“待扫描”、“扫描中”或“扫描完成”状态时,都指示了去分配表0中存在至少一项记录被标记为“去分配”(也参看图6,步骤631),而状态0处于“无记录”状态时,指示去分配表0中没有任何的记录被标记为“去分配”(也参看图6,步骤631)。
可选地或进一步地,响应于为去分配表0处理了去分配命令,CPU 0还将状态1(也参看图6A)设置为“待扫描”状态,或者请求CPU 1将状态1(也参看图6A)设置为“待扫描”状态。以及,类似地,响应于为去分配表1处理了去分配命令或状态1被设置为“待扫描”状态,将状态0设置为“待扫描”状态。
在一些条件下,CPU 0开始清理去分配表0,根据去分配表0中的被设置为“去分配”的条目,将FTL表0中对应的条目的物理地址设置为指定值(例如,0)。例如,CPU 0在没有要处理的IO命令的空闲期间,清理去分配表0。或者,响应于主机的指示,CPU 0开始清理去分配表0。响应于开始清理去分配表0,CPU 0还将处于“待扫描”状态的状态0设置为“扫描中”状态。
响应于CPU 0完成对去分配表0的清理,CPU 0将处于“扫描中”状态的状态0设置为“扫描完成”状态。为完成对去分配表的清理,CPU 0将去分配表0中被设置为“去分配”的条目对应的FTL表0的条目的物理地址都设置为指定值,以及相应地将去分配表0中被设置为“去分配”的条目修改为清除“去分配”标记,以指示在对应的FTL表0的条目中记录的是对应的逻辑地址同物理地址的真实映射关系。
若状态0处于“扫描完成”状态,CPU 0还获取状态1所指示的状态。若状态1处于 “扫描完成”或“无记录”状态,则CPU 0将状态0设置为“无记录”状态。
在一种实施方式中,若状态0处于“扫描完成”状态,CPU 0访问或向CPU 1请求状态1的值。在又一种实施方式中,若状态0进入“扫描完成”状态,CPU 0向CPU 1通知状态的状态变化结果。在依然又一种实施方式中,每当状态0发生状态变化,CPU 0向CPU1通知状态的状态变化结果。
进一步地,响应于状态0变为“无记录”状态,若状态1处于“扫描完成”状态,还将状态1设置为“无记录”状态。
根据本申请的实施例,CPU 1以同CPU 0类似的方式维护状态1。
图7B是根据本申请图6A所示的实施例的处理去分配命令的流程图。
响应于收到去分配命令(710),分配器630将去分配命令分配给CPU之一(例如,CPU 0)。分配器630根据去分配命令指示的逻辑地址范围,选择负责相应逻辑地址范围的CPU处理去分配命令(720),将去分配命令发送给选择的CPU(例如,CPU 0)。被选择的CPU(CPU 0)根据去分配命令指示的逻辑地址范围,将去分配表对应的一个或多个条目设置为“去分配”(730)。至此向主机指示去分配命令处理完成。以及将状态0设置为处于“待扫描”状态。
图7C是根据本申请图6A所示的实施例的扫描去分配表的流程图。
当状态0处于“待扫描”状态,当没有待处理的IO命令,或者响应于主机的指示,CPU(以CPU 0为例)发起对所负责的去分配表0的扫描(740)。CPU 0遍历去分配表0,对于去分配表中每个指示“去分配”的条目,获取其逻辑地址,将对应FTL表(FTL 0)中同逻辑地址对应的表项的物理地址修改为指示逻辑地址被“去分配”的指定值(750)。CPU 0还将去分配表0中指示“去分配”的条目更新为清除“去分配”标记。可选地,响应于开始扫描去分配表0,CPU 0还将状态0设置为“扫描中”状态。
当CPU 0将去分配表0中的所有“去分配”标记都被清除,意味着去分配表0扫描完成。响应于扫描去分配表0完成,CPU 0将状态0设置为“扫描完成”状态(760)。若在扫描去分配表0的过程中,CPU 0收到了去分配命令,则停止扫描,以及将状态0更新为“待扫描”状态。
响应于状态0被设置为“扫描完成”状态,CPU 0还获取状态0所指示的状态。若此时状态1处于“扫描完成”或“无记录”状态,则CPU 0将状态0设置为“无记录”状态(770)。以及若状态1处于“扫描完成”,CPU 0将状态0设置为“无记录”状态,还使得状态被设置为“无记录”状态。对状态1的维护,由例如CPU 1进行。
图8是根据本申请另一实施例的控制部件的框图。图8中示出的控制部件104包括主机接口810、分配器、用于FTL任务的多个CPU(CPU0、CPU1、CPU2与CPU3)和用于访问NVM芯片105的介质接口820。
分配器用于接收主机发送给存储设备的命令,并将命令分配给用于处理FTL任务的多个CPU之一。
控制部件104还耦合到外部存储器(例如,DRAM)110。参看图8,外部存储器110存储FTL表0、FTL表1、FTL表2、FTL表3、去分配表0、去分配表1、去分配表2与分配表3。用于处理FTL任务的多个CPU通过使用FTL表与去分配表来处理FTL任务。
根据图8的实施例,FTL表被分为四部分(FTL表1、FTL表2、FTL表3与FTL表4)。CPU 0对应于FTL表0,CPU 1对应于FTL表1,CPU 2对应于FTL表2,而CPU 3对应于FTL表3。例如,按逻辑地址被4除的余数确定逻辑地址位于哪个FTL表。又例如,逻辑地址空间的前1/4逻辑地址位于FTL表0,接下来的1/4逻辑地址位于FTL表1,以此类推。也有多种其他方式划分FTL表。去分配表0同FTL表0对应,记录了FTL表0中的逻辑地址是否被去分配。去分配表1同FTL表1对应,记录了FTL表1中的逻辑地址是否被去分配。去分配表2同FTL表2对应,记录了FTL表2中的逻辑地址是否被去分配。去分配表3同FTL表3对应,记录了FTL表3中的逻辑地址是否被去分配。
CPU 0对应于去分配表0,CPU 1对应于去分配表1,CPU 2对应于去分配表2,CPU3对应于去分配表3。从而,关联于FTL表的不同部分的命令,可由不同CPU处理。由分配器根据命令访问的逻辑地址将命令分配给CPU 0、CPU1、CPU2与CPU 3之一。多个CPU并行处理命令。
可选地,每个CPU,都可访问所有的FTL表和/或去分配表。从而,对于任何IO命令,分配器可将其分配给任何一个CPU处理。
根据图8的实施例,CPU还为其对应的去分配表维护状态。CPU 0维护状态0,CPU维护状态1,CPU 2维护状态2,CPU 3维护状态3。状态0指示去分配表0的状态,状态1指示去分配表1的状态,状态2指示去分配表2的状态,状态3指示去分配表3的状态。去分配表的状态,至少指示了所有去分配表中是否存在至少一项条目被标记为“去分配”。
根据图8的实施例,各个CPU按照图6A、图7A与图7B所展示的方式,维护和使用状态0、状态1、状态2与状态3。
进一步地,CPU每次更新自己所维护的状态,将对状态的更新广播或通知给其他CPU,使得任何CPU能够知晓其他CPU所维护的状态的最新值。CPU还根据收到的其他CPU维护的状态的更新,更新自身维护的状态。以及在状态0处于“扫描完成”状态时,只有CPU 0识别出所有其他CPU维护的状态均处于“扫描完成”或“无记录”状态,才将自己维护的状态0设置为“无记录”状态。以及,若状态0处于“无记录”状态,CPU 0收到去分配命令,或识别出任何其他CPU维护的状态进入“待扫描”或“扫描中”状态,则将状态0设置为“待扫描”状态。
图9是根据本申请实施例的处理日志的***的示意图。
为提高固态存储设备的可靠性,固态存储设备对其重要元数据(例如,FTL表、去分配表等)的任何更新记录日志。参看图9,多个CPU的任何一个,响应于对例如FTL表和/或去分配表的任何更新,生成日志(称为FTL表日志或去分配表日志)条目。以去分配表日志为例,日志条目记录了被更新的去分配表条目的索引,以及该去分配表条目的更新内容。可选地,在日志条目中还记录该日志的生成者或来源(称为日志标识符)。日志标识符指示例如CPU 0。CPU 0将日志条目发送给固态存储设备的日志服务。可以由控制部件的多个CPU的一个或多个提供日志服务。日志服务将多个日志条目生成数据块并写入NVM芯片。
去分配命令通常访问大段的逻辑地址空间,并导致产生大量的日志,影响固态存储设备的性能。根据本申请的实施例,对去分配表日志进行压缩,以降低写入NVM芯片的日志数据量。
图10A是根据本申请实施例的日志条目缓存的示意图。
日志服务部件维护多个日志条目缓存。例如,日志条目缓存的数量同日志标识符的取值集合大小一致。图10A中,展示了4个日志条目缓存,分别对应于控制部件的4个会产生去分配表日志的CPU。日志条目缓存指示日志标识符(用于指示日志来源,例如,CPU)。日志条目缓存还记录去分配表索引与可选地计数值。去分配表索引指示了该日志条目来源于去分配表的那个条目,也指示了作为去日志条目来源的分配表条目所对应的逻辑地址。日志服务部件根据日志条目缓存压缩去分配表日志。
图10B展示了根据本申请实施例的压缩去分配表日志的流程图。
日志服务部件获取来自CPU的去分配表日志(1010),获取的去分配表日志的来源或日志标识符(例如,CPU 0),以及根据获取的去分配表日志指示的去分配表索引(1020)。识别去分配表日志是否可压缩(1030)。例如,根据日志标识符访问日志条目缓存,并比较日志条目缓存记录的去分配表索引,与从收到的去分配表日志中获取的去分配表索引是否连续。若缓存的去分配表索引与收到的去分配表索引连续,则识别为日志可被压缩,并在访问的日志条目缓存中,将去分配表索引更新为新接收的值。可选地,还使日志条目缓存的计数值递增。依然可选地,若计数值大于阈值,根据日志条目缓存生成要写入NVM芯片的日志数据,以避免被缓存的日志过多。若缓存的去分配表索引与收到的去分配表索引不连续,则 识别为日志不可被压缩,则根据日志条目缓存生成要写入NVM芯片的日志数据,以及根据收到的去分配表日志更新日志条目缓存。
图10C是根据本申请又一实施例的日志条目缓存的示意图。
日志条目缓存有多个条目,每个条目记录了去分配表索引与计数值。日志服务部件利用日志条目缓存识别连续的去分配表日志。响应于接收到去分配表日志,日志服务部件识别收到的去分配表日志是否命中日志条目缓存。命中日志条目缓存,意味着接收的去分配表日志对应的去分配表索引同日志条目缓存的某条目记录的去分配表索引相连续或相同。在此情况下,更新被命中的日志条目缓存条目的计数值。若计数值大于阈值,根据日志条目缓存生成要写入NVM芯片的日志数据,以避免被缓存的日志过多。若接收的去分配表日志未命中日志条目缓存,且日志条目缓存中还有空余条目,则在空余条目中记录接收的去分配表日志,而无须向NVM芯片写入数据。若接收的去分配表日志未命中日志条目缓存,且日志条目缓存中没有空余条目,则根据日志条目缓存和/或接收的去分配表条目生成要写入NVM芯片的日志数据。以及可选地根据收到的去分配表日志更新日志条目缓存。
图11是根据本申请再一实施例的控制部件的框图。图11中示出的控制部件104包括主机接口1110、分配器1130、多个CPU(CPU0与CPU1)和用于访问NVM芯片105的介质接口1120。
主机接口1110用于同主机交换命令与数据。分配器1130耦合到主机接口1110,用于接收主机发送给存储设备的IO命令,并将IO命令分配给多个CPU之一。对于去分配命令,分配器1130将去分配命令同时发送给多个CPU的每个,使得多个CPU协同处理同一去分配命令,以进一步加快对去分配命令的处理。
控制部件104还耦合到外部存储器(例如,DRAM)110。外部存储器110存储FTL表、去分配表0与去分配表1。多个CPU通过使用FTL表与去分配表来处理FTL任务。
根据图11的实施例,将IO命令分配个多个CPU的每个,从而多个CPU并行处理多个IO命令。例如,根据IO命令访问的逻辑地址为奇数或偶数,将IO命令分配给CPU 0或CPU 1。又例如,将访问前一半逻辑地址空间的IO命令分配给CPU 0,而将访问后一半逻辑地址空间的IO命令分配给CPU 1。再又一个例子中,将IO命令随机或轮流地分配给CPU 0或CPU 1,而不考虑IO命令所访问的逻辑地址。
去分配表被分为两部分(去分配表0与去分配表1)。CPU 0维护去分配表0而CPU 1维护去分配表1。例如,按逻辑地址为奇数或偶数,确定逻辑地址对应的条目位于去分配表0还是去分配表1。可选地,控制部件有更多数量的CPU来处理FTL任务,每个CPU维护各自的去分配表。例如,将逻辑地址按大小依次分配给各去分配表来维护。例如,有n个去分配表(n为正整数),将逻辑地址对n取模的结果,作为维护该逻辑地址的去分配表的索引。
返回参看图11,对于去分配命令,分配器1130将去分配命令同时提供给CPU 0与CPU 1。CPU 0对去分配命令中由去分配表0维护的部分进行处理,而CPU 1对去分配命令中由去分配表1维护的部分进行处理。从而CPU 0与CPU 1同时处理同一去分配命令,加快了去分配命令的处理过程。例如,去分配表0中维护值为偶数的逻辑地址的去分配表条目,去分配表1中维护值为奇数的逻辑地址的去分配表条目。
关联于FTL表的不同部分的IO命令,由不同CPU处理。由分配器1130根据IO命令访问的逻辑地址将IO命令分配给CPU 0与CPU 1之一。CPU 0与CPU 1并行处理多个IO命令。
对于写命令,分配器1130根据写命令访问的逻辑地址将写命令分配给CPU 0或CPU 1。例如,CPU 0所对应的去分配表0维护偶数逻辑地址的去分配表条目,则将访问偶数逻辑地址的读命令或写命令也分配给CPU 0。作为又一个例子,将读命令或写命令随机地分配给CPU 0或CPU 1。作为依然又一个例子,将读命令或写命令轮流地分配给CPU 0或CPU 1。作为再一个例子,根据CPU 0或CPU 1的负载,将读命令或写命令分配给CPU 0或C PU 1。
在例如CPU 0的指示下,为写命令指示的要写入数据的逻辑地址分配新的物理地址,在FTL中记录逻辑地址与物理地址的映射关系。
对于读命令,分配器1130将读命令分配给CPU 0或CPU 1。在例如CPU 0的指示下,访问FTL表,获得同读命令的逻辑地址对应的物理地址,并从物理地址读出数据。
去分配表临时地记录逻辑地址处于“去分配”状态。CPU 0或CPU 1还对去分配表进行检查,根据记录了“去分配”状态的去分配表表项,更新FTL表的对应表项,在FTL表项中记录“去分配”状态。响应于处理去分配命令而更新了去分配表,CPU 0或CPU 1在空闲时或周期性地检查去分配表。还记录用于指示去分配表待检查或检查尚未完成的检查标记。参看图6,CPU 0维护的检查标记0指示去分配表0是否需要检查或者检查尚未完成,CPU 1维护的检查标记1指示去分配表1是否需要检查或检查尚未完成。
检查标记0与检查标记1,至少指示了各自对应的去分配表中是否存在至少一项条目被标记为“去分配”。从而,作为举例,对于去分配表0,检查标记0仅用一比特信息,指示去分配表0中是否存在至少一项条目被标记为“去分配”;对于去分配表1,检查标记1仅用一比特信息,指示去分配表1中是否存在至少一项条目被标记为“去分配”。从而检查标记0与检查标记1所需的存储空间极小,可存储在CPU内部的寄存器或存储器中。可选地,连同检查标记还存储去分配表的描述符,用于指示对去分配表的检查的进展。后面将对其进行详细介绍。
在依然可选的实施方式中,CPU 0可更新去分配表0,不可更新去分配表1,CPU 1可更新去分配表1,不可更新去分配表0。CPU 0与CPU 1都可读取去分配表0与去分配表1。
在依然可选的实施方式中,用大块描述符表描述存储设备的各个大块。大块描述符表的每个表项描述大块之一,例如记录了大块的编号、构成大块的物理块的物理地址、大块的有效数据量、大块被擦除的次数等。大块描述符表被存储在DRAM 110中。
图12展示了根据本申请再一实施例的IO命令访问的逻辑地址与去分配表的映射的示意图。在图12中,沿逻辑地址递增的方向,逻辑地址空间被分为多个区域(1202、1204……1224等),每个区域被映射到多个去分配表(去分配表0与去分配表1)之一。
在根据图12的实施例中,将各逻辑地址区域轮流映射到去分配表之一。例如,将区域1202、1206、1210、1214、1218映射到去分配表0,将区域1204、1208、1212、1216、1220映射到去分配表1。以此方式,使得来自主机的去分配命令所访问的逻辑地址空间被尽量均匀地映射给各个去分配表。每个逻辑地址区域的大小可配置。例如,每个逻辑地址区域同每个FTL表项指示的逻辑地址范围大小相同,例如4KB。
可以理解地,逻辑地址空间有其他划分方式。例如,将逻辑地址空间分为同去分配表数量相同的区域,每个区域被映射到一个去分配表。
在又一个例子中,有例如4个去分配表。将逻辑地址区域轮流映射到4个去分配表之一。例如,将逻辑地址对4取模,将结果作为该逻辑地址被映射的去分配表的索引。
继续参看图12,作为举例,去分配命令指示对逻辑地址区域1210、1212与1214执行去分配。逻辑地址区域1210与1214被映射到去分配表0,而逻辑地址区域1212被映射到去分配表1。去分配命令被提供给CPU 0与CPU 1,CPU 0访问去分配表0以对逻辑地址区域1210与1214执行去分配,而CPU 1访问去分配表1以对逻辑地址区域1212执行去分配。
图13展示了根据本申请再一实施例的处理去分配命令的流程图。
分配器630(也参看图11)将接收的去分配命令大体上同时发送给CPU 0与CPU 1(1310)。去分配命令指示了要去分配的逻辑地址。CPU 0根据去分配命令指示的逻辑地址,获取属于自己维护的去分配表0的一个或多个逻辑地址(1320),并根据这些逻辑地址更新去分配表0,以在去分配表0中记录这些逻辑地址被去分配(1330)。CPU 1根据同样的去分配命令指示的逻辑地址,获取属于自己维护的去分配表1的一个或多个逻辑地址(1340),并 根据这些逻辑地址更新去分配表1,以在去分配表1中记录这些逻辑地址被去分配(1350)。
通过CPU 0与CPU 1协同处理同一去分配命令的不同逻辑地址,加快了去分配命令的处理过程。
可以理解地,若控制部件有更多CPU来处理去分配命令,将相同的去分配命令发送给这些多个CPU,每个CPU维护各自的去分配表,并根据属于各自去分配表的逻辑地址在去分配表中标记逻辑地址被去分配。
图14展示了根据本申请再一实施例的根据去分配表更新FTL表的流程图。
去分配表用于临时地记录逻辑地址被“去分配”,以加快去分配命令的处理速度。还需要将去分配表中记录的“去分配”标记搬移到FTL表。依然以CPU 0为例,判断分配表0中是否存在至少一项条目被标记为“去分配”,CPU 0适时对去分配表进行检查,找出被标记为“去分配”的表项,根据找到的表项在FTL表中记录对应的逻辑地址被“去分配”,并清除去分配表中该表项的“去分配”标记。
参看图14,例如CPU 0(也参看图11)遍历去分配表0,找到记录了“去分配”标记的(一个)表项,根据该表项的位置得到对应的逻辑地址,将该逻辑地址记为待清理逻辑地址(1410)。可选地,CPU 0一次从去分配表0获取多个待清理的逻辑地址。
CPU 0还更新大块描述符表的一个或多个表项(1420)。根据待清理的一个或多个逻辑地址,访问FTL表,获取对应于待清理的逻辑地址的物理地址,从而识别这些物理地址所属的大块。根据大块中被去分配的物理地址的数量,更新大块描述符中记录的有效数据量。例如,若待清理的10个逻辑地址中的5个逻辑地址对应的5个物理地址属于大块1,则大块1中的这些5个物理地址所记录的数据不再有效。
CPU 0还更新FTL表中对应待清理逻辑地址的FTL表的表项,在FTL表的表项中记录其被“去分配”(1430)。以及还在去分配表中,在这些待清理的逻辑地址对应的去分配表的表项中清除“去分配”标记(1440)。
例如CPU 0重复图14所示的流程,直到去分配表0中已不存在被标记为“去分配”的表项。以及还清除相应的检查标记0。可以理解地,CPU 1根据去分配表1而执行如图14所示的类似的过程。
若检查标记被清除,在处理读命令或写命令时,无须访问去分配表,以加快对读命令或写命令的处理。若检查标记被设置,在处理读命令或写命令时,可能需要访问去分配表,以检查被访问的逻辑地址是否被去分配。
控制部件还执行垃圾回收(GC,Garbage Collection)过程,以释放被无效数据占据的存储空间。作为举例,也参看图3A与图3B,在执行去分配命令后,对应逻辑地址0到逻辑地址3的物理地址1-4、物理地址3-6、物理地址1-9与物理地址1-10上存储的数据变为无效。可回收物理块1,将其上的物理地址1-4、物理地址1-9与物理地址1-10所记录的无效数据丢弃,将物理块1上的有效数据搬移到新物理块,从而释放物理块1的无效数据占据的存储空间。
根据本申请的实施例,根据大块描述符表选择一个或多个大块作为垃圾回收过程的被回收大块。例如,选择大块描述符表的表项中记录的具有最低有效数据量的大块作为被回收大块。为获取大块的有效数据量,在处理了去分配命令后,通过执行图14的实施例所展示的流程来更新大块描述符表的表项中记录的有效数据量是有利的。其使得大块描述符表的表项能够体现大块的真实的有效数据量。
图15展示了根据本申请另一实施例的垃圾回收过程的流程图。
为进行垃圾回收,根据大块描述符表选择待回收的大块(1510),例如,选择拥有最小有效数据量的大块。可以理解地,大块描述符表所记录的大块的有效数据量,可能不是真实的有效数据量,因为尚未通过例如图14的实施例的流程来更新大块描述符表。
根据待回收大块,获取待回收的数据的物理地址与逻辑地址(1520)。其中,获取待回收数据时已知晓其物理地址,而通过待回收大块记录的物理地址同逻辑地址的对应关系,得 到待回收数据的逻辑地址。在可选的实施方式中(不同于图15),接下来直接通过逻辑地址查询FTL表,得到记录的物理地址,依据记录的物理地址与从待回收数据的物理地址是否一致来识别待回收数据是否有效,并且仅回收有效数据,而丢弃无效数据。
继续参看图15,根据检查标记(参看图11,例如检查标记0与检查标记1),识别去分配表是否待检查(1530)。若去分配表无须检查,接下来通过逻辑地址查询FTL表,得到记录的物理地址,依据记录的物理地址与从待回收数据的物理地址是否一致来识别待回收数据是否有效(1550)。对于有效的待回收数据,将有效的待回收数据写入新的大块(1560),以及还用新大块的物理地址更新FTL表(1570),以在FTL表中记录有效的待回收数据的新存储位置。
在步骤1530,若去分配表须检查,依据待回收数据的逻辑地址访问去分配表。若去分配表的对应表项记录了“去分配”(1540),其意味着该逻辑地址存储的不是有效数据,返回步骤1520从待回收大块中获取下一待回收数据的逻辑地址与物理地址。若去分配表的对应表项未记录“去分配”,则还根据待回收数据的逻辑地址查询FTL表,以识别待回收数据是否有效(1550)。
在步骤1550,若待回收数据无效,丢弃无效数据而不进行回收,以及返回步骤1520从待回收大块中获取下一待回收数据的逻辑地址与物理地址。若待回收数据有效,则将待回收数据的物理地址所记录的数据写入新大块(1560),并据此更新FTL表(1570)
图16A-图16E展示了根据本申请另一实施例的去分配表与检查标记。
在检查标记中,除了记录对应的去分配表中是否存在至少一项条目被标记为“去分配”,还记录当前检查去分配表的开始位置(S)、当前位置(C)与结尾位置(E),以及可选地还记录下次检查去分配表的开始位置(NS)与结尾位置(NE)。
参看图16A,展示了例如去分配表0。由于CPU 0处理了去分配命令(TR1),去分配表的阴影部分的表项被设置为“去分配”。以及要对去分配表0进行检查或清理,以将去分配表中的去分配标记搬移到FTL表。其中S指示当前要进行的检查的起始位置,E指示当前要进行的检查的结尾位置,C指示当前正在进行检查的位置。可以理解地,根据去分配命令(TR1)所指示的范围,得到起始位置S与结尾位置E。从起始位置S到结尾位置S通过例如图9所示的流程图逐个对去分配表的条目进行检查或清理,而C指示了当前正在检查的位置。
参看图16B,在对图16A展示的从起始位置S到结尾位置E进行检查期间,CPU 0又收到了去分配命令TR2。图16B中的与TR2对应的阴影部分指示了去分配命令TR2所指示的范围中属于去分配表0的部分。根据本申请的实施例,由于去分配命令TR2所指示的范围属于去分配表0的部分,完全在当前正在检查的范围(从起始位置S到结尾位置E)在逻辑地址空间上的后部,则仅将检查标记中记录的结尾位置从E更新为E1(TR2所指示的范围属于去分配表0的部分的结尾位置)。当前正在检查的位置为C1,根据图16B所示,对去分配表0的检查要继续从当前位置C1进行到结尾位置E1才结束。而开始位置S不更新。虽然在图16B中,在去分配命令TR1与TR2指示的阴影区分之间有部分未被标记为“去分配”的区域,出于简便的目的,对这些区域也要进行检查。在检查时发现这些区分未被标记为“去分配”,则直接略过而检查分配表0的下一表项。
参看图16C,在对图16B展示的从起始位置S到结尾位置E进行检查期间,CPU 0又收到了去分配命令TR3。图16C中的与TR3对应的横线阴影部分指示了去分配命令TR3所指示的范围属于去分配表0的部分。根据本申请的实施例,由于去分配命令TR3所指示的范围属于去分配表0的部分,同去分配命令TR1所指示的范围属于去分配表0的部分重叠,去分配命令TR3所指示的区域属于去分配表0的部分的起点在去分配命令TR1所指示的范围内,而去分配命令TR3所指示的区域属于去分配表0的部分的结尾在去分配命令TR1所指示的范围属于去分配表0的部分之后。当前正在检查的位置为C2,由于去分配命令TR3所指示的区域属于去分配表0的部分的起点与末尾均在当前位置C2与结尾位置E1之间,从而 无须因收到TR3命令而更新起始位置S与结尾位置E1。
作为另一个例子,若去分配命令TR3’(未示出)所指示的范围属于去分配表0的部分的起点在开始位置S之后,该范围的末尾在结尾位置E1之前,且该范围覆盖了当前位置C2,这意味着在去分配命令TR3’所指示的范围属于去分配表0的部分的起点与当前位置C2之间的部分已被检查或清理的去分配表表项被再次更新。在此情况下,在检查标记中还记录下次扫描的开始位置NS为去分配命令TR3’所指示的范围属于去分配表0的部分的开始位置,而下次检查的结束位置NE为当前位置C2。
参看图16D,在对图16C展示的从起始位置S到结尾位置E1进行检查期间,CPU 0又收到了去分配命令TR4。图16D中的与TR4对应的阴影部分指示了去分配命令TR4所指示的范围属于去分配表0的部分。根据本申请的实施例,去分配命令TR4所指示的区域属于去分配表0的部分的起点在去分配命令TR1所指示的范围的起点之前,而去分配命令TR4所指示的区域属于去分配表0的部分的结尾在去分配命令TR1所指示的范围之前。当前正在检查的位置为C3,由于去分配命令TR4所指示的区域属于去分配表0的部分的起点与末尾均在当前位置C3之前。这意味着在去分配命令TR4所指示的范围的去分配表表项已被更新,因而需要被检查或清理。在此情况下,在检查标记中还记录下次扫描的开始位置为去分配命令TR4的开始位置NS1,而下次检查的结束位置为去分配命令TR4的结束位置NE1。
参看图16E,在对图16C展示的从起始位置S到结尾位置E进行检查期间,CPU 0又收到了去分配命令TR5。图16E中的与TR5对应的阴影部分指示了去分配命令TR5所指示的范围属于去分配表0的部分。根据本申请的实施例,去分配命令TR5所指示的区域属于去分配表0的部分的起点在去分配命令TR1所指示的范围的起点之前,而去分配命令TR5所指示的区域属于去分配表0的部分的结尾在去分配命令TR1所指示的范围之内。当前正在检查的位置为C3,由于去分配命令TR5所指示的区域的起点与末尾均在当前位置C3之前。这意味着在去分配命令TR5所指示的范围属于去分配表0的部分的去分配表表项已被更新,因而而需要被检查或清理。在此情况下,在检查标记中还记录下次扫描的开始位置为去分配命令TR5的开始位置NS2,而下次检查的结束位置为去分配命令TR5的结束位置NE2。
在根据图16D与图16E的例子中,由于下次检查的开始位置与结束位置被设置,对去分配表的从开始位置S到结束位置E1的检查或清理完成后,还要继续根据下次检查的开始位置NS1或NS2与结束位置NE1或NE2对去分配表进行检查或清理。
图17是根据本申请另一实施例的处理去分配命令的示意图。
分配器1130(也参看图11)接收去分配命令,并将接收的去分配命令大体上同时发送给CPU 0与CPU 1(1710)。可选地,响应于收到去分配命令,CPU 0与CPU 1终止自身正在进行的对去分配表的检查或清理工作(如果对去分配表的检查或清理在进行)(1720)。
根据本申请的实施例,CPU 0与CPU 1在空闲时,对各自负责维护的去分配表进行检查或清理。而当收到新的去分配命令时,暂停正在进行的对去分配表的检查或清理,而立即处理去分配命令,以加快对去分配命令的处理,降低去分配命令的处理延迟。
依然可选地,根据所接收的去分配命令,(如果需要)CPU 0与CPU 1还更新各自维护的检查标记(检查标记0与检查标记1)(1730)。根据图16A-图16E所示的方式来更新检查标记,从而在检查标记中记录当前对分配表的检查或清理的结束位置,以及可选地,下一轮检查或清理去配表的开始位置与结束位置。
接下来CPU 0根据去分配命令指示的逻辑地址,获取属于自己维护的去分配表0的一个或多个逻辑地址(1740),并根据这些逻辑地址更新去分配表0,以在去分配表0中记录这些逻辑地址被去分配(1750)。CPU 1根据同样的去分配命令指示的逻辑地址,获取属于自己维护的去分配表1的一个或多个逻辑地址(1760),并根据这些逻辑地址更新去分配表1,以在去分配表1中记录这些逻辑地址被去分配(1770)。
图18展示了根据本申请另一实施例的根据去分配表更新FTL表的流程图。
依然以CPU 0为例,判去分配表0中是否存在至少一项条目被标记为“去分配”,CPU 0适时对去分配表进行检查,找出被标记为“去分配”的表项(1810),根据找到的表项在FTL表中记录对应的逻辑地址被“去分配”(1820),并清除去分配表中该表项的“去分配”标记(1840)。CPU 0还更新大块描述符表(1830)。步骤1830与步骤1840的执行顺序不被限制。例如CPU 0遍历去分配表0,找到记录了“去分配”标记的(一个)表项,根据该表项的位置得到对应的逻辑地址,将该逻辑地址记为待清理逻辑地址。可选地,CPU 0一次从去分配表0获取多个待清理的逻辑地址。
在更新FTL表时,根据待清理的一个或多个逻辑地址,从FTL表获取对应于待清理的逻辑地址的物理地址,从而识别这些物理地址所属的大块。根据大块中被去分配的物理地址的数量,更新大块描述符中记录的有效数据量。CPU 0还更新FTL表中对应待清理逻辑地址的FTL表的表项,在FTL表的表项中记录其被“去分配”。以及还在去分配表中,在这些待清理的逻辑地址对应的去分配表的表项中清除“去分配”标记。
CPU 0识别对去分配表0的检查或清理是否完成(1850)。通过比较检查标记中记录的结尾位置与检查或清理去分配表的当前位置来是否对去分配表的检查或清理是否完成。若当前位置未达到结尾位置,意味着去分配表表中还有表项等待被检查或清理。若对分配表0的检查或清理未完成,返回步骤1810,对去分配表进行检查,找出被标记为“去分配”的表项。
若对区分配表的检查或清理已完成(当前位置达到结尾位置),还检查是否需要对去分配表进行再次检查或清理(1860)。通过识别检查标记中是否记录了下次扫描开始位置与下次扫描结尾位置来确定是否还要对去分配表进行扫描或清理。若检查标记中记录了下次扫描开始位置与下次扫描结尾位置,则转向步骤1810,从下次扫描开始位置开始新一轮的对去分配表的检查或清理。若检查标记中未记录下次扫描开始位置与下次扫描结尾位置,CPU对去分配表的检查或清理完成。
根据本申请的实施例,各个CPU对所负责维护的去分配表执行如图18所示的流程。各CPU并行处理根据去分配表对FTL表的更新,以加快处理过程。
图19是根据本申请另一实施例的控制部件的框图。图19中示出的控制部件104,其结构与图11所示的控制部件104类似。示意性的控制部件104包括多个CPU,分配器1930将IO命令分配给多个CPU的每个。图19的控制部件包括四个CPU(CPU 0、CPU 1、CPU 2与CPU 3)。控制部件104对IO命令的处理过程同图11中控制部件104类似。将IO命令分配个多个CPU。
去分配表被分为四部分(去分配表0、去分配表1、去分配表2与去分配表3)。CPU 0维护去分配表0、CPU 1维护去分配表1、CPU 2维护去分配表2而CPU 3维护去分配表3。例如,将逻辑地址对4取模的结果,作为维护该逻辑地址的去分配表的索引。
继续参看图19,对于去分配命令,分配器1930将去分配命令同时提供给CPU 0、CPU1、CPU 2与CPU 3。CPU 0对去分配命令中由去分配表0维护的部分进行处理,而CPU 1对去分配命令中由去分配表1维护的部分进行处理,CPU 2对去分配命令中由去分配表2维护的部分进行处理,而CPU 3对去分配命令中由去分配表3维护的部分进行处理。从而CPU 0、CPU 1、CPU 2与CPU 3同时处理同一去分配命令,加快了去分配命令的处理过程。
对于IO命令,关联于FTL表的不同部分的IO命令,由不同CPU处理。由分配器1930根据IO命令访问的逻辑地址将IO命令分配给CPU 0、CPU 1、CPU 2或CPU 3之一。CPU 0、CPU 1、CPU 2与CPU 3并行处理多个IO命令。
去分配表临时地记录逻辑地址处于“去分配”状态。CPU 0、CPU 1、CPU 2或CPU 3还对去分配表进行检查,根据记录了“去分配”状态的去分配表表项,更新FTL表的对应表项,在FTL表项中记录“去分配”状态。还记录用于指示去分配表待检查或检查尚未完成的检查标记。CPU 0维护的检查标记0指示去分配表0是否需要检查或者检查尚未完成,CPU 1维护的检查标记1指示去分配表1是否需要检查或检查尚未完成。CPU 2维护的检查标记2指示去分配表2是否需要检查或者检查尚未完成,CPU 3维护的检查标记3指示去分配表 3是否需要检查或检查尚未完成。
检查标记0、检查标记1、检查标记2或检查标记3,至少指示了各自对应的去分配表中是否存在至少一项条目被标记为“去分配”。
在依然可选的实施方式中,CPU 0可更新去分配表0,不可更新去分配表1、分配表2与分配表3。CPU 1可更新去分配表1,不可更新去分配表0、分配表2与分配表3。CPU 2可更新去分配表2,不可更新去分配表0、分配表1与分配表3。CPU 3可更新去分配表3,不可更新去分配表0、分配表1与分配表2。CPU 0、CPU 1、CPU 2与CPU 3都可读取所有的去分配表。
依然作为举例,多个CPU的每个维护的检查标记,还指示该CPU维护的去分配表的当前检查的开始位置(S)、当前位置(C)与结尾位置(E),以及可选地还记录下次检查的开始位置(NS)与结尾位置(NE)。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (15)

  1. 一种处理去分配命令的***,其特征在于,包括:控制部件与外部存储器;
    控制部件包括分配器与多个CPU,
    所述分配器,接收IO命令,并将IO命令分配给多个CPU中的每一个;
    所述多个CPU,用于并行处理接收到的IO命令;
    外部存储器,存储去分配表。
  2. 如权利要求1所述的处理去分配命令的***,其特征在于,其中去分配表被分为多个部分,每个部分由多个CPU中的一个维护。
  3. 如权利要求1或2所述的处理去分配命令的***,其特征在于,其中依据IO命令访问的地址,将IO命令分配给CPU。
  4. 一种处理去分配命令的方法,其特征在于,包括如下步骤:
    将接收的去分配命令同时发送给多个CPU;
    收到去分配命令的CPU根据去分配命令指示的地址范围,获取去分配命令指示的地址范围中属于自己的去分配表的一个或多个地址,并根据获取的所述一个或多个地址更新自己维护的去分配表,以在该去分配表中记录的所述一个或多个地址被去分配。
  5. 如权利要求4所述的处理去分配命令的方法,其特征在于,其中定时或周期性的对去分配表进行检查,找出被标记为去分配的第一表项,根据第一表项在FTL表中记录对应的逻辑地址被去分配,并清除去分配表中第一表项的去分配标记。
  6. 如权利要求4或5所述的处理去分配命令的方法,其特征在于,其中还包括根据去分配命令指示的地址范围更新大块描述符中记录的有效数据量。
  7. 一种处理去分配命令的方法,其特征在于,包括如下步骤:
    将接收的去分配命令发送给多个CPU;
    各个CPU从去分配命令指示的地址范围中获取属于自己维护的去分配表的一个或多个地址,并根据所述一个或多个地址更新自己维护的去分配表,以在该去分配表中记录所述一个或多个地址被去分配。
  8. 如权利要求7所述的处理去分配命令的方法,其特征在于,还包括:
    响应于要进行垃圾回收,选择待回收的大块;
    根据待回收大块,获取待回收的数据的地址;
    若去分配表须检查,依据待回收数据的地址访问去分配表,若去分配表的对应表项记录了去分配,从待回收大块中获取下一待回收数据。
  9. 一种存储设备执行的方法,包括:
    接收读命令;
    若去分配表中没有任何条目被标记为“去分配”,查询FTL表获得读命令访问的逻辑地址对应的物理地址;以及
    从物理地址获取数据作为对读命令的响应。
  10. 根据权利要求9所述的方法,还包括:
    若去分配表中有至少一个条目被标记为“去分配”,查询去分配表以确定读命令访问的逻辑地址是否被去分配;
    若读命令访问的逻辑地址在去分配表中被标记为“去分配”,以指定值作为对读命令的响应。
  11. 根据权利要求10所述的方法,还包括:
    若读命令访问的逻辑地址在去分配表中未被标记为“去分配”,依据查询 FTL表获得的物理地址获取数据作为对读命令的响应。
  12. 根据权利要求9-11之一所述的方法,还包括:
    响应于接收写命令,为写命令分配物理地址;
    若去分配表中没有任何条目被标记为“去分配”,用写命令的逻辑地址与分配的物理地址更新FTL表;以及
    根据物理地址写入数据。
  13. 根据权利要求12所述的方法,还包括:
    若去分配表中有至少一个条目被标记为“去分配”,将去分配表中同写命令的逻辑地址对应的条目清除“去分配”标记。
  14. 根据权利要求9-13之一所述的方法,其中
    去分配表包括第一去分配表与第二去分配表,分别对应于第一FTL表与第二FTL表;
    第一去分配表的状态指示包括第一去分配表与第二去分配表的去分配表中是否有任何条目被标记为“去分配”;
    第二去分配表的状态指示包括第一去分配表与第二去分配表的去分配表中是否有任何条目被标记为“去分配”。
  15. 一种存储设备,包括控制部件、存储器和NVM芯片,其中,存储器存储去分配表和FTL表,控制部件包括第一CPU与第二CPU;
    第一CPU与第二CPU分别执行根据权利要求9-14之一所述的方法。
PCT/CN2018/093483 2017-11-29 2018-06-28 去分配命令处理方法及其存储设备 WO2019105029A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US17/044,457 US11397672B2 (en) 2017-11-29 2018-06-28 Deallocating command processing method and storage device having multiple CPUs thereof
US17/846,524 US20220327049A1 (en) 2017-11-29 2022-06-22 Method and storage device for parallelly processing the deallocation command

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201711222238.9 2017-11-29
CN201711222238.9A CN109840048B (zh) 2017-11-29 2017-11-29 存储命令处理方法及其存储设备
CN201810594487.9A CN110580228A (zh) 2018-06-11 2018-06-11 去分配命令处理方法及其存储设备
CN201810594487.9 2018-06-11

Related Child Applications (2)

Application Number Title Priority Date Filing Date
US17/044,457 A-371-Of-International US11397672B2 (en) 2017-11-29 2018-06-28 Deallocating command processing method and storage device having multiple CPUs thereof
US17/846,524 Continuation US20220327049A1 (en) 2017-11-29 2022-06-22 Method and storage device for parallelly processing the deallocation command

Publications (1)

Publication Number Publication Date
WO2019105029A1 true WO2019105029A1 (zh) 2019-06-06

Family

ID=66664152

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/093483 WO2019105029A1 (zh) 2017-11-29 2018-06-28 去分配命令处理方法及其存储设备

Country Status (2)

Country Link
US (2) US11397672B2 (zh)
WO (1) WO2019105029A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI711048B (zh) * 2020-02-07 2020-11-21 大陸商合肥兆芯電子有限公司 快閃記憶體之資料整理方法、控制電路單元與儲存裝置
CN117891416A (zh) * 2024-03-18 2024-04-16 厦门大学 基于scsi协议的取消映射操作优化方法、装置及可读介质

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20210062476A (ko) * 2019-11-21 2021-05-31 에스케이하이닉스 주식회사 메모리 컨트롤러 및 그 동작 방법
US11042481B1 (en) * 2019-12-19 2021-06-22 Micron Technology, Inc. Efficient processing of commands in a memory sub-system
WO2022040914A1 (en) * 2020-08-25 2022-03-03 Micron Technology, Inc. Unmap backlog in a memory system
JP2022114726A (ja) * 2021-01-27 2022-08-08 キオクシア株式会社 メモリシステムおよび制御方法
US11977773B2 (en) 2021-09-30 2024-05-07 Kioxia Corporation Validity table for solid state drives
US11669254B2 (en) * 2021-09-30 2023-06-06 Kioxia Corporation SSD supporting deallocate summary bit table and associated SSD operations
CN114996023B (zh) * 2022-07-19 2022-11-22 新华三半导体技术有限公司 目标缓存装置、处理装置、网络设备及表项获取方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103019971A (zh) * 2012-11-25 2013-04-03 向志华 快速响应trim命令的方法、SSD控制器及***
US20130275660A1 (en) * 2012-04-12 2013-10-17 Violin Memory Inc. Managing trim operations in a flash memory system
CN103914409A (zh) * 2013-01-06 2014-07-09 北京忆恒创源科技有限公司 用于具有多处理器的存储设备的方法
CN107003942A (zh) * 2014-10-27 2017-08-01 桑迪士克科技有限责任公司 对用于增强存储设备的性能和持久性的解除映射命令的处理

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4403283A (en) * 1980-07-28 1983-09-06 Ncr Corporation Extended memory system and method
US6477612B1 (en) * 2000-02-08 2002-11-05 Microsoft Corporation Providing access to physical memory allocated to a process by selectively mapping pages of the physical memory with virtual memory allocated to the process
US20020016878A1 (en) * 2000-07-26 2002-02-07 Flores Jose L. Technique for guaranteeing the availability of per thread storage in a distributed computing environment
US20030212883A1 (en) * 2002-05-09 2003-11-13 International Business Machines Corporation Method and apparatus for dynamically managing input/output slots in a logical partitioned data processing system
US6931497B2 (en) * 2003-01-09 2005-08-16 Emulex Design & Manufacturing Corporation Shared memory management utilizing a free list of buffer indices
US20060069849A1 (en) * 2004-09-30 2006-03-30 Rudelic John C Methods and apparatus to update information in a memory
US8375392B2 (en) * 2010-01-12 2013-02-12 Nec Laboratories America, Inc. Data aware scheduling on heterogeneous platforms
KR20120132820A (ko) * 2011-05-30 2012-12-10 삼성전자주식회사 스토리지 디바이스, 스토리지 시스템 및 스토리지 디바이스의 가상화 방법
KR101824949B1 (ko) * 2011-11-23 2018-02-05 삼성전자주식회사 플래시 메모리를 기반으로 하는 저장 장치 및 그것을 포함한 사용자 장치
US10482009B1 (en) * 2013-03-15 2019-11-19 Google Llc Use of a logical-to-logical translation map and a logical-to-physical translation map to access a data storage device
JP6231899B2 (ja) * 2014-02-06 2017-11-15 ルネサスエレクトロニクス株式会社 半導体装置、プロセッサシステム、及びその制御方法
US9652315B1 (en) * 2015-06-18 2017-05-16 Rockwell Collins, Inc. Multi-core RAM error detection and correction (EDAC) test
CN109426436B (zh) 2017-08-28 2024-04-12 北京忆恒创源科技股份有限公司 基于可变长大块的垃圾回收方法与装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130275660A1 (en) * 2012-04-12 2013-10-17 Violin Memory Inc. Managing trim operations in a flash memory system
CN103019971A (zh) * 2012-11-25 2013-04-03 向志华 快速响应trim命令的方法、SSD控制器及***
CN103914409A (zh) * 2013-01-06 2014-07-09 北京忆恒创源科技有限公司 用于具有多处理器的存储设备的方法
CN107003942A (zh) * 2014-10-27 2017-08-01 桑迪士克科技有限责任公司 对用于增强存储设备的性能和持久性的解除映射命令的处理

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI711048B (zh) * 2020-02-07 2020-11-21 大陸商合肥兆芯電子有限公司 快閃記憶體之資料整理方法、控制電路單元與儲存裝置
CN117891416A (zh) * 2024-03-18 2024-04-16 厦门大学 基于scsi协议的取消映射操作优化方法、装置及可读介质
CN117891416B (zh) * 2024-03-18 2024-05-14 厦门大学 基于scsi协议的取消映射操作优化方法、装置及可读介质

Also Published As

Publication number Publication date
US11397672B2 (en) 2022-07-26
US20210103518A1 (en) 2021-04-08
US20220327049A1 (en) 2022-10-13

Similar Documents

Publication Publication Date Title
WO2019105029A1 (zh) 去分配命令处理方法及其存储设备
US10198215B2 (en) System and method for multi-stream data write
KR101769883B1 (ko) 저장부 할당 장치, 시스템, 및 방법
CN109086219B (zh) 去分配命令处理方法及其存储设备
JP2016170583A (ja) メモリシステムおよび情報処理システム
WO2012016209A2 (en) Apparatus, system, and method for redundant write caching
CN108228470B (zh) 一种处理向nvm写入数据的写命令的方法和设备
US10621085B2 (en) Storage system and system garbage collection method
CN107797934B (zh) 处理去分配命令的方法与存储设备
WO2020007030A1 (zh) 一种***控制器和***垃圾回收方法
JP2019169101A (ja) 電子機器、コンピュータシステム、および制御方法
WO2018024214A1 (zh) Io流调节方法与装置
CN107797938B (zh) 加快去分配命令处理的方法与存储设备
CN109840048B (zh) 存储命令处理方法及其存储设备
JP2019194780A (ja) 情報処理装置、データ管理プログラム及びデータ管理方法
CN116364148A (zh) 一种面向分布式全闪存储***的磨损均衡方法及***
CN108877862B (zh) 页条带的数据组织以及向页条带写入数据的方法与装置
CN110554833A (zh) 存储设备中并行处理io命令
WO2018041258A1 (zh) 去分配命令处理的方法与存储设备
CN110096452B (zh) 非易失随机访问存储器及其提供方法
CN110865945B (zh) 存储设备的扩展地址空间
CN110580228A (zh) 去分配命令处理方法及其存储设备
CN114610654A (zh) 一种固态存储设备以及向其写入数据的方法
CN109840219B (zh) 大容量固态存储设备的地址转换***与方法
CN110688056A (zh) Nvm组的存储介质替换

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18882802

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18882802

Country of ref document: EP

Kind code of ref document: A1