WO2023065654A1 - 一种数据写入方法以及相关设备 - Google Patents

一种数据写入方法以及相关设备 Download PDF

Info

Publication number
WO2023065654A1
WO2023065654A1 PCT/CN2022/093193 CN2022093193W WO2023065654A1 WO 2023065654 A1 WO2023065654 A1 WO 2023065654A1 CN 2022093193 W CN2022093193 W CN 2022093193W WO 2023065654 A1 WO2023065654 A1 WO 2023065654A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
logical
logical block
storage pool
block set
Prior art date
Application number
PCT/CN2022/093193
Other languages
English (en)
French (fr)
Inventor
李劲松
高军
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023065654A1 publication Critical patent/WO2023065654A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the embodiments of the present application relate to the computer field, and in particular, to a data writing method and related equipment.
  • An abstract storage pool can be constructed based on storage devices, and the storage pool is composed of multiple logical block groups.
  • the storage pool is composed of multiple logical block groups.
  • the embodiment of the present application provides a data writing method and related equipment, which are used to reduce read-write amplification during garbage collection.
  • Each data in the storage pool has a corresponding logical address, and the first logical address in which the first data is written into the storage pool is acquired before the first data is written into the storage pool. Then it is determined whether there is second data stored in the storage pool, and the logical address of the second data is the same as the first logical address. If the second data is stored in the storage pool, the first data is written into a first logical block set of the storage pool, and the first logical block set is used for storing hot data.
  • the data written into the first logical block set is repeatedly written data, that is, updated data. Since there is a corresponding relationship between the attribute of the data and the logical address of the data, and when the data of a certain attribute needs to be updated, the possibility that the data of this attribute will continue to be updated in the future is high. Therefore, the proportion of garbage data generated in the first logical block set is relatively high, so when garbage collection is performed, the generated read-write amplification is relatively small.
  • the first data is written into the second logical block set of the storage pool, and the second logical block set is used for storing cold data.
  • the data written in the second logical block set is not repeatedly written data, so the ratio of garbage data generated in the second logical block set is relatively low, so that when garbage collection is performed, the generated read Write amplification is small.
  • the first data is migrated to the newly created logical block set, and the data in the first logical block set is released .
  • the data attributes of the first data and the second data are the same, and there is a corresponding relationship between the first logical address and the data attributes.
  • the storage pool includes multiple logical blocks, and the storage space of the logical blocks comes from a mechanical hard disk.
  • the storage device includes multiple functional modules, and the multiple functional modules interact to implement the methods in the above first aspect and various implementation manners thereof.
  • Multiple functional modules can be implemented based on software, hardware, or a combination of software and hardware, and the multiple functional modules can be combined or divided arbitrarily based on specific implementations.
  • It includes a processor, the processor is coupled with a memory, and the memory is used to store instructions, and when the instructions are executed by the processor, the display device executes the method in the aforementioned first aspect.
  • the fourth aspect of the embodiments of the present application provides a computer program product, including codes, which, when the codes are run on a computer, cause the computer to execute the method of the aforementioned first aspect.
  • the fifth aspect of the embodiment of the present application provides a computer-readable storage medium on which computer programs or instructions are stored, and is characterized in that when the computer programs or instructions are executed, the computer programs or instructions are stored thereon, When the instructions are executed, the computer is made to execute the method in the aforementioned first aspect.
  • FIG. 1 is a schematic diagram of a system applied to the data writing method in the embodiment of the present application
  • Fig. 2 is a schematic diagram of constructing storage pool in the embodiment of the present application.
  • Fig. 3 is a schematic diagram of garbage collection in the embodiment of the present application.
  • FIG. 4 is a schematic flow chart of a data writing method in an embodiment of the present application.
  • Fig. 5a is a schematic flow chart of writing the first data in the embodiment of the present application.
  • Fig. 5b is a schematic diagram of finding the second data according to the bitmap in the embodiment of the present application.
  • FIG. 6 is a schematic diagram of the distribution of data in the first logical block set and the second logical block set in the embodiment of the present application;
  • FIG. 7 is another schematic diagram of garbage collection in the embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a storage device in an embodiment of the present application.
  • the embodiment of the present application provides a data writing method for reducing the amount of data migration during garbage collection.
  • the embodiment of the present application can be applied to the system shown in FIG. 1 , in which a user accesses data through an application program.
  • the computers running these applications are called "application servers".
  • the application server 100 may be a physical machine or a virtual machine. Physical application servers include, but are not limited to, desktops, servers, laptops, and mobile devices.
  • the application server accesses the storage device 120 through the optical fiber switch 110 to access data.
  • the switch 110 is only an optional device, and the application server 100 can also directly communicate with the storage device 120 through the network.
  • the optical fiber switch 110 can also be replaced with an Ethernet switch, an InfiniBand switch, a RoCE (RDMA over Converged Ethernet) switch, or the like.
  • the storage device 120 shown in FIG. 1 is a centralized storage system.
  • the characteristic of the centralized storage system is that there is a unified entrance, and all data from external devices must pass through this entrance, and this entrance is the engine 121 of the centralized storage system.
  • the engine 121 is the most core component in the centralized storage system, where many advanced functions of the storage system are implemented.
  • controllers in the engine 121 there are one or more controllers in the engine 121 .
  • the engine includes two controllers as an example for illustration.
  • controller 0 fails, controller 1 can take over the business of controller 0.
  • controller 1 fails, controller 0 can take over the business of controller 1. business, so as to avoid the unavailability of the entire storage device 120 caused by a hardware failure.
  • four controllers are deployed in the engine 121, there is a mirroring channel between any two controllers, so any two controllers are mutual backups.
  • the engine 121 also includes a front-end interface 125 and a back-end interface 126 , wherein the front-end interface 125 is used to communicate with the application server 100 to provide storage services for the application server 100 .
  • the back-end interface 126 is used to communicate with the hard disk 134 to expand the capacity of the storage system. Through the back-end interface 126, the engine 121 can be connected with more hard disks 134, thereby forming a very large storage pool.
  • the controller 0 includes at least a processor 123 and a memory 124 .
  • Processor 123 is a central processing unit (central processing unit, CPU), used for processing data access requests from outside the storage system (server or other storage systems), and also used for processing requests generated inside the storage system.
  • CPU central processing unit
  • the processor 123 receives the write data request sent by the application server 100 through the front-end port 125 , it will temporarily save the data in the write data request in the memory 124 .
  • the processor 123 sends the data stored in the memory 124 to the hard disk 134 for persistent storage through the back-end port.
  • the memory 124 refers to an internal memory directly exchanging data with the processor. It can read and write data at any time, and the speed is very fast. It is used as a temporary data storage for an operating system or other running programs.
  • Memory includes at least two kinds of memory, for example, memory can be either random access memory or read-only memory (Read Only Memory, ROM).
  • the random access memory is dynamic random access memory (Dynamic Random Access Memory, DRAM), or storage class memory (Storage Class Memory, SCM).
  • DRAM Dynamic Random Access Memory
  • SCM Storage Class Memory
  • DRAM is a semiconductor memory, which, like most Random Access Memory (RAM), is a volatile memory device.
  • SCM is a composite storage technology that combines the characteristics of traditional storage devices and memory.
  • Storage-class memory can provide faster read and write speeds than hard disks, but the access speed is slower than DRAM, and the cost is also cheaper than DRAM.
  • DRAM and SCM are only exemplary illustrations in this embodiment, and the memory may also include other random access memories, such as Static Random Access Memory (Static Random Access Memory, SRAM) and the like.
  • SRAM Static Random Access Memory
  • the read-only memory for example, it may be a programmable read-only memory (Programmable Read Only Memory, PROM), an erasable programmable read-only memory (Erasable Programmable Read Only Memory, EPROM), and the like.
  • the memory 124 can also be a dual-in-line memory module or a dual-line memory module (Dual In-line Memory Module, DIMM for short), that is, a module composed of dynamic random access memory (DRAM), or a solid-state hard drive. (Solid State Disk, SSD).
  • DIMM Dual In-line Memory Module
  • multiple memories 124 and different types of memories 124 may be configured in the controller 0 .
  • This embodiment does not limit the quantity and type of the memory 113 .
  • the memory 124 can be configured to have a power saving function.
  • the power saving function means that the data stored in the internal memory 124 will not be lost when the system is powered off and then powered on again. Memory with a power saving function is called non-volatile memory.
  • the memory 124 stores software programs, and the processor 123 runs the software programs in the memory 124 to manage the hard disk. For example, hard disks are abstracted into storage pools.
  • controller 1 The hardware components and software structure of the controller 1 (and other controllers not shown in FIG. 1 ) are similar to those of the controller 0 and will not be repeated here.
  • the storage system may include two or more engines 121 , and redundancy or load balancing is performed among the multiple engines 121 .
  • Figure 1 shows a centralized storage system with integrated disk control.
  • the centralized storage system may also be in the form of separate disk control.
  • FIG. 2 is a schematic diagram of building a storage pool based on the system shown in FIG. 1 .
  • the application server 100 shown in FIG. 2 is similar to the application server 100 shown in FIG. 1
  • the hard disk 134 shown in FIG. 2 is similar to the hard disk 134 shown in FIG.
  • Device 120 is similar to storage device 120 shown in FIG. 1 .
  • the hard disk 134 in this embodiment of the present application may be any type of hard disk, such as a solid state hard disk or a mechanical hard disk.
  • Each hard disk 134 is divided into several physical blocks (chunks) 202 , and the physical blocks 202 are mapped into logical blocks 203 , and the logical blocks 203 further constitute a storage pool 204 .
  • the storage pool 204 is used to provide storage space, which actually comes from the hard disk 134 included in the system. Of course, not all hard disks 134 need to provide space for the storage pool 204 .
  • the storage system may include one or more storage pools 204 , and one storage pool 204 includes part or all of the hard disks 134 .
  • Multiple logical blocks from different hard disks or different storage nodes 201 may form a logical block group 205 (plog), which is the minimum allocation unit of the storage pool 204 .
  • the storage pool may provide one or more logical block groups to the storage service layer.
  • the storage service layer further virtualizes the storage space provided by the logical block group into a logical unit (logical unit, LU) 206 and provides it to the application server 100 for use.
  • Each logical unit has a unique logical unit number (logical unit number, LUN). Since the application server 100 can directly perceive the logical unit number, those skilled in the art usually directly use LUN to refer to the logical unit.
  • Each LUN has a LUN ID, which is used to identify the LUN. The specific location of data within a LUN can be determined by the start address and the length of the data.
  • LBA logical block address
  • the data access request generated by the application server usually carries the LUN ID, LBA and length in the request.
  • the number of logical blocks contained in a logical block group depends on which mechanism (also known as redundancy mode) is adopted to ensure data reliability.
  • the storage system uses a multi-copy mechanism or an erasure coding (erasure coding, EC) verification mechanism to store data.
  • the multi-copy mechanism refers to storing at least two copies of the same data. When one of the data copies is lost, other data copies can be used to recover. If the multi-copy mechanism is adopted, a logical block group includes at least two logical blocks, and each logical block is located on a different hard disk 134 .
  • the EC verification mechanism refers to dividing the data to be stored into at least two data fragments, and calculating the verification fragments of at least two data fragments according to a certain verification algorithm. When one of the data fragments is lost, it can use Another data shard and a checksum shard restore the data. If the EC check mechanism is adopted, a logical block group includes at least three logical blocks, and each logical block is located on a different hard disk 134 .
  • multiple logical blocks from different hard disks are divided into data groups and verification groups according to the configured RAID type.
  • the data group includes at least two logical blocks for storing data slices
  • the check group includes at least one logical block for storing check slices of the data slices.
  • the data in the memory has a certain size, it can be divided into multiple data fragments according to the set RAID type, and the verification fragments can be obtained by calculation, and these data fragments and verification fragments can be sent to multiple different hard disk to store in logical block groups. After storage, these data shards and checksum shards constitute a shard.
  • a logical block group can contain one or more stripes.
  • the data slices and check slices included in the stripes can be called stripe units, and the logical blocks of the stripe units constituting each stripe correspond to the physical blocks of different hard disks.
  • the size of a stripe unit is 8KB as an example for illustration, but it is not limited to 8KB.
  • one physical block is taken from each of six mechanical hard disks to form a logical block set (a subset of the storage pool), and then the logical block set is grouped based on the set RAID type (take RAID6 as an example).
  • chunk 0, chunk 1, chunk 2, and chunk 3 are data block groups
  • chunk 4 and chunk 5 are check block groups.
  • the processor before the processor sends data to the hard disk, it needs to judge whether there is a logical block group that has been allocated. If there is, and the logical block group still has enough space to accommodate the data, then the processor can instruct the hard disk to write the data into the allocated logical block group. Specifically, the processor obtains an unused logical address from the allocated logical address range of the logical block group, and carries the logical address in the write data request and sends it to the hard disk.
  • a new logical block group needs to be created.
  • the creation process may be that the processor determines that the remaining space of the system is sufficient to create a new logical block group according to its own record of the available space of each hard disk.
  • the processor respectively obtains a physical block from different hard disks, and after being mapped into logical blocks, constructs these logical blocks into a new logical block group according to the set RAID type.
  • Each logical block is assigned a segment of logical addresses, and the set of these logical addresses is the logical address of a new logical block group.
  • the mapping relationship between logical blocks and physical blocks also needs to be stored in memory for easy search.
  • the processor can monitor the free space of each hard disk in real time or periodically, so as to know the free space of the entire system.
  • garbage collection can be started.
  • the capacity of a mechanical hard disk is 128G
  • the total capacity of all mechanical hard disks included in the system is 1280G
  • the space threshold can be set to 640G. That is to say, when the data stored in the system reaches half of the total capacity, and the remaining available space also reaches the space threshold, garbage collection can be performed at this time.
  • 640G is just an example of the spatial threshold, and the spatial threshold can also be set to other values.
  • garbage collection can also be triggered.
  • garbage collection may also be started.
  • the processor can perform system garbage collection in units of logical block groups.
  • the system controller When data is written into the logical block group in the storage pool, the system controller sends a write data request to the hard disk, and the data write request carries the logical address of the data on the hard disk. When the data is read, the system controller reads the data according to the logical address of the data on the hard disk.
  • logical block group 1 and logical block group 2 there are logical block group 1 and logical block group 2 in the storage pool. Since data is written into logical block group 1 or logical block group 2 randomly, when the written data becomes garbage data , the garbage data will also be evenly distributed in logical block group 1 and logical block group 2.
  • the preset threshold may be 50%, for example.
  • the proportion of garbage data in logical block group 1 and logical block group 2 has reached 50%, so it is necessary to migrate the valid data in logical block group 1 and logical block group 2 to the newly created logical block group 3 in. After that, the data in logical block group 1 and logical block group 2 are released.
  • the process of garbage collection will generate read-write amplification, and the read-write amplification D satisfies the following formula (1):
  • the data migration amount in the garbage collection process shown in FIG. 3 above is 8, and the data release amount is 8. Therefore, the read-write amplification generated by the garbage collection shown in FIG. 3 above is 1.
  • FIG. 4 the following will introduce a flow of the data writing method in the embodiment of the present application. It should be understood that this embodiment can be applied to the systems shown in FIG. 1 and FIG. 2 above.
  • the system creates a storage pool based on the storage medium.
  • the storage pool includes multiple logical blocks.
  • the storage medium can be a mechanical hard disk, and the storage space of the logical block comes from the mechanical hard disk.
  • the system After obtaining the first logical address corresponding to the first data, the system determines whether the second data is stored in the storage pool, and the logical address of the second data is the first logical address. In one manner, the manner of finding the second data may be realized through a bitmap.
  • data with different attributes can include, for example, the user's ID number, the user's deposit balance, and the user's contact information, which represents the user's ID number
  • the logical address of the data is logical address 1
  • the logical address of the data representing the user's deposit balance is logical address 2
  • the logical address of the data representing the user's contact information is logical address 3. Due to the different attributes of the data, the probability of the data of each attribute changing is also different, and if the data of a certain attribute has changed, the probability of the data of this attribute continuing to change is relatively high. If the data of a certain attribute has not If there is a change, the probability that the data of this attribute will continue to change is relatively small. For example, the user's deposit balance often changes frequently, while the user's ID number is often fixed.
  • all logical blocks in the storage pool are divided into a first set of logical blocks and a second set of logical blocks, and the first set of logical blocks and the second set of logical blocks respectively include a plurality of logical blocks.
  • the first data is written, it is determined according to whether the second data exists in the storage pool to write the first data into the first logical block set or the second logical block set.
  • determining whether there is second data in the storage pool can be implemented by a hotspot statistics module.
  • Each interval in the bitmap represents a logical address.
  • the three intervals from left to right in the bitmap represent logical address 1, logical address 2, and logical address 3 respectively. .
  • the number in the section is 0, it means that no data has been written into the logical address corresponding to the section, and when the number in the section is 1, it means that data has been written into the logical address corresponding to the section.
  • the number in the interval representing logical address 2 is updated from 0 to 1, it indicates that data has been written into logical address 2 in the storage pool.
  • the hotspot statistics module determines whether the second data exists in the storage pool according to the bitmap. If the first logical address of the first data has not been written before, that is, there is no second data, then after the first data is written, the hotspot statistics module can also modify the bitmap, thereby identifying the first The first logical address of the data has already been written.
  • the second data exists in the storage pool, since the logical addresses of the first data and the second data are both the first logical address, it means that the first data and the second data belong to data of the same attribute.
  • the data of this attribute is updated from the second data to the first data, and the second data will become garbage data due to the writing of the first data, so the probability of the first data becoming garbage data in the future is relatively high.
  • both the first data and the second data belong to the user's deposit balance, and after the deposit balance is updated to the first data, there is a high probability that the deposit balance will continue to change. Based on this, the first data is written into the logical blocks in the first set of logical blocks.
  • the second data is not stored in the storage pool, it means that the probability of the first data becoming junk data is relatively small.
  • the first data is the ID number of the user who wrote it for the first time, and it often does not change after writing. . Based on this, the first data is written into the logical blocks in the second set of logical blocks.
  • the first logical block set is used to store data with a high probability of becoming garbage data, and these data are also called hot data.
  • the second logical block set is used to store data with a low probability of becoming garbage data, and these data are called cold data.
  • FIG. 6 is a schematic diagram of data distribution in this embodiment.
  • data with a high probability of becoming junk data is written into the first logical block set
  • data with a low probability of becoming garbage data is written into the second logical block set. Therefore, most of the data in the first logical block set are garbage data, and most of the data in the second logical block set are valid data.
  • two types of logical block groups are included in the storage pool, wherein the logical blocks in one type of logical block group are composed of logical blocks in the first logical block set, and the logical blocks in the other type of logical block group
  • the logical blocks are composed of logical blocks in the second set of logical blocks.
  • most of the logic blocks in one of the two types of logic block groups include the logic blocks in the first logic block set, and a small number of logic blocks include the second logic block A logical block in a collection.
  • a majority of the logic blocks in the other type of logic blocks include logic blocks in the second set of logic blocks, and a small portion of the logic blocks include logic blocks in the first set of logic blocks.
  • FIG. 7 is another schematic diagram of garbage collection.
  • part or all of the logic blocks in the first set of logic blocks may constitute the first logical block group
  • some or all of the logical blocks in the second set of logical blocks may constitute the second logical block group. It should be understood that since the data in the first logical block group comes from the data in the first logical block set, the proportion of garbage data in the first logical block group is relatively high; the data in the second logical block group comes from the second data in the logical block set, so the proportion of junk data in the second logical block group is relatively low.
  • the proportion of garbage data in the first logical block group has reached 50%, so it is necessary to migrate the valid data in the first logical block group to the newly created logical block group, and transfer the first logical block All data in the group is released.
  • the proportion of garbage data in the second logical block group has not reached 50%, so no processing is required.
  • the amount of data migration in the garbage collection process shown in Figure 7 is 1, and the amount of data release is 7. It can be seen from the formula (1) that the read-write amplification generated by garbage collection is 0.14. It is not difficult to see that in this embodiment, since the first logical block group is basically garbage data, and the second logical block group is basically valid data, so when performing garbage collection, often only the first data block needs to be collected. Effective data in the database is migrated, thereby greatly reducing read and write amplification.
  • the aggregation of garbage data is improved, so the amount of data migration and read-write amplification can be reduced during garbage collection, thereby reducing the impact of garbage collection on business.
  • a storage device 800 in this embodiment of the present application includes a processing unit 801 .
  • the processing unit 801 is configured to obtain a first logical address corresponding to the first data, where the first logical address is indicated by an LBA and a logical unit number identifier.
  • the processing unit 801 is further configured to determine whether there is second data in the storage pool, and the logical address of the second data is the first logical address.
  • the processing unit 801 is further configured to write the first data into the first logical block set if the second data exists, and the first logical block set is used to store hot data.
  • the processing unit 801 is further configured to write the first data into a second logical block set if there is no second data, and the second logical block set is used to store cold data.
  • the processing unit 801 is further configured to migrate the first data to a newly created logical block set if the proportion of junk data in the first logical block set is greater than or equal to a preset threshold.
  • the processing unit 801 is further configured to release data in the first logical block set.
  • the data attributes of the first data and the second data are the same, and there is a corresponding relationship between the first logical address and the data attributes.
  • the processing unit 801 is further configured to create a storage pool, the storage pool includes a plurality of logical blocks, and the storage space of the logical blocks comes from a mechanical hard disk.
  • the disclosed system, device and method can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or part of the contribution to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, read-only memory), random access memory (RAM, random access memory), magnetic disk or optical disc, etc., which can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请实施例公开了一种数据写入方法以及相关设备,用于降低垃圾回收时的读写放大。本申请实施例方法包括:获取第一数据的第一逻辑地址。确定存储池中是否存储有第二数据,第二数据的逻辑地址与第一逻辑地址相同。若存储池中存储有第二数据,则将第一数据写入存储池的第一逻辑块集合,第一逻辑块集合用于存储热数据。

Description

一种数据写入方法以及相关设备
本申请要求于2021年10月21日提交中国国家知识产权局、申请号202111228801.X、申请名称为“一种数据写入方法以及相关设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及计算机领域,尤其涉及一种数据写入方法以及相关设备。
背景技术
基于存储设备可以构建抽象化的存储池,该存储池由多个逻辑块组构成。在将数据写入存储池时,如果写入的数据的逻辑地址与存储池中已经存在的某个目标数据的逻辑地址一致,这种情况为重复写入,重复写入的数据不会覆盖目标数据,而是写入存储池中的另一个位置,并将目标数据标记为垃圾数据。当重复写入的次数过多时,存储池中的空间会被大量消耗,为避免这种情况,需要执行垃圾回收(garbage collection,GC),将逻辑块组中中除垃圾数据之外的数据迁移到一个新的逻辑块组中,并释放原有逻辑块组中的所有数据。
在当前的技术当中,通常在***的业务空闲的时候执行GC,然而这种方案的数据迁移量较大,并且如果***不存在业务空闲的情况,如果执行GC会对当前的业务产生影响。
发明内容
本申请实施例提供了一种数据写入方法以及相关设备,用于降低垃圾回收时的读写放大。
本申请实施例第一方面提供了一种数据写入方法:
每个在存储池中的数据都有对应的逻辑地址,在第一数据写入存储池之前获取第一数据写入存储池中的第一逻辑地址。之后确定存储池中是否存储有第二数据,该第二数据的逻辑地址与第一逻辑地址相同。若存储池中存储有第二数据,则将第一数据写入存储池的第一逻辑块集合,第一逻辑块集合用于存储热数据。
本申请实施例中,写入第一逻辑块集合中的数据为重复写入的数据,也即是更新的数据。由于数据的属性与数据的逻辑地址存在对应关系,并且当某种属性的数据需要进行更新时,这种属性的数据在后续继续进行更新的可能性较高。因此在第一逻辑块集合中产生的垃圾数据的比例较高,因此在进行垃圾回收时,所产生的读写放大较小。
在一种可能的实现方式中,若存储池中未存储有第二数据,则将第一数据写入存储池的第二逻辑块集合,第二逻辑块集合用于存储冷数据。
本申请实施例中,写入第二逻辑块集合中的数据不是重复写入的数据,因此第二逻辑块集合中产生垃圾数据的比例较低,从而使得在进行垃圾回收时,所产生的读写放大较小。
在一种可能的实现方式中,如果第一逻辑块集合中垃圾数据的比例大于或等于预设阈值,则将第一数据迁移至新建的逻辑块集合,并释放第一逻辑块集合中的数据。
在一种可能的实现方式中,第一数据以及所述第二数据的数据属性相同,第一逻辑地 址与数据属性存在对应关系。
在一种可能的实现方式中,还需要创建存储池,该存储池包括多个逻辑块,逻辑块的存储空间来自机械硬盘。
本申请实施例第二方面提供了一种存储设备:
该存储设备包括多个功能模块,所述多个功能模块相互作用,实现上述第一方面及其各实施方式中的方法。多个功能模块可以基于软件、硬件或软件和硬件的结合实现,且所述多个功能模块可以基于具体实现进行任意组合或分割。
本申请实施例第三方面提供了一种存储设备:
包括处理器,处理器与存储器耦合,存储器用于存储指令,当指令被处理器执行时,使得显示设备执行如前述第一方面中的方法。
本申请实施例第四方面提供了一种计算机程序产品,包括代码,当代码在计算机上运行时,使得计算机运行如前述第一方面的方法。
本申请实施例第五方面提供了一种计算机可读存储介质,其上存储有计算机程序或指令,其特征在于,计算机程序或指令被执行时,其上存储有计算机程序或指令,计算机程序或指令被执行时,使得计算机执行如前述第一方面的方法。
附图说明
图1为本申请实施例中数据写入方法所应用的***示意图;
图2为本申请实施例中构建存储池的一个示意图;
图3为本申请实施例中垃圾回收的一个示意图;
图4为本申请实施例中数据写入方法的一个流程示意图;
图5a为本申请实施例中第一数据写入的流程示意图;
图5b为本申请实施例中根据位图查找第二数据的一个示意图;
图6为本申请实施例中第一逻辑块集合以及第二逻辑块集合中数据的分布示意图;
图7为本申请实施例中垃圾回收的另一示意图;
图8为本申请实施例中存储设备的一个结构示意图。
具体实施方式
下面结合附图,对本申请的实施例进行描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。本领域普通技术人员可知,随着技术发展和新场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、***、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
本申请实施例提供了一种数据写入方法,用于减少垃圾回收时的数据迁移量。
本申请实施例可以应用于如图1所示的***中,在该***中,用户通过应用程序来存取数据。运行这些应用程序的计算机被称为“应用服务器”。应用服务器100可以是物理机,也可以是虚拟机。物理应用服务器包括但不限于桌面电脑、服务器、笔记本电脑以及移动设备。应用服务器通过光纤交换机110访问存储设备120以存取数据。然而,交换机110只是一个可选设备,应用服务器100也可以直接通过网络与存储设备120通信。或者,光纤交换机110也可以替换成以太网交换机、InfiniBand交换机、RoCE(RDMA over Converged Ethernet)交换机等。
图1所示的存储设备120是一个集中式存储***。集中式存储***的特点是有一个统一的入口,所有从外部设备来的数据都要经过这个入口,这个入口就是集中式存储***的引擎121。引擎121是集中式存储***中最为核心的部件,许多存储***的高级功能都在其中实现。
如图1所示,引擎121中有一个或多个控制器,图1以引擎包含两个控制器为例予以说明。控制器0与控制器1之间具有镜像通道,那么当控制器0将一份数据写入其内存124后,可以通过镜像通道将数据的副本发送给控制器1,控制器1将所述副本存储在自己本地的内存124中。由此,控制器0和控制器1互为备份,当控制器0发生故障时,控制器1可以接管控制器0的业务,当控制器1发生故障时,控制器0可以接管控制器1的业务,从而避免硬件故障导致整个存储设备120的不可用。当引擎121中部署有4个控制器时,任意两个控制器之间都具有镜像通道,因此任意两个控制器互为备份。
引擎121还包含前端接口125和后端接口126,其中前端接口125用于与应用服务器100通信,从而为应用服务器100提供存储服务。而后端接口126用于与硬盘134通信,以扩充存储***的容量。通过后端接口126,引擎121可以连接更多的硬盘134,从而形成一个非常大的存储池。
在硬件上,如图1所示,控制器0至少包括处理器123、内存124。处理器123是一个中央处理器(central processing unit,CPU),用于处理来自存储***外部(服务器或者其他存储***)的数据访问请求,也用于处理存储***内部生成的请求。示例性的,处理器123通过前端端口125接收应用服务器100发送的写数据请求时,会将这些写数据请求中的数据暂时保存在内存124中。当内存124中的数据总量达到一定阈值时,处理器123通过后端端口将内存124中存储的数据发送给硬盘134进行持久化存储。
内存124是指与处理器直接交换数据的内部存储器,它可以随时读写数据,而且速度很快,作为操作***或其他正在运行中的程序的临时数据存储器。内存包括至少两种存储器,例如内存既可以是随机存取存储器,也可以是只读存储器(Read Only Memory,ROM)。举例来说,随机存取存储器是动态随机存取存储器(Dynamic Random Access Memory,DRAM),或者存储级存储器(Storage Class Memory,SCM)。DRAM是一种半导体存储器,与大部分随机存取存储器(Random Access Memory,RAM)一样,属于一种易失性存储器(volatile memory)设备。SCM是一种同时结合传统储存装置与存储器特性的复合型储存技术,存储级存储器能够提供比硬盘更快速的读写速度,但存取速度上比DRAM慢,在成本上也比DRAM更为便宜。然而,DRAM和SCM在本实施例中只是示例性的说明,内存还可以包括其他随机 存取存储器,例如静态随机存取存储器(Static Random Access Memory,SRAM)等。而对于只读存储器,举例来说,可以是可编程只读存储器(Programmable Read Only Memory,PROM)、可抹除可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)等。另外,内存124还可以是双列直插式存储器模块或双线存储器模块(Dual In-line Memory Module,简称DIMM),即由动态随机存取存储器(DRAM)组成的模块,还可以是固态硬盘(Solid State Disk,SSD)。实际应用中,控制器0中可配置多个内存124,以及不同类型的内存124。本实施例不对内存113的数量和类型进行限定。此外,可对内存124进行配置使其具有保电功能。保电功能是指***发生掉电又重新上电时,内存124中存储的数据也不会丢失。具有保电功能的内存被称为非易失性存储器。
内存124中存储有软件程序,处理器123运行内存124中的软件程序可实现对硬盘的管理。例如将硬盘抽象化为存储池。
控制器1(以及其他图1中未示出的控制器)的硬件组件和软件结构与控制器0类似,这里不再赘述。
需要说明的是,图1中只示出了一个引擎121,然而在实际应用中,存储***中可包含两个或两个以上引擎121,多个引擎121之间做冗余或者负载均衡。
图1所示的是一种盘控一体的集中式存储***,在实际的实现中,集中式存储***也可以是盘控分离的形式。
请参阅图2,图2是基于图1所示的***构建存储池的一个示意图。应理解,图2中所示的应用服务器100与图1中所示的应用服务器100类似,图2中所示的硬盘134与图1中所示的硬盘134类似,图2中所示的存储设备120与图1中所示的存储设备120类似。如图2所示,本申请实施例中的硬盘134可以是任意类型的硬盘,例如可以是固态硬盘或者是机械硬盘。每个硬盘134被划分为若干个物理块(chunk)202,物理块202映射成逻辑块203,逻辑块203进而构成一个存储池204。存储池204用于提供存储空间,该存储空间实际来源于***中所包含的硬盘134。当然,并非所有硬盘134都需要提供空间给存储池204。在实际应用中,存储***中可包含一个或多个存储池204,一个存储池204包括部分或全部硬盘134。来自不同硬盘或者不同存储节点201的多个逻辑块可以组成一个逻辑块组205(plog),该逻辑块组205是存储池204的最小分配单位。
当存储服务层向存储池204申请存储空间时,存储池可以向存储服务层提供一个或多个逻辑块组。存储服务层进一步将逻辑块组提供的存储空间虚拟化为逻辑单元(logical unit,LU)206提供给应用服务器100使用。每个逻辑单元具有唯一的逻辑单元号(logical unit number,LUN)。由于应用服务器100能直接感知到逻辑单元号,本领域技术人员通常直接用LUN代指逻辑单元。每个LUN具有LUN ID,用于标识LUN。数据位于一个LUN内的具***置可以由起始地址和该数据的长度(length)确定。对于起始地址,本领域技术人员通常称作逻辑块地址(logical block address,LBA)。可以理解的是,LUN ID、LBA和长度这三个因素标识了一个确定的地址段。应用服务器生成的数据访问请求,通常在该请求中携带LUN ID、LBA和长度。
一个逻辑块组所包含的逻辑块的数量取决于采用何种机制(又称为冗余模式)来保证 数据可靠性。通常情况下,为了保证数据的可靠性,存储***采用多副本机制或者纠删码(erasure coding,EC)校验机制来存储数据。多副本机制是指存储至少两份相同的数据副本,当其中一份数据副本丢失时,可以使用其他数据副本恢复。如果采用多副本机制,一个逻辑块组至少包含两个逻辑块,每个逻辑块位于不同硬盘134上。EC校验机制是指将待存储的数据划分为至少两个数据分片,按照一定的校验算法计算至少两个数据分片的校验分片,当其中一个数据分片丢失时,可以利用另一个数据分片以及校验分片恢复数据。如果采用EC校验机制,那么一个逻辑块组至少包含三个逻辑块,每个逻辑块位于不同硬盘134上。
以EC校验机制为例,来自不同硬盘的多个逻辑块根据设定的RAID类型被划分为数据组和校验组。数据组中包括至少两个逻辑块,用于存储数据分片,校验组中包括至少一个逻辑块,用于存储数据分片的校验分片。当数据在内存中凑满一定大小时,可以根据设定的RAID类型切分为多个数据分片,并计算获得校验分片,将这些数据分片和校验分片发送给多个不同硬盘,以保存在逻辑块组中。存储之后,这些数据分片和校验分片就构成一个分条。一个逻辑块组可包含一个或多个分条。分条所包含的数据分片和校验分片都可以被称作分条单元,构成每个分条的分条单元所属的逻辑块对应不同硬盘的物理块。本实施例中,以一个分条单元的大小为8KB为例予以说明,但不限定为8KB。举个例子,假设从6个机械硬盘中各取出一个物理块构成逻辑块集合(存储池的子集),然后对逻辑块集合基于设定的RAID类型(以RAID6为例)进行分组。其中,chunk 0、chunk 1、chunk 2和chunk 3为数据块组,chunk 4和chunk 5为校验块组。当内存中存储的数据达到8KB×4=32KB时,将数据划分为4个数据分片(分别为数据分片0、数据分片1、数据分片2、数据分片3),每个数据分片的大小为8KB,然后计算获得2个校验分片(分别是P0和Q0),每个校验分片的大小也是8KB。处理器将这些数据分片和校验分片发送给硬盘,以实现将数据存储在逻辑块组中。可以理解的是,按照RAID6的冗余保护机制,任意两个数据分片或者校验分片失效时,都可以根据剩下的数据分片或者校验分片重构出失效的单元。
另外,处理器在向硬盘发送数据之前,需要判断是否存在一个已经分配好的逻辑块组,如果有,并且该逻辑块组仍然有足够的空间容纳该数据,那么处理器可以指令硬盘将数据写入已经分配的逻辑块组中。具体的,处理器从已经分配的逻辑块组的逻辑地址区间中获取一段未使用的逻辑地址,将逻辑地址携带在写数据请求中发送给硬盘。
在上述例子中,如果处理器确定***中并不存在已经分配的逻辑块组,或者已分配的逻辑块组均已写满数据,那么就需要创建一个新的逻辑块组。其创建过程可以是,处理器根据自己对每个硬盘所拥有的可用空间的记录,确定***剩余空间足以创建一个新的逻辑块组。接下来,处理器分别从不同的硬盘中获取一个物理块,经映射为逻辑块之后,再将根据设定的RAID类型将这些逻辑块构建成一个新的逻辑块组。每个逻辑块均分配有一段逻辑地址,这些逻辑地址集合就是新的逻辑块组的逻辑地址。另外,逻辑块和物理块之间的映射关系也需要保存在内存中,方便查找。
为了保证***中始终有足够的可用空间用于创建逻辑块组,处理器可以实时地或者定期地对每个硬盘的可用空间进行监控,从而获知整个***的可用空间。当***的可用空间 低于设定的空间阈值时,可以启动垃圾回收。例如,一个机械硬盘的容量是128G,***中所包含的所有机械硬盘的总容量是1280G,所述空间阈值可以设置为640G。也就是说,当该***存储的数据达到所述总容量的一半时,剩余的可用空间也达到了所述空间阈值,此时则可以执行垃圾回收。可以理解的是,640G只是空间阈值的一个示例,所述空间阈值也可以设置为其他数值。另外,当***的已使用空间达到设定的空间阈值时,也可以触发垃圾回收。另外,在本发明另一个实施例中,当一个或多个分条所包含的无效数据的数据量达到设定阈值时,也可以启动垃圾回收。处理器可以以逻辑块组为单位执行***垃圾回收。
数据在写入存储池中的逻辑块组时,由***控制器向硬盘发送写数据请求,并在写数据请求中携带该数据在硬盘上的逻辑地址。数据在被读取时,***控制器根据数据在硬盘上的逻辑地址,对数据进行读取。
请参阅图3,下面对垃圾回收的过程进行示例性说明:
数据在写入时,需要确定数据所写入的逻辑地址,该逻辑地址通过LBA以及逻辑单元号标识(LUN ID)所指示。在一种数据写入方式中,如果需要将数据A写入逻辑地址1中,该逻辑地址1由LUN ID1以及LBA1所指示。如果在存储池中存在数据B,且数据B的逻辑地址也为逻辑地址1。则数据A不会覆盖数据B,而是写入存储池中的另一个位置,并将数据B标识为垃圾数据,数据A则为有效数据。当然,如果后续写入了数据C,且数据C的逻辑地址同样为逻辑地址1,那么数据A以及数据B都将成为垃圾数据,这时数据C为有效数据。数量过多的垃圾数据会极大地消耗存储池中的空间,为了确保存储池的空间充足,需要进行针对垃圾数据进行回收。应理解,本申请实施例可以应用于采用上述数据写入方式的***中。
示例性的,在存储池中存在逻辑块组1以及逻辑块组2,由于数据在写入时会随机写入逻辑块组1或者逻辑块组2,因此当写入的数据转变为垃圾数据时,垃圾数据也会均匀分布在逻辑块组1以及逻辑块组2中。当需要执行垃圾回收时,判断逻辑块组中的垃圾数据的占比是否达到了预设的阈值,该预设的阈值例如可以是50%。如图3所示,逻辑块组1以及逻辑块组2中垃圾数据的占比都达到了50%,因此需要将逻辑块组1以及逻辑块组2中的有效数据迁移到新建的逻辑块组3中。之后,释放逻辑块组1以及逻辑块组2中的数据。垃圾回收的过程会产生读写放大,读写放大D满足如下公式(1):
Figure PCTCN2022093193-appb-000001
上述图3所示的垃圾回收过程中的数据迁移量为8,数据释放量为8,因此上述图3所示的垃圾回收所产生的读写放大为1。读写放大越大,说明有效数据的迁移量越大,从而对业务产生较大的影响,并且也不利于硬盘的寿命。
请参阅图4,下面开始对本申请实施例中的数据写入方法的一个流程进行介绍,应理解,本实施例可以应用于上述图1以及图2所示的***中。
401、获取第一数据的第一逻辑地址;
***基于存储介质创建存储池,该存储池中包括多个逻辑块,存储介质具体可以是机械硬盘,逻辑块的存储空间来自于机械硬盘。当***需要将第一数据写入存储池时,*** 获取第一数据的第一逻辑地址。第一逻辑地址携带在数据写入请求中,***可以根据数据写入请求获取第一逻辑地址。
402、确定存储池中是否存储有第二数据;
获取第一数据对应的第一逻辑地址之后,***确定存储池中是否存储有第二数据,该第二数据的逻辑地址为第一逻辑地址。在一种方式中,查找第二数据的方式可以是通过位图实现。
403、若存储池中存储有第二数据,则将第一数据写入第一逻辑块集合。
逻辑地址与数据的属性往往存在对应的关系,例如在银行的数据库***中,不同属性的数据例如可以包括用户的身份证号码、用户的存款余额以及用户的联系方式,其中表示用户的身份证号码的数据的逻辑地址为逻辑地址1,表示用户的存款余额的数据的逻辑地址为逻辑地址2,表示用户的联系方式的数据的逻辑地址为逻辑地址3。由于数据的属性不同,因此各个属性的数据发生变化的概率也不同,并且如果某种属性的数据发生过变化,则该属性的数据后续继续发生变化的概率较大,如果某种属性的数据没有发生过变化,则该属性的数据后续继续发生变化的概率较小。例如用户的存款余额往往会经常性地发生变化,而用户的身份证号码往往是固定不变的。
基于此,将存储池中的所有逻辑块分为第一逻辑块集合以及第二逻辑块集合,第一逻辑块集合以及第二逻辑块集合分别包括多个逻辑块。在第一数据写入时,根据存储池中是否存在第二数据确定将第一数据写入第一逻辑块集合或者第二逻辑块集合。
请参阅图5a,确定存储池中是否存在第二数据可以通过热点统计模块实现。示例性的,请参阅图5b,在位图中的每一个区间代表了一个逻辑地址,例如在位图中从左往右的3个区间分别代表了逻辑地址1、逻辑地址2以及逻辑地址3。当区间中的数字为0时,代表该区间对应的逻辑地址未被写入数据,当区间中的数字为1时,代表该区间对应的逻辑地址已经被写入了数据。例如当代表逻辑地址2的区间中的数字由0更新为1时,说明在存储池中的逻辑地址2已经被写入了数据。在第一数据写入时,热点统计模块根据位图确定存储池中是否存在第二数据。如果第一数据的第一逻辑地址在之前未被写入过,也即不存在第二数据,那么在第一数据写入之后,热点统计模块也可以对位图进行修改,从而标识出第一数据的第一逻辑地址已经被写入过。
如果在存储池中存在第二数据,由于第一数据与第二数据的逻辑地址都为第一逻辑地址,说明第一数据与第二数据属于相同属性的数据。该属性的数据由第二数据更新为第一数据,第二数据由于第一数据的写入将成为垃圾数据,因此第一数据在后续成为垃圾数据的概率比较大。例如第一数据与第二数据都属于用户的存款余额,存款余额更新为第一数据之后,大概率还会继续发生变化。基于此,将第一数据写入第一逻辑块集合中的逻辑块中。
如果在存储池中未存储有第二数据,则说明第一数据在后续成为垃圾数据的概率比较小,例如第一数据为初次写入的用户的身份证号码,在写入之后往往不再变化。基于此,将第一数据写入第二逻辑块集合中的逻辑块中。
基于上述描述可以得知,第一逻辑块集合用于存储成为垃圾数据的概率较大的数据, 这些数据也称之为热数据。第二逻辑块集合用于存储成为垃圾数据的概率较小的数据,这些数据称之为冷数据。
请参阅图6,图6为本实施例中数据分布的示意图。如图6所示,基于上述的数据写入方式,成为垃圾数据概率较大的数据都写入了第一逻辑块集合,成为垃圾数据概率较小的数据都写入了第二逻辑块集合。因此第一逻辑块集合中的数据大部分都为垃圾数据,而第二逻辑块集合中的数据大部分都为有效数据。应理解,在优选的方式中,在存储池中包括两类逻辑块组,其中一类逻辑块组中的逻辑块由第一逻辑块集合中的逻辑块构成,另一类逻辑块组中的逻辑块由第二逻辑块集合中的逻辑块构成。当然,在实际的实现中,也可以是两类逻辑块组中的其中一类逻辑块中的大部分逻辑块包括第一逻辑块集合中的逻辑块,而小部分逻辑块包括第二逻辑块集合中的逻辑块。另一类逻辑块中的大部分逻辑块包括第二逻辑块集合中的逻辑块,而小部分逻辑块包括第一逻辑块集合中的逻辑块。
请参阅图7,图7为垃圾回收的另一个示意图。示例性的,在实际的实现当中,第一逻辑块集合中的部分或全部逻辑块可以构成第一逻辑块组,第二逻辑块集合中的部分或全部逻辑块可以构成第二逻辑块组。应理解,由于第一逻辑块组中的数据来自于第一逻辑块集合中的数据,因此第一逻辑块组中的垃圾数据的占比较高;第二逻辑块组中的数据来自于第二逻辑块集合中的数据,因此第二逻辑块组中的垃圾数据的占比较低。如图7所示,在第一逻辑块组中垃圾数据的占比达到了50%,因此需要将第一逻辑块组中的有效数据迁移到新建的逻辑块组中,并将第一逻辑块组中的全部数据进行释放。第二逻辑块组中垃圾数据的占比还未达到50%,因此不需要进行处理。
图7中所示的垃圾回收过程的数据迁移量为1,数据释放量为7,由公式(1)可知,垃圾回收所产生的读写放大为0.14。不难看出,在本实施例中,由于第一逻辑块组中基本都是垃圾数据,第二逻辑块组中基本都是有效数据,因此在进行垃圾回收时,往往只需要对第一数据块中的有效数据进行迁移,从而大幅降低读写放大。
本申请实施例中,提高了垃圾数据的聚集性,因此在进行垃圾回收的时候可以减少数据的迁移量,减少读写放大,进而降低垃圾回收对业务的影响。
上面对本申请实施例中的数据写入方法进行了介绍,下面对本申请实施例中的存储设备进行介绍:
请参阅图8,本申请实施例中的存储设备800包括处理单元801。
处理单元801,用于获取第一数据对应的第一逻辑地址,第一逻辑地址通过LBA以及逻辑单元号标识指示。
处理单元801,还用于确定存储池中是否存在第二数据,第二数据的逻辑地址为第一逻辑地址。
处理单元801,还用于若存在第二数据,则将第一数据写入第一逻辑块集合,第一逻辑块集合用于存储热数据。
在一种可能的实现中,
处理单元801,还用于若不存在第二数据,则将第一数据写入第二逻辑块集合,第二逻辑块集合用于存储冷数据。
在一种可能的实现中,
处理单元801,还用于若第一逻辑块集合中垃圾数据的占比大于或等于预设阈值,则将第一数据迁移至新建的逻辑块集合。
处理单元801,还用于释放第一逻辑块集合中的数据。
在一种可能的实现中,
第一数据以及所述第二数据的数据属性相同,第一逻辑地址与数据属性存在对应关系。
在一种可能的实现中,
处理单元801,还用于创建存储池,存储池包括多个逻辑块,逻辑块的存储空间来自机械硬盘。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的***,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的***,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,read-only memory)、随机存取存储器(RAM,random access memory)、磁碟或者光盘等各种可以存储程序代码的介质。

Claims (12)

  1. 一种数据写入方法,其特征在于,包括:
    获取第一数据的第一逻辑地址;
    确定存储池中是否存储有第二数据,所述第二数据的逻辑地址与所述第一逻辑地址相同;
    若所述存储池中存储有所述第二数据,则将所述第一数据写入所述存储池的第一逻辑块集合,所述第一逻辑块集合用于存储热数据。
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    若所述存储池中未存储有所述第二数据,则将所述第一数据写入所述存储池的第二逻辑块集合,所述第二逻辑块集合用于存储冷数据。
  3. 根据权利要求1或2所述的方法,其特征在于,所述方法还包括:
    若所述第一逻辑块集合中垃圾数据的占比大于或等于预设阈值,则将所述第一数据迁移至新建的逻辑块集合;
    释放所述第一逻辑块集合中的数据。
  4. 根据权利要求3所述的方法,其特征在于,所述第一数据以及所述第二数据的数据属性相同,所述第一逻辑地址与所述数据属性存在对应关系。
  5. 根据权利要求1至4所述的方法,其特征在于,所述方法还包括:
    创建所述存储池,所述存储池包括多个逻辑块,逻辑块的存储空间来自机械硬盘。
  6. 一种存储设备,其特征在于,包括:
    处理单元,用于获取第一数据的第一逻辑地址;
    所述处理单元,还用于确定存储池中是否存储有第二数据,所述第二数据的逻辑地址与所述第一逻辑地址相同;
    所述处理单元,还用于若所述存储池中存储有所述第二数据,则将所述第一数据写入所述存储池的第一逻辑块集合,所述第一逻辑块集合用于存储热数据。
  7. 根据权利要求6所述的设备,其特征在于,
    所述处理单元,还用于若所述存储池中未存储有所述第二数据,则将所述第一数据写入所述存储池的第二逻辑块集合,所述第二逻辑块集合用于存储冷数据。
  8. 根据权利要求6或7所述的设备,其特征在于,
    所述处理单元,还用于若所述第一逻辑块集合中垃圾数据的占比大于或等于预设阈值,则将所述第一数据迁移至新建的逻辑块集合;
    所述处理单元,还用于释放所述第一逻辑块集合中的数据。
  9. 根据权利要求8所述的设备,其特征在于,所述第一数据以及所述第二数据的数据属性相同,所述第一逻辑地址与所述数据属性存在对应关系。
  10. 根据权利要求6至9所述的设备,其特征在于,
    所述处理单元,还用于创建所述存储池,所述存储池包括多个逻辑块,逻辑块的存储空间来自机械硬盘。
  11. 一种存储设备,其特征在于,包括处理器,所述处理器与存储器耦合,所述存储器 用于存储指令,当所述指令被所述处理器执行时,使得所述存储设备执行如权利要求1至5中任一项所述的方法。
  12. 一种计算机可读存储介质,其上存储有计算机指令或程序,其特征在于,所述计算机指令或程序被执行时,使得计算机执行如权利要求1至5中任一项所述的方法。
PCT/CN2022/093193 2021-10-21 2022-05-17 一种数据写入方法以及相关设备 WO2023065654A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111228801.X 2021-10-21
CN202111228801.XA CN116009761A (zh) 2021-10-21 2021-10-21 一种数据写入方法以及相关设备

Publications (1)

Publication Number Publication Date
WO2023065654A1 true WO2023065654A1 (zh) 2023-04-27

Family

ID=86021656

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/093193 WO2023065654A1 (zh) 2021-10-21 2022-05-17 一种数据写入方法以及相关设备

Country Status (2)

Country Link
CN (1) CN116009761A (zh)
WO (1) WO2023065654A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116540949B (zh) * 2023-07-04 2024-01-12 苏州浪潮智能科技有限公司 一种独立冗余磁盘阵列存储空间动态分配方法和装置
CN117785070B (zh) * 2024-02-23 2024-05-24 杭州海康威视数字技术股份有限公司 数据存储控制方法及装置

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150046670A1 (en) * 2013-08-08 2015-02-12 Sangmok Kim Storage system and writing method thereof
CN105677242A (zh) * 2015-12-31 2016-06-15 杭州华为数字技术有限公司 冷热数据的分离方法和装置
US20160274802A1 (en) * 2015-03-19 2016-09-22 Samsung Electronics Co., Ltd. Method of operating memory controller, data storage device including the same, and data processing system including the same
CN106406753A (zh) * 2016-08-30 2017-02-15 深圳芯邦科技股份有限公司 一种数据存储方法及数据存储装置
CN109542358A (zh) * 2018-12-03 2019-03-29 浪潮电子信息产业股份有限公司 一种固态硬盘冷热数据分离方法、装置及设备
CN110674056A (zh) * 2019-09-02 2020-01-10 新华三大数据技术有限公司 一种垃圾回收方法及装置
CN111045598A (zh) * 2019-10-10 2020-04-21 深圳市金泰克半导体有限公司 数据存储方法、装置
US20210223994A1 (en) * 2020-01-16 2021-07-22 Kioxia Corporation Memory system and method of controlling nonvolatile memory

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150046670A1 (en) * 2013-08-08 2015-02-12 Sangmok Kim Storage system and writing method thereof
US20160274802A1 (en) * 2015-03-19 2016-09-22 Samsung Electronics Co., Ltd. Method of operating memory controller, data storage device including the same, and data processing system including the same
CN105677242A (zh) * 2015-12-31 2016-06-15 杭州华为数字技术有限公司 冷热数据的分离方法和装置
CN106406753A (zh) * 2016-08-30 2017-02-15 深圳芯邦科技股份有限公司 一种数据存储方法及数据存储装置
CN109542358A (zh) * 2018-12-03 2019-03-29 浪潮电子信息产业股份有限公司 一种固态硬盘冷热数据分离方法、装置及设备
CN110674056A (zh) * 2019-09-02 2020-01-10 新华三大数据技术有限公司 一种垃圾回收方法及装置
CN111045598A (zh) * 2019-10-10 2020-04-21 深圳市金泰克半导体有限公司 数据存储方法、装置
US20210223994A1 (en) * 2020-01-16 2021-07-22 Kioxia Corporation Memory system and method of controlling nonvolatile memory

Also Published As

Publication number Publication date
CN116009761A (zh) 2023-04-25

Similar Documents

Publication Publication Date Title
US10031703B1 (en) Extent-based tiering for virtual storage using full LUNs
US8984221B2 (en) Method for assigning storage area and computer system using the same
JP5681413B2 (ja) 書込み可能コピーオンライト・スナップショット機能のためのi/oレイテンシーの削減
WO2016046911A1 (ja) ストレージシステム及びストレージシステムの管理方法
US8392670B2 (en) Performance management of access to flash memory in a storage device
US20070162692A1 (en) Power controlled disk array system using log storage area
WO2023065654A1 (zh) 一种数据写入方法以及相关设备
US11861204B2 (en) Storage system, memory management method, and management node
EP2302500A2 (en) Application and tier configuration management in dynamic page realloction storage system
WO2015015550A1 (ja) 計算機システム及び制御方法
US20120117320A1 (en) Latency reduction associated with a response to a request in a storage system
US8694563B1 (en) Space recovery for thin-provisioned storage volumes
JPWO2009069326A1 (ja) ネットワークブートシステム
US20110082950A1 (en) Computer system and computer system input/output method
CN111949210A (zh) 分布式存储***中元数据存储方法、***及存储介质
US9766824B2 (en) Storage device and computer system
US10884924B2 (en) Storage system and data writing control method
US11842051B2 (en) Intelligent defragmentation in a storage system
US11100008B2 (en) Efficient memory usage for snapshots
EP3889785B1 (en) Stripe reconstruction method in storage system and striping server
US20200117381A1 (en) Storage system and storage control method
US11347641B2 (en) Efficient memory usage for snapshots based on past memory usage
WO2023020136A1 (zh) 存储***中的数据存储方法及装置
CN116483263A (zh) 一种存储***的存储设备、存储***
JP5597266B2 (ja) ストレージシステム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22882279

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022882279

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022882279

Country of ref document: EP

Effective date: 20240503

NENP Non-entry into the national phase

Ref country code: DE