WO2024040857A1 - Disk array initialization method and system, electronic device, and storage medium - Google Patents

Disk array initialization method and system, electronic device, and storage medium Download PDF

Info

Publication number
WO2024040857A1
WO2024040857A1 PCT/CN2023/070636 CN2023070636W WO2024040857A1 WO 2024040857 A1 WO2024040857 A1 WO 2024040857A1 CN 2023070636 W CN2023070636 W CN 2023070636W WO 2024040857 A1 WO2024040857 A1 WO 2024040857A1
Authority
WO
WIPO (PCT)
Prior art keywords
write
data
stripe
blocks
block
Prior art date
Application number
PCT/CN2023/070636
Other languages
French (fr)
Chinese (zh)
Inventor
夏方健
苏涛
Original Assignee
苏州元脑智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州元脑智能科技有限公司 filed Critical 苏州元脑智能科技有限公司
Publication of WO2024040857A1 publication Critical patent/WO2024040857A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0632Configuration or reconfiguration of storage systems by initialisation or re-initialisation of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1004Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's to protect a block of data words, e.g. CRC or checksum
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD

Definitions

  • the present application relates to the field of computer systems and storage, and in particular to a disk array initialization method, system, electronic equipment and non-volatile computer-readable storage media.
  • RAID Redundant Arrays of Independent Disks, disk array
  • hot spare disks which are used as array redundancy and fault tolerance; when a hard disk in the array is suddenly damaged, the hot spare disk will replace the damaged hard disk.
  • the system needs to temporarily suspend to wait for the area to be completed.
  • Initialization then implements the writing of data, that is, initialization before writing, which relatively reduces the execution efficiency of the system in reading disk array data, resulting in low overall system efficiency.
  • this application provides a disk array initialization method, which method includes:
  • the stripe contains multiple blocks and at least one parity block
  • the type of IO write operation includes non-full stripe write operation and full stripe write operation;
  • the blocks in the stripe are divided into write blocks and write zero blocks according to the IO write operations;
  • Write blocks, write zero blocks, and verify blocks respectively send corresponding write requests to the lower layer. After the lower layer returns a successful write request prompt, the bitmap corresponding to the strip is updated to reflect the completion of the initialization.
  • the blocks in the stripe are divided into write blocks and write zero blocks according to the IO write operation, including:
  • dividing the blocks in the stripe into write blocks and write zero blocks according to the IO write operation also includes:
  • the write block In response to the size of the IO data to be written being consistent with the size of the write block, the write block creates a write request based on the IO data to be written; or in response to the size of the IO data to be written being smaller than the size of the write block. size, after filling the IO data to be written with all zero data according to the size of the write block, the write block creates a corresponding write request based on the filled IO data to be written.
  • the method further includes:
  • All-zero data is pre-stored in the fixed area, and the granularity of all-zero data is consistent with the block size.
  • the check value corresponding to the check block is determined based on the IO data issued by the IO write operation and the pre-stored all-zero data, including:
  • the verification block generates corresponding write requests based on the verification value.
  • the method further includes:
  • the IO data to be written in each block of the stripe is determined based on the IO data sent by the IO write operation;
  • the verification block generates corresponding write requests based on the verification value.
  • the method further includes: when the IO write operation received by the stripe is a full stripe write operation, the method further includes:
  • Each block and verification block send corresponding write requests to the lower layer respectively.
  • the bitmap corresponding to the strip is updated to reflect the completion of the initialization.
  • the method further includes:
  • the stripe position and bitmap determine whether the stripe corresponding to the IO read operation command has been initialized.
  • the method further includes:
  • the stripe corresponding to the IO read operation command has completed initialization, then the read request is sent to the lower layer to return the corresponding IO data; or in response to the IO read operation command, the corresponding stripe has not completed initialization, then all zero data is returned.
  • the method is applied to a disk array with parity blocks, and the method further includes:
  • the lost data blocks are restored.
  • determining the type of IO write operation received by the strip includes:
  • the IO write operation In response to the size of the delivered IO data being the same as the corresponding stripe size, the IO write operation is determined to be a full stripe write operation; or in response to the size of the delivered IO data being smaller than the corresponding stripe size, the IO write operation is determined to be a full stripe write operation. For non-full stripe write operations.
  • the IO write operation includes at least writing address information and IO data
  • IO read operations include at least reading address information.
  • this application provides a disk array initialization system, which includes:
  • the data preparation module is used to create a disk array and divide it into stripes.
  • a stripe contains multiple blocks and at least one parity block;
  • the data analysis module is used to determine the type of IO write operations received by the strip.
  • the types of IO write operations include non-full strip write operations and full strip write operations;
  • the data analysis module is also used to divide the blocks in the strip into write blocks and write zero blocks based on the IO write operations when the IO write operations received by the strip are non-full strip write operations;
  • the data analysis module is also used to determine the check value corresponding to the check block based on the IO data issued by the IO write operation and the pre-stored all-zero data;
  • the data processing module is used to send corresponding write requests to the lower layer by writing blocks, writing zero blocks, and verifying blocks respectively. After the lower layer returns a successful write request prompt, it updates the bitmap corresponding to the strip to reflect the completion of the initialization.
  • this application provides an electronic device.
  • the electronic device includes:
  • processors one or more processors
  • the memory is used to store computer readable instructions.
  • the program instructions are read and executed by one or more processors, the steps in the disk array initialization method provided by any of the above embodiments are performed. .
  • this application also provides a non-volatile computer-readable storage medium, which stores computer-readable instructions.
  • the computer-readable instructions cause the computer to execute the disk array provided in any of the above embodiments. Steps in the initialization method.
  • Figure 1 is a schematic diagram of a disk array initialization process provided by some embodiments of the present application.
  • Figure 2 is a flow chart of a disk array initialization method provided by some embodiments of the present application.
  • Figure 3 is an architecture diagram of a disk array initialization system provided by some embodiments of the present application.
  • Figure 4 is a structural diagram of an electronic device provided by some embodiments of the present application.
  • background initialization means that after the disk array is created, no host IO (Input/Output, input/output) operation is sent to the strip, and the strip writes zeros under the internal scheduling of the disk array; after the zero writing is completed That is, the initialization is completed.
  • IO Input/Output, input/output
  • the IO is processed directly; however, in this case, it takes a long time to wait for the initialization to be completed.
  • Initialization before writing means that when the area of the disk array accessed by the host IO operation has not been initialized, the system needs to temporarily suspend the IO operation to wait for the area to complete initialization before processing the IO operation; however, this situation will cause The problem of slow host IO processing speed occurs.
  • the disk array has not completed initialization, the above two situations will be encountered when the host IO operation is issued.
  • the storage must first write zeros to the disk area at the block granularity, and then process the host IO operation. , will cause bandwidth congestion and IO operation queuing, further affecting performance.
  • this application provides a disk array initialization method, optimizes the algorithm of initialization before writing, optimizes the waiting time for IO operations issued by the host, and changes the traditional method of writing zeros first and then writing IO in the traditional process. , in order to further improve the performance of the storage system.
  • embodiments of the present application provide a storage system. Specifically, the process of initializing a disk array in the storage system disclosed in this application includes:
  • the disk array is divided into strips.
  • the disk array contains multiple strips, and each strip contains multiple blocks and a parity block.
  • the host delivers the IO write operation to the disk array; the IO write operation at least contains write address information and IO data; the host determines the stripe of the corresponding location based on the write address information, and delivers the IO write operation to the specific in the strip.
  • the strip After the strip receives the IO write operation, it first judges the type of the IO write operation: first obtains the IO data size issued by the IO write operation and the corresponding strip size; if the IO data size issued by the host is the same as the corresponding stripe size, The sizes of the stripes are the same, that is, the IO data size is consistent with the stripe width, then the above IO write operation is determined to be a full stripe write operation; if the size of the IO data sent by the host is smaller than the stripe size, the above IO write operation is determined to be a full stripe write operation. The operation is a non-full stripe write operation.
  • the above initialization operations include:
  • the IO data delivered is not necessarily an integer multiple of the block size.
  • the size of the IO data delivered may also be 250k. In this case, the IO data still needs to be written to the two blocks in the stripe, but only one of the blocks needs to be written. The space of the block is not completely filled (that is, it is not written in blocks).
  • the write block is a full block write, that is, whether the size of the IO data to be written in the above write block is equal to the size of the write block. Consistent, if consistent, it is a full block write; if the size of the IO data written in the above write block is smaller than the write block, it is an unsatisfied block write, and in this case, it needs to be read from the disk array
  • the all-zero data is stored in advance and written to the free space in the chunk. Among them, when the disk array is created, a fixed area is divided in the memory to store all-zero data; the granularity of all-zero data is consistent with the block size.
  • the data to be written corresponding to the written blocks is the directly issued IO data; when the issued IO data is not written in full blocks, the data corresponding to the written blocks is written.
  • the data to be written is IO data padded with all zero data.
  • Other blocks that do not need to write IO data are recorded as write zero blocks. All zero data is taken from the memory to replace the original old data in the above write zero blocks. That is, the data to be written corresponding to the write zero blocks is all zeros. data.
  • determine the data to be written in the verification block i.e., the check value).
  • Each block generates a write request based on the corresponding data to be written and sends it to the lower layer. After the lower layer returns a successful prompt for the write request of each block, the initialization of the stripe is completed.
  • the IO data to be written in each block is determined directly according to the delivered IO data and divided according to the block size. Then perform an XOR operation based on the IO data to be written in each block to obtain the check value corresponding to the check block. Then each block generates a write request based on the IO data to be written and sends it to the lower layer, and the verification block generates a write request based on the verification value and sends it to the lower layer; the lower layer returns the success of the write request corresponding to all write requests.
  • the initialization is completed after the prompt. It is worth noting that there is no all-zero blocking at this time.
  • this application also discloses that when the host issues an IO read operation command, the storage system first parses the read address information corresponding to the IO read operation to determine the conditional position corresponding to the IO read operation; then based on the stripe position and bitmap , determine whether the corresponding strip has been initialized; if the bitmap flag position is 1, it indicates that the corresponding strip has been initialized; if the bitmap flag position is 0, it indicates that the corresponding strip has not been initialized.
  • the storage system sends a read request to the lower layer and obtains the IO data corresponding to the IO read operation returned by the lower layer. If the stripe is not initialized, the storage system directly returns all-zero data.
  • this application also discloses the block recovery of lost data based on the check value and IO data in the strip.
  • the strip shown in Table 1 is used as an example, in which the strip contains 5 blocks.
  • This application is different from the traditional disk array initialization method. It removes the original background initialization task that is time-consuming and resource-intensive, optimizes the pre-write initialization, removes the process of writing zeros to the hard disk and reading it, and changes it to the memory.
  • the all-zero data read operation also changes the traditional process of writing zeros in blocks and then writing IO data, which greatly saves time and avoids performance loss problems caused by IO queuing in the storage system.
  • embodiments of the present application also provide a disk array initialization method, as shown in Figure 2, specifically as follows:
  • the stripe contains multiple blocks and at least one parity block
  • the above method further includes:
  • All-zero data is pre-stored in the fixed area, and the granularity of all-zero data is consistent with the block size.
  • the type of IO write operation includes non-full stripe write operation and full stripe write operation;
  • determining the type of IO write operation received by the stripe includes:
  • the IO write operation is determined to be a full stripe write operation
  • the IO write operation is determined to be a non-full stripe write operation.
  • the above method further includes:
  • the IO data to be written in each block of the stripe is determined based on the IO data sent by the IO write operation;
  • the verification block generates corresponding write requests based on the verification value.
  • the above method when the IO write operation received by the stripe is a full stripe write operation, the above method further includes:
  • Each block and verification block send a corresponding write request to the lower layer respectively. After the lower layer returns a successful write request prompt, the bitmap corresponding to the strip is updated to reflect the completion of the initialization.
  • the blocks in the stripe are divided into write blocks and write zero blocks according to the IO write operations;
  • the blocks in the stripe are divided into write blocks and write zero blocks according to the IO write operation, including:
  • dividing the blocks in the stripe into write blocks and write zero blocks according to the IO write operation also includes:
  • the write block creates a write request based on the IO data to be written
  • the size of the IO data to be written is smaller than the size of the writing block, after the IO data to be written is filled with all zero data according to the size of the writing block, the writing block is filled in according to the size of the writing block. Input IO data to create a corresponding write request.
  • the check value corresponding to the check block is determined based on the IO data issued by the IO write operation and the pre-stored all-zero data, including:
  • the verification block generates corresponding write requests based on the verification value.
  • the IO write operation at least includes writing address information and IO data
  • IO read operations include at least reading address information.
  • the above method further includes:
  • the stripe position and bitmap determine whether the stripe corresponding to the IO read operation command has been initialized.
  • the above method further includes:
  • the above method is applied to a disk array with parity blocks, and the above method further includes:
  • inventions of the present application provide a disk array initialization system.
  • the system includes:
  • the data preparation module 310 is used to create a disk array and divide it into strips.
  • the stripes include multiple blocks and at least one parity block;
  • the data analysis module 320 is used to determine the type of IO write operations received by the stripe.
  • the types of IO write operations include non-full stripe write operations and full stripe write operations;
  • the data analysis module 320 is also configured to divide the blocks in the strip into write blocks and write zero blocks according to the IO write operations when the IO write operation received by the strip is a non-full strip write operation;
  • the data analysis module 320 is also used to determine the check value corresponding to the check block based on the IO data issued by the IO write operation and the pre-stored all-zero data;
  • the data processing module 330 is used to send corresponding write requests to the lower layer through write blocks, write zero blocks, and check blocks respectively. After the lower layer returns a successful write request prompt, it updates the bitmap corresponding to the strip to reflect the completion of the initialization. .
  • the data analysis module 320 is also used to obtain the IO data size issued by the IO write operation and the block size in the strip; the data analysis module 320 is also used to analyze the IO write according to the IO data size and block size. The number of writing blocks corresponding to the operation; the data analysis module 320 is also used to determine that the blocks in the strip except the writing blocks and the verification blocks are write zero blocks.
  • the data analysis module 320 is also used to determine the IO data to be written in each write block;
  • the data analysis module 320 is also used to compare the size of the IO data to be written and the size of the written blocks;
  • the data analysis module 320 is also configured to use the write block to create a write request based on the IO data to be written when the size of the IO data to be written is consistent with the size of the write block;
  • the data analysis module 320 is also configured to, when the size of the IO data to be written is smaller than the size of the writing block, fill the IO data to be written with all zero data according to the size of the writing block, and then use the writing block. Create a corresponding write request based on the completed IO data to be written.
  • the data preparation module 310 is also used to divide a fixed area in the disk array memory; all-zero data is pre-stored in the fixed area, and the all-zero data granularity is consistent with the block size.
  • the data analysis module 320 is also used to perform an XOR operation on the IO data to be written corresponding to each write block and the all-zero data to determine the check value; the check block is generated based on the check value Corresponding write request.
  • the data analysis module 320 when the IO write operation received by the stripe is a full stripe write operation, the data analysis module 320 is also used to determine the data to be written in each block of the stripe based on the IO data issued by the IO write operation. IO data; the data analysis module 320 is also used to perform an XOR operation on the IO data to be written in each block to determine the check value; the data analysis module 320 is also used to use the check block to generate a check value based on the check value Corresponding write request.
  • the data processing module 330 is also configured to send a corresponding write request to the lower layer based on each block and the verification block, and the lower layer returns After the write request is successfully prompted, the bitmap corresponding to the stripe is updated to reflect that the initialization is completed.
  • the data processing module 330 is also configured to determine the stripe position based on the read address information corresponding to the IO read operation command when receiving the IO read operation command; the data processing module 330 is also configured to determine the stripe position based on the stripe position and the bit position. Figure, determine whether the strip corresponding to the IO read operation command has completed initialization.
  • the data processing module 330 is also used to send a read request to the lower layer to return the corresponding IO data;
  • the data processing module 330 is also used to obtain the returned all-zero data.
  • the system is applied to a disk array with parity blocks, and the data processing module 330 is also used to recover blocks of lost data based on the parity value and IO data in the strip.
  • the data analysis module 320 is also used to obtain the IO data size delivered by the IO write operation and the corresponding stripe size; if the size of the IO data delivered is the same as the corresponding stripe size, the data analysis module 320 The IO write operation is determined to be a full stripe write operation; if the size of the delivered IO data is smaller than the corresponding stripe size, the data analysis module 320 determines that the IO write operation is a non-full stripe write operation.
  • the IO write operation at least includes writing address information and IO data; the IO read operation at least includes reading address information.
  • embodiments of the present application provide an electronic device, including: one or more processors; and a memory associated with the one or more processors, the memory is used to The program instructions that store computer readable instructions perform the steps in the method provided by any of the above embodiments when read and executed by the one or more processors.
  • FIG. 4 exemplarily shows the architecture of the electronic device, which may specifically include a processor 410, a video display adapter 411, a disk drive 412, an input/output interface 413, a network interface 414, and a memory 420.
  • the above-mentioned processor 410, video display adapter 411, disk drive 412, input/output interface 413, network interface 414, and the memory 420 can be communicatively connected through a bus 430.
  • the processor 410 can be implemented by using a general-purpose CPU (Central Processing Unit, central processing unit), a microprocessor, an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, for Execute relevant procedures to implement the technical solutions provided in this application.
  • a general-purpose CPU Central Processing Unit, central processing unit
  • a microprocessor central processing unit
  • an application specific integrated circuit Application Specific Integrated Circuit, ASIC
  • one or more integrated circuits for Execute relevant procedures to implement the technical solutions provided in this application.
  • the memory 420 can be implemented in the form of ROM (Read Only Memory, programmable memory), RAM (Random Access Memory), static storage device, dynamic storage device, etc.
  • the memory 420 may store an operating system 421 for controlling execution of the electronic device 400 and a basic input output system (BIOS) 422 for controlling low-level operations of the electronic device 400 .
  • BIOS basic input output system
  • a web browser 423, a data storage management system 424, an icon font processing system 425, etc. can also be stored.
  • the above-mentioned icon font processing system 425 can be an application program that specifically implements the above-mentioned steps in the embodiment of the present application.
  • the relevant program code is stored in the memory 420 and called and executed by the processor 410 .
  • the input/output interface 413 is used to connect the input/output module to realize information input and output.
  • the input/output/module can be configured in the device as a component (not shown in the figure), or can be externally connected to the device to provide corresponding functions.
  • Input devices can include keyboards, mice, touch screens, microphones, various sensors, etc., and output devices can include monitors, speakers, vibrators, indicator lights, etc.
  • the network interface 414 is used to connect a communication module (not shown in the figure) to realize communication interaction between this device and other devices.
  • the communication module can realize communication through wired means (such as USB, network cable, etc.) or wireless means (such as mobile network, WIFI, Bluetooth, etc.).
  • Bus 430 includes a path that carries information between various components of the device (eg, processor 410, video display adapter 411, disk drive 412, input/output interface 413, network interface 414, and memory 420).
  • processor 410 video display adapter 411, disk drive 412, input/output interface 413, network interface 414, and memory 420.
  • the electronic device 400 can also obtain information on specific receiving conditions from the virtual resource object receiving condition information database for condition judgment, and so on.
  • the A device may also include other components necessary for proper execution.
  • the above-mentioned device may also include only the components necessary to implement the solution of the present application, and does not necessarily include all the components shown in the drawings.
  • embodiments of the present application also provide a non-volatile computer-readable storage medium that stores computer-readable instructions, and the computer-readable instructions cause the computer to execute any of the above implementations.
  • the steps in the method are provided in the example.
  • the present application can be implemented by means of software plus the necessary general hardware platform. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence or that contributes to the existing technology.
  • the computer software product can be stored in a storage medium, such as ROM/RAM, disk , optical disk, etc., including a number of instructions to cause a computer device (which can be a personal computer, a cloud server, or a network device, etc.) to execute the methods described in various embodiments or certain parts of the embodiments of this application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Computer Security & Cryptography (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing For Digital Recording And Reproducing (AREA)

Abstract

The present application relates to the fields of computer systems and storage, and in particular to a disk array initialization method and system, an electronic device, and a storage medium. The method comprises: creating a disk array and performing stripe division, a stripe comprising a plurality of blocks and at least one check block; determining the type of an I/O write operation received by the stripe; when the I/O write operation received by the stripe is a non-full stripe write operation, dividing the blocks in the stripe into write blocks and write zero blocks according to the I/O write operation; determining a check value according to I/O data issued by the I/O write operation and pre-stored all-zero data; and the write blocks, the write zero blocks and the check blocks respectively sending corresponding write requests to a lower layer, the lower layer returning a write request success prompt and then updating a bitmap corresponding to the stripe to reflect initialization completion.

Description

磁盘阵列初始化方法、***、电子设备及存储介质Disk array initialization method, system, electronic device and storage medium
相关申请的交叉引用Cross-references to related applications
本申请要求于2022年8月26日提交中国专利局,申请号为202211032632.7,申请名称为“磁盘阵列初始化方法、***、电子设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application submitted to the China Patent Office on August 26, 2022, with the application number 202211032632.7 and the application name "Disk array initialization method, system, electronic device and storage medium", the entire content of which is incorporated by reference incorporated in this application.
技术领域Technical field
本申请涉及计算机***及存储领域,特别涉及一种磁盘阵列初始化方法、***、电子设备及非易失性的计算机可读存储介质。The present application relates to the field of computer systems and storage, and in particular to a disk array initialization method, system, electronic equipment and non-volatile computer-readable storage media.
背景技术Background technique
随着全球信息化程度的不断提高,海量的高价值数据正在加速产生,企业内部产生以及需要保存的数据随着上升,但这些指数级增长的高价值数据也带来了很多挑战。在大数据时代,由于数据***性增长,相应的存储也承受越来越大压力;用户对于存储设备的要求不再仅限于容量,还要求更快的读写数据的速度以更快速的处理用户任务等。而为应对数据安全,RAID(Redundant Arrays of Independent Disks,磁盘阵列)增加了热备盘,用来当作阵列冗余容错;当阵列中有硬盘突然损坏时,热备盘会替换损坏硬盘,此时利用其他成员盘,通过数据校验恢复算法,将坏盘数据重构到热备盘上去,即为数据重构。数据重构极为依赖条带数据的一致性,若条带数据不一致则会出现恢复错误数据的情况,因此为防止此类情况发生,创建RAID阵列以后,需要对RAID所有区域进行一次写零的初始化操作,写零操作以分块为粒度进行,以保证RAID中所有条带中数据保持一致性。With the continuous improvement of global informatization, the generation of massive amounts of high-value data is accelerating. The amount of data generated within enterprises and that needs to be saved is increasing. However, these exponential growth of high-value data also bring many challenges. In the era of big data, due to the explosive growth of data, corresponding storage is also under increasing pressure; users' requirements for storage devices are no longer limited to capacity, but also require faster reading and writing of data to process user tasks faster. wait. In order to deal with data security, RAID (Redundant Arrays of Independent Disks, disk array) adds hot spare disks, which are used as array redundancy and fault tolerance; when a hard disk in the array is suddenly damaged, the hot spare disk will replace the damaged hard disk. At this time, other member disks are used to reconstruct the damaged disk data to the hot spare disk through the data verification and recovery algorithm, which is data reconstruction. Data reconstruction is extremely dependent on the consistency of stripe data. If the stripe data is inconsistent, erroneous data will be restored. Therefore, in order to prevent this from happening, after creating a RAID array, you need to initialize all areas of the RAID by writing zeros at once. Operation, write zero operation is performed at block granularity to ensure data consistency in all stripes in RAID.
而,发明人意识到,现有的磁盘阵列初始化过程中,***可能会对于磁盘阵列进行数据的访问,当***所访问的磁盘阵列的区域尚未初始化时,***需要暂时挂起以等待该区域完成初始化进而实现对于数据的写入即写前初始化,因此相对降低了***读取磁盘阵列数据的执行效率进而导致出现***整体的工作效率较低的情况。However, the inventor realized that during the initialization process of the existing disk array, the system may access data from the disk array. When the area of the disk array that the system accesses has not yet been initialized, the system needs to temporarily suspend to wait for the area to be completed. Initialization then implements the writing of data, that is, initialization before writing, which relatively reduces the execution efficiency of the system in reading disk array data, resulting in low overall system efficiency.
因此,亟需一种在磁盘阵列初始化时提高性能以提高存储***工作效率的初始化方法,以解决现有技术的上述技术问题。Therefore, there is an urgent need for an initialization method that improves performance during disk array initialization to improve storage system working efficiency, so as to solve the above technical problems of the existing technology.
发明内容Contents of the invention
第一方面本申请提供了一种磁盘阵列初始化方法,该方法包括:In the first aspect, this application provides a disk array initialization method, which method includes:
创建磁盘阵列并进行条带划分,条带包含多个分块以及至少一个校验分块;Create a disk array and divide it into stripes. The stripe contains multiple blocks and at least one parity block;
判断条带接收到的IO写操作的类型,IO写操作的类型包括非满条带写操作以及满条带写操作;Determine the type of IO write operation received by the stripe. The type of IO write operation includes non-full stripe write operation and full stripe write operation;
条带接收到的IO写操作为非满条带写操作时,根据IO写操作划分条带中的分块为写入分块以及写零分块;When the IO write operation received by the stripe is a non-full stripe write operation, the blocks in the stripe are divided into write blocks and write zero blocks according to the IO write operations;
根据IO写操作下发的IO数据与预存的全零数据确定校验分块对应的校验值;及Determine the check value corresponding to the check block based on the IO data issued by the IO write operation and the pre-stored all-zero data; and
写入分块、写零分块以及校验分块分别向下层发送对应的写请求,下层返回写请求成功提示后,更新条带对应的位图以反映初始化完成。Write blocks, write zero blocks, and verify blocks respectively send corresponding write requests to the lower layer. After the lower layer returns a successful write request prompt, the bitmap corresponding to the strip is updated to reflect the completion of the initialization.
在一些实施例中,IO写操作为非满条带写操作时,根据IO写操作划分条带中的分块为写入分块以及写零分块,包括:In some embodiments, when the IO write operation is a non-full stripe write operation, the blocks in the stripe are divided into write blocks and write zero blocks according to the IO write operation, including:
获取IO写操作下发的IO数据大小以及条带内的分块大小;Get the IO data size issued by the IO write operation and the block size in the stripe;
根据IO数据大小以及分块大小解析IO写操作对应的写入分块数;及Analyze the number of write blocks corresponding to the IO write operation based on the IO data size and block size; and
确定条带内除写入分块以及校验分块外的分块为写零分块。It is determined that the blocks in the strip except the write blocks and the check blocks are write zero blocks.
在一些实施例中,IO写操作为非满条带写操作时,根据IO写操作划分条带中的分块为写入分块以及写零分块,还包括:In some embodiments, when the IO write operation is a non-full stripe write operation, dividing the blocks in the stripe into write blocks and write zero blocks according to the IO write operation also includes:
确定各写入分块的待写入IO数据;Determine the IO data to be written in each write block;
比较待写入IO数据的大小与写入分块的大小;及Compare the size of the IO data to be written with the size of the written block; and
响应于待写入IO数据的大小与写入分块的大小一致,则写入分块根据待写入IO数据创建写请求;或,响应于待写入IO数据的大小小于写入分块的大小,则按照写入分块的大小以全零数据补齐待写入IO数据后,写入分块根据补齐后的待写入IO数据创建对应的写请求。In response to the size of the IO data to be written being consistent with the size of the write block, the write block creates a write request based on the IO data to be written; or in response to the size of the IO data to be written being smaller than the size of the write block. size, after filling the IO data to be written with all zero data according to the size of the write block, the write block creates a corresponding write request based on the filled IO data to be written.
在一些实施例中,该方法还包括:In some embodiments, the method further includes:
在磁盘阵列内存中划分出固定区域;及Allocate fixed areas in disk array memory; and
固定区域中预先存储有全零数据,全零数据粒度与分块大小一致。All-zero data is pre-stored in the fixed area, and the granularity of all-zero data is consistent with the block size.
在一些实施例中,根据IO写操作下发的IO数据与预存的全零数据确定校验分块对应的校验值,包括:In some embodiments, the check value corresponding to the check block is determined based on the IO data issued by the IO write operation and the pre-stored all-zero data, including:
将每一写入分块对应的待写入IO数据与全零数据进行异或运算以确定校验值;及Perform an XOR operation on the IO data to be written corresponding to each write block and the all-zero data to determine the check value; and
校验分块基于校验值生成对应的写请求。The verification block generates corresponding write requests based on the verification value.
在一些实施例中,该方法还包括:In some embodiments, the method further includes:
条带接收到的IO写操作为满条带写操作时,基于IO写操作下发的IO数据,确定条带每一分块内要写入的IO数据;When the IO write operation received by the stripe is a full stripe write operation, the IO data to be written in each block of the stripe is determined based on the IO data sent by the IO write operation;
将每一分块要写入的IO数据进行异或运算,以确定校验值;及Perform an XOR operation on the IO data to be written in each block to determine the check value; and
校验分块基于校验值生成对应的写请求。The verification block generates corresponding write requests based on the verification value.
在一些实施例中,该方法还包括:条带接收到的IO写操作为满条带写操作时,该方法还包括:In some embodiments, the method further includes: when the IO write operation received by the stripe is a full stripe write operation, the method further includes:
每一分块以及校验分块分别向下层发送对应的写请求,下层返回写请求成功提示后,更新条带对应的位图以反映初始化完成。Each block and verification block send corresponding write requests to the lower layer respectively. After the lower layer returns a successful write request prompt, the bitmap corresponding to the strip is updated to reflect the completion of the initialization.
在一些实施例中,该方法还包括:In some embodiments, the method further includes:
接收到IO读操作命令时,基于IO读操作命令对应的读取地址信息确定条带位置;及When receiving the IO read operation command, determine the stripe position based on the read address information corresponding to the IO read operation command; and
根据条带位置以及位图,确定IO读操作命令对应的条带是否完成初始化。According to the stripe position and bitmap, determine whether the stripe corresponding to the IO read operation command has been initialized.
在一些实施例中,该方法还包括:In some embodiments, the method further includes:
响应于IO读操作命令对应的条带已完成初始化,则下发读请求至下层以返回对应的IO数据;或,响应于IO读操作命令对应的条带未完成初始化,则返回全零数据。In response to the IO read operation command, the stripe corresponding to the IO read operation command has completed initialization, then the read request is sent to the lower layer to return the corresponding IO data; or in response to the IO read operation command, the corresponding stripe has not completed initialization, then all zero data is returned.
在一些实施例中,该方法应用于带有校验分块的磁盘阵列,该方法还包括:In some embodiments, the method is applied to a disk array with parity blocks, and the method further includes:
基于校验值以及条带中的IO数据,恢复丢失数据的分块。Based on the check value and IO data in the stripe, the lost data blocks are restored.
在一些实施例中,判断条带接收到的IO写操作的类型,包括:In some embodiments, determining the type of IO write operation received by the strip includes:
获取IO写操作下发的IO数据大小以及对应的条带大小;及Obtain the IO data size issued by the IO write operation and the corresponding stripe size; and
响应于下发的IO数据大小与对应的条带大小相同,则判定IO写操作为满条带写操作;或,响应于下发的IO数据大小小于对应的条带大小,则判定IO写操作为非满条带写操作。In response to the size of the delivered IO data being the same as the corresponding stripe size, the IO write operation is determined to be a full stripe write operation; or in response to the size of the delivered IO data being smaller than the corresponding stripe size, the IO write operation is determined to be a full stripe write operation. For non-full stripe write operations.
在一些实施例中,IO写操作至少包括写入地址信息以及IO数据;In some embodiments, the IO write operation includes at least writing address information and IO data;
IO读操作至少包括读取地址信息。IO read operations include at least reading address information.
第二方面,本申请提供了一种磁盘阵列初始化***,***包括:In the second aspect, this application provides a disk array initialization system, which includes:
数据准备模块,用于创建磁盘阵列并进行条带划分,条带包含多个分块以及至少一个校验分块;The data preparation module is used to create a disk array and divide it into stripes. A stripe contains multiple blocks and at least one parity block;
数据分析模块,用于判断条带接收到的IO写操作的类型,IO写操作的类型包括非满条带写操作以及满条带写操作;The data analysis module is used to determine the type of IO write operations received by the strip. The types of IO write operations include non-full strip write operations and full strip write operations;
数据分析模块,还用于在条带接收到的IO写操作为非满条带写操作时,根据IO写操作划分条带中的分块为写入分块以及写零分块;The data analysis module is also used to divide the blocks in the strip into write blocks and write zero blocks based on the IO write operations when the IO write operations received by the strip are non-full strip write operations;
数据分析模块,还用于根据IO写操作下发的IO数据与预存的全零数据确定校验分块对应的校验值;及The data analysis module is also used to determine the check value corresponding to the check block based on the IO data issued by the IO write operation and the pre-stored all-zero data; and
数据处理模块,用于通过写入分块、写零分块以及校验分块分别向下层发送对应的写请求,下层返回写请求成功提示后,更新条带对应的位图以反映初始化完成。The data processing module is used to send corresponding write requests to the lower layer by writing blocks, writing zero blocks, and verifying blocks respectively. After the lower layer returns a successful write request prompt, it updates the bitmap corresponding to the strip to reflect the completion of the initialization.
第三方面,本申请提供了一种电子设备,电子设备包括:In a third aspect, this application provides an electronic device. The electronic device includes:
一个或多个处理器;one or more processors;
以及与一个或多个处理器关联的存储器,存储器用于存储计算机可读指令程序指令在被一个或多个处理器读取执行时,执行上述任一实施例提供的磁盘阵列初始化方法中的步骤。and a memory associated with one or more processors. The memory is used to store computer readable instructions. When the program instructions are read and executed by one or more processors, the steps in the disk array initialization method provided by any of the above embodiments are performed. .
第四方面,本申请还提供了一种非易失性的计算机可读存储介质,该存储介质上存储计算机可读指令,所述计算机可读指令使得计算机执行上述任一实施例提供的磁盘阵列初始化方法中的步骤。In a fourth aspect, this application also provides a non-volatile computer-readable storage medium, which stores computer-readable instructions. The computer-readable instructions cause the computer to execute the disk array provided in any of the above embodiments. Steps in the initialization method.
附图说明Description of drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图,其中:In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can also be obtained based on these drawings without exerting creative efforts, among which:
图1是本申请一些实施例提供的磁盘阵列初始化流程示意图;Figure 1 is a schematic diagram of a disk array initialization process provided by some embodiments of the present application;
图2是本申请一些实施例提供的磁盘阵列初始化方法流程图;Figure 2 is a flow chart of a disk array initialization method provided by some embodiments of the present application;
图3是本申请一些实施例提供的磁盘阵列初始化***架构图;Figure 3 is an architecture diagram of a disk array initialization system provided by some embodiments of the present application;
图4是本申请一些实施例提供的电子设备结构图。Figure 4 is a structural diagram of an electronic device provided by some embodiments of the present application.
具体实施方式Detailed ways
为使本申请的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purpose, technical solutions and advantages of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only Some of the embodiments of this application are provided, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of this application.
应当理解,在本申请的描述中,除非上下文明确要求,否则整个说明书和权利要求书中的“包括”、“包含”等类似词语应当解释为包含的含义而不是排他或穷举的含义; 也就是说,是“包括但不限于”的含义。It should be understood that in the description of this application, unless the context clearly requires it, the words "including", "including" and other similar words throughout the specification and claims should be interpreted as having an inclusive meaning rather than an exclusive or exhaustive meaning; also That is to say, it means "including but not limited to".
还应当理解,术语“第一”、“第二”等仅用于描述目的,而不能理解为指示或暗示相对重要性。此外,在本申请的描述中,除非另有说明,“多个”的含义是两个或两个以上。It should also be understood that the terms "first," "second," etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Furthermore, in the description of the present application, unless otherwise stated, the meaning of “plurality” is two or more.
需要注意的是,术语“S1”、“S2”等仅用于步骤的描述目的,并非特别指称次序或顺位的意思,亦非用以限定本申请,其仅仅是为了方便描述本申请的方法,而不能理解为指示步骤的先后顺序。另外,各个实施例之间的技术方案可以相互结合,但是必须是以本领域普通技术人员能够实现为基础,当技术方案的结合出现相互矛盾或无法实现时应当认为这种技术方案的结合不存在,也不在本申请要求的保护范围之内。It should be noted that the terms "S1", "S2", etc. are only used for the purpose of describing the steps, and do not specifically refer to the sequence or order, nor are they used to limit the present application. They are only used to facilitate the description of the method of the present application. , and cannot be understood as indicating the sequence of steps. In addition, the technical solutions in various embodiments can be combined with each other, but it must be based on the realization by those of ordinary skill in the art. When the combination of technical solutions is contradictory or cannot be realized, it should be considered that such a combination of technical solutions does not exist. , nor is it within the scope of protection required by this application.
根据背景技术可知,传统的磁盘阵列初始化分为两种:后台初始化以及写前初始化。其中,后台初始化是指创建磁盘阵列后,没有主机IO(Input/Output,输入/输出)操作下发到条带,该条带在磁盘阵列内部的调度下进行写零的操作;写零完成后即完成初始化,初始化完成后再有IO操作下发时,直接处理IO;但是此种情况下需要漫长的时间等待完成初始化。写前初始化是指当主机IO操作下发访问的磁盘阵列的区域尚未初始化时,***需要暂时将IO操作挂起以等待该区域完成初始化,再进行对IO操作的处理;但这种情况会导致主机IO处理速度慢的问题的出现。当磁盘阵列未完成初始化时,主机IO操作下发时会遇到以上两种情况,两种情况叠加下,存储要先以分块粒度对磁盘区域写零,后再对主机的IO操作进行处理,会造成带宽挤占以及IO操作排队,进一步性能的损耗。According to the background technology, traditional disk array initialization is divided into two types: background initialization and pre-write initialization. Among them, background initialization means that after the disk array is created, no host IO (Input/Output, input/output) operation is sent to the strip, and the strip writes zeros under the internal scheduling of the disk array; after the zero writing is completed That is, the initialization is completed. After the initialization is completed, when an IO operation is issued, the IO is processed directly; however, in this case, it takes a long time to wait for the initialization to be completed. Initialization before writing means that when the area of the disk array accessed by the host IO operation has not been initialized, the system needs to temporarily suspend the IO operation to wait for the area to complete initialization before processing the IO operation; however, this situation will cause The problem of slow host IO processing speed occurs. When the disk array has not completed initialization, the above two situations will be encountered when the host IO operation is issued. When the two situations are superimposed, the storage must first write zeros to the disk area at the block granularity, and then process the host IO operation. , will cause bandwidth congestion and IO operation queuing, further affecting performance.
为解决上述技术问题,本申请提供了一种磁盘阵列初始化方法,优化写前初始化的算法,优化主机下发的IO操作的等待时间,改变传统的流程中必须先写零再写IO的传统方法,以此进一步提升存储***的性能。In order to solve the above technical problems, this application provides a disk array initialization method, optimizes the algorithm of initialization before writing, optimizes the waiting time for IO operations issued by the host, and changes the traditional method of writing zeros first and then writing IO in the traditional process. , in order to further improve the performance of the storage system.
在一些实施例中,如图1所示,本申请实施例提供了一种存储***,具体的,在本申请里公开的存储***中进行磁盘阵列初始化的过程包括:In some embodiments, as shown in Figure 1, embodiments of the present application provide a storage system. Specifically, the process of initializing a disk array in the storage system disclosed in this application includes:
S10、判断IO写操作的类型。S10. Determine the type of IO write operation.
具体的,在存储***中创建磁盘阵列后进行条带划分,磁盘阵列内包含多个条带,每一条带包含多个分块以及一个校验分块。主机下发IO写操作到磁盘阵列;其中,IO写操作中至少包含写入地址信息以及IO数据;主机根据其中的写入地址信息确定对应位置的条带,并将IO写操作下发到具体的条带中。条带接收到IO写操作后,首先对IO写操作的类型进行判断:先获取IO写操作下发的IO数据大小以及对应的所述条带大小;如果主机下发的IO数据大小与对应的所述条带的大小相同,即IO数据大小与条带宽度一 致,则判定上述IO写操作为满条带写操作;如果主机下发的IO数据大小小于条带的大小,则判定上述IO写操作为非满条带写操作。Specifically, after creating a disk array in the storage system, the disk array is divided into strips. The disk array contains multiple strips, and each strip contains multiple blocks and a parity block. The host delivers the IO write operation to the disk array; the IO write operation at least contains write address information and IO data; the host determines the stripe of the corresponding location based on the write address information, and delivers the IO write operation to the specific in the strip. After the strip receives the IO write operation, it first judges the type of the IO write operation: first obtains the IO data size issued by the IO write operation and the corresponding strip size; if the IO data size issued by the host is the same as the corresponding stripe size, The sizes of the stripes are the same, that is, the IO data size is consistent with the stripe width, then the above IO write operation is determined to be a full stripe write operation; if the size of the IO data sent by the host is smaller than the stripe size, the above IO write operation is determined to be a full stripe write operation. The operation is a non-full stripe write operation.
S20、上述IO写操作为非满条带写操作时,对条带进行初始化操作。S20. When the above IO write operation is a non-full stripe write operation, initialize the stripe.
具体的,上述初始化操作包括:Specifically, the above initialization operations include:
S21、判断IO数据是否为满分块并确定条带各分块对应的待写入数据。S21. Determine whether the IO data is a full block and determine the data to be written corresponding to each block of the stripe.
首先通过获取IO操作对应的IO数据以及所述条带内分块的大小;根据IO数据大小以及分块大小计算可得出IO写操作对应的写入分块数;比如下发的IO数据为256k,分块的大小为128k,则将IO数据写入条带内的两个分块中,这两个分块记为写入分块。但是下发的IO数据不一定是分块大小的整数倍数,也可能下发的IO数据大小为250k,此时仍旧需要将IO数据写入条带内的两个分块中,只是其中一个分块的空间没有完全写满(即不满分块写入)。因此,在确定要写入的分块数后,还需要判断写入分块是否为满分块写入,即要写入上述写入分块内的IO数据的大小是否与写入分块的大小一致,若一致则为满分块写入;若要写入上述写入分块内的IO数据的大小比写入分块小,则为不满分块写入,此时需要从磁盘阵列内读取预先存储的全零数据并写入分块中的空余空间。其中,磁盘阵列创建时在内存中划分出一个固定区域用于存储全零数据;全零数据粒度与分块大小一致。总而言之,下发的IO数据是满分块写入时,写入分块对应的待写入数据为直接下发的IO数据;下发的IO数据不是满分块写入时,写入分块对应的待写入数据为以全零数据补齐后的IO数据。其他不需要写入IO数据的分块记为写零分块,从内存中取全零数据代替上述写零分块中原有的旧数据,即写零分块对应的待写入数据为全零数据。最后确定校验分块中待写入数据(即校验值),其中在满分块写入时,直接根据写入分块内要写入的IO数据和内存中的全零数据进行异或运算以确定校验值;在不满分块写入时,根据用全零数据补齐后的写入分块内要写入的IO数据和内存中的全零数据进行异或运算以确定校验值。First, obtain the IO data corresponding to the IO operation and the size of the blocks in the strip; calculate the number of write blocks corresponding to the IO write operation based on the IO data size and block size; for example, the issued IO data is 256k, the block size is 128k, then the IO data is written into two blocks in the strip, and these two blocks are recorded as write blocks. However, the IO data delivered is not necessarily an integer multiple of the block size. The size of the IO data delivered may also be 250k. In this case, the IO data still needs to be written to the two blocks in the stripe, but only one of the blocks needs to be written. The space of the block is not completely filled (that is, it is not written in blocks). Therefore, after determining the number of blocks to be written, it is also necessary to determine whether the write block is a full block write, that is, whether the size of the IO data to be written in the above write block is equal to the size of the write block. Consistent, if consistent, it is a full block write; if the size of the IO data written in the above write block is smaller than the write block, it is an unsatisfied block write, and in this case, it needs to be read from the disk array The all-zero data is stored in advance and written to the free space in the chunk. Among them, when the disk array is created, a fixed area is divided in the memory to store all-zero data; the granularity of all-zero data is consistent with the block size. In short, when the issued IO data is written in full blocks, the data to be written corresponding to the written blocks is the directly issued IO data; when the issued IO data is not written in full blocks, the data corresponding to the written blocks is written. The data to be written is IO data padded with all zero data. Other blocks that do not need to write IO data are recorded as write zero blocks. All zero data is taken from the memory to replace the original old data in the above write zero blocks. That is, the data to be written corresponding to the write zero blocks is all zeros. data. Finally, determine the data to be written in the verification block (i.e., the check value). When writing a full block, XOR operation is directly performed based on the IO data to be written in the writing block and the all-zero data in the memory. To determine the check value; when the block writing is not satisfied, an XOR operation is performed based on the IO data to be written in the write block padded with all-zero data and the all-zero data in the memory to determine the check value. .
S22、各分块根据对应的待写入数据生成写请求并下发给下层,在下层返回每一分块的写请求成功提示后,完成条带的初始化。S22. Each block generates a write request based on the corresponding data to be written and sends it to the lower layer. After the lower layer returns a successful prompt for the write request of each block, the initialization of the stripe is completed.
S30、上述IO写操作为满条带写操作时,对条带进行初始化操作。S30. When the above IO write operation is a full stripe write operation, initialize the stripe.
具体的,由于是满条带写入,此时不会存在条带数据不一致的情况,因此直接根据下发的IO数据按照分块大小划分后,确定每一分块要写入的IO数据。然后根据每一分块要写入的IO数据进行异或运算得到校验分块对应的校验值。然后每一分块根据要写入的IO数据生成写请求并下发给下层,以及校验分块根据校验值生成写请求并下发给下层;在下层返回所有写请求对应的写请求成功提示后即完成初始化,值得注意的是此时 并不存在全零分块。Specifically, since the full stripe is written, there will be no stripe data inconsistency at this time. Therefore, the IO data to be written in each block is determined directly according to the delivered IO data and divided according to the block size. Then perform an XOR operation based on the IO data to be written in each block to obtain the check value corresponding to the check block. Then each block generates a write request based on the IO data to be written and sends it to the lower layer, and the verification block generates a write request based on the verification value and sends it to the lower layer; the lower layer returns the success of the write request corresponding to all write requests. The initialization is completed after the prompt. It is worth noting that there is no all-zero blocking at this time.
S40、条带初始化完成后,更新该条带对应的位图,即将位图中标志条带是否初始化完成的标志位更新以反映对应的条带是否完成初始化。S40. After the stripe initialization is completed, update the bitmap corresponding to the stripe, that is, update the flag bit in the bitmap indicating whether the stripe has been initialized to reflect whether the corresponding stripe has been initialized.
此外,本申请还公开了在主机下发IO读操作命令时,存储***先解析IO读操作对应的读取地址信息,以确定该IO读操作对应的条件位置;然后根据条带位置以及位图,判断对应的条带是否完成初始化;若位图标志位置1即表明对应的条带已完成初始化;若位图标志位置0即表明对应的条带未完成初始化。在条带完成初始化时,存储***下发读请求至下层并获取下层返回的与IO读操作对应的IO数据;若条带未完成初始化,则存储***直接返回全零数据。In addition, this application also discloses that when the host issues an IO read operation command, the storage system first parses the read address information corresponding to the IO read operation to determine the conditional position corresponding to the IO read operation; then based on the stripe position and bitmap , determine whether the corresponding strip has been initialized; if the bitmap flag position is 1, it indicates that the corresponding strip has been initialized; if the bitmap flag position is 0, it indicates that the corresponding strip has not been initialized. When the stripe is initialized, the storage system sends a read request to the lower layer and obtains the IO data corresponding to the IO read operation returned by the lower layer. If the stripe is not initialized, the storage system directly returns all-zero data.
进一步,本申请还公开了基于校验值以及条带中的IO数据,恢复丢失数据的分块,具体的,以表1所示的条带为例进行说明,其中条带包含5个分块D1-D5以及一个校验分块P:若主机下发的IO写操作对应的IO数据要写入D1-D5中,在这种情况下,P=D1异或D2异或D3异或D4异或D5,若D1丢失后需要恢复可根据D1=P异或D2异或D3异或D4异或D5得到D1丢失的数据。Furthermore, this application also discloses the block recovery of lost data based on the check value and IO data in the strip. Specifically, the strip shown in Table 1 is used as an example, in which the strip contains 5 blocks. D1-D5 and a parity block P: If the IO data corresponding to the IO write operation issued by the host is to be written to D1-D5, in this case, P = D1 exclusive or D2 exclusive or D3 exclusive or D4 exclusive Or D5, if D1 is lost and needs to be restored, the lost data of D1 can be obtained according to D1=P XOR D2 XOR D3 XOR D4 XOR D5.
表1Table 1
D1D1 D2D2 D3D3 D4D4 D5D5 PP
本申请不同于传统磁盘阵列初始化方法,去除了原有的耗时长且资源消耗大的后台初始化任务,对写前初始化进行优化,去除了对硬盘写零并读取的过程,改为在内存中的全零数据读取操作,同时改变了传统流程中必须在分块中写零再写入IO数据的流程,大大节省时间,避免存储***因中IO排队而导致的性能损耗问题。This application is different from the traditional disk array initialization method. It removes the original background initialization task that is time-consuming and resource-intensive, optimizes the pre-write initialization, removes the process of writing zeros to the hard disk and reading it, and changes it to the memory. The all-zero data read operation also changes the traditional process of writing zeros in blocks and then writing IO data, which greatly saves time and avoids performance loss problems caused by IO queuing in the storage system.
在一些实施例中,对应上述实施例,本申请实施例还提供了一种磁盘阵列初始化方法,如图2所示,具体如下:In some embodiments, corresponding to the above embodiments, embodiments of the present application also provide a disk array initialization method, as shown in Figure 2, specifically as follows:
2100、创建磁盘阵列并进行条带划分,条带包含多个分块以及至少一个校验分块;2100. Create a disk array and divide it into stripes. The stripe contains multiple blocks and at least one parity block;
在一些实施方式中,上述方法还包括:In some embodiments, the above method further includes:
2110、在磁盘阵列内存中划分出固定区域;2110. Divide a fixed area in the disk array memory;
2120、固定区域中预先存储有全零数据,全零数据粒度与分块大小一致。2120. All-zero data is pre-stored in the fixed area, and the granularity of all-zero data is consistent with the block size.
2200、判断条带接收到的IO写操作的类型,IO写操作的类型包括非满条带写操作以及满条带写操作;2200. Determine the type of IO write operation received by the stripe. The type of IO write operation includes non-full stripe write operation and full stripe write operation;
在一些实施方式中,判断条带接收到的IO写操作的类型,包括:In some implementations, determining the type of IO write operation received by the stripe includes:
2210、获取IO写操作下发的IO数据大小以及对应的条带大小;2210. Obtain the IO data size issued by the IO write operation and the corresponding stripe size;
2220、若下发的IO数据大小与对应的条带大小相同,则判定IO写操作为满条带写 操作;2220. If the size of the delivered IO data is the same as the corresponding stripe size, the IO write operation is determined to be a full stripe write operation;
2230、若下发的IO数据大小小于对应的条带大小,则判定IO写操作为非满条带写操作。2230. If the size of the delivered IO data is smaller than the corresponding stripe size, the IO write operation is determined to be a non-full stripe write operation.
在一些实施方式中,上述方法还包括:In some embodiments, the above method further includes:
2240、条带接收到的IO写操作为满条带写操作时,基于IO写操作下发的IO数据,确定条带每一分块内要写入的IO数据;2240. When the IO write operation received by the stripe is a full stripe write operation, the IO data to be written in each block of the stripe is determined based on the IO data sent by the IO write operation;
2250、将每一分块要写入的IO数据进行异或运算,以确定校验值;2250. Perform XOR operation on the IO data to be written in each block to determine the check value;
2260、校验分块基于校验值生成对应的写请求。2260. The verification block generates corresponding write requests based on the verification value.
在一些实施方式中,条带接收到的IO写操作为满条带写操作时,上述方法还包括:In some embodiments, when the IO write operation received by the stripe is a full stripe write operation, the above method further includes:
2270、每一分块以及校验分块分别向下层发送对应的写请求,下层返回写请求成功提示后,更新条带对应的位图以反映初始化完成。2270. Each block and verification block send a corresponding write request to the lower layer respectively. After the lower layer returns a successful write request prompt, the bitmap corresponding to the strip is updated to reflect the completion of the initialization.
2300、条带接收到的IO写操作为非满条带写操作时,根据IO写操作划分条带中的分块为写入分块以及写零分块;2300. When the IO write operation received by the stripe is a non-full stripe write operation, the blocks in the stripe are divided into write blocks and write zero blocks according to the IO write operations;
在一些实施方式中,IO写操作为非满条带写操作时,根据IO写操作划分条带中的分块为写入分块以及写零分块,包括:In some embodiments, when the IO write operation is a non-full stripe write operation, the blocks in the stripe are divided into write blocks and write zero blocks according to the IO write operation, including:
2310、获取IO写操作下发的IO数据大小以及条带内的分块大小;2310. Obtain the IO data size issued by the IO write operation and the block size in the strip;
2320、根据IO数据大小以及分块大小解析IO写操作对应的写入分块数;2320. Analyze the number of write blocks corresponding to the IO write operation according to the IO data size and block size;
2330、确定条带内除写入分块以及校验分块外的分块为写零分块。2330. Determine that the blocks in the strip except the write blocks and the check blocks are write zero blocks.
在一些实施方式中,IO写操作为非满条带写操作时,根据IO写操作划分条带中的分块为写入分块以及写零分块,还包括:In some embodiments, when the IO write operation is a non-full stripe write operation, dividing the blocks in the stripe into write blocks and write zero blocks according to the IO write operation also includes:
2340、确定各写入分块的待写入IO数据;2340. Determine the IO data to be written in each write block;
2350、比较待写入IO数据的大小与写入分块的大小;2350. Compare the size of the IO data to be written and the size of the written block;
2360、若待写入IO数据的大小与写入分块的大小一致,则写入分块根据待写入IO数据创建写请求;2360. If the size of the IO data to be written is consistent with the size of the write block, the write block creates a write request based on the IO data to be written;
2370、若待写入IO数据的大小小于写入分块的大小,则按照写入分块的大小以全零数据补齐待写入IO数据后,写入分块根据补齐后的待写入IO数据创建对应的写请求。2370. If the size of the IO data to be written is smaller than the size of the writing block, after the IO data to be written is filled with all zero data according to the size of the writing block, the writing block is filled in according to the size of the writing block. Input IO data to create a corresponding write request.
2400、根据IO写操作下发的IO数据与预存的全零数据确定校验分块对应的校验值;2400. Determine the check value corresponding to the check block based on the IO data issued by the IO write operation and the pre-stored all-zero data;
在一些实施方式中,根据IO写操作下发的IO数据与预存的全零数据确定校验分块对应的校验值,包括:In some implementations, the check value corresponding to the check block is determined based on the IO data issued by the IO write operation and the pre-stored all-zero data, including:
2410、将每一写入分块对应的待写入IO数据与全零数据进行异或运算以确定校验 值;2410. Perform an XOR operation on the IO data to be written corresponding to each write block and the all-zero data to determine the check value;
2420、校验分块基于校验值生成对应的写请求。2420. The verification block generates corresponding write requests based on the verification value.
2500、写入分块、写零分块以及校验分块分别向下层发送对应的写请求,下层返回写请求成功提示后,更新条带对应的位图以反映初始化完成。2500. Write blocks, write zero blocks, and verify blocks respectively send corresponding write requests to the lower layer. After the lower layer returns a successful write request prompt, the bitmap corresponding to the strip is updated to reflect the completion of the initialization.
其中,IO写操作至少包括写入地址信息以及IO数据;Among them, the IO write operation at least includes writing address information and IO data;
IO读操作至少包括读取地址信息。IO read operations include at least reading address information.
在一些实施方式中,上述方法还包括:In some embodiments, the above method further includes:
2510、接收到IO读操作命令时,基于IO读操作命令对应的读取地址信息确定条带位置;2510. When receiving the IO read operation command, determine the stripe position based on the read address information corresponding to the IO read operation command;
2520、根据条带位置以及位图,确定IO读操作命令对应的条带是否完成初始化。2520. According to the stripe position and bitmap, determine whether the stripe corresponding to the IO read operation command has been initialized.
在一些实施方式中,上述方法还包括:In some embodiments, the above method further includes:
2530、若IO读操作命令对应的条带已完成初始化,则下发读请求至下层以返回对应的IO数据;2530. If the strip corresponding to the IO read operation command has been initialized, send the read request to the lower layer to return the corresponding IO data;
2540、若IO读操作命令对应的条带未完成初始化,则返回全零数据。2540. If the strip corresponding to the IO read operation command has not completed initialization, all zero data will be returned.
在一些实施方式中,上述方法应用于带有校验分块的磁盘阵列,上述方法还包括:In some embodiments, the above method is applied to a disk array with parity blocks, and the above method further includes:
2600、基于校验值以及条带中的IO数据,恢复丢失数据的分块。2600. Based on the check value and the IO data in the strip, restore the lost data blocks.
实施例三Embodiment 3
如图3所示,对应上述实施例,本申请实施例提供了一种磁盘阵列初始化***,***包括:As shown in Figure 3, corresponding to the above embodiments, embodiments of the present application provide a disk array initialization system. The system includes:
数据准备模块310,用于创建磁盘阵列并进行条带划分,条带包含多个分块以及至少一个校验分块;The data preparation module 310 is used to create a disk array and divide it into strips. The stripes include multiple blocks and at least one parity block;
数据分析模块320,用于判断条带接收到的IO写操作的类型,IO写操作的类型包括非满条带写操作以及满条带写操作;The data analysis module 320 is used to determine the type of IO write operations received by the stripe. The types of IO write operations include non-full stripe write operations and full stripe write operations;
数据分析模块320,还用于在条带接收到的IO写操作为非满条带写操作时,根据IO写操作划分条带中的分块为写入分块以及写零分块;The data analysis module 320 is also configured to divide the blocks in the strip into write blocks and write zero blocks according to the IO write operations when the IO write operation received by the strip is a non-full strip write operation;
数据分析模块320,还用于根据IO写操作下发的IO数据与预存的全零数据确定校验分块对应的校验值;The data analysis module 320 is also used to determine the check value corresponding to the check block based on the IO data issued by the IO write operation and the pre-stored all-zero data;
数据处理模块330,用于通过写入分块、写零分块以及校验分块分别向下层发送对应的写请求,下层返回写请求成功提示后,更新条带对应的位图以反映初始化完成。The data processing module 330 is used to send corresponding write requests to the lower layer through write blocks, write zero blocks, and check blocks respectively. After the lower layer returns a successful write request prompt, it updates the bitmap corresponding to the strip to reflect the completion of the initialization. .
在一些实施例中,数据分析模块320还用于获取IO写操作下发的IO数据大小以及条带内的分块大小;数据分析模块320还用于根据IO数据大小以及分块大小解析IO写 操作对应的写入分块数;数据分析模块320还用于确定条带内除写入分块以及校验分块外的分块为写零分块。In some embodiments, the data analysis module 320 is also used to obtain the IO data size issued by the IO write operation and the block size in the strip; the data analysis module 320 is also used to analyze the IO write according to the IO data size and block size. The number of writing blocks corresponding to the operation; the data analysis module 320 is also used to determine that the blocks in the strip except the writing blocks and the verification blocks are write zero blocks.
在一些实施例中,数据分析模块320还用于确定各写入分块的待写入IO数据;In some embodiments, the data analysis module 320 is also used to determine the IO data to be written in each write block;
数据分析模块320还用于比较待写入IO数据的大小与写入分块的大小;The data analysis module 320 is also used to compare the size of the IO data to be written and the size of the written blocks;
数据分析模块320还用于在待写入IO数据的大小与写入分块的大小一致时,利用写入分块根据待写入IO数据创建写请求;The data analysis module 320 is also configured to use the write block to create a write request based on the IO data to be written when the size of the IO data to be written is consistent with the size of the write block;
数据分析模块320还用于在待写入IO数据的大小小于写入分块的大小时,则按照写入分块的大小以全零数据补齐待写入IO数据后,利用写入分块根据补齐后的待写入IO数据创建对应的写请求。The data analysis module 320 is also configured to, when the size of the IO data to be written is smaller than the size of the writing block, fill the IO data to be written with all zero data according to the size of the writing block, and then use the writing block. Create a corresponding write request based on the completed IO data to be written.
在一些实施例中,数据准备模块310还用于在磁盘阵列内存中划分出固定区域;固定区域中预先存储有全零数据,全零数据粒度与分块大小一致。In some embodiments, the data preparation module 310 is also used to divide a fixed area in the disk array memory; all-zero data is pre-stored in the fixed area, and the all-zero data granularity is consistent with the block size.
在一些实施例中,数据分析模块320还用于将每一写入分块对应的待写入IO数据与全零数据进行异或运算以确定校验值;校验分块基于校验值生成对应的写请求。In some embodiments, the data analysis module 320 is also used to perform an XOR operation on the IO data to be written corresponding to each write block and the all-zero data to determine the check value; the check block is generated based on the check value Corresponding write request.
在一些实施例中,条带接收到的IO写操作为满条带写操作时,数据分析模块320还用于基于IO写操作下发的IO数据,确定条带每一分块内要写入的IO数据;数据分析模块320还用于将每一分块要写入的IO数据进行异或运算,以确定校验值;数据分析模块320还用于利用校验分块基于校验值生成对应的写请求。In some embodiments, when the IO write operation received by the stripe is a full stripe write operation, the data analysis module 320 is also used to determine the data to be written in each block of the stripe based on the IO data issued by the IO write operation. IO data; the data analysis module 320 is also used to perform an XOR operation on the IO data to be written in each block to determine the check value; the data analysis module 320 is also used to use the check block to generate a check value based on the check value Corresponding write request.
在一些实施例中,条带接收到的IO写操作为满条带写操作时,数据处理模块330还用于基于每一分块以及校验分块分别向下层发送对应的写请求,下层返回写请求成功提示后,更新条带对应的位图以反映初始化完成。In some embodiments, when the IO write operation received by the stripe is a full stripe write operation, the data processing module 330 is also configured to send a corresponding write request to the lower layer based on each block and the verification block, and the lower layer returns After the write request is successfully prompted, the bitmap corresponding to the stripe is updated to reflect that the initialization is completed.
在一些实施例中,数据处理模块330还用于接收到IO读操作命令时,基于IO读操作命令对应的读取地址信息确定条带位置;数据处理模块330还用于根据条带位置以及位图,确定IO读操作命令对应的条带是否完成初始化。In some embodiments, the data processing module 330 is also configured to determine the stripe position based on the read address information corresponding to the IO read operation command when receiving the IO read operation command; the data processing module 330 is also configured to determine the stripe position based on the stripe position and the bit position. Figure, determine whether the strip corresponding to the IO read operation command has completed initialization.
在一些实施例中,若IO读操作命令对应的条带已完成初始化,则数据处理模块330还用于下发读请求至下层以返回对应的IO数据;In some embodiments, if the strip corresponding to the IO read operation command has completed initialization, the data processing module 330 is also used to send a read request to the lower layer to return the corresponding IO data;
若IO读操作命令对应的条带未完成初始化,则数据处理模块330还用于获取返回的全零数据。If the stripe corresponding to the IO read operation command has not completed initialization, the data processing module 330 is also used to obtain the returned all-zero data.
在一些实施例中,***应用于带有校验分块的磁盘阵列,数据处理模块330还用于基于校验值以及条带中的IO数据,恢复丢失数据的分块。In some embodiments, the system is applied to a disk array with parity blocks, and the data processing module 330 is also used to recover blocks of lost data based on the parity value and IO data in the strip.
在一些实施例中,数据分析模块320还用于获取IO写操作下发的IO数据大小以及对应的条带大小;若下发的IO数据大小与对应的条带大小相同,则数据分析模块320判 定IO写操作为满条带写操作;若下发的IO数据大小小于对应的条带大小,则数据分析模块320判定IO写操作为非满条带写操作。In some embodiments, the data analysis module 320 is also used to obtain the IO data size delivered by the IO write operation and the corresponding stripe size; if the size of the IO data delivered is the same as the corresponding stripe size, the data analysis module 320 The IO write operation is determined to be a full stripe write operation; if the size of the delivered IO data is smaller than the corresponding stripe size, the data analysis module 320 determines that the IO write operation is a non-full stripe write operation.
在一些实施例中,IO写操作至少包括写入地址信息以及IO数据;IO读操作至少包括读取地址信息。In some embodiments, the IO write operation at least includes writing address information and IO data; the IO read operation at least includes reading address information.
在一些实施例中,对应上述所有实施例,本申请实施例提供一种电子设备,包括:一个或多个处理器;以及与所述一个或多个处理器关联的存储器,所述存储器用于存储计算机可读指令所述程序指令在被所述一个或多个处理器读取执行时,执行上述任一实施例提供的方法中的步骤。In some embodiments, corresponding to all the above embodiments, embodiments of the present application provide an electronic device, including: one or more processors; and a memory associated with the one or more processors, the memory is used to The program instructions that store computer readable instructions perform the steps in the method provided by any of the above embodiments when read and executed by the one or more processors.
其中,图4示例性的展示出了电子设备的架构,具体可以包括处理器410,视频显示适配器411,磁盘驱动器412,输入/输出接口413,网络接口414,以及存储器420。上述处理器410、视频显示适配器411、磁盘驱动器412、输入/输出接口413、网络接口414,与存储器420之间可以通过总线430进行通信连接。Among them, FIG. 4 exemplarily shows the architecture of the electronic device, which may specifically include a processor 410, a video display adapter 411, a disk drive 412, an input/output interface 413, a network interface 414, and a memory 420. The above-mentioned processor 410, video display adapter 411, disk drive 412, input/output interface 413, network interface 414, and the memory 420 can be communicatively connected through a bus 430.
其中,处理器410可以采用通用的CPU(Central Processing Unit,中央处理器)、微处理器、应用专用集成电路(Application Specific Integrated Circuit,ASIC)、或者一个或多个集成电路等方式实现,用于执行相关程序,以实现本申请所提供的技术方案。Among them, the processor 410 can be implemented by using a general-purpose CPU (Central Processing Unit, central processing unit), a microprocessor, an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, for Execute relevant procedures to implement the technical solutions provided in this application.
存储器420可以采用ROM(Read Only Memory,可编写存储器)、RAM(Random Access Memory,随机存取存储器)、静态存储设备,动态存储设备等形式实现。存储器420可以存储用于控制电子设备400执行的操作***421,用于控制电子设备400的低级别操作的基本输入输出***(BIOS)422。另外,还可以存储网页浏览器423,数据存储管理***424,以及图标字体处理***425等等。上述图标字体处理***425就可以是本申请实施例中具体实现前述各步骤操作的应用程序。总之,在通过软件或者固件来实现本申请所提供的技术方案时,相关的程序代码保存在存储器420中,并由处理器410来调用执行。The memory 420 can be implemented in the form of ROM (Read Only Memory, programmable memory), RAM (Random Access Memory), static storage device, dynamic storage device, etc. The memory 420 may store an operating system 421 for controlling execution of the electronic device 400 and a basic input output system (BIOS) 422 for controlling low-level operations of the electronic device 400 . In addition, a web browser 423, a data storage management system 424, an icon font processing system 425, etc. can also be stored. The above-mentioned icon font processing system 425 can be an application program that specifically implements the above-mentioned steps in the embodiment of the present application. In short, when the technical solution provided in this application is implemented through software or firmware, the relevant program code is stored in the memory 420 and called and executed by the processor 410 .
输入/输出接口413用于连接输入/输出模块,以实现信息输入及输出。输入输出/模块可以作为组件配置在设备中(图中未示出),也可以外接于设备以提供相应功能。其中输入设备可以包括键盘、鼠标、触摸屏、麦克风、各类传感器等,输出设备可以包括显示器、扬声器、振动器、指示灯等。The input/output interface 413 is used to connect the input/output module to realize information input and output. The input/output/module can be configured in the device as a component (not shown in the figure), or can be externally connected to the device to provide corresponding functions. Input devices can include keyboards, mice, touch screens, microphones, various sensors, etc., and output devices can include monitors, speakers, vibrators, indicator lights, etc.
网络接口414用于连接通信模块(图中未示出),以实现本设备与其他设备的通信交互。其中通信模块可以通过有线方式(例如USB、网线等)实现通信,也可以通过无线方式(例如移动网络、WIFI、蓝牙等)实现通信。The network interface 414 is used to connect a communication module (not shown in the figure) to realize communication interaction between this device and other devices. The communication module can realize communication through wired means (such as USB, network cable, etc.) or wireless means (such as mobile network, WIFI, Bluetooth, etc.).
总线430包括一通路,在设备的各个组件(例如处理器410、视频显示适配器411、 磁盘驱动器412、输入/输出接口413、网络接口414,与存储器420)之间传输信息。Bus 430 includes a path that carries information between various components of the device (eg, processor 410, video display adapter 411, disk drive 412, input/output interface 413, network interface 414, and memory 420).
另外,该电子设备400还可以从虚拟资源对象领取条件信息数据库中获得具体领取条件的信息,以用于进行条件判断,等等。In addition, the electronic device 400 can also obtain information on specific receiving conditions from the virtual resource object receiving condition information database for condition judgment, and so on.
需要说明的是,尽管上述设备仅示出了处理器410、视频显示适配器411、磁盘驱动器412、输入/输出接口413、网络接口414,存储器420,总线430等,但是在具体实施过程中,该设备还可以包括实现正常执行所必需的其他组件。此外,本领域的技术人员可以理解的是,上述设备中也可以仅包含实现本申请方案所必需的组件,而不必包含图中所示的全部组件。It should be noted that although the above device only shows the processor 410, the video display adapter 411, the disk drive 412, the input/output interface 413, the network interface 414, the memory 420, the bus 430, etc., during the specific implementation process, the A device may also include other components necessary for proper execution. In addition, those skilled in the art can understand that the above-mentioned device may also include only the components necessary to implement the solution of the present application, and does not necessarily include all the components shown in the drawings.
在一些实施例中,对应上述所有实施例,本申请实施例还提供一种非易失性计算机可读存储介质,其存储计算机可读指令,所述计算机可读指令使得计算机执行上述任一实施例提供的方法中的步骤。In some embodiments, corresponding to all the above embodiments, embodiments of the present application also provide a non-volatile computer-readable storage medium that stores computer-readable instructions, and the computer-readable instructions cause the computer to execute any of the above implementations. The steps in the method are provided in the example.
通过以上的实施方式的描述可知,本领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件平台的方式来实现。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,云服务端,或者网络设备等)执行本申请各个实施例或者实施例的某些部分所述的方法。From the above description of the embodiments, those skilled in the art can clearly understand that the present application can be implemented by means of software plus the necessary general hardware platform. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence or that contributes to the existing technology. The computer software product can be stored in a storage medium, such as ROM/RAM, disk , optical disk, etc., including a number of instructions to cause a computer device (which can be a personal computer, a cloud server, or a network device, etc.) to execute the methods described in various embodiments or certain parts of the embodiments of this application.
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于***或***实施例而言,由于其基本相似于方法实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可。以上所描述的***及***实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。Each embodiment in this specification is described in a progressive manner. The same and similar parts between the various embodiments can be referred to each other. Each embodiment focuses on its differences from other embodiments. In particular, for the system or system embodiment, since it is basically similar to the method embodiment, the description is relatively simple. For relevant details, please refer to the partial description of the method embodiment. The system and system embodiments described above are only illustrative, in which the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, It can be located in one place, or it can be distributed over multiple network elements. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. Persons of ordinary skill in the art can understand and implement the method without any creative effort.
以上所述仅为本申请的较佳实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。The above are only preferred embodiments of the present application and are not intended to limit the present application. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of the present application shall be included in the protection of the present application. within the range.

Claims (20)

  1. 一种磁盘阵列初始化方法,其特征在于,所述方法包括:A disk array initialization method, characterized in that the method includes:
    创建磁盘阵列并进行条带划分,所述条带包含多个分块以及至少一个校验分块;Create a disk array and perform striping, where the stripe includes a plurality of blocks and at least one parity block;
    判断条带接收到的IO写操作的类型,所述IO写操作的类型包括非满条带写操作以及满条带写操作;Determine the type of IO write operation received by the stripe. The type of IO write operation includes a non-full stripe write operation and a full stripe write operation;
    所述条带接收到的IO写操作为非满条带写操作时,根据所述IO写操作划分条带中的分块为写入分块以及写零分块;When the IO write operation received by the strip is a non-full strip write operation, the blocks in the strip are divided into write blocks and write zero blocks according to the IO write operations;
    根据所述IO写操作下发的IO数据与预存的全零数据确定校验分块对应的校验值;及Determine the check value corresponding to the check block based on the IO data issued by the IO write operation and the pre-stored all-zero data; and
    所述写入分块、所述写零分块以及所述校验分块分别向下层发送对应的写请求,下层返回写请求成功提示后,更新所述条带对应的位图以反映初始化完成。The write block, the write zero block and the check block respectively send corresponding write requests to the lower layer. After the lower layer returns a successful write request prompt, the bitmap corresponding to the strip is updated to reflect the completion of the initialization. .
  2. 根据权利要求1所述的方法,其特征在于,所述IO写操作为非满条带写操作时,根据所述IO写操作划分条带中的分块为写入分块以及写零分块,包括:The method according to claim 1, characterized in that when the IO write operation is a non-full stripe write operation, the blocks in the stripe are divided into write blocks and write zero blocks according to the IO write operation. ,include:
    获取所述IO写操作下发的IO数据大小以及所述条带内的分块大小;Obtain the IO data size issued by the IO write operation and the block size in the stripe;
    根据所述IO数据大小以及分块大小解析所述IO写操作对应的写入分块数;及Analyze the number of write blocks corresponding to the IO write operation according to the IO data size and block size; and
    确定所述条带内除所述写入分块以及校验分块外的分块为所述写零分块。It is determined that the blocks in the strip except the write blocks and the check blocks are the write zero blocks.
  3. 根据权利要求2所述的方法,其特征在于,所述IO写操作为非满条带写操作时,根据所述IO写操作划分条带中的分块为写入分块以及写零分块,还包括:The method according to claim 2, characterized in that when the IO write operation is a non-full stripe write operation, the blocks in the stripe are divided into write blocks and write zero blocks according to the IO write operation. ,Also includes:
    确定各写入分块的待写入IO数据;Determine the IO data to be written in each write block;
    比较所述待写入IO数据的大小与所述写入分块的大小;及Compare the size of the IO data to be written with the size of the written block; and
    响应于所述待写入IO数据的大小与所述写入分块的大小一致,则所述写入分块根据所述待写入IO数据创建写请求;或,响应于所述待写入IO数据的大小小于所述写入分块的大小,则按照所述写入分块的大小以全零数据补齐所述待写入IO数据后,所述写入分块根据补齐后的所述待写入IO数据创建对应的写请求。In response to the size of the IO data to be written being consistent with the size of the write block, the write block creates a write request based on the IO data to be written; or, in response to the IO data to be written being If the size of the IO data is smaller than the size of the write block, then after filling the IO data to be written with all zero data according to the size of the write block, the write block will be filled in according to the size of the write block. The IO data to be written creates a corresponding write request.
  4. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method of claim 1, further comprising:
    在所述磁盘阵列内存中划分出固定区域;及Divide fixed areas in the disk array memory; and
    所述固定区域中预先存储有全零数据,所述全零数据粒度与所述分块大小一致。All-zero data is pre-stored in the fixed area, and the granularity of the all-zero data is consistent with the block size.
  5. 根据权利要求3所述的方法,其特征在于,所述根据所述IO写操作下发的IO数据与预存的全零数据确定校验分块对应的校验值,包括:The method according to claim 3, characterized in that determining the check value corresponding to the check block based on the IO data issued by the IO write operation and the pre-stored all-zero data includes:
    将每一所述写入分块对应的所述待写入IO数据与所述全零数据进行异或运算以确定 所述校验值;及Perform an XOR operation on the IO data to be written corresponding to each write block and the all-zero data to determine the check value; and
    所述校验分块基于所述校验值生成对应的写请求。The verification block generates a corresponding write request based on the verification value.
  6. 根据权利要求4所述的方法,其特征在于,所述方法还包括:The method of claim 4, further comprising:
    所述条带接收到的所述IO写操作为满条带写操作时,基于所述IO写操作下发的IO数据,确定所述条带每一分块内要写入的IO数据;When the IO write operation received by the strip is a full strip write operation, the IO data to be written in each block of the strip is determined based on the IO data issued by the IO write operation;
    将每一所述分块要写入的IO数据进行异或运算,以确定所述校验值;及Perform an XOR operation on the IO data to be written in each block to determine the check value; and
    所述校验分块基于所述校验值生成对应的写请求。The verification block generates a corresponding write request based on the verification value.
  7. 根据权利要求6所述的方法,其特征在于,所述条带接收到的所述IO写操作为满条带写操作时,所述方法还包括:The method according to claim 6, characterized in that when the IO write operation received by the stripe is a full stripe write operation, the method further includes:
    每一所述分块以及所述校验分块分别向下层发送对应的写请求,下层返回写请求成功提示后,更新所述条带对应的位图以反映初始化完成。Each of the blocks and the verification block sends a corresponding write request to the lower layer respectively. After the lower layer returns a successful write request prompt, the bitmap corresponding to the strip is updated to reflect the completion of the initialization.
  8. 根据权利要求4所述的方法,其特征在于,所述方法还包括:The method of claim 4, further comprising:
    接收到IO读操作命令时,基于所述IO读操作命令对应的读取地址信息确定所述条带位置;及When receiving an IO read operation command, determine the stripe position based on the read address information corresponding to the IO read operation command; and
    根据所述条带位置以及位图,确定所述IO读操作命令对应的条带是否完成初始化。According to the stripe position and bitmap, it is determined whether the stripe corresponding to the IO read operation command has been initialized.
  9. 根据权利要求8所述的方法,其特征在于,所述方法还包括:The method of claim 8, further comprising:
    响应于所述IO读操作命令对应的条带已完成初始化,则下发读请求至下层以返回对应的IO数据;或,响应于所述IO读操作命令对应的条带未完成初始化,则返回全零数据。In response to the stripe corresponding to the IO read operation command having been initialized, the read request is sent to the lower layer to return the corresponding IO data; or in response to the stripe corresponding to the IO read operation command being initialized not completed, then returning All zero data.
  10. 根据权利要求4所述的方法,其特征在于,所述方法应用于带有校验分块的磁盘阵列,所述方法还包括:The method according to claim 4, characterized in that the method is applied to a disk array with parity blocks, and the method further includes:
    基于所述校验值以及所述条带中的IO数据,恢复丢失数据的分块。Based on the check value and the IO data in the strip, the lost data blocks are restored.
  11. 根据权利要求2所述的方法,其特征在于,所述判断条带接收到的IO写操作的类型,包括:The method of claim 2, wherein determining the type of IO write operation received by the stripe includes:
    获取所述IO写操作下发的IO数据大小以及对应的所述条带大小;及Obtain the IO data size issued by the IO write operation and the corresponding strip size; and
    响应于所述下发的IO数据大小与对应的所述条带大小相同,则判定所述IO写操作为满条带写操作;或,响应于所述下发的IO数据大小小于对应的所述条带大小,则判定所述IO写操作为非满条带写操作。In response to the size of the issued IO data being the same as the corresponding stripe size, it is determined that the IO write operation is a full stripe write operation; or in response to the size of the issued IO data being smaller than the corresponding stripe size. If the stripe size is specified, the IO write operation is determined to be a non-full stripe write operation.
  12. 根据权利要求8所述的方法,其特征在于,The method according to claim 8, characterized in that:
    所述IO写操作至少包括写入地址信息以及IO数据;The IO write operation at least includes writing address information and IO data;
    所述IO读操作至少包括读取地址信息。The IO read operation at least includes reading address information.
  13. 根据权利要求8所述的方法,其特征在于,所述磁盘阵列内包含多个条带;判断条带接收到的IO写操作的类型之前,所述方法还包括:The method according to claim 8, characterized in that the disk array contains multiple stripes; before determining the type of IO write operation received by the strip, the method further includes:
    主机下发IO写操作到所述磁盘阵列;及The host issues IO write operations to the disk array; and
    主机根据所述IO写操作中的写入地址信息确定对应位置的条带,并将所述IO写操作下发到该条带中。The host determines the stripe of the corresponding position based on the write address information in the IO write operation, and sends the IO write operation to the stripe.
  14. 根据权利要求3所述的方法,其特征在于,按照所述写入分块的大小以全零数据补齐所述待写入IO数据,包括:The method according to claim 3, characterized in that, filling the IO data to be written with all-zero data according to the size of the writing block, includes:
    从所述磁盘阵列中读取预先存储的全零数据,并写入分块中的空余空间。The pre-stored all-zero data is read from the disk array and written to the free space in the block.
  15. 根据权利要求8所述的方法,其特征在于,根据所述条带位置以及位图,确定所述IO读操作命令对应的条带是否完成初始化,包括:The method according to claim 8, characterized in that, according to the stripe position and the bitmap, determining whether the stripe corresponding to the IO read operation command has been initialized includes:
    读取所述位图的标志位;及Read the flag bits of the bitmap; and
    响应于所述位图的标志位为1,确定所述IO读操作命令对应的条带已完成初始化;或,响应于所述位图的标志位为0,确定所述IO读操作命令对应的条带未完成初始化。In response to the flag bit of the bitmap being 1, it is determined that the strip corresponding to the IO read operation command has been initialized; or in response to the flag bit of the bitmap being 0, it is determined that the strip corresponding to the IO read operation command has been initialized. The stripe has not completed initialization.
  16. 根据权利要求10所述的方法,其特征在于,基于所述校验值以及所述条带中的IO数据,恢复丢失数据的分块,包括:The method according to claim 10, characterized in that, based on the check value and the IO data in the strip, recovering the lost data blocks includes:
    将所述校验值与所述条带中的未丢失数据的分块进行异或,以恢复丢失数据的分块。The check value is XORed with the blocks of the non-lost data in the strip to restore the blocks of the lost data.
  17. 一种磁盘阵列初始化***,其特征在于,所述***包括:A disk array initialization system, characterized in that the system includes:
    数据准备模块,用于创建磁盘阵列并进行条带划分,所述条带包含多个分块以及至少一个校验分块;A data preparation module for creating a disk array and striping it, where the stripe contains multiple blocks and at least one parity block;
    数据分析模块,用于判断条带接收到的IO写操作的类型,所述IO写操作的类型包括非满条带写操作以及满条带写操作;A data analysis module, used to determine the type of IO write operations received by the stripe. The types of IO write operations include non-full stripe write operations and full stripe write operations;
    所述数据分析模块,还用于在所述条带接收到的IO写操作为非满条带写操作时,根据所述IO写操作划分条带中的分块为写入分块以及写零分块;The data analysis module is also configured to divide the blocks in the strip into write blocks and write zeros according to the IO write operations when the IO write operation received by the strip is a non-full strip write operation. Block;
    所述数据分析模块,还用于根据所述IO写操作下发的IO数据与预存的全零数据确定校验分块对应的校验值;及The data analysis module is also used to determine the check value corresponding to the check block based on the IO data issued by the IO write operation and the pre-stored all-zero data; and
    数据处理模块,用于通过所述写入分块、所述写零分块以及所述校验分块分别向下层发送对应的写请求,下层返回写请求成功提示后,更新所述条带对应的位图以反映初始化完成。The data processing module is configured to send corresponding write requests to the lower layer through the write block, the write zero block and the verification block respectively. After the lower layer returns a successful write request prompt, it updates the stripe corresponding bitmap to reflect initialization completion.
  18. 根据权利要求17所述的***,其特征在于,所述数据分析模块还用于获取所述IO写操作下发的IO数据大小以及所述条带内的分块大小;所述数据分析模块还用于根据 所述IO数据大小以及分块大小解析所述IO写操作对应的写入分块数;所述数据分析模块还用于确定所述条带内除所述写入分块以及校验分块外的分块为所述写零分块。The system according to claim 17, characterized in that the data analysis module is also used to obtain the IO data size issued by the IO write operation and the block size in the strip; the data analysis module is further Used to parse the number of write blocks corresponding to the IO write operation according to the IO data size and block size; the data analysis module is also used to determine the write blocks and verification in the strip Blocks outside the block are the write zero blocks.
  19. 一种电子设备,其特征在于,所述电子设备包括:An electronic device, characterized in that the electronic device includes:
    一个或多个处理器;one or more processors;
    以及与所述一个或多个处理器关联的存储器,所述存储器用于存储程序指令,所述程序指令在被所述一个或多个处理器读取执行时,执行权利要求1-16任一所述方法。and a memory associated with the one or more processors, the memory being used to store program instructions that, when read and executed by the one or more processors, execute any one of claims 1-16 described method.
  20. 一种非易失性计算机可读存储介质,其特征在于,其存储计算机可读指令,所述计算机可读指令使得计算机执行权利要求1-16中任一所述方法。A non-volatile computer-readable storage medium, characterized in that it stores computer-readable instructions, and the computer-readable instructions cause a computer to execute the method of any one of claims 1-16.
PCT/CN2023/070636 2022-08-26 2023-01-05 Disk array initialization method and system, electronic device, and storage medium WO2024040857A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211032632.7A CN115098046B (en) 2022-08-26 2022-08-26 Disk array initialization method, system, electronic device and storage medium
CN202211032632.7 2022-08-26

Publications (1)

Publication Number Publication Date
WO2024040857A1 true WO2024040857A1 (en) 2024-02-29

Family

ID=83300095

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/070636 WO2024040857A1 (en) 2022-08-26 2023-01-05 Disk array initialization method and system, electronic device, and storage medium

Country Status (2)

Country Link
CN (1) CN115098046B (en)
WO (1) WO2024040857A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115098046B (en) * 2022-08-26 2023-01-24 苏州浪潮智能科技有限公司 Disk array initialization method, system, electronic device and storage medium
CN115657960B (en) * 2022-11-11 2023-03-14 苏州浪潮智能科技有限公司 Disk array initialization method, device, equipment and readable storage medium
CN115543215B (en) * 2022-11-28 2023-03-14 苏州浪潮智能科技有限公司 Data writing operation and data reading operation method and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030233596A1 (en) * 2002-06-12 2003-12-18 John Corbin Method and apparatus for fast initialization of redundant arrays of storage devices
CN103645862A (en) * 2013-12-12 2014-03-19 北京同有飞骥科技股份有限公司 Initialization performance improvement method of redundant arrays of inexpensive disks
US20150212736A1 (en) * 2014-01-24 2015-07-30 Silicon Graphics International Corporation Raid set initialization
CN107885620A (en) * 2017-11-22 2018-04-06 华中科技大学 A kind of method and system for improving Solid-state disc array Performance And Reliability
CN113849124A (en) * 2021-08-27 2021-12-28 苏州浪潮智能科技有限公司 Disk array capacity expansion method and device
CN115098046A (en) * 2022-08-26 2022-09-23 苏州浪潮智能科技有限公司 Disk array initialization method, system, electronic device and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101976177B (en) * 2010-08-19 2012-10-03 北京同有飞骥科技股份有限公司 Method for constructing vertical grouped disk array capable of being subject to parallel centralized check
JP5923964B2 (en) * 2011-12-13 2016-05-25 富士通株式会社 Disk array device, control device, and program
CN102609224B (en) * 2012-02-16 2015-03-11 浪潮(北京)电子信息产业有限公司 Redundant array of independent disk system and initializing method thereof
WO2021046693A1 (en) * 2019-09-09 2021-03-18 华为技术有限公司 Data processing method in storage system, device, and storage system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030233596A1 (en) * 2002-06-12 2003-12-18 John Corbin Method and apparatus for fast initialization of redundant arrays of storage devices
CN103645862A (en) * 2013-12-12 2014-03-19 北京同有飞骥科技股份有限公司 Initialization performance improvement method of redundant arrays of inexpensive disks
US20150212736A1 (en) * 2014-01-24 2015-07-30 Silicon Graphics International Corporation Raid set initialization
CN107885620A (en) * 2017-11-22 2018-04-06 华中科技大学 A kind of method and system for improving Solid-state disc array Performance And Reliability
CN113849124A (en) * 2021-08-27 2021-12-28 苏州浪潮智能科技有限公司 Disk array capacity expansion method and device
CN115098046A (en) * 2022-08-26 2022-09-23 苏州浪潮智能科技有限公司 Disk array initialization method, system, electronic device and storage medium

Also Published As

Publication number Publication date
CN115098046A (en) 2022-09-23
CN115098046B (en) 2023-01-24

Similar Documents

Publication Publication Date Title
US11086740B2 (en) Maintaining storage array online
WO2024040857A1 (en) Disk array initialization method and system, electronic device, and storage medium
US9733862B1 (en) Systems and methods for reverse point-in-time copy management in a storage system
US9405625B2 (en) Optimizing and enhancing performance for parity based storage
US20200264785A1 (en) Method and system for managing storage device
US10108359B2 (en) Method and system for efficient cache buffering in a system having parity arms to enable hardware acceleration
US20200241781A1 (en) Method and system for inline deduplication using erasure coding
US10095585B1 (en) Rebuilding data on flash memory in response to a storage device failure regardless of the type of storage device that fails
US11736447B2 (en) Method and system for optimizing access to data nodes of a data cluster using a data access gateway and metadata mapping based bidding in an accelerator pool
US10929229B2 (en) Decentralized RAID scheme having distributed parity computation and recovery
US10860224B2 (en) Method and system for delivering message in storage system
US7716519B2 (en) Method and system for repairing partially damaged blocks
US20190042365A1 (en) Read-optimized lazy erasure coding
US11526284B2 (en) Method and system for storing data in a multiple data cluster system
US10740189B2 (en) Distributed storage system
CN110737395A (en) I/O management method, electronic device, and computer-readable storage medium
US11442642B2 (en) Method and system for inline deduplication using erasure coding to minimize read and write operations
US20190213076A1 (en) Systems and methods for managing digital data in a fault tolerant matrix
US20210064477A1 (en) Method and system for any-point in time recovery within traditional storage system via a continuous data protection interceptor
US9830094B2 (en) Dynamic transitioning of protection information in array systems
WO2024113986A1 (en) Data write operation method, data read operation method, and apparatus
US11882098B2 (en) Method and system for optimizing access to data nodes of a data cluster using a data access gateway and metadata mapping based bidding
CN113973138B (en) Method and system for optimizing access to data nodes of a data cluster using a data access gateway
CN113973137B (en) Method and system for optimizing access to data nodes of a data cluster using a data access gateway and a bid counter
US11288005B2 (en) Method and system for generating compliance and sequence aware replication in a multiple data cluster system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23855954

Country of ref document: EP

Kind code of ref document: A1