CN117377940A - Data processing apparatus and data processing method - Google Patents

Data processing apparatus and data processing method Download PDF

Info

Publication number
CN117377940A
CN117377940A CN202180098648.5A CN202180098648A CN117377940A CN 117377940 A CN117377940 A CN 117377940A CN 202180098648 A CN202180098648 A CN 202180098648A CN 117377940 A CN117377940 A CN 117377940A
Authority
CN
China
Prior art keywords
data
preprocessing
raid controller
cache address
storage array
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202180098648.5A
Other languages
Chinese (zh)
Inventor
秦军杰
常高嘉
许羡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN117377940A publication Critical patent/CN117377940A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The embodiment of the application provides a data processing device and a data processing method. The RAID controller is used for acquiring first data on a target strip in the storage array and first index information corresponding to the first data; the first data is any one of the data on the target strip. The RAID controller is used for determining corresponding first table item information from a preset mapping relation based on the first index information; the first table entry information is used for indicating a preprocessing type and a cache address corresponding to the first data in the consistency operation; and carrying out corresponding consistency operation on the first data according to the first table item information. According to the method and the device, the reading and writing times of the first data in the cache unit and the transmission times of the first data on the bus can be reduced, the requirement on the system bandwidth is reduced, and further the system power consumption is reduced.

Description

Data processing apparatus and data processing method Technical Field
The present disclosure relates to the field of information technologies, and in particular, to a data processing device and a data processing method.
Background
Redundant arrays of independent disks (Redundant Array of Independent Disks, RAID) are a high performance, high reliability storage technology that provides logical disks for an application terminal or cluster of terminals by combining a series of individual disks in different ways. RAID techniques have been widely used in various applications for data storage, and common RAID techniques include RAID0, RAID1, RAID5, RAID6, RAID10, etc.; wherein RAID0 has no redundancy capability, and RAID1 disk utilization is low; while RAID5, RAID6, and RAID10 each consist of multiple disks (e.g., RAID5 contains at least 3 blocks, RAID6, and RAID10 contains at least 4 blocks), each RAID writes data to the disks in the array in stripes and deposits verification information on the disks in the array.
RAID5 is a storage solution with the balance of storage performance, data security and storage cost, and uses Disk partitioning Disk string technology. RAID5 requires at least three disks, and RAID5 does not back up the stored data, but stores the data and corresponding parity information on each disk that makes up RAID5, and stores the parity information and corresponding data on different disks, respectively. When one disk data of RAID5 is damaged, the damaged data can be recovered by using the remained data and the corresponding check information.
For RAID5 and RAID6 technologies, in the process of updating the disk check information or recovering the damaged data of the disk, the old data on the disk is stored in a buffer unit buffer, and then the RAID controller reads the old data from the buffer to perform corresponding check information update or data recovery.
However, in the prior art, in the process of updating the disk verification information or recovering the damaged data of the disk, the occupied system space is larger, and the delay of the processing process is higher; in addition, when the buffer unit buffer is located outside the RAID controller, the requirement on the bus bandwidth is high, and the system power consumption is high.
Disclosure of Invention
The embodiment of the application discloses a data processing device and a data processing method, which can reduce the read-write times of old data of a disk in a cache unit and the transmission times of the old data of the disk on a bus, reduce the requirements on the system bandwidth and the storage space, and further reduce the system power consumption; meanwhile, the calculation can be performed out of order, so that the delay for obtaining the consistency operation result is reduced.
In a first aspect, embodiments of the present application disclose a data processing apparatus comprising a redundant array of independent disks, RAID, controller and a storage array coupled to the RAID controller; the RAID controller is used for acquiring first data on a target strip in the storage array and first index information corresponding to the first data; wherein the first data is any one of the data on the target strip; the method comprises the steps of determining first table item information corresponding to first data from a preset mapping relation based on first index information; the method comprises the steps that a preset mapping relation is generated based on consistency of stripes, and first table item information is used for indicating a corresponding preprocessing type and a cache address of first data in consistency operation; and correspondingly preprocessing the first data according to the preprocessing type, and updating the data in the cache address by utilizing the preprocessed first data.
It should be appreciated that the above consistency operation may be a recovery of the disk corruption data using the first data in RAID5/6, an update of the disk verification information, or a corresponding calculation in other scenarios. The above process of determining that the first index information corresponds to the first entry information from the preset mapping relationship may be by table lookup or other manners, which is not limited in this application.
Compared with the prior art, the method has the advantages that compared with the process that first data in a storage array are cached in a cache unit, and then the first data in the cache unit and a physical address corresponding to the first data are read to a RAID controller, in the embodiment of the application, the storage array can directly send the first data and first index information corresponding to the first data to the RAID controller, and then the first index information is used for determining the preprocessing type and the cache address required by the subsequent consistency operation of the first data, so that the process that the first data is cached in the cache unit first and then read from the cache unit to the RAID controller can be omitted, the data read-write times in the cache unit can be effectively reduced, and the requirement on the read-write bandwidth of the cache unit can be further reduced. Meanwhile, the process of acquiring the first data in the embodiment of the application does not pass through the buffer unit for buffering, so that the steps are simpler, and the embodiment of the application can also reduce the delay of recovering the disk data and/or updating the verification information by using the first data later. In addition, when the buffer unit is located outside the RAID controller, in the prior art, the process of buffering the first data in the storage array in the buffer unit needs to be performed through the bus for data transmission.
In a possible implementation manner, the storage array includes M disks, the target stripe includes M stripe units, and the M stripe units are respectively located on the M disks; wherein M is an integer greater than 2; the first data is one or all of the data blocks on any of the M stripe units.
It should be appreciated that the first data may be one or all of the data blocks on any of the M stripe units. That is, when the data in the stripe unit is split into a plurality of data blocks, the first data may be one of the plurality of data blocks; when the data in a stripe unit is not split, the first data may be all data contained in any one of the M stripe units.
It can be seen that, in the embodiment of the present application, when the data in the stripe unit is split into multiple data blocks, since the first data is one of the multiple data blocks, the size of the first data is smaller, so that the first data at this time can be quickly returned to the RAID controller to perform subsequent computation, thereby reducing the delay of the subsequent processing procedure and improving the efficiency.
In a possible embodiment, the target strip further includes second data; the memory array is used for: transmitting second data and second index information corresponding to the second data to the RAID controller; wherein the transmission time of the second data is before or after the transmission time of the first data.
It should be appreciated that the second data described above may include at least one other data block in the target stripe in addition to the first data. When the second data includes a plurality of data blocks, the plurality of data blocks may be returned to the RAID controller in any order, and each data block may simultaneously carry index information corresponding to each data block when returned.
It can be seen that, in this embodiment of the present application, since the first index information corresponding to the data may be returned to the RAID controller simultaneously with the data, the RAID controller may determine, according to the index information corresponding to each returned data, a preprocessing type and a buffer address required for performing a consistency operation on the returned data, and need not wait for all the data required on the target stripe to be returned, so when the second data includes multiple data blocks, the storage array may return, in any order, the first data and the second data to the RAID controller, and complete a recovery process of the damaged data and/or an update process of the verification information of the disk based on the first data and/or the second data in an out-of-order manner, thereby effectively improving system performance.
In a possible implementation manner, the first entry information includes a first preprocessing type and a first cache address; the RAID controller is specifically configured to: preprocessing the first data according to the first preprocessing type, and updating the data in the first cache address by utilizing the preprocessed first data to obtain first reference information corresponding to the first data; the RAID controller is also configured to: determining second table item information corresponding to second data from a preset mapping relation based on second index information; the second table entry information comprises a second preprocessing type and a second cache address; preprocessing second data according to a second preprocessing type, and updating data in a second cache address by utilizing the preprocessed second data to obtain second reference information corresponding to the second data; and obtaining the data to be recovered in the storage array according to the first reference information and the second reference information.
In a possible implementation manner, the updating the data in the first cache address by using the preprocessed first data to obtain the first reference information corresponding to the first data specifically includes: the RAID controller acquires data in the first cache address, performs exclusive OR (exclusive OR) operation on the preprocessed first data and the data in the first cache address to obtain first reference information corresponding to the first data, and writes the first reference information into the first cache address.
It should be understood that the above-mentioned calculation process of the second reference information may correspond to the corresponding calculation process of the first reference information, which is not described herein.
It can be seen that, in this embodiment of the present application, since the data is returned to the RAID controller and carries the corresponding index information, the RAID controller determines the corresponding entry information according to the index information corresponding to each data, and further performs, according to the preprocessing type and the cache address indicated by the corresponding entry information, a corresponding consistency operation on each data, where the consistency operation includes a corresponding preprocessing and exclusive-or operation. Because the exclusive or operation logic is fixed, for the data returned to the RAID controller in any order, corresponding processing can be directly performed according to the order of each data return, so as to obtain the reference information corresponding to each data. Specifically, when the data in the same stripe unit is split into a plurality of data blocks, the plurality of data blocks can be returned to the RAID controller in an out-of-order manner, and corresponding consistency operation (namely out-of-order return and out-of-order calculation) is performed according to the returned sequence of the plurality of data blocks, so that the system performance is effectively improved.
In a possible implementation manner, the RAID controller is further configured to: receiving third data to be written into the storage array and third index information corresponding to the third data; determining third table item information corresponding to third data from a preset mapping relation based on third index information; the third table entry information comprises a third preprocessing type and a third cache address; preprocessing a third data block according to a third preprocessing type, and updating data in a third cache address by utilizing the preprocessed third data to obtain third reference information corresponding to the third data.
It should be appreciated that the third data may be data that the host sends to the RAID controller to be subsequently written to the storage array; the third data may include one or more data blocks, and the plurality of data blocks may be returned to the RAID controller in any order, where each data block returns while carrying index information corresponding to each data block.
It can be seen that, in the embodiment of the present application, the RAID controller may further perform a corresponding consistency operation on third data returned from the host to obtain third reference information corresponding to the third data. When the third data comprises a plurality of data blocks, because each data block carries corresponding index information, the RAID controller can directly perform corresponding consistency operation (namely out-of-order return and out-of-order calculation) on each data block according to the return sequence of each data block without according to the sequence before splitting each data block, thereby effectively improving the system performance; meanwhile, as out-of-order calculation can be performed, the delay of the second data consistency operation process can be reduced, and the delay of the subsequent verification information obtaining according to the second data consistency operation result is further reduced.
In a possible implementation manner, the first entry information includes a fourth preprocessing type and a fourth cache address; the RAID controller is specifically configured to: preprocessing the first data according to a fourth preprocessing type, and updating the data in a fourth cache address by utilizing the preprocessed first data to obtain fourth reference information corresponding to the first data; the RAID controller is also configured to: and obtaining the check information in the storage array according to the third reference information and the fourth reference information.
It can be seen that in the embodiment of the present application, the first entry information further includes a fourth preprocessing type and a fourth cache address. The first index information corresponding to the first data is returned to the RAID controller at the same time when the first data is returned to the RAID controller, so that the RAID controller can determine corresponding first item information according to the first index information, and perform corresponding consistency operation on the first data blocks according to a fourth preprocessing type and a fourth cache address contained in each first item information, without waiting for all other data blocks in the storage array to be returned to the RAID controller, and directly perform corresponding consistency operation (namely out-of-order return and out-of-order calculation) according to the return sequence of the data, thereby effectively improving the system performance. Meanwhile, the first data returned by the disk carries the corresponding first index information, and the corresponding preprocessing type and the corresponding cache address can be determined according to the first index information and the preset mapping relation, and the first data does not need to be cached in the middle through a cache unit, so that the read-write times of the cache unit and the bus transmission times are reduced, and the delay of obtaining the verification information to be updated of the storage array according to the consistency operation result of the first data is further reduced.
It should be appreciated that the embodiments of the present application exemplarily describe that the first entry information may include a first preprocessing type and/or a fourth preprocessing type, and a first cache address and/or a fourth cache address. Those skilled in the art will appreciate that the first entry information may contain Q preprocessing types and Q cache addresses; wherein the Q preprocessing types and the Q cache addresses are in one-to-one correspondence. Each preprocessing type and the buffer address corresponding to the preprocessing type may correspond to one application scenario in RAID5/6, that is, the Q preprocessing types and the Q buffer addresses respectively correspond to Q application scenarios, where the Q application scenarios may be scenarios where first data is obtained from a disk to perform subsequent operations, for example, disk data recovery, disk verification information update, disk data recovery and disk verification information update, or other scenarios, which are not limited in this application, and Q is a positive integer.
In a possible implementation manner, the RAID controller is further configured to: before receiving the first data, initializing the data in the cache address indicated by the first entry information.
It can be seen that, in the embodiment of the present application, before receiving the first data and performing the subsequent consistency operation, the RAID controller needs to initialize the data in the cache address indicated by the first entry information, where the initialization process may be a zero clearing process; and then the RAID controller starts to perform corresponding consistency operation on the received first data block, so that the generation of correct data to be recovered and/or verification information of the disk is ensured.
In a second aspect, embodiments of the present application disclose a RAID controller comprising a processor and interface circuitry; the processor is coupled with the storage array through the interface circuit; the processor is used for receiving first data on a target strip in the storage array and first index information corresponding to the first data through the interface circuit; wherein the first data is any one of the data on the target strip; determining first table item information corresponding to first data from a preset mapping relation based on first index information; the method comprises the steps that a preset mapping relation is generated based on consistency of stripes, and first table item information is used for indicating a corresponding preprocessing type and a cache address of first data in consistency operation; and correspondingly preprocessing the first data according to the preprocessing type, and updating the data in the cache address by utilizing the preprocessed first data.
In a possible implementation manner, the RAID controller includes a memory, and the memory is used for storing the first table item information.
In a possible implementation manner, the storage array includes M disks, the target stripe includes M stripe units, and the M stripe units are respectively located on the M disks; wherein M is an integer greater than 2; the first data is one or all of the data blocks on any of the M stripe units.
In a possible embodiment, the target strip further includes second data; the storage array is also used for sending second data and second index information corresponding to the second data to the RAID controller; wherein the transmission time of the second data is before or after the transmission time of the first data.
It should be appreciated that the advantages of the embodiments in the second aspect may be described with reference to the advantages of the corresponding embodiments in the first aspect, which are not repeated herein.
In a third aspect, an embodiment of the present application discloses a data processing method, including: acquiring first data on a target strip in a storage array and first index information corresponding to the first data by a RAID controller; wherein the first data is any one of the data on the target strip; determining first table item information corresponding to first data from a preset mapping relation based on first index information through a RAID controller; the method comprises the steps that a preset mapping relation is generated based on stripe consistency, and first table item information is used for indicating a preprocessing type and a cache address of consistency operation corresponding to first data; and correspondingly preprocessing the first data according to the preprocessing type, and updating the data in the cache address by utilizing the preprocessed first data.
In one possible implementation, the storage array includes M disks, and the target stripe includes M stripe units, where the M stripe units are located on the M disks, respectively; wherein M is an integer greater than 2; the first data is one or all of the data blocks on any of the M stripe units.
In a possible embodiment, the target strip further includes second data; the method further comprises the steps of: sending second data and second index information corresponding to the second data to the RAID controller through the storage array; wherein the transmission time of the second data is before or after the transmission time of the first data.
In a possible implementation manner, the first entry information includes a first preprocessing type and a first cache address; the preprocessing of the first data according to the preprocessing type and updating the data in the cache address by using the preprocessed first data includes: and preprocessing the first data by the RAID controller according to the first preprocessing type, and updating the data in the first cache address by utilizing the preprocessed first data to obtain first reference information corresponding to the first data. The method further comprises the following steps: determining, by the RAID controller, second entry information corresponding to second data from a preset mapping relationship based on the second index information; the second table entry information comprises a second preprocessing type and a second cache address; preprocessing second data according to a second preprocessing type, and updating data in a second cache address by utilizing the preprocessed second data to obtain second reference information corresponding to the second data; and obtaining the data to be recovered in the storage array according to the first reference information and the second reference information.
In one possible embodiment, the method further comprises: receiving, by the RAID controller, third data to be written to the storage array and third index information corresponding to the third data; determining third table item information corresponding to third data from a preset mapping relation based on third index information; the third table entry information comprises a third preprocessing type and a third cache address; preprocessing a third data block according to a third preprocessing type, and updating data in a third cache address by utilizing the preprocessed third data to obtain third reference information corresponding to the third data.
In a possible implementation manner, the first entry information includes a fourth preprocessing type and a fourth cache address; corresponding preprocessing is carried out on the first data according to the preprocessing type, and the data in the cache address is updated by utilizing the preprocessed first data, which comprises the following steps: preprocessing the first data by the RAID controller according to a fourth preprocessing type, and updating the data in a fourth cache address by utilizing the preprocessed first data to obtain fourth reference information corresponding to the first data; the method further comprises the following steps: and obtaining check information in the storage array by the RAID controller according to the third reference information and the fourth reference information.
In a possible embodiment, the method further comprises: the RAID controller initializes data in the cache address indicated by the first entry information prior to receiving the first data.
In a fourth aspect, embodiments of the present application disclose a chip system, where the chip system includes at least one processor, a memory, and an interface circuit, where the memory, the interface circuit, and the at least one processor are interconnected by a line, and where an instruction is stored in the at least one memory; the method of any of the third aspects above is implemented when the instructions are executed by the processor.
In a fifth aspect, embodiments of the present application disclose a computer readable storage medium having stored therein program instructions which, when run on a processor, implement the method of any of the third aspects above.
In a sixth aspect, embodiments of the present application disclose a computer program product, which, when run on a terminal, implements the method according to any of the third aspects above.
In a seventh aspect, an embodiment of the present application provides a terminal device, including a data processing apparatus provided in any one of the embodiments of the first aspect and a discrete device coupled to the data processing apparatus.
Drawings
FIG. 1 is a schematic diagram of a storage array in RAID 5 according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a storage array in RAID 6 according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a stripe structure of a memory array according to an embodiment of the present disclosure;
FIG. 4 is a schematic diagram of a prior art data flow;
FIG. 5 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;
FIG. 6 is a schematic flow chart of a consistency operation according to an embodiment of the present application;
FIG. 7 is a schematic diagram of corresponding cache addresses of different data blocks in a cache unit according to an embodiment of the present application;
FIG. 8 is a schematic diagram of another data processing apparatus according to an embodiment of the present application;
FIG. 9 is a schematic diagram of a data flow provided in an embodiment of the present application;
fig. 10 is a schematic diagram of a bus transmission number and a data read/write number according to an embodiment of the present application;
FIG. 11 is a schematic diagram of a hardware configuration of a RAID controller according to an embodiment of the present application;
fig. 12 is a flow chart of a data processing method according to an embodiment of the present application.
Detailed Description
Embodiments of the present application are described below with reference to the accompanying drawings in the embodiments of the present application.
First, the disk structures in RAID5 and RAID6 related to the scheme and corresponding data read-write modes are introduced:
referring to fig. 1, fig. 1 is a schematic structural diagram of a storage array in RAID5 according to an embodiment of the present application. As shown in fig. 1, RAID5 contains four independent Disk disks: disk0, disk1, disk2 and Disk3. The four independent disks may comprise four stripes, each stripe comprising four stripe units, and each stripe comprising four stripe units located on the four independent disks, respectively. The first stripe comprises four stripe units A1, A2, A3 and Ap, the second stripe comprises four stripe units B1, B2, bp and B3, the third stripe comprises four stripe units C1, cp, C2 and C3, and the fourth stripe comprises four stripe units Dp, D1, D2 and D3. For each of the four stripes, the four stripe units contained in each stripe have the same starting position and length on the respective disk. In each stripe shown in fig. 1, stripe units indicated by numerical subscripts (1, 2, and 3) are used to store disk data, and stripe units indicated by letter subscripts (P) are used to store parity information corresponding to the disk data (in all embodiments of the present application, parity information in RAID5 may also be referred to as P data).
Referring to fig. 2, fig. 2 is a schematic structural diagram of a storage array in RAID6 according to an embodiment of the present application. As shown in fig. 2, RAID6 contains five independent disks that may contain five stripes, each stripe containing five stripe units; the specific stripe unit included in each stripe may be referred to in fig. 2, and will not be described herein. As shown in fig. 2, in each stripe, stripe units indicated by numerical subscripts (1, 2, and 3) are used to store disk data, and stripe units indicated by alphabetical subscripts (P and Q) are used to store parity information corresponding to the disk data (in all embodiments of the present application, the two kinds of parity information in RAID6 may be referred to as P data and Q data, respectively).
It can be seen that RAID6 adds a second independent block of parity information as compared to RAID 5. The two independent parity systems use different algorithms, so that the reliability of data is very high, and the data integrity is not affected when any two disks fail simultaneously.
An application scenario of the embodiment of the present application will be described below with reference to fig. 3. The embodiment of the application can be applied to the following scenes in RAID5 or RAID 6: 1. recovering damaged data in the disk; 2. updating verification information in a disk; 3. recovering the damaged data in the disk and updating the verification information in the disk.
Referring to fig. 3, fig. 3 is a schematic diagram illustrating a stripe structure of a memory array according to an embodiment of the disclosure. The stripe may be any of the five stripes shown in fig. 2 (i.e., RAID 6). The three application scenarios described above will be described below with respect to the bands shown in fig. 3. The stripe shown in fig. 3 comprises five stripe units: d0, D1, D2, P and Q; wherein D0, D1, and D2 are used for storing disk data, and P and Q are used for storing P data and Q data (two independent pieces of check information) corresponding to the disk data, respectively.
When the embodiment of the application is applied to a scene, it is assumed that data in D1 is damaged, and the disk data in D1 needs to be recovered, and at this time, the data in D0, D2, P and Q need to be read, and the data in D1 needs to be recovered.
When the embodiment of the present application is applied to the second scenario, it is assumed that new D1 data needs to be written into D1, and the process of updating P data and Q data includes two ways of uppercase and lowercase:
lowercase mode
Firstly, corresponding old D1 data, old P data and old Q data are read from a stripe unit D1, a stripe unit P and a stripe unit Q; then, new D1 data is received, and new P data and new Q data are calculated from the old D1 data, the old P data, and the old Q data, and the new D1 data.
Capitalization mode (II)
Firstly, corresponding old D0 data and old D2 data are read from a stripe unit D0 and a stripe unit D2; then, new D1 data is received, and new P data and new Q data are calculated from the old D0 data, the old D2 data, and the new D1 data.
It should be understood that in the second scenario under RAID5, only one kind of check information needs to be updated, and data can be read from the disk in a uppercase or lowercase mode, and the read data is used once; in scenario two under RAID6, the parity information that needs to be updated contains two separate parts, and when data is read from the disk in either uppercase or lowercase mode, the read data (except for the data in stripe units P and Q) is used twice.
When the embodiment of the application is applied to a third scene, when the update of the verification information in the disk does not need to use the disk data to be recovered, the update of the verification information and the recovery of the disk data can be performed simultaneously; when the update of the verification information in the disk needs to use the disk data to be recovered, the update of the verification information needs to be performed after the recovery of the corresponding disk data. In a double data bad disk forced uppercase scenario in RAID6, the same data returned by the disk may be used four times. For example, one stripe in a disk includes six stripe units of D0, D1, D2, D3, P, and Q; wherein D0, D1, D2, and D3 are stripe units storing data, and P and Q are stripe units storing parity information. Assuming that the data in D0 and D1 are damaged, when new D2 data needs to be written to the disk in a capitalization manner, the data in D2 and D3 need to be read once from the disk, and the same data read out is used four times for calculating old D0 data and old D1 data, and new P data and new Q data, respectively.
Referring to fig. 4, fig. 4 is a schematic diagram of a data flow in the prior art, which is used to describe a data interaction process among a storage array (including a plurality of independent disks), a RAID controller, and a buffer unit in the prior art. As shown in fig. 4, first, according to the data flow shown in (1) in fig. 4, old data in the memory array is read, and the old data in the memory array is written into the cache unit. Then the RAID controller reads old data of the storage array from the cache unit according to the data flow shown in the step (2); when the RAID controller needs to perform the consistency operation corresponding to different scenarios, for example, perform recovery of damaged data and update of verification information at the same time, the RAID controller may read corresponding data from the cache unit, that is, the data flow (2) may include multiple independent data reading processes. In addition, the RAID controller may also receive new data to be written sent by the host over the bus (this process is not shown). After the RAID controller acquires the corresponding old data of the storage array from the cache unit, the RAID controller performs corresponding consistency operation on the acquired old data of the storage array to obtain updated check information and/or data to be recovered in the storage array; finally, writing updated check information and/or data to be recovered in the storage array into the corresponding position of the cache unit according to the data flow direction shown in the step (3); when the RAID controller needs to write different consistency operation results to the cache unit at the same time, the data flow (3) may include multiple independent data writing processes. When the cache unit is located outside the RAID controller, the data streams in the processes (1), (2) and (3) need to be transmitted through the system bus.
Alternatively, in fig. 4, the buffer unit may be a readable and writable memory, such as a register or a random access memory (random access memory, RAM), for example, a static random access memory (static random access memory, SRAM), a dynamic random access memory (dynamic random access memory, DRAM) or a Synchronous Dynamic Random Access Memory (SDRAM), a double data rate SDRAM (DDR SDRAM), or the like. It should be noted that the cache unit may be located inside the RAID controller (i.e., on-chip, such as SRAM) or outside the RAID controller (i.e., off-chip, such as DDR SDRAM). When the cache unit is positioned outside the RAID controller, data interaction between the cache unit and the RAID controller is transmitted through a system bus; when the cache unit is located inside the RAID controller, data interaction between the cache unit and the RAID controller does not need to be transmitted through a system bus.
Referring to fig. 5, fig. 5 is a schematic structural diagram of a data processing apparatus 500 according to an embodiment of the present application. As shown in fig. 5, the data processing apparatus 500 may include a redundant array of independent disks RAID controller 510 and a storage array 520 coupled to the RAID controller.
The RAID controller 510 is configured to obtain first data on a target stripe in the storage array 520 and first index information corresponding to the first data; the first data is any one of the data on the target strip.
Specifically, as shown in fig. 5, the storage array 520 includes M independent disks, and the storage array 520 may be divided into N stripes. The target band may be any one of N bands, and M is an integer greater than 2.
The RAID controller 510 is configured to determine, based on the first index information, first entry information corresponding to the first data from a preset mapping relationship; the method comprises the steps that a preset mapping relation is generated based on consistency of stripes, and first table item information is used for indicating a corresponding preprocessing type and a cache address of first data in consistency operation; and correspondingly preprocessing the first data according to the preprocessing type, and updating the data in the cache address by utilizing the preprocessed first data.
Alternatively, the first index information may include specific location information of the first data in the storage array 520, and a corresponding relationship between the first data and the first entry information, that is, the RAID controller may determine the first entry information corresponding to the first data from the preset mapping relationship by analyzing the first index information. The preset mapping relation is generated based on the consistency of the stripes, namely, the table entry information corresponding to different data on the target stripes is generated according to the consistency of the stripes. The determining, based on the preset mapping relationship, the first index information corresponding to the first data may be determined by table lookup or other manners, which is not limited in this application; the first index information may be returned to the RAID controller 510 as a command header for the first data along with the first data.
The specific location information may include a number of a disk to which the first data belongs in the storage array 520 and a specific location of the first data in the disk to which the first data belongs.
It should be understood that the foregoing consistency operation may be a corresponding calculation process in the RAID5/6 for recovering the damaged data of the disk by using the first data, updating the check information of the disk, or other scenarios, that is, the foregoing scenario one, scenario two, scenario three, or other scenarios for acquiring the data from the storage array in the RAID5/6 may all include a corresponding consistency operation process, which is not limited in this application. The first data may also be referred to as coherency operation data.
Compared with the process of caching the first data in the storage array in the cache unit and then reading the first data in the cache unit and the physical address corresponding to the first data to the RAID controller in the prior art, the method and the device can effectively reduce the data read-write times in the cache unit and further reduce the requirement on the read-write bandwidth of the cache unit because the storage array in the embodiment of the invention can directly send the first data and the first index information corresponding to the first data to the RAID controller and then determine the preprocessing type and the cache address required by the subsequent consistency operation of the first data through the first index information. Meanwhile, the process of acquiring the first data is simpler, so that the time delay of subsequently recovering the disk data and/or updating the verification information by using the first data can be reduced. In addition, when the buffer unit is located outside the RAID controller, in the prior art, the process of buffering the first data in the storage array in the buffer unit needs to be performed through the bus for data transmission.
In a possible implementation manner, the storage array 520 includes M disks, and the target stripe includes M stripe units, where the M stripe units are located on the M disks, respectively; wherein M is an integer greater than 2; the first data is one or all of the data blocks on any of the M stripe units.
Alternatively, when the storage array 520 splits the data in the M stripe units into a plurality of data blocks, the first data may be one of the plurality of data blocks; when the storage array 520 does not split the data in the M stripe units, the first data may be all data blocks contained in any stripe unit of the M stripe units, that is, all data on any stripe unit.
It can be seen that, in the embodiment of the present application, when the data in the stripe unit is split into multiple data blocks, since the first data block is one of the multiple data blocks, the first data is smaller, so that the first data at this time can be quickly returned to the RAID controller to perform subsequent computation, thereby reducing the delay of the subsequent processing procedure and improving the efficiency.
In a possible embodiment, the target strip further includes second data; the memory array is further configured to: transmitting second data and second index information corresponding to the second data to the RAID controller; wherein the transmission time of the second data is before or after the transmission time of the first data.
Alternatively, the second data may include at least one other data block in the target stripe in addition to the first data. When the second data includes a plurality of data blocks, the plurality of data blocks may be returned to the RAID controller in any order, and each data block may simultaneously carry index information corresponding to each data block when returned.
It can be seen that, in this embodiment of the present application, since the first index information corresponding to the data may be returned to the RAID controller simultaneously with the data, the RAID controller may determine, according to the index information corresponding to each returned data, a preprocessing type and a buffer address required for performing a consistency operation on the returned data, and need not wait for all the data required on the target stripe to be returned, so when the second data includes a plurality of data blocks, the storage array may return the first data and the second data to the RAID controller according to any order, and complete a recovery process of the damaged data and/or an update process of the check information of the disk based on the first data and/or the second data in an out-of-order manner, thereby effectively improving system performance.
Specifically, when the second data includes a plurality of data blocks, the storage array 520 is configured to send the first data and the plurality of data blocks to the RAID controller in any order, and simultaneously send index information corresponding to the data when sending different data.
Further, when the storage array 520 does not split the data in the M stripe units, the first data may be all the data in any stripe unit of the M stripe units, the second data may include the data in at least one stripe unit of the M stripe units, and the specific data content included in the second data is determined according to the application scenario to which the consistency operation belongs. The memory array 520 may sequentially transmit the first data and the second data in any order. When the storage array 520 divides data in M stripe units, the first data may be one data block included in any stripe unit, and the second data may include at least one remaining data block other than the first data, and when the number of the at least one remaining data blocks is a plurality, they may be located in the same stripe unit or different stripe units. At this point, the storage array 520 may send the first data and the second data to the RAID controller in two ways:
(1) First mode
The storage array 520 first transmits all data blocks belonging to one stripe unit in the target stripe, and then transmits all data blocks belonging to another stripe unit, and transmits the first data and the second data according to the rule; the storage array 520 may determine which stripe unit of all data blocks is sent first and which stripe unit of all data blocks is sent subsequently according to a specific application scenario, which is not specifically limited in this application. For example, when the data contained in the stripe unit 3 is split into the data block 1, the data block 2, and the data block 3 in the order from the data header to the data trailer, the data contained in the stripe unit 4 is split into the data block 4, the data block 5, and the data block 6 in the order from the data header to the data trailer; at this time, the storage array 520 may transmit all the data blocks included in the stripe unit 3 first, then transmit the data blocks included in the stripe unit 4, or transmit all the data blocks included in the stripe unit 4 first, and then transmit all the data blocks included in the stripe unit 3.
Further, for multiple data blocks that belong to the same stripe unit after splitting, the storage array 520 may send them to the RAID controller in a sequential or out-of-order manner. For example, when the data contained in the stripe unit 3 is split into data block 1, data block 2, and data block 3; wherein, data block 1 is the data header and data block 3 is the data trailer. When transmitting three data blocks in stripe unit 3, storage array 520 may sequentially transmit the three data blocks in the order of data block 1, data block 2, and data block 3, or may transmit the three data blocks to the RAID controller in other orders.
(2) Second mode
The memory array 520 may alternately transmit blocks of data belonging to different stripe units in any order. Specifically, the memory array 520 may first transmit one or more data blocks (not all data blocks) in any stripe unit and then transmit one or more first data blocks (not all data blocks) in another stripe unit until the first data and the second data are transmitted. The storage array 520 may determine which stripe unit of the partial data block is sent first and which stripe unit of the partial data block is sent subsequently according to a specific application scenario, which is not specifically limited in this application. For example, when the data contained in stripe unit 3 is split into data block 1, data block 2, and data block 3 in order from the data header to the data trailer, and the data contained in stripe unit 4 is split into data block 4, data block 5, and data block 6 in order from the data header to the data trailer, storage array 520 may sequentially transmit data block 2 in stripe unit 3, data block 5 and data block 6 in stripe unit 4, data block 1 and data block 3 in stripe unit 3, and data block 4 in stripe unit 4.
It should be noted that, the RAID controller 510 may determine entry information corresponding to the data according to index information of the returned data, and further perform subsequent exclusive-or XOR operation based on the preprocessing type and the cache address included in the entry information, and since logic of the exclusive-or operation is fixed, the first data and the second data may be returned to the RAID controller in an out-of-order manner, so that calculation of data to be recovered and/or check information of the subsequent disk is completed in an out-of-order manner, and thus the embodiment of the present application may effectively improve system performance.
The process by which the RAID controller 510 calculates data to be recovered (i.e., recovering disk corruption data) using the first data and the second data received from the storage array 520, scenario one, will be described in detail below.
In a possible implementation manner, the first entry information includes a first preprocessing type and a first cache address; the RAID controller is specifically configured to: preprocessing the first data according to the first preprocessing type, and updating the data in the first cache address by utilizing the preprocessed first data to obtain first reference information corresponding to the first data; the RAID controller is also configured to: determining second table item information corresponding to second data from a preset mapping relation based on second index information; the second table entry information comprises a second preprocessing type and a second cache address; preprocessing second data according to a second preprocessing type, and updating data in a second cache address by utilizing the preprocessed second data to obtain second reference information corresponding to the second data; and obtaining the data to be recovered in the storage array according to the first reference information and the second reference information.
Optionally, preprocessing the first data according to the first preprocessing type, and updating data in the first cache address by using the preprocessed first data to obtain first reference information corresponding to the first data, which specifically includes: the RAID controller 510 obtains data in the first cache address, performs an exclusive-or XOR operation on the preprocessed first data and the data in the first cache address, obtains first reference information (i.e. a result of the exclusive-or operation) corresponding to the first data, and writes the first reference information into the first cache address. Similarly, the calculation process of the second reference information may refer to the corresponding process in the first reference information, which is not described herein.
The first data corresponds to first index information, the first index information corresponds to first table item information, and the first table item information is used for indicating a preprocessing type and a cache address of the first data for consistency operation. Optionally, the first entry information may include K preprocessing types and K cache addresses; the K preprocessing types and the K cache addresses are respectively in one-to-one correspondence, each preprocessing type and the corresponding cache address are respectively used for the consistency operation of the first data in different scenes, and K is a positive integer. Similarly, the second data corresponds to second index information, the second index information corresponds to second entry information, and the second entry information is used for indicating a preprocessing type and a cache address of the second data for consistency operation. It should be noted that when the second data includes a plurality of data blocks, each data block corresponds to one index information and entry information, that is, when the second index information includes index information respectively corresponding to the plurality of data blocks, the second entry information includes entry information respectively corresponding to the plurality of data blocks.
The preprocessing type is a data processing manner in RAID5/6, that is, a manner of processing received data by using a preprocessing function after the RAID controller 510 receives the data from the storage array 520, for example, when the data received from the storage array 520 is 50, the data 50 may be processed by using a corresponding preprocessing function to obtain preprocessed data 89; and then performing exclusive-or operation on the data 89 and the data 50 corresponding to the cache address to obtain the reference information corresponding to the data 50, namely the exclusive-or operation result. In summary, the preprocessing is the data processing process before the exclusive-or operation after the data on the storage array 520 is returned.
The process of generating data to be restored in a target stripe on the storage array 520 using the first data and the second data will be described below with reference to fig. 6 and 7. As shown in fig. 6, fig. 6 is a flow chart 600 of a coherency operation, including steps 610, 620, 630, and 640.
Step 610, the raid controller receives the ith data block returned currently, and obtains corresponding table entry information from a preset mapping relation according to index information corresponding to the ith data block; preprocessing the ith data block according to the table entry information; when the second data comprises a plurality of data blocks, the ith data block is the first data or one of the plurality of data blocks; when the second data is a data block, the ith data block is the first data or the second data, and i is a positive integer.
Specifically, the process of preprocessing the ith data block by the RAID controller is the same as the process of preprocessing the first data by the RAID controller, which is not described herein.
It should be understood that the preprocessing type corresponding to each returned data block is determined according to a specific application scenario, and may be the same or different, which is not limited in this application. The different preprocessing types correspond to different preprocessing functions, and the RAID controller performs preprocessing on the ith data block according to a preprocessing algorithm corresponding to the ith data block, for example, when data in the ith data block is 67, the data in the ith data block may be preprocessed according to the corresponding preprocessing algorithm, so that the preprocessed ith data block is a value different from the data in the ith data block, such as 34.
Step 620, determining whether there is a superposition between the buffer address corresponding to the ith data block and the buffer address corresponding to the data block (also referred to as the current data block) currently undergoing exclusive or processing by the RAID controller.
The overlapping portion refers to that the buffer address corresponding to the ith data block and the buffer address corresponding to the current data block are completely overlapped or partially overlapped.
Step 630, when the determination result in step 620 is yes, the RAID controller waits for the data in the cache address corresponding to the current data block to be updated, and then performs a corresponding consistency operation on the ith data block to update the data in the cache address corresponding to the ith data block, that is, serial update.
Specifically, when the buffer address corresponding to the ith data block and the buffer address corresponding to the current data block overlap, after the RAID controller waits for data update in the buffer address corresponding to the current data block, the RAID controller acquires the data in the buffer address corresponding to the ith data block, performs exclusive or (XOR) processing on the preprocessed data in the buffer address corresponding to the ith data block and the data in the buffer address corresponding to the ith data block, and obtains reference information corresponding to the ith data block, and writes the reference information into the buffer address corresponding to the ith data block.
Step 640, when the determination result in step 620 is "no", the RAID controller may immediately start updating the data in the cache address corresponding to the ith data block, i.e. updating in parallel; specific updating process the corresponding process of the foregoing embodiment is not repeated here.
In summary, when the RAID controller updates the data in the buffer address corresponding to the ith data block, because the process needs to use the data in the buffer address, when the buffer address corresponding to the ith data block is partially or completely overlapped with the buffer address corresponding to the current data block, serial updating is needed; when the buffer address corresponding to the ith data block is not coincident with the buffer address corresponding to the current data block, the data block can be updated in parallel.
It should be understood that, when the storage array 520 splits data in M stripe units, multiple cache addresses corresponding to multiple data blocks split by the same stripe unit do not overlap; when the storage array 520 divides the data in each of the M stripe units using different rules, the respective cache addresses of the data blocks belonging to the different stripe units may overlap.
A process of obtaining data to be restored in the storage array 520 using the first reference information and the second reference information will be described below with reference to fig. 7. Referring to fig. 7, the target cache address in fig. 7 is a union of the first cache address and the second cache address. The first cache address is used for storing first reference information. The second cache address is used for storing second reference information, and when the second data comprises a plurality of data blocks, the second cache address can comprise a plurality of cache addresses respectively corresponding to the plurality of data blocks; at this time, the target cache address is a union of the first cache address and the plurality of cache addresses. The first cache address and any two cache addresses of the plurality of cache addresses may not overlap, or may overlap entirely or partially in the target cache address. The ith data block is one of a plurality of data blocks contained in the first data or the second data.
Specifically, as shown in fig. 7, the buffer address corresponding to the ith data block partially overlaps with the buffer address corresponding to the i-1 th data block; the buffer address corresponding to the (i+1) th data block is completely overlapped with the buffer address corresponding to the (i+2) th data block. After the reference information corresponding to each data block is obtained by parallel and/or serial calculation according to the description in the embodiment of fig. 6, the reference information corresponding to each data block is written into the cache address; after the last data block with consistency operation is written into the cache address corresponding to the data block with the corresponding reference information, the data stored in the target cache address is the data to be recovered on the target stripe in the storage array 520.
The target cache address may be located in a cache unit buffer as shown in fig. 4, and the cache unit may be a readable and writable memory, such as a register or a random access memory (random access memory, RAM), for example, a static random access memory (static random access memory, SRAM), a dynamic random access memory (dynamic random access memory, DRAM) or a Synchronous Dynamic Random Access Memory (SDRAM), a double data rate SDRAM (DDR SDRAM), or the like. It should be noted that the target cache unit may be located inside the RAID controller (i.e., on-chip, such as SRAM) or outside the RAID controller (i.e., off-chip, such as DDR SDRAM).
The process by which the RAID controller generates verification information in the target stripe using the first data received from storage array 520, namely scenario two, will be described in detail below.
In one possible implementation, third data to be written to the storage array and third index information corresponding to the third data are received; determining third table item information corresponding to third data from a preset mapping relation based on third index information; the third table entry information comprises a third preprocessing type and a third cache address; preprocessing a third data block according to a third preprocessing type, and updating data in a third cache address by utilizing the preprocessed third data to obtain third reference information corresponding to the third data.
It should be appreciated that the third data may be data that the host sends to the RAID controller to be subsequently written to the storage array 520; the third data may include one or more data blocks, and the plurality of data blocks may be returned to the RAID controller in any order, where each data block returns while carrying index information corresponding to each data block.
Specifically, when the third data includes one data block, the RAID controller 510 may determine third entry information corresponding to the third data from a preset mapping relationship according to the third index information. Carrying out corresponding consistency operation on a third data block by utilizing the third table entry information comprising a third preprocessing type and a third cache address; the process of the third data consistency operation may refer to a corresponding process in the first data consistency operation, which is not described herein.
It should be understood that, when the third data includes a plurality of data blocks, the consistency operation corresponding to each data block is the same as the consistency operation corresponding to the third data as one data block, which is not described herein.
It can be seen that, in the embodiment of the present application, the RAID controller may further perform a corresponding consistency operation on third data returned from the host to obtain third reference information corresponding to the third data. When the third data comprises a plurality of data blocks, because each data block carries corresponding index information, the RAID controller can directly perform corresponding consistency operation (namely out-of-order return and out-of-order calculation) on each data block according to the return sequence of each data block without according to the sequence before splitting each data block, thereby effectively improving the system performance; meanwhile, as out-of-order calculation can be performed, the delay of the second data consistency operation process can be reduced, and the delay of the subsequent verification information obtaining according to the second data consistency operation result is further reduced.
In a possible implementation manner, the first entry information includes a fourth preprocessing type and a fourth cache address; the RAID controller is specifically configured to: preprocessing the first data according to a fourth preprocessing type, and updating the data in a fourth cache address by utilizing the preprocessed first data to obtain fourth reference information corresponding to the first data; the RAID controller is also configured to: and obtaining the check information in the storage array according to the third reference information and the fourth reference information.
Specifically, the process of performing the coherency operation on the first data by using the fourth preprocessing type and the fourth cache address by the RAID controller 510 may correspond to the corresponding process of performing the coherency operation on the first data by using the first preprocessing type and the first cache address, which will not be described herein.
Further, the process of generating the parity information in the disk by the RAID controller 510 according to the third reference information and the fourth reference information is similar to the embodiment shown in fig. 7, and when the third data includes a plurality of data blocks, the update process of the parity information is specifically as follows: any two cache addresses in the plurality of cache addresses corresponding to the plurality of data blocks in the fourth cache address and the third data may not overlap, completely overlap or partially overlap in the reference cache address; the reference cache address may be a union of the fourth cache address and a plurality of cache addresses corresponding to a plurality of data blocks in the third data. The RAID controller 510 updates the data in the buffer addresses of the multiple data blocks in the first data and the third data according to the calculation process in the foregoing embodiment, and after the reference information corresponding to the last data block performing the consistency operation is written into the buffer address corresponding to the data block, the data stored in the reference buffer address is the verification information to be updated on the target stripe in the storage array 520.
It should be appreciated that the order of calculation of the RAID controller 510 to update the parity information using the first data and third data described above may refer to the corresponding process in the embodiment of FIG. 6. Specifically, any one of the data blocks included in the first data and the third data may be used as the ith data block shown in fig. 6, and the processing sequence of the RAID controller 510 for the ith data block follows the principle in the embodiment of fig. 6, that is, whether the processing process of the ith data block and the current data block is parallel or serial is determined according to whether the corresponding buffer addresses of the ith data block and the current data block overlap, which is not described herein.
Optionally, in a different scenario, the RAID controller 510 may also use the second data to collectively complete the update of the verification information in the target stripe. At this time, the second entry information may include a fifth preprocessing type and a fifth cache address, and the RAID controller 510 performs a consistency operation on the second data by using the fifth preprocessing type and the fifth cache address to obtain fifth reference information; and then obtaining the verification information to be updated in the target strip according to the third reference information, the fourth reference information and the fifth reference information.
Specifically, the process of performing the consistency operation on the second data by using the fifth preprocessing type and the fifth cache address, and the process of obtaining the to-be-updated verification information in the target stripe according to the third reference information, the fourth reference information and the fifth reference information may be referred to the detailed description of the foregoing embodiments, which is not repeated herein.
It should be understood that, when the second data includes a plurality of data blocks, the second entry information includes a preprocessing type and a buffer address corresponding to the plurality of data blocks, respectively, and the fifth reference information includes a plurality of reference information corresponding to the plurality of data blocks, respectively; the calculation process of each reference information (i.e., the consistency operation process of each data block) in the plurality of reference information may refer to the foregoing embodiments, and will not be described herein.
Optionally, the cache unit where the reference cache address is located may be the same as the cache unit where the target cache address is located, which is not described herein.
It should be appreciated that the embodiments of the present application exemplarily describe that the first entry information may include a first preprocessing type and/or a fourth preprocessing type, and a first cache address and/or a fourth cache address. Those skilled in the art will appreciate that the first entry information may contain Q preprocessing types and Q cache addresses; wherein the Q preprocessing types and the Q cache addresses are in one-to-one correspondence. Each preprocessing type and the buffer address corresponding to the preprocessing type may correspond to one application scenario in RAID5/6, that is, the Q preprocessing types and the Q buffer addresses respectively correspond to Q application scenarios, where the Q application scenarios may be scenarios where first data is obtained from a disk to perform subsequent operations, for example, disk data recovery, disk verification information update, disk data recovery and disk verification information update, or other scenarios, which are not limited in this application, and Q is a positive integer.
It can be seen that in the embodiment of the present application, the first entry information further includes a fourth preprocessing type and a fourth cache address. The first index information corresponding to the first data is returned to the RAID controller at the same time when the first data is returned to the RAID controller, so that the RAID controller can determine corresponding first item information according to the first index information, and perform corresponding consistency operation on the first data blocks according to a fourth preprocessing type and a fourth cache address contained in each first item information, without waiting for all other data blocks in the storage array to be returned to the RAID controller, and directly perform corresponding consistency operation (namely out-of-order return and out-of-order calculation) according to the return sequence of the data, thereby effectively improving the system performance. Meanwhile, the first data returned by the disk carries the corresponding first index information, and the corresponding preprocessing type and the corresponding cache address can be determined according to the first index information and the preset mapping relation, and the first data does not need to be cached in the middle through a cache unit, so that the read-write times of the cache unit and the bus transmission times are reduced, and the delay of obtaining the verification information to be updated of the storage array according to the consistency operation result of the first data is further reduced.
In a possible implementation manner, the RAID controller is further configured to: before receiving the first data, initializing the data in the cache address indicated by the first entry information.
Specifically, in different scenarios in RAID5/6, the data used for performing the coherency operation is different, and before the RAID controller 510 receives the data required for the different scenarios, the data in the cache address corresponding to the data used for performing the coherency operation is initialized, where the initialization process may be a zero clearing process.
Wherein, in RAID5/6, the data for performing the consistency operation is from storage array 520 (e.g., may include first data and/or second data); in scenario two in RAID5/6, the data for performing the consistency operation is from the storage array 520 and the host (e.g., may include first data and third data); in scenario three in RAID5/6, the data for performing the coherency operation is from the storage array 520 and the host (e.g., may include first data, second data, and third data).
It should be appreciated that the above description of scenario two under RAID5/6 may be understood as a process of generating a type of parity information. When Q independent check information is needed to be updated in the target strip, the first table item information comprises Q corresponding pretreatment types and Q corresponding cache addresses; the third table item information comprises Q corresponding preprocessing types and Q corresponding cache addresses; wherein Q is a positive integer. The generation process of Q kinds of check information may be performed in parallel, and the update process of each kind of check information is the same as the check information generation process described in the above embodiment, and will not be described herein.
It can be seen that, in the embodiment of the present application, before receiving the first data and performing the subsequent consistency operation, the RAID controller needs to initialize, that is, clear, the data in the cache address indicated by the first entry information and/or the second entry information; and then the RAID controller starts to perform corresponding consistency operation on the received first data block and/or second data block, so as to ensure that correct data and/or verification information to be recovered of the disk are generated.
The process by which the RAID controller 510 recovers corrupted data in a target stripe of the storage array 520 and updates verification information in the target stripe, namely scenario three, will be described.
Specifically, the process in the third scenario may refer to the corresponding process in the first scenario and the second scenario, which are not described herein. It should be noted that in scenario three, when updating the check information needs to use the data to be recovered in the target stripe, the RAID controller needs to wait for the recovery of the data to be recovered in the target stripe before updating the check information; when the process of updating the check information does not need to use the data to be recovered, the recovery process of the data to be recovered and the update process of the check information can be performed in parallel.
In addition, the embodiment of the present application only illustrates the recovery of the damaged data in one stripe (i.e. the target stripe in the present application) and/or the update process of the corresponding check information in the stripe, and those skilled in the art may use the embodiment of the present application to execute any one of the above three scenarios or other possible scenarios on the data in the remaining one or more stripes in the storage array 520 in parallel, which is not limited in this application.
Referring to fig. 8, fig. 8 is a schematic structural diagram of another data processing apparatus 500 according to an embodiment of the present application, which is used as a refinement of functional modules of a RAID controller 510 in the data processing apparatus 500 in fig. 5. As shown in fig. 8, the RAID controller 510 may include a management unit 511, a determination unit 512, a parsing unit 513, and an operation unit 514. The management unit 511 is configured to manage entry information (e.g., the first entry information, the second entry information, or the third entry information) corresponding to data of a consistency operation (e.g., the data of the consistency operation may include one or more of the first data, the second data, or the third data). Specifically, the management unit 511 may generate entry information corresponding to the consistency operation data according to the target command sent by the host, and clear data in the cache address indicated by the entry information after receiving the target command; wherein the target command indicates a specific application scenario that the RAID controller 510 will execute subsequently. The determining unit 512 is configured to determine, from a preset mapping relationship, entry information corresponding to the consistency operation data according to index information corresponding to the consistency operation data, and a specific process thereof may refer to the description of the foregoing embodiment, which is not repeated herein. The parsing unit 513 is configured to parse the content in the table entry information to obtain a preprocessing type and a cache address corresponding to the data for consistency operation; and sends the data for the coherency operation, and the type of preprocessing and the cache address corresponding to the data for the coherency operation, to the operation unit 514. The operation unit 514 is configured to perform corresponding preprocessing on the data for performing the consistency operation according to a preprocessing type corresponding to the data for performing the consistency operation, so as to obtain preprocessed data for performing the consistency operation; and updating the data in the cache address corresponding to the data of the consistency operation by utilizing the preprocessed consistency operation data to obtain the data to be recovered and/or the verification information in the storage array 520. The specific process of the operation unit 514 generating the data to be recovered and/or the verification information in the storage array by using the data of the consistency operation can be referred to in the corresponding embodiment, and will not be described herein.
Referring to fig. 9, fig. 9 is a schematic data flow diagram of an embodiment of the present application, which is used to describe a process of sending data from a disk to a RAID controller. As shown in fig. 9, the data flow (4) describes a process of the storage array 520 returning the consistency operation data to the RAID controller 510 in the embodiment of the present application, and the RAID controller 510 performs a corresponding consistency operation on the received consistency operation data (the consistency operation may be the calculation process in the three scenarios or other RAID5/6 scenarios described above).
It can be seen that, in the embodiment of the present application, the storage array 520 may directly send the consistency operation data and the index information corresponding to the data to the RAID controller 510, and compared with the data acquisition mode in the prior art (as shown in the data stream (5) and the data stream (6) in fig. 9), the cache process of the cache unit is not needed, so that the read-write times of the cache unit are reduced, and further the bus transmission times and the system power consumption are reduced; in addition, because part of the data transmission process is omitted, the delay for obtaining the consistency operation result can be reduced.
Referring to fig. 10, fig. 10 is a schematic flow chart of bus transmission times and data read-write times provided in the embodiment of the present application. As shown in fig. 10, in RAID6, the process of acquiring storage array data to update verification information in the prior art is as follows: the storage array firstly sends data to the cache unit, the RAID controller reads the data from the cache unit, and then the RAID controller carries out corresponding consistency operation on the read data to obtain new P data and new Q data; and finally, writing the new P data and the new Q data into the cache unit. The process of acquiring storage array data for updating verification information in the embodiment of the application is as follows: the storage array directly sends the data to the RAID controller, and the RAID controller carries out corresponding consistency operation on the read data to obtain new P data and new Q data; and finally, writing the new P data and the new Q data into the cache unit.
In the process of writing new P data into the cache unit, the RAID controller will read the data to be written into the cache address of the new P data, and then write the new P data into the cache address corresponding to the cache unit, wherein the process comprises a read-once and write-once process, and the system bus comprises two data transmission processes. The above new P data or new Q data may be the third reference information and the fourth reference information in the foregoing embodiments. The above process is described in terms of a cache unit external to a RAID controller. It should be understood that the cache unit may also be located inside the RAID controller, which is not limited in this application.
The following will describe in detail the bus transfer times and data read/write times in the prior art and the embodiments of the present application under three different conditions.
Condition one: in scenario one (disk data recovery) under RAID6, under the condition that the buffer unit buffer is located outside the RAID controller, compared with the prior art, the embodiment of the present application may reduce the number of bus transmission times and the number of read/write times of the buffer unit (see, for example, the statistics of the number of times in scenario one in table 1). The number of bus transfers and the number of reads and writes of the buffer units under the scenario in table 1 are examples under the single bad disk condition in RAID 6.
In the prior art, as shown in fig. 10, first, the data of the memory array is written into the buffer memory unit, the transmission frequency of the bus in the process is 1 time, and the read-write frequency of the buffer memory unit is 1 time; then the RAID controller acquires storage array data from the cache unit, wherein the transmission times of the bus in the process are 1 time, and the reading and writing times of the cache unit are 1 time; and the RAID controller performs corresponding consistency operation on the acquired data of the storage array to obtain data to be recovered, and writes the data to be recovered into the cache unit, wherein the transmission frequency of the bus in the process is 2 times, and the reading and writing frequency of the cache unit is 2 times. To sum up, under the condition that the prior art is used, the bus transmission times are 4 times, and the read-write times of the cache unit are 4 times.
In the embodiment of the application, as shown in fig. 10, the RAID controller directly acquires storage array data from the disk first, and performs 1 data transmission through the bus in this process; and then the RAID controller carries out corresponding consistency operation on the storage array data to obtain data to be recovered, and writes the data to be recovered into the cache unit, wherein the transmission times of the bus in the process are 2 times, and the reading and writing times of the cache unit are 2 times. To sum up, under the condition that the application embodiment is adopted, the bus transmission times are 3 times, and the read-write times of the cache unit are 2 times.
Condition II: in a second scenario (updating the disk check information) under RAID6, the buffer unit buffer is located under an external condition of the RAID controller, and in the prior art and the embodiment of the present application, the difference between the number of bus data transmission times and the number of read/write times of the buffer unit data is obtained.
In the prior art, as shown in fig. 10, the data of the memory array is written into the memory cell first, the data is transmitted through the bus, the number of bus transmissions is 1, and the number of reading and writing of the buffer memory cell is 1. The verification information is then updated with the storage array data in the cache locations. When updating P data in the check information: firstly, the RAID controller reads the data of the storage array from the cache unit, and the process carries out 1 data transmission through the bus, wherein the read-write times of the cache unit are 1; then, carrying out corresponding consistency operation on the storage array data and the data to be written to obtain new P data, and writing the new P data into a cache unit, wherein the process needs to carry out 2 times of data transmission through a bus; specific: when writing new P data into the cache unit, the data to be written into the cache address of the cache unit needs to be read out, then the new P data is written into the cache address to be written into, and the cache unit undergoes a read process and a write process (1R 1W), namely the bus transmission times and the cache unit read-write times in the process are respectively 2 times. It can be seen that, in the process of updating the P data, the number of bus transmissions is 4, and the number of reading and writing of the buffer unit is 4. It should be understood that the update process of the Q data (second type of parity information in RAID 6) in the parity information is correspondingly the same as the update process of the P data, and will not be described here again. In summary, under condition two, when the prior art is used, the number of bus transmissions is 8, and the number of reading and writing of the cache unit is 8.
In the embodiment of the application, as shown in fig. 10, the RAID controller directly acquires storage array data first, and performs 1 data transmission through the bus in this process; and then the RAID controller carries out corresponding consistency operation on the storage array data and the data to be written to obtain new P data. The process of writing new P data to the cache unit by the RAID controller is the same as the corresponding process of the prior art. In summary, in the process of updating the P data, the number of bus transmissions is 3, and the number of reading and writing of the cache unit is 2. Similarly, the update process of the Q data is the same as the P data. In summary, under the second condition, when the embodiment of the present application is adopted, the number of bus transmissions is 6, and the number of reading and writing of the cache unit is 4.
And (3) a third condition: in a third scenario (recovery of disk data and update of check information in a disk) under RAID6, a buffer unit buffer is located outside a RAID controller, double-data bad disks exist in RAID6, and when check information update and disk data recovery are performed only by using old data returned in the disk, the corresponding statistics of the number of times of bus transmission and the number of times of reading and writing of the buffer unit are specifically visible in table 1.
In the prior art, as shown in fig. 10, during disk data recovery: firstly, writing the data of the storage array into a cache unit, and carrying out data transmission for 1 time by using the process bus, wherein the read-write times of the cache unit are 1 time. When recovering damaged data in any one of the double-data bad disks, the RAID controller acquires corresponding storage array data from the cache unit, wherein the transmission times of a bus in the process are 1, and the reading and writing times of the cache unit are 1; and then the RAID controller restores the damaged data in any bad disk based on the obtained corresponding storage array data to obtain data to be restored in any bad disk, and writes the data to be restored into the cache unit, wherein the bus transmission times in the process are 2 times, and the cache unit reading and writing times are 2 times. In summary, when recovering damaged data in two bad disk data, the number of bus transmissions is 7, and the number of reading and writing times of the buffer unit is 7. Since the data does not need to be read from the storage array again in the process of updating the verification information, the P data is updated in the process of updating the verification information: the RAID controller firstly acquires storage array data from the cache unit, and the process carries out 1 data transmission through the bus, wherein the read-write times of the cache unit are 1 time; and then the RAID controller carries out corresponding consistency operation on the acquired storage array data and the data to be written to obtain new P data, and writes the new P data into the cache unit, wherein the bus transmission times in the process are 2 times, and the reading and writing times of the cache unit are 2 times. Similarly, the update process of the Q data is the same as the corresponding process of the P data, and will not be described here again. In summary, in the process of updating the verification information, the bus transmission times are 6 times, and the read-write times of the cache unit are 6 times. As can be seen from the above description, under the third condition, when the prior art is used, the number of bus transfers is 13, and the number of reading and writing of the cache unit is 13.
In the embodiment of the application, as shown in fig. 10, the RAID controller directly acquires storage array data from the disk first, and performs 1 data transmission through the bus in this process; and then the RAID controller restores the damaged data in any bad disk based on the obtained old data of the storage array to obtain data to be restored in any bad disk, and writes the data to be restored into the cache unit, wherein the bus transmission times in the process are 2 times, and the cache unit reading and writing times are 2 times. In summary, when recovering damaged data in two bad disk data, the number of bus transmissions is 5, and the number of reading and writing times of the buffer unit is 4. Since the data does not need to be read from the storage array again in the process of updating the verification information, the P data is updated in the process of updating the verification information: the RAID controller firstly acquires the calculated data to be recovered from the cache unit, the process carries out 1 data transmission through the bus, the read-write times of the cache unit are 1, the RAID controller carries out corresponding consistency operation on the data to be recovered and the data to be written to obtain new P data, the new P data is written into the cache unit, the bus transmission times in the process are 2, and the read-write times of the cache unit are 2. Similarly, the update process of the Q data is the same as the corresponding process of the P data, and will not be described here again. In summary, in the process of updating the verification information, the bus transmission time is 5 times, and the read-write time of the cache unit is 5 times. As can be seen from the above description, in the third condition, when the application embodiment is adopted, the number of bus transmissions is 10, and the number of reading and writing of the cache unit is 9.
Table 1: bus transmission times and cache unit data read-write times under different conditions in the prior art and the embodiment of the application
It should be understood that the embodiments of the present application may also be applied to RAID5/6, where the RAID controller obtains data from a storage array to perform other consistency operations, which is not specifically limited in this application. It can be seen that when the method in the embodiment of the application is adopted to acquire data from the storage array to perform subsequent consistency operation, the data read-write times and bus transmission times of the cache unit can be effectively reduced, the requirement on bus bandwidth is reduced, and the system power consumption is reduced.
In addition, although the data acquired from the storage array by the RAID controller is different in different scenarios, in the same scenario, the data acquired from the storage array by the RAID controller is the same when the prior art and the embodiments of the present application are adopted. Therefore, for the sake of statistics, the bus transmission times and the cache unit read-write times in the process of acquiring data from the storage array in the embodiments of the present application and the prior art are counted according to one time.
Referring to fig. 11, fig. 11 is a schematic diagram 1100 of a hardware structure of a RAID controller according to an embodiment of the present disclosure. As shown in fig. 11, the RAID controller includes a processor 1101, a memory 1102, an interface circuit 1103, and a bus 1104. The interface circuit 1103 may be coupled to a memory array.
A processor 1101, configured to receive, via the interface circuit 1103, first data on a target stripe in a storage array and first index information corresponding to the first data; wherein the first data is any one of the data on the target strip; determining first table item information corresponding to first data from a preset mapping relation based on first index information; the method comprises the steps that a preset mapping relation is generated based on consistency of stripes, and first table item information is used for indicating a corresponding preprocessing type and a cache address of first data in consistency operation; and correspondingly preprocessing the first data according to the preprocessing type, and updating the data in the cache address by utilizing the preprocessed first data. A memory 1102, configured to store first entry information. Wherein the processor 1101, memory 1102 and interface circuit 1103 perform data transmission via bus 1104.
The memory 1102 includes, but is not limited to, a random access memory (random access memory, RAM), a read-only memory (ROM), an erasable programmable read-only memory (erasable programmable read only memory, EPROM), or a portable read-only memory (compact disc read-only memory, CD-ROM). The processor 1101 may be one or more central processing units (central processing unit, CPU), and in the case where the processor 1101 is one CPU, the CPU may be a single-core CPU or a multi-core CPU.
In a possible implementation manner, the storage array includes M disks, the target stripe includes M stripe units, and the M stripe units are respectively located on the M disks; wherein M is an integer greater than 2; the first data is one or all of the data blocks on any of the M stripe units.
In a possible embodiment, the target strip further includes second data; the storage array is also used for sending second data and second index information corresponding to the second data to the RAID controller; wherein the transmission time of the second data is before or after the transmission time of the first data.
Specifically, the specific functions of the processor 1101, the memory 1102 and the interface circuit 1103 may be described with reference to the corresponding descriptions in the embodiment of fig. 5, which are not repeated herein.
Referring to fig. 12, fig. 12 is a schematic flow chart of a data processing method 1200 according to an embodiment of the present application, where the data processing method is applicable to any one of the data processing apparatuses shown in fig. 5 and fig. 8 and an apparatus including the data processing apparatus. The method includes, but is not limited to, the steps of:
step S1201: acquiring first data on a target strip in a storage array and first index information corresponding to the first data by a RAID controller; wherein the first data is any one of the data on the target strip; step S1202: determining first table item information corresponding to first data from a preset mapping relation based on first index information through a RAID controller; the method comprises the steps that a preset mapping relation is generated based on stripe consistency, and first table item information is used for indicating a preprocessing type and a cache address of consistency operation corresponding to first data; step S1203: and correspondingly preprocessing the first data according to the preprocessing type, and updating the data in the cache address by utilizing the preprocessed first data.
In one possible implementation, the storage array includes M disks, and the target stripe includes M stripe units, where the M stripe units are located on the M disks, respectively; wherein M is an integer greater than 2; the first data is one or all of the data blocks on any of the M stripe units.
In a possible embodiment, the target strip further includes second data; the method further comprises the steps of: sending second data and second index information corresponding to the second data to the RAID controller through the storage array; wherein the transmission time of the second data is before or after the transmission time of the first data.
In a possible implementation manner, the first entry information includes a first preprocessing type and a first cache address; the preprocessing of the first data according to the preprocessing type and updating the data in the cache address by using the preprocessed first data includes: and preprocessing the first data by the RAID controller according to the first preprocessing type, and updating the data in the first cache address by utilizing the preprocessed first data to obtain first reference information corresponding to the first data. The method further comprises the following steps: determining, by the RAID controller, second entry information corresponding to second data from a preset mapping relationship based on the second index information; the second table entry information comprises a second preprocessing type and a second cache address; preprocessing second data according to a second preprocessing type, and updating data in a second cache address by utilizing the preprocessed second data to obtain second reference information corresponding to the second data; and obtaining the data to be recovered in the storage array according to the first reference information and the second reference information.
In one possible embodiment, the method further comprises: receiving, by the RAID controller, third data to be written to the storage array and third index information corresponding to the third data; determining third table item information corresponding to third data from a preset mapping relation based on third index information; the third table entry information comprises a third preprocessing type and a third cache address; preprocessing a third data block according to a third preprocessing type, and updating data in a third cache address by utilizing the preprocessed third data to obtain third reference information corresponding to the third data.
In a possible implementation manner, the first entry information includes a fourth preprocessing type and a fourth cache address; corresponding preprocessing is carried out on the first data according to the preprocessing type, and the data in the cache address is updated by utilizing the preprocessed first data, which comprises the following steps: preprocessing the first data by the RAID controller according to a fourth preprocessing type, and updating the data in a fourth cache address by utilizing the preprocessed first data to obtain fourth reference information corresponding to the first data; the method further comprises the following steps: and obtaining check information in the storage array by the RAID controller according to the third reference information and the fourth reference information.
In a possible embodiment, the method further comprises: the RAID controller initializes data in the cache address indicated by the first entry information prior to receiving the first data.
It should be noted that, for a specific flow of the data processing method 1200 described in the embodiment of the present application, reference may be made to the related descriptions in the embodiment of the application described in fig. 5 and 8, which are not repeated herein.
The present embodiment also provides a computer storage medium, in which a computer program may be stored, where a part of the computer program is executed by a processor (not shown in fig. 5) in the data processing apparatus 500, so that the processor may perform part or all of the steps of any one of the above-described method embodiments. The computer storage medium may be a cache unit (not shown in fig. 5) included in the data processing apparatus 500.
Embodiments of the present application also provide a computer program comprising instructions. When a part of the computer program is executed by a processor in the data processing apparatus 500, the processor may perform part or all of the steps of any one of the method embodiments described above.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments. It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of action combinations, but it should be understood by those skilled in the art that the present application is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required in the present application.
In the several embodiments provided in this application, it should be understood that the disclosed apparatus may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, such as the above-described division of units, merely a division of logic functions, and there may be additional manners of dividing in actual implementation, such as multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, or may be in electrical or other forms.
The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
The above embodiments are merely for illustrating the technical solution of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims (21)

  1. A data processing apparatus, the apparatus comprising: a redundant array of independent disks, RAID, controller and a storage array coupled to the RAID controller; wherein,
    the RAID controller is used for:
    acquiring first data on a target strip in the storage array and first index information corresponding to the first data; wherein the first data is any one of the data on the target strip;
    Determining first table item information corresponding to the first data from a preset mapping relation based on the first index information; the preset mapping relation is generated based on the consistency operation of the stripes, and the first table item information is used for indicating the corresponding preprocessing type and cache address of the first data in the consistency operation;
    and correspondingly preprocessing the first data according to the preprocessing type, and updating the data in the cache address by utilizing the preprocessed first data.
  2. The apparatus of claim 1, wherein the storage array comprises M disks, the target stripe comprises M stripe units, the M stripe units being located on the M disks, respectively; wherein M is an integer greater than 2;
    the first data is one or all data blocks on any one of the M stripe units.
  3. The apparatus of claim 1 or 2, wherein the target stripe further comprises second data thereon; the storage array is used for:
    sending the second data and second index information corresponding to the second data to the RAID controller; wherein the transmission time of the second data is before or after the transmission time of the first data.
  4. The apparatus of claim 3, wherein the first entry information comprises a first preprocessing type and a first cache address;
    the RAID controller is specifically configured to:
    preprocessing the first data according to the first preprocessing type, and updating the data in the first cache address by utilizing the preprocessed first data to obtain first reference information corresponding to the first data;
    the RAID controller is further configured to:
    determining second table item information corresponding to the second data from the preset mapping relation based on the second index information; the second table entry information comprises a second preprocessing type and a second cache address;
    preprocessing the second data according to the second preprocessing type, and updating the data in the second cache address by utilizing the preprocessed second data to obtain second reference information corresponding to the second data;
    and obtaining the data to be recovered in the storage array according to the first reference information and the second reference information.
  5. A device as claimed in claim 2 or 3, wherein the RAID controller is further configured to:
    Receiving third data to be written into the storage array and third index information corresponding to the third data;
    determining third table item information corresponding to the third data from the preset mapping relation based on the third index information; the third table entry information comprises a third preprocessing type and a third cache address;
    preprocessing the third data block according to the third preprocessing type, and updating the data in the third cache address by utilizing the preprocessed third data to obtain third reference information corresponding to the third data.
  6. The apparatus of claim 5, wherein the first entry information comprises a fourth preprocessing type and a fourth cache address;
    the RAID controller is specifically configured to:
    preprocessing the first data according to the fourth preprocessing type, and updating the data in the fourth cache address by utilizing the preprocessed first data to obtain fourth reference information corresponding to the first data;
    the RAID controller is further configured to:
    and obtaining the verification information in the storage array according to the third reference information and the fourth reference information.
  7. The apparatus of any of claims 1-6, wherein the RAID controller is further to:
    and initializing the data in the cache address indicated by the first table item information before receiving the first data.
  8. A RAID controller, comprising a processor and interface circuitry; the processor is coupled with the storage array through the interface circuit; wherein,
    the processor is configured to:
    receiving first data on a target strip in the storage array and first index information corresponding to the first data through the interface circuit; wherein the first data is any one of the data on the target strip;
    determining first table item information corresponding to the first data from a preset mapping relation based on the first index information; the preset mapping relation is generated based on the consistency of the stripes, and the first table item information is used for indicating the corresponding preprocessing type and cache address of the first data in the consistency operation;
    and correspondingly preprocessing the first data according to the preprocessing type, and updating the data in the cache address by utilizing the preprocessed first data.
  9. The RAID controller of claim 8, wherein the RAID controller comprises a memory for storing the first entry information.
  10. The RAID controller of claim 8 or 9, wherein,
    the storage array comprises M magnetic disks, the target stripe comprises M stripe units, and the M stripe units are respectively positioned on the M magnetic disks; wherein M is an integer greater than 2;
    the first data is one or all data blocks on any one of the M stripe units.
  11. The RAID controller of any of claims 8-10, further comprising second data on the target stripe; the storage array is used for:
    sending the second data and second index information corresponding to the second data to the RAID controller; wherein the transmission time of the second data is before or after the transmission time of the first data.
  12. A method of data processing, the method comprising:
    acquiring first data on a target strip in a storage array and first index information corresponding to the first data by a RAID controller; wherein the first data is any one of the data on the target strip;
    Determining, by the RAID controller, first entry information corresponding to the first data from a preset mapping relationship based on the first index information; the preset mapping relation is generated based on stripe consistency, and the first table item information is used for indicating a preprocessing type and a cache address of consistency operation corresponding to the first data; and correspondingly preprocessing the first data according to the preprocessing type, and updating the data in the cache address by utilizing the preprocessed first data.
  13. The method of claim 12, wherein the storage array comprises M disks, the target stripe comprises M stripe units, the M stripe units being located on the M disks, respectively; wherein M is an integer greater than 2;
    the first data is one or all data blocks on any one of the M stripe units.
  14. The method of claim 12 or 13, further comprising second data on the target stripe; the method further comprises the steps of:
    sending the second data and second index information corresponding to the second data to the RAID controller through the storage array; wherein the transmission time of the second data is before or after the transmission time of the first data.
  15. The method of claim 14, wherein the first entry information comprises a first preprocessing type and a first cache address; the preprocessing the first data according to the preprocessing type, and updating the data in the cache address by using the preprocessed first data, including:
    preprocessing the first data by the RAID controller according to the first preprocessing type, and updating the data in the first cache address by utilizing the preprocessed first data to obtain first reference information corresponding to the first data;
    the method further comprises the steps of:
    determining, by the RAID controller, second entry information corresponding to the second data from the preset mapping relation based on the second index information; the second table entry information comprises a second preprocessing type and a second cache address;
    preprocessing the second data according to the second preprocessing type, and updating the data in the second cache address by utilizing the preprocessed second data to obtain second reference information corresponding to the second data;
    And obtaining the data to be recovered in the storage array according to the first reference information and the second reference information.
  16. The method according to claim 13 or 14, characterized in that the method further comprises:
    receiving, by the RAID controller, third data to be written to the storage array and third index information corresponding to the third data;
    determining third table item information corresponding to the third data from the preset mapping relation based on the third index information; the third table entry information comprises a third preprocessing type and a third cache address;
    preprocessing the third data block according to the third preprocessing type, and updating the data in the third cache address by utilizing the preprocessed third data to obtain third reference information corresponding to the third data.
  17. The method of claim 16, wherein the first entry information includes a fourth preprocessing type and a fourth cache address; the preprocessing the first data according to the preprocessing type, and updating the data in the cache address by using the preprocessed first data, including:
    Preprocessing the first data by the RAID controller according to the fourth preprocessing type, and updating the data in the fourth cache address by utilizing the preprocessed first data to obtain fourth reference information corresponding to the first data;
    the method further comprises the steps of:
    and obtaining check information in the storage array by the RAID controller according to the third reference information and the fourth reference information.
  18. The method according to any one of claims 12-17, further comprising:
    before receiving the first data, initializing, by the RAID controller, data in a cache address indicated by the first entry information.
  19. A chip system, comprising at least one processor, a memory and an interface circuit, wherein the memory, the interface circuit and the at least one processor are interconnected by a line, and wherein the at least one memory has instructions stored therein; the method of any of claims 12-18 being implemented when said instructions are executed by said processor.
  20. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein program instructions which, when run on a processor, implement the method of any of claims 12-18.
  21. A computer program product, characterized in that the method of any of claims 12-18 is implemented when the computer program product is run on a terminal.
CN202180098648.5A 2021-05-27 2021-05-27 Data processing apparatus and data processing method Pending CN117377940A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/096303 WO2022246727A1 (en) 2021-05-27 2021-05-27 Data processing apparatus and data processing method

Publications (1)

Publication Number Publication Date
CN117377940A true CN117377940A (en) 2024-01-09

Family

ID=84229441

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202180098648.5A Pending CN117377940A (en) 2021-05-27 2021-05-27 Data processing apparatus and data processing method

Country Status (2)

Country Link
CN (1) CN117377940A (en)
WO (1) WO2022246727A1 (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101241420A (en) * 2008-03-20 2008-08-13 杭州华三通信技术有限公司 Method and memory apparatus for promoting write address incontinuous data storage efficiency
US10592166B2 (en) * 2018-08-01 2020-03-17 EMC IP Holding Company LLC Fast input/output in a content-addressable storage architecture with paged metadata
CN111625181B (en) * 2019-02-28 2022-03-29 华为技术有限公司 Data processing method, redundant array controller of independent hard disk and data storage system
CN111158599B (en) * 2019-12-29 2022-03-22 北京浪潮数据技术有限公司 Method, device and equipment for writing data and storage medium

Also Published As

Publication number Publication date
WO2022246727A1 (en) 2022-12-01

Similar Documents

Publication Publication Date Title
TWI777127B (en) A data storage device, a data storage method and computer-readable medium
US7206899B2 (en) Method, system, and program for managing data transfer and construction
EP0927395B1 (en) Fly-by xor
US7036066B2 (en) Error detection using data block mapping
US7664915B2 (en) High performance raid-6 system architecture with pattern matching
CN106021147B (en) Storage device exhibiting direct access under logical drive model
CN101727299A (en) RAID5-orientated optimal design method for writing operation in continuous data storage
US7743308B2 (en) Method and system for wire-speed parity generation and data rebuild in RAID systems
JP2013156977A (en) Elastic cache of redundant cache data
WO1996018141A1 (en) Computer system
US9092152B1 (en) Data storage system employing a distributed compute engine memory controller with embedded logic and arithmetic functionality and method for data migration between high-performance computing architectures and data storage devices using the same
WO2021089036A1 (en) Data transmission method, network device, network system and chip
CN116126251B (en) Method for realizing multi-concurrency writing, controller and solid-state storage device
CN107729536A (en) A kind of date storage method and device
US6052822A (en) Fast destaging method using parity engine
CN117193672B (en) Data processing method and device of storage device, storage medium and electronic device
CN116501264B (en) Data storage method, device, system, equipment and readable storage medium
CN103645995B (en) Write the method and device of data
US5964895A (en) VRAM-based parity engine for use in disk array controller
CN117377940A (en) Data processing apparatus and data processing method
WO2023020136A1 (en) Data storage method and apparatus in storage system
US11334292B2 (en) Autonomous RAID data storage system
CN113687977A (en) Data processing device based on RAID controller to realize calculation performance improvement
US10466921B1 (en) Accelerating data reduction through reinforcement learning
WO2024026956A1 (en) Data storage method and apparatus, storage device, and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination