CN117409844A

CN117409844A - Data recovery method, memory storage device and memory controller

Info

Publication number: CN117409844A
Application number: CN202210789276.7A
Authority: CN
Inventors: 刘京; 刘其铠
Original assignee: Hosin Global Electronics Co Ltd
Current assignee: Hosin Global Electronics Co Ltd
Priority date: 2022-07-06
Filing date: 2022-07-06
Publication date: 2024-01-16

Abstract

The invention provides a data recovery method, a memory storage device and a memory controller, wherein the data recovery method is used for a memory module, the memory module comprises a plurality of memory grains, each memory grain is provided with a plurality of entity erasing units, each entity erasing unit is provided with a plurality of entity programming units, and the method comprises the following steps: acquiring an error probability value of each entity programming unit; the method comprises the steps of disturbing a plurality of entity programming units according to a disorder algorithm, grouping the entity programming units into a plurality of first groups, wherein each first group comprises a plurality of first-type entity programming units and a plurality of second-type entity programming units; and grouping the plurality of physical programming units in each first group into a plurality of subgroups, wherein each subgroup comprises at most one physical programming unit of the second type. Potentially dense second-type physical programming units may be dispersed into subgroups, thereby avoiding situations where data cannot be recovered.

Description

Data recovery method, memory storage device and memory controller

Technical Field

The present invention relates to a memory management technology, and in particular, to a data recovery method, a memory storage device and a memory controller.

Background

Flash memory, particularly NAND flash memory, is one of volatile memories, and is widely used in storage fields such as mobile phones, notebook computers, cloud storage, and the like. The basic function of flash memory is to store data, and ensuring consistency of written data and read data is a fundamental requirement of a memory system. Error checking and correction codes (ECC, error Correcting Code), also known as error correction codes, are used to correct errors in read data during error correction operations, and are an important means to ensure consistency of written and read data and to improve reliability of memory systems. However, the error correction code also has a certain error correction range, if the original error rate of the read data is high, the error correction capability of the error correction code will be exceeded, and the written data cannot be correctly recovered.

The Reed-solomon code (RS code) algorithm is a data recovery method, and when writing data of a host system into a physical program unit in a memory, data in a physical program unit (data page) of a physical erase unit (block) is subjected to an exclusive-or operation, and the exclusive-or result is stored. When any one of the physical programming units in the physical erasing unit is wrong during data reading, the exclusive OR result is exclusive-or with the data stored in the physical programming unit which is not wrong in the physical erasing unit, so that the data stored in the physical programming unit which is wrong in the physical erasing unit is calculated, and the purpose of data recovery is achieved.

The error probability values of the physical programming units of the flash memory have a layering phenomenon, that is, the error probability values of the physical programming units of a certain continuous address range are all close to 0, or the error probability values of the physical programming units of a certain continuous address range are all close to the upper limit of the error probability values, so that dense physical programming units (error pages) with potential errors exist.

Because of the specificity of the exclusive or operation, if and only if the error pages of a group of entity programming units are not more than 1, the data recovery of the error pages can be completed by using the RS algorithm, so that the error pages in the entity programming units in a certain address range cannot be excessive, otherwise, the recovery capability of the RS algorithm can be exceeded.

Due to the specificity of the exclusive or operation, the data recovery of an error page can be accomplished using the RS algorithm if and only if the error page of a set of physical programming units does not exceed 1. The current grouping method is to traverse the physical programming units according to a linear sequence, and if the number of error pages in a certain grouping is 0 or exceeds 1, the data recovery without the RS algorithm is caused or the data recovery capability of the RS algorithm is exceeded.

Disclosure of Invention

Embodiments of the present invention provide a data recovery method, memory storage device and memory controller that can distribute potentially dense bad physical programming units into subgroups.

The embodiment of the invention provides a data recovery method for a memory module, wherein the memory module comprises at least one memory grain, each memory grain has a plurality of physical erasing units, each physical erasing unit has a plurality of physical programming units, the method comprises the following steps: acquiring an error probability value of each entity programming unit; the method comprises the steps of disturbing a plurality of entity programming units according to a disorder algorithm, grouping the entity programming units into a plurality of first groups, wherein each first group comprises a plurality of first-type entity programming units and a plurality of second-type entity programming units; and grouping the physical programming units in each first group into a plurality of subgroups, wherein each subgroup comprises at most one physical programming unit of the second type.

The embodiment of the invention also provides a memory storage device which comprises a connection interface, a memory module and a memory controller. The connection interface is used for connecting with the host system. The memory module comprises at least one memory grain, each memory grain is provided with a plurality of physical erasing units, and each physical erasing unit is provided with a plurality of physical programming units. The memory controller is connected with the connection interface and the memory module, wherein the memory controller is used for acquiring the error probability value of each physical programming unit. The memory controller is further configured to shuffle the plurality of physical programming units according to an out-of-order algorithm, and group the plurality of physical programming units into a plurality of first groups, wherein each first group includes a plurality of physical programming units of a first type and a plurality of physical programming units of a second type. The memory controller is further configured to group the plurality of physical program units in each first group into a plurality of subgroups, wherein each subgroup comprises at most one physical program unit of the second type.

Embodiments of the present invention also provide a memory controller including a host interface, a memory interface, an error checking and correcting circuit, and a memory control circuit. The host interface is used for connecting with a host system. The memory interface is used for connecting a memory module, wherein the memory module comprises at least one memory grain, each memory grain is provided with a plurality of physical erasing units, and each physical erasing unit comprises a plurality of physical programming units. The memory control circuit is connected with the host interface, the memory interface and the error checking and correcting circuit, wherein the memory control circuit is used for obtaining the error probability value of each physical programming unit. The memory control circuit is further configured to shuffle the plurality of physical programming units according to an out-of-order algorithm, and group the plurality of physical programming units into a plurality of first groups, wherein each first group includes a plurality of physical programming units of a first type and a plurality of physical programming units of a second type. The memory control circuit is further configured to group the plurality of physical program units in each first group into a plurality of subgroups, wherein each subgroup comprises at most one physical program unit of the second type.

Based on this, the data recovery method, the memory storage device and the memory controller provided by the invention can disperse the potentially dense bad entity programming units into each subgroup, so as to avoid the situation that the data cannot be recovered due to the fact that the number of the wrong entity programming units in the same entity group is too large to exceed the capacity of the RS algorithm for recovering the data when the RS algorithm is used for decoding the read data.

Drawings

The accompanying drawings are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification. The drawings illustrate embodiments of the invention and, together with the description, serve to explain the principles of the invention.

FIG. 1 is a schematic diagram of a memory storage device according to an embodiment of the invention;

FIG. 2 is a schematic diagram of a memory controller according to an embodiment of the invention;

FIG. 3 is a schematic diagram illustrating managing memory modules according to an embodiment of the invention;

FIG. 4 is a schematic diagram of a multi-frame encoding shown according to an embodiment of the present invention;

FIG. 5 is a flow chart of a data recovery method according to an embodiment of the invention;

FIG. 6 is a schematic diagram of a ranking based disorder algorithm shown in accordance with one embodiment of the present invention;

FIG. 7 is a schematic diagram of a swap-based shuffling algorithm as shown in accordance with an embodiment of the present invention;

FIG. 8 is a specific flow chart of step S505 shown in accordance with an embodiment of the present invention;

FIG. 9 is a diagram illustrating grouping of first physical programming units into subgroups according to an embodiment of the present invention;

FIG. 10 is a diagram illustrating grouping of second physical program elements into subgroups according to an embodiment of the present invention.

Detailed Description

Reference will now be made in detail to the exemplary embodiments of the present invention, examples of which are illustrated in the accompanying drawings. Wherever possible, the same reference numbers will be used throughout the drawings and the description to refer to the same or like parts.

FIG. 1 is a schematic diagram of a memory storage device according to an embodiment of the invention. Referring to fig. 1, a memory storage system 10 includes a host system 11 and a memory storage device 12. Host system 11 may be any type of computer system. For example. Host system 11 may be a notebook computer, desktop computer, smart phone, tablet computer, industrial computer, or the like. The memory storage device 12 is used to store data from the host system 11. For example, the memory storage 12 may include a solid state disk, a U disk, or other type of non-volatile storage. Host system 11 may be electrically connected to memory storage device 12 via a serial advanced technology bus attachment (Serial Advanced Technology Attachment, SATA) interface, peripheral component interconnect Express (Peripheral Component Interconnect Express, PCI Express), universal serial bus (Universal Serial Bus, USB), or other type of connection interface. Thus, host system 11 may store data to memory storage device 12 and/or read data from memory storage device 12.

Memory storage device 12 may include a connection interface 121, a memory module 122, and a memory controller 123. The connection interface 121 is used to connect the memory storage device 12 to the host system 11. For example, connection interface 121 may support connection interface standards such as SATA, PCI Express, or USB. Memory storage 12 may communicate with host system 11 via connection interface 121.

The memory module 122 is used for storing data. The memory module 122 may include a rewritable nonvolatile memory module. The memory module 122 includes an array of memory cells. The memory cells in the memory module 122 store data in the form of voltages. For example, the memory module 122 may include a single Level Cell (Single Level Cell, SLC) NAND-type flash memory module, a Multi Level Cell (MLC) NAND-type flash memory module, a third Level Cell (Triple Level Cell, TLC) NAND-type flash memory module, a Quad Level Cell (QLC) NAND-type flash memory module, a three-dimensional NAND-type flash memory module (3D NAND flash memory module) (which may have a plurality of third or fourth Level memory cells), or other memory modules having similar characteristics. The memory cells in the memory module 122 are arranged in an array.

The memory controller 123 is connected to the connection interface 121 and the memory module 122. The memory controller 123 may be used to control the memory storage device 12. For example, the memory controller 123 may control the connection interface 121 and the memory module 122 for data access and data management. For example, the memory controller 123 may include a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), or other programmable general purpose or special purpose microprocessor, digital signal processor (Digital Signal Processor, DSP), programmable controller, application specific integrated circuit (Application Specific Integrated Circuits, ASIC), programmable logic device (Programmable Logic Device, PLD), or other similar device or combination of devices.

In one embodiment, memory controller 123 is also referred to as a flash memory controller. In one embodiment, the memory module 122 is also referred to as a flash memory module. The memory module 122 may receive a sequence of instructions from the memory controller 123 and access data stored in the memory unit according to the sequence of instructions.

FIG. 2 is a schematic diagram of a memory controller according to an embodiment of the invention. Referring to fig. 1 and 2, the memory controller 123 includes a host interface 21, a memory interface 22, a memory control circuit 23 and an error checking and correcting circuit 24.

The host interface 21 is used to connect to the host system 11 via the connection interface 121 to communicate with the host system 11. In the present example embodiment, host interface 504 is compliant with the SATA standard. However, it should be understood that the present invention is not limited thereto, and the host interface 504 may also be compatible with PATA standards, IEEE1394 standards, PCI Express standards, USB standards, SD standards, UHS-I standards, UHS-II standards, MS standards, MMC standards, eMMC standards, UFS standards, CF standards, IDE standards, or other suitable data transfer standards.

The memory interface 22 is configured to connect to the memory module 122 to communicate with the memory module 122. That is, the data to be written into the memory module 122 is converted into a format acceptable to the memory module 122 through the memory interface 22.

The memory control circuit 23 is connected to the host interface 21 and the memory interface 22. The memory control circuit 23 can communicate with the host system 11 via the host interface 21 and access the memory module 122 via the memory interface 22. The memory control circuit 23 may also be regarded as a control core of the memory controller 123, which has a plurality of control instructions that are executed to perform operations such as writing, reading, erasing, etc. of data when the memory storage device 12 is operated. In various embodiments, the control instructions of the memory control circuit 23 may be implemented in firmware, code or hardware, and the invention is not limited thereto. In the following embodiment, the explanation of the memory control circuit 23 is equivalent to the explanation of the memory controller 123. In addition, the memory control circuit 23 may further include one or more buffer memories for temporarily storing data.

The error checking and correcting circuit 24 is electrically connected to the memory control circuit 23 and is used for executing an error checking and correcting program to ensure the correctness of the data. When the memory control circuit 23 receives a write command from the host system 11, the error checking and correcting circuit 24 generates an error checking and correcting Code (Error Checking and Correcting Code, ECC Code) corresponding to the write command, and the memory control circuit 23 writes the data and the corresponding error checking and correcting Code into the memory module 122. When the memory control circuit 23 reads data from the memory module 122, the error checking and correcting codes are read simultaneously. And the memory control circuit 23 performs an error checking and correcting process on the read data according to the read error checking and correcting code.

FIG. 3 is a schematic diagram illustrating managing memory modules according to an embodiment of the invention.

Referring to fig. 1 and 3, the memory module 122 includes a plurality of physical programming units, such as physical units 301 (0) to 301 (C) shown in fig. 3. The memory module 122 is managed on a physical programming unit basis, and each physical programming unit includes a plurality of memory units and is used for non-volatile storage of data. Multiple physical program units may constitute a physical erase unit (e.g., a physical block). Multiple physical program units (or memory cells) in a physical erase unit may be erased simultaneously. In addition, the memory control circuit 23 may configure the plurality of logic units 311 (0) to 311 (D) to map at least a portion of the physical programming units. For example, a logical unit may be composed of one or more logical addresses. The mapping relationship between the logic unit and the physical programming unit can be recorded in the logic-to-physical mapping table.

In one embodiment, the memory control circuit 23 logically divides the memory module 122 into a data area 310, an idle area 320 and a system area 330. The physical units 301 (1) to 301 (a) logically belonging to the data area 310 store data (also referred to as user data) from the host system 11. The physical programmer in the data area 310 is marked and then associated with the idle area 320. In other words, all the physical units 301 (A+1) -301 (B) in the spare area 320 are marked and do not store valid data. Wherein, the plurality of physical program units in the idle area 320 may constitute an idle physical erase unit.

The physical units 301 (b+1) -301 (C) logically belonging to the system area 330 are used for recording system data. For example, the system data includes information about the manufacturer and model of the memory module, the number of physical erased cells of the memory module, the number of physical programmed cells per physical erased cell, etc. In particular, the number of physical Programming units in the data area 310, the idle area 320, and the system area 330 may vary according to different memory specifications.

In an embodiment where a memory cell can store multiple bits (e.g., MLC or TLC NAND-type flash memory), physical programming units belonging to the same word line (or the same word line layer) can be categorized into at least a lower physical programming unit and an upper physical programming unit. For example, in an MLC NAND type flash memory, the least significant bit (Least Significant Bit, LSB) of a memory cell belongs to the lower physical program cell, and the most significant bit (Most Significant Bit, MSB) of the memory cell belongs to the upper physical program cell. The reliability of the lower physical programming unit is generally higher than that of the upper physical programming unit. In an exemplary embodiment, the lower physical programming unit is also referred to as a fast page (fast page), and the upper physical programming unit is also referred to as a slow page (slow page). In other words, taking an MLC NAND type flash memory as an example, in an MLC NAND type flash memory, several memory cells arranged on the same word line may constitute 2 physical programming units, including a lower physical programming unit composed of the least significant bits (also referred to as the first significant bits) of the memory cells and an upper physical programming unit composed of the most significant bits (also referred to as the second significant bits) of the memory cells.

In addition, in the TLC NAND type flash memory, the least significant bit (Least Significant Bit, LSB) of a memory cell belongs to the lower physical program cell, the middle significant bit (Center Significant Bit, CSB) of the memory cell belongs to the middle physical program cell, and the most significant bit (Most Significant Bit, MSB) of the memory cell belongs to the upper physical program cell.

In addition, in the present exemplary embodiment, the error checking and correcting circuit 24 may perform single-frame (single-frame) encoding for the data stored in the same physical program unit, or may perform multi-frame (multi-frame) encoding for the data stored in a plurality of physical program units. The single frame coding and the multi-frame coding may employ at least one of coding algorithms such as low density parity check correction codes (low density parity code, LDPC), BCH codes, convolutional codes (convolutional code), or turbo codes (turbo codes), respectively. Alternatively, in an example embodiment, the multi-frame encoding may also employ a Reed-solomon codes (RS codes) algorithm or an exclusive-or (XOR) algorithm. In addition, in another exemplary embodiment, more encoding algorithms not listed above may be used, and will not be described here. Depending on the encoding algorithm employed, the error checking and correction circuit 24 may encode the data to be protected to generate a corresponding error correcting code and/or error checking code. For convenience of explanation, the error correction codes and/or error check codes generated by encoding will be collectively referred to as encoded data hereinafter.

referring to FIG. 4, taking the example of encoding the data stored in the physical Programming units 810 (0) to 810 (E) to generate corresponding encoded data 820, each of the physical Programming units 810 (0) to 810 (E) is exemplifiedAt least a portion of the data stored by the user may be considered a frame. In multi-frame encoding, the data in the physical programming units 810 (0) to 810 (E) is encoded based on the location of each bit (or byte). For example, bit b at position 801 (1) ₁₁ 、b ₂₁ 、…、b _p1 Will be encoded as bit b in encoded data 820 _o1 Bit b at position 801 (2) ₁₂ 、b ₂₂ 、…、b _p2 Will be encoded as bit b in encoded data 820 _o2 The method comprises the steps of carrying out a first treatment on the surface of the Similarly, bit b at position 801 (r) _1r 、b _2r 、…、b _pr Will be encoded as bit b in encoded data 820 _or . The data read from the physical programming units 810 (0) to 810 (E) can then be decoded based on the encoded data 820 in an attempt to correct errors that may exist in the read data.

In addition, in another example embodiment of fig. 4, the data used to generate the encoded data 820 may also include redundancy bits (redundancy bits) corresponding to data bits (data bits) in the data stored by the physical programming units 810 (0) to 810 (E). Taking the data stored in the physical programming unit 810 (0) as an example, the redundancy bits are, for example, encoded data generated by single frame encoding the data bits stored in the physical programming unit 810 (0). In the present exemplary embodiment, it is assumed that when reading data in the physical programming unit 810 (0), the data read from the physical programming unit 810 (0) can be decoded by using redundancy bits (e.g., encoded data generated by single frame encoding) in the physical programming unit 810 (0) to perform error detection and correction on the read data. However, when decoding using the redundancy bits in the physical program unit 810 (0) fails (e.g., the number of errors in the data stored in the physical program unit 810 (0) after decoding is greater than a threshold), a re-Read (Retry-Read) mechanism may be used to select and use other Read voltages to attempt to Read the correct data from the physical program unit 810 (0). When the correct data cannot be Read from the physical programming unit 810 (0) by the re-Read mechanism, the encoded data 820 and the data of the physical programming units 810 (1) to 810 (E) can be Read, and decoding is performed according to the encoded data 820 and the data of the physical programming units 810 (1) to 810 (E), so as to attempt to correct errors in the data stored in the physical programming unit 810 (0). That is, in the present exemplary embodiment, when decoding of encoded data generated using single frame encoding fails and reading of encoded data generated using a re-Read (Retry-Read) mechanism fails, decoding is performed using encoded data generated using multi-frame encoding.

It should be noted that, in the following embodiments, the error physical programming unit (error page) is that the number of bits (bits) in error in the physical programming unit exceeds the correction capability of the error checking and correcting code (ECC), so that the data stored in the error physical programming unit cannot be read or cannot be read correctly.

Fig. 5 is a flow chart illustrating a data recovery method according to an embodiment of the present invention.

Referring to fig. 5, the method of the present embodiment is applicable to the memory storage device 12 in the above embodiment, and the detailed steps of the present embodiment are described below together with the elements in the memory storage device 12. It should be noted that each step in fig. 5 may be implemented as a plurality of codes or circuits, which is not a limitation of the present invention. In addition, the method of fig. 5 may be used with the following exemplary embodiments, or may be used alone, and the present invention is not limited thereto.

In step S501, the memory controller 123 writes the data received from the host system 11 into the plurality of physical programming units of the memory module 122.

In step S502, when the memory controller 123 reads the data in the physical program unit, the data written in the physical program unit cannot be decoded by using the error checking and correcting operation.

In step S503, the memory controller 123 obtains an error probability value of each physical programming unit, where the error probability value of each physical programming unit is expressed as:

f (x, y, z) =h formula 1

Wherein x is the number of program/erase (P/E) times of the physical erase unit, y is the ECC value of the physical erase unit, z is the number of times the data in the physical erase unit is read, and h is the error probability value of the physical erase unit. The error probability value for each physical programmer is h1, h2., hn, where 1, 2 … n are the number of physical programmer and n is the total number of physical programmer.

For example, x can be searched from the P/E erasure count table corresponding to each physical erasure cell, such as TLC NAND flash memory, in which the P/E erasure count of the physical erasure cell is 3000-5000 times, and the error probability value of the physical programming cell in the physical erasure cell with P/E erasure count greater than 2500 times is relatively large.

Similarly, y can be searched from a bad block table corresponding to each physical erasing unit, the bad block table records the ECC value corresponding to each physical programming unit, the ECC value in the flash memory can be set to be 60, and the physical programming units with ECC values of 0-7,8-15, 16-23, 24-31 and 32-47 are classified and managed. Generally, flash memory uses physically programmed cells with relatively small ECC values to store data. Whereas the error probability value is larger for physically programmed cells with larger ECC values, e.g., 48-60.

Likewise, z is queried from the read number table corresponding to each physical programmer. The greater the number of reads of the physical programming unit, the greater its error probability value.

In one embodiment, the physical programming units are classified into at least a first type of physical programming unit and a second type of physical programming unit according to the error probability value. The physical program units of the first type may be good physical program units or normal or high-quality physical program units, for example, physical program units having P/E erase times smaller than 2500 times or having ECC values smaller or having read times may be classified as the physical program units of the first type. In contrast, the second type of physical program unit may be a bad physical program unit or a potentially faulty or low-quality physical program unit, for example, the physical program unit having a P/E erase count greater than 2500 times or a larger ECC value or a larger read count may be classified as the second type of physical program unit.

In step S504, the memory controller 123 shuffles the plurality of physical programming units in the memory module 122 according to the shuffling algorithm; and then, the disturbed entity programming units are grouped into a plurality of first groups, wherein each first group comprises a plurality of first entity programming units and a plurality of second entity programming units, the number of the second entity programming units in each first group is the same, and the entity programming units in the same first group belong to different entity erasing units.

The following describes how to shuffle the plurality of physical programming units in the memory module 122 according to an out-of-order algorithm with reference to fig. 6-7.

FIG. 6 is a schematic diagram of a sort-based disorder algorithm, according to one embodiment of the present invention. As shown in fig. 6, in one embodiment, the memory controller 123 randomly generates a random number for each physical programming unit (P0, P1, P2, P3, P4, P5, P6, P7, P8) and then sorts the random numbers (for example, in this embodiment, the random numbers may be sorted from large to small, or from small to large in another embodiment, the invention is not limited thereto), and the physical programming units corresponding to the random numbers in the first group are moved while the random numbers are sorted, so that the physical programming units in the first group may be disturbed after the sorting of the physical programming units is completed, and the second group (P8, P2, P7, P6, P0, P5, P1, P4, P3) shown in fig. 6 is obtained.

Fig. 7 is a schematic diagram of a swap-based out-of-order algorithm, according to one embodiment of the present invention. As shown in fig. 7, in one embodiment, a first group (P0, P1, P2, P3, P4, P5, P6, P7, P8) of 9 physical programming units generates a random number (the random number is greater than 1) between the 1 st physical programming unit P0 and the 6 th physical programming unit P5, and in this embodiment, the random number is 4, and the positions of the physical programming unit P5 and the physical programming unit P3 in the first group are exchanged, so as to obtain a second group (P0, P1, P2, P5, P4, P3, P6, P7, P8) shown in fig. 7. Obviously, the random number may be greater than 6 and only need to be smaller than N, where N is the number of physical programming units in the first group.

In step S505, the memory controller 123 groups the physical programming units in each first group into a plurality of subgroups, wherein each subgroup comprises a plurality of physical programming units, and each subgroup comprises at most one physical programming unit of the second type.

Fig. 8 is a specific flowchart of step S505 shown according to an embodiment of the present invention.

Step S505 includes the following steps S801 to S805. Specifically, in step S801, the memory controller 123 groups the first physical programming units in the first group into a plurality of subgroups.

In step S802, the memory controller 123 calculates an H value of each subgroup according to the error probability value of the first physical programming unit, wherein the H value is a mathematical expectation of the error probability value of the physical programming units in each subgroup (i.e. the probability of having 2 or 2 physical programming units in error) [ here, the H value is equal to the error probability value H of the first physical programming unit, because each subgroup only comprises the first physical programming units.

In step S803, the memory controller 123 groups the second physical programming units in the first group into a plurality of subgroups, and groups the second physical programming units with the smallest error probability value into a subgroup with the largest H value in the plurality of subgroups.

In step S804, the memory controller 123 recalculates the H values of the plurality of subgroups.

In step S805, the memory controller 123 groups the third physical programming units in the first group into a plurality of subgroups and groups the third physical programming units with the smallest error probability value into a subgroup with the largest H value until the physical programming units in the first group are grouped into a plurality of subgroups and the H values of the plurality of subgroups are equal or close.

In the following, it is described how to group each of the physical programming units of the first group into a plurality of subgroups, such that each subgroup includes at most one physical programming unit of the second type.

In one embodiment, the memory module 122 includes 260000 (26W) physical program units, and the 26W physical program units are shuffled and grouped into 84 first groups (i.e., the first group G0, the first group G1 …, the first group G83) by an out-of-order algorithm. Wherein each first group includes 3000 physical programming units. The 3000 physical program units in each first group are further grouped into 12 subgroups (i.e., subgroup g0, subgroup g1 … subgroup g 11) such that 250 physical program units are included in each subgroup.

Wherein the H values corresponding to the subgroups g0, g1, g2, g3, g4, g5, g6, g7, g8, g9, g10, g11 are H0, H1, H2, H3, H4, H5, H6, H7, H8, H9, H10, H11, respectively, wherein the H values are mathematical expectations of error probability values for the physical programming units in the subgroup.

For example, the physical programming units P0 to P249 are subjected to grouping operation. As shown in fig. 9, the physical programming units P0 to P11 (i.e., the first physical programming unit) are first grouped, the physical programming units P0 are respectively grouped into the subgroups g0, the physical programming units P1 are grouped into the subgroups g1, and the physical programming units P11 are grouped into the subgroups g11, so as to calculate H values corresponding to the subgroups g0 to g11, which are respectively H0 to H11.

If the error probability H of the physical program unit P7 is 0.3, i.e. the H7 corresponding to the subgroup g7 is 0.3, the physical program units in the other subgroups (subgroups g0 to g6, g8 to g 11) are all the first type of physical program units, i.e. the H values of the other subgroups are 0 (i.e. less than 0.3). Thus, the subgroup g7 has the largest H value relative to the other 11 subgroups.

Next, the grouping operation is continued for the physical programming units in the first group, i.e. the physical programming units P12 to P23 (i.e. the second physical programming unit), as shown in fig. 10, since the subgroup g7 to which the physical programming unit P7 belongs has the largest H value (H7); therefore, in this round of grouping operation, the physical programming units P13 in which the h value is the smallest are selected to be grouped into the subgroup g7. And (3) recalculating the H values of the 12 subgroups to obtain an H value of 0.2 of a subgroup g7, comparing the H value with H values of other subgroups, judging whether the H value of the subgroup g7 is still maximum, and if the H value of the subgroup g7 is still maximum, continuing to group the entity programming units with the minimum H value into the subgroup g7. For example, in the grouping operation of the physical programming units P24 to P35 (i.e., the third physical programming unit), the physical programming unit P30 with the smallest h value is continuously grouped into the subgroup g7. In this way, if the H value of subgroup g3 is the largest in the next grouping operation, then in the grouping operation of physical program units P36 to P47, physical program unit P38 with the smallest H value is grouped into subgroup g3 until physical program units P0 to P249 are all grouped into subgroups g0 to g 11. Thus, each subgroup comprises 249 first-type entity programming units and 1 second-type entity programming unit, and the potentially dense second-type entity programming units can be distributed into each subgroup, so that the condition that the second-type entity programming units in the same subgroup are too many and exceed the capability of the RS algorithm for recovering data is avoided.

In step S506, the memory controller 123 performs a logic operation on the data written into the physical program unit to generate an error checking and correcting code, wherein the error checking and correcting code is used for performing an error checking and correcting operation on the data in the physical program unit to recover the data in the physical program unit.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims

1. A data recovery method for a memory module, wherein the memory module includes at least one memory die having a plurality of physical erase units, each of the physical erase units having a plurality of physical program units, the method comprising:

acquiring an error probability value of each entity programming unit;

the entity programming units are scrambled according to a scrambling algorithm, and are grouped into a plurality of first groups, wherein each first group comprises a plurality of first entity programming units and a plurality of second entity programming units; and

grouping the plurality of physical programming units in each of the first groups into a plurality of subgroups, wherein each subgroup comprises at most one physical programming unit of the second type.

2. The method of data recovery according to claim 1, wherein the method further comprises:

writing data received from a host system into the plurality of physical programming units; and

when the data in the plurality of physical programming units is read, the data cannot be decoded using error checking and correction operations.

3. The method of claim 1, wherein the number of the second type of physical program units in each of the first groups is the same, and the plurality of physical program units grouped into the same first group belong to different ones of the physical erase units.

4. The method of claim 1, wherein the step of grouping the plurality of physical programming units in each of the first groups into the plurality of subgroups, wherein each subgroup comprises at most one physical programming unit of the second type further comprises:

grouping a first physical program element of the plurality of physical program elements in the first group into the plurality of subgroups;

calculating an H value for each of the subgroups based on the error probability values for the first physical programming units, wherein the H value is a mathematical expectation of the error probability values grouped into the first physical programming units in each of the subgroups;

grouping a second physical programming unit of the plurality of physical programming units in the first group into the plurality of subgroups, and grouping the second physical programming unit having a minimum error probability value into a subgroup having a maximum H value of the plurality of subgroups;

Recalculating the H values for the multiple subgroups, an

Grouping a third physical programming unit of the plurality of physical programming units in the first group into the plurality of subgroups and grouping the third physical programming unit having the minimum error probability value into a subgroup of the plurality of subgroups having the maximum H value;

until the plurality of physical programming units in the first group are grouped into the plurality of subgroups and the H values of the plurality of subgroups are equal or close.

5. The data recovery method of claim 1, wherein the error probability value for each of the physical programming units is represented as:

f (x, y, z) =h formula 1

Wherein x is the number of P/E of the physical erasing unit, y is the ECC value of the physical programming unit, z is the number of times the data in the physical programming unit is read, and h is the error probability value.

6. The data recovery method of claim 1, further comprising:

performing logic operation on the data written into the physical programming units to generate error checking and correcting codes, wherein the error checking and correcting codes are used for performing error checking and correcting operation on the data so as to recover the data of the physical programming units.

7. A memory storage device, comprising:

the connection interface is used for connecting a host system;

a memory module including at least one memory die having a plurality of physical erase units, each physical erase unit having a plurality of physical program units; and

a memory controller connecting the connection interface with the memory module,

wherein the memory controller is configured to write data received from a host system into the physical programming unit;

the memory controller is further configured to obtain an error probability value of each of the physical programming units;

the memory controller is further configured to shuffle the plurality of physical programming units according to an out-of-order algorithm, and group the plurality of physical programming units into a plurality of first groups, wherein each of the first groups includes a plurality of first type physical programming units and a plurality of second type physical programming units; and

the memory controller is also configured to group the physical programming units in each of the first groups into a plurality of subgroups, wherein each subgroup comprises at most one physical programming unit of the second type.

8. The memory storage device of claim 7, wherein the memory controller is further configured to write data received from a host system into the plurality of physical programming units, and

the memory controller is also configured to disable decoding of the written data using error checking and correction operations when reading the data in the plurality of physical programming units.

9. The memory storage device of claim 7, wherein the number of physical program units of the second type in each of the first groups is the same and the plurality of physical program units grouped into the same first group belong to different ones of the physical erase units.

10. The memory storage device of claim 7, wherein the memory controller is further configured to group the plurality of physical programming units in each of the first groups into the plurality of subgroups, wherein the operations of each of the subgroups including at most one of the second type of bad physical programming units further comprise:

the memory controller is further configured to group a first physical program unit of the plurality of physical program units in the first group into the plurality of subgroups;

The memory controller is further configured to calculate an H value for each of the subgroups based on the error probability values of the first physical programming units, wherein the H value is a mathematical expectation of the error probability values grouped into the first physical programming units in each of the subgroups;

the memory controller is also configured to group a second physical programming unit of the plurality of physical programming units in the first group into the plurality of subgroups and group the second physical programming unit having a minimum error probability value into a subgroup of the plurality of subgroups having a maximum H value;

the memory controller is also to recalculate the H values for the plurality of subgroups; and

the memory controller is also configured to group a third physical programming unit of the plurality of physical programming units in the first group into the plurality of subgroups and group the third physical programming unit having the minimum error probability value into a subgroup of the plurality of subgroups having the maximum H value until the plurality of physical programming units of the first group are grouped into the plurality of subgroups and the H values of the plurality of subgroups are equal or close.

11. The memory storage device of claim 7, wherein the error probability value for each of the physical programming units is represented as:

f (x, y, z) =h formula 1

12. The memory storage device of claim 7, wherein the memory controller is further configured to perform a logical operation on data written to the plurality of physical programming units to generate an error checking and correction code, wherein the error checking and correction code is configured to perform an error checking and correction operation on the data to recover the data of the plurality of physical programming units.

13. A memory controller, comprising:

a host interface for connecting to a host system;

a memory interface for connecting a memory module, wherein the memory module comprises at least one memory die having a plurality of physical erase units, each of the physical erase units comprising a plurality of physical program units;

An error checking and correcting circuit; and

a memory control circuit connecting the host interface, the memory interface and the error checking and correcting circuit,

the memory control circuit is further configured to obtain an error probability value of each of the physical programming units;

the memory control circuit is further configured to shuffle the plurality of physical programming units according to an out-of-order algorithm, and group the plurality of physical programming units into a plurality of first groups, wherein each first group includes a plurality of first type physical programming units and a plurality of second type physical programming units; and

the memory control circuit is also configured to group the plurality of physical programming units in each of the first groups into a plurality of subgroups, wherein each subgroup comprises at most one physical programming unit of the second type.

14. The memory controller of claim 13, wherein the memory control circuitry is further to write data received from a host system into the plurality of physical programming units, and

the memory control circuit is also configured to disable decoding of the data using error checking and correction operations when reading the data in the plurality of physical programming units.

15. The memory controller of claim 13, wherein the number of physical program units of the second type in each of the first groups is the same and the plurality of physical program units grouped into the same first group belong to different ones of the physical erase units.

16. The memory controller of claim 13, wherein the memory control circuitry is further configured to group the plurality of physical programming units in each of the first groups into the plurality of subgroups, wherein the operations of each of the subgroups including at most one of the second type of physical programming units further comprise:

the memory control circuitry is further to group a first physical program unit of the plurality of physical program units in the first group into the plurality of subgroups;

the memory control circuit is further configured to calculate an H value for each of the subgroups according to the error probability values of the first physical programming units, wherein the H value is a mathematical expectation of the error probability values grouped into the first physical programming units in each of the subgroups;

the memory control circuitry is also to group a second physical programming unit of the plurality of physical programming units in the first group into the plurality of subgroups and group the second physical programming unit having a minimum error probability value into a subgroup of the plurality of subgroups having a maximum H value;

The memory control circuitry is also to recalculate the H values for the plurality of subgroups; and

the memory control circuitry is also to group a third physical programming unit of the plurality of physical programming units in the first group into the plurality of subgroups and group the third physical programming unit having the minimum error probability value into a subgroup of the plurality of subgroups having the maximum H value until the plurality of physical programming units of the first group are grouped into the plurality of subgroups and the H values of the plurality of subgroups are equal or close.

17. The memory controller of claim 13, wherein the error probability value for each of the physical programming units is represented as:

f (x, y, z) =h formula 1

18. The memory controller of claim 13, wherein the memory control circuit is further configured to perform a logical operation on data written to the plurality of physical programming units to generate an error checking and correction code, wherein the error checking and correction code is configured to perform an error checking and correction operation on the data to recover the data of the plurality of physical programming units.