CN117472644A - Error determination method and system, processor and memory - Google Patents

Error determination method and system, processor and memory Download PDF

Info

Publication number
CN117472644A
CN117472644A CN202211204450.3A CN202211204450A CN117472644A CN 117472644 A CN117472644 A CN 117472644A CN 202211204450 A CN202211204450 A CN 202211204450A CN 117472644 A CN117472644 A CN 117472644A
Authority
CN
China
Prior art keywords
error
processor
media particles
error information
media
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211204450.3A
Other languages
Chinese (zh)
Inventor
陈智勇
牛元君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to PCT/CN2023/103220 priority Critical patent/WO2024016971A1/en
Publication of CN117472644A publication Critical patent/CN117472644A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1044Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices with specific ECC/EDC distribution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/079Root cause analysis, i.e. error or fault diagnosis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Detection And Correction Of Errors (AREA)

Abstract

The application discloses an error determination method and system, a processor and a memory, and belongs to the technical field of computers. The error determination method is applied to a processor included in the error determination system. Wherein the processor obtains error information capable of indicating a media particle in the plurality of media particles in which an error occurred. The processor can then determine the media particles that have an error based on the error information without determining the media particles that have an error through a computing process that consumes processing resources. Thus, the processor can use more processing resources to process the media particles with errors, such as error correction and the like, so that the reliability and usability of the error determination system are improved.

Description

Error determination method and system, processor and memory
The present application claims priority from chinese patent application No. 202210864871.2 entitled "memory failure detection method, apparatus, and system," filed on month 21 of 2022, 07, the entire contents of which are incorporated herein by reference.
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a method and system for determining errors, a processor, and a memory.
Background
The computer device includes a plurality of media particles for storing data. Each media particle may be subject to errors in storing the data. Accordingly, it is desirable to provide an error determination method to determine a media particle in which an error has occurred from among a plurality of media particles.
Disclosure of Invention
The application provides an error determination method and system, a processor and a memory, so as to determine the media particles with errors from a plurality of media particles. The technical scheme provided by the application comprises the following aspects.
In a first aspect, an error determination method is provided that is applied to a processor included in an error determination system that further includes a plurality of media particles. After the processor acquires the error information, the processor can directly determine the media particle with the error according to the error information, because the error information can indicate the media particle with the error in the plurality of media particles.
Therefore, the processor in the method and the device can process the medium particles with errors, such as error correction, by using the saved processing resources, without determining the medium particles with errors through a calculation process consuming the processing resources, so that the reliability and the usability of an error determination system can be improved. The process in which the processor consumes processing resources to calculate the media particles that have an error, also referred to as the error detection process. The error detection process requires consumed processing resources that are exponentially and positively correlated to the number of erroneous bits. Therefore, the processor does not need to execute the error detection process, and considerable processing resources can be saved, so that the processor does not need to execute the error detection process is of great significance.
In one possible implementation, the error determination system further includes a control interface circuit, the processor being connected to the control interface circuit by a line; the processor obtains error information, including: the processor receives the error information sent by the control interface circuit through the line. In the implementation mode, the processor can acquire the error information from the control interface circuit, and the acquisition mode is flexible.
In one possible implementation, the processor includes a main error correction code (error correction code, ECC) element, the control interface circuit includes a register clock driver (registering clock driver, RCD), and the lines include signal lines. Because the processor comprises the main ECC element, a signal line can be used as a line for exchanging error information, and the speed of transmitting the error information by using the signal line is higher, so that the efficiency and timeliness of determining errors are improved.
In one possible implementation, the processor includes a processing core, the control interface circuit includes an RCD, and the line includes a first bus. The speed of transmitting error information using the first bus may be slightly reduced compared to using the signal line. However, this implementation does not require the processor to include the main ECC element, and is widely applicable.
In one possible implementation, the processor includes a processing core, the control interface circuit includes a complex programmable logic device (complex programmable logic device, CPLD), and the circuitry includes a second bus. The speed of transmitting error information using the second bus may be slightly reduced compared to using the signal line. However, such an implementation does not require the processor to include the main ECC element, nor the RCD, and is much more applicable.
In one possible implementation manner, the control interface circuit is respectively connected with a plurality of medium particles, the error information is obtained by processing a plurality of sub-error information by the control interface circuit in a parallel-to-serial mode, and the plurality of sub-error information is sent to the control interface circuit by the plurality of medium particles in parallel; for any one of the plurality of media particles, the sub-error information sent by the any one of the plurality of media particles is used to indicate an error condition of the any one of the plurality of media particles. The control interface circuit gathers the parallel sub-error information into serial error information, so that the processor can directly acquire the serial error information, and the transmission efficiency of the error information is improved. And the processor can grasp the error condition of a plurality of medium particles in a short time, so that the processor is convenient for carrying out overall treatment on the plurality of medium particles.
In one possible implementation, the error condition of any one media particle includes: whether any one of the media particles is in error or not. Or, the error condition of any one medium particle comprises: any one of the media particles is not subject to error, is subject to error Correction (CE), or is subject to error correction (DUE) that is not subject to error correction (detected uncorrectable error). That is, the present application may simply divide the error situation into two situations, i.e., no error occurs and an error occurs, and may divide the error situation into three situations, i.e., no error occurs, CE occurs and DUE occurs with finer granularity. Wherein, can be according to actual demand flexible choice fall into two kinds of situations or three kinds of situations.
In one possible implementation, the sub-error information is sent by any one of the media particles in the event that a DUE occurs for any one of the media particles. Alternatively, the sub-error message is sent by any one of the media particles in the event that either one of the media particles experiences a CE or experiences a DUE. That is, the medium particles may transmit the sub-error information only in the case where DUE occurs, or may transmit the sub-error information in the case where CE occurs or in the case where DUE occurs, and thus whether or not the medium particles transmit the sub-error information in the case where CE occurs may be flexibly configured according to actual requirements. If the media particles send sub-error information in the event of a DUE, a processor with greater error correction capability may be enabled to handle these uncorrectable errors, thereby improving the reliability and usability of the error determination system. If the medium particles send the sub-error information under the condition that the CE occurs, the processor can know that the CE occurs in the medium particles, so that the processor is convenient to manage the medium particles, and the reliability and usability of the error determination system can be improved.
In one possible implementation, the error information includes a plurality of bits, the plurality of bits corresponding one-to-one to the plurality of media particles. Because the number of the medium particles is always fixed, the number of bits occupied by the error information is also fixed in the mode, so that the error information has higher stability and universality.
In one possible implementation, the error information includes an identification of the media particle in which the error occurred. Under the condition that the number of the medium particles with errors is small, the error information only needs to comprise a small number of marks, so that the number of bits occupied by the error information is small, the processing resources required for transmitting the error information are saved, and the efficiency of transmitting the error information is improved.
In a second aspect, a processor is provided for obtaining error information indicating a media particle in a plurality of media particles in which an error occurred;
the processor is further configured to determine a media particle that is in error based on the error information.
In one possible implementation, the processor is configured to receive error information sent by the control interface circuit over the line.
In one possible implementation, the processor includes a main ECC element, the control interface circuit includes an RCD, and the lines include signal lines.
In one possible implementation, the processor includes a processing core, the control interface circuit includes an RCD, and the line includes a first bus.
In one possible implementation, the error information includes a plurality of bits, the plurality of bits corresponding one-to-one to the plurality of media particles.
In one possible implementation, the error information includes an identification of the media particle in which the error occurred.
In a third aspect, a memory is provided, the memory comprising a plurality of media particles; the plurality of media particles is configured to generate error information indicating a media particle in which an error has occurred among the plurality of media particles.
In one possible implementation, the memory further includes control interface circuitry; the control interface circuit is used for sending error information to the processor through a line.
In one possible implementation, the processor includes a main ECC element, the control interface circuit includes an RCD, and the lines include signal lines.
In one possible implementation, the processor includes a processing core, the control interface circuit includes an RCD, and the line includes a first bus.
In one possible implementation, the control interface circuit is connected to a plurality of media particles, respectively; the medium particles are used for transmitting a plurality of sub-error messages to the control interface circuit in parallel, and for any one of the medium particles, the sub-error message transmitted by the any one of the medium particles is used for indicating the error condition of the any one of the medium particles; the control interface circuit is also used for processing a plurality of sub-error information in a parallel-to-serial mode to obtain error information.
In one possible implementation, the error condition of any one media particle includes whether any one media particle is in error; alternatively, the error condition of any one media particle includes any one media particle not being in error, having a CE, or having a DUE.
In one possible implementation, any one of the media particles is configured to send a sub-error message in the event of a DUE occurring in any one of the media particles; or, any one of the media particles is used for sending the sub-error information in the case that CE or DUE occurs in any one of the media particles.
In one possible implementation, the error information includes a plurality of bits, the plurality of bits corresponding one-to-one to the plurality of media particles.
In one possible implementation, the error information includes an identification of the media particle in which the error occurred.
In a fourth aspect, an error determination system is provided, the system comprising a processor provided for the second aspect and any of its possible implementations, and a memory provided for the third aspect and any of its possible implementations.
In a fifth aspect, an error determination system includes a processor and a plurality of media particles; the plurality of media particles is used for generating error information, and the error information is used for indicating the media particles with errors in the plurality of media particles; the processor is used for acquiring error information; the processor is further configured to determine a media particle that is in error based on the error information.
In one possible implementation, the system further includes a control interface circuit, the processor being connected to the control interface circuit by a line; the control interface circuit is used for sending error information to the processor through a line; the processor is used for receiving error information sent by the control interface circuit through a line.
In one possible implementation, the processor includes a processing core, the control interface circuit includes a CPLD, and the line includes a second bus.
In one possible implementation, the control interface circuit is connected to a plurality of media particles, respectively; the medium particles are used for transmitting a plurality of sub-error messages to the control interface circuit in parallel, and for any one of the medium particles, the sub-error message transmitted by the any one of the medium particles is used for indicating the error condition of the any one of the medium particles; the control interface circuit is also used for processing a plurality of sub-error information in a parallel-to-serial mode to obtain error information.
In one possible implementation, the error condition of any one media particle includes whether any one media particle is in error; alternatively, the error condition of any one media particle includes any one media particle not being in error, having a CE, or having a DUE.
In one possible implementation, any one of the media particles is configured to send a sub-error message in the event of a DUE occurring in any one of the media particles; or, any one of the media particles is used for sending the sub-error information in the case that CE or DUE occurs in any one of the media particles.
In one possible implementation, the error information includes a plurality of bits, the plurality of bits corresponding one-to-one to the plurality of media particles.
In one possible implementation, the error information includes an identification of the media particle in which the error occurred.
It should be appreciated that, for technical effects achieved by the technical solutions provided by the possible implementation manners corresponding to the second aspect to the fifth aspect of the present application, reference may be made to the technical effects of the technical solutions provided by the first aspect and the corresponding possible implementation manners, which are not repeated herein.
Drawings
FIG. 1 is a schematic diagram of an error determination system according to an embodiment of the present application;
FIG. 2 is a schematic diagram of another error determination system provided in an embodiment of the present application;
FIG. 3 is a schematic diagram of yet another error determination system provided in an embodiment of the present application;
FIG. 4 is a schematic diagram of a control interface circuit according to an embodiment of the present disclosure connected to a plurality of dielectric particles;
FIG. 5 is a flowchart of an error determination method according to an embodiment of the present disclosure;
FIG. 6 is a schematic diagram of sending error information in a time window according to an embodiment of the present application;
fig. 7 is a schematic diagram of a transmission error message according to an embodiment of the present application;
FIG. 8 is a schematic flow chart of ECC in a medium particle according to an embodiment of the present disclosure;
fig. 9 is a schematic diagram of performing ECC twice according to an embodiment of the present application.
Detailed Description
The terminology used in the description section of the present application is for the purpose of describing particular embodiments of the present application only and is not intended to be limiting of the present application.
An embodiment of the present application provides an error determination system, see FIG. 1, that includes a processor and a plurality of media particles. The processor may be configured to perform an error determination method provided by an embodiment of the present application, where the error determination method is configured to determine a media particle in which an error occurs from a plurality of media particles. Wherein a media particle may be referred to as a Logical Unit (LUN), or as a Die.
In an exemplary embodiment, the error determination system includes, but is not limited to, case A1 and case A2 as follows.
In case A1, the error determination system includes a processor and a memory, the memory including a plurality of media particles.
In case A1, a plurality of media particles are integrated in the memory. The memory also illustratively includes a conductive contact to which the plurality of dielectric particles may be directly or indirectly coupled, and to which the processor is also coupled, such that a connection between the processor and the plurality of dielectric particles may be achieved. The conductive contact may be arranged in a finger shape and gold-plated, so that the conductive contact may be also referred to as a gold finger.
In case A2, the error determination system includes a processor and a plurality of media particles.
In case A2, there is no need to integrate a plurality of dielectric particles, which are discrete. Wherein a plurality of media particles may be directly and indirectly coupled to the processor, respectively.
The error determination system further illustratively includes a control interface circuit with which the processor is connected by a line. That is, the error determination system includes a processor, a plurality of media particles, and the control interface circuit, and then the error determination system includes, but is not limited to, case B1 and case B2 as follows.
Based on case B1 of case A1, the error determination system includes a processor and a memory including a plurality of media particles and a control interface circuit.
In this case, the control interface circuit may be connected to a conductive contact included in the memory, and the processor is connected to the conductive contact through a line, so that the processor is connected to the control interface circuit through the line. Accordingly, a plurality of dielectric particles may be indirectly connected to the conductive contact through the control interface circuit. That is, a plurality of dielectric particles are also connected to the control interface circuit in addition to the control interface circuit being connected to the conductive contacts.
Illustratively, case B1 may include cases 1 and 2 as follows.
Case 1, the processor includes a main (host) ECC element, the control interface circuit includes an RCD, and the line includes a signal line (pin).
Wherein the main ECC element is connected to the RCD through a signal line. For example, as shown in FIG. 2, the main ECC element is connected to a conductive contact via a signal line, the conductive contact being connected to an RCD, the RCD also being connected to a plurality of dielectric particles. The processor may include a memory controller (controller) in which the main ECC element is located. Illustratively, the memory controller may also include other elements not shown in FIG. 2, such as read/write (RD/WR) elements, etc., without limitation.
Illustratively, the signal lines include, but are not limited to: a Data Queue (DQ) signal line for transmitting data, a Command Address (CA) signal line for transmitting commands and addresses, a signal line for transmitting clock signals (clock), a signal line for transmitting other information, and the like. The number of signal lines is not limited in the embodiment of the present application, and is 288, for example.
The instruction transmitted by the CA signal line is a reading instruction or a writing instruction, wherein the reading instruction is used for reading data from the medium particles, and the writing instruction is used for writing the data into the medium particles. The address transmitted by the CA signal line is the address of the medium particle. In addition, the clock signal is used for indicating a clock period, wherein the clock period is a time unit for interaction between the memory controller and the medium particles, and comprises a rising edge and a falling edge. And the memory controller and the medium particles can respectively perform one interaction at the rising edge and the falling edge. A rising edge or a falling edge may be referred to as a burst or a heartbeat. The other information mentioned above may be information related to the media particles, such as the temperature of the media particles, etc., and is not limited herein.
In case 2, the processor comprises a processing core (core), the control interface circuit comprises an RCD, and the line comprises a first bus.
Wherein the processing core is coupled to the RCD via a first bus. For example, referring to fig. 2, the processing core is coupled to a conductive contact via a first bus, the conductive contact is coupled to an RCD, and the RCD is further coupled to a plurality of dielectric particles. As can be seen from FIG. 2, the processing core and the memory controller illustrated in case 1 are distinct elements in the processor. Illustratively, the first bus may include, but is not limited to: inter-integrated circuit (inter-integrated circuit, I2C) buses, modified inter-integrated circuits (improved inter integrated circuit, I3C), and the like. Compared with an I2C bus, the speed of transmitting information by the I3C bus is higher, and the I2C bus can be compatible. The I3C bus combines the advantages of the I2C bus and the serial peripheral interface (serial peripheral interface, SPI) and adds new functionality. Among other advantages, the I2C bus includes two-wire, simple, and SPI bus advantages include low power consumption and high speed. Thus, the first bus may be, for example, an I3C bus. In addition, software may be run on the processing core, such as a basic input output system (basic input output system, BIOS) system.
Illustratively, the RCDs in case 1 and case 2 may include registers having a memory function. In addition, the case shown in fig. 2 is a case where the processor includes both the main ECC element and the processing core, that is, a case where case 1 and case 2 coexist. In addition to this case, the processor may also include only the main ECC element, i.e. only case 1 is present. Alternatively, the processor may include only processing cores, i.e., only case 2 is present.
Based on case B2 of case A2, the error determination system includes a processor, a plurality of media particles, and a control interface circuit.
The control interface circuit may be directly wired to the processor. Accordingly, a discrete plurality of media particles may also be connected to the control interface circuit, via which an indirect connection between the processor and the plurality of media particles is achieved.
Illustratively, case B2 may include case 3 as follows.
In case 3, the processor includes a processing core, the control interface circuit includes a CPLD, and the line includes a second bus.
The processing core is connected with the CPLD through a second bus, and the CPLD is also connected with a plurality of medium particles on the basis. For example, referring to fig. 3, the cpld may carry the processor and a plurality of media particles. It should be understood that the processor and the plurality of dielectric particles shown in fig. 3 are both exemplary of the same side of the CPLD, and that the processor and the plurality of dielectric particles may be located on different sides of the CPLD, such as opposite sides of the CPLD. Illustratively, the second bus includes, but is not limited to: I2C bus, I3C bus, SPI bus, local bus (local bus), etc. The second bus may be, for example, an I3C bus.
Illustratively, the CPLD may have a memory function. In addition, in case 3, an RCD may be included, and the RCD is also connected to the CPLD, so that an indirect connection between the RCD and the processor, and an indirect connection between the RCD and the plurality of media particles, is achieved by the CPLD.
In an exemplary embodiment, the control interface circuit may be separately coupled to a plurality of media particles, whether the control interface circuit is an RCD or a CPLD. Each medium particle comprises a pin, the control interface circuit comprises a plurality of pins, and the number of the pins of the control interface circuit is equal to that of the medium particles, so that the pins of the control interface circuit are in one-to-one correspondence with the pins of each medium particle. Then, the corresponding pins are connected through the connecting wires, so that the control interface circuit can be connected with each medium particle respectively.
Referring to fig. 4, fig. 4 shows a control interface circuit and a plurality of dielectric particles, the pins being represented by circles in fig. 4. It should be understood that the 40 media particles shown in fig. 4 are only examples and are not intended to limit the number of media particles included in the error determination system. Also, for simplicity, only 20 pins are shown on the control interface circuit in fig. 4, with the 20 pins being connected with 20 media particles, respectively. In practice, the control interface circuit of fig. 4 should include 40 pins, and these 40 pins should be connected to 40 media particles, respectively.
Illustratively, the pins include, but are not limited to: general-purpose input/output (GPIO) pins, etc., are not limited herein.
In the embodiment of the application, the processor is, for example, a central processing unit (central processing unit, CPU), and the medium particles are, for example, a dynamic random access memory (dynamic random access memory, DRAM). Additionally, in some embodiments, the memory is categorized according to its physical structure, and then the memory includes, but is not limited to: dual-in-line memory module (dual-inline memory Modules, DIMM), single in-line memory module (single-inline memory Modules, SIMM), and so forth. In other embodiments, the memory may be classified according to a memory-dependent technique, and the memory includes, but is not limited to: double Data Rate (DDR) memory, high bandwidth memory (high band width memory, HBM), NAND (not and, NAND) memory, universal backplane management (universal backplane management, UBM) memory, and fast computational link (compute express link, CXL) memory, among others. Illustratively, the DDR may include DDR4, DDR5, DDR6, etc., and the HBM may include HBM2, HBM3, etc.
It should be understood that the processors, media particles, and memory described herein are by way of example and not by way of limitation. Other processors, media particles, or memory, as may be suitable for use in embodiments of the present application, are also within the scope of embodiments of the present application and are hereby incorporated by reference herein.
Next, the roles of the various elements included in fig. 1 to 4 described above will be described in conjunction with the error determination method provided in the embodiment of the present application. The error determination method can be used in the processor. As shown in fig. 5, the error determination method includes the following steps 501 and 502.
In step 501, a processor obtains error information indicating a media particle in a plurality of media particles in which an error occurred.
Wherein the media particles may be used to store data. During use of each media particle, the media particle may be subject to error. For example, due to the fact that the service time of the media particles is long, the properties of the media particles change, errors can occur in the media particles, and data stored in the media particles are also in error. For another example, although the properties of the media particles are unchanged, occasional errors in the media particles may also cause errors in the data stored by the media particles. Thus, there may be erroneous ones of the plurality of media particles, and the processor needs to acquire error information indicating the erroneous ones of the plurality of media particles.
In an exemplary embodiment, the processor obtains error information, including but not limited to case C1 and case C2 as follows.
In case C1, the control interface circuit sends error information to the processor via the line, and correspondingly, the processor receives the error information sent by the control interface circuit via the line.
This case C1 may be applied to a case where the error determination system includes a control interface circuit, such as the case B1 and the case B2 described above. Accordingly, the processor, control interface circuitry, and circuitry may include cases 1 through 3 described above.
In case 1, the rcd sends error information to the main ECC element via the signal line. The processor includes the main ECC element, so that the signal line can be used for transmitting error information, and the speed for transmitting error information is high, thereby improving the error determination efficiency. Case 1 applies to the case where the processor includes a main ECC element, and if the processor does not include a main ECC element, the processor needs to be modified to add a main ECC element to the processor. In addition, case 1 also requires an improvement to the RCD so that the RCD can send error information to the main ECC element. In summary, the false positive efficiency of case 1 is high and improvements in the processor and RCD may be desirable.
In case 1, the signal line may be a designated signal line or may be a signal line that is in an idle state when an error message is required to be transmitted, which is not limited herein. For example, in the case where the memory is DDR5, the signal line may be a designated DQ signal line. For another example, in the case where the memory is DDR4 memory, the signal line may be a signal line that is in an idle state when it is necessary to transmit error information. For example, when it is necessary to transmit error information, the signal line for transmitting the temperature of the medium particles is in an idle state, and then the error information can be transmitted through the signal line for transmitting the temperature of the medium particles.
In case 2, the rcd sends an error message to the processing core via the first bus. Since the main ECC element is not used, the signal line cannot be used any more, and it is necessary to transmit error information using the first bus. The speed of transmitting error information using the first bus is slower than the signal line. However, case 2 may be applicable where the processor does not include a main ECC element, and no improvement is required to the processor, and only the improvement is required to the RCD so that the RCD can send error information to the processing core. In summary, the false positive efficiency of case 2 is lower than case 1, and improvements to the RCD may be needed without requiring improvements to the processor.
In case 1 and case 2, referring to fig. 6, transmission of error information may be completed in one time window. For example, the time window is a CRC_ALERT_PW time window defined by the relevant specification. It should be understood that the error information shown in fig. 6 is only an exemplary form, and is not intended to limit the form of the error information acquired by the processor in the embodiment of the present application.
In case 3, the cpld sends an error message to the processing core via the second bus. In case 3, the main ECC element is not used, and thus the signal line cannot be used, but the second bus is required to transmit error information. The speed of transmitting error information using the second bus is slower than using the signal line. However, case 2 does not require modification of the processor or the RCD, and only requires modification of programming corresponding to the CPLD, so that the CPLD can send error information to the processing core. Accordingly, the CPLD transmits error information based on the modified programming. Since case 3 requires the use of a second bus and the transmission of error information needs to be completed based on the modified programming, case 3 is less efficient in error determination than case 2. However, the case 3 does not need to be improved in hardware (i.e., the processor and the RCD), but only in software (i.e., programming corresponding to the CPLD), and has strong applicability.
In case 2 and case 3, referring to fig. 7, after the control interface circuit obtains the error information and transmits the error information, the processing core included in the processor needs to obtain the error information based on the running BIOS in order to perform the subsequent processing. The subsequent processing may be referred to as the description in step 502, and is not described herein.
Regardless of the manner in which the control interface circuit transmits the error information, the error information needs to be acquired before the control interface circuit transmits. In some embodiments, in the case where the control interface circuit is not separately connected to the plurality of media particles, the plurality of media particles may send error information to the control interface circuit one by one, such as after the 0 th media particle sends sub-error information to the control interface circuit, the 1 st media particle sends sub-error information to the control interface, and so on. The control interface circuit may then aggregate the plurality of individually received sub-error information into a plurality of parallel sub-error information. In other embodiments, when the control interface circuit is connected to the plurality of media particles, the plurality of media particles send the plurality of sub-error information to the control interface circuit in parallel, that is, the plurality of sub-error information is sent by the plurality of media particles to the control interface circuit in parallel, and the control interface circuit directly receives the plurality of sub-error information in parallel. For any one of the plurality of media particles, the sub-error information sent by any one of the media particles is used to indicate: an error condition of any one of the media particles.
Illustratively, the error condition of any one of the media particles includes whether any one of the media particles is in error, i.e., the error condition includes both no error and an error. Alternatively, the error condition of any one of the media particles includes any one of the media particles not being in error, CE being in error, or DUE being in error, i.e., the error condition includes three conditions. The error condition includes two cases or three cases, and can be selected according to actual conditions.
The control interface circuit is connected to the plurality of media particles, that is, the case shown in fig. 4. As can be seen from fig. 4, the pins included in the control interface circuit are connected with the pins included in each dielectric particle by corresponding connection lines. Any one of the media particles may then send a sub-error message to the control interface circuit by controlling the level of the connection line, a process of sending the sub-error message also known as a signaling process. The level of this connection line is also called alert signal. Illustratively, when any one of the media particles controls the level of the connection line to be the first level, the sub-error information is used to indicate that any one of the media particles is in error. Or when any one of the medium particles controls the level of the connecting line to be the second level, the sub-error information is used for indicating that any one of the medium particles is not in error. The second level is different from the first level. For example, the first level may be greater than the second level, or the first level may be less than the second level. Of course, the first level may be further divided into two different sub-levels for indicating that CE and DUE occur, respectively.
Of course, the signaling may be performed in other manners than by controlling the level of the connection line. For example, the signaling may be performed using an encoded signal supported by the media particle interface, or may be performed using an information interrupt (message interrupt) message, which is not limited herein.
After the control interface circuit obtains a plurality of parallel sub-error information, the control interface circuit processes the plurality of sub-error information in a parallel-to-serial mode to obtain error information. That is, the error information is obtained by processing a plurality of sub-error information by the control interface circuit in a parallel-to-serial manner. In some embodiments, the control interface circuit sends the error information to the processor in real time after the error information is obtained. In other embodiments, the control interface circuit stores the error information after obtaining the error information, such as by a register when the control interface circuit is an RCD, and by a store function when the control interface circuit is a CPLD. Then, the process is carried out. The control interface circuit may send a request to the processor, and after the processor receives the request, send an acquisition instruction to the control interface circuit, and the control interface circuit reads the stored error information according to the acquisition instruction and returns the error information to the processor.
In an exemplary embodiment, the error information obtained by the controller includes, but is not limited to, case D1 and case D2 as follows.
In case D1, the error information includes a plurality of bits, and the plurality of bits are in one-to-one correspondence with the plurality of media particles.
The correspondence between the plurality of bits and the plurality of media particles can be negotiated by the processor and the control interface circuit, so that the processor can identify which bit corresponds to which media particle, and can determine whether the media particle corresponding to the bit is the wrong media particle according to the bit. Taking the number of media particles as 40 as an example, the plurality of bits is 40 bits. Then, it may be: bit 1 corresponds to bit 1 of the media particle, and so on, bit 40 corresponds to bit 40 of the media particle. Alternatively, it may be: bit 1 corresponds to bit 40 of the media particle, and so on, bit 40 corresponds to bit 1 of the media particle. Of course, other correspondence relationships are also possible, and are not illustrated here.
For example, for any one of the plurality of bits, the value of any one bit is a first value, which indicates that an error occurs in the media grain corresponding to any one bit, and the value of any one bit is a second value, which indicates that no error occurs in the media grain corresponding to any one bit, where the first value is different from the second value. For example, the first value is 0 and the second value is 1. For another example, the first value is 1 and the second value is 0. Alternatively, the first and second values may be other values than 0 and 1, which are not limited herein.
In case D2, the error information includes an identification of the media particle in which the error occurred.
Wherein, a sign is used for uniquely indicating a medium particle, the form of the sign is not limited in this embodiment, and the form of the sign may be a numerical value, a character string, etc., which is not limited herein. The processor may negotiate with the control interface circuit such that the processor may, after receiving the identification, identify the media particle indicated by the identification and determine that the media particle indicated by the identification is the erroneous media particle.
In the case D1, the number of bits occupied by the error information is fixed, which has strong stability and versatility. Moreover, the number of bits occupied by the error information is the same as the number of media particles, and the number of media particles is often limited, so that the number of bits occupied by the error information is not too large. In the case D2, the number of bits occupied by the error information may be not fixed, because the media particles in which the error occurs may be different and the identifications of the media particles in which the error occurs may be different. However, if the number of erroneous media particles is small, the number of bits occupied by the erroneous information will be small, such as less than the number of bits occupied by the erroneous information in case D1. For example, the embodiment of the present application may select the case D1 by default or select the case D2 by default, or may select one case with a smaller number of bits occupied by the error information from the case D1 and the case D2.
Of course, in addition to the cases D1 and D2 described above, the control interface circuit may directly send the plurality of sub-error information to the processor in parallel via the line, instead of performing parallel-to-serial processing for the plurality of sub-error information. Considering that this approach requires more lines, it is suitable for a scenario where more lines are available between the processor and the control interface circuitry.
Illustratively, regardless of the manner in which the control interface circuit obtains the error information, the control interface circuit may encode the error information to obtain the encoded error information. The error information is encoded, that is, compressed, so that the number of bits occupied by the encoded error information is smaller than the number of bits occupied by the error information. Correspondingly, after the processor receives the encoded error information, the processor decodes the encoded error information to obtain the error information.
In addition, any one of the media particles needs to determine the sub-error information before it sends the sub-error information to the control interface circuit. In an exemplary embodiment, any one of the media particles determines the manner of the sub-error information, including but not limited to case E1 and case E2 as follows.
In case E1, an ECC process is performed inside any one of the media grains to obtain sub-error information, and this process may also be referred to as (on-die) ECC located on the media grain, and this process may be a process that is not perceived by the processor.
In some embodiments, if the error condition indicated by the sub-error information includes no error and an error occurs, the process of determining the sub-error information may include: any one of the media particles is calculated to obtain a target value according to the data read from any one of the media particles, and a reference value is calculated according to the data written into any one of the media particles. And then, any one medium particle generates a comparison result between the target value and the reference value, and the sub-error information is determined according to the comparison result. That is, the sub-error information sent by any one of the media grains is determined according to a comparison result generated inside any one of the media grains, the comparison result is a comparison result between a target value and a reference value, the target value is calculated according to data read from any one of the media grains, and the reference value is calculated according to data written into any one of the media grains.
If the comparison result is that the target value is different from the reference value, the sub-error information sent by any one of the medium particles indicates that any one of the medium particles is in error, and if the comparison result is that the target value is the same as the reference value, the sub-error information sent by any one of the medium particles indicates that any one of the medium particles is not in error.
In other embodiments, if the error condition indicated by the sub-error information includes no error, CE, or DUE, then determining the sub-error information may include the following process, see FIG. 8.
During writing, for data (data to memory) to be written into the media grain, the data is both written into the media grain and input into the detection bit generator (check bit generator), so that the detection bit generator generates and outputs detection bits (check bits) corresponding to the data. After that, the detection bit is also written to the media grain. Wherein the data itself and the detection bits may be stored in different units in the media particle, respectively.
In the reading process, for data (data from memory) to be read from the media particles, both the data itself and the detection bits corresponding to the data are read. Then, the data itself and the detection bits are input to a syndrome generator (syndrome generator), and the syndrome generator generates a syndrome (syndrome) corresponding to the data. The syndrome is decoded by a syndrome decoding (syndrome decoding) process.
After decoding the syndrome, if the syndrome is zero, it indicates that no error has occurred, i.e., no error. If the syndrome is non-zero and can match one of the columns of the H matrix, which belongs to a kind of check matrix, this indicates that CE occurs. If the syndrome is non-zero and cannot match any column of the H matrix, then the DUE is declared to occur. Thus, an error condition may be found as no error, CE, or DUE. In case of CE, CE positioning (positioning), that is, determining a bit position where an error occurs, is performed on data read from the media particle, and then error correction (error correction) is performed, so as to obtain corrected data (corrected error).
In the on-die ECC process, the bits occupied by the data itself and the bits occupied by the data for detecting error conditions affect the bits occupied by the CE. According to the joint institute of electrical and electronics engineering (joint electron device engineering council, JEDEC) specification (spec), if the bit occupied by the data itself is 128 bits and the data used to detect the error condition is an 8-bit hamming code, such as the above-mentioned reference value or the detection bit is 8 bits, the bit occupied by the CE should be 1 bit. In other words, one-bit errors can be corrected. According to this specification, two-bit errors can also be detected, but two-bit errors cannot be corrected, and thus this specification is also called single-bit error correction double-bit error detection (single error correction double error correction, SEC-DED). Based on this, if only one bit error is detected, the error belongs to CE, and if two bits error is detected, the error belongs to DUE.
In some embodiments, any one of the media particles sends a sub-error message in the event that a DUE occurs with any one of the media particles. In yet other embodiments, any one of the media particles transmits a sub-error message in the event that any one of the media particles experiences a CE or a DUE. It follows that in the case of CE, whether any one media particle transmits sub-error information is flexibly configurable. Illustratively, the media particles include a mode register (mode register) by which whether any one of the media particles transmits sub-error information in the event of a CE is configured.
In case E2, any one of the media particles determines sub-error information based on the information associated with the media particle.
For example, if the value of the information related to the media particles exceeds the threshold, the sub-error information sent by any one media particle indicates that any one media particle is in error. Further, it may be determined that CE or UCE occurs in any one of the media particles according to the extent to which the value exceeds the threshold. If the value of the information related to the media particles does not exceed the threshold value, the sub-error information sent by any one of the media particles indicates that any one of the media particles has no error.
Illustratively, the information related to the media particles includes, but is not limited to: the temperature of the media particles, the voltage of the media particles, the current of the media particles, etc., are not limited herein.
In case C2, the processor receives error information sent by the plurality of media particles.
This case C2 may be applied to a case where the error determination system does not include a control interface circuit, such as the case A1 and the case A2 described above. In case C2, any one of the media particles may determine the sub-error information, and the plurality of media particles send the plurality of sub-error information to the processor in parallel, where the plurality of sub-error information is the error information. Alternatively, the parallel-to-serial processing may be performed on the plurality of parallel sub-error information by another method, so as to obtain the error information in the case D1 and the case D2, and then the obtained error information may be sent to the processor. The parallel to serial processing may be performed by other elements than the control interface circuit, or the parallel to serial processing may be performed by designating one of the plurality of media particles. The method for determining the sub-error information by any one sub-medium particle may refer to the description in the case C1, and the method for performing parallel-to-serial processing may also refer to the description in the case C1, which is not described herein.
From the above description of the case C1 and the case C2, the error information may be indirectly generated by a plurality of media particles, for example, the plurality of media particles generate a plurality of sub-error information, and the plurality of sub-error information is summarized as the error information. Alternatively, the error information may be generated directly by a plurality of media particles, such as a plurality of sub-error information, i.e., error information. In either case, therefore, the error information may be considered to be generated by a plurality of media particles, i.e., a plurality of media particles are used to generate the error information.
In step 502, the processor determines the media particle with the error based on the error information.
Since the error information obtained in step 501 is used to indicate the media particle having an error among the plurality of media particles, the processor can directly determine the media particle having an error according to the error information, so as to perform the subsequent processing without determining the media particle having an error through a process of consuming processing resources such as calculation. It is because the processor does not need to consume processing resources to calculate the media particles that have an error, and thus the processor can use more processing resources for subsequent processing, thereby improving the reliability and usability of the error determination system. For example, the processor may generate a processing log after receiving the error information, where the processing log does not include content of media particles in which the processor consumes resources to calculate an error, but includes content of subsequent processing by the processor.
The process that the processor consumes processing resources to calculate the media particles with errors is also called an error detection process, and the processing resources consumed in the error detection process are directly related to the bit in an exponential multiple way. For example, if detecting an error of one bit requires 1 part of processing resources, detecting an error of two bits requires 4 parts of processing resources, and detecting an error of three bits requires 9 parts of processing resources. Because the processor in the embodiment of the application does not need to detect the error, compared with the related technology which needs to detect the error, the method and the device can save more objective processing resources. It is important to see that in the embodiments of the present application, it is not necessary for the processor to perform the process.
As previously described, the processor may perform subsequent processing for the erroneous media particles. For example, the processor may mark the media particles that have an error as unusable media particles, and subsequently the media particles that have an error are no longer used to ensure data security. For another example, the processor may re-write the correct data to the media particle with the error to ensure that the media particle with the error can continue to be used normally.
In an exemplary embodiment, the plurality of media particles may include a traffic media particle and an ECC media particle, and the number ratio of traffic media particles to ECC media particles may be 4:1, that is, 4 traffic media particles correspond to 1 ECC media particle. The number ratio is also referred to as a redundancy ratio, and the redundancy ratio is not limited to this, and may be set according to actual conditions. Accordingly, the processor may employ different processing modes according to different media particles. See case F1 and case F2 below.
In case F1, the erroneous media particle is a service media particle, and the data read from the erroneous media particle is service data. The processor may then correct the traffic data based on the reference data read from the ECC medium grains that have never been subjected to errors.
Referring to fig. 9, the on die ECC process illustrated in case E1 above is a first ECC, and what the processor performs here is a second ECC. Illustratively, the second ECC may be an enhanced ECC having a stronger error correction capability than the first ECC. For example, in the first ECC, 128-bit data is error-corrected using 8-bit data. In the second ECC, taking the redundancy ratio of 4:1 as an example, if 128 bits of service data are respectively read from 4 service medium grains and 128 bits of service data are also read from 1 corresponding ECC medium grain, each 128 bits of service data can be allocated to 32 bits of reference data, so that the processor can use the 32 bits of reference data to correct the 128 bits of service data, thereby improving the error correction capability. It should be appreciated that the error correction capability of the second ECC may be further improved if the redundancy ratio is increased to 3:1, 2:1 or higher.
For example, the processor may complete the reading in consecutive bursts, whether it is reading reference data or reading traffic data. Taking the example of reading service data, the processor can continuously read in 16 bursts, and 8 bits of service data are read in each burst, so that 128 bits of service data are read.
In case F2, the media grain with the error is an ECC media grain, and the data read from the media grain with the error is reference data for correcting the service data. Then the processor may delete the reference data.
Since the ECC medium grain is in error, the reference data stored in the ECC medium grain is no longer suitable for correcting other service data, and the reference data can be deleted.
In summary, in the embodiment of the present application, the error information obtained by the processor is used to indicate the media particle with the error in the plurality of media particles, so that the processor may directly determine the media particle with the error according to the error information, without determining the media particle with the error through a process of consuming processing resources such as calculation. Thus, the processor can use more processing resources to process the media particles with errors, thereby improving the reliability and usability of the error determination system.
The embodiment of the application also provides a processor, which is used for acquiring error information, wherein the error information is used for indicating the medium particles with errors in the plurality of medium particles; the processor is further configured to determine a media particle that is in error based on the error information.
The processor is illustratively configured to receive error messages sent by the control interface circuit over the line.
In an exemplary embodiment, the processor includes a main ECC element, the control interface circuit includes an RCD, and the lines include signal lines.
Illustratively, the processor includes a processing core, the control interface circuit includes an RCD, and the line includes a first bus.
Optionally, the error information includes a plurality of bits, the plurality of bits being in one-to-one correspondence with the plurality of media particles.
Illustratively, the error information includes an identification of the media particle in which the error occurred.
The embodiment of the application also provides a memory, which comprises a plurality of medium particles; the plurality of media particles is configured to generate error information indicating a media particle in which an error has occurred among the plurality of media particles.
Illustratively, the memory further includes control interface circuitry; the control interface circuit is used for sending error information to the processor through a line.
Illustratively, the processor includes a main ECC element, the control interface circuit includes an RCD, and the lines include signal lines.
Optionally, the processor comprises a processing core, the control interface circuit comprises an RCD, and the line comprises a first bus.
Illustratively, the control interface circuit is coupled to the plurality of media particles, respectively; the medium particles are used for transmitting a plurality of sub-error messages to the control interface circuit in parallel, and for any one of the medium particles, the sub-error message transmitted by the any one of the medium particles is used for indicating the error condition of the any one of the medium particles; the control interface circuit is also used for processing a plurality of sub-error information in a parallel-to-serial mode to obtain error information.
Illustratively, the error condition of any one media particle includes whether any one media particle is in error; alternatively, the error condition of any one media particle includes any one media particle not being in error, having a CE, or having a DUE.
Illustratively, any one of the media particles is configured to transmit a sub-error message in the event that a DUE occurs with any one of the media particles; or, any one of the media particles is used for sending the sub-error information in the case that CE or DUE occurs in any one of the media particles.
Optionally, the error information includes a plurality of bits, the plurality of bits being in one-to-one correspondence with the plurality of media particles.
Illustratively, the error information includes an identification of the media particle in which the error occurred.
In some embodiments, an error determination system is further provided in the embodiments of the present application, where the system includes a processor and a memory, where the processor is any one of the above-mentioned exemplary processors, and the memory is any one of the above-mentioned exemplary memories.
In other implementations, embodiments of the present application also provide an error determination system that includes a processor and a plurality of media particles; the plurality of media particles is used for generating error information, and the error information is used for indicating the media particles with errors in the plurality of media particles; the processor is used for acquiring error information; the processor is further configured to determine a media particle that is in error based on the error information.
Illustratively, the system further comprises a control interface circuit, the processor being connected to the control interface circuit by a line; the control interface circuit is used for sending error information to the processor through a line; the processor is used for receiving error information sent by the control interface circuit through a line.
Optionally, the processor comprises a processing core, the control interface circuit comprises a CPLD, and the line comprises a second bus.
Illustratively, the control interface circuit is coupled to the plurality of media particles, respectively; the medium particles are used for transmitting a plurality of sub-error messages to the control interface circuit in parallel, and for any one of the medium particles, the sub-error message transmitted by the any one of the medium particles is used for indicating the error condition of the any one of the medium particles; the control interface circuit is also used for processing a plurality of sub-error information in a parallel-to-serial mode to obtain error information.
In an exemplary embodiment, the error condition of any one media particle includes whether any one media particle is in error; alternatively, the error condition of any one media particle includes any one media particle not being in error, having a CE, or having a DUE.
Illustratively, any one of the media particles is configured to transmit a sub-error message in the event that a DUE occurs with any one of the media particles; or, any one of the media particles is used for sending the sub-error information in the case that CE or DUE occurs in any one of the media particles.
Optionally, the error information includes a plurality of bits, the plurality of bits being in one-to-one correspondence with the plurality of media particles.
Illustratively, the error information includes an identification of the media particle in which the error occurred.
The terms "first," "second," and the like in this application are used to distinguish between identical or similar items that have substantially the same function and function, and it should be understood that there is no logical or chronological dependency between the "first," "second," and "nth" terms, nor is it limited to the number or order of execution. It will be further understood that, although the following description uses the terms first, second, etc. to describe various elements, these elements should not be limited by the terms. These terms are only used to distinguish one element from another element.
It should also be understood that, in the embodiments of the present application, the sequence number of each process does not mean that the execution sequence of each process should be determined by the function and the internal logic of each process, and should not constitute any limitation on the implementation process of the embodiments of the present application.
It is to be understood that the terminology used in the description of the various examples described herein is for the purpose of describing particular examples only and is not intended to be limiting. As used in the description of the various described examples and in the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should also be understood that the terms "if" and "if" may be interpreted to mean "when" ("white" or "upon") or "in response to a determination" or "in response to detection. Similarly, the phrase "if determined" or "if [ a stated condition or event ] is detected" may be interpreted to mean "upon determination" or "in response to determination" or "upon detection of [ a stated condition or event ] or" in response to detection of [ a stated condition or event ] "depending on the context.
The foregoing description is illustrative only and is not intended to limit the present invention, and any modifications, equivalents, improvements, etc. that fall within the principles of the present application are intended to be included within the scope of the present application.

Claims (34)

1. An error determination method, the method being applied to a processor comprised by an error determination system, the error determination system further comprising a plurality of media particles, the method comprising:
the processor obtains error information, wherein the error information is used for indicating the medium particles with errors in the plurality of medium particles;
the processor determines the media particles with the errors according to the error information.
2. The method of claim 1, wherein the error determination system further comprises a control interface circuit, the processor being wired to the control interface circuit;
the processor obtaining error information includes:
the processor receives the error information sent by the control interface circuit through the line.
3. The method of claim 2, wherein the processor comprises a main error correction code ECC element, the control interface circuit comprises a register clock driver RCD, and the lines comprise signal lines.
4. The method of claim 2, wherein the processor comprises a processing core, the control interface circuit comprises a registered clock driver RCD, and the line comprises a first bus.
5. The method of claim 2, wherein the processor comprises a processing core, the control interface circuit comprises a complex programmable logic device CPLD, and the line comprises a second bus.
6. The method according to any one of claims 2-5, wherein the control interface circuit is connected to the plurality of media particles, the error information is obtained by processing a plurality of sub-error information by the control interface circuit in a parallel-to-serial manner, and the plurality of sub-error information is sent to the control interface circuit by the plurality of media particles in parallel;
For any one of the plurality of media particles, the sub-error information sent by the any one of the plurality of media particles is used to indicate an error condition of the any one of the plurality of media particles.
7. The method of claim 6, wherein the error condition of any one of the media particles comprises whether the any one of the media particles is in error;
or the error condition of any one of the media particles comprises that the any one of the media particles is not in error, has a correctable error CE or has a detectable uncorrectable error DUE.
8. The method of claim 7 wherein the sub-error message is sent by the any one of the media particles in the event of a DUE for the any one of the media particles;
or the sub-error information is sent by any one of the media particles in the case that the CE or the DUE occurs in any one of the media particles.
9. The method of any of claims 1-8, wherein the error information comprises a plurality of bits, the plurality of bits corresponding one-to-one to the plurality of media particles.
10. The method according to any of claims 1-8, wherein the error information comprises an identification of the media particle in which the error occurred.
11. A processor for acquiring error information indicating a media particle in which an error has occurred among a plurality of media particles;
the processor is further configured to determine the media particle with the error according to the error information.
12. The processor of claim 11, wherein the processor is configured to receive the error message sent by the control interface circuit over a wire.
13. The processor of claim 12, wherein the processor comprises a main error correction code ECC element, the control interface circuit comprises a register clock driver RCD, and the lines comprise signal lines.
14. The processor of claim 12, wherein the processor comprises a processing core, the control interface circuit comprises a registered clock driver RCD, and the line comprises a first bus.
15. The processor of any of claims 11-14, wherein the error information comprises a plurality of bits, the plurality of bits corresponding one-to-one to the plurality of media particles.
16. The processor of any of claims 11-14, wherein the error information includes an identification of the media particle in which the error occurred.
17. A memory, wherein the memory comprises a plurality of media particles;
the plurality of media particles is configured to generate error information indicating a media particle in which an error has occurred among the plurality of media particles.
18. The memory of claim 17, wherein the memory further comprises control interface circuitry;
the control interface circuit is used for sending the error information to the processor through a line.
19. The memory of claim 18 wherein the processor includes a main error correction code ECC element, the control interface circuit includes a register clock driver RCD, and the lines include signal lines.
20. The memory of claim 18 wherein the processor includes a processing core, the control interface circuit includes a register clock driver RCD, and the line includes a first bus.
21. The memory according to any one of claims 18-20, wherein the control interface circuit is respectively connected to the plurality of dielectric particles;
the plurality of media particles are used for transmitting a plurality of sub-error information to the control interface circuit in parallel, and for any one of the plurality of media particles, the sub-error information transmitted by the any one of the plurality of media particles is used for indicating the error condition of the any one of the plurality of media particles;
The control interface circuit is also used for processing the plurality of sub-error information in a parallel-to-serial mode to obtain the error information.
22. The memory of claim 21, wherein the error condition of any one of the media particles includes whether the any one of the media particles is in error;
or the error condition of any one of the media particles comprises that the any one of the media particles is not in error, has a correctable error CE or has a detectable uncorrectable error DUE.
23. The memory of claim 22, wherein any one of the media particles is configured to send the sub-error message if a DUE occurs in the any one of the media particles;
or, the arbitrary media particle is used for sending the sub-error information when the arbitrary media particle generates CE or DUE.
24. The memory of any one of claims 17-23, wherein the error information comprises a plurality of bits, the plurality of bits corresponding one-to-one to the plurality of media particles.
25. The memory of any one of claims 17-23, wherein the error information includes an identification of the media particle in which the error occurred.
26. An error determination system comprising a processor and a memory, the processor being the processor of any one of claims 11-16 and the memory being the memory of any one of claims 17-25.
27. An error determination system, the system comprising a processor and a plurality of media particles;
the plurality of media particles is used for generating error information, and the error information is used for indicating the media particles with errors in the plurality of media particles;
the processor is used for acquiring the error information;
the processor is further configured to determine the media particle with the error according to the error information.
28. The system of claim 27, further comprising a control interface circuit, wherein the processor is connected to the control interface circuit by a line;
the control interface circuit is used for sending error information to the processor through the line;
the processor is used for receiving the error information sent by the control interface circuit through the line.
29. The system of claim 28, wherein the processor comprises a processing core, the control interface circuit comprises a complex programmable logic device CPLD, and the line comprises a second bus.
30. The system of claim 28 or 29, wherein the control interface circuit is connected to each of the plurality of media particles;
the plurality of media particles are used for transmitting a plurality of sub-error information to the control interface circuit in parallel, and for any one of the plurality of media particles, the sub-error information transmitted by the any one of the plurality of media particles is used for indicating the error condition of the any one of the plurality of media particles;
the control interface circuit is also used for processing the plurality of sub-error information in a parallel-to-serial mode to obtain the error information.
31. The system of claim 30, wherein the error condition of any one of the media particles comprises whether the any one of the media particles is in error;
or the error condition of any one of the media particles comprises that the any one of the media particles is not in error, has a correctable error CE or has a detectable uncorrectable error DUE.
32. The system of claim 31 wherein any one of the media particles is configured to send the sub-error message if the DUE occurs with the any one of the media particles;
Or, the arbitrary media particle is used for sending the sub-error information when the arbitrary media particle generates CE or DUE.
33. The system of any of claims 27-32, wherein the error information comprises a plurality of bits, the plurality of bits corresponding one-to-one to the plurality of media particles.
34. The system of any of claims 27-32, wherein the error information includes an identification of the media particle in which the error occurred.
CN202211204450.3A 2022-07-21 2022-09-29 Error determination method and system, processor and memory Pending CN117472644A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2023/103220 WO2024016971A1 (en) 2022-07-21 2023-06-28 Error determination method and system, processor, and memory

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210864871 2022-07-21
CN2022108648712 2022-07-21

Publications (1)

Publication Number Publication Date
CN117472644A true CN117472644A (en) 2024-01-30

Family

ID=89629898

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211204450.3A Pending CN117472644A (en) 2022-07-21 2022-09-29 Error determination method and system, processor and memory

Country Status (1)

Country Link
CN (1) CN117472644A (en)

Similar Documents

Publication Publication Date Title
US10824499B2 (en) Memory system architectures using a separate system control path or channel for processing error information
US11683050B2 (en) Memory controller and method of data bus inversion using an error detection correction code
US10002043B2 (en) Memory devices and modules
EP1984822B1 (en) Memory transaction replay mechanism
KR102378466B1 (en) Memory devices and modules
US7587658B1 (en) ECC encoding for uncorrectable errors
US9785570B2 (en) Memory devices and modules
US20120239996A1 (en) Memory controller, information processing apparatus and method of controlling memory controller
US20040237001A1 (en) Memory integrated circuit including an error detection mechanism for detecting errors in address and control signals
KR20150135311A (en) Memory device having error correction logic
US11625346B2 (en) Interface for memory readout from a memory component in the event of fault
WO2023221342A1 (en) Ddr dual-in-line memory module and operation method therefor, and memory system
KR20180086816A (en) Memory device and electronic device performing adaptive error correction with pre-checking error rate and method of operating the memory device
US20230236934A1 (en) Instant write scheme with dram submodules
CN106528437B (en) Data storage system and related method thereof
US11416331B2 (en) Modified checksum using a poison data pattern
WO2021088368A1 (en) Method and device for repairing memory
KR102334739B1 (en) Memory module, system, and error correction method thereof
CN112068985A (en) NORFLASH memory ECC (error correction code) error detection and correction method and system with programming instruction identification function
CN105023616A (en) Method for storing and retrieving data based on Hamming code and integrated random access memory
CN117472644A (en) Error determination method and system, processor and memory
WO2024016971A1 (en) Error determination method and system, processor, and memory
US10740179B2 (en) Memory and method for operating the memory
CN112114756B (en) Storage system and electronic device
US12026050B2 (en) DDR DIMM, memory system and operation method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination