CN110109619B - Closed-loop multistage storage system resistant to single event upset effect and implementation method - Google Patents

Closed-loop multistage storage system resistant to single event upset effect and implementation method Download PDF

Info

Publication number
CN110109619B
CN110109619B CN201910340318.7A CN201910340318A CN110109619B CN 110109619 B CN110109619 B CN 110109619B CN 201910340318 A CN201910340318 A CN 201910340318A CN 110109619 B CN110109619 B CN 110109619B
Authority
CN
China
Prior art keywords
cpu
fpga
memory
instruction
address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910340318.7A
Other languages
Chinese (zh)
Other versions
CN110109619A (en
Inventor
郗洪柱
郑义
钟亮
郑林
蒙瑰
刘蓓
徐暠
周建发
史青
彭泳卿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Research Institute of Telemetry
Aerospace Long March Launch Vehicle Technology Co Ltd
Original Assignee
Beijing Research Institute of Telemetry
Aerospace Long March Launch Vehicle Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Research Institute of Telemetry, Aerospace Long March Launch Vehicle Technology Co Ltd filed Critical Beijing Research Institute of Telemetry
Priority to CN201910340318.7A priority Critical patent/CN110109619B/en
Publication of CN110109619A publication Critical patent/CN110109619A/en
Application granted granted Critical
Publication of CN110109619B publication Critical patent/CN110109619B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1004Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's to protect a block of data words, e.g. CRC or checksum
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1012Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices using codes or arrangements adapted for a specific type of error
    • G06F11/1032Simple parity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0631Configuration or reconfiguration of storage systems by allocating resources to storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Human Computer Interaction (AREA)
  • Computer Security & Cryptography (AREA)
  • Detection And Correction Of Errors (AREA)
  • Retry When Errors Occur (AREA)

Abstract

The invention discloses a closed-loop multi-stage single event upset effect resistant storage system which comprises a CPU, an FPGA and a memory, wherein the CPU sends a memory reading instruction or a memory writing instruction to the FPGA, the FPGA analyzes the received instruction, and after the analysis is successful, the operation of reading data from a corresponding address of the memory and writing the data into the CPU is executed or the operation of reading data from a corresponding address space of the CPU and writing the data into the memory is executed. The invention also discloses a method for realizing the storage system. The invention has the advantages that the data among the CPU, the FPGA and the memory flows into and flows out of the channel error correction path to form a closed loop, the data operation path is short, the operation speed is high, the triple modular redundancy is carried out on the FPGA program, the correction can be still carried out even if three groups of data written into the memory are in error, and the single event upset resistance of the storage system is effectively improved.

Description

Closed-loop multistage storage system resistant to single event upset effect and implementation method
Technical Field
The invention belongs to the technical field of environmental irradiation resistance, and particularly relates to a closed-loop multi-stage single event upset effect resistant storage system based on an FPGA and an implementation method thereof.
Background
When high-energy particles enter a sensitive area of a semiconductor device in space, the phenomenon of device logic state inversion, namely single-particle inversion, often occurs. With the increase of the integration scale of the semiconductor chip, the probability of single event upset of the semiconductor device under a complex electromagnetic environment is higher and higher. When the logic state of the device is turned over, the operation state of the system is changed, and disastrous consequences such as casualties, task failure and the like are caused.
The current method for resisting the single event effect of the memory mainly comprises two aspects of process and design. The adoption of a special anti-irradiation reinforcement process has the disadvantages of higher cost, limited measures and the like, and the design aspect adopts more flexible measures, and one-time correction and two-time detection or more-time correction and two-time detection and the like. However, these methods often have the problems of single error correction mode, error correction object one-sided and error correction path open loop, etc., and the effect of resisting the single event upset effect thereof has a further improved space.
Disclosure of Invention
The invention solves the technical problems that: the defects of the prior art are overcome, and a closed-loop multi-stage single event upset effect resistant storage system based on an FPGA and an implementation method are provided.
The technical solution of the invention is as follows:
a closed-loop multi-stage storage system resisting a single event upset effect comprises a CPU, an FPGA and a memory, wherein the CPU sends a memory reading instruction or a memory writing instruction to the FPGA, the FPGA analyzes the received instruction, and executes the operation of reading data from a corresponding address of the memory and writing the data into the CPU or reading data from a corresponding address space of the CPU and writing the data into the memory after the analysis is successful.
A method for realizing a closed-loop multi-stage storage system with single event upset resistance effect comprises the following steps:
(1) Allocating three noncontiguous modular address spaces A, B and C to the memory space of the memory according to the same functional area;
(2) the CPU sends an instruction to the FPGA, the FPGA performs odd check on the received instruction, and the step (3) is entered;
(3) if the odd check result in the step (2) is correct, the FPGA analyzes the received instruction, and executes corresponding operation of reading or writing the memory according to the analysis result; when the odd check result is wrong, the FPGA sends an odd check failure mark to the CPU, the CPU sends an instruction to the FPGA again, and the step (2) is returned; if the number of continuous errors of the odd check result exceeds the specified number, the CPU reloads the FPGA and then restarts the FPGA;
(4) when the FPGA executes the operation of writing in the memory, the flow of writing in three modules, reading out comparison and writing in the error module is executed;
(5) if the operation result in the step (4) is inconsistent with the data read by the FPGA from the CPU, the FPGA reads the data from the CPU again, and the step (4) is repeated, if the continuous repeated process exceeds the specified times, the CPU reloads the FPGA and restarts;
(6) when the FPGA executes the operation of reading the memory, the flow of reading three modules, comparing and writing error modules is executed;
(7) and (4) if the local CRC code of the data obtained by the CPU in the step (6) is inconsistent with the received CRC code, the CPU resends the read instruction and returns to the step (6), and if the continuous repeated process exceeds the specified times, the CPU is restarted after being reloaded with the FPGA.
In the step (3), the process of analyzing the received instruction by the FPGA is as follows:
(3.1) the FPGA judges the received instruction head to determine whether the instruction is effective or not, and when the instruction is effective, the step (3.2) is carried out, and when the instruction is ineffective, the step (3.3) is carried out;
(3.2) resetting the invalid instruction register, analyzing the instruction content, judging whether the instruction is a memory reading instruction or a memory writing instruction, if the instruction is the memory reading instruction, acquiring a memory initial address contained in the instruction, and reading according to the number of bytes actually specified by a protocol between the CPU and the FPGA; if the instruction is a write memory instruction, acquiring the number of bytes and the number of packets to be written contained in the instruction, comparing the number of packets in the instruction with a packet number counter of the FPGA, reading corresponding byte data from the CPU when the byte data and the packet number counter are consistent, writing specified byte numbers from the next address of the previous write memory address space, adding 1 to the packet number counter of the FPGA, giving an initial value to the memory address space to be initially written according to a three-mode address distribution result, and setting the initial value of the packet number counter of the FPGA to be 0; when the two are not consistent, the FPGA sends a packet number inconsistency mark to the CPU, and the CPU sends an instruction to the FPGA again;
(3.3) if the frequency of the continuous invalid instructions does not reach the limit number, the FPGA adds 1 to the invalid instruction register and then reads the instruction sent by the CPU again, and the step (3.1) is returned to judge again; and when the number of the continuous invalid instructions reaches the limit number, the FPGA writes an error mark into the CPU and waits for reloading.
In the step (4), the flow of the FPGA executing the "write three modules-read comparison-rewrite error module" is as follows:
(4.1) reading primary data from the CPU appointed memory space by the FPGA, and sequentially writing the data into the three address spaces A, B, C of the memory;
(4.2) immediately reading the data in the three addresses in sequence after the writing is finished, and performing calculation by taking two out of three bits;
(4.3) comparing the calculation result of taking two out of three bits according to the bit with the original data read from the CPU, adding 1 to the continuous inconsistent register i when the two are inconsistent, and entering the step (4.4), and entering the step (4.5) when the two are consistent;
(4.4) judging whether the counting number of the continuous inconsistent register i reaches the specified number, if so, writing an error mark into the CPU, and waiting for reloading; otherwise, re-receiving the instruction of writing the memory sent by the CPU, and repeating the process;
(4.5) resetting the continuous inconsistency register i, and then comparing the calculation result of taking two out of three bits according to the bit with the data read out from the A, B and the C three-block address respectively to determine whether the calculation results are respectively consistent; if the inconsistency exists, writing the calculation result of two out of three bits into the inconsistent memory module address, adding 1 to the inconsistent flag register j of the two out of three bits at this time, and entering the step (4.6); if yes, entering the step (4.7);
(4.6) judging whether the counting times of the continuous inconsistent register j are enough, if so, writing an error mark into the CPU, and waiting for reloading; otherwise, entering the step (4.7);
(4.7) adding 1 to the byte count register to be written, judging whether the byte to be written is enough, if so, writing an execution success mark and two inconsistent times of writing operation III into the corresponding address of the CPU by the FPGA; otherwise, the write memory operation is continued.
In the step (6), the flow of the FPGA executing the 'reading three modules-comparison-rewriting error module' is as follows:
(5.1) the FPGA reading data once from the corresponding start addresses of the three modulo address spaces A, B and C of the memory, respectively;
(5.2) carrying out bit-by-bit two-out on the data taken out from the three module addresses, comparing the obtained results with the data read out from the three module addresses respectively, determining whether the results are respectively consistent, if the results are inconsistent, entering the step (5.3), if the results are all consistent, writing the results of the two-out-of-three-out-of-two into the corresponding address of the CPU, adding 1 to a byte counting register to be read, and then entering the step (5.5);
(5.3) writing the two-out-of-three result into the inconsistent module address, then writing the two-out-of-three result into the corresponding address of the CPU, adding 1 to the two-out-of-three inconsistent flag register k, and entering the step (5.4);
(5.4) judging whether the counting times of the two-out-of-three inconsistent flag register k is enough, if so, writing an error flag into the CPU, and waiting for reloading; otherwise, re-receiving the memory reading instruction sent by the CPU, and repeating the process;
and (5.5) if the number of the bytes to be read does not meet the requirement, continuing to execute the read operation until the number of the bytes to be read meets the requirement, and writing an execution success mark and the second inconsistent read operation times into the corresponding address of the CPU by the FPGA.
In the step (4.7) and the step (5.5), when the FPGA writes the execution success mark into the corresponding address of the CPU, a local CRC check code is generated according to CRC-16 and is written into the corresponding address of the CPU.
After the CPU receives the data, a CPU local CRC check code is also generated, the CPU judges the consistency of the check code generated by the CPU and the check code from the FPGA, and judges whether the continuous check inconsistency frequency reaches the limit frequency or not when the check code and the check code are inconsistent, if not, the CPU check inconsistency register is added with 1, and the operation of reading the memory is executed again; if yes, the CPU controls the FPGA to reload and restart;
and when the check code generated by the CPU is consistent with the check code from the FPGA, resetting the CPU check inconsistency register, and finishing the execution of the operation instruction of the read memory.
In the step (1), the implementation method for allocating three noncontiguous modular address spaces A, B and C to the storage space of the memory according to the same functional area is as follows:
(8.1) determining a required address space size;
(8.2) determining whether three or more chip selection areas exist in the memory, if so, respectively defining three module addresses A, B and C of the same functional area in the three chip selection areas, and dividing the module address range of each chip selection area from the address 0;
when only one chip selection area exists, setting the required address space as x, setting the address offset as a and B, wherein a and B are both far smaller than x, and a is not equal to B, then setting the space range of the modular address A of the functional area as 0-x, setting the space range of the modular address B as (x + a) - (2x + a), and setting the space range of the modular address C as (2x + a + B) - (3x + a + B);
when two chip selection areas exist, two modular address spaces are distributed in one chip selection area, the other address space is distributed in the other chip selection area, the two modular address spaces in the same chip selection area are distributed in the following mode, the required address space is set to be y, the address offset is set to be c, c is far smaller than y, the range of the first modular address space in the chip selection area is defined to be 0-y, and the range of the other modular address space is defined to be (y + c) - (2y + c).
In the step (2), the instructions sent by the CPU to the FPGA are 32 bits, and the content is stored in a big-end mode, that is, the instruction header flags are stored from address 0 to address 6 to indicate the type of the instruction, and the check bits for odd check of the instruction are stored at address 7.
For a write memory command, addresses 8 to 19 store the number of packets to be written to memory, as 1: 1 mapping, wherein the address range of the segment represents that the number of the writable packets is 0-4095 packets; addresses 20 to 31 store the number of bytes to be read from the CPU and also the number of bytes to be written to the memory, and are set in accordance with 1: 1 mapping, wherein the address range of the segment represents the operable byte number range of 1-4095;
for a read memory instruction, addresses 9 through 31 store the starting address of the memory to be read, as 1: 1 mapping, then the theoretical maximum accessible memory address is (2) 23 -1)。
Compared with the prior art, the invention has the advantages that:
(1) the invention improves the error finding probability and increases the effectiveness of the correction measures by the closed loop of the error correction path of the data inflow and outflow channels among the CPU, the FPGA and the memory.
(2) The parallel operation characteristics of the FPGA are fully utilized, the data operation path is short, the operation speed is high, the triple modular redundancy is performed on the FPGA program, and the single event upset resistance of the storage system is effectively improved.
(3) The invention utilizes the first-stage error correction of data flow between the CPU and the FPGA and the second-stage error correction of data flow between the FPGA and the memory, has comprehensive and various error correction modes, ensures that the error can be corrected even if three groups of data written into the memory are all in error, and has strong single event upset resistance.
Drawings
FIG. 1 is a block diagram of the system of the present invention;
FIG. 2 is a flow chart of the working process of the closed-loop multi-stage single event upset effect resistant storage system based on the FPGA;
FIG. 3 is a first level error correction flow chart of the data flow path between the CPU and the FPGA of the present invention;
FIG. 4 is a two-level error correction process for FPGA write memory according to the present invention;
FIG. 5 is a two-level error correction process for the FPGA read memory of the present invention.
Detailed Description
The invention will be further explained with reference to the drawings.
As shown in fig. 1, the present invention provides a closed-loop multi-level single event upset effect resistant storage system based on an FPGA, which includes a CPU, an FPGA and a memory; the CPU is not directly connected to the memory. The CPU sends a read or write memory instruction to the FPGA, and the FPGA respectively executes the operation of reading data from the corresponding address of the memory and writing the data into the CPU or reading data from the corresponding address space of the CPU and writing the data into the memory after the FPGA successfully analyzes the data.
As shown in fig. 2, which is a working flow chart of the closed-loop multi-level single event upset effect resistant storage system based on the FPGA of the present invention, before the system operates, three noncontiguous modular address spaces A, B and C are allocated to the storage space of the memory according to the same functional area; after the system starts to operate, the CPU determines when to start a data read-write flow. After the CPU sends an instruction to the FPGA, the FPGA judges an instruction header at first to determine whether the instruction is effective or not. When the FPGA judges that the instruction is a memory reading instruction, firstly, an invalid instruction register is cleared, and simultaneously, the instruction content is analyzed, the memory initial address information to be read contained in the instruction is obtained, and a byte counting register to be read is cleared. Thereafter, the FPGA reads data once from the corresponding start address of the corresponding three (A, B and C respectively) module spaces of the memory according to the number of bytes actually specified by the protocol between the CPU and the FPGA. And (4) judging the data in the three addresses according to the bit three and taking two, comparing the obtained result with the data read from the three addresses respectively, and determining whether the data are respectively consistent. And if the inconsistency exists, adding 1 to the inconsistent flag register of the second-out-of-three operation, writing the result of the second-out-of-three operation into the inconsistent module address, and writing the result of the second-out-of-three operation into the corresponding address of the CPU. And after the execution is finished, if the number of the bytes to be read does not meet the requirement, continuing to execute the reading operation, otherwise, generating a local CRC check code according to CRC-16 (namely 8005), and writing the local CRC check code into a corresponding address of the CPU together with an execution success mark and the number of times of inconsistency of two-out-of-three.
As shown in fig. 2, after receiving an instruction sent by the CPU, the FPGA determines the instruction header, and if it determines that the instruction is a write memory instruction, the FPGA first clears the invalid instruction register, and analyzes the instruction content, obtains information, such as the number of bytes to be written and the number of packets, included in the instruction, determines whether the number of packets is consistent with the FPGA packet counter, and clears the packet-number-inconsistent flag register and the byte-count-to-be-written register when the number of packets is consistent with the FPGA packet counter. Then the FPGA reads data from the appointed memory space of the CPU once, and writes the data into three (A, B and C) module addresses corresponding to the memory corresponding to the instruction in sequence, wherein the written address is the next address of the previous write memory address space, and the packet number counter of the FPGA is increased by 1. Assigning an initial value to a memory address space to be initially written according to a three-mode address allocation result, wherein the initial value of a packet number counter of the FPGA is 0; when the packet number is inconsistent with the FPGA packet number counter, judging whether the continuous inconsistent times of the packet number reach the limiting times, if not, the FPGA adds 1 to a packet number inconsistent flag register and then receives an instruction sent by the CPU again, and if the packet number inconsistent flag register reaches the limiting times, writing an error flag into the CPU and waiting for reloading;
After the writing is finished, data in the three module addresses are read out in sequence immediately, the judgment is carried out according to the bit three and the two, the obtained result is firstly compared with the original data read from the CPU, when the two are inconsistent, whether the continuous inconsistent times under the condition reach the specified times is judged, if so, an error mark is written into the CPU, and the reloading is waited; otherwise, adding 1 to the continuous inconsistent register, re-receiving the instruction sent by the CPU, and repeating the process. When the two-out-of-three result is completely consistent with the original data read from the CPU, the inconsistent register is cleared, and then the two-out-of-three result is respectively compared with the data read from the three addresses of the memory to determine whether the two-out-of-three result is respectively consistent with the data read from the three addresses of the memory. And if the inconsistency exists, adding 1 to the two-out-of-three inconsistency flag register, and writing the two-out-of-three result into the inconsistent module address. And after the execution is finished, if the number of the bytes to be written does not meet the requirement, continuing to execute the writing operation, otherwise, indicating that the execution of the current writing instruction is finished, and writing an execution success mark into the CPU.
As shown in fig. 2, after receiving an instruction sent by the CPU, the FPGA determines the instruction header, and after determining that the instruction is an invalid instruction, adds 1 to the invalid instruction register, and if the number of consecutive invalid instructions does not reach the limit number, the FPGA reads the instruction sent by the CPU again to perform re-determination. When the count of the continuous invalid instructions reaches the limit times, the FPGA writes an error mark into the CPU and waits for reloading.
Fig. 3 is a first-level error correction flow chart of the data flow channel between the CPU and the FPGA according to the present invention, and mainly shows the error detection and correction process during data flow between the CPU and the FPGA. After the operation of reading the memory by the FPGA shown in fig. 2 is completed and the execution success flag is written into the CPU, the CPU completes reception of data and the CRC check code, generates a CPU local CRC check code according to CRC-16 (i.e., 8005), and performs consistency judgment on the check code generated by the CPU itself and the check code from the FPGA inside. Judging whether the continuous check inconsistency reaches the limit times when the two are inconsistent, if not, adding 1 to a CPU check inconsistency register, and executing FPGA read operation again; and if so, the CPU controls the FPGA to reload and restart. And when the two check codes are consistent, resetting the CPU check inconsistent register, and finishing the instruction execution. The processing flow shown in fig. 3 for the write operation instruction and the invalidate instruction is substantially identical to that shown in fig. 2.
Fig. 4 is a flow chart of two-stage error correction for FPGA write memory, which mainly includes a flow of "write three modules — read comparison — write error module again", and firstly divides the memory address space into three discrete module addresses (i.e. A, B and C) according to the functional requirements, and each functional area is divided into three discrete module addresses. For each module address, when the FPGA executes the operation of writing the memory, firstly writing module A address data, then writing module B address data, and finally writing module C address data. Reading data from the addresses of the module A, the module B and the module C in sequence immediately after the writing operation is finished, taking two out of the read data according to the bit three, firstly comparing the result of taking two out of the three with the original data read from the CPU, adding 1 to the continuous inconsistent register i when the result of taking two out of the three is inconsistent with the original data read from the CPU, then judging whether the frequency of the continuous inconsistent register i is enough, if so, writing an error mark into the CPU, and waiting for reloading; otherwise, the writing memory instruction sent by the CPU is received again, and the process is repeated. If the comparison result is consistent with the original data read from the CPU, clearing the continuous inconsistent register i, then respectively comparing the continuous inconsistent register i with the data read from the module A, the module B and the module C, adding 1 to the byte count register to be written when the continuous inconsistent register i is respectively consistent with the data read from the module A, the module B and the module C, and continuing the writing operation when the byte to be written is not enough; and when the three modules have inconsistent modules, writing a two-out-of-three result into an inconsistent module address, covering the originally stored data of the address, and adding 1 to the continuous inconsistent flag register j, wherein the size of the data stored in the register indicates the degree of influence of a single event upset effect possibly suffered by the FPGA in the writing operation process. The data stored in the inconsistent flag register is large, which indicates that the FPGA is likely to be influenced by the single event upset effect in the writing operation process greatly, otherwise, the influence degree is small. And when the number of times of j of the continuous inconsistent registers reaches the specified number, writing an error mark into the CPU, waiting for reloading, otherwise, continuously adding 1 to the byte count register to be written, executing memory writing operation, generating a local CRC (cyclic redundancy check) code according to CRC-16 (namely 8005) after all data are written, and writing the local CRC code, the instruction execution success mark and the number of times of two-out-of-three inconsistency into the CPU.
Fig. 5 is a flow chart showing the two-stage error correction of the FPGA read memory according to the present invention, which mainly includes a flow of "reading three modules — comparing — writing the error module again", and is a flow of error correction and detection of the FPGA executing the read operation after dividing the three modules for the functional area of the memory according to the method shown in fig. 4. As shown in fig. 2, the FPGA reads data from the memory according to the three-module address and then performs the two-out-of-three-bit operation, if the two-out-of-three result is inconsistent with the contents of a certain module, the two-out-of-three result is written into the inconsistent module address, the two-out-of-three result is written into the corresponding address of the CPU, and 1 is added to the two-out-of-three inconsistent flag register k, and the value of the inconsistent flag register k reflects the degree of the FPGA read operation affected by the single event upset effect. And when the k times of the continuous inconsistency flag register reach the specified times, writing an error flag into the CPU, and waiting for reloading, otherwise, continuously executing the memory reading operation until all data are read out, generating a local CRC (cyclic redundancy check) code according to CRC-16 (namely 8005 type), and writing the local CRC code, an instruction execution success flag and the two-out-of-three inconsistency times into the CPU.
The primary error correction is an error correction channel for data flow between a CPU and an FPGA, and comprises the steps that the CPU sends an instruction with an odd check bit to the FPGA, and the FPGA conducts odd check on the received instruction and sends a result of judging an instruction head to the CPU; the CPU sends data to the FPGA, the FPGA sends a data comparison result correctness flag to the CPU, and the data comparison refers to bit-by-bit one-to-one comparison of a two-out-of-three result obtained from a write memory and the data sent by the CPU; the FPGA sends the data and the CRC check result to the CPU, the CPU receives the data and carries out CRC check on the received data, and the two CRC check results are compared. Further, operation number control and CPU restart judgment are also included, as shown in fig. 3.
The two-stage error correction of the present invention refers to an error correction channel for data flow between the FPGA and the memory, and includes a flow of "write three-mode-read comparison-rewrite error mode" of the FPGA write memory and "read three-mode-comparison-rewrite error mode" of the read memory, as shown in fig. 4 and 5.
In the invention, the instructions sent to the FPGA by the CPU are 32 bits, and the content is stored according to a big-end mode, namely, the instruction header marks are stored from the address 0 to the address 6 to indicate the type of the instructions, and the check bits for odd check of the instructions are stored at the address 7;
for a write memory command, addresses 8 to 19 store the number of packets to be written to memory, as 1: 1 mapping, wherein the address range of the segment represents that the number of the writable packets is 0-4095 packets; addresses 20 to 31 store the number of bytes to be read from the CPU and also the number of bytes to be written to the memory, and are set in accordance with 1: 1 mapping, wherein the address range of the segment represents the operable byte number range of 1-4095, the mapping proportion can be adjusted according to actual needs, if only 2048 bytes need to be operated at most, the FPGA indicates that the actual byte number to be operated is 1 every 2 bits when analyzing the instruction.
For a read memory instruction, addresses 9 through 31 store the starting address of the memory to be read, as 1: 1 mapping, then the theoretical maximum accessible memory address is (2) 23 -1). According to the memory type, the theoretical maximum accessible address is mapped to the actual operable address, for example, if the memory is accessed by bytes and only has 128 kilobytes at most, only the following conditions of 1: 1 mapping, address 15 to address 31 valid, address 9 to address 14 invalid.
The method for distributing three non-contiguous modular address spaces A, B and C to the memory space of the memory according to the same functional area is as follows:
firstly, determining the size of a required address space;
then determining whether three or more chip selection areas exist in the memory, if so, respectively defining three module addresses A, B and C of the same functional area in the three chip selection areas, and dividing the module address range of each chip selection area from the address 0;
when only one chip selection area exists, setting the required address space as x, setting the address offset as a and B, wherein a and B are both far smaller than x, and a is not equal to B, then setting the space range of the modular address A of the functional area as 0-x, setting the space range of the modular address B as (x + a) - (2x + a), and setting the space range of the modular address C as (2x + a + B) - (3x + a + B);
when two chip selection areas exist, two modular address spaces are distributed in one chip selection area, the other address space is distributed in the other chip selection area, the two modular address spaces in the same chip selection area are distributed in the following mode, the required address space is set to be y, the address offset is set to be c, c is far smaller than y, the range of the first modular address space in the chip selection area is defined to be 0-y, and the range of the other modular address space is defined to be (y + c) - (2y + c).
The present invention has not been described in detail in part as is known in the art.

Claims (7)

1. A method for realizing a closed-loop multi-stage storage system with single event upset resistance effect is characterized by comprising the following steps:
the storage system comprises a CPU, an FPGA and a memory, wherein the CPU sends a memory reading instruction or a memory writing instruction to the FPGA, the FPGA analyzes the received instruction, and executes the operation of reading data from a corresponding address of the memory and writing the data into the CPU or reading data from a corresponding address space of the CPU and writing the data into the memory after the analysis is successful;
the storage system implementation method comprises the following steps:
(1) allocating three noncontiguous modular address spaces A, B and C to the memory space of the memory according to the same functional area;
(2) the CPU sends an instruction to the FPGA, the FPGA performs odd check on the received instruction, and the step (3) is entered;
(3) if the odd check result in the step (2) is correct, the FPGA analyzes the received instruction, and executes corresponding operation of reading or writing the memory according to the analysis result; when the odd check result is wrong, the FPGA sends an odd check failure mark to the CPU, the CPU sends an instruction to the FPGA again, and the step (2) is returned; if the number of continuous errors of the odd check result exceeds the specified number, the CPU reloads the FPGA and then restarts the FPGA;
(4) When the FPGA executes the operation of writing into the memory, the flow of writing into three modules, reading out comparison and rewriting the error module is executed, and the flow of writing into three modules, reading out comparison and rewriting the error module is as follows:
(4.1) reading primary data from the CPU appointed memory space by the FPGA, and sequentially writing the data into the three address spaces A, B, C of the memory;
(4.2) immediately reading the data in the three addresses in sequence after the writing is finished, and performing calculation by taking two out of three bits;
(4.3) comparing the calculation result of taking two out of three bits according to the bit with the original data read from the CPU, adding 1 to the continuous inconsistent register i when the two are inconsistent, and entering the step (4.4), and entering the step (4.5) when the two are consistent;
(4.4) judging whether the counting number of the continuous inconsistent register i reaches the specified number, if so, writing an error mark into the CPU, and waiting for reloading; otherwise, re-receiving the instruction of writing the memory sent by the CPU, and repeating the process;
(4.5) resetting the continuous inconsistency register i, and then comparing the calculation result of taking two out of three bits according to the bit with the data read out from the three address spaces A, B, C of the memory respectively to determine whether the calculation results are respectively consistent; if the inconsistency exists, writing the calculation result of two out of three bits into the inconsistent memory module address, adding 1 to the inconsistent flag register j of the two out of three bits at this time, and entering the step (4.6); if yes, entering the step (4.7);
(4.6) judging whether the counting times of the continuous inconsistent register j are enough, if so, writing an error mark into the CPU, and waiting for reloading; otherwise, entering the step (4.7);
(4.7) adding 1 to the byte count register to be written, judging whether the byte to be written is enough, if so, writing an execution success mark and two inconsistent times of writing operation III into the corresponding address of the CPU by the FPGA; otherwise, continuing to execute the operation of writing the memory;
(5) if the operation result in the step (4) is inconsistent with the data read by the FPGA from the CPU, the FPGA reads the data from the CPU again, and the step (4) is repeated, if the continuous repeated process exceeds the specified times, the CPU reloads the FPGA and restarts;
(6) when the FPGA executes the operation of reading the memory, the flow of reading three modules, comparing and rewriting the error module is executed, and the flow of reading three modules, comparing and rewriting the error module is as follows:
(6.1) the FPGA reading data once from the corresponding start addresses of the three modulo address spaces A, B and C of the memory, respectively;
(6.2) carrying out bit-by-bit two-out on the data taken out from the three module addresses, comparing the obtained results with the data read out from the three module addresses respectively, determining whether the results are respectively consistent, if the results are inconsistent, entering the step (6.3), if the results are all consistent, writing the results of the two-out-of-three-out-of-two into the corresponding address of the CPU, adding 1 to a byte counting register to be read, and then entering the step (6.5);
(6.3) writing the two-out-of-three result into the inconsistent module address, then writing the two-out-of-three result into the corresponding address of the CPU, adding 1 to the two-out-of-three inconsistent flag register k, and entering the step (6.4);
(6.4) judging whether the counting times of the two-out-of-three inconsistent flag register k is enough, if so, writing an error flag into the CPU, and waiting for reloading; otherwise, re-receiving the memory reading instruction sent by the CPU, and repeating the process;
(6.5) if the number of the bytes to be read does not meet the requirement, continuing to execute the read operation until the number of the bytes to be read meets the requirement, and writing an execution success mark and the second inconsistent read operation times into a corresponding address of the CPU by the FPGA;
(7) and (4) if the local CRC code of the data obtained by the CPU in the step (6) is inconsistent with the received CRC code, the CPU resends the read instruction and returns to the step (6), and if the continuous repeated process exceeds the specified times, the CPU is restarted after being reloaded with the FPGA.
2. The method for realizing the closed-loop multi-stage storage system resisting the single event upset effect according to claim 1, wherein the method comprises the following steps: in the step (3), the process of analyzing the received instruction by the FPGA is as follows:
(3.1) the FPGA judges the received instruction head to determine whether the instruction is effective or not, and when the instruction is effective, the step (3.2) is carried out, and when the instruction is ineffective, the step (3.3) is carried out;
(3.2) resetting the invalid instruction register, analyzing the instruction content, judging whether the instruction is a memory reading instruction or a memory writing instruction, if the instruction is the memory reading instruction, acquiring a memory initial address contained in the instruction, and reading according to the number of bytes actually specified by a protocol between the CPU and the FPGA; if the instruction is a write memory instruction, acquiring the number of bytes and the number of packets to be written contained in the instruction, comparing the number of packets in the instruction with a packet number counter of the FPGA, reading corresponding byte data from the CPU when the byte data and the packet number counter are consistent, writing specified byte numbers from the next address of the previous write memory address space, adding 1 to the packet number counter of the FPGA, giving an initial value to the memory address space to be initially written according to a three-mode address distribution result, and setting the initial value of the packet number counter of the FPGA to be 0; when the two are not consistent, the FPGA sends a packet number inconsistency mark to the CPU, and the CPU sends an instruction to the FPGA again;
(3.3) if the frequency of the continuous invalid instructions does not reach the limit number, the FPGA adds 1 to the invalid instruction register and then reads the instruction sent by the CPU again, and the step (3.1) is returned to judge again; and when the number of the continuous invalid instructions reaches the limit number, the FPGA writes an error mark into the CPU and waits for reloading.
3. The method for realizing the closed-loop multi-stage storage system resisting the single event upset effect according to claim 1, wherein the method comprises the following steps: in the step (4.7) and the step (6.5), when the FPGA writes the execution success mark into the corresponding address of the CPU, a local CRC check code is generated according to CRC-16 and is written into the corresponding address of the CPU.
4. The method for realizing the closed-loop multi-stage storage system resisting the single event upset effect according to claim 3, wherein the method comprises the following steps: after the CPU receives the data, a CPU local CRC check code is also generated, the CPU judges the consistency of the check code generated by the CPU and the check code from the FPGA, and judges whether the continuous check inconsistency frequency reaches the limit frequency or not when the check code and the check code are inconsistent, if not, the CPU check inconsistency register is added with 1, and the operation of reading the memory is executed again; if yes, the CPU controls the FPGA to reload and restart;
and when the check code generated by the CPU is consistent with the check code from the FPGA, resetting the CPU check inconsistency register, and finishing the execution of the operation instruction of the read memory.
5. The method for realizing the closed-loop multi-stage storage system resisting the single event upset effect according to claim 1, wherein the method comprises the following steps: in the step (1), the implementation method for allocating three noncontiguous modular address spaces A, B and C to the storage space of the memory according to the same functional area is as follows:
(8.1) determining a required address space size;
(8.2) determining whether three or more chip selection areas exist in the memory, if so, respectively defining three noncontiguous module address spaces A, B and C of the same functional area in the three chip selection areas, and dividing the module address range of each chip selection area from the address 0;
when only one chip selection area exists, setting the required address space as x, setting the address offset as a and B, wherein a and B are both far smaller than x, and a is not equal to B, then setting the space range of the modular address A of the functional area as 0-x, setting the space range of the modular address B as (x + a) - (2x + a), and setting the space range of the modular address C as (2x + a + B) - (3x + a + B);
when two chip selection areas exist, two modular address spaces are distributed in one chip selection area, the other address space is distributed in the other chip selection area, the two modular address spaces in the same chip selection area are distributed in the following mode, the required address space is set to be y, the address offset is set to be c, c is far smaller than y, the range of the first modular address space in the chip selection area is defined to be 0-y, and the range of the other modular address space is defined to be (y + c) - (2y + c).
6. The method for realizing the closed-loop multi-stage storage system resisting the single event upset effect according to claim 1, wherein the method comprises the following steps: in the step (2), the instructions sent by the CPU to the FPGA are 32 bits, and the content is stored in a big-end mode, that is, the instruction header flags are stored from address 0 to address 6 to indicate the type of the instruction, and the check bits for odd check of the instruction are stored at address 7.
7. The method for realizing the closed-loop multi-stage storage system resisting the single event upset effect according to claim 6, wherein the method comprises the following steps: for a write memory command, addresses 8 to 19 store the number of packets to be written to memory, as 1: 1 mapping, wherein the address range of the segment represents that the number of the writable packets is 0-4095 packets; addresses 20 to 31 store the number of bytes to be read from the CPU and also the number of bytes to be written to the memory, and are set in accordance with 1: 1 mapping, wherein the address range of the segment represents the operable byte number range of 1-4095;
for a read memory instruction, addresses 9 through 31 store the starting address of the memory to be read, as 1: 1 mapping, then the maximum accessible memory address is (2) 23 -1)。
CN201910340318.7A 2019-04-25 2019-04-25 Closed-loop multistage storage system resistant to single event upset effect and implementation method Active CN110109619B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910340318.7A CN110109619B (en) 2019-04-25 2019-04-25 Closed-loop multistage storage system resistant to single event upset effect and implementation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910340318.7A CN110109619B (en) 2019-04-25 2019-04-25 Closed-loop multistage storage system resistant to single event upset effect and implementation method

Publications (2)

Publication Number Publication Date
CN110109619A CN110109619A (en) 2019-08-09
CN110109619B true CN110109619B (en) 2022-07-29

Family

ID=67486776

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910340318.7A Active CN110109619B (en) 2019-04-25 2019-04-25 Closed-loop multistage storage system resistant to single event upset effect and implementation method

Country Status (1)

Country Link
CN (1) CN110109619B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111404750B (en) * 2020-03-20 2022-11-01 上海航天测控通信研究所 Centralized parameter management device and method for advanced on-orbit system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7620883B1 (en) * 2001-02-14 2009-11-17 Xilinx, Inc. Techniques for mitigating, detecting, and correcting single event upset effects
CN101937375A (en) * 2010-08-27 2011-01-05 浙江大学 Code and data real-time error correcting and detecting method and device for pico-satellite central processing unit
CN104932954A (en) * 2015-07-01 2015-09-23 西北工业大学 FPGA (Field Programmable Gate Array) key data protection method for microsatellite
CN106557346A (en) * 2016-11-24 2017-04-05 中国科学院国家空间科学中心 A kind of primary particle inversion resistant star-carried data processing system and method
CN108255636A (en) * 2017-12-13 2018-07-06 太原航空仪表有限公司 A kind of anti-single particle overturning system and its application method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8589759B2 (en) * 2010-10-01 2013-11-19 Hamilton Sundstrand Corporation RAM single event upset (SEU) method to correct errors

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7620883B1 (en) * 2001-02-14 2009-11-17 Xilinx, Inc. Techniques for mitigating, detecting, and correcting single event upset effects
CN101937375A (en) * 2010-08-27 2011-01-05 浙江大学 Code and data real-time error correcting and detecting method and device for pico-satellite central processing unit
CN104932954A (en) * 2015-07-01 2015-09-23 西北工业大学 FPGA (Field Programmable Gate Array) key data protection method for microsatellite
CN106557346A (en) * 2016-11-24 2017-04-05 中国科学院国家空间科学中心 A kind of primary particle inversion resistant star-carried data processing system and method
CN108255636A (en) * 2017-12-13 2018-07-06 太原航空仪表有限公司 A kind of anti-single particle overturning system and its application method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种适用于空间信息处理平台的抗单粒子翻转技术研究;王苏灵等;《通信技术》;20180510(第05期);第1228-1231页 *

Also Published As

Publication number Publication date
CN110109619A (en) 2019-08-09

Similar Documents

Publication Publication Date Title
US8694855B1 (en) Error correction code technique for improving read stress endurance
US7389465B2 (en) Error detection and correction scheme for a memory device
US8205135B2 (en) Memory system and command handling method
US8990549B2 (en) Method and system for booting electronic device from NAND flash memory
CN110442473B (en) Nonvolatile data storage method and device, electronic equipment and medium
US11853599B2 (en) Memory system and information processing system
TWI651726B (en) Decoding method and storage controller
CN107797821A (en) Retry read method and the device using this method
CN110109619B (en) Closed-loop multistage storage system resistant to single event upset effect and implementation method
US10509565B2 (en) Apparatuses, methods, and computer-readable non-transitory recording mediums for erasure in data processing
US9891986B2 (en) System and method for performing bus transactions
CN116361232A (en) Processing method and device for on-chip cache, chip and storage medium
CN112559482A (en) Binary data classification processing method and system based on distribution
TWI749279B (en) A data storage device and a data processing method
CN114822664B (en) Risk assessment method based on data priority, storage device and control circuit
US20220365695A1 (en) Data processing method and device and electronic apparatus
US6535442B2 (en) Semiconductor memory capable of debugging an incorrect write to or an incorrect erase from the same
WO2017148096A1 (en) Method and device for generating cyclic redundancy check
CN107977282B (en) Method and device for reading data page by SPI-Nand
CN111124742A (en) Flash data verification method, Flash controller, storage medium and equipment
CN103119564A (en) Method and apparatus for checking a main memory of a processor
CN113741821B (en) Classification-based data access method, system, medium, and program
CN114063918B (en) Data access method, memory storage device and memory controller
CN114637626B (en) Method, device and equipment for reducing read-write errors of EEPROM (electrically erasable programmable read-only memory) data and readable storage medium
US11650738B2 (en) Integrity check of a memory

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant