CN110971244A - Forward error correction decoding decoder based on burst error detection - Google Patents

Forward error correction decoding decoder based on burst error detection Download PDF

Info

Publication number
CN110971244A
CN110971244A CN201910994927.4A CN201910994927A CN110971244A CN 110971244 A CN110971244 A CN 110971244A CN 201910994927 A CN201910994927 A CN 201910994927A CN 110971244 A CN110971244 A CN 110971244A
Authority
CN
China
Prior art keywords
module
error
iteration
output
register
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910994927.4A
Other languages
Chinese (zh)
Inventor
张为
王佳琪
陆薇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201910994927.4A priority Critical patent/CN110971244A/en
Publication of CN110971244A publication Critical patent/CN110971244A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/03Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words
    • H03M13/05Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits
    • H03M13/13Linear codes
    • H03M13/15Cyclic codes, i.e. cyclic shifts of codewords produce other codewords, e.g. codes defined by a generator polynomial, Bose-Chaudhuri-Hocquenghem [BCH] codes
    • H03M13/151Cyclic codes, i.e. cyclic shifts of codewords produce other codewords, e.g. codes defined by a generator polynomial, Bose-Chaudhuri-Hocquenghem [BCH] codes using error location or error correction polynomials
    • H03M13/1515Reed-Solomon codes

Landscapes

  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Algebra (AREA)
  • General Physics & Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Error Detection And Correction (AREA)

Abstract

The invention belongs to the field of error control coding in channel coding, and aims to shorten the delay of a key path of a decoder on the premise of ensuring the decoding performance and improve the decoding throughput rate. Therefore, the technical scheme adopted by the invention is that the forward error correction decoding decoder based on burst error detection comprises the following steps: the system comprises a syndrome calculation SC module, a key equation solving KES module and a chien search and error estimation CSEE module, wherein a syndrome calculated by the SC module is output to the KES module, and an error position polynomial Lambda (X) and an error estimation polynomial omega (X) calculated by the KES module are output to the chien search and error estimation CSEE module. The invention is mainly applied to the design and manufacture occasions of the decoder.

Description

Forward error correction decoding decoder based on burst error detection
Technical Field
The invention belongs to the field of error control coding in channel coding, and relates to Reed-Solomon (RS) code coding and decoding related technology, pipelining technology and retiming technology, in particular to a forward error correction decoder framework of a Reed-Solomon (RS) code RS (255,239) suitable for a 100 Gb/rate optical communication system or above.
Background
In recent years, the transmission of digital information is becoming more frequent due to the development of communication technology, digital signals are transmitted in time and space through a medium or a storage device such as a wired medium or a wireless medium, however, errors occur due to the fact that the digital signals are interfered by noise in different degrees during transmission due to the non-ideal transmission channel. Error control coding is a technology for correcting errors generated in a transmission process of digital information by using coding and decoding technology. As an important error control coding scheme, the RS code has been widely used in various fields such as wireless communication, data storage, deep space exploration, and digital video broadcasting due to its characteristics of strong error correction capability, simple structure, and efficient decoding algorithm and large-scale integrated circuit development.
In 2009, the RS (255,239) code was defined as the transmission standard of submarine optical fiber systems, high-speed optical fiber systems, gigabit passive optical communication networks, and the like, by ITU-T (international Telecommunication union's telecommunications selector), because of the high practical value of RS codes. With the rapid development of optical communication systems, due to the over-high transmission rate and the over-long transmission distance, the error of a large amount of data during transmission becomes more serious, and even further development of optical communication systems is limited.
The decoding method of the RS code mainly comprises two categories of soft decision decoding algorithm and hard decision decoding algorithm. The soft-decision decoding algorithm can fully utilize the channel soft information in the received signal, so that the soft-decision decoding algorithm has higher coding gain and error correction capability than the hard-decision decoding algorithm, but needs to consume more hardware resources and is not beneficial to the application of the algorithm. The hard decision algorithm and the hardware architecture are simple, and the method has absolute advantages in the practical application of the current decoder. The hard-decision RS decoder mainly includes three modules: syndrome Computation (SC), Key Equation Solving (KES), Chien Search and Error Evaluation (CSEE). As the most classical RS decoding algorithm, a hardware architecture of the RiBM (reconstructed updated Berlekamp-Massey) algorithm includes 3t +1 homogeneous Processing units (PE), and 2t clock cycles are required to complete the calculation of the error location polynomial and the error estimation polynomial, where in this specification, t represents the error correction capability of the RS code. An mCS-RiBM (modified Compensated Simplified-reconstructed Berlekamp-Massey) algorithm derived on the basis of the RiBM algorithm removes a plurality of redundant processing units, and the hardware resource consumption is obviously reduced.
Note that in the existing hard decision decoder architecture, the delay of the syndrome computation module and the chien search and error estimation module is n (n is the RS code packet length) clock cycles, and the minimum delay of the KES module is 2t-1 clock cycles, so that the KES module has a large amount of idle time, which causes waste of hardware resources, and the directly implemented decoder has large occupied area, low throughput rate and low decoding efficiency. Therefore, the design method of the RS code hard decision decoder architecture capable of effectively improving the decoding speed needs to be further researched.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention aims to provide a parallel decoder framework based on the current most advanced hard decision RS decoding algorithm-mCS-RiBM algorithm, and on the premise of ensuring the decoding performance, the delay of the key path of the decoder is shortened, so that the decoding throughput rate is improved. Therefore, the technical scheme adopted by the invention is that the forward error correction decoding decoder based on burst error detection comprises the following steps: the system comprises a syndrome calculation SC module, a key equation solving KES module and a chien search and error estimation CSEE module, wherein a syndrome calculated by the SC module is output to the KES module, an error position polynomial Λ (X) and an error estimation polynomial Ω (X) calculated by the KES module are output to the chien search and error estimation CSEE module, and all dislocation positions and error values are calculated by the CSEE module; the SC module adopts the syndrome calculation formula as follows to calculate the syndrome:
si=(…(rn-1αi(q-1)+rn-2αi(q-2)+…+rn-q+1αi+rn-qiq+…+rqiq+rq-1αi(q-1)+rq-2αi(q-2)+…+r1αi+r0, (2)
wherein, αiA root representing a symbol generator polynomial, i ═ 1,2,3 … 2 t; t is a syndrome SiNumber, i is more than or equal to 0 and less than or equal to 2t-1, n is the length of the processed code element packet, r0、r1…rn-1Representing received codeword polynomial coefficients; q is a parallelism factor.
In the syndrome calculation SC module, the total of all 2t syndromes is calculated2t (q +1) constant multipliers, 2t adders and 2t registers are required, the initial value of the register is set to 0, and 0 and α are in the middle of the first clockiqAfter multiplication is still 0 and after addition to the left term the value r is obtainedq-1αi(q-1)+…+r1αi+r0Storing the value in a register; in the second clock cycle, the product of each item is added to obtain the value (r)q-1αi(q-1)+…+r1αi+r0iq+r2q-1αi(q-1)+…+rq+1αi+rq) Storing the new sum of products into the register again; and by analogy, sequentially inputting each path of code elements from a high bit to a low bit, wherein each iteration occupies one clock cycle, processing q code elements in parallel in one clock cycle, after n/q cycles, obtaining the final syndrome value as the result in the register, wherein the key path of the SC module is 3Txor, the Txor is the delay of an exclusive-OR gate, and the calculated syndrome is sent to the KES module.
The KES architecture comprises a controller, t +1 PE1 processing units (numbered from 0 to t) and t PE4 processing units (numbered from t +1 to 2 t), a compensation unit cs (compensation stack), and t +1 processing circuits PE1 respectively being PE10~PE1tArranged in the order of the numbers from small to large, PE1iOutput iteration signal deltai(r) feeding to PE1i-1Wherein i is 1,2, …, t + 1; t processing elements PE4t+1~PE42tArranged in the order of the numbers from small to large, PE4iOutputting the iterative data signal deltai(r) feeding to PE4i-1,PE4i-1Output intermediate variable signal thetai-1(r) feeding into PE4iWherein i is t +1, t +2, … 2 t; PE10Outputting the iterative data signal deltai(r) to the controller, PE1t receives PE4t+1Output iteration data deltai+1(r) as input, PE42tThe iteration data input value of (1) is 0;
each PE contains 2 pipelined galois field multipliers, 1 galois field adder, 2-1 selectors, 15 latches,PE4 has 3 more selectors than PE1, as shown in fig. 2. Upper and lower rows of latches in PE
Figure BDA0002239451360000022
And
Figure BDA0002239451360000021
for storing coefficients of polynomials △ (x) and Θ (x), respectively, subscript k being the number of the PE, superscript denoting the channel number, latch
Figure BDA0002239451360000023
Receiving the operation result of the adder, buffering the operation result, and sending the buffered operation result to the outside as the iteration data signal delta output by the processing unit PEi(r) simultaneous iteration of the data signal δi(r) is also returned to the data selection signal MC2s,i(r) a controlled data selector; latch device
Figure BDA0002239451360000024
For receiving a routing signal MC1s,i(r) the output signal of the controlled data selector is buffered and sent to the outside as the intermediate variable signal theta output by the processing circuiti-1(r) simultaneously, the intermediate variable signal is also returned to the data selector;
the controller is used for controlling the selector in PE4 and storing data selection signal MC2 for processing unit PE with coefficient of Λ (r, z)s,i(r) is marked and the signal MC2s,i(r) are stored in an internal register block, and the controller receives the iteration output signal δ provided by the processing element PE1i(r) while the data signal δ is to be iteratedi(r) is sent as an output signal; data selection signal MC1 output by controllers,i(r) iteration data signal deltai(r) and an intermediate variable signal θi-1(r) simultaneously as processing element PE10PE1t input signal; data selection signal MC2 output by controllers,i(r) and MC3s,i(r) as processing element PE4t+1~PE42tThe input signal of (1); the iteration number r starts from 1, 1 is accumulated in each iteration until 2t iterations are finished, and when r is 2t, 2t iteration operations are carried outThereafter, the processing element PE10PE1t outputs error value polynomial coefficients; processing element PE4t+1~PE42tOutputting a position polynomial coefficient;
mCS-the overflow factor in the RiBM algorithm needs to be stored in dependence on a CS unit which operates as follows:
Λ(r+1,z)=γ(r).....γ(r′+1)·δ0(r) & B (r', z) to calculate the compensation coefficient deltacAnd passed to the PE unit at the appropriate time, where γ (r) represents the error probability, r' represents the higher order portion of r, and δ0(r) represents the 0 th iteration data signal, B (r', z) represents the intermediate polynomial;
the CS unit consists of 5 two-way selectors, 1 multiplier and 7 registers, wherein the three selectors M1, M2 and M3 are in cascade connection, and the output of the register is buffered by three stages of registers and then is sent to the register D4 or is used as one-way input of the selector M4 after passing through the multiplier. In each clock cycle, data in registers D3, D4 or DM can be selected to be sent to the leftmost register through a selector, namely three feedback loop named L3, L4 and Lm exist in the CS architecture, three updating modes are adopted for circularly generating and transmitting related polynomial coefficients, the mode 1 is applied to the condition that k (r) is not less than 0 and flag (r) is not less than 0, k (r) and flag (r) respectively represent the initial value in the r iteration and the position of the first coefficient of Lambda (r, z), and the aim is to move the current coefficient to the left by one bit and receive a new overflow coefficient after the current iteration is completed; mode 2 applies to the k (r) <0 condition in order to multiply all current coefficients by the error probability γ (r) and move one bit to the right. Mode 3 is applied to the condition that k (r) ≧ 0& flag (r) <0, and the effect is that all coefficients remain the original positions after one iteration is completed. I.e. the current coefficient is subjected to the L4 feedback loop twice in this iteration, while keeping the value of D5 unchanged.
The CSEE module consists of 8 units, 1 register, 1 alternative data selector and 32 adders, wherein 8 units are used for calculating error positions and error values, and each unit comprises 1 selector, 1 register, 1 multiplier and lambdaj,j∈[1,8]The coefficients of the j-th order of the polynomial representing the error location are determined at the first clockPeriod, the selection signals of the selectors in the 8 cells are all set to 1, i.e. λ j, j ∈ [1,8 ] is selected]As output, sending to constant multiplier to complete corresponding multiplication operation; on one hand, the products with the highest frequency in each unit are stored in a register and are sent to the output end of the selector when waiting for the next clock period; in the second clock cycle, all selectors in 8 units are set to be 0, and the product with the highest frequency in the previous clock cycle is accumulated in a register; by analogy, the judgment of all positions can be realized by continuously raising the power of the result in the register in each iteration; the Foney algorithm module removes the C8 unit on the original basis, and outputs the error value of the error position to be added to the corresponding transmission code word after the same operation is executed.
The invention has the characteristics and beneficial effects that:
aiming at the problems of complex structure and long decoding time of an RS code decoder, the invention combines a retiming technology and a pipeline parallel architecture with an mCS-RiBM algorithm, and provides a 16-channel four-degree parallel Forward error correction (RS-FEC) decoder architecture which comprises 4-channel sub-decoders, wherein the parallelism of an SC module and a CSEE module in each sub-decoder is 4. And the delay of the key path is shortened, so that the decoding time of the RS code is shortened, and the throughput rate of the decoder is greatly improved.
Description of the drawings:
FIG. 1A modified four-degree parallel syndrome computation module (SC) module for a conventional SC cell
FIG. 24 channel Key equation solving Module (KES) architecture
FIG. 3 four-degree-parallel PE1 and PE4 processing units (1) PE 1; (2) PE4
FIG. 4 is a four-degree parallel CS circuit diagram
FIG. 5 architecture of a four-degree parallel pipeline chien search module
FIG. 6 is a 16-channel four-degree parallel FEC architecture based on RS codes
Detailed Description
An architecture for a four-degree parallel forward error correction decoder (RS-FEC) for the RS decoding algorithm-mCS-RiBM algorithm, the improvement comprising the following aspects:
(1) and the parallel structure is adopted to realize the function of the SC module, from the perspective of reducing the critical path, the syndromes are divided into an odd number part and an even number part for calculation respectively, and finally, the odd number part and the even number part are added to calculate 2t syndromes. The parallelism factor of the syndrome computing circuit is 4, each path of code elements are sequentially input according to the sequence from high order to low order, the syndrome computing module simultaneously processes 4 code elements in one clock period, and after n/4 clock periods, the computed syndrome is sent to the KES module;
(2) since the number of channels of the sub-decoder is 4, 1 pair of registers of each processing unit in the original KES module needs to be increased to 4 pairs, the number of the register stages is increased compared with the original number, and a common multiplier is replaced into a pipeline multiplier by using a retiming technology;
(3) after the function of the KES module is implemented, the calculated error location polynomial Λ (X) and error estimation polynomial Ω (X) are fed to a Chien Search and Error Estimation (CSEE) module. The CSEE module is also designed into a four-degree parallel architecture, namely, in one clock cycle, the circuit can simultaneously process 4 code elements, and the CSEE module needs n/4 clock cycles to calculate all dislocation positions and error values.
The invention mainly aims at the most advanced hard decision RS decoding algorithm-mCS-RiBM algorithm at present, designs a parallel decoder architecture, improves the utilization rate of key modules by utilizing a module multiplexing technology and a pipeline parallel design, and realizes the hard decision decoder architecture with low hardware cost and high throughput rate. The present invention will be described in detail with reference to the drawings and examples. For convenience, RS (255,239) code (n is 255, t is 8) which is the most widely used is described as an example.
(1) The first step of the decoding process is to calculate 2t syndromes SiI is more than or equal to 0 and less than or equal to 2t-1, if all the 2t syndromes are 0, no error occurs, otherwise, the error occurs in the transmission. The most basic formula for syndrome calculation is:
Figure BDA0002239451360000041
in order to improve the speed and throughput rate of the decoder, the invention adopts a parallel structure to realize the function of the SC module. Changing equation (1) to the following form:
si=(…(rn-1αi(q-1)+rn-2αi(q-2)+…+rn-q+1αi+rn-qiq+…+rqiq+rq-1αi(q-1)+rq-2αi(q-2)+…+r1αi+r0, (2)
the parallelism factor of the circuit is q (q is 4 in the invention), but the traditional multi-degree parallel structure will certainly increase the critical path of the module, and in order to solve the problem, the syndrome is divided into odd part Roddi) And an even part Reveni) Respectively calculating, and finally adding. Namely:
si=R(αi)=Roddi)+Reveni), (3)
FIG. 1 shows a four-degree parallel module with modifications to a conventional parallel SC cell, a single SC circuit architecture requires 1 adder, q +1 constant multipliers and one register, and in a four-degree parallel SC architecture, a total of 2t (q +1) constant multipliers, 2t adders and 2t registers are required to calculate all 2t syndromes, the initial value of the register is set to 0, and 0 and α are in the middle of the first clockiqAfter multiplication is still 0 and after addition to the left term the value r is obtainedq-1αi(q-1)+…+r1αi+r0Storing the value in a register; in the second clock cycle, the product of each item is added to obtain the value (r)q-1αi(q-1)+…+r1αi+r0iq+r2q-1αi(q-1)+…+rq+1αi+rq) The new sum of products is again stored in the register. By analogy, according to the heightAnd sequentially inputting each path of code elements from the bit to the lower bit, wherein each iteration occupies one clock cycle, the SC circuit module can process q code elements in parallel in one clock cycle, and after n/q cycles, the result in the register is the required final syndrome value. The critical path of the SC module at this time is 3 Txor. In a four-degree parallel structure, only 64 clock cycles are required to complete the syndrome computation. The calculated syndrome is sent to the KES module.
(2) 4-channel KES architecture. Since the number of channels of the sub-decoder is 4, only 1 pair of registers of each processing unit in the original KES module needs to be increased to 4 pairs, and the overall architecture of the 4-channel KES module is as shown in fig. 2. The complete KES architecture includes a controller, t +1 PE1 units and t PE4 units, a Compensation Stack (CS).
the t +1 processing circuits PE1 are respectively PE10~PE1tArranged in the order of the numbers from small to large, PE1iOutput iteration signal deltai(r) feeding to PE1i-1Wherein i is 1,2, …, t + 1; t processing elements PE4t+1~PE42tArranged in the order of the numbers from small to large, PE4iOutputting the iterative data signal deltai(r) feeding to PE4i-1,PE4i-1Output intermediate variable signal thetai-1(r) feeding into PE4iWherein i is t +1, t +2, … 2 t; PE10 outputs iteration data signal deltai(r) to the controller, PE1t receives PE4t+1Output iteration data deltai+1(r) as input, PE42tThe iteration data input value of (1) is 0. Due to the increased number of register stages, a pipeline multiplier can be replaced with a normal multiplier using retiming.
The four-degree parallel PE processing unit is shown in fig. 3. Each PE contains 2 pipelined galois field multipliers, 1 galois field adder, 2-1 selectors, and 15 latches. PE4 differs slightly from PE1 by 3 more selectors. Upper and lower rows of latches in PE
Figure BDA0002239451360000051
And
Figure BDA0002239451360000052
for storing the coefficients of polynomials △ (x) and Θ (x), respectively, with subscript k being the number of the PE and superscript representing the channel number
Figure BDA0002239451360000053
Receiving the operation result of the adder, buffering the operation result, and sending the buffered operation result to the outside as the iteration data signal delta output by the processing unit PEi(r) simultaneous iteration of the data signal δi(r) is also returned to MC2s,i(r) a controlled data selector; latch device
Figure BDA0002239451360000061
For receiving the data from the MC1s,i(r) the output signal of the controlled data selector is buffered and sent to the outside as the intermediate variable signal theta output by the processing circuiti-1(r) simultaneously, the intermediate variable signal is also returned to the data selector;
the controller is used for controlling the selector in PE4, and MC2 is used for processing unit PE storing Λ (r, z) coefficients,i(r) is labeled, and MC2s,i(r) is stored in an internal register block. The controller receives the iterative output signal delta provided by the processing unit PE1i(r) while the data signal δ is to be iteratedi(r) is sent as an output signal; data selection signal MC1 output by controllers,i(r) iteration data signal deltai(r) and an intermediate variable signal θi-1(r) simultaneously as processing units PE 10-PE 1t input signals; data selection signal MC2 output by controllers,i(r) and MC3s,i(r) as processing element PE4t+1~PE42tThe input signal of (1); the iteration number r starts from 1, and 1 is accumulated in each iteration until 2t iterations are finished. When r is 2t, namely after 2t times of iterative operation, the processing units PE 10-PE 1t output error value polynomial coefficients; processing element PE4t+1~PE42tAnd outputting the position polynomial coefficient.
mCS-the overflow factor in the RiBM algorithm needs to be stored in dependence on a CS unit which operates as follows:
Λ(r+1,z)=γ(r)··γ(r′+1)·δ0(r) & B (r', z) to calculate the compensation coefficient deltacAnd passed to the PE unit at the appropriate time. Fig. 4 shows a four-degree parallel CS circuit diagram, which is composed of 5 two-way selectors, 1 multiplier and 7 registers, where three selectors M1, M2, and M3 are cascaded, and the register output is buffered by three-level registers and then sent to D4 or used as one-way input of the selector M4 after passing through the multiplier. In each clock cycle, data in D3, D4 or DM can be selected to be sent to the leftmost register through the selector, namely three feedback loops of L3, L4 and Lm exist in the CS architecture. Since the channel parallelism in this architecture is 4, each iteration is completed with 8 clock cycles. In the proposed CS cell, there are three update modes for cyclically generating, transferring the polynomial coefficients involved. Mode 1 applies to k (r) ≧ 0&flag (r) ≧ 0, in order to shift the current coefficient one bit to the left and receive a new overflow coefficient after the iteration is completed. The specific implementation steps are that the current coefficient passes through a 4-stage L4 feedback loop for one time, so that the coefficient returns to the original position after 4 cycles, then the 3-stage L3 feedback loop is completed from the 5 th clock cycle, the values of D5 and Dm are kept unchanged, and when the 8 th clock cycle comes, a new overflow coefficient theta is addedfInto the rightmost register. Mode 2 applies to k (r)<The 0 condition, the objective is to multiply the current overall coefficient by γ (r) and move one bit to the right. The specific operation steps are that the current coefficient passes through an Lm feedback loop of 5-stage in the first 4 clock cycles, and all data starts to complete an L4 feedback loop in the 5 th cycle. Mode 3 is applied to k (r) ≧ 0&flag(r)<Under the condition of 0, the effect is that all the coefficients keep the original positions after one iteration is finished. I.e. the current coefficient is subjected to the L4 feedback loop twice in this iteration, while keeping the value of D5 unchanged.
In the four-degree parallel architecture, since each iteration is performed in 4-channel serial mode, the product coefficient theta (x) of the intermediate polynomial and the syndrome of the first channel is output at the 60 th clock cycle after the start of each iteration, and the output of all 4 channels is completed at the 63 th clock cycle.
(3) In the key part of completionAfter the process is solved, the obtained error position polynomial and error value polynomial are sent to the CSEE module, wherein the Qian search module calculates the root of the error polynomial, and the Forney algorithm module calculates each error value, in order to be suitable for the high-speed RS decoder, the CSEE module also needs to adopt the pipelining technology, because of the existence of a feedback loop, the coefficient of α needs to be adjusted in the pipelining CSEE architecture to adjust the time sequence, FIG. 5 shows the architecture of the four-degree parallel pipelining Qian search module, the circuit realizes the Qian search function through the following function of lambda in the figurej,j∈[1,8]The j-degree term coefficients representing the error location polynomial. In the first clock cycle, the selection signals of the selectors in the 8 cells are all set to 1, i.e. λ is selectedj,j∈[1,8]As output, sending to constant multiplier to complete corresponding multiplication operation; on one hand, the products with the highest frequency in each unit are stored in a register and are sent to the output end of the selector when the next clock cycle is waited. In the second clock cycle, all selectors in 8 cells are set to 0, and the product with the highest number of times in the previous clock cycle is accumulated in the register. By analogy, the determination of all positions can be realized by continuously performing the raising power operation on the result in the register in each iteration. The Foney algorithm module only removes the C8 unit on the original basis, and the rest parts are basically similar. The critical path of the CSEE module is 3Txor + Tmux, which represents the delay time of one xor gate and one multiplier, respectively. There is a delay of 3 cycles before the first accepted codeword is output and it takes 64 clock cycles to complete the calculation of all the error values.
Fig. 6 is a 16-channel four-degree-parallel RS-FEC architecture proposed herein, including four 4-channel multi-degree-parallel RS decoders. The parallelism of the SC block and the CSEE block in each sub-decoder is 4. The syndrome calculation needs 64 clock cycles, the KES framework starts to output the coefficients of the error position polynomial and the error estimation polynomial of the first channel in the 124 th clock cycle, and sends the output result to the CSEE module, and the output of all the error position polynomial and the error estimation polynomial coefficients is completed after 127 clock cycles.

Claims (4)

1. A forward error correction decoding decoder based on burst error detection, comprising: the system comprises a syndrome calculation SC module, a key equation solving KES module and a chien search and error estimation CSEE module, wherein a syndrome calculated by the SC module is output to the KES module, an error position polynomial Λ (X) and an error estimation polynomial Ω (X) calculated by the KES module are output to the chien search and error estimation CSEE module, and all dislocation positions and error values are calculated by the CSEE module; the SC module adopts the syndrome calculation formula as follows to calculate the syndrome:
si=(…(rn-1αi(q-1)+rn-2αi(q-2)+…+rn-q+1αi+rn-qiq+…+rqiq+rq-1αi(q-1)+rq-2αi(q-2)+…+r1αi+r0, (2)
wherein, αiA root representing a symbol generator polynomial, i ═ 1,2,3 … 2 t; t is a syndrome SiNumber, i is more than or equal to 0 and less than or equal to 2t-1, n is the length of the processed code element packet, r0、r1…rn-1Representing received codeword polynomial coefficients; q is a parallelism factor.
2. The forward error correction decoding decoder based on burst error detection as claimed in claim 1, wherein the syndrome calculation SC module requires 2t (q +1) constant multipliers, 2t adders and 2t registers in total for calculating all 2t syndromes, the initial value of the register is set to 0, and 0 and α are respectively set in the middle of the first clockiqAfter multiplication is still 0 and after addition to the left term the value r is obtainedq-1αi(q-1)+…+r1αi+r0Storing the value in a register; in the second clock cycle, the product of each item is added to obtain the value (r)q-1αi(q-1)+…+r1αi+r0iq+r2q-1αi(q-1)+…+rq+1αi+rq) Storing the new sum of products into the register again; and by analogy, sequentially inputting each path of code elements from a high bit to a low bit, wherein each iteration occupies one clock cycle, processing q code elements in parallel in one clock cycle, after n/q cycles, obtaining the final syndrome value as the result in the register, wherein the key path of the SC module is 3Txor, the Txor is the delay of an exclusive-OR gate, and the calculated syndrome is sent to the KES module.
3. The forward error correction decoder of claim 1, wherein the KES structure comprises a controller, t +1 PE1 processing units (numbered from 0 to t) and t PE4 processing units (numbered from t +1 to 2 t), a compensation unit cs (compensation stage), and the t +1 processing circuit PE1 is PE1 respectively0~PE1tArranged in the order of the numbers from small to large, PE1iOutput iteration signal deltai(r) feeding to PE1i-1Wherein i is 1,2, …, t + 1; t processing elements PE4t+1~PE42tArranged in the order of the numbers from small to large, PE4iOutputting the iterative data signal deltai(r) feeding to PE4i-1,PE4i-1Output intermediate variable signal thetai-1(r) feeding into PE4iWherein i is t +1, t +2, … 2 t; PE10Outputting the iterative data signal deltai(r) to the controller, PE1t receives PE4t+1Output iteration data deltai+1(r) as input, PE42tThe iteration data input value of (1) is 0;
each PE contains 2 pipelined galois field multipliers, 1 galois field adder, 2-1 selectors, 15 latches, 3 more selectors for PE4 than PE1, as shown in fig. 2. Upper and lower rows of latches in PE
Figure FDA0002239451350000011
And
Figure FDA0002239451350000012
for storing polynomials Δ (x) and Θ (x), respectively) Subscript k is the serial number of PE, superscript denotes the channel serial number, latch
Figure FDA0002239451350000013
Receiving the operation result of the adder, buffering the operation result, and sending the buffered operation result to the outside as the iteration data signal delta output by the processing unit PEi(r) simultaneous iteration of the data signal δi(r) is also returned to the data selection signal MC2s,i(r) a controlled data selector; latch device
Figure FDA0002239451350000014
For receiving a routing signal MC1s,i(r) the output signal of the controlled data selector is buffered and sent to the outside as the intermediate variable signal theta output by the processing circuiti-1(r) simultaneously, the intermediate variable signal is also returned to the data selector;
the controller is used for controlling the selector in PE4 and storing data selection signal MC2 for processing unit PE with coefficient of Λ (r, z)s,i(r) is marked and the signal MC2s,i(r) are stored in an internal register block, and the controller receives the iteration output signal δ provided by the processing element PE1i(r) while the data signal δ is to be iteratedi(r) is sent as an output signal; data selection signal MC1 output by controllers,i(r) iteration data signal deltai(r) and an intermediate variable signal θi-1(r) simultaneously as processing element PE10PE1t input signal; data selection signal MC2 output by controllers,i(r) and MC3s,i(r) as processing element PE4t+1~PE42tThe input signal of (1); the iteration number r starts from 1, 1 is accumulated in each iteration until 2t iterations end, and when r is 2t, that is, after 2t iterations, the processing unit PE10PE1t outputs error value polynomial coefficients; processing element PE4t+1~PE42tOutputting a position polynomial coefficient;
mCS-the overflow factor in the RiBM algorithm needs to be stored in dependence on a CS unit which operates as follows:
Λ(r+1,z)=γ(r)·····γ(r′+1)·δ0(r) & B (r', z) to calculate the compensation coefficient deltacAnd passed to the PE unit at the appropriate time, where γ (r) represents the error probability, r' represents the higher order portion of r, and δ0(r) represents the 0 th iteration data signal, B (r', z) represents the intermediate polynomial;
the CS unit consists of 5 two-way selectors, 1 multiplier and 7 registers, wherein the three selectors M1, M2 and M3 are in cascade connection, and the output of the register is buffered by three stages of registers and then is sent to the register D4 or is used as one-way input of the selector M4 after passing through the multiplier. In each clock cycle, data in registers D3, D4 or DM are sent to the leftmost register through a selector, namely three feedback loop named L3, L4 and Lm exist in a CS framework, three updating modes are adopted for circularly generating and transmitting related polynomial coefficients, the mode 1 is applied to the condition that k (r) ≧ 0& flag (r) ≧ 0, k (r) and flag (r) respectively represent the initial value in the r iteration and the position of the first coefficient of Λ (r, z), and the aim is to move the current coefficient to the left by one bit and receive a new overflow coefficient after the current iteration is completed; mode 2 applies to the k (r) <0 condition in order to multiply all current coefficients by the error probability γ (r) and move one bit to the right. Mode 3 is applied to the condition that k (r) is not less than 0& flag (r) and less than 0, and the effect is that all coefficients keep the original positions after one iteration is completed. I.e. the current coefficient is subjected to the L4 feedback loop twice in this iteration, while keeping the value of D5 unchanged.
4. The forward error correction decoding decoder based on burst error detection as claimed in claim 1, wherein the CSEE module is composed of 8 units, 1 register, 1 alternative data selector and 32 adders, each of the 8 units is used to calculate the error position and error value, each unit includes 1 selector, 1 register and 1 multiplier, and λj,j∈[1,8]The coefficients of the j-th order of the polynomial representing the error position, the selection signals of the selectors in 8 cells are all set to 1 in the first clock cycle, i.e. λ j, j ∈ [1,8 ] is selected]As output, sending to constant multiplier to complete corresponding multiplication operation; then, on the one hand, the products obtained in each cell are divided into several stepsAdding to judge whether 4 positions have error positions, on one hand, storing the product with the highest frequency in each unit in a register to be sent to the output end of the selector in the next clock cycle; in the second clock cycle, all selectors in 8 units are set to be 0, and the product with the highest frequency in the previous clock cycle is accumulated in a register; by analogy, the judgment of all positions can be realized by continuously raising the power of the result in the register in each iteration; the Foney algorithm module removes the C8 unit on the original basis, and outputs the error value of the error position to be added to the corresponding transmission code word after the same operation is executed.
CN201910994927.4A 2019-10-18 2019-10-18 Forward error correction decoding decoder based on burst error detection Pending CN110971244A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910994927.4A CN110971244A (en) 2019-10-18 2019-10-18 Forward error correction decoding decoder based on burst error detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910994927.4A CN110971244A (en) 2019-10-18 2019-10-18 Forward error correction decoding decoder based on burst error detection

Publications (1)

Publication Number Publication Date
CN110971244A true CN110971244A (en) 2020-04-07

Family

ID=70029771

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910994927.4A Pending CN110971244A (en) 2019-10-18 2019-10-18 Forward error correction decoding decoder based on burst error detection

Country Status (1)

Country Link
CN (1) CN110971244A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112436842A (en) * 2021-01-27 2021-03-02 睿迪纳(南京)电子科技有限公司 Method for realizing signal processing device based on fractional folding
CN113395137A (en) * 2021-06-08 2021-09-14 龙迅半导体(合肥)股份有限公司 FEC encoding and decoding module
CN117200809A (en) * 2023-11-06 2023-12-08 浙江大学 Low-power-consumption money search and error estimation circuit for RS code for correcting two error codes

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030192007A1 (en) * 2001-04-19 2003-10-09 Miller David H. Code-programmable field-programmable architecturally-systolic Reed-Solomon BCH error correction decoder integrated circuit and error correction decoding method
US20090024902A1 (en) * 2007-06-04 2009-01-22 Samsung Electronics Co., Ltd. Multi-channel error correction coder architecture using embedded memory
CN102970049A (en) * 2012-10-26 2013-03-13 北京邮电大学 Parallel circuit based on chien search algorithm and forney algorithm and RS decoding circuit
CN108768407A (en) * 2018-04-23 2018-11-06 天津大学 A kind of Hard decision decoding device framework of low hardware cost, high-throughput

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030192007A1 (en) * 2001-04-19 2003-10-09 Miller David H. Code-programmable field-programmable architecturally-systolic Reed-Solomon BCH error correction decoder integrated circuit and error correction decoding method
US20090024902A1 (en) * 2007-06-04 2009-01-22 Samsung Electronics Co., Ltd. Multi-channel error correction coder architecture using embedded memory
CN102970049A (en) * 2012-10-26 2013-03-13 北京邮电大学 Parallel circuit based on chien search algorithm and forney algorithm and RS decoding circuit
CN108768407A (en) * 2018-04-23 2018-11-06 天津大学 A kind of Hard decision decoding device framework of low hardware cost, high-throughput

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WEI LU等: ""The Design of an RS Decoder Based on the mCS-RiBM Algorithm for 100 Gb/s Optical Communication Systems"" *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112436842A (en) * 2021-01-27 2021-03-02 睿迪纳(南京)电子科技有限公司 Method for realizing signal processing device based on fractional folding
CN113395137A (en) * 2021-06-08 2021-09-14 龙迅半导体(合肥)股份有限公司 FEC encoding and decoding module
CN113395137B (en) * 2021-06-08 2023-04-25 龙迅半导体(合肥)股份有限公司 FEC coding and decoding module
CN117200809A (en) * 2023-11-06 2023-12-08 浙江大学 Low-power-consumption money search and error estimation circuit for RS code for correcting two error codes
CN117200809B (en) * 2023-11-06 2024-04-12 浙江大学 Low-power-consumption money search and error estimation circuit for RS code for correcting two error codes

Similar Documents

Publication Publication Date Title
CN107241106B (en) Deep learning-based polar code decoding algorithm
Lee High-speed VLSI architecture for parallel Reed-Solomon decoder
US20060059409A1 (en) Reed-solomon decoder systems for high speed communication and data storage applications
CN110971244A (en) Forward error correction decoding decoder based on burst error detection
US7941734B2 (en) Method and apparatus for decoding shortened BCH codes or reed-solomon codes
Lee et al. A high-speed low-complexity concatenated BCH decoder architecture for 100 Gb/s optical communications
Xie et al. Fast nested key equation solvers for generalized integrated interleaved decoder
Chen et al. Area efficient parallel decoder architecture for long BCH codes
CN108768407A (en) A kind of Hard decision decoding device framework of low hardware cost, high-throughput
KR101094574B1 (en) APPARATUS FOR PERFORMING THE HIGH-SPEED LOW-COMPELEXITY PIPELINED BERLEKAMP-MASSEY ALGORITHM OF BCH decoder AND METHOD THEREOF
Zhu et al. Factorization-free low-complexity Chase soft-decision decoding of Reed-Solomon codes
Li et al. Unified architecture for Reed-Solomon decoder combined with burst-error correction
Liu et al. Area-efficient Reed–Solomon decoder using recursive Berlekamp–Massey architecture for optical communication systems
Tan et al. Area-efficient pipelined vlsi architecture for polar decoder
KR100756424B1 (en) An Area-Efficient Reed-Solomon Decoder using Pipelined Recursive Technique
Zhang et al. Modified low-complexity Chase soft-decision decoder of Reed–Solomon codes
KR100963015B1 (en) method and circuit for error location Processing of the discrepancy-computationless Reformulated inversionless Berlekamp-Massey algorithm for high-speed low-complexity BCH decoder
WO2010054526A1 (en) Rs decoding device and key multinomial solving device used by rs decoding device
Zhu et al. Efficient Reed-Solomon decoder with adaptive error-correcting capability
CN113395138B (en) PC-SCMA joint iterative detection decoding method based on deep learning
Ji et al. 16-channel two-parallel Reed-Solomon based forward error correction architecture for optical communications
CN112671415B (en) Product code-oriented high throughput coding method
Lee et al. 100-Gb/s three-parallel Reed-Solomon based foward error correction architecture for optical communications
Shanmugam VHDL Implementation of Reed-Solomon FEC architecture for high-speed optical communications
Hui et al. Design and Implementation of RS (1023, 847) Parallel Decoder

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200407