CN109586733A

CN109586733A - A kind of LDPC-BCH interpretation method based on graphics processor

Info

Publication number: CN109586733A
Application number: CN201811403301.3A
Authority: CN
Inventors: 刘永鑫; 赵明; 张秀军
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2018-11-23
Filing date: 2018-11-23
Publication date: 2019-04-05
Anticipated expiration: 2038-11-23
Also published as: CN109586733B

Abstract

The LDPC-BCH interpretation method based on graphics processor that the present invention relates to a kind of, belongs to field of communication technology.Quasi- cyclic is converted by the check matrix of different code length, code rate first；The check matrix of quasi- cyclic is compressed, in the ranks interweave and interweaving in line；Whether iterative decoding is correctly used as stopping criterion for iteration using BCH code decoding, to eliminate error floor, improve decoder error-correcting performance, it specifically includes: matrix intersector is carried out to the check part of input code word variable node external information, calculation resources distribution, the log-likelihood that the parallel posteriority log-likelihood for updating variable node and check-node are transmitted to variable node, hard decision, iterative decoding, output information bit and successfully decoded mark are terminated when BCH code decoding is correct or reaches maximum number of iterations.The decoder delay that the present invention realizes decodes throughput in 100,000,000 magnitudes, error-correcting performance is suitable with the error-correcting performance of second generation digital satellite broadcasting standard suggestion in millisecond magnitude.

Description

A kind of LDPC-BCH interpretation method based on graphics processor

Technical field

The LDPC-BCH interpretation method based on graphics processor that the present invention relates to a kind of, more particularly to it is a kind of for the second generation The LDPC-BCH interpretation method based on graphics processor of digital satellite broadcasting standard, belongs to field of communication technology.

Background technique

DVB-S2 is the DVB system of new generation for serving broadband satellite application, and compared with DVB-S, DVB-S2 uses LDPC+ BCH code supports a variety of codes such as 1/4,1/3,2/5,1/2,3/5,2/3,3/4,4/5,5/6,8/9,9/10 as channel coding method Type.LDPC code allows decoding performance to approach shannon limit, and BCH code can eliminate error floor.Under identical transmission conditions, DVB-S2 improves transmission capacity about 30% or more, and stronger reception can be obtained under same spectrum efficiency.In interactive mode DVB-S2 uses VCM, ACM technology in point-to-point application, and different type of service (such as SDTV, HDTV, audio, multimedia) can To select different error protection rank hierarchical transmissions, thus efficiency of transmission is greatly improved.Passback letter is used in combination in VCM Road can also realize adaptive coding and modulating (Adaptive Coding Modulation, ACM), can be for each use The path condition at family optimizes configured transmission.

Graphics processor (GPU) is that the single instruction stream for having MPP ability that is rapidly developed in recent years is more Thread stream (SIMT) framework general-purpose operation processor.Existing market mainstream graphics processor NVIDIA GTX 1080ti includes 3584 arithmetic cores, single-precision floating point operational capability are up to 10TFLOP.Single instruction stream multiple data stream (SIMD) is used with existing CPU, ARM, DSP of framework, which are compared, has higher operational capability, and configuration is more flexible, more just with FPGA and ASIC hardware compared with It is realized in programming.

Compute Unified Device Architecture (CUDA) is the exploitation environment calculated for GPU, it is GPU can be considered as the equipment of parallel data calculating, divided the calculating carried out by one completely new software and hardware architecture Match and manages.In the framework of CUDA, calculating must be no longer mapped to as past so-called GPGPU architecture by these calculating In figure API (OpenGL and Direct 3D), therefore for developer, the exploitation threshold of CUDA is greatly reduced.

Through the literature search of existing technologies, the Chinese patent of number of patent application 201610955524.5, proprietary term Referred to as " a kind of high-speed adaptive DVB-S2LDPC decoder and interpretation method based on FPGA " discloses a kind of real based on FPGA Existing DVB-S2LDPC decoder.Based on the FPGA decoder realized is all made of Min-Sum decoding algorithm, likelihood value uses BCH code is not accounted in 8bit fixed-point representation, iteration as termination condition, is handled up using sacrificing decoding performance and improving to decode as cost Rate.And the problems such as decoder will consider timing arrangement and resource allocation using FPGA realization, complexity is higher, the development cycle Longer, versatility and reconfigurability be not high.

The DVB-S2LDPC decoder based on graphics processor also has significant limitation, such as document at this stage (G.Falcao,J.Andrade,V.Silva,and L.Sousa,"GPU-based DVB-S2 LDPC decoder with high throughput and fast error floor detection,"(in English),Electronics 2011.) Letters, vol.47, no.9, pp.542-543, Apr 28 is used using Min-Sum decoding algorithm, likelihood value BCH code is not accounted in 8bit fixed-point representation, iteration as termination condition, is handled up using sacrificing decoding performance and improving to decode as cost Rate.Document (D.Kun, " High throughput GPU LDPC encoder and decoder for DVB-S2, " in 2018 IEEE Aerospace Conference, 2018, pp.1-9.) by decoding 8000 frames raising GPU utilization rate simultaneously, Reach the throughput of 500-1000Mbps, but considerably increase decoding delay and video memory consumption, is not suitable for actual communication System.The document also uses Min-Sum decoding algorithm, does not also account for BCH code as termination condition, error correction in iteration simultaneously Can also there be loss.

Summary of the invention

The LDPC-BCH interpretation method based on graphics processor that the purpose of the present invention is to propose to a kind of, using novel thread point With mode, concurrency in abundant mining algorithm code word reduces function call expense, improves level-one, L2 cache hit rate, into And the throughput of decoding is improved, reduce decoding delay；And each iteration is decoded correct as iteration ends item using BCH code Part eliminates platform effect, improves the error-correcting performance of decoding.

LDPC-BCH interpretation method proposed by the present invention based on graphics processor, comprising the following steps:

(1) to the check matrix H of LDPC code '_{(q×z)×(n×z)}It is reconstructed, obtains the verification with quasi- cycle characteristics Matrix H_{(q×z)×(n×z)}, steps are as follows for reconstruct:

(1-1) to the check matrix H of LDPC code '_{(q×z)×(n×z)}Capable intertexture is carried out, row interleaving mode is matrix intersector, matrix Interleave parameter is z × q, obtains a provisional matrixWherein z indicates the repetition factor of LDPC code, will verify square Battle array H '_{(q×z)×(n×z)}Row be divided into q group, every group includes z row, by check matrix H '_{(q×z)×(n×z)}Column be divided into n group, every group includes Z column, (q × z) indicate LDPC code check bit bit number, (n × z) indicate LDPC code total bit number, check bit bit number with The sum of information bit bit number, wherein interleave parameter is the matrix intersector of z × q, matrix intersector method are as follows: makeWhereinIndicate provisional matrixA row, 0≤a≤(q × z) -1,Expression check matrix H '_{(q×z)×(n×z)}?Row, mod (a, z) indicate a divided by The remainder of z；

The provisional matrix that (1-2) obtains step (1-1)Last (q × z) column carry out column interleaving, interweave Parameter is z × q, obtains a quasi-cyclic matrix H_{(q×z)×(n×z)}；

(2) to the quasi-cyclic matrix H of step (1)_{(q×z)×(n×z)}It is compressed, obtains the interim check matrix of compression D′_q×dmax, wherein dmax indicates quasi-cyclic matrix H_{(q×z)×(n×z)}Maximum row weight, compression method is as follows:

(2-1) is by quasi-cyclic matrix H_{(q×z)×(n×z)}It is divided into q × n submatrix, by the quasi-cyclic matrix after division H_{(q×z)×(n×z)}It is denoted as matrix D_q×n, D_q×nIn element be d_ij, each element represents a submatrix, submatrix d_ijIt is one Dimension is the full null matrix or circular matrix of z × z, and i is matrix D_q×nRow serial number, 0≤i≤q-1, j are matrix D_q×nColumn sequence Number, 0≤j≤n-1；

(2-2) is to matrix D_q×nIt is compressed, obtains an interim compression check matrix D '_q×dmax, D '_q×dmaxIn element For (x_mk,y_mk), x_mkRepresenting matrix D_q×nThe column serial number of k-th of non-zero submatrices in m row, y_mkRepresenting matrix D_q×nM row In k-th of non-zero submatrices cyclic shift, m is interim compression check matrix D '_q×dmaxRow serial number, 0≤m≤q-1, k are Interim compression check matrix D '_q×dmaxColumn serial number, 0≤k≤dmax-1, when column serial number k be greater than d_mWhen -1, make D '_q×dmaxIn Element (x_mk,y_mk) it is (- 1, -1), d_mFor D_q×nM row non-zero submatrices quantity；

(3) to the interim check matrix D ' of compression obtained in step (2)_q×dmaxIn the ranks interweaved and interweaving in line, is obtained Compress check matrix M_q×dmax；

(4) compression check matrix M is obtained using step (3)_q×dmax, to graphics processor from channel received LDPC- BCH code word is decoded, comprising the following steps:

(4-1) carries out matrix intersector by the check bit log-likelihood of the received N number of code word of channel to graphics processor, hands over Knitting parameter is z × q, obtains the posteriority log-likelihood of N number of code word variable node, posteriority log-likelihood is denoted asWherein g For code word serial number, 0≤g≤N-1, x are the bit sequence of code word, and 0≤x≤n × z-1, (n × z) indicates the bit length of code word；

When (4-2) is initialized, the maximum sub- the number of iterations P of setting, maximum number of iterations L make the number of iterations u=0, make LDPC The log-likelihood R that code check node is transmitted to variable node^g(m, v, k) is 0, and wherein m is M_q×dmaxRow serial number, 0≤m≤ Q-1, v are submatrix row serial number, 0≤v≤z-1, k M_q×dmaxColumn serial number, 0≤k≤dmax-1；

(4-3) compresses check matrix M by the received code word number N of channel, LDPC code according to graphics processor_q×dmaxAnd maximum Calculation resources in graphics processor are divided into N × q × P thread block by sub- the number of iterations P, by the three-dimensional serial number of thread block It is denoted as (g, m, p), wherein g is code word serial number, and 0≤g≤N-1, m are matrix M_q×dmaxRow serial number, 0≤m≤q-1, p are sub- iteration Number；

(4-4) in the graphics processor of step (4-3) per thread block distribute z sub thread, be obtained N × q × P × Z sub thread；

(4-5) is according to the compression check matrix M of step (3)_q×dmax, in the thread block of step (4-3), to variable node Posteriority log-likelihoodThe log-likelihood R transmitted with check-node to variable node^g(m, v, k) carries out parallel update and counts It calculates, it is parallel to update there are two ways to calculating,

The first are as follows: using LDPC code and long-pending interpretation method, updateAnd R^g(m, v, k), mistake Journey is as follows:

(4-5-1) calculates the log-likelihood that variable node is transmitted to check-nodeWherein 0≤k≤d_m- 1, d_mTo compress check matrix M_q×dmxa It is not equal to the element number of (- 1, -1), (b in m row_mk,e_mk) it is compression check matrix M_q×dmaxThe element of m row kth column；

(4-5-2) calculates the first temporary variable S according to LDPC code and product decoding formula,Its In∏ is that even multiplication accords with；

(4-5-3) calculates the second temporary variable Q according to LDPC code and product decoding formula_sum,Wherein

(4-5-4) calculates pair that updated check-node is transmitted to variable node according to LDPC code and product decoding formula Number likelihood value

(4-5-5) is according to step (4-5-4) updated log-likelihoodMore using atom add operation Newly

(4-5-6) is according to step (4-5-4) updated log-likelihoodUpdate R^g(m, v, k):

Or second are as follows: according to obtained in step (3) compression check matrix M_q×dmax, step (4-3) thread block in into The following parallel computation of row, updatesAnd R^g(m, v, k), 0≤k≤d_m- 1, process is as follows:

(4-5-7) calculates the log-likelihood Q (k) that variable node is transmitted to check-node,Wherein, 0≤k≤d_m- 1, d_mTo compress check matrix M_q×dmaxM row In be not equal to (- 1, -1) element number, (b_mk,e_mk) it is compression check matrix M_q×dmaxThe element of m row kth column；

(4-5-8) is calculated according to LDPC code offset is minimum and decoding formula | Q (k) | minimum value min0, secondary minimum value Min1 and index k corresponding with minimum value_min,

(4-5-9) is minimum according to LDPC code offset and decodes formula, update min0 and min1, min0=max (min0- β, 0), min1=max (min1- β, 0), wherein β be that LDPC code offset is minimum and decoding formula in decoding deviation ratio；

(4-5-10) is minimum according to LDPC code offset and decodes formula, calculates the first temporary variable S:

Wherein∏ is that even multiplication accords with；

(4-5-11) is minimum according to LDPC code offset and decodes formula, calculates updated check-node and passes to variable node The log-likelihood passed

(4-5-12) is according to (4-5-11) updated log-likelihoodIt is updated using atom add operation

(4-5-13) is according to step (4-5-11) updated log-likelihoodUpdate R^g(m, v, k):

(4-6) according to graphics processor by the received code word number N of channel and code word size (n × z), will be in graphics processor Calculation resources be reclassified as N × n thread block, per thread block is divided into z sub thread, and the 2-d index of thread block is denoted as (g, h), wherein g is code word serial number, and h is matrix D in step (2-1)_q×nColumn serial number, sub thread index are denoted as v；

(4-7) in the thread block that step (4-6) is distributed, to what is updated in step (4-5)Carry out parallel hard decision meter It calculates, obtains hard-decision bitsX=h × z+v,

(4-8) uses Berlekampe or Euclidean algorithm and money searching algorithm, to the hard decision ratio of step (4-7) It is specialIn LDPC code information bit carry out BCH code decoding, obtain the error correction number of BCH code, error correction number judged, If error correction number is greater than t-1, g-th of codeword decoding failure is determined；If error correction number is less than or equal to t-1, determine g-th Codeword decoding success, wherein t is the maximum error correction number of BCH code；

(4-9) calculates the code word number of decoding failure in N number of code word according to the judging result of step (4-8), to code word number into Row judgement determines successfully decoded if the code word number of decoding failure is equal to 0, is decoded as a result, provide successfully decoded mark, It completes LDPC-BCH and decodes process；If the code word number of decoding failure is greater than 0, the number of iterations u is judged, if the number of iterations U is greater than or equal to maximum number of iterations L, then completes LDPC-BCH decoding process, and provide and translate to the code word of wherein decoding failure Code unsuccessfully identifies, and provides successfully decoded mark to successful code word is decoded；If the number of iterations u is less than maximum number of iterations L, make U=u+P, return step (4-3).

LDPC-BCH interpretation method proposed by the present invention based on graphics processor, its advantage is that:

The present invention is based on the LDPC-BCH interpretation methods of graphics processor, as can be seen that such as from the embodiment of the present invention Shown in Fig. 2 and Fig. 3, due to having used BCH code as outer code, which eliminates error floor, with document (D.Kun, " High throughput GPU LDPC encoder and decoder for DVB-S2,"in 2018 IEEE Aerospace Conference, 2018, pp.1-9.) second generation digital satellite broadcasting standard LDPC code error-correcting performance realized Curve is compared, and the sum-product algorithm decoding performance that the method for the present invention is realized improves 0.2dB, and the offset that the present invention realizes is minimum and calculates Method decoding performance improves 0.0-0.1dB.Document (D.Kun, " High throughput GPU LDPC encoder simultaneously And decoder for DVB-S2, " in 2018 IEEE Aerospace Conference, 2018, pp.1-9.) realize Decoder needs while translating 8000 code words, is 99.7 to 2592 milliseconds according to the different decoding latencies of coding mode；In BER= 10^-7, under conditions of maximum number of iterations 50 times, offset minimum-sum algorithm decoder delay that the present invention realizes in 1.1-3.5ms, Decoding throughput is 120.8-303.7Mbps, the error-correcting performance loss 0.1- compared with the sum-product algorithm decoder that the present invention realizes 0.2dB；The sum-product algorithm decoder that the present invention realizes postpones in 1.1-8.1ms, throughput 115.5-127.5Mbps, error correction Performance is suitable with the error-correcting performance of second generation digital satellite broadcasting standard suggestion.The method of the present invention terminates standard using iteration in advance Then, as signal-to-noise ratio increases, practical the number of iterations can reduce therewith, therefore decoding throughput can also increase.

Detailed description of the invention

Fig. 1 is the flow diagram of the LDPC-BCH interpretation method proposed by the present invention based on graphics processor.

Fig. 2 and Fig. 3 is the error-correcting performance curve for the DVB-S2 LDPC-BCH decoder realized according to the method for the present invention.

Specific embodiment

LDPC-BCH interpretation method proposed by the present invention based on graphics processor, flow diagram is as shown in Figure 1, include Following steps:

(2-2) is to matrix D_q×nIt is compressed, obtains an interim compression check matrix D '_q×dmax, D '_q×dmaxIn element For (x_mk,y_mk), x_mkRepresenting matrix D_q×nThe column serial number of k-th of non-zero submatrices in m row, y_mkRepresenting matrix D_q×nM row In k-th of non-zero submatrices cyclic shift, m is interim compression check matrix D '_q×dmaxRow serial number, 0≤m≤q-1, k are Interim compression check matrix D '_q×dmaxColumn serial number, 0≤k≤dmax-1, when column serial number k be greater than d_mWhen -1, make D '_q×dmaxIn Element (x_mk, y_mk) it is (- 1, -1), d_mFor D_q×nM row non-zero submatrices quantity；

(4-5-1) calculates the log-likelihood Q (k) that variable node is transmitted to check-node,Wherein, 0≤k≤d_m- 1, d_mTo compress check matrix M_q×dmaxM row In be not equal to (- 1, -1) element number, (b_mk,e_mk) it is compression check matrix M_q×dmaxThe element of m row kth column；

(4-5-6) is according to step (4-5-4) updated log-likelihoodUpdate R^g(m, v, k):

(4-5-7) calculates the log-likelihood Q (k) that variable node is transmitted to check-node:

Wherein, 0≤k≤d_m- 1, d_mTo compress check matrix M_q×dmaxIt is not equal to the element number of (- 1, -1), (b in m row_mk,e_mk) it is compression check matrix M_q×dmaxThe member of m row kth column Element；

(4-5-10) is minimum according to LDPC code offset and decodes formula, calculates the first temporary variable S,Wherein∏ is that even multiplication accords with；

(4-6) according to graphics processor by the received code word number N of channel and code word size (n × z), will be in graphics processor The calculation resources such as stream handle, shared drive, register be reclassified as N × n thread block, per thread block is divided into z The 2-d index of sub thread, thread block is denoted as (g, h), and wherein g is code word serial number, and h is matrix D in step (2-1)_q×nColumn serial number, Sub thread index is denoted as v；

One embodiment of the method for the present invention introduced below:

With n=16200 in DVB-S2, for the LDPC-BCH code of 4/9 code rate, this yard of message length is 7200 bits, school Testing length is 9000 bits, the repetition factor z=360, q=25, n=45 of LDPC code；The code length of BCH code is 7200 bits, letter Breath length is 7032 bits, error correction number t=12；LDPC code and BCH code use serial concatenated structure, and LDPC is Internal Code, and BCH is Outer code；

The graphics processor that the present embodiment uses includes 3584 stream handles, single precision for NVIDIA GTX 1080ti Floating-point operation ability is 10TFLOP；

The following steps are included:

(1-1) to the check matrix H of LDPC code '_{(q×z)×(n×z)}Capable intertexture is carried out, row interleaving mode is matrix intersector, matrix Interleave parameter is z × q, obtains a provisional matrixWherein z indicates the repetition factor of LDPC code, wherein z= 360, by check matrix H '_{(q×z)×(n×z)}Row be divided into q group, wherein q=25, every group includes z row, by check matrix H′_{(q×z)×(n×z)}Column be divided into n group, wherein n=45, every group arranges comprising z, and (q × z) indicates the check bit bit number of LDPC code, (n × z) indicate the total bit number of LDPC code, the sum of check bit bit number and information bit bit number, wherein interleave parameter is z × q Matrix intersector method are as follows: makeWhereinIndicate provisional matrixA row, 0≤a≤ (q × z) -1,Expression check matrix H '_{(q×z)×(n×z)}? Row, mod (a, z) Indicate a divided by the remainder of z；

(2-1) is by quasi-cyclic matrix H_{(q×z)×(n×z)}It is divided into q × n submatrix, by the quasi-cyclic matrix after division H_{(q×z)×(n×z)}It is denoted as matrix D_q×n, D_q×nIn element be d_ij, each element represents a submatrix, and the dimension of submatrix is z × z, submatrix d_ijFor full null matrix or circular matrix, i is matrix D_q×nRow serial number, 0≤i≤q-1, j are matrix D_q×nColumn Serial number, 0≤j≤n-1, D_q×nPartitioning Expression of A are as follows:

D_q×n=[A B],

Wherein It is compared with the circulation submatrix of standardThe upper right corner lacks one 1, and it is special that when decoding needs Processing, I₁₂₅Indicate the circular matrix that a circulation offset is 125.

(2-2) is to matrix D_q×nIt is compressed, obtains an interim compression check matrix D '_q×dmax, D '_q×dmaxIn element For (x_mk,y_mk), x_mkRepresenting matrix D_q×nThe column serial number of k-th of non-zero submatrices in m row, y_mkRepresenting matrix D_q×nM row In k-th of non-zero submatrices cyclic shift, m is interim compression check matrix D '_q×dmaxRow serial number, 0≤m≤q-1, k are Interim compression check matrix D '_q×dmaxColumn serial number, 0≤k≤dmax-1, when column serial number k be greater than d_mWhen -1, make D '_q×dmaxIn Element (x_mk,y_mk) it is (- 1, -1), d_mFor D_q×nM row non-zero submatrices quantity, D '_q×dmaxAs shown in Table 1；

(5,0)	(9,206)	(19,222)	(20,0)	(44,359)	(- 1, -1)	(- 1, -1)
							(1,125)	(2,323)	(2,132)	(6,0)	(20,0)	(21,0)	(- 1, -1)
(3,248)	(3,217)	(4,112)	(7,0)	(14,45)	(21,0)	(22,0)
							(1,107)	(4,280)	(8,0)	(17,239)	(22,0)	(23,0)	(- 1, -1)
(0,106)	(9,0)	(23,0)	(24,0)	(- 1, -1)	(- 1, -1)	(- 1, -1)
							(6,246)	(10,0)	(13,237)	(24,0)	(25,0)	(- 1, -1)	(- 1, -1)
(11,0)	(13,176)	(25,0)	(26,0)	(- 1, -1)	(- 1, -1)	(- 1, -1)
							(2,220)	(12,0)	(18,318)	(26,0)	(27,0)	(- 1, -1)	(- 1, -1)
(0,154)	(8,314)	(13,0)	(14,175)	(27,0)	(28,0)	(- 1, -1)
							(5,83)	(14,0)	(15,205)	(28,0)	(29,0)	(- 1, -1)	(- 1, -1)
(4,313)	(15,0)	(16,3)	(29,0)	(30,0)	(- 1, -1)	(- 1, -1)
							(0,265)	(0,198)	(16,0)	(19,64)	(30,0)	(31,0)	(- 1, -1)
(0,332)	(0,318)	(7,352)	(17,0)	(31,0)	(32,0)	(- 1, -1)
							(2,263)	(4,310)	(18,0)	(18,121)	(32,0)	(33,0)	(- 1, -1)
(1,237)	(8,223)	(17,330)	(19,0)	(33,0)	(34,0)	(- 1, -1)
							(2,233)	(4,155)	(10,349)	(34,0)	(35,0)	(- 1, -1)	(- 1, -1)
(3,317)	(6,358)	(35,0)	(36,0)	(- 1, -1)	(- 1, -1)	(- 1, -1)
							(3,174)	(4,171)	(11,302)	(12,271)	(36,0)	(37,0)	(- 1, -1)
(1,259)	(2,213)	(15,86)	(37,0)	(38,0)	(- 1, -1)	(- 1, -1)
							(2,350)	(7,93)	(38,0)	(39,0)	(- 1, -1)	(- 1, -1)	(- 1, -1)
(0,0)	(0,159)	(3,180)	(12,48)	(39,0)	(40,0)	(- 1, -1)
							(1,0)	(5,199)	(16,161)	(40,0)	(41,0)	(- 1, -1)	(- 1, -1)
(1,168)	(2,0)	(4,101)	(9,184)	(41,0)	(42,0)	(- 1, -1)
							(1,131)	(1,267)	(3,0)	(42,0)	(43,0)	(- 1, -1)	(- 1, -1)
(3,148)	(3,183)	(4,0)	(10,124)	(11,199)	(43,0)	(44,0)

Table one

(3) to the interim check matrix D ' of compression obtained in step (2)_q×dmaxIn the ranks interweaved and interweaving in line, is obtained Compress check matrix M_q×dmax, M_q×dmaxAs shown in Table 2；

(5,0)	(19,222)	(44,359)	(9,206)	(20,0)	(- 1, -1)	(- 1, -1)
							(8,314)	(14,175)	(28,0)	(13,0)	(27,0)	(0,154)	(- 1, -1)
(35,0)	(3,317)	(36,0)	(6,358)	(- 1, -1)	(- 1, -1)	(- 1, -1)
							(18,121)	(33,0)	(4,310)	(32,0)	(2,263)	(18,0)	(- 1, -1)
(0,106)	(23,0)	(9,0)	(24,0)	(- 1, -1)	(- 1, -1)	(- 1, -1)
							(5,83)	(15,205)	(29,0)	(14,0)	(28,0)	(- 1, -1)	(- 1, -1)
(44,0)	(3,183)	(10,124)	(43,0)	(3,148)	(4,0)	(11,199)
							(1,125)	(2,132)	(20,0)	(2,323)	(6,0)	(21,0)	(- 1, -1)
(15,0)	(29,0)	(4,313)	(16,3)	(30,0)	(- 1, -1)	(- 1, -1)
							(3,180)	(39,0)	(0,0)	(12,48)	(40,0)	(0,159)	(- 1, -1)
(24,0)	(6,246)	(13,237)	(25,0)	(10,0)	(- 1, -1)	(- 1, -1)
							(33,0)	(1,237)	(17,330)	(34,0)	(8,223)	(19,0)	(- 1, -1)
(13,176)	(26,0)	(25,0)	(11,0)	(- 1, -1)	(- 1, -1)	(- 1, -1)
							(22,0)	(3,217)	(7,0)	(21,0)	(3,248)	(4,112)	(14,45)
(1,0)	(16,161)	(41,0)	(5,199)	(40,0)	(- 1, -1)	(- 1, -1)
							(0,318)	(17,0)	(32,0)	(7,352)	(31,0)	(0,332)	(- 1, -1)
(11,302)	(36,0)	(3,174)	(12,271)	(37,0)	(4,171)	(- 1, -1)
							(19,64)	(31,0)	(0,198)	(30,0)	(0,265)	(16,0)	(- 1, -1)
(38,0)	(2,213)	(37,0)	(1,259)	(15,86)	(- 1, -1)	(- 1, -1)
							(23,0)	(4,280)	(17,239)	(1,107)	(8,0)	(22,0)	(- 1, -1)
(12,0)	(26,0)	(2,220)	(18,318)	(27,0)	(- 1, -1)	(- 1, -1)
							(1,168)	(4,101)	(41,0)	(2,0)	(9,184)	(42,0)	(- 1, -1)
(4,155)	(34,0)	(2,233)	(10,349)	(35,0)	(- 1, -1)	(- 1, -1)
							(3,0)	(43,0)	(1,267)	(42,0)	(1,131)	(- 1, -1)	(- 1, -1)
(39,0)	(7,93)	(2,350)	(38,0)	(- 1, -1)	(- 1, -1)	(- 1, -1)

Table two

(4-5) is according to the compression check matrix M of step (3)_q×dmax, in the thread block of step (4-3), to variable node Posteriority log-likelihoodThe log-likelihood R transmitted with check-node to variable node^g(m, v, k) carries out parallel update and counts It calculates, it is parallel to update there are two ways to calculating, the first are as follows:

Using LDPC code and long-pending interpretation method, updateAnd R^g(m, v, k) process is as follows:

(4-5-1) calculates the log-likelihood Q (k) that variable node is transmitted to check-node,Wherein, 0≤k≤d_m- 1, d_mTo compress check matrix M_q×dmaxM row In be not equal to (- 1, -1) element number, (b_mk, e_mk) it is compression check matrix M_q×dmaxThe element of m row kth column；

(4-5-2) calculates the first temporary variable S,Wherein

(4-5-3) calculates the second temporary variable Q_sum,Wherein

(4-5-4) calculates the log-likelihood that updated check-node is transmitted to variable node

(4-5-5) is updated using atom add operation

(4-5-6) updates

Or second are as follows:

Using the offset minimum and interpretation method of LDPC code, updateAnd R^g(m, v, k), process is such as Under:

(4-5-7) calculates the log-likelihood Q (k) that variable node is transmitted to check-node,Wherein, 0≤k≤d_m- 1, d_mTo compress check matrix M_q×dmaxM row In be not equal to (- 1, -1) element number, (b_mk, e_mk) it is compression check matrix M_q×dmaxThe element of m row kth column；

(4-5-8) is calculated | Q (k) |, 0≤k≤d_m- 1, minimum value min0, secondary minimum value min1 and minimum value it is corresponding Index k_min；

(4-5-9) recalculates min0 and min1 according to the decoding offset β of LDPC code, min0=max (min0- β, 0), min1=max (min1- β, 0)；

(4-5-10) calculates the first temporary variable S,Wherein

(4-5-11) calculates the log-likelihood that updated check-node is transmitted to variable node

(4-5-12) is updated using atom add operation

(4-5-13) updates

(4-7) is in the thread block that step (4-6) is distributed to update in step (4-5)Carry out parallel hard decision meter It calculates, obtains hard-decision bitsX=h × z+v,

(4-8) uses Berlekampe or Euclidean algorithm and money searching algorithm, to the hard decision ratio of step (4-7) It is specialIn LDPC code information bit carry out BCH code decoding, obtain the error correction number of BCH code, error correction number judged, If error correction number is greater than t-1, g-th of codeword decoding failure is determined；If error correction number is less than or equal to t-1, determine g-th Codeword decoding success, wherein t is the maximum error correction number of BCH code, t=12；

(4-9) calculates the code word number of decoding failure in N number of code word, if decoding failure according to the judging result of step (4-8) Code word number be equal to 0, then it is successfully decoded, decoded as a result, provide successfully decoded mark, complete LDPC-BCH and decode process； If the code word number of decoding failure is greater than 0, the number of iterations u is judged, if the number of iterations u is greater than or equal to greatest iteration time Number L is then decoded and is identified as a result, providing decoding failure to the code word of wherein decoding failure, is provided to successful code word is decoded Successfully decoded mark completes LDPC-BCH and decodes process；If the number of iterations u is less than maximum number of iterations L, make u=u+P, returns Step (4-3)；

Fig. 2, Fig. 3 give second generation digital satellite broadcasting standard LDPC-BCH code error correction of interpretation method realization Energy curve, wherein in the error-correcting performance curve of Fig. 2, in the error-correcting performance curve of code length n=64800, Fig. 3, code length n= 16200.Simulated conditions are white Gaussian noise, and the every kind of code rate of coding for being 64800 for code length at most emulates 10⁷A code word is right 4 × 10 are at most emulated in every kind of code rate of coding that code length is 16200⁷The code word number of a code word, decoding failure is at least up to 50 frames. Solid line is sum-product algorithm, and dotted line is minimum-sum algorithm.

Claims

1. a kind of LDPC-BCH interpretation method based on graphics processor, it is characterised in that the interpretation method the following steps are included:

(1) to the check matrix H of LDPC code '_{(q×z)×(n×z)}It is reconstructed, obtains the check matrix with quasi- cycle characteristics H_{(q×z)×(n×z)}, steps are as follows for reconstruct:

(1-1) to the check matrix H of LDPC code '_{(q×z)×(n×z)}Capable intertexture is carried out, row interleaving mode is matrix intersector, matrix intersector Parameter is z × q, obtains a provisional matrixWherein z indicates the repetition factor of LDPC code, by check matrix H′_{(q×z)×(n×z)}Row be divided into q group, every group includes z row, by check matrix H '_{(q×z)×(n×z)}Column be divided into n group, every group includes z Column, (q × z) indicate that the check bit bit number of LDPC code, (n × z) indicate the total bit number of LDPC code, check bit bit number and letter The sum of position bit number is ceased, wherein interleave parameter is the matrix intersector of z × q, matrix intersector method are as follows: makeWhereinIndicate provisional matrixA row, 0≤a≤(q × z) -1,Expression check matrix H '_{(q×z)×(n×z)}?Row, mod (a, z) indicate a divided by The remainder of z；

The provisional matrix that (1-2) obtains step (1-1)Last (q × z) column carry out column interleaving, interleave parameter For z × q, a quasi-cyclic matrix H is obtained_{(q×z)×(n×z)}；

(2) to the quasi-cyclic matrix H of step (1)_{(q×z)×(n×z)}It is compressed, obtains the interim check matrix D ' of compression_q×dmax, Wherein dmax indicates quasi-cyclic matrix H_{(q×z)×(n×z)}Maximum row weight, compression method is as follows:

(2-2) is to matrix D_q×nIt is compressed, obtains an interim compression check matrix D '_q×dmax, D '_q×dmaxIn element be (x_mk,y_mk), x_mkRepresenting matrix D_q×nThe column serial number of k-th of non-zero submatrices in m row, y_mkRepresenting matrix D_q×nIn m row K-th of non-zero submatrices cyclic shift, m is interim compression check matrix D '_q×dmaxRow serial number, 0≤m≤q-1, k be face When compress check matrix D '_q×dmaxColumn serial number, 0≤k≤dmax-1, when column serial number k be greater than d_mWhen -1, make D '_q×dmaxIn member Element (x_mk,y_mk) it is (- 1, -1), d_mFor D_q×nM row non-zero submatrices quantity；

(3) to the interim check matrix D ' of compression obtained in step (2)_q×dmaxIn the ranks interweaved and interweaving in line, is compressed Check matrix M_q×dmax；

(4) compression check matrix M is obtained using step (3)_q×dmax, to graphics processor from channel received LDPC-BCH code Word is decoded, comprising the following steps:

(4-1) carries out matrix intersector by the check bit log-likelihood of the received N number of code word of channel to graphics processor, and interweave ginseng Number is z × q, obtains the posteriority log-likelihood of N number of code word variable node, posteriority log-likelihood is denoted asWherein g is code Word serial number, 0≤g≤N-1, x are the bit sequence of code word, and 0≤x≤n × z-1, (n × z) indicates the bit length of code word；

When (4-2) is initialized, the maximum sub- the number of iterations P of setting, maximum number of iterations L make the number of iterations u=0, make LDPC code school Test the log-likelihood R that node is transmitted to variable node^g(m, v, k) is 0, and wherein m is M_q×dmaxRow serial number, 0≤m≤q-1, v For submatrix row serial number, 0≤v≤z-1, k M_q×dmaxColumn serial number, 0≤k≤dmax-1；

(4-3) compresses check matrix M by the received code word number N of channel, LDPC code according to graphics processor_q×dmaxIt changes with maximum son Calculation resources in graphics processor are divided into N × q × P thread block, the three-dimensional serial number of thread block are denoted as by generation number P (g, m, p), wherein g is code word serial number, and 0≤g≤N-1, m are matrix M_q×dmaxRow serial number, 0≤m≤q-1, p are sub- the number of iterations；

(4-4) distributes z sub thread to per thread block in the graphics processor of step (4-3), is obtained N × q × P × z Sub thread；

(4-5) is according to the compression check matrix M of step (3)_q×dmax, in the thread block of step (4-3), after variable node Test log-likelihoodThe log-likelihood R transmitted with check-node to variable node^g(m, v, k) carries out parallel update and calculates, It is parallel to update there are two ways to calculating,

The first are as follows: using LDPC code and long-pending interpretation method, updateAnd R^g(m, v, k), process is such as Under:

(4-5-2) calculates the first temporary variable S according to LDPC code and product decoding formula,Wherein∏ is that even multiplication accords with；

(4-5-3) calculates the second temporary variable Q according to LDPC code and product decoding formula_sum,Its In

(4-5-4) calculates logarithm that updated check-node is transmitted to variable node seemingly according to LDPC code and product decoding formula So value

(4-5-5) is according to step (4-5-4) updated log-likelihoodIt is updated using atom add operation

(4-5-6) is according to step (4-5-4) updated log-likelihoodUpdate R^g(m, v, k):

Or second are as follows: according to obtained in step (3) compression check matrix M_q×dmax, carried out such as in the thread block of step (4-3) Lower parallel computation updatesAnd R^g(m, v, k), 0≤k≤d_m- 1, process is as follows:

(4-5-9) updates min0 and min1 according to LDPC code offset is minimum and decoding formula, min0=max (min0- β, 0), Min1=max (min1- β, 0), wherein β be that LDPC code offset is minimum and decoding formula in decoding deviation ratio；

(4-5-11) is minimum according to LDPC code offset and decodes formula, calculates what updated check-node was transmitted to variable node Log-likelihood

(4-6) according to graphics processor by the received code word number N of channel and code word size (n × z), by the fortune in graphics processor Calculate resource and be reclassified as N × n thread block, per thread block is divided into z sub thread, the 2-d index of thread block be denoted as (g, H), wherein g is code word serial number, and h is matrix D in step (2-1)_q×nColumn serial number, sub thread index are denoted as v；

(4-7) in the thread block that step (4-6) is distributed, to what is updated in step (4-5)Parallel hard decision calculating is carried out, is obtained To hard-decision bits

(4-8) uses Berlekampe or Euclidean algorithm and money searching algorithm, to the hard-decision bits of step (4-7) In LDPC code information bit carry out BCH code decoding, obtain the error correction number of BCH code, error correction number judged, if entangling Wrong number is greater than t-1, then determines g-th of codeword decoding failure；If error correction number is less than or equal to t-1, g-th of code word is determined Successfully decoded, wherein t is the maximum error correction number of BCH code；

(4-9) calculates the code word number of decoding failure in N number of code word, sentences to code word number according to the judging result of step (4-8) It is disconnected, if the code word number of decoding failure is equal to 0, determine successfully decoded, is decoded as a result, provide successfully decoded mark, completed LDPC-BCH decodes process；If the code word number of decoding failure is greater than 0, the number of iterations u is judged, if the number of iterations u is big In or equal to maximum number of iterations L, then LDPC-BCH decoding process is completed, and decoding is provided to the code word of wherein decoding failure and is lost Mark is lost, provides successfully decoded mark to successful code word is decoded；If the number of iterations u is less than maximum number of iterations L, make u=u + P, return step (4-3).