CN113608219B

CN113608219B - Multichannel SAR-oriented azimuth uniform sampling realization system and method

Info

Publication number: CN113608219B
Application number: CN202110832424.4A
Authority: CN
Inventors: 李炳沂; 孙晗伟; 刘振; 赵勇; 王新民; 郭路鹏
Original assignee: Beijing Institute of Radio Measurement
Current assignee: Beijing Institute of Radio Measurement
Priority date: 2021-07-22
Filing date: 2021-07-22
Publication date: 2023-11-17
Anticipated expiration: 2041-07-22
Also published as: CN113608219A

Abstract

The invention relates to a system and a method for realizing uniform sampling of azimuth direction for multichannel SAR, wherein the system is used for reconstructing a homogenized original received signal through an inverse filter, and a matrix inversion part is used for realizing quick inversion of a variable-order vandermonde matrix based on Lagrange interpolation through an FPGA; the variable-order vandermonde matrix rapid inversion part comprises a first branch consisting of an element buffer module, a u matrix element calculation module and a v matrix element calculation module, a second branch consisting of a pi vector and a reciprocal calculation module thereof, and a matrix multiplication module. According to the invention, the original receiving signal is reconstructed and homogenized through the inverse filter, the azimuth uniform sampling of the multichannel SAR is realized, the variable order vandermonde matrix based on Lagrange interpolation is realized through the FPGA aiming at the inverse filtering matrix inversion part, the two branches are operated in parallel, the operation efficiency can be improved, the hardware resource is saved, and finally, the high-precision vandermonde matrix inversion operation is realized under the conditions of low resource and low delay.

Description

Multichannel SAR-oriented azimuth uniform sampling realization system and method

Technical Field

The invention relates to the technical field of aerospace, in particular to a multichannel SAR oriented azimuth uniform sampling implementation system and method.

Background

The multi-channel mode SAR transmits as a single beam and receives multiple echoes simultaneously by multiple receivers. When the space-borne SAR works in a multichannel mode, a multi-phase central azimuth multi-beam system is adopted, and under the condition of ensuring higher azimuth resolution, the contradiction between the high azimuth resolution and wide-distance swath of the traditional single-channel SAR imaging is solved. A key problem with multi-channel mode SAR is that the mechanism of multi-channel reception can lead to non-uniform sampling of the inter-channel azimuth.

Disclosure of Invention

The invention aims to solve the technical problems existing in the prior art and provides a system and a method for realizing uniform sampling of azimuth directions of a multichannel SAR.

In order to solve the technical problems, the embodiment of the invention provides a multichannel SAR oriented azimuth uniform sampling FPGA implementation system, which homogenizes an original received signal through preprocessing operation reconstructed by an inverse filter and restores an azimuth spectrum of the original received signal; the matrix inversion part of the preprocessing operation of the inverse filter is used for realizing the fast inversion of a variable-order vandermonde matrix based on Lagrange interpolation through an FPGA;

the fast inversion part for realizing the variable order vandermonde matrix based on Lagrange interpolation through the FPGA comprises the following steps: the device comprises a first branch consisting of an element buffer memory module, a u matrix element calculation module and a v matrix element calculation module, a second branch consisting of a pi vector and an inverse calculation module, and a matrix multiplication module.

In order to solve the technical problem, the embodiment of the invention also provides a method for realizing the azimuth uniform sampling FPGA for the multichannel SAR, which comprises the following steps: homogenizing an original received signal through a preprocessing operation sampling reconstructed by an inverse filter, and recovering an azimuth spectrum of the original received signal; the matrix inversion part of the preprocessing operation of the inverse filter is used for realizing the fast inversion of the variable-order vandermonde matrix based on Lagrange interpolation through an FPGA.

The beneficial effects of the invention are as follows: the original received signal is sampled and homogenized through the inverse filter reconstruction preprocessing operation, the azimuth spectrum of the signal is recovered, the azimuth uniform sampling of the multichannel SAR is realized, the variable-order vandermonde matrix inversion based on Lagrange interpolation is realized based on the FPGA aiming at the inversion part of the multichannel inverse filtering matrix of the satellite-borne SAR, the u matrix element and the v matrix element are calculated through a first branch, pi vector and the reciprocal thereof are calculated through a second branch, the first branch and the second branch are operated in parallel, the operation efficiency can be effectively improved, the hardware resource is saved, and finally the high-precision vandermonde matrix inversion operation is realized under the conditions of low resource and low delay.

Additional aspects of the invention and advantages thereof will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

Fig. 1 is a block diagram of a reconstruction part of an inverse filter in a multichannel SAR-oriented azimuth uniform sampling implementation system according to an embodiment of the present invention;

fig. 2 shows a dependence graph (n=5) of the u, v matrix element calculations;

FIG. 3 is a block diagram of a first type of feedback column element calculator;

FIG. 4 is a u-matrix calculation step and task schedule;

FIG. 5 is a flowchart of the intermediate cache in-place store;

FIG. 6 is a v matrix calculation structure and schedule;

FIG. 7 is a block diagram of a pi' vector calculation module;

FIG. 8 is a block diagram of a matrix multiplication module;

FIG. 9 is a block diagram of a delay-dependent pipelined accumulator design;

fig. 10 is a timing diagram of an example of a delay-dependent pipelined accumulator (n=12, dmul=4, d=28);

fig. 11a is a 32-order accuracy analysis (max= 4.475 ×10 ^-7 ) Is a graph of test results;

fig. 11b is a 128-order accuracy analysis (max=1.176×10 ^-6 ) Is a graph of test results.

Detailed Description

The principles and features of the present invention are described below with reference to the drawings, the examples are illustrated for the purpose of illustrating the invention and are not to be construed as limiting the scope of the invention.

Because of unavoidable hardware and circuit errors and satellite attitude pointing errors in the operation process of the radar system, the amplitude and phase between channels are inconsistent, namely the problem of amplitude and phase errors between channels exists. The distance that the antenna of the multi-channel SAR advances in one pulse period is not equal to half the length of the antenna, so that multi-channel azimuth sampling is non-uniform most of the time, which can lead to the occurrence of range migration correction and focusing errors, and the original received signal needs to be homogenized through preprocessing operation sampling, so that the azimuth spectrum of the signal is recovered. The preprocessing method for reconstructing the inverse filter can complete reconstruction of the original signal, and effectively solves the problem of non-uniform sampling of the azimuth direction.

The matrix inversion is an important part of the multi-channel inverse filtering construction transfer function, the operand proportion in the multi-channel mode reaches 15%, so that the design and research of the low-cost reconfigurable matrix inversion structure is also suitable for the requirement of the channel reconstruction inverse filtering part in the multi-channel mode. The parallelism of conventional LU decomposition and QR decomposition is not high, the inversion efficiency of a high-order matrix is low, and the resource consumption is high when the hardware is realized.

Aiming at the characteristic that the multichannel inverse filter matrix is a vandermonde matrix, the embodiment of the invention provides a low-cost variable-order vandermonde matrix rapid inversion high-efficiency implementation method in combination with an application scene of FPGA-based satellite-borne SAR real-time imaging processing.

The embodiment of the invention provides a multichannel SAR oriented azimuth uniform sampling implementation system, which is used for sampling and homogenizing an original received signal through a preprocessing operation reconstructed by an inverse filter and recovering an azimuth spectrum of the original received signal; the preprocessing operation of the inverse filter reconstruction realizes the fast inversion of a variable-order vandermonde matrix based on Lagrange interpolation through an FPGA; the fast inversion part for realizing the variable order vandermonde matrix based on Lagrange interpolation through the FPGA comprises the following steps: the device comprises a first branch consisting of an element buffer memory module, a u matrix element calculation module and a v matrix element calculation module, a second branch consisting of a pi vector and an inverse calculation module, and a matrix multiplication module.

In the above embodiment, the original received signal is sampled and homogenized through the inverse filter reconstruction preprocessing operation, the azimuth spectrum of the signal is recovered, the azimuth uniform sampling of the multichannel SAR is realized, the variable-order vandermonde matrix inversion based on lagrangian interpolation is realized based on the FPGA aiming at the inversion part of the multichannel inverse filter matrix of the spaceborne SAR, the u matrix element and the v matrix element are calculated through the first branch, the pi vector and the reciprocal thereof are calculated through the second branch, the first branch and the second branch are operated in parallel, the operation efficiency can be improved, the hardware resource is effectively reduced, the hardware resource is saved, and the high-precision vandermonde matrix inversion operation is finally realized under the conditions of low resource and low delay.

The vandermonde matrix inversion method based on lagrangian interpolation is specifically described below:

for n mutually different nodes a in the vandermonde matrix ₁ ,a ₂ ...a _n I.e. vandermonde matrix elements, taking elementary lagrangian interpolation polynomials expressed as:

wherein x represents the input variable value, in the embodiment of the invention, represents the input Fan Demeng matrix element; pi represents a cumulative operation, pi (x), pi' (a) _i ) The definition is as follows:

note that in the formula (1), when i=j, it can be expressed as: l (L) _i (a _j )＝δ _ij (i, j=1, 2, …, n), n representing the number of elements, (i.e. when i=j, l) _i (a _j ) =1; otherwise, l _i (a _j ) =0), wherein δ _ij Is a Kronecker symbol. From the available inverse of the vandermonde matrix can be expressed as:

wherein diag { } represents a diagonal matrix made up of a plurality of elements. Thus, the inverse matrix V is obtained ^-1 The process can be described as follows, wherein the most critical is to find theSetting:

the element u of the intermediate matrix is calculated,represented as the elements of row i and column k in the u matrix.

The v matrix is calculated by means of the intermediate matrix u,represented as the elements of row i and column k in the v matrix.

And calculating pi' vectors, and obtaining an inverse matrix of the vandermonde matrix by using the equation (4).

Calculate pi' (a) _i ) And corresponding 1/pi' (a) _i ) Finally, the inverse matrix of the vandermonde matrix is obtained by using the formula (3).

According to the Lagrange interpolation vandermonde matrix rapid inversion method described in the first step, the invention designs a variable-order vandermonde matrix inversion overall structure based on the FPGA. Referring to fig. 1, the overall structure of the inverse filter reconstruction portion is divided into 5 blocks, respectively vandermonde matrix elements a ₁ ,a ₂ ...a _n The input buffer module, the u matrix operation module (corresponding to the formula (5)), the v matrix operation module (corresponding to the formula (6)), the pi' vector and its reciprocal calculation module (corresponding to the formula (7)), and the matrix multiplication module. The element buffer module adopts the DPRAM data buffer with small capacity in the FPGA. Based on the overall structure of the design, the Van der Monte matrix rapid inversion implementation process is as follows:

vandermonde matrix element a ₁ ,a ₂ ...a _n After the element buffer module is input, the element buffer module is sequentially input into a u matrix operation module and a pi 'vector calculation module according to the algorithm description of the formula (5), and u matrix calculation and pi' vector calculation are performed in parallel.

As shown in fig. 1, implementing the fast inversion of the variable order vandermonde matrix based on lagrangian interpolation by the FPGA includes: the device comprises an element buffer memory module, a u matrix operation module, a v matrix operation module, a pi' vector, a reciprocal calculation module and a matrix multiplication module.

Element buffer module for buffering input vandermonde matrix element a ₁ ,a ₂ ...a _n N is a positive integer;

a u matrix operation module for performing a matrix operation according to the Van der Monte matrix element a ₁ ,a ₂ ...a _n U matrix element calculation is carried out on the first branch;

the v matrix operation module is used for carrying out v matrix element calculation after u matrix element calculation is completed;

a pi' vector and its reciprocal calculation module for calculating the matrix element a according to the Van der Monte ₁ ,a ₂ ...a _n Performing pi' vector calculation on the second branch; and after pi ' vector calculation is completed, inverting and calculating 1/pi ' (a) of the pi ' vector _i ) Obtaining a diagonal element result;

the matrix multiplication module is used for carrying out matrix multiplication according to the v matrix element and the diagonal element result to obtain an inverse matrix of the vandermonde matrix;

where diag { } represents a diagonal matrix made up of a plurality of elements.

In the above embodiment, after the calculation of the u matrix element is completed, the first branch stores the calculation result into the internal register of the FPGA (belonging to the u matrix operation module), and then starts the v matrix operation module to calculate the v matrix element; and (3) after the pi' vector calculation of the second branch is completed, performing inversion calculation to obtain a diagonal element result in the formula (3). The first branch and the second branch are operated in parallel, and meanwhile, in the operation of the first branch, a parallel or serial pipeline computing structure is designed according to the v and u matrix computing characteristics, so that the operation efficiency can be effectively improved.

In addition, because the operation delays of the two branches are not necessarily consistent, the pi' vector and the reciprocal calculation module thereof are embedded into the FIFO, and the FIFO is used for caching the operation result of the divider and carrying out time sequence matching with the V matrix element, so that the operation result sequentially enters the matrix multiplication module. The pi' vector and the reciprocal calculation module are embedded into a small-capacity FIFO to play a role in delay matching.

According to the analysis of the formulas (5) and (6), the intermediate matrix u and v elements are calculated in a column-by-column iterative relationship, but cannot be calculated in parallel, and the element calculation dependency shown in fig. 2 (in the figure, the calculation sequence of the v matrix elements is opposite to the u matrix calculation sequence, so that the v matrix element calculation must be started after the element calculation of the last line of the u matrix is completed. After all the calculation of the u matrix is completed, the result of the last row is stored into a RAM/REG buffer memory (embedded in the u matrix operation module), and then the v matrix calculation module is started. The specific implementation steps of the u, v matrix are as follows:

u matrix element pipeline computing structure

The u matrix operation module adopts a pipeline calculation structure, a delay-related accumulator needs to be exemplified to calculate a first column element, a feedback-like column element calculator (shown in fig. 3) comprising a multiplier and a subtracter needs to be exemplified for subsequent column element calculation, and a register group with depth of n (n is the number of the matrix elements). Namely, the u matrix operation module comprises: the delay-related multiplier, the u matrix element register set with the depth of n and the first type feedback column element calculator, wherein the first type feedback column element calculator comprises a multiplier and a subtracter.

Referring to fig. 4, the specific implementation procedure of this step is as follows:

as can be seen from the equation (5), the u matrix is a lower triangular matrix of order n+1, and diagonal elements are all 1, and other element calculations are performed in columns, and the first column calculation method is different from the calculation method of the following n columns. The first column of computation can be regarded as element multiplication operation, and the input element sequence is { -a ₁ ,-a ₂ … -a, i.e., the vandermonde matrix element is inverted. The embodiment of the invention is based on floating point design, and floating point number negation only needs sign bit negation. Meanwhile, according to the delay-related accumulator timing sequence example, the previous element accumulation results can be output in addition to the accumulation results of all the elements at the final result, namely, the output sequence corresponding to the input element is:

and comparing the formula (11) with the formula (5), wherein the delay-related accumulator outputs result sequence elements in one-to-one correspondence with the first column elements of the u matrix. After the element calculation of the first column is completed, the element calculation of the first column is stored into a register group and is used as the input of the element calculation of the subsequent column. The subsequent 2-n+1 columns of the u matrix are calculated identically, and the 5×5u matrix element calculation step and the scheduling situation are shown in fig. 3. The u matrix operation module comprises: the device comprises a delay-related accumulator, a u matrix element register set with depth of n and a first type feedback column element calculator, wherein the first type feedback column element calculator comprises a multiplier and a subtracter;

referring to fig. 5, when task scheduling is performed, input data is reasonably arranged, so that partial pipeline operation with small delay is realized. The specific implementation steps are as follows:

calculating the first column element of the u matrix by the delay-dependent accumulator, sequentially storing the first column element into a u matrix element register group with depth of n after inverting calculation as the input of the calculation of the subsequent column elements, wherein the stored element sequence is that

Calculating subsequent diagonal elements by the first-class feedback column element calculator, calculating by adopting an in-situ replacement storage mode, and utilizing a calculation formula according to u matrix elementsCalculate->After (I)>Does not participate in subsequent operations, is->Replace->A storage location; />Replace->A storage location; and so on until the element of the last row of the second column +.>Replace->At this time, all elements in the second column are calculated;

the intermediate elements calculated in the subsequent i columns replace the i-n-i number values in the register group one by one, and finally all v matrix calculation input sequences are obtained:

in the embodiment, the in-situ replacement storage is adopted in the u element calculation, so that the hardware storage resource can be effectively saved.

v matrix element parallel computing structure

After the last line of elements of the u matrix is obtained, the v matrix operation module can start v matrix element calculation. The v matrix calculation module needs to instantiate two register sets with depth n for storing the input original fango matrix elements and the u matrix element register sets. Meanwhile, as the v matrix adopts a parallel computing structure, each row of elements can be independently computed, n kinds of feedback column element calculators are required to be instantiated for computing the v matrix elements. Namely, the v matrix operation module comprises: two register sets with depth of n and n second-type feedback column element calculators, wherein the second-type feedback column element calculators comprise multipliers and adders; each row of v matrix elements is calculated using one of said second type feedback column element calculators.

Referring to fig. 6, the specific implementation steps are as follows:

the two registers with the depth of n are respectively used for storing input Mongolian matrix elements and elements stored in a u matrix element register set;

for the ith row of v matrix, calculating (i=1, 2, … n), fixing one end input of the multiplier as Fan Demeng matrix element a corresponding to the row _i The method comprises the steps of carrying out a first treatment on the surface of the During the whole v matrix calculation process, the port input is not updated any more;

the other end of the multiplier inputs an initial value of 1, and then the other end inputs the result of the previous round of calculation; the output of the multiplication result is input as one end of the adder; the other end of the adder is input as an element in a u matrix element register group;

after each round of column element calculation is completed, the other end input of all adders is synchronously updated from the u matrix element register group, and u of each row is refreshed ⁿ⁺¹ The value, the output result is stored in an n x n memory array.

In the above embodiment, the parallel computation is performed on the v matrix element by the column-wise parallel computation method, so that the hardware operation time can be effectively saved.

Pi' vector calculation structure

The pi' vector is the result of the cumulating operation after the element subtraction of the vandermonde matrix, and the core is to instantiate a running water cumulating device for cumulating, a running water subtracter for element subtraction and a divider for inverting. I.e., pi' vector and its reciprocal calculation module includes a pipelined subtractor, a delay-dependent accumulator, and a divider.

Referring to fig. 7, the specific implementation method is as follows:

a1, one input end of the fixed flow subtractor is one reference element a in the vandermonde matrix elements _i (i＝1,2,…n)；

A2, the other input end of the pipelining subtracter sequentially reads other vandermonde matrix elements and the reference element a from the element buffer module _i Performing subtraction operation;

a3, outputting an operation result of the pipelining subtracter to a delay-related accumulator to obtain pi ' vector elements corresponding to the reference elements, outputting the accumulated result to a divider after fixed delay by the delay-related accumulator, obtaining the reciprocal of the pi ' vector elements corresponding to the reference elements through one division, and storing the reciprocal into a register or a RAM to finish calculation of the pi ' vector elements;

and A4, replacing the reference elements input by the stream subtracter, repeating the steps A1 to A3, and completing all pi' vectors and related inversion calculation.

In the above embodiment, by fixing one input end of the pipeline subtracter as one reference element in the vandermonde matrix element, the pipeline subtracter is adopted for operation, so that hardware resources can be effectively saved; meanwhile, the delay-related accumulator can effectively save hardware time and resource cost brought by accumulation operation.

Matrix multiplication calculation module structure

The matrix multiplication calculation module mainly performs multiplication operation of pi' vector and v matrix result, and needs to instantiate a flow multiplier and a delay correlation accumulator for calculation. I.e. the matrix multiplication module comprises a stream multiplier and a delay-dependent accumulator.

Referring to fig. 8, the specific implementation steps are as follows:

b1, one input end of the fixed flow multiplier is a corresponding element in a pi' vector, and the other end of the fixed flow multiplier is sequentially input with a column of elements of a v matrix;

b2, outputting the operation result of the flow multiplier to a delay related accumulator, and obtaining one element of the Van der Monte inverse matrix after fixed delay;

b3, in the working process of the delay related accumulator, the flow multiplier can continuously input the next element of the v matrix and multiply the next element with the corresponding element in the pi' vector;

b4, after the operation of the v matrix element and the same pi' vector element is completed according to the column pair, all elements of the corresponding row of the Van der Waals inverse matrix are obtained;

and B5, updating the next pi' vector element, continuing to operate for one round according to the steps B1 to B4, and obtaining all the results of vandermonde matrix inversion after n rounds of operation.

In the above embodiment, by fixing one input end of the pipeline multiplier to be the corresponding element in the pi' vector, and sequentially inputting one column of elements of the v matrix at the other end, hardware resources can be effectively saved by adopting the pipeline multiplier; meanwhile, the delay-related accumulator can effectively save hardware time and resource cost brought by accumulation operation.

Aiming at a cumulative multiplication or accumulation algorithm existing in a vandermonde matrix inversion method of Lagrange interpolation, the embodiment of the invention designs a delay-related running water accumulator (a cumulative multiplier or an accumulator) structure and designs an FPGA-based efficient running water floating point accumulation arithmetic unit.

The number of multipliers/adders in the delay accumulator structure designed in this step and the overall operation delay are only equal to the single multiplier/adder delay d _mul Correlation, independent of data points.

Referring to fig. 9 and 10, the delay-related accumulator and the delay-accumulator operate as follows:

(1) As shown in fig. 9, elements x (n) = { x (1), x (2) are sequentially input to the first input port a of the first-stage multiplier/adder according to x (n) } to be subjected to multiply/accumulate ₁ After the data are multiplied by 1/added by 0 in turn, the first output result of the first-stage multiplier/adder is fed back to the second input port b of the first-stage multiplier/adder ₁ I.e. input data sequence delay d _mul After a period, the first-stage multiplier/adder is multiplied/added with the subsequent elements of the self sequence, and the output port a of the first-stage multiplier/adder ₂ The sequence of results is as follows:

wherein,representing a down rounding operation, mod represents a modulo calculation;

(2) Results a output by the first stage ₂ Directly as an input to the second stage multiplier/adder, while a ₂ After a delay of one cycle, as a second stage multiplier/adder b ₂ Input of b ₂ And a ₂ The first stage of output a is operated according to the complementary 1 in a period of the previous stagger ₃ The last element of the resulting sequence is:

(3) Multiplier/adder a after the second stage up to the nth stage ₃ …a _N The inputs are all upPrimary multiplication/addition output, multiplier/adder b ₂ …b _N Input is the output delay of the first-stage multiplier/adder, and each stage of multiplier/adder is delayed by d _mul +1 cycles, within the effective signal range b ₃ … weeks N and a ₃ The … -week N staggered periods are all operated according to the complement 1; finally, after the output result passes through an N-level multiplier/adder, the last one of the output results is the cumulative multiplication/accumulation result of all elements; the cumulative multiply/accumulate result output delay is:

where N is the accumulation level of the delay-related accumulator/delay-related accumulator, d _mul The delay time before accumulation/accumulation is performed for the data sequence.

In the above embodiment, the delay-related pipeline accumulator needs an arithmetic unit which is independent of input data, but is only related to the delay of the arithmetic unit, so that the operation time can be saved and the hardware resource can be saved for accumulating and operating a large amount of data.

FIG. 10 shows an example of a pipelined cumulative timing assuming a single multiplier delay of 4 cycles, d _mul 4, the sequence to be processed has 12 elements, i.e. N is 12. According to equation (10), after 28 cycles, the cumulative multiplication result is output. The method can solidify the accumulation part according to specific resource and delay requirements during hardware design. The method is also applicable by replacing the multiplier in fig. 9 with other delay operators. The structure can be used as inherent hardware IP and is suitable for various pipeline accumulation operation scenes. Meanwhile, as can be seen from fig. 10, any intermediate cumulative multiplication result can be output in the pipeline processing.

The addition operation in the vandermonde matrix inversion method based on lagrangian interpolation is also performed according to this configuration after replacing the multiplier in fig. 9 with an adder. Aiming at a large number of accumulation operations in the algorithm, the embodiment of the invention adopts the delay correlation accumulator, reduces the correlation between hardware resources and calculation orders, and saves the hardware resources.

The embodiment of the invention provides a method for realizing uniform sampling of azimuth direction of a multichannel SAR, which comprises the following steps: homogenizing an original received signal through a preprocessing operation sampling reconstructed by an inverse filter, and recovering an azimuth spectrum of the original received signal; the preprocessing operation of the inverse filter reconstruction realizes the fast inversion of a variable-order vandermonde matrix based on Lagrange interpolation through an FPGA.

In the above embodiment, the original received signal is sampled and homogenized through the inverse filter reconstruction preprocessing operation, the azimuth spectrum of the signal is recovered, the azimuth uniform sampling of the multichannel SAR is realized, the variable order vandermonde matrix inversion based on the lagrangian interpolation is realized based on the FPGA aiming at the inversion part of the multichannel inverse filter matrix of the spaceborne SAR, the u matrix element and the v matrix element can be calculated through the first branch, the pi vector and the reciprocal thereof are calculated through the second branch, the first branch and the second branch are operated in parallel, the operation efficiency can be effectively improved, the hardware resource is saved, and the high-precision vandermonde matrix inversion operation is finally realized under the conditions of low resource and low delay.

The embodiment of the invention also provides a method for realizing uniform sampling of the azimuth direction of the multichannel SAR, wherein the method comprises the steps of realizing quick inversion of a variable-order vandermonde matrix based on Lagrange interpolation through an FPGA, and specifically comprises the following steps:

s110, buffering input vandermonde matrix element a ₁ ,a ₂ ...a _n N is a positive integer; according to said vandermonde matrix element a ₁ ,a ₂ ...a _n U matrix element calculation is carried out on the first branch, and v matrix element calculation is carried out after the u matrix element calculation is completed;

s120, according to the Van der Monte matrix element a ₁ ,a ₂ ...a _n Performing pi 'vector calculation on the second branch, and performing inversion calculation on the pi' vector 1/pi '(a) after pi' vector calculation is completed _i ) Obtaining a diagonal element result;

s130, performing matrix multiplication according to the v matrix element and the diagonal element result to obtain an inverse matrix of the vandermonde matrix;

where diag { } represents a diagonal matrix made up of a plurality of elements.

Optionally, the calculating u matrix elements according to the vandermonde matrix elements includes:

calculating the first column element of the u matrix, sequentially storing the first column element into a u matrix element register group with depth of n after inverting calculation as the input of the calculation of the subsequent column element, wherein the stored element sequence is that

The subsequent diagonal elements are calculated by adopting an in-situ replacement storage mode, and the u matrix element calculation formula is utilizedCalculate->After (I)>Does not participate in subsequent operations, is->Replace->A storage location; />Replace->A storage location; and so on until the element of the last row of the second column +.>Replace->At this time, all elements in the second column are calculated;

the v matrix element calculation after the u matrix element calculation is completed comprises the following steps:

for the ith row of the v matrix, calculating (i=1, 2, … n), one end of the fixed multiplier is input as Fan Demeng matrix elements ai corresponding to the row; during the whole v matrix calculation process, the port input is not updated any more;

the other end of the multiplier inputs an initial value of 1, and then the other input is the calculation result of the previous round; the multiplication result output is used as one-section input of an adder; the other end of the adder is input as an element in a u matrix element register group;

Optionally, according to said vandermonde matrix element a ₁ ,a ₂ ...a _n Performing pi 'vector calculation on the second branch, and performing inversion calculation on the pi' vector 1/pi '(a) after pi' vector calculation is completed _i ) Obtaining diagonal element results, comprising:

a1, fixingOne input end of the constant flow water subtracter is a reference element a in the vandermonde matrix elements _i (i＝1,2,…n)；

a3, outputting an operation result of the pipelining subtracter to a delay-related accumulator to obtain pi ' vector elements corresponding to the reference elements, outputting the accumulated result to a divider after fixed delay by the delay accumulator, obtaining the reciprocal of the pi ' vector elements corresponding to the reference elements through one division, and storing the reciprocal into a register or a RAM to finish calculation of the pi ' vector elements;

and A4, replacing the reference elements input by the stream subtracter, and repeating the steps to finish all pi' vectors and related inversion calculation.

Optionally, the performing matrix multiplication according to the v matrix element and the diagonal element result to obtain an inverse matrix of the vandermonde matrix includes:

b3, in the working process of the delay related accumulator, the flow multiplier can continuously input the next column element of the v matrix and multiply the next column element with the corresponding element in the pi' vector.

and B5, updating the next pi' vector element, continuing to operate for one round according to the steps B1 to B5, and obtaining all the results of vandermonde matrix inversion after n rounds of operation.

Optionally, the delay-related accumulator and the delay-related accumulator operate as follows:

the element x (n) = { x (1), x (2) data x (n) } to be multiplied/accumulated is sequentially input into the first input port a of the first-stage multiplier/adder ₁ After the data are multiplied by 1/added by 0 in turn, the first output result of the first-stage multiplier/adder is fed back to the second input port b of the first-stage multiplier/adder ₁ I.e. input data sequence delay d _mul After a period, the first-stage multiplier/adder is multiplied/added with the subsequent elements of the self sequence, and the output port a of the first-stage multiplier/adder ₂ The sequence of results is as follows:

/>

results a output by the first stage ₂ Directly as an input to the second stage multiplier/adder, while a ₂ After a delay of one cycle, as a second stage multiplier/adder b ₂ Input of b ₂ And a ₂ The first stage of output a is operated according to the complementary 1 in a period of the previous stagger ₃ The last element of the resulting sequence is:

multiplier/adder a after the second stage up to the nth stage ₃ …a _N Inputs are the multiplication/addition outputs of the previous stage, multiplier/adder b ₂ …b _N Input is the output delay of the first-stage multiplier/adder, and each stage of multiplier/adder is delayed by d _mul +1 cycles, within the effective signal range b ₃ … weeks N and a ₃ The … -week N staggered periods are all operated according to the complement 1; finally, after the output result passes through an N-level multiplier/adder, the last one of the output results is the cumulative multiplication/accumulation result of all elements; the cumulative multiply/accumulate result output delay is:

Experimental results and analysis, verifying the advancement and practicality of the present invention

By adopting the method described by the embodiment of the invention, 8-order, 32-order and 128-order vandermonde matrix inversion is realized based on Xilinx XC7VX690T FPGA respectively. When the method is realized, the matrix inversion operation adopts an Xilinx EDA development tool with an IP core, wherein for the cumulative multiplication operation, the adding, subtracting and multiplying IP cores involved in the complex multiplication operation are configured to be 2-period delay, and the single-precision floating point multiplication IP core is configured to delay 4-period output results, so that the final calculation delay under the condition of inputting two types of data is consistent. The specific verification results are as follows:

processing delay verification

At a dominant frequency of 150MHz, the 8 th order vandermonde matrix inversion takes about 11.1us (1477 cycles); the inversion of the 32 th order vandermonde matrix takes about 162.4us (21649 cycles), about 15 times the 8 th order; the inversion of the 128 th order vandermonde matrix takes about 2.54ms (338497 cycles), about 229 times the 8 th order, approximately corresponding to the increase in data size.

Processing error verification

Table 1 shows the consumption of FPGA resources (mainly including a lookup table (Slice LUTs), a Register (Slice Register), a random access Block memory (Block RAM), a digital computing unit (DSP 48 Es)) and a Delay (clock cycle count) in the technical solution provided by the embodiment of the present invention.

Table 1 different-order FPGA resource consumption conditions

/>

According to the table, the technical scheme provided by the embodiment of the invention can realize the rapid inversion of the variable-order vandermonde matrix. For Slice LUTs, the consumption is 9112,32 when the consumption is 13141,128 when the consumption is 28235 when the consumption is 8; for the Slice Register, the 8-order consumption is 865,32-order consumption and the 21047,128-order consumption is 52949; for Block RAM, the 8-level consumption is 0.5,32-level consumption is 1, and the 128-level consumption is 14.5; for DSP48Es, the 8-order consumption is 14, the 32-order consumption is 14,128-order 14; for Delay, the 8-way drain is 1477,32-way drain is 21649,128-way drain is 338497. According to the data, the technical scheme provided by the embodiment of the invention has the advantages of less resource consumption and lower delay error.

The technical scheme provided by the embodiment of the invention is compared with the MTLAB arithmetic result in relative error. Referring to FIGS. 11a and 11b, the order 32 relative error magnitude is 10 ^-7 The order of 128 relative errors is of the order of 10 ^-6 As the order increases, the relative error, although gradually increasing, remains within the acceptable range of SAR imaging processing, and is lower than in the prior art.

The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.

Claims

1. A multichannel SAR oriented azimuth uniform sampling realization system is characterized in that the system homogenizes an original received signal through preprocessing operation of inverse filter reconstruction and recovers azimuth spectrum of the original received signal; the matrix inversion part of the preprocessing operation of the inverse filter is used for realizing the fast inversion of a variable-order vandermonde matrix based on Lagrange interpolation through an FPGA;

the fast inversion part for realizing the variable order vandermonde matrix based on Lagrange interpolation through the FPGA comprises the following steps: the device comprises a first branch consisting of an element buffer module, a u matrix element operation module and a v matrix element operation module, a second branch consisting of a pi' vector and a reciprocal calculation module thereof, and a matrix multiplication module;

the element buffer module is used for buffering input vandermonde matrix element a ₁ ,a ₂ ...a _n N is a positive integer;

the u matrix operation module is used for performing matrix operation according to the vandermonde matrix element a ₁ ,a ₂ ...a _n U matrix element calculation is carried out on the first branch;

the u matrix operation module comprises: the device comprises a delay-related accumulator, a u matrix element register set with depth of n and a first type feedback column element calculator, wherein the first type feedback column element calculator comprises a multiplier and a subtracter;

the intermediate elements calculated in the subsequent i columns replace the i-n-i number values in the u matrix element register group one by one, and finally all v matrix calculation input sequences are obtained:

the v matrix operation module is used for performing v matrix element calculation after u matrix element calculation is completed;

the v matrix operation module comprises: two register sets with depth of n and n second-type feedback column element calculators, wherein the second-type feedback column element calculators comprise multipliers and adders; each row of v matrix elements is calculated by adopting one second type feedback column element calculator;

for the ith row of the v matrix, i=1, 2, … n, fixing one end input of the multiplier as Fan Demeng matrix element a corresponding to the row _i The method comprises the steps of carrying out a first treatment on the surface of the During the whole v matrix calculation process, the port input is not updated any more;

after each round of column element calculation is completed, synchronously updating the other ends of all adders from the u matrix element register group, refreshing the value of each row, and storing the output result into an n multiplied by n storage array;

the pi' vector and its reciprocal calculation module are used for calculating the matrix element a according to the vandermonde ₁ ,a ₂ ...a _n Performing pi' vector calculation on the second branch; after pi 'vector calculation is completed, inverting calculation is carried out on the pi' vector, and a diagonal element result is obtained;

the pi' vector and its reciprocal calculation module comprises a flow subtractor, a delay-related accumulator and a divider;

one input end of the fixed flow subtractor is a reference element a in the vandermonde matrix element _i ，i＝1,2,…n；

The other input end of the pipeline subtracter sequentially reads other vandermonde matrix elements and the reference element a from the element buffer module _i Performing subtraction operation;

the operation result of the pipelining subtracter is output to a delay-related accumulator to obtain pi ' vector elements corresponding to the reference elements, the delay-related accumulator outputs the accumulated result to a divider after fixed delay, and the accumulated result is divided once to obtain the reciprocal of the pi ' vector elements corresponding to the reference elements, and the reciprocal is stored in a register or a RAM to complete pi ' vector element calculation;

replacing the reference element input by the pipeline subtracter, repeating the steps to finish all pi' vectors and related inversion calculation;

the matrix multiplication module is used for carrying out matrix multiplication according to the v matrix element and the diagonal element result to obtain an inverse matrix of the vandermonde matrix.

2. The system of claim 1, wherein the matrix multiplication module comprises a pipelined multiplier and a delay-dependent accumulator;

one input end of the fixed flow multiplier is a corresponding element in pi' vector, and the other end of the fixed flow multiplier is sequentially input with a column of elements of a v matrix;

the operation result of the flow multiplier is output to a delay related accumulator, and after fixed delay, one element of the vandermonde inverse matrix is obtained;

in the working process of the delay correlation accumulator, the flow multiplier continues to input the next element of the v matrix and multiplies the next element with the corresponding element in the pi' vector;

after the operation of v matrix elements and the same pi' vector elements is completed according to column pairs, all elements of corresponding rows of the vandermonde inverse matrix are obtained;

and updating the next pi' vector element, continuing to operate for one round according to the steps, and obtaining all the results of Van der Monte matrix inversion after n rounds of operation.

3. The system of claim 1, wherein the pi' vector and its reciprocal calculation module further includes a FIFO coupled to the divider, the FIFO configured to buffer the operation results of the divider and perform timing matching with v matrix elements, so that the operation results enter the matrix multiplication module sequentially.

4. A method for realizing multi-channel SAR-oriented azimuth uniform sampling, which is realized by using the multi-channel SAR-oriented azimuth uniform sampling realizing system according to any one of claims 1 to 3, and comprises the following steps:

homogenizing an original received signal through a preprocessing operation sampling reconstructed by an inverse filter, and recovering an azimuth spectrum of the original received signal; the matrix inversion part of the preprocessing operation of the inverse filter is used for realizing the fast inversion of the variable-order vandermonde matrix based on Lagrange interpolation through an FPGA.

5. The method of claim 4, wherein said implementing, by the FPGA, a fast inversion of the variable order vandermonde matrix based on lagrangian interpolation comprises:

caching inputVandermonde matrix element a ₁ ,a ₂ ...a _n N is a positive integer; according to said vandermonde matrix element a ₁ ,a ₂ ...a _n U matrix element calculation is carried out on the first branch, and v matrix element calculation is carried out after the u matrix element calculation is completed;

according to said vandermonde matrix element a ₁ ,a ₂ ...a _n Performing pi ' vector calculation on the second branch, and performing inversion calculation on the pi ' vector after the pi ' vector calculation is completed to obtain a diagonal element result;

and performing matrix multiplication according to the v matrix element and the diagonal element result to obtain an inverse matrix of the vandermonde matrix.