CN109446478A - A kind of complex covariance matrix computing system based on iteration and restructural mode - Google Patents

A kind of complex covariance matrix computing system based on iteration and restructural mode Download PDF

Info

Publication number
CN109446478A
CN109446478A CN201811284263.4A CN201811284263A CN109446478A CN 109446478 A CN109446478 A CN 109446478A CN 201811284263 A CN201811284263 A CN 201811284263A CN 109446478 A CN109446478 A CN 109446478A
Authority
CN
China
Prior art keywords
complex
bank
data
covariance matrix
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811284263.4A
Other languages
Chinese (zh)
Other versions
CN109446478B (en
Inventor
李丽
陈辉
傅玉祥
陈沁雨
何国强
何书专
李伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN201811284263.4A priority Critical patent/CN109446478B/en
Publication of CN109446478A publication Critical patent/CN109446478A/en
Application granted granted Critical
Publication of CN109446478B publication Critical patent/CN109446478B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Complex Calculations (AREA)

Abstract

The present invention relates to the complex covariance matrix computing systems based on iteration and restructural mode, including DDR memory, reconfigurable cell, dma controller outside on piece SRAM memory, piece and accelerate core, the acceleration core includes: matrix covariance computing module, each region source data of poll on piece SRAM memory by way of iterative calculation, and calculate lower triangle covariance matrix;It is conjugated symmetrical module, according to the conjugate symmetry matter of covariance matrix, lower triangle covariance matrix is obtained into complete complex covariance matrix by way of address of cache and reconstruct storage, forms final operation result;DMA interface function module will be stored on piece SRAM memory by partitioned mode by the dma mode data that DDR memory is read in outside piece.The utility model has the advantages that the present invention supports the complex matrix of any columns to carry out covariance operation, reduces the source data calculation amount of conventional hardware implementation and result data is repeatedly write back to the time of DDR.

Description

A kind of complex covariance matrix computing system based on iteration and restructural mode
Technical field
The present invention relates to field of computer technology, more particularly to based on the complex covariance matrix of iteration and restructural mode Calculation system.
Background technique
The typical operation that covariance matrix is field of signal processing is calculated, is to realize multi-level assessment device, space The key component of Power estimation, relevant sources number detection and affine invarient pattern-recognition, is widely used in radar, sonar, number The fields such as word image procossing.In addition to this, covariance matrix also has widely in fields such as images match, image steganalysis Using, but the calculating process of covariance matrix is complex, by multiple feature vector structures in each region if you need to calculate image At multiple random variables covariance matrix, take a long time, this also becomes the real-time implementation image on personal versatile PC platform The big obstacle that covariance matrix calculates.
With the rapid development of IC industry, high-performance and the unremitting pursuit for being in real time built-in field.Currently, needle To the hardware realization of complex covariance matrix, largely it is all based on the platforms such as DSP, GPU and FPGA and is designed.And it is of the invention Accelerate core based on a restructural intelligence, proposes a kind of hardware for calculating complex covariance matrix based on iteration and restructural mode Implementation method, compared with conventional hardware implementation, this method resource utilization is high, hardware realization speed is fast.At signal The typical operation in reason field, the hardware implementation method have good reference and broad application prospect.
In statistics and probability theory, covariance matrix is a matrix.Each of which element is between each vector element Covariance.If X=(X1,X2,X3,...,XN)TFor n n-dimensional random variable n, claim matrix
For the covariance matrix of n n-dimensional random variable n X, it is denoted as D (X).Wherein, Cij=Cov (Xi,Xj), i, j=1 ..., n is The component X of XiAnd XjCovariance.Because of Cij=Cji, therefore covariance matrix is symmetrical matrix.
As shown in Figure 1, traditional hardware implementation mode is as follows: for the complex matrix A=[a of a M × Nij], its multiple association Variance matrix B can be obtained by following formula: B=AAH.Transposition first is asked to matrix A, then conjugation is asked to obtain AH, finally obtain matrix B Each element is as follows:Since which is realized according to matrix multiplication, big points complex matrix association is being calculated The multiple carrying-in/carrying-out of data is involved in when variance, this will lead to memory access overlong time, and the implementation does not press association side The conjugate symmetry matter of poor matrix reduces operand, these are all that covariance matrix calculates time long reason.And many real In the application scenarios of border, inefficient calculate of covariance matrix can become a big obstruction, it is seen that there is the present invention certain reference to anticipate Justice and application prospect.
Summary of the invention
It is an object of the invention to overcome the deficiency of the above prior art, one kind is provided, source data operand is effectively reduced, Storage resource is made full use of, calculating speed is accelerated, and then promoted on the whole using algorithm performance based on iteration and can be weighed The complex covariance matrix computing system of structure mode, is specifically realized by the following technical scheme:
The complex covariance matrix computing system based on iteration and restructural mode, including on piece SRAM memory, piece Outer DDR memory, reconfigurable cell, dma controller and accelerate core, the acceleration core respectively on piece SRAM memory, can Reconfiguration unit communication connection, dma controller and reconfigurable cell communicate to connect, and the outer DDR memory of piece is controlled by bus and DMA Device communication connection, the acceleration core include:
Matrix covariance computing module, each region source data of poll on piece SRAM memory by way of iterative calculation, And calculate lower triangle covariance matrix;
It is conjugated symmetrical module, according to the conjugate symmetry matter of covariance matrix, lower triangle covariance matrix is passed through into address The mode of mapping and reconstruct storage obtains complete complex covariance matrix, forms final operation result;
DMA interface function module will be stored in by the dma mode data that DDR memory is read in outside piece by partitioned mode On piece SRAM memory;And the operation result is write back into DDR memory outside piece by dma mode.
The complex covariance matrix computing system based on iteration and restructural mode it is further design be, described Upper SRAM memory setting storage resource is divided into k bank, and the depth of each bank is d, if under m bank of distribution is for storing Triangle covariance matrix carries out covariance operation for the complex matrix that size is M × N, meets M2≤ md, N are arbitrary value;If meter Calculation degree of parallelism is b, then delimits the condition that complex matrix to be asked is small point are as follows: M2≤ bd, condition is not satisfied determines wait ask multiple Matrix is big points.
The further design of the complex covariance matrix computing system based on iteration and restructural mode is, if wait ask Complex matrix is small point, then uses the one-dimensional data transmission mode of DMA;The two of DMA is used if complex matrix to be asked is big points Dimension data transmission mode.
The complex covariance matrix computing system based on iteration and restructural mode it is further design be, the square Battle array covariance computing module is stored in all bank of single bank, single deposit one and arranges and will work as according to by column of complex matrix to be asked Forefront divides an area into, all bank of single divide the regular of a section into and store former data, and it is total to calculate single subregion Size, segmentation total degree, area's number of final stage and the columns in the last one area.
The further design of the complex covariance matrix computing system based on iteration and restructural mode is, if arbitrarily Area's columns is unfilled, then uses zero padding mechanism;If source data stores over single maximum storage points, using ping-pong operation point Section processing uses batch processing if source data stores over single maximum storage points.
The further design of the complex covariance matrix computing system based on iteration and restructural mode is that matrix is assisted Variance computing module is constructed using restructural mode multiplies accumulating computing unit again, and each segmentation, each subregion, each bank are kept in result Data are iterated calculating, obtain lower triangle covariance matrix.
The further design of the complex covariance matrix computing system based on iteration and restructural mode is, multiplies again tired The bank quantity for being equal to source data storage using number for adding complex multiplier and Complex Summer in computing unit, is set as b, every time The number of computing unit input data are as follows: 2b+1;Institute's active data bank m-th address is read as complex multiplier and inputs A;Point Not Du Qu the 1~M address institute active data bank data conjugation as complex multiplier input B, another input data For the value of corresponding storage address in last storage result bank, each calculated result need to write back again in same bank again Same address.
The further design of the complex covariance matrix computing system based on iteration and restructural mode is, described total The symmetrical module of yoke successively reads data from bank where lower triangle covariance matrix, parses row of the data in lower triangular matrix And column, the rule for being stored in a bank is arranged according still further to matrix one, by the data and its source number of conjugate symmetric data deposit multiplexing According in bank, until obtaining complete complex covariance matrix.
The further design of the complex covariance matrix computing system based on iteration and restructural mode is, described to add When fast core distributes the data read from DDR, the address resolution being passed to according to DMA goes out incoming data wait ask in complex matrix Row and column is stored in corresponding source data bank further according to by column distribution principle.
The complex covariance matrix computing system based on iteration and restructural mode it is further design be, conjugate pair Claim module in the address of cache of use are as follows: the data of lower triangle covariance matrix to be successively read from result bank, according to the number The row and column of former lower triangular matrix where going out according to the address resolution in the bank number and bank at place, then press complete covariance square The distribution principle that one column of battle array are stored in a bank will be in current data and the bank of its conjugate symmetric data deposit reconstruct.
Advantages of the present invention is as follows:
Complex covariance matrix computing system based on iteration and restructural mode of the invention is carried out using parameterized approach Design reconstructs the maximized scheme of resource utilization and carries out complex covariance matrix fortune for different storage and computing resource It calculates.The method reduce source data calculation amounts, improve resource utilization, and accelerate hard-wired arithmetic speed, are signal The design realization that covariance matrix is calculated in process field provides good reference function.
Detailed description of the invention
Fig. 1 is to calculate complex covariance matrix hardware in the present invention to realize architecture diagram.
Fig. 2 is that traditional approach calculates complex covariance matrix hardware realization architecture diagram.
Fig. 3 is that source data arranges schematic diagram in the present invention.
Fig. 4 is that source data transmits Address Mapping schematic diagram in the present invention.
Fig. 5 is small point source data Stored Procedure figure in the present invention.
Fig. 6 is computing unit design diagram in the present invention.
Fig. 7 is that calculating process source data reads schematic diagram in the present invention.
Fig. 8 is that lower triangular matrix is conjugated symmetrical Address Mapping schematic diagram in the present invention.
Fig. 9 is the Performance Evaluation contrast schematic diagram that the present invention and traditional approach calculate complex covariance matrix.
Specific embodiment
The present invention program is described in detail with reference to the accompanying drawing.
Such as Fig. 1, the complex covariance matrix computing system based on iteration and restructural mode of the example mainly includes DMA Interface function, the operation of matrix covariance and the main modulars such as conjugation is symmetrical.DMA interface function module is mainly responsible for: first is that will On piece SRAM memory is stored in by partitioned mode with the dma mode data that DDR memory (hereinafter DDR) is read in outside piece (hereinafter SRAM), second is that last operation result is write back DDR by dma mode.Matrix covariance computing module is main It is responsible for each region source data of poll SRAM by way of iterative calculation and calculates lower triangle covariance matrix.It is conjugated symmetrical module master It is responsible for the conjugate symmetry matter according to covariance matrix, lower triangle covariance matrix is stored by address of cache and reconstruct Mode obtains complete complex covariance matrix.The present invention supports the complex matrix of any columns to carry out covariance operation, reduces biography The source data calculation amount for hardware implementation mode of uniting and the time that result data is repeatedly write back to DDR, tradeoff calculates and storage money Source, which is realized, maximizes multidiameter delay, and computing unit is constructed in the way of restructural, particular address mapping ruler is set up and obtains multiple association Variance matrix.
On piece SRAM memory setting storage resource is divided into k bank, and the depth of each bank is d, if m bank of distribution For storing lower triangle covariance matrix, covariance operation is carried out for the complex matrix that size is M × N, meets M2≤ md, N are Arbitrary value;If calculating degree of parallelism is b, the condition that complex matrix to be asked is small point delimited are as follows: M2≤ bd, condition is not satisfied i.e. Determine that complex matrix to be asked is big points.
The further design of the complex covariance matrix computing system based on iteration and restructural mode is, if wait ask Complex matrix is small point, then uses the one-dimensional data transmission mode of DMA;The two of DMA is used if complex matrix to be asked is big points Dimension data transmission mode.
The complex covariance matrix computing system based on iteration and restructural mode it is further design be, the square Battle array covariance computing module is stored in all bank of single bank, single deposit one and arranges and will work as according to by column of complex matrix to be asked Forefront divides an area into, all bank of single divide the regular of a section into and store former data, and it is total to calculate single subregion Size, segmentation total degree, area's number of final stage and the columns in the last one area.
The further design of the complex covariance matrix computing system based on iteration and restructural mode is, if arbitrarily Area's columns is unfilled, then uses zero padding mechanism;If source data stores over single maximum storage points, using ping-pong operation point Section processing uses batch processing if source data stores over single maximum storage points.
The further design of the complex covariance matrix computing system based on iteration and restructural mode is that matrix is assisted Variance computing module is constructed using restructural mode multiplies accumulating computing unit again, and each segmentation, each subregion, each bank are kept in result Data are iterated calculating, obtain lower triangle covariance matrix.
The further design of the complex covariance matrix computing system based on iteration and restructural mode is, multiplies again tired The bank quantity for being equal to source data storage using number for adding complex multiplier and Complex Summer in computing unit, is set as b, every time The number of computing unit input data are as follows: 2b+1;Institute's active data bank m-th address is read as complex multiplier and inputs A;Point Not Du Qu the 1~M address institute active data bank data conjugation as complex multiplier input B, another input data For the value of corresponding storage address in last storage result bank, each calculated result need to write back again in same bank again Same address.
The further design of the complex covariance matrix computing system based on iteration and restructural mode is, described total The symmetrical module of yoke successively reads data from bank where lower triangle covariance matrix, parses row of the data in lower triangular matrix And column, the rule for being stored in a bank is arranged according still further to matrix one, by the data and its source number of conjugate symmetric data deposit multiplexing According in bank, until obtaining complete complex covariance matrix.
The further design of the complex covariance matrix computing system based on iteration and restructural mode is, described to add When fast core distributes the data read from DDR, the address resolution being passed to according to DMA goes out incoming data wait ask in complex matrix Row and column is stored in corresponding source data bank further according to by column distribution principle.
The complex covariance matrix computing system based on iteration and restructural mode it is further design be, conjugate pair Claim module in the address of cache of use are as follows: the data of lower triangle covariance matrix to be successively read from result bank, according to the number The row and column of former lower triangular matrix where going out according to the address resolution in the bank number and bank at place, then press complete covariance square The distribution principle that one column of battle array are stored in a bank will be in current data and the bank of its conjugate symmetric data deposit reconstruct.
It is described in detail, and built a based on SystemC language with an example of the present invention realization below Cycle accurate system integration project model is verified.
The present invention is based on hardware implementing architectures shown in Fig. 2 to calculate complex covariance matrix, assumes that matrix X is M × N in example Rank (M≤256, N≤8K), general covariance result Y are the matrix of M × M: (following " % " indicates modulo operation)
If (being divided into 32 bank, each bank depth is 8k) so that memory size is the SRAM of 2MB as an example, when hardware realization Data processing is carried out using subregion (big points also need to be segmented) mode, repeatedly fills up source data area by column piecemeal.Consider to support big Points ping-pong operation, general covariance result Y are up to 256*256 points, i.e. lower triangular matrix is up to 256* (256+ 1)/2=32896 points, therefore at least need 5 bank storages.Therefore, available 32 bank remove storage calculated result 5 bank, remaining all bank by ping-pong operation store source data, therefore the bank quantity of single stored source data be A =floor [(32-5)/2]=13.(floor function performance: returning to the maximum integer smaller than parameter).
Since each bank depth is 8k, and in each bank by single-row M number it is sequentially stored into data (M is up to 256), therefore a area B=floor [8k/M] delimited, i.e., the A bank in each area can store A column data, and all bank at most may be used Store C=(A*B) column data.(big points definition: the size of data of complex matrix to be asked is more than C × M)
By taking M × N matrix as an example (N≤C), then source data arrangement is as shown in Figure 3.If N > C is recycled according to this, need to pass altogether It send D=ceil (N/C) secondary, ceil function performance: returning to the smallest positive integral for being more than or equal to specified expression formula.The last one area The bank number of occupancy is (N-1) %A+1, and area residue bank is filled up with 0.
For the AXI bus data bit wide used in the realization of this example for 256bit, DMA can be split as the number of 4 64bit According to the DMA interface function for sending matrix covariance operation to, which can be deposited into corresponding according to Address Mapping Bank, as shown in Figure 4.Therefore, when big points operation, every section of partition size is preferably 4 multiple, the column transmitted every time in this way The multiple for counting exactly 4 is not in synchronization toward same bank 2 data of write-in, so that it is whole to avoid delay from depositing number influence Body arithmetic speed;For small point operation, terminate since DMA is once carried, and columns is not necessarily exactly 4 multiple, 2 data are written toward same bank so will appear synchronization, need to be delayed at this time is written data in turn, as shown in Figure 5.It is right In last area's data transmission of the final stage of points operation greatly, because DMA uses 2-D data transmission mode, if remaining columns Be not 4 multiple, then can automatic zero padding gather into 4 multiple and needed just later pair although bank is written with zero padding column more Last area's zero padding filling, thus it is not only unimportant, shorten zero padding columns instead, accelerates source data storage time.
Reconfigurable Computation unit need to be reconstructed into the input of 27 complex datas, 5 grades of full flowing water and answer multiply-accumulate unit, such as Fig. 6 It is shown, 13 complex multipliers, 13 Complex Summers are used altogether.
According to the conjugate symmetry property of general covariance result, in order to shorten operation time, complex covariance matrix is only calculated Lower triangle, upper triangle is symmetrically extended using conjugation.Source data reading form in calculating process is as shown in fig. 7, specific fortune Steps are as follows for calculation:
1) matrix is divided into D sections by column, imports 1 segment data every time, is put into source data bank by Fig. 3 form;
2) carry out following operation to each area occupied when leading portion: (input Pre_Region Data is same area's corresponding positions The data set)
The 1st number is read simultaneously as input I1 from A bank, and the 1st number takes conjugation as input I2;
From A bank while the 2nd number is read as input I1, and successively reading the 1st, the 2nd number take conjugation as input I2;
The 3rd number is read simultaneously from A bank as input I1, successively read the 1st, the 2nd, the 3rd number take conjugation conduct Input I2;
……;
M-th number is read simultaneously as input I1 from A bank, successively reads the 1st, the 2nd ... ..., m-th number takes conjugation As input I2;
3) it repeats step 2) and completes the operation for working as all areas of leading portion;
4) all sections of step 2), step 3) completion calculating are repeated, M (M+1)/2 result is obtained;
5) by M (M+1)/2 result, by conjugation, symmetrically being extended to M*M result is newly stored into new storage array.
It can be re-used due to calculating completion opisthogenesis data storage areas, therefore construct new storage array and store M*M square Battle array covariance calculated result, 256bit is corresponding in order to be transmitted as with each data of DMA, and several in view of taking out 4 every time Convenience, therefore new storage array is planned to bank0-bank15, calculated M*M complex covariance matrix is sequentially stored into newly by column Storage array in, it is as shown in Figure 8 that specific lower triangular matrix is conjugated symmetrical Address Mapping.
This example Performance Evaluation is as follows: 1) number of segment for taking B area is floor (N/C);2) area's number that final stage occupies For ceil ((N%C)/A);3) periodicity that input I1 is taken when each area's operation is M, takes input I2 and input Pre_ parallel The periodicity of Region Data is M × (M+1)/2;4) time of lower triangular matrix conjugation symmetric extension is M × (M+1)/2;5) Arithmetic element is multiplied accumulating again and calculates the time as T1, reads from bank and the time of storing data is respectively T2, T3.The then multiple association side The poor total execution cycle number of matrix is as follows: floor (N/C) * [(M* (M+1)/2+M) * B]+ceil [(N%C)/A] * [M* (M+1)/2 +M]+M*(M+1)/2+T1+T2+T3.And the periodicity that traditional approach calculates complex covariance matrix is as follows: M*N* (M+4)/4.
Because T1, T2, T3 are smaller relative to total periodicity, thus it is negligible.Association side is being made to different size complex matrix When difference operation, the performance comparison with above 2 kinds of implementations is as shown in Figure 9.From figure it can clearly be seen that based on iteration and can The periodicity that reconstruct mode calculates complex covariance matrix greatly reduces than traditional approach.It is of the invention based on iteration and restructural side The complex covariance matrix computing system of formula supports the complex matrix of any columns to carry out covariance operation, reduces conventional hardware realization The source data calculation amount of mode and the time that result data is repeatedly write back to DDR, tradeoff calculates and storage resource realizes maximum Change multidiameter delay, computing unit is constructed in the way of restructural, particular address mapping ruler is set up and obtains complex covariance matrix, greatly It improves resource utilization and hard-wired arithmetic speed greatly.As the typical operation of field of signal processing, the hardware realization Method has good reference and broad application prospect.

Claims (10)

1. a kind of complex covariance matrix computing system based on iteration and restructural mode, including outside on piece SRAM memory, piece DDR memory, reconfigurable cell, dma controller and accelerate core, the acceleration core respectively on piece SRAM memory, can weigh The connection of structure unit communication, dma controller and reconfigurable cell communicate to connect, and the outer DDR memory of piece passes through bus and dma controller Communication connection, it is characterised in that the acceleration core includes:
Matrix covariance computing module, each region source data of poll on piece SRAM memory by way of iterative calculation, and count Calculate lower triangle covariance matrix;
It is conjugated symmetrical module, according to the conjugate symmetry matter of covariance matrix, lower triangle covariance matrix is passed through into address of cache Complete complex covariance matrix is obtained with the mode of reconstruct storage, forms final operation result;
DMA interface function module will be stored on piece by partitioned mode by the dma mode data that DDR memory is read in outside piece SRAM memory;And the operation result is write back into DDR memory outside piece by dma mode.
2. the complex covariance matrix computing system according to claim 1 based on iteration and restructural mode, feature exist It is divided into k bank on piece SRAM memory setting storage resource, the depth of each bank is d, if m bank of distribution is used In storing lower triangle covariance matrix, covariance operation is carried out for the complex matrix that size is M × N, meets M2≤ md, N are to appoint Meaning value;If calculating degree of parallelism is b, the condition that complex matrix to be asked is small point delimited are as follows: M2≤ bd, condition is not satisfied sentences Fixed complex matrix to be asked is big points.
3. the complex covariance matrix computing system according to claim 2 based on iteration and restructural mode, feature exist If being small point in complex matrix to be asked, the one-dimensional data transmission mode of DMA is used;It is used if complex matrix to be asked is big points The 2-D data transmission mode of DMA.
4. the complex covariance matrix computing system according to claim 1 based on iteration and restructural mode, feature exist All bank of single bank, single deposit one is stored according to by column of complex matrix to be asked in the matrix covariance computing module It arranges and list will be calculated when forefront divide an area into, all bank of single divide the rule an of section into and store former data Subzone total size, segmentation total degree, area's number of final stage and the columns in the last one area.
5. the complex covariance matrix computing system according to claim 4 based on iteration and restructural mode, feature exist If columns is unfilled in any area, zero padding mechanism is used;If source data stores over single maximum storage points, using table tennis Pang operation segment processing uses batch processing if source data stores over single maximum storage points.
6. the complex covariance matrix computing system according to claim 1 based on iteration and restructural mode, feature exist It is constructed in matrix covariance computing module using restructural mode and multiplies accumulating computing unit again, by each segmentation, each subregion, each bank Temporary result data is iterated calculating, obtains lower triangle covariance matrix.
7. the complex covariance matrix computing system according to claim 6 based on iteration and restructural mode, feature exist In the bank quantity for being equal to source data storage using number for multiplying accumulating complex multiplier and Complex Summer in computing unit again, setting For b, the number of each computing unit input data are as follows: 2b+1;Institute's active data bank m-th address is read as complex multiplier Input A;The conjugation for reading the data of the 1~M address institute active data bank respectively inputs B as complex multiplier, another Input data is the value of corresponding storage address in last storage result bank, and each calculated result need to write back same again again Same address in bank.
8. the complex covariance matrix computing system according to claim 1 based on iteration and restructural mode, feature exist Data successively are read from bank where lower triangle covariance matrix in the symmetrical module of conjugation, parse data in lower three angular moment Row and column in battle array arranges the rule for being stored in a bank according still further to matrix one, the data and its conjugate symmetric data is stored in multiple In source data bank, until obtaining complete complex covariance matrix.
9. the complex covariance matrix computing system according to claim 1 based on iteration and restructural mode, feature exist When the acceleration core distributes the data read from DDR, the address resolution being passed to according to DMA goes out incoming data wait ask multiple Row and column in matrix is stored in corresponding source data bank further according to by column distribution principle.
10. the complex covariance matrix computing system according to claim 4 based on iteration and restructural mode, feature exist In being conjugated symmetrical module in the address of cache of use are as follows: the data of lower triangle covariance matrix are successively read from result bank, The row and column of bank number where the data and the address resolution in bank former lower triangular matrix where going out, then by complete What the distribution principle that one column of covariance matrix are stored in a bank reconstructed current data and the deposit of its conjugate symmetric data In bank.
CN201811284263.4A 2018-10-30 2018-10-30 Complex covariance matrix calculation system based on iteration and reconfigurable mode Active CN109446478B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811284263.4A CN109446478B (en) 2018-10-30 2018-10-30 Complex covariance matrix calculation system based on iteration and reconfigurable mode

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811284263.4A CN109446478B (en) 2018-10-30 2018-10-30 Complex covariance matrix calculation system based on iteration and reconfigurable mode

Publications (2)

Publication Number Publication Date
CN109446478A true CN109446478A (en) 2019-03-08
CN109446478B CN109446478B (en) 2021-09-28

Family

ID=65550425

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811284263.4A Active CN109446478B (en) 2018-10-30 2018-10-30 Complex covariance matrix calculation system based on iteration and reconfigurable mode

Country Status (1)

Country Link
CN (1) CN109446478B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109614582A (en) * 2018-11-06 2019-04-12 海南大学 The lower triangular portions storage device of self adjoint matrix and parallel read method
CN111045965A (en) * 2019-10-25 2020-04-21 南京大学 Hardware implementation method for multi-channel conflict-free splitting, computer equipment and readable storage medium for operating method
CN111723336A (en) * 2020-06-01 2020-09-29 南京大学 Cholesky decomposition-based arbitrary-order matrix inversion hardware acceleration system adopting loop iteration mode

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002037259A1 (en) * 2000-11-01 2002-05-10 Bops, Inc. Methods and apparatus for efficient complex long multiplication and covariance matrix implementation
EP1215507A2 (en) * 2000-12-12 2002-06-19 Matsushita Electric Industrial Co., Ltd. Radio-wave arrival-direction estimating apparatus and directional variable transceiver
CN101211333A (en) * 2006-12-30 2008-07-02 北京邮电大学 Signal processing method, device and system
CN103685110A (en) * 2013-12-17 2014-03-26 京信通信***(中国)有限公司 Predistortion processing method and system and predistortion factor arithmetic unit
CN105426345A (en) * 2015-12-25 2016-03-23 南京大学 Matrix inverse operation method
CN105630735A (en) * 2015-12-25 2016-06-01 南京大学 Coprocessor based on reconfigurable computational array
CN105893333A (en) * 2016-03-25 2016-08-24 合肥工业大学 Hardware circuit for calculating covariance matrix in MUSIC algorithm
US9686069B2 (en) * 2015-05-22 2017-06-20 ZTE Canada Inc. Adaptive MIMO signal demodulation using determinant of covariance matrix

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002037259A1 (en) * 2000-11-01 2002-05-10 Bops, Inc. Methods and apparatus for efficient complex long multiplication and covariance matrix implementation
EP1215507A2 (en) * 2000-12-12 2002-06-19 Matsushita Electric Industrial Co., Ltd. Radio-wave arrival-direction estimating apparatus and directional variable transceiver
CN101211333A (en) * 2006-12-30 2008-07-02 北京邮电大学 Signal processing method, device and system
CN103685110A (en) * 2013-12-17 2014-03-26 京信通信***(中国)有限公司 Predistortion processing method and system and predistortion factor arithmetic unit
US9686069B2 (en) * 2015-05-22 2017-06-20 ZTE Canada Inc. Adaptive MIMO signal demodulation using determinant of covariance matrix
CN105426345A (en) * 2015-12-25 2016-03-23 南京大学 Matrix inverse operation method
CN105630735A (en) * 2015-12-25 2016-06-01 南京大学 Coprocessor based on reconfigurable computational array
CN105893333A (en) * 2016-03-25 2016-08-24 合肥工业大学 Hardware circuit for calculating covariance matrix in MUSIC algorithm

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
于东等: "一种高精度的大点数二维FFT处理器设计", 《现代雷达》 *
何子述等: "基于数据阵共辘重构的MUSIC角估计算法", 《电子科技大学学报》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109614582A (en) * 2018-11-06 2019-04-12 海南大学 The lower triangular portions storage device of self adjoint matrix and parallel read method
CN111045965A (en) * 2019-10-25 2020-04-21 南京大学 Hardware implementation method for multi-channel conflict-free splitting, computer equipment and readable storage medium for operating method
CN111723336A (en) * 2020-06-01 2020-09-29 南京大学 Cholesky decomposition-based arbitrary-order matrix inversion hardware acceleration system adopting loop iteration mode
CN111723336B (en) * 2020-06-01 2023-01-24 南京大学 Cholesky decomposition-based arbitrary-order matrix inversion hardware acceleration system adopting loop iteration mode

Also Published As

Publication number Publication date
CN109446478B (en) 2021-09-28

Similar Documents

Publication Publication Date Title
CN111178519B (en) Convolutional neural network acceleration engine, convolutional neural network acceleration system and method
CN108241890B (en) Reconfigurable neural network acceleration method and architecture
CN104915322B (en) A kind of hardware-accelerated method of convolutional neural networks
Pınar et al. Fast optimal load balancing algorithms for 1D partitioning
CN107341544A (en) A kind of reconfigurable accelerator and its implementation based on divisible array
CN108805266A (en) A kind of restructural CNN high concurrents convolution accelerator
TW201913460A (en) Chip device and related products
CN111667051A (en) Neural network accelerator suitable for edge equipment and neural network acceleration calculation method
CN107392309A (en) A kind of general fixed-point number neutral net convolution accelerator hardware structure based on FPGA
CN111242289A (en) Convolutional neural network acceleration system and method with expandable scale
CN110110844B (en) Convolutional neural network parallel processing method based on OpenCL
CN103970720B (en) Based on extensive coarseness imbedded reconfigurable system and its processing method
CN109446478A (en) A kind of complex covariance matrix computing system based on iteration and restructural mode
CN102135951B (en) FPGA (Field Programmable Gate Array) implementation method based on LS-SVM (Least Squares-Support Vector Machine) algorithm restructured at runtime
CN110647719B (en) Three-dimensional FFT (fast Fourier transform) calculation device based on FPGA (field programmable Gate array)
CN106156851A (en) The accelerator pursued one's vocational study towards the degree of depth and method
CN110222818A (en) A kind of more bank ranks intertexture reading/writing methods for the storage of convolutional neural networks data
CN109993293A (en) A kind of deep learning accelerator suitable for stack hourglass network
CN110069444A (en) A kind of computing unit, array, module, hardware system and implementation method
CN209708122U (en) A kind of computing unit, array, module, hardware system
CN105955896B (en) A kind of restructural DBF hardware algorithm accelerator and control method
CN112732630A (en) Floating-point matrix multiplier many-core parallel optimization method for deep learning
CN106156142A (en) The processing method of a kind of text cluster, server and system
CN114492753A (en) Sparse accelerator applied to on-chip training
Wang et al. A scalable FPGA engine for parallel acceleration of singular value decomposition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant