CN1187698C

CN1187698C - Design method of built-in parallel two-dimensional discrete wavelet conversion VLSI structure

Info

Publication number: CN1187698C
Application number: CNB031146023A
Authority: CN
Inventors: 兰旭光; 郑南宁; 周宁; 梅魁志
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2003-04-07
Filing date: 2003-04-07
Publication date: 2005-02-02
Anticipated expiration: 2023-04-07
Also published as: CN1448871A

Abstract

The present invention discloses a design method for the two-dimensional discrete wavelet transformation based on the intrinsic parallel VLSI structure of the lifting algorithm. Through the techniques of shift and addition, multiplication operation in a filter is converted into the operation of a shift register and a summator so that the calculation amount is greatly reduced, and hardware resources and cost are saved; through row cache technique, a row and a column filters are connected and are simultaneously operated so that an amount of memory space is saved; through delay register technique, pipeline processing is realized so that hardware processing speed is improved, and the hardware utilization rate is also improved; through unified symmetrical expanding technique, the present invention has versatility. The present invention is a high efficiency hardware implementation proposal for two-dimensional discrete wavelet transformation, which has a plurality of advantages. Through the optimization design of the row filter and the column filter of two-dimensional discrete wavelet transformation, the hardware utilization rate reaches 100%; in one working clock, two outputs occur so that parallelism is realized, and no hardware cost is increased; obtained hardware structure is simple, and thus, VLSI can be easily realized.

Description

The VLSI construction design method of the inherent two-dimensional discrete wavelet conversion that walks abreast

One, affiliated technical field

The invention belongs to the VLSI design field.Be specifically related in JPEG2000 hardware is realized, design a kind of inherence parallel, efficiently based on the VLSI construction design method of the two-dimensional discrete wavelet conversion of boosting algorithm.

Two, background technology

The 2D-DWT hardware configuration has good time frequency analysis characteristic is being applied it in a lot of real systems, as the coded system of Jpeg2000 standard and Video processing etc., consider the requirement of speed and area, need realize compressibility with chip.In the present existing 2-D DWT chip structure, most of structure is just the same for the structural design in two stages of fast discrete wavelet transformation (row filtering and row filtering), though reduced the control complexity like this, but the storage of the intermediate result of row filtering needs a large amount of storage spaces, so not only reduced the hardware utilization factor, and increased the consumption and the chip cost of hardware resource, limited the speed of chip deal with data.

Three, summary of the invention

According to defective that exists in the above-mentioned background technology and deficiency, the objective of the invention is to, provide a kind of hardware utilization factor height, cost low, have the VLSI structure Design method of the parallel two-dimensional discrete wavelet conversion in the inherence of concurrency, hardware spending 2-D DWT little, simple in structure based on boosting algorithm.

For achieving the above object, the solution that the present invention adopts is: the VLSI construction design method of the inherent two-dimensional discrete wavelet conversion that walks abreast, carry out in the following manner:

1) multiply operation in the wave filter is become the operation of shift register and totalizer by " displacement adds " technology;

2) by " row cache (LineBuffer) " technology, make high pass, low-pass filter, line direction, column direction filtering transformation are realized parallel work-flow in the hardware configuration of a compactness, reduced storage space, save the expense of hardware resource, increased hardware utilization and processing speed;

3) by " delay time register " technology, realize Pipeline thought, make and structure optimization improved processing speed.

4) make this filter construction have versatility by unified " border symmetry expansion " technology.

Described displacement adds technology, and real multiply is with 2 n power, is equivalent to this real number n position that moves to left; And any one wavelet filter coefficient can by limited 2 integral number power and expression; Like this, the product of a real number and a wavelet filter coefficient just can and be represented by the limited individual that be shifted of this real number.

Described " row cache " technology is: the filtering intermediate result of the line direction of wavelet transform (or column direction), deposit in row (or row) array cache (Line Buffer Array), when reaching certain line number (or columns) (being no more than 9 row among the embodiment), the filtering of column direction (or line direction) just can be from row (or row) buffer memory reading of data, be listed as the filtering transformation of (or row) direction, like this, after the discrete wavelet line translation began, the filtering of row (or row) direction just can begin.In a clock, can hold the operation of row, column filtering transformation simultaneously, realize the concurrency of row, column trend pass filtering conversion.The size of row cache array is to determine according to the number of Lifting algorithm and symmetry expansion.

Described " delay time register " technology is: when the row, column wave filter of design two-dimensional discrete wavelet conversion, in order to improve hardware process speed, adopt streamline thought, by the data in input of delay time register buffer memory and the computation process, make wave filter to handle continuously and output data, like this, make hardware resource be fully utilized.

Described unification " border symmetry expansion " technology is: in the wavelet transform based on boosting algorithm, the border will be expanded, so that image obtains reconstruct accurately.In the Jpeg2000 standard code, boosting algorithm has different expansions to count for the wavelet filter of different length, according to symmetrical extension principle, feasible expansion has versatility, in the Jpeg2000 standard recommendation, 5/3 and 9/7 can adopt unified expansion like this, make filter construction have versatility.

The present invention is a kind of hard-wired design proposal of efficient 2-D DWT with multiple advantage.By optimal design to discrete wavelet transformer line feed filtering and row filtering, the hardware utilization factor reaches 100%, realized the concurrency of row, column filtering transformation, replace multiply operation with the displacement add operation, make operand significantly reduce, and not increasing hardware spending, the hardware configuration that obtains is also very simple, is highly susceptible to the realization of VLSI.

Fig. 1 is that embodiment of the invention multiply operation changes displacement add operation structural drawing into.

Fig. 2 is the system assumption diagram of the two-dimensional discrete wavelet conversion of the embodiment of the invention based on the VLSI realization of boosting algorithm.

Fig. 3 is the structured flowchart of embodiment of the invention row, column filtering transformation.

Fig. 4 is the structured flowchart of embodiment of the invention line direction filtering.

Fig. 5 is the structured flowchart of embodiment of the invention column direction filtering.

Five, embodiment

The present invention is described in more detail below in conjunction with drawings and Examples, but the invention is not restricted to this embodiment.

According to technical scheme of the present invention, the inventor has provided embodiments of the invention.What use in the present embodiment is one group of biorthogonal wavelet wave filter---DAUBECHIES 9/7 biorthogonal wavelet in the JPEG2000 standard.

In the present embodiment, at first the multiply operation in the wave filter is realized with shift register and totalizer by general " displacement adds " technology.In calculating process, adopt fixed-point number to calculate, back 13 is decimal, data width is 24.

By " displacement adds " technology, present embodiment changes multiplication into the displacement add operation.In view of the coefficient of the wave filter of two-dimensional discrete wavelet conversion is fixed, make design more optimize like this.Two-dimensional discrete wavelet conversion core based on boosting algorithm is to the prediction of odd point and the renewal of dual numbers point; I.e. (boosting algorithm of D9/7) now illustrates step1:

step1：Y(2n+1)＝X _ext(2n+1)+α×(X _ext(2n)+X _ext(2n+2))i ₀-3≤2n+1＜i ₁+3

I wherein ₀, (i ₁-1), the beginning index of expression input delegation (or row) data and index respectively at last.

Filter coefficient is as follows with binary representation:

α＝-1.586134342＝-1.1001011000001

＝-(1+1/2+1/16+1/64+1/128+1/8192)；

In Fig. 1, be in the present embodiment, first is negative value because of α, thus get the complement code of Num earlier,

Num=～Num+1; Then have

α * Num=Num+Num＞＞1+Num＞＞4+Num＞＞6+Num＞＞7+Num＞＞13, promptly multiply operation is optimized for the displacement add operation.Other coefficient is all done same processing.Represent negative with complement code in the present embodiment;

In Fig. 2, provided the total system assumption diagram based on boosting algorithm VLSI realization of present embodiment.Control module is to judge whether to carry out the next stage wavelet decomposition according to current decomposed class and desired wavelet decomposition progression, and the data of control change input.The row expansion module is that input is expanded, and the line translation module is that the data after the expansion are carried out Filtering Processing.Row cache module (Line Buffer Array) is the line translation result who deposits some, to realize the parallel of row-column transform.Rank transformation is that sense data is carried out the row Filtering Processing from row cache, and scalar/vector produces according to row filtering output result and distributes the address, and the coefficient of current wavelet decomposition is write memory module.By this architecture, not only can make high-pass filtering and low-pass filtering in the row and column direction realize walking abreast, and it is parallel that the conversion of row, column trend pass filtering is realized.

The inherent concurrency here refers on the basis that does not increase hardware cost, has realized actual parallel processing, comprises that high pass, low-pass filtering and the ranks direction in line direction and the column direction carried out filtering simultaneously.In each work clock, produce two outputs.

In Fig. 3, provided the line direction of present embodiment, the integrated filter of column direction, can realize parallel the carrying out of filtering transformation of line direction and column direction like this.

In Fig. 4, provided the line direction filtering transformation structure of present embodiment.Utilize delay time register that input is divided into even number point and odd point, enter the α module, the addition of even number point is multiplied each other with filter coefficient then (being optimized for the displacement add operation) again with the odd point addition, utilizes the Pipeline register to realize pipeline processes.Then the processing mode identical with the α module is divided into odd point and even number point to the output of last module, enters β, γ, δ module successively.Along with the continuous input of data, line filter is handled continuously.Can make full use of hardware like this, make its utilization factor reach 100%.And control is very simple, just utilizes several delay time registers (DFF) can realize the continuous processing of filtering.The result of row filtering deposits in the Line Buffer array, and present embodiment has adopted 11 Line Buffer, 24 of bit wide positions, and length is picture traverse, 256.Why using 11, is to carry out in order to realize that line translation and rank transformation are parallel, has so just saved the Memory space of storage line conversion intermediate result greatly, makes hardware more compact, has improved arithmetic speed when reducing area, also meets chip design thought.Can see that from structure the capable filtering transformation of D9/7 biorthogonal wavelet wave filter only needs 8 Pipeline registers, 6 delay time registers, 8 totalizers and 4 shift registers just can be realized.Because shift register can and be realized with line, so shift register does not increase spending of hardware basically.Compare with classic method, this structure has greatly reduced the complexity of spending of hardware and structure.

In Fig. 5, provided the structure of present embodiment row filtering stages.The design of row filter structure is different with the row filtering stage.Because, when having only the certain number of conversion, row filtering just can carry out row filtering, and present embodiment is no more than 9 row.And the data that row rate ripple is handled are come the result of filtering voluntarily, so when the required data number of rank transformation satisfies, get final product begin column filtering, (size of data number is got according to boosting algorithm and symmetry expansion number), be to begin just can carry out row filtering (symmetry expansion earlier is 9 row after the expansion) in the present embodiment from fifth line.In order to accelerate the processing speed of row filtering, in order to avoid data produce accumulation in Line Buffer Array, from Line Buffer Array, read 9 data at a clock, carrying out rank transformation handles, the result of line translation simultaneously continues to deposit among the Line Buffer, reads and writes Line Buffer array and carry out simultaneously.9 data enter 4 modules successively, and two data of last clock output are respectively LL, LH, or HL, LH.According to output data, produce corresponding address, in the write store.As can be seen, row filtering needs 8 Pipeline registers from structural drawing, 1 delay time register, and hardware process speed has been quickened in 19 totalizers and 10 displacement add operations.Read 9 data simultaneously, make processing speed improve greatly.So just realized the parallel processing of line translation and rank transformation.

In the present embodiment, " symmetry expansion " technology by unified makes expansion have versatility.Counting of symmetry expansion is to expand with the parity of last index according to the type of wave filter and the beginning index of delegation's (row) data.During design row (row) is begun index i ₀Unification is zero, and then according to extension principle, counting expansion unified is that symmetry expansion maximum in all filter types is counted.

Claims

1. the VLSI construction design method of the parallel two-dimensional discrete wavelet conversion in an inherence is characterized in that, comprises following content:

2), line direction, column direction filtering transformation are realized parallel in the hardware configuration of a compactness by " row cache " technology;

3), realize pipeline organization by " delay time register " technology;

4) make this filter construction have versatility by unified " border symmetry expansion " technology;

Described " displacement adds " technology is:

Real multiply is with 2 n power, be equivalent to this real number n position that moves to left, the product of a real number and a wavelet filter coefficient just by limited displacement of real number and represent;

Described " row cache " technology is:

The filtering intermediate result of the line direction of wavelet transform, deposit in the row cache array, when reaching certain line number, the filtering of column direction just can be from row cache reading of data, carry out the filtering transformation of column direction, in a clock, can carry out the operation of row, column filtering transformation simultaneously, realize parallel work-flow; Perhaps the filtering intermediate result of the column direction of wavelet transform, deposit in the row array cache, when reaching certain columns, the filtering of line direction just can be from the row buffer memory reading of data, carry out the filtering transformation of line direction, in a clock, can carry out the operation of row, row filtering transformation simultaneously, realize parallel work-flow;

Described " delay time register " technology is:

The row, column Design of Filter of two-dimensional discrete wavelet conversion adopts pipelining technique, and the data by in input of delay time register buffer memory and the computation process make wave filter handle continuously;

" border symmetry expansion " technology of described unification is:

In the wavelet transform based on boosting algorithm, the border will be expanded, so that image obtains reconstruct accurately, during design row or column being begun the index unification is zero, according to extension principle, counting expansion unified is that symmetry expansion maximum in all filter types is counted then.