CN101355700B - Method for encoding parallel series entropy and apparatus thereof - Google Patents

Method for encoding parallel series entropy and apparatus thereof Download PDF

Info

Publication number
CN101355700B
CN101355700B CN 200810119769 CN200810119769A CN101355700B CN 101355700 B CN101355700 B CN 101355700B CN 200810119769 CN200810119769 CN 200810119769 CN 200810119769 A CN200810119769 A CN 200810119769A CN 101355700 B CN101355700 B CN 101355700B
Authority
CN
China
Prior art keywords
code table
module
code
run
coefficient
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 200810119769
Other languages
Chinese (zh)
Other versions
CN101355700A (en
Inventor
彭小明
曹喜信
张兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi lead Speed Technology Co., Ltd.
Original Assignee
SCHOOL OF SOFTWARE AND MICROELECTRONICS PEKING UNIVERSITY
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SCHOOL OF SOFTWARE AND MICROELECTRONICS PEKING UNIVERSITY filed Critical SCHOOL OF SOFTWARE AND MICROELECTRONICS PEKING UNIVERSITY
Priority to CN 200810119769 priority Critical patent/CN101355700B/en
Publication of CN101355700A publication Critical patent/CN101355700A/en
Application granted granted Critical
Publication of CN101355700B publication Critical patent/CN101355700B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The present invention discloses a parallel entropy coding method and a device thereof, belonging to the video coding decoding field. The method comprises the following steps of grouping the quantification coefficient of each transform block to parallelly input to and output to two register blocks, firstly; afterwards, scanning the coding coefficient (run, level) of each group of the quantification coefficient; selecting a code table of the current quantification coefficient according to the maximum value of the level of the coded coefficient, and selecting the code table of the transforms block EOB according to the maximum value of the level of all coding coefficients; finally, transforming the selected code table to a bit wide table, and calculating the bit number of the code of the transform block. The device comprises a data input transcription module, a run coding module, a reverse order matrix module, a code table selection module, a table look-up module, a Golomb coding module, and a bit number module of an addition transform module. The method and the device largely improve the speed of data processing, optimize the table, and at the same time, save the hardware resources.

Description

A kind of method for encoding parallel series entropy and device thereof
Technical field
The present invention relates to a kind entropy coding method and a device thereof, relate in particular to a kind of method for encoding parallel series entropy and device thereof, belong to the coding and decoding video field.
Background technology
In traditional image, video compression, entropy coder is in last link of coding as an indispensable part, is used for 01 the shortest Bit String of the usefulness of processed images data is represented.
The AVS standard is the novel video coding technique that has independent property right and autonomous innovation technology, and video section (AVS-P2) has formally become national standard at present.The AVS video coding in real-time coding, will be handled lot of data information in the unit interval towards high definition radio and television and sound equipment video, the coding of pure software realize hardly may, therefore, encoder hardware-accelerated essential.The present invention is a link during the AVS encoder hardware quickens to realize, in encoder, the complexity of entropy coding part accounts for 30% of whole encoder complexity greatly, this process mainly is (the run that each coefficient of coding is tried to achieve in scanning to the zigzag of each coefficient serial (zig-zag), level), according to coefficient (run level) looks into the bit number that corresponding Golomb code table obtains corresponding expression again.
The AVS entropy coding adopts the adaptive variable length coding techniques.In AVS entropy coding process, all syntactic elements and residual error data all are that the form with the index Columbus sign indicating number is mapped to binary bit stream.Block conversion coefficient to prediction residual, form (run, level) to string through scanning, level, run are not independent events, and exist very strong correlation, level, run adopt two-dimentional combined coding in AVS, and according to the different probability distribution trend of current level, run, the exponent number of adaptively changing index Columbus sign indicating number.By observing (the run of residual error coefficient, level) right, find that the level amplitude has the trend that increases progressively, and the value of run has the trend of successively decreasing, and (run that big level value is arranged, level) several less usually to its run value, this that is to say that (run, local probability distribution level) is different for different phase in sequential coding.Based on these statistical laws, adopt a plurality of 2D-VLC code tables coupling (run, local probability distribution level), and, the level amplitude increase progressively the variation that can discern local probability distribution, promptly utilize the level amplitude to carry out the automatic switchover of code table.
In the AVS-P2 standard, entropy coding is used 19 2D-VLC tables altogether, each code table all adopts index Columbus (Exp-Golomb) sign indicating number, dissimilar transform blocks (comprises in the frame, interframe and chrominance block) use different code tables respectively, (intra) code table, 7 interframe (inter) code tables and 5 colourities (chroma) code table is respectively applied in the coded frame, the conversion coefficient of interframe and chrominance block in 7 frames.These code tables all are the 2D-VLC code table, each code table has defined (level, run) and the mapping relations between EOB (end of block) and the codenumber, the codenumber scope is 0~59,59 expression escape_code (overflowing sign indicating number), Fig. 1 has provided the flow chart of entropy coder in the existing method.
Summary of the invention
The present invention is directed to the AVS entropy coder, provide a kind of efficient hardware to quicken to realize.The object of the present invention is to provide a kind of method for encoding parallel series entropy and,, realize the real-time of high definition coding thereby reach to improve the speed of encoder based on hardware-accelerated AVS class entropy coding device.
Technical scheme of the present invention is:
A kind of method for encoding parallel series entropy the steps include:
1) the quantization parameter grouping with each transform block walks abreast input and output in two registers group; Its method is:
A) in a clock, deposit one group of quantization parameter in a registers group;
B) current transform block all is input to this registers group by group after, at next clock from this registers group according to the order of zig-zag by group output quantization parameter, begin follow-up grouping quantization parameter is imported another registers group from this clock simultaneously;
C) repeat above-mentioned steps a) and b), realize the ping-pong operation of two registers group;
2) (run level), identifies the EOB symbol simultaneously on first non-zero quantized coefficients of current transform block to adopt the method for Run-Length Coding to calculate the code coefficient of every group of quantization parameter in same clock; Wherein level represents the size of quantization parameter absolute value, and run represents number zero between quantization parameter and the previous non-zero quantized coefficients;
3) (run level) carries out inverted order output with code coefficient;
4) select the code table of current quantization parameter according to the level maximum in the code coefficient of output;
5) select the code table of this transform block EOB according to the level maximum of all code coefficients in the current transform block;
6) step 4) and the selected code table of the step 5) logic determines by Columbus's coding is converted to the bit wide table, and adds and calculate the number of coded bits of current all coefficients of transform block and the number of coded bits of EOB.
Described transform block is the 8x8 piece, and the quantization parameter of described 8x8 piece is divided into 8 groups of data with behavior unit.
The level value of described code coefficient is the absolute value of quantization parameter; The run value calculating method of described code coefficient is: set a variable base0 and be used for being used to write down the number at every group of quantization parameter end adjacent 0 and the line number that a counter counter is used to write down this transform block to run value assignment, a variable base1; Counter=0, base0=0; =8, the base0=base1 of next group quantization parameter.
A kind of encoding parallel series entropy device, it comprises that successively the data input unloading module, Run-Length Coding module, inverted order matrix module, the code table that connect select module, table look-up module, Columbus's coding module, add the bit number module with transform block;
Described data input unloading module is used to handle the quantization parameter of every group of parallel input, and it comprises that two storage matrix are used for realizing the ping-pong operation of data input and output;
Described Run-Length Coding module be used for every group of quantization parameter that the counter register group is imported in same clock code coefficient (run, level),
Described inverted order matrix module is used to finish code coefficient (run, level) the inverted order output of transform block;
Described code table selects module to be used for code coefficient according to current inverted order output, and (run level) determines the code table of current quantization parameter and the code table of this transform block EOB;
The code table that described table look-up module is used for obtaining is mapped as corresponding code word;
Described Columbus's coding module is used for code word is converted to bit wide, obtains the bit wide table of code table correspondence;
The described number of coded bits that is used for calculating current all quantization parameters of transform block with the bit number module of transform block that adds according to the bit wide table.
Described Run-Length Coding module comprises that a sign bit comparator, a counter and some input data comparators are used for every group of input data are compared in same clock with zero, and described input data comparator links to each other with this group end adjacent zeros number comparator with the EOB comparator respectively; Described EOB comparator is connected with a selector, and described this group end adjacent zeros number comparator is connected with run value comparator through a selector; Described counter is connected with described two selectors respectively.
Described code table selects the circuit connecting relation of module to be: the some input size of data comparators that are used for comparison transform block input data determine that with one maximum comparison circuit before the coefficient, a selector are connected respectively; The output of described selector is connected with maximum comparison circuit before described definite coefficient with an EOB MUX respectively; The maximum comparison circuit is connected with some parallel single-stage MUX before described definite coefficient; Described selector is connected with the output of same counter respectively with described EOB MUX; The some input size of data comparator output terminals that are connected with described selector are connected with described EOB MUX, and the output of described EOB MUX is connected with the selected comparator of a code table.
The circuit connecting relation of described table look-up module is: code table type gating switch is connected with the code table gating switch respectively, described code table gating switch respectively through a comparator be connected with a MUX again after a code word gating switch is connected; Described code table gating switch is connected with one yard table number incoming line respectively with described MUX; Described comparator is connected with the input data respectively with described MUX; Described MUX is to every pairing described level value of code table of selection signal reference of code table inter_VLC0, inter_VLC1, inter_VLC2, inter_VLC3, intra_VLC0, intra_VLC1, intra_VLC2, intra_VLC3, chroma_VLC0, chroma_VLC1, chroma_VLC2, to every pairing described run value of code table of selection signal reference of code table inter_VLC4, inter_VLC5, inter_VLC6, intra_VLC4, intra_VLC5, intra_VLC6, chroma_VLC3, chroma_VLC4.
The circuit connecting relation of described Columbus's coding module is: the cascade comparator of input code table number and code table type is connected with an adder 1 through shift unit 1, and this adder 1 input is connected with the output of level absolute value, its output be connected by bit comparator 1; Described cascade comparator is connected with an adder 2 through another shift unit 2, and these adder 2 inputs are connected with the output of level sign bit, are connected with run value output through a shift unit 3 through a selector 1, simultaneously its output be connected by bit comparator 2; The described bit comparator 1 of pressing is connected with a shift unit 4 through an adder 3 by bit comparator 2 with described, and two outputs of described shift unit 4 and described cascade comparator are connected through the input of a subtracter with a selector 2; The cascade comparator output terminal that is connected with described shift unit 2 is connected with a subtracter simultaneously, and the input of this subtracter is connected with the output of code word through a shift unit 5, and the output of this subtracter is connected with the input of described selector 2 simultaneously; Described selector 2 inputs are connected with described code word output.
Parallel AVS class entropy coding method of the present invention comprises:
1) 8 of quantized data parallel input and output: in order further to improve the speed of hardware designs, coefficient for the 8x8 piece after quantizing divides 8 clocks to be written into register, each clock is written into 8 coefficients, these 8 relevant positions that are deposited with current registers group according to the coefficient of natural order input according to the order of Zig-Zag scanning;
2) ping-pong operation of two groups of registers group: the data of a 8x8 piece are written into a registers group at 8 clocks, finish successively since the 9th operation that clock is follow-up like this, and export the result of 8 coefficients at every turn, in order to ensure carrying out smoothly of streamline, at the 9th clock, the data of another 8x8 piece begin to be written into another identical in structure registers group, like this, since the 9th clock, the coefficient of the previous 8x8 piece of each output is carried out simultaneously with the coefficient that is written into current 8x8 piece, like this, the input and output of data are successive, and are chosen between two registers group with ping-pong operation and change;
3) Zig-Zag of Run-Length Coding scanning and (run, obtaining level), input 8x8 piece coefficient is stored in the registers group successively according to the order of Zig-Zag scanning, Run-Length Coding directly calculates (the run of each coefficient in order, level), but, here the input and output of data are 8 parallel, 8 coefficients of promptly each processing, and run determine not only relate to current 8 coefficients, and it is relevant with last nonzero coefficient in preceding 8 coefficients, handle (the run of current 8 coefficients, level) information, the run of last nonzero coefficient in current 8 coefficients of needs record (being the number of back zero) is as the basis of 8 coefficient calculations run of rear adjacent value;
4) carrying out code table according to the maximum of code coefficient selects: it all is that level value with previous code coefficient is as foundation that the code table that entropy coder software is realized switches, the selection that is current code table must be waited until after the code table of its previous coefficient is determined and could carry out, and the hardware design methods of class entropy coder of the present invention is based on 8 coefficient parallel processings, if according to the method that software is realized, need 8 clocks just can handle one group 8 and row coefficient so.Therefore, based on the function of software approach, the present invention has revised the rule that code table switches, and promptly so that the maximum of code coefficient is as the foundation of code table switching before the current coefficient, 8 comparators of such clock just can be realized the code table of 8 coefficients.And class entropy coder hardware capability of the present invention is realized consistent with software fully.
5) determine the code table that coding EOB selects according to the maximum of all 8x8 piece coefficients: for each 8x8 piece, will add EOB before first nonzero coefficient, EOB is corresponding one by one with code table.In the AVS entropy coder, the coding of coefficient inverted order is from back to front encoded successively, and all by the nonzero coefficient decision of its rear adjacent, determining with coefficient of EOB code table is identical for the code table that each coefficient is selected, and the code table that it is selected is by first nonzero coefficient decision.But in fact, according to method 4), the selection of EOB code table is determined by the maximum of all coefficients of 8x8 piece.
6) set up 8x8 piece coefficient maximum and the coding EOB bit number between direct relation: according between code table and the EOB one by one the mapping relation, class entropy coder hardware design methods of the present invention is set up the direct relation between 8x8 piece coefficient maximum and the EOB bit number, and does not need to determine EOB by corresponding code table.
7) conversion between code table and the bit wide table: the bit wide that realizes entropy coder by logic determines is calculated, because the code table that each coefficient coding is selected, do not need to obtain its accurate code word, as long as but obtain its bit wide, so the present invention with doing a conversion between the bit wide table, directly is mapped as the codenumber in the code table bit number of this codenumber of coding at every code table.
8) treatment of special situation in the code table: according to the AVS standard, the codenumber value of finding in original appendix code table also will be done subsequent treatment, if promptly original level is a negative, codenumber=codenumber+1, the needed bit number of the codenumber that finds in the code table is just different for positive and negative number like this, but most of pairing bit wide is identical, except several individually.Just can be handled by analyzing every code table one by one like this, thereby be obtained the bit wide table of every code table these special situations.
AVS class entropy coder hardware unit of the present invention comprises: data inputs dump module, Run-Length Coding module, inverted order matrix module, code table select module, table look-up module and Golomb coding module, add the bit number module with transform block.
1) data inputs dump module is used to handle the coefficient of parallel input, according to the coefficient of the scanning sequency storage input of Zig-Zag, makes the later coefficient of dump by the sequence arrangement of lining by line scan.
2) data input dump module of the present invention has been used the registers group of 2 same structures on hardware designs, reaches the effect of ping-pong operation, so reaches the water operation to the 8x8 piece.
3) data of parallel input Run-Length Coding resume module 1 of the present invention), obtain (the run of each coefficient, level), because (the run of input 8x8 piece coefficient, level) relevance between, the coefficient of current input need be used the information that coefficient run has been handled in the front, at the run that needs to increase by one 8 parallel input coefficient on the record on the hardware designs, calculates the benchmark of run as current 8 parallel input datas.
4) code table of the present invention is selected the needs of module according to hardware designs, revised the described code table of selecting the present encoding coefficient according to the level value of a last code coefficient of standard, but determine the code table of current coefficient according to the maximum of the coefficient before the present encoding coefficient in the current 8x8 piece, like this, the selection of each coefficient code table does not have dependence each other in fact, thereby a clock just can be handled 8 parallel coefficients of importing.
5) code table of the present invention is selected module 4) needn't wait until also that for the selection of EOB code table first non-zero handles and could determine equally, a maximum that in fact only need all coefficients of calculating just can have been determined the code table of EOB.
6) flashback functions of modules of the present invention is according to the AVS standard code, finishes the code coefficient inverted order of transform block. basic skills is exactly to utilize the counter counter of one 8 state, comes the input and the inverted order output (as shown in figure 12) of control data.
7) table look-up module of the present invention is according to " code table selection " determined code table of module, obtain the code word of current coefficient, just in the precoding stage, do not need the actual real code word that obtains each coefficient, but only need the bit wide of code word, characteristics according to the Golomb code word are easy to obtain every pairing bit wide table of code table, but need to handle the processing of some particular points (as: in the position of some settings, because the positive and negative bit wide that can influence code word of data, we need do sign with sign bit).
8) according to 6) described, the particular point that code table is converted in the bit wide table is that the different processing method of positive negative causes.
When 9) AVS class entropy coder hardware unit of the present invention was used in precoding, code table was reduced to the bit wide table, and can adopt logic determines to realize looking into the function of bit wide table on the hardware designs fully.
10) Golomb coding module of the present invention mainly is with 7) bit wide of gained encodes according to the different rank of Golomb, also calculates the bit number of required coding simultaneously for the escape incident.
11) to add the bit number module functions with transform block be exactly that the number of coded bits of all coefficients of transform block and the number of coded bits of EOB are all added up to data of the present invention.
Good effect of the present invention:
Traditional entropy coding, need finish each parts of whole entropy coding, and can only serial carry out, inefficiency, in handling, HD video is difficult to accomplish real-time coding like this. and when adopting encoding parallel series entropy device of the present invention to carry out entropy coding, processing speed is 8 times of conventional method, has improved the performance of whole encoder greatly. and the present invention simultaneously is on particular hardware realizes, optimized the VLC form that accounts for resource most, area reduces in a large number; And realized with logic determines, saved great amount of hardware resources.
Description of drawings
Fig. 1 .AVS-P2 two-dimensional variable length coding flow chart;
Fig. 2. schematic diagram of the present invention;
Fig. 3. class entropy coder device block diagram of the present invention;
Fig. 4. register-stored is distributed;
(a) normal scanning sequency in a zigzag, corresponding register-stored address when (b) exporting
Fig. 5. the Run-Length Coding algorithm flow chart;
Fig. 6. the hardware pipeline structure chart of Run-Length Coding module;
Fig. 7. the algorithm flow chart that code table is selected;
Fig. 8. code table is selected the hardware pipeline structure chart of module;
Fig. 9. code table ROM storage;
Figure 10. the hardware pipeline structure chart of table look-up module;
The hardware pipeline structure chart of Figure 11 .Golomb coding module;
Figure 12. the inverted order schematic diagram.
Embodiment
The present invention is further detailed explanation below in conjunction with the drawings and specific embodiments.
In order to realize the advantage of hardware parallel processing, and the conflict in the budget law design on the cost of economize on hardware design to greatest extent and the solution hardware designs, the present invention proposes the equivalence of some algorithms, these schemes comprise: parallel zig-zag scanning, the parallel processing of coefficient in the Run-Length Coding, eliminate code table and select dependence between coefficient, the foundation that the EOB code table is selected, precoding module to table look-up to table look-up on the simplification of computing and the hardware designs logic realization of computing.
Principle of the present invention as shown in Figure 2.The present invention is based on hard-wired class entropy coding method and device, the ping-pong operation that comprises two registers group that are used for the parallel input of image residual error data, the image residual error coefficient is carried out the Run-Length Coding module of scanning in a zigzag, select the code table of the used code table of coding to select module for each code coefficient, for the code coefficient (run behind the selected code table, level) be mapped as the table look-up module of code word and the codenum (code word) that obtains of tabling look-up is mapped as the Golomb coding module of bit wide (bit number).At first the module before the entropy coding (comprises motion estimation and compensation, change quantization) residual error coefficient that produces coding is deposited with in the register in two registers with parallel input and according to predetermined order, this predetermined order is exactly the position that each coefficient is placed in correspondence according to the order of the zigzag scanning that will carry out, the Run-Length Coding of back just can be peeked from register and Run-Length Coding according to the every successively row of natural order like this, the result of Run-Length Coding produces the (level of each coefficient, run), then, to these (level, run) select code table according to the value of level, here AVS 2D-VLC has 19 code tables that are used for coefficient coding, the code table of each coefficient is by the level decision of its level and encoded coefficient, then, the code word that from each code table, the obtains bit wide that obtains to write code stream according to the structure and the exponent number of Golomb code word again.Each module that this process relates to will be described in detail below.
As the flowing water block diagram of Fig. 3 for class entropy coder of the present invention, it comprises following submodule:
1. the memory allocation of data:
Data bit width behind the change quantization is 12, and the data of each 8x8 piece are advanced into register with one at every turn, is stored in the registers group as Fig. 4 according to the structure of zig-zag scanning.In order to guarantee carrying out smoothly of 8 parallel pipelines, this module has adopted the registers group of two 8x8, and per 8 clocks conversion once.The following describes the transformational structure of storage matrix inside.
Because data are 8 parallel inputs, so for fear of in the conflict of carrying out order on the Run-Length Coding, we just deposit data in registers group, the parallel processing that this mode of marshal data has again solved data well according to the order of zig-zag when storage.As shown in Figure 4, zig-zag is 0,1,8,16,9,2,3,10,17,24 in proper order ..., corresponding respectively register matrix position is 0,1,2,3,4,5,6,7,8,9 ...With one of them matrix is example, and the clock arrangement of data flow is described: first clock, 8 data of input are deposited address 0,1,8,16,9,2,3,10; Second clock deposits 17,24 with the data of input, and 32,25,18,11,4,5 The 8th clock deposits 53,60 with the data of input, 61,54,47,55,62,63; To the 9th clock, the data of input have forwarded on the another one matrix, and current matrix begins dateout, 0,1,8,16,9,2,3,10; The 10th clock, output 17,24,32,25,18,11,4,5 The 16th clock, output 53,60,61,54,47,55,62,63.
2. Run-Length Coding:
The Run-Length Coding algorithm structure as shown in Figure 5, this module functions is exactly that the number in the registers group is carried out Run-Length Coding, each number of scan matrix, coefficient to each non-zero, produce (a run respectively, level) right, on the coefficient of first non-zero, indicate simultaneously the EOB symbol, so be not among zero the block (cbp is non-vanishing) an EOB symbol to be arranged all entirely for each.
Level is the absolute value of input coefficient, can directly ask the coefficient of input to thoroughly deserve.Key is to ask run, because data are parallel inputs, so will construct the reference variable of a run, be made as base0, tentation data is 8 parallel inputs, is designated as { a0, a1, a2, a3, a4, a5, a6, a7}, base0 is used for representing the number of a0 preceding adjacent 0, if a0 is not equal to 0 like this, the run of a0 equals base0; Known the run of a0, with the a0 in the delegation, a1, a2, a3, a4, a5, the run of a6 and a7 just can in the hope of, so, the parallel processing of data just can realize, the interior data that just can handle delegation of clock.
The calculating of base0 is carried out in the following manner:
Parallel be input as example with 8, a counter counter from 0 to 7 at first will be arranged, what be used for representing input in this section clock is the data of same block.Clearly, when counter=0, base0=0 (counter=0 when first group of 8 data of input, also corresponding first clock, counter equal several just corresponding which clocks); Other the time base0 then be the amount of a variation, for this reason, can define another one variable base1, in 8 data of this row that base1 represents to import, adjacent zero the number at end.For 8 numbers of input, for example: for 0,12,3,0,0,4., 0,0; Base1=2.For 1,0,3,0,0,0,3,7; Base1=0.For 0,0,0,0,0,0,0,0; Base1=8.Base0 can be definite so in sum, counter=0, base0=0;=0 o'clock (be that counter is not equal at 0 o'clock,!=8, the base0=base1 of next line
As shown in Figure 6, the hardware configuration of whole module roughly can be divided into 4 level production lines, with the 8 parallel examples that are input as 4 grades of fluvial processeses is described:
The first order, main some parallel comparators and a counter controls.Whether the coefficient that uses 8 comparators to produce each input respectively is zero indications, be designated as sign_zero[0~7], be respectively 0,1,0,1,0,0,1,1 such as 8 identifiers that produced, then be designated as sign_zero[01010011] (sign_zero=01010011); Counting is the umber of beats of input valid data, and per 8 bats are the data of same block 8x8.
The second level mainly comprises two parallel 1 logics of looking for.By from sign_zero[7] to sign_zero[0] look for 1 logic can obtain the coefficient of first non-zero each row, and be denoted as 1, other are zero.Like this each the row EOB[7:0] just obtained.The EOB[7:0 of every row clearly] in have only 11 at most, perhaps all be zero.From sign_zero[0] to sign_zero[7] look for 1 logic can obtain behind last nonzero coefficient of this row zero number, be designated as base1.
The third level mainly comprises two selectors.Utilize the counter selector that EOB is controlled, in same,, guarantee the fast maximum situations that 1 EOB equals 1 that only occur of each conversion in case after first nonzero coefficient having occurred, just later coefficient all is denoted as 0; The another one selector is the run value that is used for constructing every first coefficient of row, is designated as base0[5:0];=8, base0<=base1, if base1=8, base0=8+base0.
The fourth stage, mainly comprise 8 comparators, be used for obtaining each coefficient (run, level), here be noted that traditional zig-zag, only the encode coefficient of non-zero is 8 parallel structures because adopt here, so when the coefficient of input equals zero, note (run=0, level=0), also participate in coding, the bit number that needs of only encoding equals 0.Run value base0 before 3rd level has been known first coefficient of every row is in conjunction with sign_zero[7:0], by comparator be easy to obtain this each coefficient of row (run, level).
3. code table is selected
According to the AVS standard, initializaing variable maxAbsLevel (level of expression absolute value maximum) is 0, and first nonzero coefficient option code Table V LC0.The size that compares each coefficient absolute value (abslevel) and maxAbsLevel then, if abslevel is greater than maxAbsLevel, code table then takes place to be switched, otherwise code table is constant, after switching code table the value of abslevel is composed to maxAbsLevel, so circularly each coefficient decoding is finished.
With regard to encoder, the code table of the next coefficient of coding depends on the code table that a coefficient is selected, and the maxAbsLevel of a last code table.Because what we adopted now is the structure that walks abreast, obviously such algorithm is difficult to realize with hardware.We obtain so important conclusion by analysis: the code table that current coefficient is selected, only determine, so, can obtain the code table of this coefficient as long as we can try to achieve the maximum before this coefficient by the maximum of code coefficient.The algorithm flow of selecting according to above conclusion code table as shown in Figure 7.
Algorithm steps following (with the 8 parallel examples that are input as):
The first step: successively relatively 8 coefficient level values (level[0~7]) obtain 8 maximums (be designated as max[0~6] and maxlevel1}, these 8 maximums are defined as respectively: max0=max{level0}, max1=max{level0, level1}, max2=max{level0, level1, level2} ..., max6=max{level 0~6}, maxlevel1=max{level0~7}, maximum numerical value in maxlevel1 this row of representing to import here;
Second step: determine to have imported in the 8x8 piece maximum of data, be designated as maxlevel0.Obviously, during counter=0, maxlevel0=0;=0 o'clock, compare maxlevel0 and maxlevel1, getting the greater is maxlevel0;
The 3rd step: determine to select the parameter value of code table, be designated as tab_value 0~7.According to our conclusion: the code table that current coefficient is selected, only determine by the maximum of code coefficient.Obviously, to 8 coefficient level0~7 of input, the code table of level0 is determined by the size of tab_value0=maxlevel0; The code table of level1 is by tab_valuel=max{maxlevel0, and max0} is definite ..., the code table of level7 is by tab_value7=max{maxlevel0, and max6} determines;
The 4th step: according to the corresponding code table of AVS Standard Selection.
In addition, the selection of EOB code table is level decision by last coding nonzero coefficient according to standard, but in fact according to above analysis, the selection of EOB code table is determined by number maximum in the 8x8 piece.
The hardware pipeline structure of whole module is divided into 4 grades and finishes as shown in Figure 8:
The first order mainly comprises 56 comparator sum counter counter[2:0].In order to guarantee in a clock, to try to achieve max0~7[11:0].The data of each input compare with other 7 data respectively, need 56 comparators altogether.Counter[2:0] be used for representing to import the umber of beats of data, per 8 to clap be a 8x8 piece.
The second level comprises one according to counter[2:0] selector.Input be first line data time, maxlevel0[11:0] equal 0, other the time then equal the greater of maxlevel0 and maxlevel1.
The third level determines that the maximum comparison circuit before each coefficient comprises 7 parallel comparators.Max0~6[11:0] respectively with maxlevel0[11:0] relatively, obtain parametric t ab_value1~7 of tabling look-up of each coefficient; The parameter of tabling look-up of first coefficient is then only by maxlevel0[11:0] determine i.e.: Tab_value0[11:0]=maxlevel0[11:0].
The fourth stage comprises 8 parallel single-stage MUX, and EOB MUX and EOB code table are chosen comparator.
4. table look-up
The main effect of this module is that the inquiry code table is determined code word (codenumber) value.Modal method is that code table is existed among the ROM hard-wired the time, inquires about according to the address then, and code table ROM stores as shown in Figure 9.
Each (run, level) code table that need inquire about have been obtained in the 3rd part; Then, the MUX addressing ROM that selects based on code table arrives corresponding code table; Then, according to the codenumber value in the level selection correspondence table.
The difference of 2D VLC entropy coding and precoding precoding maximum is exactly that the former need generate code stream with codenumber, and the latter only needs the bit number of codenumber code stream.According to the bit wide of codenumber in the code table and Golomb code word, can obtain the bit wide table of each code table correspondence, with intra_VLC4 the mapping relations of example explanation code table and bit wide table.
Table 1.AVS standard appendix table intra_VLC4
Figure G200810119769XD00111
Secundum legem, intra_VLC4 need encode with second order Golomb, and we can be according to the codenumber in the code table, and contrast Golomb coding rule finds corresponding information bit bit number.The code table of conversion is as follows:
Table 2.intra_VLC4 bit wide table
Figure G200810119769XD00112
For the escape incident, we can allow the bit number of information bit equal 7 and do sign.Two special circumstances (run=1 in the table 2, level=2) and (run=0, level=6) value of two changes appears in meeting, this is because secundum legem, the codenumber value of finding in original appendix code table also will be done following processing: if original level is a negative, and codenumber=codenumber+1.The codenumber of the two groups of special data correspondences in top is respectively 11,27, if level is a positive number, encoded information bits is respectively 3,4, if level is a negative, encoded information bits then is respectively 4,5.According to top method, change other code tables.Make discovery from observation,, have only 1~6 and 7, so just can replace rom to realize precoding with the simple logic design for the information bit bit number in the code table.Figure 10 is to be that example has illustrated how to obtain (run with intra_VLC4, level) information bit bit number, input be certain coefficient (run level), table type and corresponding code table, finally obtains the information bit bit number of this coefficient by a series of logic determines.
On above analysis foundation, the hardware configuration of tabling look-up can be divided into 3 level production lines as shown in figure 10:
The first order is made of the gating switch and the cascade of code table gating switch of code table type, navigates to unique specific code table with it.That is to say that at any time having only a code table is gating
The second level is made of some parallel comparators, according to each coefficient (run level) determines code word.On hardware designs, table look-up and use comparator to realize, what deserves to be explained is that the comparison other here can change to some extent according to the code table difference.For example inter_VLC0 can use run to be comparison other, and inter_VLC6 can use level to be comparison other, and purpose is in order to reduce the quantity of comparator as far as possible in a word.
The third level mainly is made of a gating switch and MUX.Gating switch is identical in the gate logic of front, is used for selecting wherein in 3 paths; The selection signal and the partial comparison signal of MUX will replace, be inter_VLC0, inter_VLC1, inter_VLC2, inter_VLC3, intra_VLC0, intra_VLCl, intra_VLC2, intra_VLC3, chroma_VLC0, chroma_VLC1, the selection signal sel signal of these 11 table selectors of chroma_VLC2 is with reference to the level value; The selection signal sel signal of other tables is with reference to the run value.
Just (run, the bit number of information bit level) when bit number equals 7, are (escape) incident of overflowing can to obtain every pair by above three steps.
5. Columbus encodes
This module functions is to obtain every pair of (run, bit number level).The information bit of Golomb sign indicating number has following rule:
M=floor?log 2(codenum+2 k),
Wherein, k represents the coding exponent number of Golomb sign indicating number, the numerical value that codenumber indicates to encode, the information bit bit wide that the M presentation code obtains.Thereby the needed bit number of coding codenumber of each coefficient is:
M_stream=2M+1-k
For the escape incident, the coding of run still uses the pairing Golomb exponent number of current code table coding, and level is then according to the exponent number coding of Columbus's sign indicating number of prescribed by standard.
The hardware of whole module is realized dividing following a few step to finish as Figure 11:
The first step: comprise a cascade comparator, 3 shift units and a selector.According to the AVS standard, the cascade comparator is by input signal table_type[1:0] and table_num[2:0], the Golomb exponent number of obtain encoding run and escape incident.Two shift units wherein are the Golomb exponent number displacements to run and level.The another one shift unit then is to run[5:0] displacement.Selector then is an added constant when being used in the person of the choosing escape incident coding run, and sign is that timing adds 60, otherwise adds 59.
Second step: comprise two adders.Finish C_level=level+1<<esc_rank, and C_run=2*run+59/60+1<<rank.
The 3rd step: comprise two parallel comparators.Be used for trying to achieve the information bit bit wide of coding level and run, owing to hardware is realized and can not directly be taken the logarithm, so can finish this function by a logic of looking for from a high position to the low level.
The 4th step: comprise an adder.Be M_level[3:0]+M_run[3:0].
The 5th step: comprise two shift units and two adders (subtraction).Try to achieve the number of coded bits of escape incident and non-escape incident respectively.
The 6th step: comprise a selector.As value[2:0] when equaling 7, be expressed as the escape incident and then export bits_esc[5:0]; Otherwise output bits[3:0].

Claims (8)

1. a method for encoding parallel series entropy the steps include:
1) the quantization parameter grouping with each transform block walks abreast input and output in two registers group; Its method is:
A) in a clock, deposit one group of quantization parameter in a registers group;
B) current transform block all is input to this registers group by group after, at next clock from this registers group according to the order of zig-zag by group output quantization parameter, begin follow-up grouping quantization parameter is imported another registers group from this clock simultaneously;
C) repeat above-mentioned steps a) and b), realize the ping-pong operation of two registers group;
2) (run level), identifies the EOB symbol simultaneously on first non-zero quantized coefficients of current transform block to adopt the method for Run-Length Coding to calculate the code coefficient of every group of quantization parameter in same clock; Wherein level represents the size of quantization parameter absolute value, and run represents number zero between quantization parameter and the previous non-zero quantized coefficients;
3) (run level) carries out inverted order output with code coefficient;
4) select the code table of current quantization parameter according to the level maximum in the code coefficient of output;
5) select the code table of this transform block EOB according to the level maximum of all code coefficients in the current transform block;
6) step 4) and the selected code table of the step 5) logic determines by Columbus's coding is converted to the bit wide table, and adds and calculate the number of coded bits of current all coefficients of transform block and the number of coded bits of EOB.
2. the method for claim 1 is characterized in that described transform block is the 8x8 piece, and the quantization parameter of described 8x8 piece is divided into 8 groups of data with behavior unit.
3. the method for claim 1 is characterized in that the absolute value of the level value of described code coefficient for quantization parameter; The run value calculating method of described code coefficient is: set a variable base0 and be used for being used to write down the number at every group of quantization parameter end adjacent 0 and the line number that a counter counter is used to write down this transform block to run value assignment, a variable base1; Counter=0, base0=0; =8, the base0=base1 of next group quantization parameter.
4. encoding parallel series entropy device, it comprises that the data input unloading module, Run-Length Coding module, inverted order matrix module, the code table that connect successively select module, table look-up module, Columbus's coding module, add the bit number module with transform block;
Described data input unloading module is used to handle the quantization parameter of every group of parallel input, and it comprises that two storage matrix are used for realizing the ping-pong operation of data input and output;
Described Run-Length Coding module is used for the code coefficient (run of every group of quantization parameter that the counter register group is imported in same clock, level), wherein level represents the size of quantization parameter absolute value, and run represents number zero between quantization parameter and the previous non-zero quantized coefficients;
Described inverted order matrix module is used to finish code coefficient (run, level) the inverted order output of transform block;
Described code table selects module to be used for code coefficient according to current inverted order output, and (run level) determines the code table of current quantization parameter and the code table of this transform block EOB;
The code table that described table look-up module is used for obtaining is mapped as corresponding code word;
Described Columbus's coding module is used for code word is converted to bit wide, obtains the bit wide table of code table correspondence;
The described number of coded bits that is used for calculating current all quantization parameters of transform block with the bit number module of transform block that adds according to the bit wide table.
5. device as claimed in claim 4, it is characterized in that described Run-Length Coding module comprises that a sign bit comparator, a counter and some input data comparators are used for every group of input data are compared in same clock with zero, described input data comparator links to each other with this group end adjacent zeros number comparator with the EOB comparator respectively; Described EOB comparator is connected with a selector, and described this group end adjacent zeros number comparator is connected with run value comparator through a selector; Described counter is connected with described two selectors respectively.
6. device as claimed in claim 4 is characterized in that described code table selects the circuit connecting relation of module to be: the some input size of data comparators that are used for comparison transform block input data determine that with one maximum comparison circuit before the coefficient, a selector are connected respectively; The output of described selector is connected with maximum comparison circuit before described definite coefficient with an EOB MUX respectively; The maximum comparison circuit is connected with some parallel single-stage MUX before described definite coefficient; Described selector is connected with the output of same counter respectively with described EOB MUX; The some input size of data comparator output terminals that are connected with described selector are connected with described EOB MUX, and the output of described EOB MUX is connected with the selected comparator of a code table.
7. device as claimed in claim 4, the circuit connecting relation that it is characterized in that described table look-up module is: code table type gating switch is connected with the code table gating switch respectively, described code table gating switch respectively through a comparator be connected with a MUX again after a code word gating switch is connected; Described code table gating switch is connected with one yard table number incoming line respectively with described MUX; Described comparator is connected with the input data respectively with described MUX; Described MUX is to every pairing described level value of code table of selection signal reference of code table inter_VLC0, inter_VLC1, inter_VLC2, inter_VLC3, intra_VLC0, intra VLC1, intra_VLC2, intra_VLC3, chroma_VLC0, chroma_VLC1, chroma_VLC2, to every pairing described run value of code table of selection signal reference of code table inter_VLC4, inter_VLC5, inter_VLC6, intra_VLC4, intra_VLC5, intra_VLC6, chroma_VLC3, chroma_VLC4.
8. device as claimed in claim 4, the circuit connecting relation that it is characterized in that described Columbus's coding module is: the cascade comparator of input code table number and code table type is connected with an adder 1 through shift unit 1, and this adder 1 input is connected with the output of level absolute value, its output be connected by bit comparator 1; Described cascade comparator is connected with an adder 2 through another shift unit 2, and these adder 2 inputs are connected with the output of level sign bit, are connected with run value output through a shift unit 3 through a selector 1, simultaneously its output be connected by bit comparator 2; The described bit comparator 1 of pressing is connected with a shift unit 4 through an adder 3 by bit comparator 2 with described, and two outputs of described shift unit 4 and described cascade comparator are connected through the input of a subtracter with a selector 2; The cascade comparator output terminal that is connected with described shift unit 2 is connected with a subtracter simultaneously, and the input of this subtracter is connected with the output of code word through a shift unit 5, and the output of this subtracter is connected with the input of described selector 2 simultaneously; Described selector 2 inputs are connected with described code word output.
CN 200810119769 2008-09-09 2008-09-09 Method for encoding parallel series entropy and apparatus thereof Expired - Fee Related CN101355700B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200810119769 CN101355700B (en) 2008-09-09 2008-09-09 Method for encoding parallel series entropy and apparatus thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200810119769 CN101355700B (en) 2008-09-09 2008-09-09 Method for encoding parallel series entropy and apparatus thereof

Publications (2)

Publication Number Publication Date
CN101355700A CN101355700A (en) 2009-01-28
CN101355700B true CN101355700B (en) 2010-06-02

Family

ID=40308242

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200810119769 Expired - Fee Related CN101355700B (en) 2008-09-09 2008-09-09 Method for encoding parallel series entropy and apparatus thereof

Country Status (1)

Country Link
CN (1) CN101355700B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108471537B (en) * 2010-04-13 2022-05-17 Ge视频压缩有限责任公司 Device and method for decoding transformation coefficient block and device for coding transformation coefficient block
CN102625096A (en) * 2011-01-28 2012-08-01 联合信源数字音视频技术(北京)有限公司 Parallel precoding equipment based on AVS
CN113382238A (en) * 2020-02-25 2021-09-10 北京君正集成电路股份有限公司 Method for accelerating calculation speed of partial bit number of residual coefficient
CN116233389A (en) * 2021-12-03 2023-06-06 维沃移动通信有限公司 Point cloud coding processing method, point cloud decoding processing method and related equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1889689A (en) * 2006-06-01 2007-01-03 上海交通大学 Runs decoding, anti-scanning, anti-quantization and anti-inverting method and apparatus
CN101114833A (en) * 2007-08-06 2008-01-30 北京航空航天大学 Encoding device and method for index Golomb coding

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1889689A (en) * 2006-06-01 2007-01-03 上海交通大学 Runs decoding, anti-scanning, anti-quantization and anti-inverting method and apparatus
CN101114833A (en) * 2007-08-06 2008-01-30 北京航空航天大学 Encoding device and method for index Golomb coding

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YANGLING CHEN ET.AL.A MEMORY-EFFICIENT CAVLC DECODING SCHEME FORH.264/AVC.ADVANCED COMMUNICATION TECHNOLOGY.2008,全文. *
王庆春,曹喜信.分数像素运动矢量代价产生器的VLSI设计.Digital TV & digital video.2007,全文. *

Also Published As

Publication number Publication date
CN101355700A (en) 2009-01-28

Similar Documents

Publication Publication Date Title
CN107743239B (en) Method and device for encoding and decoding video data
CN101933331B (en) Video encoding device, video decoding method and video encoding method
CN101072356B (en) Motion vector predicating method
US20050012648A1 (en) Apparatus and methods for entropy-encoding or entropy-decoding using an initialization of context variables
CN109996071A (en) Variable bit rate image coding, decoding system and method based on deep learning
CN105103549A (en) Encoding and decoding of significant coefficients in dependence upon a parameter of the significant coefficients
CN110602491A (en) Intra-frame chroma prediction method, device and equipment and video coding and decoding system
CN102088603B (en) Entropy coder for video coder and implementation method thereof
CN104853209A (en) Image coding and decoding method and device
WO2005041420A1 (en) Decoding apparatus or encoding apparatus wherein intermediate buffer is inserted between arithmetic sign decoder or encoder and debinarizer or binarizer
JPH04223717A (en) Method for data compression, method for selection of system and dynamic model, and system
CN109819250B (en) Method and system for transforming multi-core full combination mode
JP2023522575A (en) Parallelized Rate-Distortion Optimal Quantization Using Deep Learning
CN101355700B (en) Method for encoding parallel series entropy and apparatus thereof
WO2012048053A2 (en) System and method for optimizing context-adaptive binary arithmetic coding
CN102186075B (en) Entropy coder and realization method thereof
JP2011530222A (en) Video encoder with integrated temporal filter for noise removal
AU2011201336B2 (en) Modulo embedding of video parameters
CN103227924A (en) Arithmetic coder and coding method
CN101707716A (en) Video coder and coding method
CN103974066A (en) Video coding method and device
CN111757104B (en) Image coding method and device, electronic equipment and storage medium
CN102625096A (en) Parallel precoding equipment based on AVS
Vizzotto et al. Area efficient and high throughput CABAC encoder architecture for HEVC
CN103686176A (en) Code rate estimation method for video coding

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: WUXI INSPEED COMMUNICATIONS CO.,LTD.

Free format text: FORMER OWNER: SCHOOL OF SOFTWARE AND MICROELECTRONICS, PEKING UNIVERSITY

Effective date: 20140217

COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 102600 DAXING, BEIJING TO: 214043 WUXI, JIANGSU PROVINCE

TR01 Transfer of patent right

Effective date of registration: 20140217

Address after: Tong Hui Road Beitang District 214043 Jiangsu city of Wuxi Province, No. 436 building 45

Patentee after: Wuxi Inspeed Communications Co.,Ltd.

Address before: 102600 Beijing Jinyuan Industrial Zone Daxing District Road No. 24

Patentee before: School of Software and Microelectronics, Peking University

TR01 Transfer of patent right
ASS Succession or assignment of patent right

Owner name: JIANGSU MINLEHUI COMMERCE TECHNOLOGY CO., LTD.

Free format text: FORMER OWNER: WUXI INSPEED COMMUNICATIONS CO.,LTD.

Effective date: 20140827

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 214043 WUXI, JIANGSU PROVINCE TO: 214045 WUXI, JIANGSU PROVINCE

TR01 Transfer of patent right

Effective date of registration: 20140827

Address after: Beitang District Road 214045 Jiangsu Minfeng 198-404 city of Wuxi Province

Patentee after: Jiangsu Minle Business Technology Co., Ltd.

Address before: Tong Hui Road Beitang District 214043 Jiangsu city of Wuxi Province, No. 436 building 45

Patentee before: Wuxi Inspeed Communications Co.,Ltd.

C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20151120

Address after: Tong Hui Road Beitang District 214043 Jiangsu city of Wuxi Province, No. 436 building 45

Patentee after: Wuxi lead Speed Technology Co., Ltd.

Address before: Beitang District Road 214045 Jiangsu Minfeng 198-404 city of Wuxi Province

Patentee before: Jiangsu Minle Business Technology Co., Ltd.

DD01 Delivery of document by public notice

Addressee: Ai Zhuxuan

Document name: Notification of Passing Examination on Formalities

CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20100602

Termination date: 20160909

CF01 Termination of patent right due to non-payment of annual fee