CN101060337A

CN101060337A - An optimized Huffman decoding method and device

Info

Publication number: CN101060337A
Application number: CN 200710099475
Authority: CN
Inventors: 王箫程
Original assignee: Vimicro Corp
Current assignee: Zhongxing Technology Co ltd
Priority date: 2007-05-22
Filing date: 2007-05-22
Publication date: 2007-10-24
Anticipated expiration: 2027-05-22
Also published as: CN100578943C

Abstract

The disclosed optimal Hofmann decoding method comprises: variable-length grouping all code words in Hofmann code table, generating node according to the variable-length code word fragment to form a variable-length grouping Hofmann search tree; searching the code flow to obtain the corresponding signal. This invention can reduce memory space.

Description

A kind of Hofmann decoding method of optimization and device

Technical field

The present invention relates to data compression encoding and decoding technique field, particularly a kind of Huffman of optimization (Huffman) coding/decoding method and device.

Background technology

Huffman coding is a kind of coding method that utilizes the statistical property of information symbol that Huffman (Huffman) proposed in nineteen fifty-two, i.e. coding method from top to bottom.Huffman coding is a kind of entropy coding that generally uses at present, also is one of basic and main coding techniques.

Figure 1 shows that a kind of simple realization flow of huffman coding, comprise the steps:

Step 101: the cell symbol that adds up to N, by the probability P of each cell symbol appearance _i(i=1,2 ... N) descending sequence arrangement is P ₁〉=P ₂〉=... 〉=P _N

Step 102: with the probability addition of two cell symbols of probability of occurrence minimum, synthetic probability; With this probability with the probability of other cell symbols sequence arrangement by size again;

Step 103: judge whether that probability is 1, if execution in step 104 then, otherwise go to step 102;

Step 104: with line the cell symbol is coupled together, progressively from after encode forward, each node has two branches, the tax 0 big to probability, the tax 1 that probability is little (also tax 1 that can be big, the tax 0 that probability is little) to probability, arrive end-node through behind several nodes, be also referred to as end points;

Step 105: will be from first node to

end points

0 or 1 line up in order is exactly the code word of the pairing cell symbol of this end points.

Figure 2 shows that the codeword structure schematic diagram of huffman coding.Wherein, white circle is represented intermediate node (Internal Node), and gray circles is then represented end points (Result Node).As can be seen, the length of this code word is variable.According to above-mentioned coding flow process as can be known, the shortest code word of cell symbol correspondence of probability maximum, and the longest code word of cell symbol correspondence of probability minimum so just can shorten total code length.

Normal at present employing two is advanced to set search method above-mentioned code word is decoded.Its basic principle is from first node, from code word, read a bit at every turn, according to 0 or 1 branch that judge to select binary tree, the code word that judges whether to search out needs according to the value of branch node still is next step search of needs then, and the bit that has read can abandon.Can see that two to advance to set the number of times of searching under the search method worst case be the length of maximum length code word in the Huffman code word.After search obtains code word, search the code word that sets in advance and the mapping table of cell symbol again, just can obtain the cell symbol of this code word correspondence.

See that from the angle of grouping two advance the structure of search tree, can think that two block lengths of advancing search tree are fixed as 1.In decoding time, by bit, promptly once analyzed a bit to the search of code stream.Two Hofmann decoding methods that advance to set search method can reach very high decoding efficiency, but need each intermediate node of storage and end points.As can be seen from Figure 2, because each node code word of a corresponding bit only, but need distribute the memory space of certain-length for each node, therefore, existing two advance to set search method need consume more memory space.

Summary of the invention

In view of this, the objective of the invention is to, propose a kind of Hofmann decoding method and device of optimization, can under the impregnable substantially situation of decoding efficiency, save required memory space greatly.

Described Hofmann decoding method comprises the steps:

All code words in the huffman code table are carried out the Dynamic Packet of elongated degree, and the code word fragment of the elongated degree that obtains according to grouping generates node, and described node is formed elongated degree divide the group Huffman search tree;

Divide the group Huffman search tree that code stream is searched for according to described elongated degree, obtain the cell symbol of code word correspondence.

Described Hofmann decoding device comprises code table information searching module, grouping module and search module, wherein,

Described code table information searching module is used to store the relevant information that elongated degree divides the group Huffman search tree, and the nodal information that the variation group Huffman of being stored is set is sent to search module and grouping module respectively;

Described grouping module is used to receive the Huffman code stream, according to the next stage node grouping length from search module the Huffman code stream that is received is divided into groups, and obtains the current code word fragment that needs search; Gained code word fragment is sent to search module;

Described search module is used to receive the code word fragment from grouping module, and reception divides the nodal information of group Huffman search tree from the elongated degree of code table information searching module, according to described nodal information described code word fragment is searched for, find and the corresponding node of described code word fragment, and then finally find the cell symbol of complete Huffman code word correspondence.

As can be seen from the above technical solutions, the present invention program carries out the grouping of elongated degree to the Huffman code word, can be with the realization Hofmann decoding of less memory space.

Description of drawings

Fig. 1 is a kind of realization flow figure of huffman coding;

Fig. 2 is the codeword structure schematic diagram of huffman coding;

Fig. 3 is the node structure storage schematic diagram of the embodiment of the invention;

The flow chart that Fig. 4 divides the group Huffman search tree that code stream is decoded for the embodiment of the invention according to elongated degree;

Fig. 5 divides the example of group Huffman search tree for a kind of elongated degree;

Fig. 6 advances to set example for corresponding with Fig. 5 two;

Fig. 7 is an embodiment of the invention device schematic diagram.

Embodiment

For making the purpose, technical solutions and advantages of the present invention clearer, the present invention is further elaborated below in conjunction with accompanying drawing.

The embodiment of the invention is carried out the grouping of variable-length to all code words in the huffman code table, and the variable length code word slice section that obtains according to grouping generates node, and described node is formed elongated degree divide the group Huffman search tree; Then, divide the group Huffman search tree that the code stream that huffman coding forms is searched for according to described elongated degree.

The node structure storage that the elongated degree of the embodiment of the invention divides the group Huffman search tree as shown in Figure 3.Suppose that memory length is 16 bits, comprise the Endpoint ID of a bit, be used to represent whether this node is end points, for example value is that 1 this node of expression is an end points, value is an intermediate node for this node of null representation, can certainly represent that this node is an intermediate node with 0, represents that with 1 this node is an end points.Ensuing three bits are used to represent the bit number of the code word fragment of downstream site, are called the block length (Segmentation Length) of next stage node.12 remaining bits then are used to store the first address side-play amount of the first address or the next stage node of next stage node, described first address side-play amount can be the side-play amount with respect to first node address, also can be the side-play amount with respect to the even higher level of node first address, below unification be represented with first address.The bit number that is used for stores packets length is not limited to above-mentioned value, can adjust according to actual needs.Total memory length of node also is not limited to 16 bits, can be worth for other, for example is that 24 bits, 32 bits or 64 compare top grade.The memory contents of end points comprises the code word fragment length and the cell symbol of Endpoint ID, end points.Just directly obtain the cell symbol after search finishes like this, and need not to search again the mapping table of code word and cell symbol, improved decoding efficiency.

The embodiment of the invention comprises following two basic steps:

A, the code word in the huffman code table is carried out the grouping of elongated degree, the variable length code word slice section that obtains according to grouping generates node and described node is formed elongated degree divide the group Huffman search tree.Therefore the method for the grouping of elongated degree can have many kinds, and resulting elongated degree divides the group Huffman search tree also may be not unique.

For example, a kind of method of simple elongated degree grouping is as follows: the length of first order node is the short code word length in the code table, and middle length at different levels is 3.The afterbody of each branch is according to the length of remainder codewords, and its length can be 1,2, and perhaps 3.The method of so just having formed a kind of grouping.

In memory space, generate elongated degree then and divide the group Huffman search tree, specifically, comprise the steps:

A, be node memory allocated at different levels spaces.Belong to each node of same superior node, its memory space is continuous, and the block length of neglecting present node greatly of memory space and deciding.If such as the block length of present node is 2, then length be the code word fragment one of 2 bits have 00,01,10 and 11 amount to 4 kinds may, if the memory space of a node is 16 bits, then need to distribute at least 4 16 bit storage space; If the block length of present node is 8, if then node storage space is to need to distribute at least 256 16 bit storage space under the prerequisite of 16 bits.

B, the corresponding Endpoint ID of code word fragment allocation that grouping obtains to code word.Judging whether this code word fragment is last segment of a complete code, if then Endpoint ID is true, otherwise is false, can represent with 0 or 1 respectively.

For Endpoint ID is false code word fragment, next stage node grouping length, first address and the Endpoint ID of this node correspondence is saved in the memory space of this node; For Endpoint ID is genuine code word fragment, then Endpoint ID, code word fragment length and cell symbol is kept at the memory space of this node.In addition, if this node is empty node, then still keep this memory space for empty.

B, divide the group Huffman search tree that code stream is searched for, obtain the cell symbol of code word correspondence according to described elongated degree.

The flow process that the code stream that the embodiment of the invention divides the group Huffman search tree that huffman coding is formed according to elongated degree is decoded comprises the steps: as shown in Figure 4

Step 401: with first node is present node, obtains first order node grouping length and address according to the content of storing in the first node.

Step 402: the intercepting code word fragment identical with block length at the corresponding levels from the Huffman code stream of input according to the code word fragment that obtains and the first address of node at the corresponding levels, searches present node from node at the corresponding levels.

For example, if the chip field is " 011 ", the code word fragment of first node correspondence is " 001 " after the node first address then at the corresponding levels, and the code word fragment of second node correspondence is " 010 ", and the 3rd node is exactly present node; In like manner, if the chip field is " 1101 ", the 13rd node of storage is present node after the node first address then at the corresponding levels.

Step 403: judge whether present node is end points, if then go to step 405, otherwise go to step 404.

Step 404: according to the content of storing in the present node, obtain next stage node grouping length and node first address, go to step 402 then.

Step 405: the cell symbol of end points storage is exported as decoded result.

To continuous Huffman code stream, repeat the processing procedure of above step 401 to step 405, finish so all decode up to code word.

With a concrete example decode procedure of the present invention is further specified below.Suppose one group of cell symbol is carried out huffman coding, obtain Huffman code word as shown in table 1:

Code word	Length	The cell symbol
Code word	Length	The cell symbol	0	1	a1
1000	4	a2	0	1	a1
1000	4	a2	1001	4	a3
1010	4	a4	1001	4	a3
1010	4	a4	101100	6	a5
101101	6	a6	101100	6	a5
101101	6	a6	101111	6	a7
1100	4	a8	101111	6	a7
1100	4	a8	11010	5	a9
11011	5	a10	11010	5	a9
11011	5	a10	1110	4	a11
1111	4	a12	1110	4	a11

Table 1

Code word shown in the his-and-hers watches 1 is carried out elongated degree grouping, the length of first order node is the short code word length 1 in the code table, middle 1 grade length is 3, the afterbody of each branch is according to the length of remainder codewords, its length is 1 or 2, and a kind of elongated degree that obtains divides the group Huffman search tree as shown in Figure 5.Wherein, diamond is represented first node, and the white ovals frame table shows intermediate node, and the grey oval frame is represented end points, and the black oval frame is then represented empty node.

Be the memory space that nodes at different levels distribute so, its storage organization is:

For node 501, node headed by it, so end marker is made as 0; Its downstream site 502,503 length are 1, and then subordinate's block length is made as 1, and the downstream site first address is the memory address ADD502 of node 502, so the storage organization of node 501 is:

0

001

ADD502

For node 502, it is an end points, so end marker is made as 0, and the code word fragment length is 1, and the Hofmann decoding content is a1, so the storage organization of node 502 is:

1

001

a1

For node 503, it is an intermediate node, and subordinate's block length is 3, and the downstream site first address is node 504 corresponding address ADD504, and the storage organization that then obtains node 503 is:

0

011

ADD504

……

For node 516, it is an end points, and the code word fragment is 1, and corresponding Hofmann decoding content is a9, that is:

1

001

a9

……

Suppose that the binary code stream that huffman coding obtains is 10011110110110010110010100......, then when decoding, divide the group Huffman search tree according to elongated degree shown in Figure 5, at first the next stage block length according to first node is 1, the downstream site first address is ADD502, first bit of binary code stream is 1, and this first address is added 1, searches node 503; Then, next stage block length according to node 503 is 3, the downstream site first address is ADD504, and the content of the 2nd to the 4th bit of binary code stream is 001, ADD504 adds 1 with first address, then search node 505, so just obtained first code word 1001, the Hofmann decoding content that obtains this code word correspondence accordingly is a3.Repeat such search procedure for code word afterwards, just above-mentioned code stream is decomposed into: 1,001; 1,110; 1,101,1; 0; 0; 1,011,00; 1,010; 0; ....Wherein comma is a boundaries of packets, and branch is the code word segment boundaries.It is a3a11a10a1a1a5a4a1...... that thereby decoding obtains corresponding Hofmann decoding content

As can be seen from Figure 5, for node not at the same level, its block length may be different; For with the one-level node, its block length also may be different.For example node 512 and node 517 belong to same one-level, but the block length of node 512 is 2 bits, and the block length of node 517 is 1 bit.But, all be identical with the block length of those nodes that have identical superior node in the one-level, for example node 512 is to node 516.The first address of next stage node of storage is the address of node 512 in its superior node 507, and node 512 to node 516 is storages continuously, therefore can search node 512 any one node to the node 516 according to this first address.

Code word 101110 is empty, that is to say not have this code word in the code table, and this explanation possibility code stream error code occurs or mistake appears in decode procedure.

If to above-mentioned code streams by using traditional two advance the tree search, then node structure is as shown in Figure 6.Compare with Fig. 5, the node among Fig. 5 add up to for 17 (comprising first node and empty node), and node adds up to 23 (not including first node) among Fig. 6.If each node needs identical memory cell, the memory space that method then shown in Figure 5 needs is than 6 memory cell of lacking among Fig. 6, if the required memory space of the node unit of these two kinds of methods is identical, then among this embodiment, variation group Huffman coding/decoding method has been saved 26.1% memory space.

In addition from number of comparisons, the two node progression that advance tree are 6, and elongated degree to divide the group Huffman search tree be 3.As code word 1001, divide in the group Huffman search tree method at elongated degree and only need carry out 2 times and judge the success of just can decoding, and two advance search tree and need carry out 4 judgements and could decode.Though the present invention program need additionally obtain grouping information, is coupled in the node by the information with result and intermediate demand, can finish decoding efficiently.

The variation group Huffman decoding device of the embodiment of the invention comprises grouping module 701, code table information searching module 702 and search module 703 as shown in Figure 7.Wherein,

Code table information searching module 702 is used to store the relevant information that elongated degree divides the group Huffman search tree, comprises the information that this elongated degree divides each node of group Huffman search tree.If this node is an intermediate node, then nodal information comprises Endpoint ID, next stage block length and next stage node first address information; If this node is an end points, then nodal information comprises Endpoint ID, code word fragment length and Hofmann decoding content.Described first address information can be the physical address of the first node of next stage, also can be the side-play amount of the physical address of the first node of next stage.May further include generation unit in the code table information searching module 702, be used to generate described elongated degree and divide the group Huffman search tree, just according to the block length of nodes at different levels, distribute elongated degree to divide the memory space of each node of group Huffman search tree, and store the relevant information of each node into each memory space.

Code table information searching module 702 also comprises the end points judging unit, be used to judge that present node is end points or intermediate node, if intermediate node, the Endpoint ID of then described this intermediate node of generation unit is set to vacation, and with the next stage node grouping length of this node correspondence, the memory space that first address stores described intermediate node into; If end points, the Endpoint ID of then described this end points of generation unit is set to very, and the cell symbol of the code word fragment length of this end points and end points correspondence is stored into the memory space of described end points.

Code table information searching module 702 divides the relevant information of group Huffman search tree to be sent to grouping module 701 and search module 703 respectively described elongated degree.

Grouping module 701 is used to receive the Huffman code stream, according to the block length from the next stage node of search module 702, Huffman code stream is divided into groups, and obtains the current code word fragment that needs search; If then divide the block length of the first node indication of group Huffman search tree first when carrying out, Huffman code stream is divided into groups and obtains the code word fragment that current needs are searched for according to elongated degree from code table information searching module 702.Gained code word fragment is sent to search module 703.

Search module 703, be used to receive code word fragment from grouping module 701, and reception divides the nodal information of group Huffman search tree from the elongated degree of code table information searching module 702, according to physical address in the described nodal information or physical address side-play amount, described code word fragment is searched for, find and the corresponding node of described code word fragment.

Also further comprise judging unit and output unit in the search module 703, described judging unit is used for judging according to the Endpoint ID of the node that is searched whether described node is end points, if the notice output unit extracts the cell information of storing in this node and externally output.Described output unit then is used to export described cell symbol.

If this node of judgment unit judges is not an end points, then search module 702 is sent to grouping module 701 with the next stage node grouping length of this node.

The above only is preferred embodiment of the present invention, not in order to restriction the present invention, all any modifications of being done within the spirit and principles in the present invention, is equal to and replaces and improvement etc., all should be included within protection scope of the present invention.

Claims

1, a kind of Hofmann decoding method of optimization is characterized in that, this method comprises the steps:

According to the described method of claim 1, it is characterized in that 2, the code word fragment of the described elongated degree that grouping is obtained is as node, and described node formed elongated degree divide the group Huffman search tree to comprise:

Be node memory allocated at different levels space;

The corresponding Endpoint ID of code word fragment allocation for described elongated degree;

For Endpoint ID is false code word fragment, the memory space that the next stage node grouping length and the next stage node first address of Endpoint ID, this node correspondence is saved in this node; For Endpoint ID is genuine code word fragment, then the memory space that the length and the cell symbol of Endpoint ID, this code word fragment is kept at this node.

3, method according to claim 2 is characterized in that, describedly comprises for node memory allocated at different levels space: distribute continuous memory space for all nodes that belong to same superior node.

4, method according to claim 2 is characterized in that, describedly comprises for node memory allocated at different levels space: according to the pairing block length of each grade node, determine the size of the memory space of this grade node.

5, method according to claim 2 is characterized in that, described first address is:

Be used to store the initial address of the memory space of next stage node; Perhaps,

Described initial address is with respect to the side-play amount of the memory space initial address that is used to store even higher level of node; Perhaps,

Described initial address is with respect to the side-play amount of the memory space first address that is used to store first node.

6, according to each described method of claim 1 to 5, it is characterized in that, describedly divide the group Huffman search tree that code stream search is comprised according to described elongated degree:

With first node is present node, obtains first order node grouping length and address according to the content of storing in the first node;

The intercepting code word fragment identical with node grouping length at the corresponding levels according to the code word fragment that obtains and the first address of node at the corresponding levels, searches present node from node at the corresponding levels from the Huffman code stream of input;

Judge whether present node is end points, if not, according to the content of storing in the present node, obtain next stage node grouping length and node first address, go to the described step that from the Huffman code stream of input, intercepts the code word fragment identical then with block length at the corresponding levels; If then export the cell symbol stored in the present node as Search Results.

7, a kind of Hofmann decoding device of optimization is characterized in that, this device comprises code table information searching module, grouping module and search module, wherein,

8, Hofmann decoding device according to claim 7, it is characterized in that, described code table information searching module further comprises generation unit, be used to distribute elongated degree to divide the memory space of each node of group Huffman search tree, and store the relevant information of each node into described each memory space.

9, Hofmann decoding device according to claim 8, it is characterized in that, described code table information searching module further comprises the end points judging unit, be used to judge that present node is end points or intermediate node, if intermediate node, the Endpoint ID of then described this intermediate node of generation unit is set to vacation, and with the next stage node grouping length of this node correspondence, the memory space that first address stores described intermediate node into;

If end points, the Endpoint ID of then described this end points of generation unit is set to very, and the cell symbol of the code word fragment length of this end points and end points correspondence is stored into the memory space of described end points.

10, Hofmann decoding device according to claim 7 is characterized in that, described code table information searching module is further used for dividing the block length of the first node of group Huffman search tree to be sent to grouping module described elongated degree;

Then described grouping module is used for according to the block length from the code table information searching module code word being divided into groups, and obtains the current code word fragment that needs search; Gained code word fragment is sent to search module.

11, according to each described Hofmann decoding device of claim 7 to 10, it is characterized in that, described search module further comprises judging unit and output unit, described judging unit is used to judge whether the node that is searched is end points, if the notice output unit comes out the cell symbol extraction of storing in this node and externally output; Described output unit then is used to export described cell symbol.