CN114095036B - Code length generating device for dynamic Huffman coding - Google Patents

Code length generating device for dynamic Huffman coding Download PDF

Info

Publication number
CN114095036B
CN114095036B CN202210055822.4A CN202210055822A CN114095036B CN 114095036 B CN114095036 B CN 114095036B CN 202210055822 A CN202210055822 A CN 202210055822A CN 114095036 B CN114095036 B CN 114095036B
Authority
CN
China
Prior art keywords
data
memory
code length
datum
ram
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210055822.4A
Other languages
Chinese (zh)
Other versions
CN114095036A (en
Inventor
刘宇豪
张永兴
王振
马孔明
赵璠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202210055822.4A priority Critical patent/CN114095036B/en
Publication of CN114095036A publication Critical patent/CN114095036A/en
Application granted granted Critical
Publication of CN114095036B publication Critical patent/CN114095036B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/40Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention provides a code length generating device of dynamic Huffman coding, comprising: the ping-pong storage module is configured to be used for storing the sequenced data; the code length generation module is configured to receive the data stored in the ping-pong storage module and calculate a code length matrix and a code word matrix based on the data stored in the ping-pong storage module and the plurality of comparators; and the code length recovery module is configured to redistribute the code length corresponding to the characters in the data according to the frequency matrix based on the code length matrix and the code word matrix which are obtained by calculation of the code length generation module and the frequency matrix of the data. By using the scheme of the invention, 2/3 time can be saved during tree building, the code length can be built for the code word in each clock cycle, the throughput rate of the encoder can be improved, the link processing delay can be shortened, and the encoding efficiency of the data sequence can be improved.

Description

Code length generating device for dynamic Huffman coding
Technical Field
The present invention relates to the field of computers, and more particularly, to a code length generating apparatus for dynamic huffman coding.
Background
With the rapid development of technologies such as the 5G, the internet of things, cloud computing, big data, artificial intelligence and the like, a high-speed and safe data storage service faces new challenges. Among all these technologies, the cloud computing technology is equivalent to the human brain, provides large-capacity data storage and an efficient computing manner, and is more centralized in computing resources and storage resources. But at the same time, the generated massive data exponentially increases, and huge pressure is brought to the existing storage equipment. Therefore, the efficient and safe data compression technology becomes an effective method for reducing the storage cost and saving the storage resources. The huffman coding in the Deflate format is a combined coding form of LZ77 coding and huffman coding. The data is firstly encoded by LZ77 and then exists in three forms of Literal, Length and Distance. Before Huffman coding, the Literal and the Length are used as a type of information to obtain a Literal _ Length code word by searching a Literal _ Length code table, and the Distance is used as a type of information alone to obtain a Distance code word by searching a Distance code table. And then performing Huffman coding on the two types of code words through Huffman code table 1 and Huffman code table 2 respectively to obtain a CL1 sequence and a CL2 sequence.
Huffman is a lossless data compression coding invented by david.a. huffman in 1952. Owing to its high efficiency, huffman coding has been widely adopted in the fields of computers, data encryption, communications, and the like. The general huffman coding is generally implemented by looking up a static table or dynamically, the former can simplify the huffman tree growing process, but the compression ratio is lower and the flexibility is poorer than the dynamic implementation, while the latter has a higher compression ratio, but the core process of the huffman coding is the huffman tree growing process, which is a relatively complex and frequent operation for a general-purpose CPU. For fast and real-time Huffman coding, the software tree building process is difficult to satisfy.
Disclosure of Invention
In view of this, an object of the embodiments of the present invention is to provide a code length generating apparatus for dynamic huffman coding, which can save 2/3 time during tree building by using the technical solution of the present invention, and satisfy the requirement that a code length can be established for a codeword in each clock cycle, thereby increasing the throughput of an encoder, and shortening the link processing delay, thereby increasing the coding efficiency of a data sequence.
In view of the above object, an aspect of the embodiments of the present invention provides a code length generating apparatus for dynamic huffman coding, including:
the ping-pong storage module is configured to be used for storing the sequenced data;
the code length generation module is configured to receive the data stored in the ping-pong storage module and calculate a code length matrix and a code word matrix based on the data stored in the ping-pong storage module and the plurality of comparators;
and the code length recovery module is configured to redistribute the code length corresponding to the characters in the data according to the frequency matrix based on the code length matrix and the code word matrix which are obtained by calculation of the code length generation module and the frequency matrix of the data.
According to an embodiment of the present invention, further comprising:
and the control module is respectively connected to the ping-pong storage module, the code length generation module and the code length recovery module, and is configured to send an enable signal to the code length generation module after receiving a completion signal of the ping-pong storage module and send the enable signal to the code length recovery module after receiving the completion signal of the code length generation module.
According to one embodiment of the invention, the ping-pong memory module comprises an odd memory and an even memory, wherein the odd memory stores data of odd positions in the sorted data, and the even memory stores data of even positions.
According to one embodiment of the present invention, the code length generation module includes:
the storage group comprises a first storage, a second storage, a third storage, a fourth storage, a fifth storage and a sixth storage, wherein the first storage receives data stored in an odd storage in the ping-pong storage module, and the second storage receives data stored in an even storage in the ping-pong storage module;
the comparator group comprises 15 comparators, each comparator is configured to respectively input data in any two memories in the memory group for comparison and output a comparison result in one clock cycle, and the source memories of the data input in each comparator are not identical;
and the selector is configured to receive the comparison result output by the comparator group and generate a code length matrix and a code word matrix based on the comparison result.
According to an embodiment of the present invention, the comparator group is configured such that the first comparator compares one data in the first memory with one data in the second memory and outputs data having small data, the second comparator compares one data in the first memory with one data in the third memory and outputs data having small data, the third comparator compares one data in the first memory with one data in the fourth memory and outputs data having small data, the fourth comparator compares one data in the first memory with one data in the fifth memory and outputs data having small data, the fifth comparator compares one data in the first memory with one data in the sixth memory and outputs data having small data, the sixth comparator compares one data in the second memory with one data in the third memory and outputs data having small data, a seventh comparator compares one datum in the second memory with one datum in the fourth memory and outputs a datum with small data, an eighth comparator compares one datum in the second memory with one datum in the fifth memory and outputs a datum with small data, a ninth comparator compares one datum in the second memory with one datum in the sixth memory and outputs a datum with small data, a tenth comparator compares one datum in the third memory with one datum in the fourth memory and outputs a datum with small data, an eleventh comparator compares one datum in the third memory with one datum in the fifth memory and outputs a datum with small data, and a twelfth comparator compares one datum in the third memory with one datum in the sixth memory and outputs a datum with small data, the thirteenth comparator compares one datum in the fourth memory with one datum in the fifth memory and outputs a datum with small data, the fourteenth comparator compares one datum in the fourth memory with one datum in the sixth memory and outputs a datum with small data, and the fifteenth comparator compares one datum in the fifth memory with one datum in the sixth memory and outputs a datum with small data.
According to one embodiment of the invention, the code length recovery module uses the formula: w (q (i)) < = M (i)) reallocates the code length corresponding to the character according to the frequency matrix, wherein M is the code length matrix, q is the frequency matrix, W is the code length matrix of the final adjusted sequence, and i is the number of code words.
According to one embodiment of the invention, the sorted data is data sorted from small to large according to the occurrence frequency of each character in the data to be compressed.
According to one embodiment of the invention, the comparator group is configured to read 1 data from each memory in the memory group in one clock cycle for comparison, and store the smallest two data in the memory group after summing according to a preset rule.
According to an embodiment of the present invention, a rule is preset such that if any one of the smallest two data comes from the first memory or the second memory, the summed data is stored in the third memory or the fourth memory; if the smallest two data come from the third memory and the fourth memory, storing the summed data into the fifth memory or the sixth memory; if either of the smallest two data comes from the fifth memory or the sixth memory, the summed data is stored in the fifth memory or the sixth memory.
According to one embodiment of the present invention, the initial values in the third memory, the fourth memory, the fifth memory and the sixth memory are null, and the maximum value is assigned to the memories if the values in the memories are detected to be null when the comparator group performs data comparison.
The invention has the following beneficial technical effects: the code length generating device of the dynamic Huffman coding provided by the embodiment of the invention is characterized in that a ping-pong storage module is arranged, and the ping-pong storage module is configured to be used for storing sequenced data; the code length generation module is configured to receive the data stored in the ping-pong storage module and calculate a code length matrix and a code word matrix based on the data stored in the ping-pong storage module and the plurality of comparators; the code length recovery module is configured to be a technical scheme that the code length corresponding to characters in the data is redistributed according to the code length matrix and the code word matrix calculated by the code length generation module and the frequency matrix of the data, 2/3 time can be saved during tree building, the code length of the code words can be established in each clock cycle, the throughput rate of an encoder can be improved, the link processing delay can be shortened, and therefore the encoding efficiency of the data sequence is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by using the drawings without creative efforts.
Fig. 1 is a schematic diagram of a code length generating apparatus for dynamic huffman coding according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a ping-pong memory module, according to one embodiment of the invention;
fig. 3 is a schematic diagram of a code length generation module according to an embodiment of the present invention.
Detailed Description
Embodiments of the present disclosure are described below. However, it is to be understood that the disclosed embodiments are merely examples and that other embodiments may take various and alternative forms. The figures are not necessarily to scale; certain features may be exaggerated or minimized to show details of particular components. Therefore, specific structural and functional details disclosed herein are not to be interpreted as limiting, but merely as a representative basis for teaching one skilled in the art to variously employ the present invention. As one of ordinary skill in the art will appreciate, various features illustrated and described with reference to any one of the figures may be combined with features illustrated in one or more other figures to produce embodiments that are not explicitly illustrated or described. The combination of features shown provides a representative embodiment for a typical application. However, various combinations and modifications of the features consistent with the teachings of the present disclosure may be desirable for certain specific applications or implementations.
In view of the above-mentioned objects, a first aspect of the embodiments of the present invention proposes an embodiment of a code length generation apparatus for dynamic huffman coding. Fig. 1 shows a schematic view of the device.
As shown in fig. 1, the apparatus may include:
and the ping-pong storage module is configured to be used for storing the sequenced data. The sorted data is data sorted from small to large according to the occurrence frequency of each character in the data to be compressed, as shown in fig. 2, the ping-pong storage module mainly buffers the sorted data, in order to accelerate the data transfer, the data transfer of the module interface is carried out in an array mode, in order to reduce the resource of a hardware circuit, the ping-pong storage module transfers the data in an array format into a Random Access Memory (RAM), so that the reading operation at the later stage is easier, and the system delay is reduced. The operation principle follows the following principle, ordered data are written into an odd RAM (odd memory) and an even RAM (even memory) in turn according to the order of odd and even, so the size of the data stored in the odd RAM is smaller than or equal to that of the data stored in the even RAM, the number of the data in the odd RAM is equal to or larger than that of the data in the even RAM, and a MUX (selector) is controlled by a state machine to control whether the data are written into the odd RAM or the even RAM.
The device also comprises a code length generation module, wherein the code length generation module is configured to receive the data stored in the ping-pong storage module and calculate a code length matrix and a code word matrix based on the data stored in the ping-pong storage module and the plurality of comparators. As shown in fig. 3, the code length generation module includes a memory group (memory group) including a LEAF odd RAM (first memory), a LEAF even RAM (second memory), a NODE odd RAM (third memory), a NODE even RAM (fourth memory), a ST odd RAM (fifth memory), and a ST even RAM (sixth memory), wherein the LEAF odd RAM receives data stored in the odd RAM in the ping-pong memory module, and the LEAF even RAM receives data stored in the even RAM in the ping-pong memory module; the comparator group comprises 15 comparators, each comparator is configured to be respectively input into one data in any two RAMs in the memory group for comparison and output a comparison result in one clock cycle, wherein the source RAMs of the data input into each comparator are not identical; and the selector MUX is configured to receive the comparison result output by the comparator group and generate the code length matrix and the code word matrix based on the comparison result. The comparator group reads 1 data from each RAM in the memory group in one clock cycle for comparison, sums the minimum two data and stores the sum data into the memory group according to a preset rule, wherein the preset rule is that if any one of the minimum two data comes from a LEAF odd RAM or a LEAF even RAM, the summed data is stored into a NODE odd RAM or a NODE even RAM; if the minimum two data come from NODE odd RAM and NODE even RAM, storing the summed data into ST odd RAM or ST even RAM; if any one of the smallest two data is from ST odd RAM or ST even RAM, storing the summed data into ST odd RAM or ST even RAM, wherein the initial values in NODE odd RAM, NODE even RAM, ST odd RAM and ST even RAM are null, and when the comparator group performs data comparison, if the value in the memory is detected to be null, assigning the maximum value to the memory. Each time data in the NODE odd RAM or the NODE even RAM is combined with data in the ST odd RAM or the ST even RAM, all LEAF NODE code lengths below the NODE odd RAM or the LEAF even RAM need to be increased by 1, when the data in the LEAF odd RAM or the LEAF even RAM is combined with any data in the NODE odd RAM or the NODE even RAM or the ST odd RAM or the ST even RAM, the LEAF NODE code length under the LEAF odd RAM or the LEAF even RAM is 1, and all LEAF NODE code lengths under the NODE odd RAM or the NODE even RAM or the ST odd RAM or the ST even RAM are increased by 1.
The code length recovery module is configured to redistribute the code lengths corresponding to the characters in the data according to the frequency matrix based on the code length matrix and the code word matrix calculated by the code length generation module and the frequency matrix of the data. The code length matrix output by the code length generating module is out of order and is stored according to the output result of the comparator each time, therefore, the code length restoring module can redistribute the code length corresponding to the character according to the code length matrix, the code word matrix and the frequency matrix, so that the character obtains the real code length. The code length recovery uses the following approach: w (q (i)) < = M (i), where M is a code length matrix, q is a frequency matrix, W is a code length matrix of a final adjusted order, and i is the number of codewords.
The technical scheme of the invention optimizes the algorithm realization of Huffman tree generation in Huffman coding and designs a dynamic Huffman coding code length generation hardware circuit suitable for a deflate format. The whole dynamic Huffman coding hardware circuit is divided into a plurality of small functional circuits, so that the whole hardware circuit has the advantages of good expansibility and strong performance, and the advantages of hardware for realizing a software algorithm are fully exerted. Compared with the traditional hardware circuit for realizing the growth process of the Huffman tree, the invention adopts the parallel comparator groups to carry out comparison in real time, and realizes that one node on the Huffman tree can be output in each clock cycle. For a Huffman tree with 50 LEAFs, the invention needs 50 clock cycles in total, and the tree building process of the traditional hardware circuit needs to consume 150 clock cycles, thereby greatly improving the growth efficiency of the hardware circuit for realizing the Huffman tree.
By the technical scheme of the invention, 2/3 time can be saved during tree building, the code length of the code word can be built in each clock cycle, the throughput rate of the encoder can be improved, the link processing delay can be shortened, and the encoding efficiency of the data sequence can be improved.
In a preferred embodiment of the present invention, the method further comprises:
and the control module is respectively connected to the ping-pong storage module, the code length generation module and the code length recovery module, and is configured to send an enable signal to the code length generation module after receiving a completion signal of the ping-pong storage module and send the enable signal to the code length recovery module after receiving the completion signal of the code length generation module.
In a preferred embodiment of the present invention, the ping-pong memory module includes an odd RAM and an even RAM, the odd RAM stores data at odd-numbered positions of the sorted data, and the even RAM stores data at even-numbered positions of the sorted data. That is, of the sorted data, data at the first, third, fifth, and so on odd-numbered positions is stored in the odd RAM, and data at the second, fourth, sixth, and so on even-numbered positions is stored in the even RAM.
In a preferred embodiment of the present invention, the code length generating module includes:
the storage group comprises a LEAF odd RAM, a LEAF even RAM, a NODE odd RAM, a NODE even RAM, an ST odd RAM and an ST even RAM, wherein the LEAF odd RAM receives data stored in the odd RAM in the ping-pong storage module, and the LEAF even RAM receives data stored in the even RAM in the ping-pong storage module;
the comparator group comprises 15 comparators, each comparator is configured to be respectively input into one data in any two RAMs in the memory group for comparison and output a comparison result in one clock cycle, wherein the source RAMs of the data input into each comparator are not identical;
and the selector is configured to receive the comparison result output by the comparator group and generate a code length matrix and a code word matrix based on the comparison result.
In a preferred embodiment of the present invention, the comparator group is configured such that the first comparator compares one data in the LEAF odd RAM with one data in the LEAF even RAM to output data having small data, the second comparator compares one data in the LEAF odd RAM with one data in the NODE odd RAM to output data having small data, the third comparator compares one data in the LEAF odd RAM with one data in the NODE even RAM to output data having small data, the fourth comparator compares one data in the LEAF odd RAM with one data in the ST odd RAM to output data having small data, the fifth comparator compares one data in the LEAF odd RAM with one data in the ST even RAM to output data having small data, the sixth comparator compares one data in the LEAF even RAM with one data in the NODE odd RAM to output data having small data, and the seventh comparator compares one data in the LEAF even RAM with one data in the NODE even RAM to output data having small data Small data, an eighth comparator compares one data in the LEAF even RAM with one data in the ST odd RAM to output data having small data, a ninth comparator compares one data in the LEAF even RAM with one data in the ST even RAM to output data having small data, a tenth comparator compares one data in the NODE odd RAM with one data in the NODE even RAM to output data having small data, an eleventh comparator compares one data in the NODE odd RAM with one data in the ST odd RAM to output data having small data, a twelfth comparator compares one data in the NODE odd RAM with one data in the ST even RAM to output data having small data, a thirteenth comparator compares one data in the NODE even RAM with one data in the ST odd RAM to output data having small data, a fourteenth comparator compares one data in the NODE even RAM with one data in the ST even RAM to output data having small data, the fifteenth comparator compares one data in the ST odd RAM with one data in the ST even RAM and outputs data having small data.
In a preferred embodiment of the present invention, the code length recovery module uses the formula: w (q (i)) < = M (i)) reallocates the code length corresponding to the character according to the frequency matrix, wherein M is the code length matrix, q is the frequency matrix, W is the code length matrix of the final adjusted sequence, and i is the number of code words.
In a preferred embodiment of the present invention, the sorted data is data sorted from small to large according to the number of occurrences of each character in the data to be compressed.
In a preferred embodiment of the present invention, the comparator bank is configured to read 1 data from each RAM in the memory bank in one clock cycle for comparison, and store the smallest two data after summing into the memory bank according to a preset rule.
In a preferred embodiment of the present invention, the preset rule is that if either of the smallest two data comes from the LEAF odd RAM or the LEAF even RAM, the summed data is stored in the NODE odd RAM or the NODE even RAM; if the minimum two data come from NODE odd RAM and NODE even RAM, storing the summed data into ST odd RAM or ST even RAM; if either of the smallest two data comes from the ST odd RAM or the ST even RAM, the summed data is stored in the ST odd RAM or the ST even RAM.
In a preferred embodiment of the present invention, the initial values in the NODE odd RAM, NODE even RAM, ST odd RAM and ST even RAM are null, and the RAM is given the maximum value if the value in the RAM is detected to be null when the comparator group performs data comparison.
Examples
The characters and the frequency of occurrence of the characters in the data to be compressed are shown in table 1 below:
TABLE 1 characters and frequencies
Character(s) A B C D E F G H
Frequency of 1 2 5 1 3 11 5 11
Thus, the frequency-sorted data is 1,1,2,3,5,5,11,11, and the sorted data is stored in the ping-pong memory module in the case that 1,2,5,11 is stored in the odd RAM, 1,3,5,11 is stored in the even RAM, and then the data is transferred to the code length generating means, wherein data 1,2,5,11 in the odd RAM is stored in the LEAF odd RAM, data 1,3,5,11 in the even RAM is stored in the LEAF even RAM, data in the other 4 RAMs in the memory group is null, data in the 6 RAMs in the first clock cycle is compared by the comparators in the comparator group, specifically, the first comparator compares 1 in the LEAF odd RAM with 1 in the LEAF even RAM and outputs 1, the second comparator compares 1 in the LEAF odd RAM with the maximum number in the NODE odd RAM (the value in the RAM is null, and outputs 1 is the maximum value), the third comparator compares the maximum number in the LEAF odd RAM 1 and NODE even RAM to output 1, the fourth comparator compares the maximum number in the LEAF odd RAM 1 and the ST odd RAM to output 1, the fifth comparator compares the maximum number in the LEAF odd RAM 1 and the ST even RAM to output 1, the sixth comparator compares the 1 in the LEAF even RAM with the maximum number in the NODE odd RAM to output 1, the seventh comparator compares the 1 in the LEAF even RAM with the maximum number in the NODE even RAM to output 1, the eighth comparator compares the 1 in the LEAF even RAM with the maximum number in the ST odd RAM to output 1, the ninth comparator compares the 1 in the LEAF even RAM with the maximum number in the ST even RAM to output 1, the tenth comparator compares the maximum number in the NODE odd RAM with the maximum number in the NODE odd RAM to output 1, the tenth comparator RAM compares the maximum number in the NODE odd RAM with the maximum number in the NODE odd RAM to output 1, the eleventh comparator compares the maximum number with the maximum number in the NODE odd RAM to output 1 And the twelfth comparator compares the maximum number in the NODE odd RAM with the maximum number in the ST even RAM and outputs the maximum number, the thirteenth comparator compares the maximum number in the NODE even RAM with the maximum number in the ST odd RAM and outputs the maximum number, the fourteenth comparator compares the maximum number in the NODE even RAM with the maximum number in the ST even RAM and outputs the maximum number, and the fifteenth comparator compares the maximum number in the ST odd RAM with the maximum number in the ST even RAM and outputs the maximum number. And transmitting the output result of each comparator to a selector MUX to select the minimum two data, wherein the minimum two data are 1 and 1, the sources of 1 and 1 are LEAF odd RAM and LEAF even RAM respectively, and 1 are summed and stored in NODE odd RAM according to the preset rule, wherein the code word matrix is (1, 1) and the code length matrix is (1, 1).
In the second clock cycle, 2 in the LEAF odd RAM, 3 in the LEAF even RAM, 2 in the NODE odd RAM, and 3 other RAMs are empty (i.e., maximum), and for convenience of description, data is not put into a specific comparator for comparison, and the comparison method is the same as that described above. At this time, the two smallest data among the 6 digits are 2 and 2, since the sources of 2 and 2 are LEAF odd RAM and NODE odd RAM, respectively, according to the preset rule, 2 and 2 are summed and stored in NODE even RAM, where the codeword matrix is (2, 2, 2) and the code length matrix is (2, 2, 1).
In the third clock cycle, the number of the LEAF odd RAM is 5, the number of the LEAF even RAM is 3, the number of the NODE odd RAM is empty, the number of the NODE even RAM is 4, the number of the other 2 RAMs is empty, the minimum two data of the 6 numbers are 3 and 4, the 3 and 4 are respectively sourced from the LEAF even RAM and the NODE even RAM, the 3 and 4 are summed and stored in the NODE odd RAM according to the preset rule, the code matrix at the time is (3, 3,3, 3), and the code length matrix is (3, 3,2, 1).
In the fourth clock cycle, 5 is in the LEAF odd RAM, 5 is in the LEAF even RAM, 7 is in the NODE odd RAM, and is empty in the other 3 RAMs, and at this time, the two smallest data in the 6 digits are 5 and 5, and since the sources of 5 and 5 are the LEAF odd RAM and the LEAF even RAM, respectively, according to the preset rule, the 5 and 5 are summed and then stored in the NODE even RAM, and at this time, the code matrix is (3, 3,3,3,4, 4), and the code length matrix is (3, 3,2,1,1, 1).
In the fifth clock cycle, 11 in the LEAF odd RAM, 11 in the LEAF even RAM, 7 in the NODE odd RAM, 10 in the NODE even RAM, and null in the other 2 RAMs, where the minimum two data of the 6 digits are 7 and 10, since the sources of 7 and 10 are the NODE odd RAM and the NODE even RAM, respectively, according to the preset rule, 7 and 10 are summed and stored in the ST odd RAM, where the codeword matrix is (5, 5,5,5, 5) and the code length matrix is (4, 4,3,2,2, 2).
At the sixth clock cycle, 11 in the LEAF odd RAM, 11 in the LEAF even RAM, null in the NODE odd RAM, null in the NODE even RAM, 17 in the ST odd RAM, and null in the ST even RAM, where the two smallest data among the 6 numbers are 11 and 11, since the sources of 11 and 11 are the LEAF odd RAM and the LEAF even RAM, respectively, according to the preset rule, 11 and 11 are summed and stored in the NODE odd RAM, where the codeword matrix is (5, 5,5,5,5,5,6, 6), and the code length matrix is (4, 4,3,2,2,2, 1, 1).
And in the seventh clock cycle, the LEAF odd RAM is empty, the LEAF even RAM is empty, the NODE odd RAM is 22, the NODE even RAM is empty, the ST odd RAM is 17, the ST even RAM is empty, the two data with the minimum number of 6 numbers are 22 and 17, the 22 and 17 are respectively sourced from the NODE odd RAM and the ST odd RAM, the 22 and 17 are summed and stored in the ST even RAM according to the preset rule, the code word matrix is (7, 7,7,7,7, 7), and the code length matrix is (5, 5,4,3,3, 2, 2). The final codeword matrix obtained at this time is (7, 7,7,7, 7), and the final code length matrix is (5, 5,4,3,3,3,2, 2).
By the technical scheme of the invention, 2/3 time can be saved during tree building, the code length of the code word can be built in each clock cycle, the throughput rate of the encoder can be improved, the link processing delay can be shortened, and the encoding efficiency of the data sequence can be improved.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
The embodiments described above, particularly any "preferred" embodiments, are possible examples of implementations and are presented merely to clearly understand the principles of the invention. Many variations and modifications may be made to the above-described embodiments without departing from the spirit and principles of the technology described herein. All such modifications are intended to be included within the scope of this disclosure and protected by the following claims.

Claims (6)

1. A code length generating apparatus for dynamic huffman coding, comprising:
a ping-pong storage module configured to store the sorted data;
a code length generation module configured to receive the data stored in the ping-pong storage module and calculate a code length matrix and a code word matrix based on the data stored in the ping-pong storage module and the plurality of comparators, the code length generation module comprising:
the memory bank comprises a first memory, a second memory, a third memory, a fourth memory, a fifth memory and a sixth memory, wherein the first memory receives data stored in an odd memory in the ping-pong memory module, and the second memory receives data stored in an even memory in the ping-pong memory module;
the comparator group comprises 15 comparators, the comparator group is configured to input data in any two memories in the memory group respectively for comparison and output comparison results in one clock cycle, wherein the source memories of the data input in each comparator are not identical, the comparator group is configured to read 1 data in each memory in the memory group for comparison in one clock cycle, sum the two smallest data and store the data in the memory group according to a preset rule, and the preset rule is that if any one of the two smallest data comes from the first memory or the second memory, the summed data is stored in the third memory or the fourth memory; if the smallest two data come from the third memory and the fourth memory, storing the summed data into the fifth memory or the sixth memory; if any one of the smallest two data comes from the fifth memory or the sixth memory, storing the summed data into the fifth memory or the sixth memory;
a selector configured to receive the comparison result output by the comparator group and generate a code length matrix and a code word matrix based on the comparison result;
a code length recovery module configured to reallocate a code length corresponding to a character in data according to a frequency matrix based on the code length matrix and the codeword matrix calculated by the code length generation module and the frequency matrix of the data, the code length recovery module using a formula: w (q (i)) < = M (i)) reallocates the code length corresponding to the character according to the frequency matrix, wherein M is the code length matrix, q is the frequency matrix, W is the code length matrix of the final adjusted sequence, and i is the number of code words.
2. The apparatus of claim 1, further comprising:
the control module is respectively connected to the ping-pong storage module, the code length generation module and the code length recovery module, and is configured to send an enable signal to the code length generation module after receiving a completion signal of the ping-pong storage module, and send an enable signal to the code length recovery module after receiving the completion signal of the code length generation module.
3. The apparatus of claim 1, wherein the ping-pong storage module comprises an odd memory and an even memory, the odd memory storing therein data at odd numbered locations of the sorted data, the even memory storing therein data at even numbered locations.
4. The apparatus of claim 1, wherein the comparator group is configured such that the first comparator compares one data in the first memory with one data in the second memory to output data having small data, the second comparator compares one data in the first memory with one data in the third memory to output data having small data, the third comparator compares one data in the first memory with one data in the fourth memory to output data having small data, the fourth comparator compares one data in the first memory with one data in the fifth memory to output data having small data, the fifth comparator compares one data in the first memory with one data in the sixth memory to output data having small data, the sixth comparator compares one data in the second memory with one data in the third memory to output data having small data, a seventh comparator compares one datum in the second memory with one datum in the fourth memory and outputs a datum with small data, an eighth comparator compares one datum in the second memory with one datum in the fifth memory and outputs a datum with small data, a ninth comparator compares one datum in the second memory with one datum in the sixth memory and outputs a datum with small data, a tenth comparator compares one datum in the third memory with one datum in the fourth memory and outputs a datum with small data, an eleventh comparator compares one datum in the third memory with one datum in the fifth memory and outputs a datum with small data, and a twelfth comparator compares one datum in the third memory with one datum in the sixth memory and outputs a datum with small data, the thirteenth comparator compares one datum in the fourth memory with one datum in the fifth memory and outputs a datum with small data, the fourteenth comparator compares one datum in the fourth memory with one datum in the sixth memory and outputs a datum with small data, and the fifteenth comparator compares one datum in the fifth memory with one datum in the sixth memory and outputs a datum with small data.
5. The apparatus of claim 1, wherein the sorted data is sorted from small to large for the number of occurrences of each character in the data to be compressed.
6. The apparatus of claim 1, wherein the initial values in the third, fourth, fifth and sixth memories are null, and the maximum value is assigned to the memories if the values in the memories are null when the comparator set performs the data comparison.
CN202210055822.4A 2022-01-18 2022-01-18 Code length generating device for dynamic Huffman coding Active CN114095036B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210055822.4A CN114095036B (en) 2022-01-18 2022-01-18 Code length generating device for dynamic Huffman coding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210055822.4A CN114095036B (en) 2022-01-18 2022-01-18 Code length generating device for dynamic Huffman coding

Publications (2)

Publication Number Publication Date
CN114095036A CN114095036A (en) 2022-02-25
CN114095036B true CN114095036B (en) 2022-04-22

Family

ID=80308755

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210055822.4A Active CN114095036B (en) 2022-01-18 2022-01-18 Code length generating device for dynamic Huffman coding

Country Status (1)

Country Link
CN (1) CN114095036B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101998122A (en) * 2010-12-13 2011-03-30 山东大学 Method and device for decoding normal form Hoffman hardware in JPEG (Joint Photographic Expert Group) image
CN107565974A (en) * 2017-08-14 2018-01-09 同济大学 A kind of parallel full coding implementation method of static Huffman
CN112332854A (en) * 2020-11-27 2021-02-05 平安普惠企业管理有限公司 Hardware implementation method and device of Huffman coding and storage medium
CN113839678A (en) * 2021-08-31 2021-12-24 山东云海国创云计算装备产业创新中心有限公司 Huffman decoding system, method, equipment and computer readable storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060146944A1 (en) * 2005-01-05 2006-07-06 Integrated Programmable Communications, Inc. System and method of processing frequency-diversity signals with reduced-sampling-rate receiver
KR102098247B1 (en) * 2013-11-25 2020-04-08 삼성전자 주식회사 Method and apparatus for encoding and decoding data in memory system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101998122A (en) * 2010-12-13 2011-03-30 山东大学 Method and device for decoding normal form Hoffman hardware in JPEG (Joint Photographic Expert Group) image
CN107565974A (en) * 2017-08-14 2018-01-09 同济大学 A kind of parallel full coding implementation method of static Huffman
CN112332854A (en) * 2020-11-27 2021-02-05 平安普惠企业管理有限公司 Hardware implementation method and device of Huffman coding and storage medium
CN113839678A (en) * 2021-08-31 2021-12-24 山东云海国创云计算装备产业创新中心有限公司 Huffman decoding system, method, equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN114095036A (en) 2022-02-25

Similar Documents

Publication Publication Date Title
US9454552B2 (en) Entropy coding and decoding using polar codes
CN110943744B (en) Data compression, decompression and processing method and device based on data compression and decompression
CN111884660B (en) Huffman coding equipment
WO2022148304A1 (en) Sorting network-based dynamic huffman coding method, apparatus and device
CN111510156A (en) Method for dynamically compressing and decompressing large file based on segmentation
CN111211787A (en) Industrial data compression method, system, storage medium and terminal
CN114268323B (en) Data compression coding method, device and time sequence database supporting line memory
Zhang et al. CEAZ: accelerating parallel I/O via hardware-algorithm co-designed adaptive lossy compression
CN114095036B (en) Code length generating device for dynamic Huffman coding
Mantoro et al. The performance of text file compression using Shannon-Fano and Huffman on small mobile devices
CN110401451B (en) Automaton space compression method and system based on character set transformation
CN107623524B (en) Hardware-based Huffman coding method and system
CN113381769B (en) Decoder based on FPGA
CN114884618A (en) GPU-based 5G multi-user LDPC (Low Density parity check) code high-speed decoder and decoding method thereof
CN110569487B (en) Base64 expansion coding method and system based on high-frequency character substitution algorithm
KR20220100030A (en) Pattern-Based Cache Block Compression
CN113726342B (en) Segmented difference compression and inert decompression method for large-scale graph iterative computation
Howard et al. Parallel lossless image compression using Huffman and arithmetic coding
CN114640357B (en) Data encoding method, apparatus and storage medium
CN112073069B (en) Test vector lossless compression method suitable for integrated circuit test
CN112200301B (en) Convolution computing device and method
US11914443B2 (en) Power-aware transmission of quantum control signals
CN111213146A (en) Pseudo data generating device, method thereof, and program
Kim et al. Dual pattern compression using data-preprocessing for large-scale gpu architectures
Balevic et al. Using arithmetic coding for reduction of resulting simulation data size on massively parallel GPGPUs

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant