CN113839679B

CN113839679B - Huffman decoding system, method, equipment and computer readable storage medium

Info

Publication number: CN113839679B
Application number: CN202111011902.1A
Authority: CN
Inventors: 苏建龙; 马恒
Original assignee: Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Current assignee: Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Priority date: 2021-08-31
Filing date: 2021-08-31
Publication date: 2023-09-15
Anticipated expiration: 2041-08-31
Also published as: CN113839679A

Abstract

The application discloses a Huffman decoding system, which comprises: the decoding control unit, the comparator group array, the content addressing memory, the vector fusion unit and the multi-path decoding engine calculation unit. Huffman decoding for the deflate encoding can be realized by a decoding control unit, a comparator array, a content addressing memory, a vector fusion unit and a decoding engine calculation unit. On the basis, by setting the decoding engine computing units to be multiple paths, each path of decoding engine computing unit can decode in parallel, so that the decoding throughput rate can be greatly improved. The application also discloses a Huffman decoding method, huffman decoding equipment and a computer readable storage medium, which have the technical effects.

Description

Huffman decoding system, method, equipment and computer readable storage medium

Technical Field

The application relates to the technical field of decoding, in particular to a Huffman decoding system; also relates to a Huffman decoding method, a Huffman decoding device and a computer readable storage medium.

Background

The Huffman coding of the Deflate format is a combination coded version of LZ77 coding and Huffman coding. The data are first LZ77 encoded and then exist in three forms, namely, real, length and distance. Before Huffman coding is carried out, the user and the length are used as one type of information to obtain a user_length code word by searching a user_length code table, and the distance is independently used as one type of information to obtain a distance code word by searching a distance code table. And then performing Huffman coding on the two types of code words through the Huffman code table 1 and the Huffman code table 2 to obtain a CL1 sequence and a CL2 sequence. Because the maximum depth of the Huffman tree in the deflate format is 15, the code length sequences CL1 and CL2 have values ranging from 0 to 15. Many repetition lengths and codewords with a length of 0 may occur in one compression. The CL1 sequence and the CL2 sequence are run-length coded separately for this case and are augmented with three codewords 16, 17, 18. And (3) performing Huffman coding on the SQ1 sequence and the SQ2 sequence obtained after run-length coding again through a Huffman code table 3 to obtain a CCL sequence. The decoding process of the Deflate format is the inverse operation of the encoding process.

At present, the Huffman decoding scheme aiming at the deflate format is mostly realized by adopting a single engine decoding mode, and for the single engine decoding, the calculation period and the data are completely carried out in series, and the data of the next stroke can be received for decoding after the current data calculation is completely finished. This results in many idle computing cycles, which results in lower throughput for the code stream input. Therefore, how to improve the decoding throughput rate has become a technical problem to be solved by those skilled in the art.

Disclosure of Invention

The application aims to provide a Huffman decoding system which can greatly improve the decoding throughput rate on the basis of effectively carrying out Huffman decoding on a deflate code. Another object of the present application is to provide a huffman decoding method, apparatus and computer readable storage medium, which all have the above technical effects.

In order to solve the above technical problems, the present application provides a huffman decoding system, including:

the decoding control unit, the comparator group array, the content addressing memory, the vector fusion unit and the multipath decoding engine calculation unit;

the decoding control unit is used for receiving the data frame to be decoded, splicing the data frame to be decoded to obtain a spliced data frame, and transmitting the spliced data frame to the comparator group array and the decoding engine computing unit; the decoding control unit issues different spliced data frames to each decoding engine computing unit;

The comparator group array is used for determining the code length and the offset value corresponding to each data segment in the spliced data frame; wherein the code length comprises a first code length and a second code length; the offset value comprises a first offset value and a second offset value;

the content addressing memory is configured to obtain a first vector according to the first code length and the first offset value; obtaining a second vector according to the second code length and the second offset value;

the vector fusion unit is used for fusing the first vector and the second vector to obtain a fusion vector;

and each decoding engine computing unit is used for decoding the received spliced data frame according to the fusion vector and the effective position information output by the upper stage decoding engine computing unit and outputting a decoding result.

Optionally, the decoding engine calculating unit is specifically configured to:

calculating a length vector according to the fusion vector;

calculating a position vector according to the length vector and the effective position information output by the decoding engine calculation unit at the upper stage;

and calculating according to the position vector to obtain a result vector, and outputting a decoding result according to the result vector after the decoding engine calculation unit at the upper stage outputs the decoding result.

Optionally, the decoding control unit is specifically configured to:

splicing the two adjacent frames of the data frames to be decoded to obtain a first spliced data frame, and transmitting the first spliced data frame to the decoding engine computing unit;

and splicing the data frame to be decoded and the adjacent data frame with the width of the overlapping area in the data frame to be decoded, obtaining a second spliced data frame, and issuing the second spliced data frame to the comparator group array.

Optionally, the decoding control unit is further configured to enlarge the overlapping region width.

Optionally, the vector fusion unit is specifically configured to:

when the type of the current bit of the first vector is length and valid, if the next position pointed by the code length of the current bit is larger than a boundary value, marking the current bit of the fusion vector as valid and incomplete, wherein the code length of the current bit of the fusion vector is the code length of the current bit of the first vector;

when the type of the current bit of the first vector is length and valid, if the next position pointed to by the code length of the current bit is smaller than or equal to the boundary value, marking the current bit of the fusion vector as valid and complete, wherein the code length of the current bit of the fusion vector is the sum of the code length of the current bit of the first vector and the code length of the bit of the second vector pointed to by the code length of the current bit;

When the current bit of the first vector is invalid, the fusion vector is invalid;

when the current bit of the first vector is original, the current bit of the fusion vector is marked as valid and complete.

Optionally, the content addressable memory includes:

a first content addressing memory, configured to obtain the first vector according to the first code length and the first offset value;

and the second content addressing memory is used for obtaining the second vector according to the second code length and the second offset value.

In order to solve the technical problem, the application also provides a Huffman decoding method, which comprises the following steps:

receiving a data frame to be decoded through a decoding control unit, splicing the data frame to be decoded to obtain a spliced data frame, and transmitting the spliced data frame to a comparator group array and a decoding engine computing unit; the decoding control unit issues different spliced data frames to each decoding engine computing unit;

determining code length and offset value corresponding to each data segment in the spliced data frame through the comparator group array; wherein the code length comprises a first code length and a second code length; the offset value comprises a first offset value and a second offset value;

Obtaining a first vector through a content addressable memory according to the first code length and the first offset value; obtaining a second vector according to the second code length and the second offset value;

fusing the first vector and the second vector through a vector fusing unit to obtain a fused vector;

and decoding the received spliced data frame by a decoding engine computing unit according to the fusion vector and the effective position information output by the decoding engine computing unit at the upper stage, and outputting a decoding result.

Optionally, decoding, by the decoding engine computing unit, the received spliced data frame according to the fusion vector and the valid position information output by the decoding engine computing unit at the previous stage, and outputting a decoding result includes:

calculating by the decoding engine calculation unit according to the fusion vector to obtain an L vector; calculating to obtain a P vector according to the L vector and the effective position information output by the decoding engine calculation unit at the upper stage; and calculating to obtain an R vector according to the P vector, and outputting a decoding result according to the R vector after the decoding engine calculation unit at the upper stage outputs the decoding result.

In order to solve the technical problem, the present application further provides a huffman decoding device, including:

a memory for storing a computer program;

and a processor for implementing the steps of the Huffman decoding method as described above when executing the computer program.

To solve the above technical problem, the present application also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the huffman decoding method as described above.

The Huffman decoding system provided by the application comprises: the decoding control unit, the comparator group array, the content addressing memory, the vector fusion unit and the multipath decoding engine calculation unit; the decoding control unit is used for receiving the data frame to be decoded, splicing the data frame to be decoded to obtain a spliced data frame, and transmitting the spliced data frame to the comparator group array and the decoding engine computing unit; the decoding control unit issues different spliced data frames to each decoding engine computing unit; the comparator group array is used for determining the code length and the offset value corresponding to each data segment in the spliced data frame; wherein the code length comprises a first code length and a second code length; the offset value comprises a first offset value and a second offset value; the content addressing memory is configured to obtain a first vector according to the first code length and the first offset value; obtaining a second vector according to the second code length and the second offset value; the vector fusion unit is used for fusing the first vector and the second vector to obtain a fusion vector; and each decoding engine computing unit is used for decoding the received spliced data frame according to the fusion vector and the effective position information output by the upper stage decoding engine computing unit and outputting a decoding result.

Therefore, the Huffman decoding system provided by the application obtains two code lengths and offset values through the comparator group array, obtains two vectors through the content addressing memory, fuses the two vectors through the vector fusion unit, and finally decodes the data frame through the decoding engine calculation unit, thereby realizing Huffman decoding aiming at the deflate coding. On the basis, the decoding engine computing unit is provided with multiple paths, and each path of decoding engine computing unit can decode different data frames in parallel, so that the decoding throughput can be greatly improved.

The Huffman decoding method, the Huffman decoding equipment and the computer readable storage medium provided by the application have the technical effects.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required in the prior art and the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic diagram of a huffman decoding system according to an embodiment of the present application;

Fig. 2 is a schematic diagram of data splicing according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a comparator according to an embodiment of the present application;

FIG. 4 is a schematic diagram of vector fusion according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a specific vector fusion according to an embodiment of the present application;

FIG. 6 is a schematic diagram of each calculation stage of a decoding engine calculation unit according to an embodiment of the present application;

fig. 7 is a schematic diagram of decoding calculation performed by a decoding engine calculation unit according to an embodiment of the present application;

fig. 8 is a schematic flow chart of a huffman decoding method according to an embodiment of the present application.

Detailed Description

The core of the application is to provide a Huffman decoding system, which can greatly improve the decoding throughput rate on the basis of effectively carrying out Huffman decoding on a deflate code. Another core of the present application is to provide a huffman decoding method, apparatus and computer readable storage medium, which all have the above technical effects.

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

Referring to fig. 1, fig. 1 is a schematic diagram of a huffman decoding system according to an embodiment of the present application, and referring to fig. 1, the system mainly includes: a decoding control unit 10, a comparator bank array 20, a content addressable memory 30, a vector fusion unit 40, and a multi-decoding engine calculation unit 50.

The decoding control unit 10 is responsible for receiving the data frames to be decoded, and splicing the data frames to be decoded to obtain spliced data frames. The aim of splicing the data frames to be decoded is to ensure that the codewords of a group of data frames can be decoded in their entirety. After the decoding control unit 10 splices to obtain a spliced data frame, the spliced data frame is sent to the comparator group array 20. In addition, in the handshake process between the decoding control unit 10 and the decoding engine computing unit 50, once a certain decoding engine computing unit 50 requests decoding data, the decoding control unit 10 issues a corresponding spliced data frame to the decoding engine computing unit 50. The splice data frames issued by the decoding control unit 10 to each decoding engine calculation unit 50 are different.

In one specific embodiment, the decoding control unit 10 is specifically configured to: splicing two adjacent frames of data frames to be decoded to obtain a first spliced data frame, and sending the first spliced data frame to the decoding engine computing unit 50; and splicing the data frame to be decoded and the adjacent data frame to be decoded of the next frame, which have the width of the overlapping area, to obtain a second spliced data frame, and issuing the second spliced data frame to the comparator array 20.

For example, assuming that one frame of data to be decoded includes 64 bits, the decoding control unit 10 splices two adjacent frames of data to be decoded to obtain a first spliced data frame with 128 bits, and then the decoding control unit 10 issues the first spliced data frame with 128 bits to the corresponding decoding engine computing unit 50.

For example, referring to fig. 2, the decoding control unit 10 splices the data frames D0 and D1 to be decoded, sends the data frames D1 and D2 to be decoded to the decoding engine computing unit 500, sends the data frames D1 and D2 to be decoded to the decoding engine computing unit 501, splices the data frames D2 and D3 to be decoded to the decoding engine computing unit 502, and sends the data frames D3 and D4 to be decoded to the decoding engine computing unit 503.

For the decoding engine calculation unit 501, the decoding engine calculation unit 50 of the upper stage of the decoding engine calculation unit 501 is the decoding engine calculation unit 500; for the decoding engine calculating unit 502, the decoding engine calculating unit 50 of the upper stage of the decoding engine calculating unit 502 is the decoding engine calculating unit 501, and so on, for the decoding engine calculating unit 503, the decoding engine calculating unit 50 of the upper stage of the decoding engine calculating unit 503 is the decoding engine calculating unit 502.

In addition, assuming that the bit width of a frame of data to be decoded is 64 bits and the overlapping area width is 16 bits, the decoding control unit 10 performs splicing on the data frame to be decoded and 16 bits of data in the adjacent next frame of data frame to be decoded to obtain a second spliced data frame with 80 bits, and then transfers the second spliced data frame with 80 bits to the comparator bank array 20.

The comparator bank array 20 is responsible for determining the code length and offset value corresponding to each data segment in the spliced data frame it receives. Specifically, the comparator array 20 divides the received spliced data frame, and determines the code length and offset value corresponding to each data segment by comparing with the base code. The code length comprises a first code length and a second code length; the offset values include a first offset value and a second offset value.

In the compression of the deflate format, the maximum value of the code length after data compression is 15 bits, so that the data stream input into the comparator array 20 can be grouped bit by bit with each group of 15 bits, and compared with the base code respectively to obtain the code length and the offset value corresponding to each group of data segments.

Two Huffman code tables, a liter-length code table and a distance code table exist in the compression process of the deflate format. In the decoding process, the position of the code word in the input data stream and the code table adopted in the compression of the code word cannot be known, so that for each packet, the code word is required to be compared with a distance base code and a distance-length base code to obtain a distance-length code length and a corresponding offset value corresponding to each packet, and the distance code length and the corresponding offset value. The first code length refers to the size of the space. The first offset value refers to an offset value corresponding to a size of a size code, and the second offset value refers to an offset value corresponding to a size of a distance code.

Grouping bit by bit with each group of 15 bits can result in the following data segments:

data segment 1 containing bits 0 to 14; a data segment 2 containing bits 1 to 15; a data segment 3 containing bits 2 to 16; … …, and so on.

For each 15-bit data segment, dividing the data segment into a plurality of segments according to a mode that one bit is gradually increased from the starting position of the data segment. Taking the starting position of the data segment as bit0 as an example, then each segment is respectively segment 1: bit0; fragment 2: bit0+bit1; fragment 3: bit0+bit1+bit2; fragment 4: bit0+bit1+bit2+bit3; analogize, segment 15: bit0+bit1+bit2+bit3+ … +bit14.

For each segment, the segment is compared to a liter-length base code of equal code length to the segment, and the segment is compared to a distance base code of equal code length to the segment. When a certain segment is compared with a certain size-length base code, the size of the size-length code corresponding to the data segment is equal to the size of the size-length base code. Similarly, when a segment is compared with a certain base code, the length of the corresponding base code is equal to that of the base code.

For example, corresponding to the above-described segment 1 to segment 15, bit0 is compared with the liter-length base code and distance base code of code length 1; bit0+ bit1 is compared with a liter-length base code and a distance base code with the code length of 2; bit0+ bit1+ bit2 is compared with a liter-length base code and a distance base code with the code length of 3; similarly, bit0+bit1+ … +bit14 is compared with a length-15 liter-length base code and a distance base code. If bit0+bit1+ … +bit6 is compared with the bit-length base code with the code length of 7, the corresponding bit-length code length is 7. bits 0-6-7 bits will all represent information of one codeword.

The segment is matched with the base code when compared, namely if the segment contains a valid code word and the segment is larger than the base code with the same length as the segment, whether the next segment is smaller than the next base code is judged. Wherein the next segment is 1 greater than the length of the segment, and the next base code and the next segment have equal lengths. If the next segment is smaller than the next base code, the length of the segment is determined as the code length of the data segment, and the difference between the segment and the base code of equal length is determined as the offset of the data segment.

As shown with reference to fig. 3, BC in fig. 3 _n Representing a base code of length n. If Bit [ n:0]Representative value is greater than BC _n And Bit [ n+1:0]Representative values are less than BC _n+1 Then the output of AND gate AND is 1, the obtained code length CL is n, AND the offset value is equal to Bit [ n:0 ]]Value of (1) minus BC _n Is a value of (2).

Each cell of the content addressable memory 30 contains an embedded comparison logic, and the data written into the content addressable memory 30 is compared to each codeword information stored internally. In the huffman decoding process, there are two data inputs to the content addressable memory 30, one being the code length and the other being the offset value. The value of the codeword can be obtained by looking up the values of the code length and the base code. For decoding in the deflate format, two content addressable memories 30 are adapted. One content-addressed memory storing the codewords of the server-length is the first content-addressed memory, and the other content-addressed memory storing the codewords of the distance is the second content-addressed memory.

The first content addressing memory obtains a first vector according to the first code length and the first offset value, namely, obtains a lite-length vector according to the lite-length code length and the corresponding offset value. The second content addressing memory obtains a second vector according to the second code length and the second offset value, i.e. obtains a distance vector according to the distance code length and the corresponding offset value.

The inputs to the vector fusion unit 40 are the server-length vector and distance vector. The LL vector in FIG. 4 represents the liter-length vector and the DIST vector represents the distance vector. Each vector contains mainly two types of information, one being the sign information of the vector obtained by looking up the content addressable memory 30, i.e. the huffman codeword represented by the current bit. The other is the length information of the vector obtained by the comparator bank array 20, i.e. the length of the codeword represented by the current position. The encoding of the Deflate format does not include encoding of extra bits, and thus the code length information obtained in the comparator bank array 20 is the length of extra bits excluding codewords. One function of the vector fusion unit 40 is to determine whether the codeword includes extra bits and the length of the extra bits according to the value of the codeword, and specifically, the extra bits information included in the user-length codeword or the distance codeword can be obtained.

The decoded content in the Deflate format contains two types, one is original text and the other is matched pair. The matching pair contains a length and a distance. The length and distance in one decoding must be completely decoded to produce a valid matching pair. Thus, the following three attributes are defined for the fusion vector:

1. Effectiveness is as follows: for marking whether the codeword decoded by the current bit is valid.

2. Integrity: for marking whether the codeword represented by the current bit is complete. If the current bit is a matching pair and the length of the matching pair enters the code stream of the next frame, the matching pair is marked incompletely, and decoding is needed in the code stream of the next frame.

3. Length: representing the code length of the current bit decoded symbol. If a matching pair, it should be the total code length value of the length plus the total code length value of the distance.

The rule of fusion vector, i.e. length vector fusion, is as follows:

when the type of the current bit of the first vector is length and valid, if the next position pointed by the code length of the current bit is larger than the boundary value, marking the current bit of the fusion vector as valid and incomplete, wherein the code length of the current bit of the fusion vector is the code length of the current bit of the first vector;

when the type of the current bit of the first vector is length and valid, if the next position pointed by the code length of the current bit is smaller than or equal to the boundary value, marking the current bit of the fusion vector as valid and complete, wherein the code length of the current bit of the fusion vector is the sum of the code length of the current bit of the first vector and the code length of the bit of the second vector pointed by the code length of the current bit;

when the current bit of the first vector is the original text, the current bit of the fusion vector is marked as valid and complete in the original text.

The vector fusion unit 40 performs vector fusion based on the above rule to obtain a fusion vector.

For example, referring to fig. 5, bit0 of the player-length vector (i.e., the LL vector shown in fig. 5) is decoded to obtain the original text, and the code length corresponding to bit0 is 4, so bits 0 to 3 represent the original text. bit4 decodes to get the length and bit4 corresponds to a code length of 6, then the next position should be distance, i.e. distance, and distance is at bit10. The bit10 is located at a position representing a distance and the code length of the distance is 5. Thus, bit0 of the fusion vector is marked as valid and complete and the code length is 4. Bit4 of the fusion vector is marked as valid and complete and the code length is 6+5=11.

Each decoding engine calculating unit 50 is responsible for decoding the received spliced data frame according to the fusion vector, the effective position information outputted from the previous decoding engine calculating unit 50, and outputting the decoding result.

In a specific embodiment, the decoding engine calculation unit 50 is specifically configured to: calculating a length vector according to the fusion vector; a position vector is calculated according to the length vector and the effective position information output by the upper decoding engine calculation unit 50; and calculating according to the position vector to obtain a result vector, and outputting a decoding result according to the result vector after the upper-stage decoding engine outputs the decoding result.

Specifically, the decoding engine calculation unit 50 calls different operators under the control of one state machine to complete calculation of a length vector, i.e., L vector, a position vector, i.e., P vector, and a result vector, i.e., R vector. And after the calculation is completed, controlling the decoding result to be output in parallel. The calculation process is mainly divided into three stages.

Stage one: calculating L vector

In the decoding engine calculation unit 50, the fused vector obtained by fusion is calculated as an L0 vector to obtain another L vector. The L vector is calculated as follows:

L _m [n]＝L _m-1 [n]+L _m-1 [L _m-1 [n]+n]；

wherein L is _m [n]Representing L currently being calculated _m The length of the nth position of the vector. If there is a representative position in the right-hand expression of the equal sign that is invalid, L _m [n]Is invalid.

If one of the positions of the previous stage L vectors is invalid, the position of the current L vector is marked as invalid. It can be known from the above expression that the calculation of the L vector has a relation with the L vector of the previous stage only, so that the calculation of the L vector can be performed in parallel during the calculation of the L vector, and the calculation of the L vector of one stage can be completed in one clock cycle. When all the positions of the L vector of a certain level are invalid, the calculation of the L vector ends.

Stage two: calculating P vector

After the calculation of the L vector is completed, the calculation of the P vector is carried out according to the initial position information obtained by the previous calculation. The calculation of the P vector is a reverse process of the L vector calculation, and marks the first valid position in the P vector based on the valid position information output from the previous stage decoding engine calculation unit 50. Then one L vector is needed for each calculation of one P vector. And after the position vector corresponding to the L0 vector is calculated, a final P vector is obtained. At this point the P vector marks the positions of all valid codewords in the current code stream. In the P vector calculation process, if the next valid position enters the overlap region, the next valid position is recorded in the valid position information as the start position of the next pen.

Stage three: calculate R vector and output decoding result

Based on the codeword information, P vector, and current data frame output from the content addressable memory 30, the corresponding codeword and extra bit information in the huffman table can be obtained, and the result is converted into a final output result that is stored in the R vector. And finally, outputting the decoding result in parallel according to the effective number of the P vectors and the recorded position information.

Referring to fig. 6, by analyzing the call mechanism of the state machine to different operators in the decoding process of the decoding engine computing unit 50, whether there are data dependencies in different computing stages when decoding and computing between adjacent decoding engine computing units 50 is summarized as follows:

1. the IDLE state, since there is no data input to be decoded, there is no data dependency.

2. When calculating the L vector, each engine can calculate the L vector of each bit according to the data frame to be decoded and the output of the comparator. The L vector for each bit contains the L vector for user_length and the L vector for distance.

3. The calculation of the P vector requires the last decoding engine calculation unit 50 to output the final clustered info, and the current engine can only wait for clustered info if the last engine has not calculated clustered info as valid location information. Otherwise, the current engine may directly perform the calculation of the P vector.

4. For multiple engines, the computation of the P vector is not done completely serially, although there are data dependencies between the engines. The current engine can begin calculation of the P vector as soon as the last engine calculated the trusted info. It is possible that both neighboring engines are calculating the P vector.

5. After the P vector calculation is completed, the R vector can be calculated based on the position information of the codeword and the codeword information output from the content addressable memory 30, and the decoding result can be output. In this case, the decoding result is not outputted out of order, but is outputted sequentially, so the current decoding engine calculation unit 50 must wait until the result of the previous decoding engine calculation unit 50 is outputted.

Through the analysis, the parallel decoding algorithm can realize the calculation of multi-engine parallel pipeline by transmitting the effective position information and the output completion signal.

Referring to fig. 7, the ordinate in fig. 7 represents the number of data inputs, the abscissa represents the calculated number of cycles, and it is assumed that two cycles are calculated for the data stream L vector of each frame, and two cycles are calculated for the P vector.

At the 0 th calculation cycle, the decoding engine calculation unit 500 starts calculating Data0. Since it is the first frame data stream, the start position must be 0, and the calculation of the P vector is started directly after the completion of the L vector calculation. At the 1 st clock cycle, the decoding engine calculation unit 501 starts calculating Data1. Since the decoding engine calculating unit 500 has not outputted the valid position information yet after the L vector calculation is finished, the decoding engine calculating unit 501 WAITs for the valid position information in the cal_p_wait state. After the first effective position information is obtained, the decoding engine calculation unit 501 starts calculating the P vector and the R vector, and outputs the decoding result. The decoding engine calculation units 502 to 7 operate in the same manner as the engine 1 described above. The downward arrow indicates the transfer direction of the effective position information. The decoding engine calculation unit 507 outputs the effective position information to the decoding engine calculation unit 500 for continuous decoding after calculating the effective position information.

The above-described procedure is an idealized decoding procedure, and it is assumed that the effective position information is calculated only in the last cycle of calculating the P vector, so that the calculation of the P vector appears to be serial, and in actual decoding, there is a significant overlap in the calculation of the P vector. In addition, the above flow assumes that there is no back pressure in the downstream decoding engine calculation unit 50, and the calculated result can be immediately output, so that the current decoding engine calculation unit 50 always can obtain the output completion signal of the previous decoding engine calculation unit 50 when calculating the R vector.

Further, on the basis of the above-described embodiment, as a specific implementation, the decoding control unit 10 is further configured to expand the overlap region width.

Specifically, for transfer of effective position information between the decoding engine calculation units 50, when the overlap region width is set to 16bits, when the upper decoding engine calculation unit 50 calculates the P vector, once the position where the effective codeword exists falls within the overlap region, the trusted position may be transmitted to the current decoding engine calculation unit 50 for calculation of the P vector. In addition, there are two code tables of Huffman code in the Deflate format, namely a player_length code table and a distance code table. In the multi-engine implementation, two principles are followed, one is that the joint entering the overlap region must be completely decoded and the other is that the length of the matched pair entering the overlap region needs to be decoded, and codeword information after the length decoding is outputted as valid position information to the next decoding engine calculating unit 50. The starting position of the next decoding engine calculation unit 50 will be the position of distance. For the output control between the decoding engine computing units 50, when the decoding of the current decoding engine computing unit 50 is successful, the current decoding result should be output after the decoding of the previous decoding engine computing unit 50 is completely output.

Therefore, when the multi-engine pipelining is implemented, the calculation of the effective position information is the key of the multi-engine pipelining performance, and therefore, the width of the overlapping area can be enlarged, and in particular, the width of the overlapping area can be enlarged. For example, the overlap region width is enlarged from 16 bits to 32 bits. Experiments prove that when the width of the overlapped area is enlarged to 32 bits, the trusted position, namely the first effective position, of the next group of data frames to be decoded can be obtained in the first period of calculating the P vector. At this time, the cal_p_wait state will not appear when each decoding engine calculation unit 50 calculates, and theoretically, there will be data input every clock cycle.

In summary, in the huffman decoding system provided by the application, two code lengths and offset values are obtained through the comparator array, two vectors are obtained through the content addressing memory, the two vectors are fused through the vector fusion unit, and finally, the data frame is decoded through the decoding engine calculation unit, so that the huffman decoding aiming at the deflate coding is realized. On the basis, the decoding engine computing unit is provided with multiple paths, and each path of decoding engine computing unit can decode different data frames in parallel, so that the decoding throughput can be greatly improved.

The application also provides a Huffman decoding method, which can be referred to by the system described above. Referring to fig. 8, fig. 8 is a schematic diagram of a huffman decoding method according to an embodiment of the present application, and with reference to fig. 8, the method includes:

s101: receiving a data frame to be decoded through a decoding control unit, splicing the data frame to be decoded to obtain a spliced data frame, and transmitting the spliced data frame to a comparator group array and a decoding engine computing unit; the decoding control unit issues different spliced data frames to each decoding engine computing unit;

s102: determining code length and offset value corresponding to each data segment in the spliced data frame through the comparator group array; wherein the code length comprises a first code length and a second code length; the offset value comprises a first offset value and a second offset value;

s103: obtaining a first vector through a content addressable memory according to the first code length and the first offset value; obtaining a second vector according to the second code length and the second offset value;

s104: fusing the first vector and the second vector through a vector fusing unit to obtain a fused vector;

S105: and decoding the received spliced data frame by a decoding engine computing unit according to the fusion vector and the effective position information output by the decoding engine computing unit at the upper stage, and outputting a decoding result.

On the basis of the above embodiment, decoding, by the decoding engine computing unit, the received spliced data frame according to the fusion vector and the valid position information output by the decoding engine computing unit at the previous stage, and outputting a decoding result includes:

calculating by the decoding engine calculation unit according to the fusion vector to obtain a length vector; calculating a position vector according to the length vector and the effective position information output by the decoding engine calculation unit at the upper stage; and calculating according to the position vector to obtain a result vector, and outputting a decoding result according to the result vector after the decoding engine calculation unit at the upper stage outputs the decoding result.

On the basis of the above embodiment, receiving a data frame to be decoded by a decoding control unit, splicing the data frame to be decoded to obtain a spliced data frame, and sending the spliced data frame to a comparator group array and a decoding engine computing unit, where the decoding engine computing unit includes:

Splicing two adjacent frames of the data frames to be decoded through a decoding control unit to obtain a first spliced data frame, and transmitting the first spliced data frame to a decoding engine calculation unit;

and splicing the data with the width of the overlapping area in the data frame to be decoded and the adjacent next frame by a decoding control unit to obtain a second spliced data frame, and issuing the second spliced data frame to the comparator group array.

On the basis of the above embodiment, the method further comprises: the overlap region width is enlarged by the decoding control unit.

On the basis of the above embodiment, fusing the first vector and the second vector by a vector fusion unit, to obtain a fused vector includes:

On the basis of the embodiment, a first vector is obtained by a content addressable memory according to the first code length and the first offset value; obtaining a second vector according to the second code length and the second offset value includes:

obtaining the first vector through a first content addressing memory according to the first code length and the first offset value;

and obtaining the second vector through a second content addressable memory according to the second code length and the second offset value.

The application also provides a Huffman decoding device which comprises a memory and a processor. A memory for storing a computer program; a processor for executing a computer program to implement the steps of the huffman decoding method as described above.

For the description of the apparatus provided by the present application, refer to the above method embodiment, and the description of the present application is omitted herein.

The application also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the huffman decoding method as described above.

The computer readable storage medium may include: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

For the description of the computer-readable storage medium provided by the present application, refer to the above method embodiments, and the disclosure is not repeated here.

In the description, each embodiment is described in a progressive manner, and each embodiment is mainly described by the differences from other embodiments, so that the same similar parts among the embodiments are mutually referred. For the apparatus, device and computer readable storage medium of the embodiment disclosure, since it corresponds to the method of the embodiment disclosure, the description is relatively simple, and the relevant points refer to the description of the method section.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

The technical scheme provided by the application is described in detail. The principles and embodiments of the present application have been described herein with reference to specific examples, the description of which is intended only to facilitate an understanding of the method of the present application and its core ideas. It should be noted that it will be apparent to those skilled in the art that various changes and modifications can be made herein without departing from the principles of the application, which are also intended to fall within the scope of the appended claims.

Claims

1. A huffman decoding system comprising:

each decoding engine computing unit is used for decoding the received spliced data frames according to the fusion vector and the effective position information output by the upper-level decoding engine computing unit and outputting a decoding result;

The vector fusion unit is specifically configured to:

2. The huffman decoding system according to claim 1, wherein the decoding engine calculating unit is specifically configured to:

calculating a length vector according to the fusion vector;

3. The huffman decoding system according to claim 1, wherein the decoding control unit is specifically configured to:

and splicing the data frame to be decoded and the adjacent data frame with the width of the overlapping area in the data frame to be decoded, obtaining a second spliced data frame, and sending the second spliced data frame to the comparator group array.

4. A huffman decoding system according to claim 3 characterized in that the decoding control unit is further adapted to enlarge the overlap zone width.

5. The huffman decoding system of claim 1, wherein the content addressable memory comprises:

6. A huffman decoding method characterized by comprising:

decoding the received spliced data frame by a decoding engine computing unit according to the fusion vector and the effective position information output by the decoding engine computing unit at the upper stage, and outputting a decoding result;

fusing the first vector and the second vector through a vector fusing unit, wherein the obtaining a fused vector comprises:

7. The huffman decoding method according to claim 6, wherein decoding the received spliced data frame by the decoding engine calculating unit according to the fusion vector, the effective position information outputted by the decoding engine calculating unit of the previous stage, and outputting the decoding result comprises:

8. A huffman decoding device comprising:

a memory for storing a computer program;

processor for implementing the steps of the huffman decoding method according to claim 6 or 7 when executing said computer program.

9. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the huffman decoding method according to claim 6 or 7.