CN115865098A - Data compression method based on Huffman coding - Google Patents

Data compression method based on Huffman coding Download PDF

Info

Publication number
CN115865098A
CN115865098A CN202111119214.7A CN202111119214A CN115865098A CN 115865098 A CN115865098 A CN 115865098A CN 202111119214 A CN202111119214 A CN 202111119214A CN 115865098 A CN115865098 A CN 115865098A
Authority
CN
China
Prior art keywords
binary
data
huffman coding
binary data
bit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111119214.7A
Other languages
Chinese (zh)
Inventor
杜力
朱俊翰
杜源
杜伟丞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN202111119214.7A priority Critical patent/CN115865098A/en
Publication of CN115865098A publication Critical patent/CN115865098A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The application provides a data compression method based on Huffman coding, aiming at a plurality of original data to be compressed, converting each original data into a binary number with the length being a preset target bit width, then arranging all the binary numbers according to the same bit expansion sequence to obtain a binary data block, and finally respectively carrying out Huffman coding on the data in the same bit in all the binary numbers.

Description

Data compression method based on Huffman coding
Technical Field
The application relates to the technical field of data compression, in particular to a data compression method based on Huffman coding.
Background
At present, when data is compressed, a common coding mode is Huffman coding (Huffman coding), which is a variable word length coding method, and the method constructs a code word with the shortest average length of different word heads completely according to the occurrence probability of characters.
Because the code length of Huffman coding is uncertain, parallel coding cannot be carried out, and in the decoded code stream, the length which needs to be decoded currently cannot be determined, so that the starting position of the next symbol cannot be determined before the current character is not decoded, and the Huffman coding is difficult to carry out parallel decoding.
Disclosure of Invention
In order to solve the technical problems that the Huffman coding is difficult to decode in parallel because the code length of the Huffman coding is uncertain, and the current length required to be decoded cannot be determined in a decoded code stream, so that the starting position of the next symbol cannot be determined before the current character is not decoded, the data compression method based on the Huffman coding is disclosed by the following embodiments.
The application discloses a data compression method based on Huffman coding in a first aspect, which comprises the following steps:
acquiring a plurality of original data to be compressed;
converting each original data into a binary number with the length being a preset target bit width;
acquiring a binary data block, wherein the binary data block comprises binary numbers corresponding to all original data, and all the binary numbers are arranged according to the same bit expansion sequence from high bit to low bit or from low bit to high bit;
and respectively carrying out Huffman coding on the data in the same bit in all the binary numbers according to the binary data blocks.
Optionally, the obtaining the binary data block includes:
arranging the binary numbers corresponding to each original data in sequence according to columns to obtain the binary data block, wherein each column of the binary data block corresponds to one binary number, the binary numbers of all the columns are arranged according to the same bit expansion sequence, and the number of rows of the binary data block is the target bit width.
Optionally, according to the binary data block, huffman coding is performed on data in the same bit in all binary numbers, respectively, and the Huffman coding includes:
and simultaneously performing Huffman coding on each row of data in the binary data block.
Optionally, the obtaining the binary data block includes:
and sequentially arranging the binary numbers corresponding to each original data according to rows to obtain the binary data block, wherein each row of the binary data block corresponds to one binary number, the binary numbers of all the rows are arranged according to the same bit expansion sequence, and the number of the columns of the binary data block is the target bit width.
Optionally, according to the binary data block, huffman coding is performed on data in the same bit in all binary numbers, respectively, and the Huffman coding includes:
huffman coding is performed simultaneously for each column of data in the binary data block.
The second aspect of the present application discloses a data compression method based on Huffman coding, which includes:
acquiring a plurality of original data to be compressed;
converting each original data into a binary number with the length being a preset target bit width;
obtaining a plurality of binary data blocks, wherein each binary data block comprises binary numbers corresponding to target amount of original data, the binary numbers in the same binary data block are arranged according to the same bit expansion sequence, the binary data in any two adjacent binary data blocks are arranged according to the reverse bit expansion sequence, and the bit expansion sequence is from high bit to low bit or from low bit to high bit;
and simultaneously performing Huffman coding on all the binary data blocks respectively.
Optionally, the obtaining a plurality of binary data blocks includes:
arranging the binary numbers corresponding to each original data in sequence according to columns to obtain a binary data set, wherein each column of the binary data set corresponds to one binary number, and the number of rows of the binary data set is the target bit width;
transversely dividing the binary data set according to the target number to obtain a plurality of data blocks to be rearranged, wherein each data block to be rearranged comprises the binary numbers of the target number, each column of each data block to be rearranged corresponds to one binary number, and the number of rows of each data block to be rearranged is the target bit width;
and adjusting the bit expansion sequence of each column of binary numbers in the data blocks to be rearranged to obtain a plurality of binary data blocks, wherein the binary numbers in each column in the same binary data block are arranged according to the same bit expansion sequence, and the binary data in any two adjacent binary data blocks are arranged according to the reverse bit expansion sequence.
Optionally, the performing Huffman coding on all the binary data blocks simultaneously respectively includes:
sequentially and transversely arranging all the binary data blocks to obtain a binary array;
huffman coding is performed simultaneously for each row of data in the binary array.
Optionally, the obtaining a plurality of binary data blocks includes:
sequentially arranging binary numbers corresponding to each original data according to rows to obtain a binary data set, wherein each row of the binary data set corresponds to one binary number, and the number of columns of the binary data set is the target bit width;
longitudinally dividing the binary data set according to the target number to obtain a plurality of data blocks to be rearranged, wherein each data block to be rearranged comprises the binary numbers of the target number, each row of each data block to be rearranged corresponds to one binary number, and the number of columns of each data block to be rearranged is the target bit width;
and adjusting the bit expansion sequence of each row of binary numbers in the plurality of data blocks to be rearranged to obtain a plurality of binary data blocks, wherein the binary numbers in each row in the same binary data block are arranged according to the same bit expansion sequence, and the binary data in any two adjacent binary data blocks are arranged according to the reverse bit expansion sequence.
Optionally, the performing Huffman coding on all the binary data blocks simultaneously respectively includes:
sequentially and longitudinally arranging all the binary data blocks to obtain a binary array;
huffman coding is performed simultaneously for each column of data in the binary array.
A third aspect of the present application discloses a computer device comprising:
a memory for storing a computer program.
A processor for implementing the steps of the Huffman coding based data compression method according to the first aspect of the application when executing said computer program.
A fourth aspect of the present application discloses a computer-readable storage medium having stored thereon a computer program which, when being processed and executed, realizes the steps of the Huffman coding-based data compression method according to the first aspect of the present application.
A fifth aspect of the present application discloses a computer device, comprising:
a memory for storing a computer program.
A processor for implementing the steps of the Huffman coding based data compression method according to the second aspect of the application when executing the computer program.
A sixth aspect of the present application discloses a computer-readable storage medium, on which a computer program is stored, which, when being processed and executed, realizes the steps of the Huffman coding based data compression method according to the second aspect of the present application.
According to the data compression method based on Huffman coding, each piece of original data to be compressed is converted into a binary number with the length being the preset target bit width, all the binary numbers are arranged according to the same bit expansion sequence to obtain a binary data block, and finally data in the same bit in all the binary numbers are subjected to Huffman coding respectively.
Drawings
Fig. 1 is a schematic workflow diagram of a data compression method based on Huffman coding disclosed in an embodiment of the present application;
fig. 2 is a schematic diagram of binary data blocks obtained by arranging the binary data blocks in columns in a Huffman coding-based data compression method disclosed in an embodiment of the present application;
fig. 3 is a schematic diagram of data of the same bit in a binary data block arranged in columns in a data compression method based on Huffman coding disclosed in an embodiment of the present application;
fig. 4 is a schematic diagram of binary data blocks obtained by row arrangement in a data compression method based on Huffman coding disclosed in an embodiment of the present application;
fig. 5 is a schematic diagram of the same bit data in the binary data blocks arranged in rows in the Huffman coding-based data compression method disclosed in the embodiment of the present application;
fig. 6 is a schematic workflow diagram of another data compression method based on Huffman coding disclosed in the embodiment of the present application;
fig. 7 is a schematic diagram of a plurality of binary data blocks obtained by column arrangement in another Huffman coding-based data compression method disclosed in the embodiment of the present application;
fig. 8 is a schematic diagram of a binary array composed of a plurality of binary data blocks in a horizontal direction in another data compression method based on Huffman coding according to an embodiment of the present application;
fig. 9 is a schematic diagram of a plurality of binary data blocks obtained by row arrangement in another Huffman coding-based data compression method disclosed in an embodiment of the present application;
fig. 10 is a schematic diagram of a binary array formed by a plurality of binary data blocks in a vertical direction in another data compression method based on Huffman coding according to an embodiment of the present application.
Detailed Description
In order to solve the technical problems that the Huffman coding is difficult to decode in parallel because the code length of the Huffman coding is uncertain, and the current length required to be decoded cannot be determined in a decoded code stream, so that the starting position of the next symbol cannot be determined before the current character is not decoded, the data compression method based on the Huffman coding is disclosed by the following embodiments.
A first embodiment of the present application provides a data compression method based on Huffman coding, referring to fig. 1, the method includes:
step 101, obtaining a plurality of original data to be compressed. Generally, the original data is a decimal number, and the original data to be compressed includes a plurality of data, and the number is not limited. As an example, obtaining a plurality of raw data to be compressed is: 40. 2, 0, 30, -38, 0, 11.
And 102, converting each original data into a binary number with the length being a preset target bit width.
In this embodiment, the value of the target bit width is not limited, and a technician may set the target bit width to any bit size according to an actual requirement, for example, to 8 bits, 16 bits, 32 bits, or 64 bits.
As an example, the value of the target bit width is set to 8 bits, and for a plurality of original data to be compressed: 40. 2, 0, 30, -38, 0, 11, which are converted into binary numbers each having a length of 8 bits. Wherein, the 8-bit binary number corresponding to 40 is "00101000", the 8-bit binary number corresponding to 2 is "00000010", the 8-bit binary number corresponding to 0 is "00000000", the 8-bit binary number corresponding to 30 is "00011110", the 8-bit binary number corresponding to 30 is "11100010", the 8-bit binary number corresponding to-38 is "11011010", and the 8-bit binary number corresponding to 11 is "00001011".
Step 103, obtaining a binary data block, where the binary data block includes binary numbers corresponding to all the original data, and all the binary numbers are arranged according to the same bit expansion sequence, where the bit expansion sequence is from high to low, or from low to high.
And step 104, respectively carrying out Huffman coding on the data in the same bit in all the binary numbers according to the binary number data blocks.
The binary number and the data length contained in the binary data block are clear, and when the Huffman coding is carried out, the parallel coding can be carried out on the data at different bit positions simultaneously, so that the parallelism degree of the Huffman coding is improved, and the coding speed and the decoding speed are accelerated. For example, for the above example, in the obtained binary data block, each original data is expanded into an 8-bit binary number, and is arranged according to the same bit expansion sequence, in the encoding process, data at the same bit can be extracted and used as a path of data to perform Huffman encoding, and then for the binary data block, 8 paths of data can be extracted at the same time to perform parallel encoding, so that the encoding parallelism can be greatly improved, and the encoding speed and the decoding speed are increased.
The Huffman coding-based data compression method provided by the embodiment converts each piece of original data to be compressed into a binary number with a length of a preset target bit width, then arranges all the binary numbers according to the same bit expansion sequence to obtain a binary data block, and finally performs Huffman coding on the data with the same bit in all the binary numbers respectively.
In one implementation, the obtaining the binary data block includes:
arranging the binary numbers corresponding to each original data in sequence according to columns to obtain the binary data block, wherein each column of the binary data block corresponds to one binary number, the binary numbers of all the columns are arranged according to the same bit expansion sequence, and the number of rows of the binary data block is the target bit width.
With reference to the above example, if the raw data includes: 40. 2, 0, 30, -38, 0, 11, and the target bit width is 8 bits, the obtained binary data block is shown in fig. 2, and the binary numbers corresponding to each original data are arranged in a row sequentially, wherein 40 corresponds to the binary number in the first row, and 2 corresponds to the binary number in the second row, \8230;, 11 corresponds to the binary number in the last row. As can be seen from fig. 2, each binary number is arranged from top to bottom according to the bit expansion sequence from the high order to the low order, and in actual operation, may also be arranged according to the bit expansion sequence from the low order to the high order. The number of columns of the binary data block is the number of original data to be compressed, and the number of rows is the target bit width.
In this implementation, huffman coding is performed on data in the same bit in all binary numbers according to the binary data block, and the Huffman coding includes: and simultaneously performing Huffman coding on each row of data in the binary data block.
With reference to fig. 2 and fig. 3, the data in the same row in the binary data block is the data in the same bit in all the binary data, and Huffman coding is performed on each row of data at the same time, so that parallel coding of 8-way data can be realized, and thus, the coding speed and the subsequent decoding speed can be effectively increased.
In another implementation, the obtaining a binary data block includes:
and sequentially arranging the binary numbers corresponding to each original data according to rows to obtain the binary data block, wherein each row of the binary data block corresponds to one binary number, the binary numbers of all the rows are arranged according to the same bit expansion sequence, and the number of the columns of the binary data block is the target bit width.
With reference to the above example, if the raw data includes: 40. 2, 0, 30, -38, 0, 11, and the target bit width is 8 bits, then the obtained binary data block is shown in fig. 4, and the binary numbers corresponding to each original data are arranged in sequence by rows, wherein the binary number corresponding to 40 is located in the first row, and the binary number corresponding to 2 is located in the second row, \ 8230 \ 8230;, and the binary number corresponding to 11 is located in the last row. As can be seen from fig. 4, each binary number is arranged from left to right according to the bit expansion sequence from the high order to the low order, and in actual operation, may also be arranged according to the bit expansion sequence from the low order to the high order. The number of rows of the binary data block is the number of original data to be compressed, and the number of columns is the target bit width.
In this implementation, huffman coding is performed on data in the same bit in all binary numbers according to the binary data block, and the Huffman coding includes: huffman coding is performed simultaneously for each column of data in the binary data block.
With reference to fig. 4 and 5, the data in the same column in the binary data block is the data in the same bit in all the binary data, and Huffman coding is performed simultaneously on each column of data, so that parallel coding of 8-way data can be realized, and thus the coding speed and the subsequent decoding speed can be effectively improved.
In practical applications, for some data, such as most weights of the convolutional neural network, most of the weights are smaller weights. Therefore, in the view of the binary layer, the entropy of the new data information composed of the bits of the higher order is small, and the entropy of the new data information composed of the bits of the lower order is large, which may cause the problems of fast decoding of the higher order data and slow decoding of the lower order data.
In order to increase the decoding speed and the utilization rate of a hardware decoder, a second embodiment of the present application provides a data compression method based on Huffman coding, referring to fig. 6, where the method includes:
step 601, obtaining a plurality of original data to be compressed. Generally, the original data is a decimal number, and the original data to be compressed includes a plurality of data, and the number is not limited. As an example, obtaining a plurality of raw data to be compressed is: 40. 2, 0, 30, -38, 0, 11.
Step 602, converting each of the original data into a binary number with a length of a preset target bit width.
In this embodiment, the value of the target bit width is not limited, and a technician may set the target bit width to any bit size according to an actual requirement, for example, to 8 bits, 16 bits, 32 bits, or 64 bits.
As an example, the value of the target bit width is set to 8 bits, and for a plurality of original data to be compressed: 40. 2, 0, 30, -38, 0, 11, into binary numbers each 8 bits in length. For the specific conversion result, refer to the disclosure of the first embodiment, and are not described herein again.
Step 603, obtaining a plurality of binary data blocks, each of which includes binary data corresponding to a target amount of original data, the binary data in the same binary data block are arranged according to the same bit expansion sequence, the binary data in any two adjacent binary data blocks are arranged according to the opposite bit expansion sequence, and the bit expansion sequence is from high to low or from low to high.
The value of the target number is not limited in this embodiment, and a technician may set the target number to any value according to actual requirements. As an example, the present embodiment sets the target number to 5, and each binary data block includes 5 binary numbers. It should be noted that, for the last binary data block, zero padding is performed if the binary number included in the last binary data block is less than the target number. Combining the above example, the binary numbers corresponding to 40, 2, 0 and 30 are used as the first binary data block, the binary numbers corresponding to-30, -38, 0, 11 and 0 are used as the second binary data block, and the last zero in the second binary data block is obtained through the zero padding operation.
Binary numbers in the same binary data block are arranged according to the same bit expansion sequence, and binary data in any two adjacent binary data blocks are arranged according to the opposite bit expansion sequence. If the binary numbers in the first binary data block are all arranged according to the bit expansion sequence from high bit to low bit, the binary numbers in the second binary data block are all arranged according to the bit expansion sequence from low bit to high bit, the binary numbers in the third binary data block are all arranged according to the bit expansion sequence from high bit to low bit, and so on.
And step 604, performing Huffman coding on all the binary data blocks simultaneously respectively.
When Huffman coding is carried out on a single binary data block, the binary numbers and the data length contained in the binary data block are definite, and the data in different bit positions can be coded in parallel, so that the parallelism of the Huffman coding is improved, and the coding speed and the decoding speed are increased. For example, for the above example, in a single obtained binary data block, each original data is expanded into an 8-bit binary number, and is arranged according to the same bit expansion sequence, in the encoding process, data at the same bit can be extracted and used as a path of data to perform Huffman encoding, and then, for the binary data block, 8 paths of data can be extracted at the same time to perform parallel encoding, so that the encoding parallelism can be greatly improved, and the encoding speed and the decoding speed are increased.
When Huffman coding is carried out on different binary data blocks, because the binary data in two adjacent binary data blocks are arranged according to the reverse bit expansion sequence, the mixed coding of high-order data and low-order data can be realized in the process of carrying out the Huffman coding simultaneously, the separate coding of the high-order data and the low-order data is not needed, the problems that the high-order decoding is finished first and the low-order decoding is finished in the subsequent decoding process and needs to be idle for a period of time are avoided, and the utilization rate of a hardware decoder is improved.
In one implementation, the obtaining a plurality of binary data blocks includes:
arranging the binary numbers corresponding to each original data in sequence according to columns to obtain a binary data set, wherein each column of the binary data set corresponds to one binary number, and the number of rows of the binary data set is the target bit width.
And transversely dividing the binary data set according to the target number to obtain a plurality of data blocks to be rearranged, wherein each data block to be rearranged comprises the binary numbers of the target number, each column of each data block to be rearranged corresponds to one binary number, and the number of rows of each data block to be rearranged is the target bit width.
And adjusting the bit expansion sequence of each column of binary numbers in the multiple data blocks to be rearranged to obtain multiple binary data blocks, wherein the binary numbers of each column in the same binary data block are arranged according to the same bit expansion sequence, and the binary data in any two adjacent binary data blocks are arranged according to the reverse bit expansion sequence.
With reference to the above example, if the raw data includes: 40. 2, 0, 30, -38, 0, 11, the target bit width is 8 bits, and the target number is 5, then the binary numbers corresponding to the original data can be divided into two binary data blocks horizontally, as shown in fig. 7, the binary numbers corresponding to each original data are arranged in sequence, and the second binary data block is less than 5, so zero padding operation is performed. In practical application, all the binary numbers in the first binary data block are arranged from top to bottom according to the bit expansion sequence from the lowest bit to the highest bit, and all the binary numbers in the second binary data block are arranged from top to bottom according to the bit expansion sequence from the lowest bit to the highest bit. As can be seen from fig. 7, the number of rows of each binary data block is the target bit width, and the number of columns is the target number.
In this implementation, the Huffman coding all the binary data blocks simultaneously, respectively, includes:
and sequentially and transversely arranging all the binary data blocks to obtain a binary array. Huffman coding is performed simultaneously for each row of data in the binary array.
With reference to the above example, the first binary data block and the second binary data block are transversely spliced together to obtain the binary array, as shown in fig. 8, each row of data of the binary array includes high-order data and low-order data of binary data, and the information entropy of each row of data is equalized as much as possible, so that when Huffman coding is performed on all rows of data simultaneously, the problem that subsequent high-order decoding ends first and a period of time is needed to wait for the low-order decoding to end can be effectively avoided, and the utilization rate of the hardware decoder is improved.
In another implementation, the obtaining a plurality of binary data blocks includes:
and sequentially arranging the binary numbers corresponding to each original data according to rows to obtain a binary data set, wherein each row of the binary data set corresponds to one binary number, and the number of columns of the binary data set is the target bit width.
And longitudinally dividing the binary data set according to the target number to obtain a plurality of data blocks to be rearranged, wherein each data block to be rearranged comprises the binary numbers of the target number, each row of each data block to be rearranged corresponds to one binary number, and the number of columns of each data block to be rearranged is the target bit width.
And adjusting the bit expansion sequence of each row of binary numbers in the plurality of data blocks to be rearranged to obtain a plurality of binary data blocks, wherein the binary numbers in each row in the same binary data block are arranged according to the same bit expansion sequence, and the binary data in any two adjacent binary data blocks are arranged according to the reverse bit expansion sequence.
With reference to the above example, if the raw data includes: 40. 2, 0, 30, -38, 0, 11, the target bit width is 8 bits, and the target number is 5, then the binary numbers corresponding to the original data can be divided into two binary data blocks longitudinally, as shown in fig. 9, the binary numbers corresponding to each original data are arranged in sequence by rows, and the zero padding operation is performed on the second binary data block because the number of the binary data blocks is less than 5. In practical application, all the binary numbers in the first binary data block can be arranged from left to right according to the bit expansion sequence from the lowest bit to the highest bit, and all the binary numbers in the second binary data block can be arranged from left to right according to the bit expansion sequence from the lowest bit to the highest bit. As can be seen from fig. 9, the number of rows and the number of columns of each binary data block are the target number and the target bit width, respectively.
In this implementation, the Huffman coding all the binary data blocks simultaneously, respectively, includes:
and sequentially and longitudinally arranging all the binary data blocks to obtain a binary array. Huffman coding is performed simultaneously for each column of data in the binary array.
With reference to the above example, the first binary data block and the second binary data block are longitudinally spliced together to obtain the binary array, as shown in fig. 10, each line of data of the binary array includes high-order data and low-order data of binary data, and the information entropy of each line of data is balanced as much as possible, so that when Huffman coding is performed on all lines of data simultaneously, the problem that subsequent high-order decoding ends first and a period of time is needed to wait for the low-order decoding to end can be effectively avoided, and the utilization rate of the hardware decoder is improved.
A third embodiment of the present application discloses a computer device, comprising:
a memory for storing a computer program.
A processor for implementing the steps of the Huffman coding based data compression method according to the first embodiment of the present application when executing the computer program.
A fourth embodiment of the present application discloses a computer-readable storage medium, on which a computer program is stored, and the computer program, when being processed and executed, implements the steps of the Huffman coding-based data compression method according to the first embodiment of the present application.
A fifth embodiment of the present application discloses a computer device, comprising:
a memory for storing a computer program.
A processor configured to implement the steps of the Huffman coding based data compression method according to the second embodiment of the present application when executing the computer program.
A sixth embodiment of the present application discloses a computer-readable storage medium, on which a computer program is stored, and the computer program, when being processed and executed, implements the steps of the Huffman coding-based data compression method according to the second embodiment of the present application.

Claims (10)

1. A data compression method based on Huffman coding is characterized by comprising the following steps:
acquiring a plurality of original data to be compressed;
converting each original data into a binary number with the length being a preset target bit width;
acquiring a binary data block, wherein the binary data block comprises binary numbers corresponding to all original data, and all the binary numbers are arranged according to the same bit expansion sequence from high bit to low bit or from low bit to high bit;
and respectively carrying out Huffman coding on the data in the same bit in all the binary numbers according to the binary data blocks.
2. The Huffman coding-based data compression method as recited in claim 1, wherein said obtaining a binary data block comprises:
arranging the binary numbers corresponding to each original data in sequence according to columns to obtain the binary data block, wherein each column of the binary data block corresponds to one binary number, the binary numbers of all the columns are arranged according to the same bit expansion sequence, and the number of rows of the binary data block is the target bit width.
3. The Huffman coding based data compression method according to claim 2, wherein the Huffman coding is performed on the data in the same bit of all the binary numbers according to the binary data block, respectively, comprising:
and simultaneously performing Huffman coding on each row of data in the binary data block.
4. The Huffman coding-based data compression method as recited in claim 1, wherein said obtaining a binary data block comprises:
and sequentially arranging the binary numbers corresponding to each original data according to rows to obtain the binary data block, wherein each row of the binary data block corresponds to one binary number, the binary numbers of all the rows are arranged according to the same bit expansion sequence, and the number of the columns of the binary data block is the target bit width.
5. The Huffman coding-based data compression method according to claim 4, wherein the Huffman coding is performed on data in the same bit in all binary numbers according to the binary data block, respectively, and comprises:
huffman coding is performed simultaneously for each column of data in the binary data block.
6. A data compression method based on Huffman coding is characterized by comprising the following steps:
acquiring a plurality of original data to be compressed;
converting each original data into a binary number with the length being a preset target bit width;
obtaining a plurality of binary data blocks, wherein each binary data block comprises binary numbers corresponding to target amount of original data, the binary numbers in the same binary data block are arranged according to the same bit expansion sequence, the binary data in any two adjacent binary data blocks are arranged according to the reverse bit expansion sequence, and the bit expansion sequence is from high bit to low bit or from low bit to high bit;
and simultaneously performing Huffman coding on all the binary data blocks respectively.
7. The Huffman coding based data compression method of claim 6, wherein obtaining a plurality of binary data blocks comprises:
arranging the binary numbers corresponding to each original data in sequence according to columns to obtain a binary data set, wherein each column of the binary data set corresponds to one binary number, and the number of rows of the binary data set is the target bit width;
transversely dividing the binary data set according to the target number to obtain a plurality of data blocks to be rearranged, wherein each data block to be rearranged comprises the binary numbers of the target number, each column of each data block to be rearranged corresponds to one binary number, and the number of rows of each data block to be rearranged is the target bit width;
and adjusting the bit expansion sequence of each column of binary numbers in the multiple data blocks to be rearranged to obtain multiple binary data blocks, wherein the binary numbers of each column in the same binary data block are arranged according to the same bit expansion sequence, and the binary data in any two adjacent binary data blocks are arranged according to the reverse bit expansion sequence.
8. The Huffman coding-based data compression method according to claim 7, wherein the Huffman coding is performed on all the binary data blocks simultaneously, and comprises:
sequentially and transversely arranging all the binary data blocks to obtain a binary array;
huffman coding is performed simultaneously for each row of data in the binary array.
9. The Huffman coding based data compression method of claim 6, wherein obtaining a plurality of binary data blocks comprises:
sequentially arranging binary numbers corresponding to each original data according to rows to obtain a binary data set, wherein each row of the binary data set corresponds to one binary number, and the number of columns of the binary data set is the target bit width;
longitudinally dividing the binary data set according to the target number to obtain a plurality of data blocks to be rearranged, wherein each data block to be rearranged comprises the binary numbers of the target number, each row of each data block to be rearranged corresponds to one binary number, and the number of columns of each data block to be rearranged is the target bit width;
and adjusting the bit expansion sequence of each row of binary numbers in the plurality of data blocks to be rearranged to obtain a plurality of binary data blocks, wherein the binary numbers in each row in the same binary data block are arranged according to the same bit expansion sequence, and the binary data in any two adjacent binary data blocks are arranged according to the reverse bit expansion sequence.
10. The Huffman coding-based data compression method according to claim 9, wherein the Huffman coding is performed on all the binary data blocks simultaneously, respectively, and comprises:
sequentially and longitudinally arranging all the binary data blocks to obtain a binary array;
huffman coding is performed simultaneously for each column of data in the binary array.
CN202111119214.7A 2021-09-24 2021-09-24 Data compression method based on Huffman coding Pending CN115865098A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111119214.7A CN115865098A (en) 2021-09-24 2021-09-24 Data compression method based on Huffman coding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111119214.7A CN115865098A (en) 2021-09-24 2021-09-24 Data compression method based on Huffman coding

Publications (1)

Publication Number Publication Date
CN115865098A true CN115865098A (en) 2023-03-28

Family

ID=85652431

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111119214.7A Pending CN115865098A (en) 2021-09-24 2021-09-24 Data compression method based on Huffman coding

Country Status (1)

Country Link
CN (1) CN115865098A (en)

Similar Documents

Publication Publication Date Title
CN109889205B (en) Coding method and system, decoding method and system, coding and decoding method and system
CN100525450C (en) Method and device for realizing Hoffman decodeng
CN107294539B (en) Quasi-dynamic Huffman hardware encoder and encoding method
CN112506879B (en) Data processing method and related equipment
CN110048727B (en) Polar code coding method with any code length
CN108288966B (en) Polar code rate matching processing method and device
CN102438145A (en) Image lossless compression method on basis of Huffman code
US9698819B1 (en) Huffman code generation
CN112152634B (en) Block compression encoding method, device, computer equipment and readable storage medium
CN111510156A (en) Method for dynamically compressing and decompressing large file based on segmentation
KR102098202B1 (en) Encoding apparatus and encoding method thereof
CN115459781A (en) Long sequence DNA storage coding method based on static interleaving coding
CN107565974B (en) Static Huffman parallel full coding implementation method
CN110602498A (en) Self-adaptive finite state entropy coding method
CN111384972A (en) Optimization method and device of multi-system LDPC decoding algorithm and decoder
CN102725964A (en) Encoding method, decoding method, encoding device, and decoding device
CN115865098A (en) Data compression method based on Huffman coding
CN113992303B (en) Sequence determination method, device and equipment
CN103428502B (en) Decoding method and decoding system
CN112612762B (en) Data processing method and related equipment
US10931303B1 (en) Data processing system
CN110995288B (en) RM code analysis method and device
CN112534724B (en) Decoder and method for decoding polarization code and product code
CN113346913A (en) Data compression using reduced number of occurrences
CN111224674A (en) Decoding method, device and decoder of multi-system LDPC code

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination