CN112165331A - Data compression method and device, data decompression method and device, storage medium and electronic equipment - Google Patents

Data compression method and device, data decompression method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN112165331A
CN112165331A CN202011001267.4A CN202011001267A CN112165331A CN 112165331 A CN112165331 A CN 112165331A CN 202011001267 A CN202011001267 A CN 202011001267A CN 112165331 A CN112165331 A CN 112165331A
Authority
CN
China
Prior art keywords
data
differential
frame
byte
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011001267.4A
Other languages
Chinese (zh)
Inventor
沈亮
孙宝辰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai East China Automotive Information Technology Co Ltd
Original Assignee
Shanghai OFilm Smart Car Technology Co Ltd
Shanghai East China Automotive Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai OFilm Smart Car Technology Co Ltd, Shanghai East China Automotive Information Technology Co Ltd filed Critical Shanghai OFilm Smart Car Technology Co Ltd
Priority to CN202011001267.4A priority Critical patent/CN112165331A/en
Publication of CN112165331A publication Critical patent/CN112165331A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/70Type of the data to be coded, other than image and sound

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The embodiment of the application discloses a data compression method and device, a data decompression method and device, a storage medium and electronic equipment, wherein the method comprises the following steps: the method comprises the steps of obtaining at least one type of characteristic data contained in original file data, obtaining data information of various types of characteristic data in the at least one type of characteristic data, respectively carrying out data coding on the various types of data information based on coding modes corresponding to the data information of the various types of characteristic data to obtain data frames corresponding to the various types of data information, and carrying out compression processing on the data frames corresponding to the various types of data information to obtain data compression packets corresponding to the original file data. By adopting the embodiment of the application, the compression rate of data compression can be improved.

Description

Data compression method and device, data decompression method and device, storage medium and electronic equipment
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a data compression method and apparatus, a data decompression method and apparatus, a storage medium, and an electronic device.
Background
With the development of communication technology, in order to meet performance requirements of users for information, resources, communication and the like based on electronic devices, scenarios such as data update or data upgrade of the electronic devices are involved, and in these scenarios, data is compressed in order to save data overhead (such as memory overhead and data transmission overhead).
At present, in a data compression process, an electronic device usually compresses original file data by using a fixed compression algorithm to generate a compressed data packet after the compression process.
Disclosure of Invention
The embodiment of the application provides a data compression method and device, a data decompression method and device, a storage medium and electronic equipment, which can improve the compression rate of data compression. The technical scheme of the embodiment of the application is as follows:
in a first aspect, an embodiment of the present application provides a data compression method, where the method includes:
acquiring at least one type of characteristic data contained in original file data, and acquiring data information of various types of characteristic data in the at least one type of characteristic data;
respectively carrying out data coding on various data information based on coding modes corresponding to the data information of various characteristic data to obtain data frames corresponding to the various data information;
and compressing the data frames corresponding to the various types of data information respectively to obtain a data compression packet corresponding to the original file data.
In a second aspect, an embodiment of the present application provides a data decompression method, where the method includes:
acquiring a data compression packet;
decompressing the data compression packet to obtain at least one type of data frame;
and performing data decoding on the at least one type of data frames based on the decoding modes corresponding to the various types of data frames in the at least one type of data frames to obtain original file data corresponding to the data compression packet.
In a third aspect, an embodiment of the present application provides a data compression apparatus, where the apparatus includes:
the data information acquisition module is used for acquiring at least one type of characteristic data contained in original file data and acquiring data information of each type of characteristic data;
the data information coding module is used for respectively carrying out data coding on various data information based on coding modes corresponding to the data information of various characteristic data to obtain data frames corresponding to the various data information;
and the data compression packet generation module is used for compressing the data frames corresponding to the various data information respectively to obtain the data compression packet corresponding to the original file data.
In a fourth aspect, an embodiment of the present application provides a data decompression apparatus, where the apparatus includes:
the compressed packet acquisition module is used for acquiring a data compressed packet;
the compressed packet decompression module is used for decompressing the data compressed packet to obtain at least one type of data frame;
and the data frame decoding module is used for carrying out data decoding on the at least one type of data frames based on the decoding modes corresponding to the various types of data frames in the at least one type of data frames to obtain the original file data corresponding to the data compression packet.
In a fifth aspect, embodiments of the present application provide a computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the above-mentioned method steps.
In a sixth aspect, an embodiment of the present application provides an electronic device, which may include: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the above-mentioned method steps.
The beneficial effects brought by the technical scheme provided by some embodiments of the application at least comprise:
in one or more embodiments of the present application, a first electronic device obtains at least one type of feature data included in original file data, obtains data information of each type of feature data in the at least one type of feature data, performs data encoding on each type of data information based on an encoding mode corresponding to the data information of each type of feature data, respectively, to obtain data frames corresponding to each type of data information, and compresses the data frames corresponding to each type of data information, respectively, to obtain a data compression packet corresponding to the original file data. By acquiring at least one type of characteristic data (namely data compiling characteristics) obtained after subdividing and mining the original file data and then encoding the various types of characteristic data, the problem of low compression ratio when data compression is directly carried out based on a fixed compression algorithm can be solved, the compression ratio of data compression is improved, and the data overhead (such as memory overhead, data transmission overhead and the like) is saved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a data compression method provided in an embodiment of the present application;
FIG. 2 is a schematic flow chart diagram illustrating another data compression method according to an embodiment of the present disclosure;
fig. 3 is a schematic view of a scenario of repetitive byte calculation involved in a data compression method provided in an embodiment of the present application;
fig. 4 is a schematic view of a scenario in which original file data is partitioned according to a data compression method provided in an embodiment of the present application;
fig. 5 is a schematic flowchart of a data decompression method according to an embodiment of the present application;
FIG. 6 is a schematic flow chart of another data decompression method provided in the embodiments of the present application;
FIG. 7 is a schematic structural diagram of a data compression apparatus according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a data information obtaining module according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of a data compression module according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of a differential byte information obtaining unit provided in an embodiment of the present application;
FIG. 11 is a schematic structural diagram of another data compression apparatus provided in an embodiment of the present application;
fig. 12 is a schematic structural diagram of a data decompression device according to an embodiment of the present application;
fig. 13 is a schematic structural diagram of a data frame decoding module according to an embodiment of the present application;
fig. 14 is a schematic structural diagram of another data decompression device according to an embodiment of the present application;
fig. 15 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
fig. 16 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In the description of the present application, it is to be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. In the description of the present application, it is noted that, unless explicitly stated or limited otherwise, "including" and "having" and any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus. The specific meaning of the above terms in the present application can be understood in a specific case by those of ordinary skill in the art. Further, in the description of the present application, "a plurality" means two or more unless otherwise specified. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
In the related art, in the data compression process, the electronic device usually compresses original file data by using a fixed compression algorithm to directly generate a compressed data packet, however, when the original file data is compiled, corresponding data compilation features, such as a repeated byte feature and a differential byte feature, exist, data corresponding to the data compilation features may have certain data code similarity or data code availability, and if the original file data is directly compressed based on the fixed compression algorithm, the compression rate of data compression may be low.
The present application will be described in detail with reference to specific examples.
In one embodiment, as shown in fig. 1, a data compression method is proposed, which may be implemented in dependence on a computer program, and which may be run on a data compression apparatus based on the von neumann architecture. The computer program may be integrated into the application or may be run as an independent tool application, and the data compression apparatus is used as the first electronic device for explanation.
Specifically, the data compression method includes:
step S101: the method comprises the steps of obtaining at least one type of characteristic data contained in original file data, and obtaining data information of various types of characteristic data in the at least one type of characteristic data.
The raw file data is data in a database (e.g., a user database for upgrading a device) or various data stored for use by a device user, a device developer, a backend manager, etc., typically the raw file number is raw or reduced data that may or may not be in machine-readable form. It constitutes physically present data. The original data has a plurality of existing forms, such as text data, image data, audio data, upgrading data or a mixture of several data.
The feature data may be understood as at least one type of feature data contained in the original file data determined by subdividing and mining the original file data based on the data compiling features in the original file data by executing all or part of the techniques of the data compression method in the embodiment of the present application, and it may be understood that encoding based on the at least one type of feature data may greatly improve the multiplexing rate of encoded bytes or data segments contained in the original file data when encoding, thereby implementing multiplexing of the feature data when compiling the original file data to improve the compression rate of the original file data when compressing data. Taking a specific implementation scenario as an example: for example, the embedded system reduces the instruction set, wherein, the 16-bit system is aligned in double bytes, and the 32-bit system is aligned in 4 bytes, so that the repetition rate or the reuse rate of 2 and 4 bytes is the highest in the compression coding strategy of the original file data length when compression is performed; secondly, on the data compiling structure, for example, the basic statement code segments of the loop statement, the switch, the if and the like have certain characteristic attributes, such as repeatability, difference and reproducibility, further, the compiling structure of the similar function in the data structure also has certain code similarity, and the code of the part of code can be restored by multiplexing the compiling code. For another example, in an actual application environment, if data repetition bytes are filled in original file data according to a protocol standard, the filling strategy may be repeated data of 1-16 bytes, and also a large number of assignment areas of initial variables in a flash space, and also a large number of continuous and regular data segments composed of repeated bytes, and the compression rate of the data in this part will be higher than that of the original file data adopting a certain compression algorithm in the related art through a certain data compiling process.
Specifically, by analyzing and processing a data structure when the original file data is compiled, and based on the data structure of various types of feature data which is predefined and set, feature data extraction is performed on various types of feature data in the original file data, for example, segmentation extraction or block extraction is performed according to a certain length (for example, 12 bytes), and data information of the feature data is determined according to coding rules corresponding to various types of feature data;
the data information is used for performing data frame encoding on feature data corresponding to the data information, and the data information may be at least feature bytes (such as reusable difference bytes, reusable repetition bytes, and the like) in the feature data, an encoded frame sequence corresponding to the feature data, a feature type corresponding to the feature data, a feature byte length, and the like.
Optionally, the type of the feature data may be repeated byte data, for example, a reserved requirement of about 20% of a flash space on a controller of an automobile may be imposed, the part of data is usually filled with data according to a protocol standard, a filling policy is usually 1-16 bytes of repeated data, and a large number of assigned areas of initial variables in the flash are filled with large pieces of continuous and regular repeated data, for example: and (3) data segment: FFFFFFF … …, i.e., the data segment is repeatedly constructed with a repeat byte "F", as well as the data segment: 55AA55AA … …, i.e. the data segment is repeatedly formed in repeat bytes "55 AA", etc.
Optionally, the type of the feature data may be differential byte data, the differential byte data may be compiled according to a certain differential rule based on a target data segment (all or part of compiled continuous bytes) of "differential byte data" in the original file data, and compared with the original data, the data compiled from the differential byte data occupies a reduced byte space, thereby improving the compression rate of the data.
Further, the difference rule may be: the differential byte data cannot be obtained by reference reduction (such as differential reduction or padding reduction) based on the compiled data in the original file data, that is, new data inconsistent with the compiled data in the original file data. The part of data can be encoded into a corresponding data frame based on data information corresponding to the new data encoding when the data is encoded. In some embodiments, the "differential byte data cannot be obtained by referring to and restoring the compiled data in the original file data," the corresponding differential byte data may be referred to as a new added data type, and the data information of the new added data type may be referred to as first differential information.
The difference rule may be: the differential byte data can be restored according to a set differential calculation mode based on the compiled data segment A in the original file data and the current differential byte segment a. That is, the approximate data segment can be found from the data (such as the compiled data segment A) which is restored from the front of the differential byte data during decoding by the set differential calculation mode for reference restoration; the differential byte segment a can be understood as a differential byte segment a obtained by the first electronic device through performing difference comparison processing on the compiled data segment a and the differential byte data in the original file data one by one, and calculating, wherein the differential byte segment is one element in data information corresponding to the differential byte data. Further, the differential calculation manner may be a logical operation, such as exclusive or, and, not, and so on. For example, the data segment a and the differential byte data are subjected to an exclusive or operation to obtain the differential byte segment a, and it can be understood that the original differential byte data can be obtained by performing an exclusive or operation based on the differential byte segment a and the data segment a during data reduction. In some embodiments, the "differential byte data may be calculated in a set differential calculation manner based on the compiled data segment a and the current differential byte segment a in the original file data," and the corresponding differential byte data may be referred to as a first differential data type, and data information of the first differential data type may be referred to as second differential information.
The difference rule may be: the differential byte data may be directly copied based on the compiled data segment B (e.g., n bytes of data) in the original file data, that is, the current differential byte data may find the target data (i.e., the data segment B) that can be directly copied in the compiled data in the original file data, it is understood that the target data (i.e., the data segment B) is identical to the data of the differential byte data.
Specifically, in one aspect, the first electronic device may perform repeated byte blocking extraction on the original file data, that is, perform repeated byte calculation on all bytes included in the original file data, and determine at least one repeated byte block, where the repeated byte block is formed by repeating repeated bytes, such as: and (3) data segment: FFFFFFF … …, i.e., the data segment is repeatedly constructed with a repeat byte "F".
The repeated byte calculation method may be to set a byte number (e.g., two bytes) for comparing repeated bytes, and perform a cyclic comparison of the original file data from the data start position to the data end position according to the "byte number (e.g., two bytes)", for example, sequentially extract the current two bytes from the beginning to the end of the original file data, compare the current two bytes with the previous byte before the "two bytes", i.e., obtain the next byte of the current two bytes, and perform the step of comparing the current two bytes with the previous byte before the "current two bytes", i.e., determine a repeated byte block composed of at least one repeated byte (e.g., a repeated byte composed of two bytes) differently;
then, obtaining the next ' set byte number (for example, two bytes) "after the repeated byte block for circular comparison, for example, extracting the current two bytes after the repeated byte block from the original file data, comparing the current two bytes with the previous byte before the ' two bytes ', obtaining the next byte of the current two bytes if the two bytes are the same, executing the step of comparing the current two bytes with the previous byte before the ' current two bytes ', and determining another repeated byte block formed by at least one repeated byte (for example, the repeated byte formed by two bytes) if the two bytes are different;
based on the above manner, the original file data is traversed, and at least one repeated byte block can be determined in the original file data by performing cyclic comparison sequentially based on the set number of bytes, so that the process of performing repeated byte calculation on all bytes contained in the original file data is completed, and meanwhile, after the repeated byte block is determined, data information corresponding to the repeated byte block can be determined accordingly, for example, the characteristic data "repeated byte block" data information at least includes a repeated byte corresponding to the repeated byte block, a filling length (i.e., a byte size of the repeated byte) corresponding to the repeated byte, a first data length (i.e., a byte size of the repeated byte block) corresponding to the repeated byte, and a frame sequence corresponding to the repeated byte block.
On the other hand, the first electronic device may perform differential byte block comparison on the original file data, that is, perform block division on all bytes included in the original file data based on a preset block length to obtain at least two data blocks, determine a target data block to be currently compared from the at least two data blocks, and then perform differential comparison processing on the target data block and all data blocks before the target data block to obtain differential byte data of the target data block. For example, the original file data is partitioned according to the partition length of 1KB, and then a cyclic differential comparison is performed, specifically, the second block is compared with the first block, the third block is compared with the previous two blocks, and the 4 th block is compared with the previous three blocks The encoded frame sequence corresponding to the feature data, the feature type corresponding to the feature data, the feature byte length, and the like.
Step S102: and respectively carrying out data coding on the various data information based on the coding modes corresponding to the data information of the various characteristic data to obtain data frames corresponding to the various data information.
Specifically, after determining at least one type of feature data based on the original file data and further determining data information of each type of feature data, the first electronic device may perform data encoding on the data information of the feature data according to corresponding encoding modes based on preset encoding modes corresponding to each type of feature data, so as to obtain a data frame corresponding to the type of data information.
Specifically, when the feature data is the repeated byte data, the terminal is preset with a first coding mode corresponding to the repeated byte data, and performs data coding on data information included in the repeated byte data according to the first coding mode, so as to obtain a first type data frame, where the first type data frame is a data frame correspondingly generated based on the data information of the repeated byte data.
It is understood that the first encoding mode corresponds to an encoding structure of a frame data, and data elements required to be filled in the constituent elements in the encoding structure, such as: the first encoding manner may be to generate the first type frame data based on at least a padding length (i.e., a byte size of the repeated bytes) in the data information, a first data length (i.e., a byte size of the repeated byte block) corresponding to the repeated bytes, and a first order of the repeated bytes in all data frames (i.e., a frame order corresponding to the encoded repeated bytes).
In a specific implementation scenario, the first type data frame may be composed of a frame header, a data segment, and a frame trailer. The frame header may fill in the frame header, the frame sequence, and the frame type when encoding according to the first encoding method, for example, the frame header may fill in a fixed byte "0 x 55", the frame length, that is, the length of the entire data frame segment of the first type, and the frame type may fill in a fixed byte corresponding to the data frame of the first type, for example, "0 x 02". The data segment is used to fill in frame data, i.e. the repetition byte, the first data length corresponding to the repetition byte (i.e. the byte size of the repetition byte block), and the padding length (i.e. the byte size of the repetition byte). The trailer may be filled with a check code for checking the first type of frame data, such as a frame data CRC8 value, and in some embodiments, may also be filled with a check code and a fixed trailer byte, such as 0 xAA.
Specifically, when the feature data is differential byte data, the terminal is preset with a second coding mode corresponding to the differential byte data, and performs data coding on data information included in the data information of the differential byte data according to the second coding mode, so as to obtain a second type data frame, where the second type data frame is a data frame correspondingly generated based on the data information of the differential byte data.
It will be appreciated that the second encoding scheme will correspond to the encoding structure of a frame of data, and the data elements that are required to be filled in the constituent elements of the encoding structure.
Such as: when the differential byte information corresponding to the differential data is the first differential information, the second encoding method may be to generate the second type frame data based on the differential data (e.g., the first differential data), the data length corresponding to the differential data, and the frame sequence in all data frames corresponding to the differential data.
For another example: when the differential byte information corresponding to the differential data is the second differential information, the second encoding method may be to perform data encoding based on the differential data (e.g., the second differential data), the frame sequence of all data frames corresponding to the differential data, the reference data of the differential data in the original file data (i.e., the data that can be restored by differential reference), the data address of the reference data, and the data length (i.e., the original data length corresponding to the encoded and restored frame data), so as to obtain a second type of data frame, where the second type of data frame is a data frame generated based on the data information corresponding to the differential byte data.
For another example: when the differential byte information corresponding to the differential data is the third differential information, the second encoding method may be to perform data encoding based on the differential data (e.g., the third differential data), the frame sequence corresponding to the differential data in all data frames, the data length corresponding to the differential information (i.e., the original data length corresponding to the frame data when encoding and restoring), and the data address of the differential data in the original file data, which can be directly copied, so as to obtain a second type data frame, where the second type data frame is a data frame generated based on the data information corresponding to the differential byte data.
The above only summarizes the data encoding process, and the implementation process of specific data frame encoding can be referred to in the relevant part of the following implementation.
In some embodiments, the type of the feature data may be determined based on a difference rule, and different data frames may be generated based on different difference rules, which may be determined according to a practical application scenario.
Step S103: and compressing the data frames corresponding to the various types of data information respectively to obtain a data compression packet corresponding to the original file data.
Specifically, data characteristics of original file data are mined according to the above method, corresponding characteristic data are extracted and encoded, the data characteristics of the original file data can be fully mined, encoding efficiency is improved, and meanwhile, storage space occupied by the original file data during encoding is reduced. Common compression algorithms include lossy compression and lossless compression, which may be Huffman (Huffman) algorithms, LZW (Lenpel-Ziv & Welch) compression algorithms, LZR (LZ-Renau) compression methods, and so forth.
In the embodiment of the application, a first electronic device obtains at least one type of feature data contained in original file data, obtains data information of each type of feature data, respectively performs data coding on each type of data information based on a coding mode corresponding to the data information of each type of feature data to obtain data frames corresponding to each type of data information, and compresses the data frames corresponding to each type of data information to obtain a data compression packet corresponding to the original file data. By acquiring at least one type of characteristic data (namely data compiling characteristics) obtained after subdividing and mining the original file data and then performing data encoding on the various types of characteristic data, the problem of low compression ratio when data compression is directly performed based on a fixed compression algorithm can be solved, the compression ratio of data compression is improved, and the data overhead (such as memory overhead, data transmission overhead and the like) is saved.
Referring to fig. 2, fig. 2 is a schematic flowchart illustrating another embodiment of a data compression method according to the present application. Specifically, the method comprises the following steps:
step S201: acquiring at least one repeated byte in original file data, and carrying out segmentation calculation processing on the original file data based on the repeated byte to obtain repeated byte information corresponding to the repeated byte.
The repetition byte information comprises the repetition byte, a padding length, a first data length corresponding to the repetition byte, and a first sequence of the repetition byte in all data frames.
Specifically, the first electronic device may perform duplicate byte calculation on the original file data, that is, extract duplicate bytes in blocks (also may be understood as duplicate byte nibble extraction), that is, perform duplicate byte calculation on all bytes included in the original file data, specifically, the first electronic device may obtain at least one duplicate byte in the original file data, perform segmentation calculation processing on the original file data based on the duplicate byte, and may generally determine at least one duplicate byte block in the original file data in the calculation, where the duplicate byte block is formed by repeating duplicate bytes, for example: and (3) data segment: FFFFFFF … …, i.e., the data segment is repeatedly constructed with a repeat byte "F". It can be understood that: the method comprises the steps of extracting repeated byte data in original file data by performing repeated byte calculation on the original file data, and acquiring repeated byte information corresponding to the repeated byte data.
As shown in fig. 3, fig. 3 is a schematic view of a repeat byte calculation, where the repeat byte calculation may be performed by setting a byte number (e.g., byte number a in fig. 3) for repeat byte comparison, and performing a cyclic comparison on the original file data from the data start position to the data end position by "byte number (e.g., two bytes)", for example, sequentially extracting two current bytes from the original file data from beginning to end, comparing the two current bytes with the previous byte before "two bytes", and assuming that byte number a, comparing the two current bytes with the previous byte before "two bytes" may be expressed as: the current comparison Byte is "Byte X + A" compared to the comparison Byte is "Byte X" as follows:
if "Byte X + a" is the same as "Byte X", the number of repeated bytes is accumulated, that is, "Count + a" in fig. 3, and then a circular comparison is performed, where X is X + a, the next Byte of the current two bytes is obtained, that is, the next Byte of the current two bytes is "Byte X + a", the step of comparing the current two bytes with the previous Byte before "the current two bytes" is performed, that is, the current comparison Byte is "Byte X + a" and the comparison Byte is "Byte X", is performed until "Byte X + a" and the comparison Byte X "are different;
if "Byte X + a" is different from "Byte X", determining a repeated Byte block composed of at least one repeated Byte (e.g., a repeated Byte composed of bytes counted by a Count counter), and then obtaining repeated Byte information corresponding to the repeated Byte based on the repeated Byte a corresponding to the repeated Byte block, where the repeated Byte information includes a padding length, a first data length corresponding to the repeated Byte (i.e., a Byte size of the repeated Byte block), and a first order of the repeated Byte (i.e., a frame order corresponding to the repeated Byte block) in all data frames (i.e., a frame order corresponding to the repeated Byte block);
then, the next "set byte number" (the next set byte number may be the same as or different from the previous round of byte number) after the repeated byte block is obtained, and the cyclic comparison is performed according to the above-mentioned manner, for example, the current two bytes (i.e. the byte number set in the current round) after the repeated byte block is extracted from the original file data are compared with the previous byte before the "two bytes", the same is to obtain the next byte of the current two bytes, the step of comparing the current two bytes with the previous byte before the "current two bytes" is performed, the difference is to determine another repeated byte block composed of at least one repeated byte (e.g. the repeated byte composed of two bytes), so as to obtain the repeated byte information corresponding to the repeated byte of the other repeated byte block;
based on the above manner, the original file data is traversed, and at least one repeated byte block can be determined in the original file data by performing cyclic comparison sequentially based on the set number of bytes, so that the process of performing repeated byte calculation on all bytes included in the original file data is completed, and after the repeated byte block is determined, repeated byte information corresponding to the repeated byte block can be determined accordingly, for example, the characteristic data "repeated byte block" data information at least includes a repeated byte corresponding to the repeated byte block, a padding length (i.e., a byte size of the repeated byte) corresponding to the repeated byte, a first data length (i.e., a byte size of the repeated byte block) corresponding to the repeated byte, and a first order of the repeated byte in all data frames.
Step S202: and carrying out data encoding on the repeated byte information according to a first encoding mode corresponding to the repeated byte data to obtain at least one first type data frame of the original file data.
Specifically, the first electronic device may generate at least one first type data frame corresponding to the original file data based on each repetition byte information based on the determined repetition byte information. The method specifically comprises the following steps: the terminal is preset with a first coding mode corresponding to the repeated byte data, and data coding is carried out on data information of the repeated byte data according to the first coding mode, so that a first type data frame is obtained, wherein the first type data frame is a data frame generated correspondingly on the basis of the data information of the repeated byte data.
It is understood that the first encoding mode corresponds to an encoding structure of a frame data, and data elements required to be filled in the constituent elements in the encoding structure, such as: the first encoding manner may be to generate the first type frame data based on at least a padding length (i.e., a byte size of the repeated bytes) in the data information, a first data length (i.e., a byte size of the repeated byte block) corresponding to the repeated bytes, and a first order of the repeated bytes in all data frames (i.e., a frame order corresponding to the encoded repeated bytes).
In a specific implementation scenario, the first type data frame may be composed of a frame header, a data segment, and a frame trailer. The frame header may fill in the frame header, the frame sequence, and the frame type when encoding according to the first encoding method, for example, the frame header may fill in a fixed byte "0 x 55", the frame length, that is, the length of the entire data frame segment of the first type, and the frame type may fill in a fixed byte corresponding to the data frame of the first type, for example, "0 x 02". The data segment is used to fill in frame data, i.e. the repetition byte, the first data length corresponding to the repetition byte (i.e. the byte size of the repetition byte block), and the padding length (i.e. the byte size of the repetition byte). The trailer may be filled with a check code for checking the first type of frame data, such as a frame data CRC8 value, and in some embodiments, may also be filled with a check code and a fixed trailer byte, such as 0 xAA.
In a possible implementation, when the total length of the repetition bytes included in the repetition byte information is greater than a frame length threshold, the first electronic device may generate at least one first type data frame based on the repetition byte information.
The total length of the repeated bytes may be a value in a Count counter in fig. 3, that is, a byte size corresponding to the repeated byte block, and the frame length threshold is a minimum byte size corresponding to the repeated byte block corresponding to the first type of frame data, that is, a frame length threshold Z in fig. 3. It can be understood that, when the total length of the repeated bytes included in the repeated byte information is less than or equal to the frame length threshold, the encoding efficiency of encoding the repeated byte data according to the first encoding method is usually not high, that is, the compression rate of encoding the original data of the data encoding is low, and there is a case that the data length of the first type data frame is greater than the first data length, and at this time, the repeated byte data corresponding to the repeated bytes in the original file data can be usually reserved.
Further, when the total length of the repeated bytes included in the repeated byte information is greater than the frame length threshold, the encoding efficiency of encoding the repeated byte data according to the first encoding method is generally high, that is, the compression rate of encoding the original data of the data encoding is high, and when the length of the data frame of the first type data is smaller than the length of the first data, the data encoding of the repeated byte data corresponding to the repeated byte in the original file data can be performed according to the first encoding method, so as to obtain at least one first type data frame of the original file data. .
Step S203: and segmenting the original file data according to a preset segmentation length to obtain at least two data blocks.
Specifically, the first electronic device performs differential comparison calculation on the original file data, that is, performs differential byte blocking comparison on the original file data, that is, blocks all bytes included in the original file data based on a preset block length to obtain at least two data blocks.
As shown in fig. 4, fig. 4 is a schematic view of a scene for blocking original file data, where a first electronic device is provided with a blocking length for blocking the original file data, and the blocking length may be 4kb, 2kb, 1kb, 512B, and the like, and blocks the original file data according to the blocking length to obtain at least two data blocks, it can be understood that after the blocking, the original file data corresponds to Block0, Block1.
Step S204: and acquiring a target data block from the at least two data blocks, and performing differential comparison processing on the target data block and all data blocks before the target data block to obtain differential byte data of the target data block.
Specifically, after the first electronic device blocks the original file data, a target data Block to be currently compared is determined from the at least two data blocks, for example, Block1, and then the target data Block is subjected to differential comparison with all data blocks before the target data Block, for example, if the current target data Block is determined to be Block1, Block1 is subjected to differential comparison with all data blocks before the target data Block, "Block 0", and if the current target data Block is determined to be Block4, Block4 is subjected to differential comparison with all data blocks before the target data Block, "Block 0-Block 4", so that a differential comparison result of the target data Block, that is, differential byte data, can be obtained. The purpose of the differential comparison is to find a differential reference basis of subsequent data (i.e. the current data block) from the compiled original data before the current data block.
Step S205: and acquiring differential byte information corresponding to the differential byte data, acquiring a next data block of the target data block, taking the next data block as the target data block, and executing the step of determining to perform differential comparison processing on the target data block and all data blocks before the target data block.
And the differential byte information is used for carrying out frame data coding in a corresponding coding mode according to the coding type of the differential byte data. The differential byte information may be at least a characteristic byte (e.g., a reusable differential byte, a reproducible differential repeated byte, a newly added byte, etc.) in the differential byte data, an encoded frame sequence corresponding to the differential byte data, a characteristic type corresponding to the differential byte data, a characteristic byte length, etc.
Specifically, after obtaining the differential byte data corresponding to the target data block, the first electronic device analyzes the differential byte data, specifically, determines a data type based on the differential byte data to determine the data type of the differential byte data, and may obtain corresponding differential byte information based on the data type, which is specifically as follows:
for example, when the first data type, also called a new data type, is used to analyze differential byte data, if the differential byte data cannot be obtained by performing reference reduction (for example, performing differential reduction or padding reduction or the like) on all data before a target data block in original file data, it may be determined that the differential byte data type is the first data type; and simultaneously determining byte information corresponding to the differential byte data based on the determined first data type, such as: the first differential data (i.e., the xinzhen data byte), the second order of the first differential data in all data frames (i.e., the frame order to which the first differential data corresponds when encoded), and the second data length to which the first differential data corresponds.
For example, when the second data type, also called a first differential data type (non-duplicable type), analyzes differential byte data, if the differential byte data can be restored by the "compiled data segment a" and the "current differential byte segment a" in all data before the target data block according to a set differential calculation method to obtain original data corresponding to the target data block, the differential byte data type may be determined to be the second data type;
that is, the second electronic device can find an approximate data segment from the data (such as the compiled data segment a) which is restored before the "differential byte data" for reference restoration when decoding according to the set differential calculation mode; the differential byte segment a can be understood as a differential byte segment a obtained by the first electronic device through performing difference comparison processing on the compiled data segment a and the differential byte data in the original file data one by one, and calculating, wherein the differential byte segment is one element in data information corresponding to the differential byte data. Further, the differential calculation manner may be a logical operation, such as exclusive or, and, not, and so on. For example, the data segment a and the differential byte data are subjected to an exclusive or operation to obtain the differential byte segment a, and it can be understood that the original differential byte data can be obtained by performing an exclusive or operation based on the differential byte segment a and the data segment a during data reduction.
And simultaneously determining byte information corresponding to the differential byte data based on the determined second data type, such as: the data processing method includes the steps of obtaining second differential data (namely the differential byte section a), a third sequence of the second differential data in all data frames, and first reference data (namely a data section a) of the second differential data in all data blocks before the target data block, a first data address (namely a data address of the data section a) corresponding to the first reference data, and a third data length corresponding to the first reference data, wherein when data is restored, a reference differential length of the reference differential data can be determined based on the third data length, a parameter data length corresponding to the first reference data, and a preset differential calculation manner, so that data indicated by the reference differential length can be conveniently obtained based on the first data address corresponding to the reference differential data when data is restored.
For example, when the differential byte data is analyzed and processed, if the differential byte data can be directly copied and data-filled based on the compiled data segment B (e.g., n-byte data) in the original file data, that is, the current differential byte data can find the target data (i.e., the data segment B) that can be directly copied in the compiled data in the original file data, it can be understood that the target data (i.e., the data segment B) is completely consistent with the data of the differential byte data, and at this time, it can be determined that the differential byte data type is the third data type; and simultaneously determining byte information corresponding to the differential byte data based on the determined third data type, such as: third differential data (differential byte data of the current data block), a fourth sequence of the third differential data in all data frames, a fourth data length corresponding to the second reference data (i.e. an original data length corresponding to the third differential information), and a second data address of the third differential data in all data blocks before the target data block.
Step S206: and when the next data block does not exist, obtaining all the differential byte information in the original file data.
Specifically, when the first electronic device determines that the target data block is the last data block, the next data block does not exist, and at this time, the first electronic device may end the data block differential comparison processing, so as to obtain all the differential byte information in the original file data after the original file data is subjected to data block differential comparison.
Step S207: and carrying out data encoding on the differential byte information according to a second encoding mode corresponding to the differential byte data to obtain at least one second type data frame of the original file data.
Specifically, the first electronic device may generate at least one second type data frame corresponding to the original file data based on each differential byte information based on the determined differential byte information. The method specifically comprises the following steps: the terminal is preset with a second coding mode corresponding to the differential byte data, and data coding is carried out on the differential byte information of the differential byte data according to the second coding mode, so that a second type data frame is obtained, wherein the second type data frame is a data frame generated correspondingly based on the differential byte information of the differential byte data.
It is understood that the second encoding method corresponds to an encoding structure of frame data, and data elements required to be filled in the constituent elements in the encoding structure, such as: the second encoding method may be to generate the second type frame data based on at least a padding length (i.e., a byte size of the differential data) in the data information, a data length corresponding to the differential data (i.e., a byte size of the differential byte block), and a frame order of the differential data in all data frames (i.e., a frame order corresponding to the differential byte to be encoded).
In a specific implementation scenario, the second type data frame may be composed of a frame header, a data segment, and a frame trailer. The frame header may fill in the frame header, the frame sequence, and the frame type when encoding according to the second encoding method, for example, the frame header may fill in a fixed byte "0 x 55", the frame length, that is, the length of the entire data frame segment of the first type, and the frame type may fill in a fixed byte corresponding to the data frame of the second type, for example, "0 x 02". The data segment is used to fill in frame data, i.e. differential data, data length corresponding to the differential data, and padding length (i.e. byte size of the differential data). The trailer may be filled with a check code for checking the first type of frame data, such as a frame data CRC8 value, and in some embodiments, may also be filled with a check code and a fixed trailer byte, such as 0 xAA.
A frame structure of a data frame is as follows:
Figure BDA0002694407170000081
in a specific embodiment, the differential byte information at least includes first differential information, second differential information, and third differential information, the first electronic device performs data encoding on each type of differential byte information according to a second encoding method corresponding to the differential byte data to obtain at least one second type data frame of the original file data,
if so, carrying out data encoding on the first differential information according to a second encoding mode corresponding to the differential byte data to obtain a newly added data frame of the original file data;
if so, carrying out data encoding on second differential information according to a second encoding mode corresponding to the differential byte data to obtain a first differential data frame of the original file data;
for example, the third differential information is subjected to data encoding according to the second encoding mode corresponding to the differential byte data, so as to obtain a third differential data frame of the original file data.
The specific implementation details are as follows:
the first electronic device generally associates a process of encoding corresponding differential information (e.g., first, second, and third differential information) with a process of performing differential comparison processing on a target data block, and one encoding manner of a second type data frame is to perform differential comparison processing on the target data block to obtain corresponding differential information (e.g., first, second, and third differential information), and may use the first electronic device to perform second type data frame encoding on corresponding differential information of the target data block in a parallel manner, and while encoding, obtain a next data block of the target data block, and perform parallel processing. A coding mode of a second type data frame is that all target data blocks corresponding to original file data (namely all data blocks contained in the original file data) are subjected to differential comparison processing to obtain differential information (such as first, second and third differential information) corresponding to all the target data blocks, and then after the differential comparison processing process is completed, the second type data frame is coded on the corresponding differential information of all the target data blocks. The following explains in detail the encoding process of the second type of data frame based on the corresponding difference information of the target data block as follows:
taking an example that a target data block corresponds to first differential information, a first electronic device may obtain first differential information of the target data block, where the first differential information includes first differential data, a second order of the first differential data in all data frames, and a second data length corresponding to the first differential data, and generate a new data frame corresponding to original file data based on the first differential information;
the second order is determined based on the number of currently encoded data frames, and if the number of encoded data frames is n before the first differential information of the target data block is encoded, the second order is n + 1.
During encoding, based on the frame structure of the second-type data frame, based on the first differential information, a new data frame corresponding to the original file data is generated, specifically, the first electronic device may fill fixed bytes (e.g., "0 x 55") in a frame header, fill the second sequence in SN corresponding to the frame sequence, fill the length of a frame data segment of the entire second-type data frame in the frame length, fill the frame type corresponding to the first differential information in the frame type, fill the first differential data and the second data length in the frame data, and generate a frame check value and a frame end identifier to fill in the frame end, thereby generating the new data frame corresponding to the original file data.
Taking an example that a target data block corresponds to second differential information, a first electronic device may obtain second differential information of the target data block, where the second differential information includes second differential data, a third order of the second differential data in all data frames, first reference data of the second differential data in all data blocks before the target data block, a first data address corresponding to the first reference data, and a third data length corresponding to the first reference data, and generate a first differential data frame of the original file data based on the third order, the first data address, the third data length, and the second differential data is determined based on the first reference data and original differential data corresponding to the second differential information after logical operation;
the second order is determined based on the number of currently encoded data frames, and if the number of encoded data frames is n before the first differential information of the target data block is encoded, the second order is n + 1.
During encoding, based on the frame structure of the second type data frame, and based on the third order, the first data address, the third data length, and the second differential data, a first differential data frame corresponding to the original file data is generated, specifically, the first electronic device may fill a fixed byte (e.g., "0 x 55") in a frame header, fill the third order in an SN number corresponding to the frame order, fill a length of a frame data segment of the entire second type data frame in a frame length, fill a frame type corresponding to the second differential information in the frame type, then fill the first data address, the third data length corresponding to the first reference data, and the second differential data in the frame data, and then generate a frame check value and a frame end identifier filled frame end, thereby generating the first differential data frame corresponding to the original file data.
During decoding, the decoding end can determine the starting position of first reference data in all original file data before the first differential data frame through the first data address, and obtain data with a third data length, namely the first reference data, from the starting position, and then the decoding end can perform logical operation based on the first reference data and the second differential data, wherein the logical operation is the inverse operation of a set differential calculation mode during encoding, for example, the first reference data and the original data of a target data block perform an exclusive-or operation during encoding to obtain second differential data, and the first reference data and the second differential data perform an exclusive-or operation during decoding to obtain the original data of the target data block.
Taking an example that a target data block corresponds to third differential information, a first electronic device may obtain third differential information of the target data block, where the third differential information includes third differential data, a fourth order of the third differential data in all data frames, a fourth data length corresponding to the second reference data, and a second data address of the third differential data in all data blocks before the target data block, and generate a second differential data frame of the original file data based on the fourth frame order, the fourth data length, and the second data address.
During encoding, based on the frame structure of the second type data frame, and based on the fourth frame sequence, the fourth data length, and the second data address, a second differential data frame corresponding to the original file data is generated, specifically, the first electronic device may fill a fixed byte (e.g., "0 x 55") in the frame header, fill the fourth sequence in the SN corresponding to the frame sequence, fill the length of the frame data segment of the entire second type data frame in the frame length, fill the frame type corresponding to the second differential information in the frame type, fill the frame data address and the fourth data length corresponding to the second reference data in the frame data, and generate a frame check value and a frame end identifier, which are then filled in the frame end, thereby generating the second differential data frame corresponding to the original file data.
During decoding, the decoding end may determine a start position of second reference data in all original file data before the second differential data frame based on the second data address, and obtain data of a fourth data length from the start position, where the data is the second reference data, and according to a definition of the third differential information in some embodiments, the second reference data is original data of the target data block.
In a possible implementation manner, the first electronic device is provided with a plurality of preset block lengths, and by sequentially performing steps 203 to 207 based on each of the preset block lengths, the "at least one second type data frame of the original file data" corresponding to each block length is obtained, and the first electronic device may determine the encoding scheme with the highest encoding efficiency, that is, determine the most suitable block length and preferentially select the "at least one second type data frame of the original file data" corresponding to the most suitable block length. The method comprises the following specific steps:
1. the first electronic device determines the coding efficiency of the at least one second type data frame corresponding to each preset block length based on a plurality of preset block lengths;
in a specific implementation, the first electronic device sequentially performs steps 203 to 207 based on each of a plurality of preset block lengths, so as to obtain "at least one second type data frame of the original file data" corresponding to each block length, and then calculates "coding efficiency of the at least one second type data frame" obtained when each block length is calculated, where the calculation method may be: and calculating the ratio of the memory size of the original data before the block length coding to the total memory of the at least one second type data frame, and taking the ratio as the coding efficiency. Based on the method, the coding efficiency of the at least one second type data frame corresponding to each preset block length can be obtained.
Determining the highest coding efficiency in the frame coding efficiencies, and determining at least one target second type data frame indicated by the highest frame coding efficiency;
in a specific implementation, based on the obtained coding efficiency of the at least one second-type data frame corresponding to each preset block length, the first electronic device determines the highest coding efficiency from the coding efficiencies of the frames, then determines a target block length indicated by the highest coding efficiency, and finally determines the corresponding at least one target second-type data frame based on the target block length as the target block length;
and taking the at least one target second type data frame as at least one second type data frame of the original file data.
It should be noted that, the step of the first electronic device performing the repeated byte calculation on the original file data, extracting the repeated byte data in the original file data, and acquiring the repeated byte information corresponding to the repeated byte data (i.e., performing step S201 to step S202), and the step of performing the differential comparison calculation on the original file data, extracting the differential byte data in the original file data, and acquiring the differential byte information corresponding to the differential byte data (i.e., performing step S203 to step S207) are not sequential, and the step S201 to step S202 "and the step S203 to step S207" may be performed in parallel by the first electronic device, or the step S201 to step S202 "and the step S203 to step S207" may be performed in series by the first electronic device.
Further, when the first electronic device executes the series execution of "step S201 to step S202" and "step S203 to step S207", the first electronic device may first execute step S201 to step S202.
Preferably, after the steps S201 to S202 are performed, at this time, obtaining at least one first type data frame of the original file data is generally completed. In order to further improve the encoding efficiency or the encoding compression rate of the original file data frame, the first electronic device may filter the repeated byte data that has been encoded,
that is, the first electronic device filters the repeated byte data in the original file data, and then obtains target file data after filtering, and the first electronic device may perform a differential comparison calculation on the target file data, extract differential byte data in the target file data, and obtain differential byte information corresponding to the differential byte data, using the target file data as the original file data, that is, the first electronic device performs steps S203 to S207, using the target file data as the original file data.
Step S208: and acquiring a target compression algorithm corresponding to the original file data, and compressing the various data frames based on the target compression algorithm to obtain a data compression packet corresponding to the original file data.
The target compression algorithm is an algorithm for compressing each type of data frame after data mining is carried out based on data characteristics of original file data and each type of data frame is encoded again. By compressing various data frames, the data compression processing can be understood as reducing the data volume to reduce the storage space and improve the transmission, storage and processing efficiency on the premise of not losing useful information, and in practical application, various data frames after being coded can be compressed by adopting a related compression algorithm, so that the redundancy and storage space of data are reduced. Common compression algorithms include lossy compression and lossless compression, which may be Huffman (Huffman) algorithms, LZW (Lenpel-Ziv & Welch) compression algorithms, LZR (LZ-Renau) compression methods, and so forth.
In a possible implementation manner, the first electronic device may perform compression processing on the various types of data frames based on a plurality of set compression algorithms, and then generate compressed data packets corresponding to the plurality of compression algorithms respectively;
and then, next, the first electronic device determines an optimal compressed data packet from the compressed data packets, and the first electronic device determines compression ratios corresponding to the multiple compression algorithms respectively based on the compressed data packets, wherein the calculation mode of the compression ratios may be a ratio of a data memory of the original file data to a packet memory of the compressed data packets, or the calculation mode of the compression ratios may be a ratio of a total memory of various data frames to the packet memory of the compressed data packets, and so on.
The first electronic device determines the highest compression ratio from the compression ratios, takes the compression algorithm indicated by the highest compression ratio as a target compression algorithm, and takes a data compression packet after compression processing corresponding to the optimal target compression algorithm as a data compression packet corresponding to the original file data.
In a feasible implementation manner, the first electronic device may obtain a preset compression algorithm corresponding to the original file data, that is, the first electronic device is provided with a default compressed preset compression algorithm, where the preset compression algorithm may be manually set by a user on the first electronic device, and may be an optimal compression algorithm that is determined by obtaining sample data (such as compression rates corresponding to the compression algorithms) in an actual environment in a manner of using big data, and then, the optimal compression algorithm is used as the preset compression algorithm.
Step S209: and acquiring a data frame decoding file and a decompression file of the target compression algorithm, and generating a data packaging packet based on the data compression packet, the decompression file and the data frame decoding file.
Specifically, the first electronic device obtains a frame decoding file corresponding to data frame decoding of original file data, where the frame decoding file includes decoding rules for decoding various data frames into the original file data, and the frame decoding file generally exists in a frame decoding protocol form, and obtains a decompressed file of a target compression algorithm, where the decompressed file may be a decoding algorithm corresponding to the target compression algorithm, or a decoding rule, and the like. And then the first electronic equipment encapsulates the data compression packet, the decompression file and the data frame decoding file to obtain a data encapsulation packet.
In the embodiment of the application, a first electronic device obtains at least one type of feature data contained in original file data, obtains data information of each type of feature data, respectively performs data coding on each type of data information based on a coding mode corresponding to the data information of each type of feature data to obtain data frames corresponding to each type of data information, and compresses the data frames corresponding to each type of data information to obtain a data compression packet corresponding to the original file data. By acquiring at least one type of characteristic data (namely data compiling characteristics) obtained after subdividing and mining original file data and then performing data encoding on the various types of characteristic data, the problem of low compression ratio when data compression is directly performed based on a fixed compression algorithm can be solved, the compression ratio of data compression is improved, and data expenses (such as memory expenses, data transmission expenses and the like) are saved; and adopting a corresponding coding mode according to data information (such as repeated byte information and differential information) of various types of characteristic data of different types, maximizing the utilization of the characteristic data of the original file data, and greatly improving the reuse rate of coded data (namely coded bytes or data segments) when the original file data is coded; the repeated byte data in the original file data can be encoded and then filtered, and then the differential data is encoded based on the filtered target file data, so that the encoding efficiency or the encoding compression ratio of the original file data frame can be further improved; and the optimal compression algorithm when various different compression algorithms encode various data frames can be obtained, so that the data compression packet after the various data frames are encoded by the optimal compression algorithm is obtained, and the compression rate is further improved.
In one embodiment, as shown in fig. 5, a data decompression method is proposed, which can be implemented by means of a computer program and can be run on a data decompression device based on the von neumann architecture. The computer program may be integrated into the application or may be run as an independent tool application, and for convenience of explanation, the data decompression device is used as the second electronic device for detailed explanation.
Specifically, the data decompression method includes:
step S301: and acquiring a data compression packet.
Specifically, the second electronic device may establish a communication connection with the first electronic device in advance, based on the communication connection, the second electronic device may send an acquisition request to the first electronic device for the data compression packet, and the first electronic device sends the data compression packet to the second electronic device in response to the acquisition request; or after the communication connection is established, the first electronic device may actively send a data compression packet to the second electronic device based on the communication connection, and at this time, the second electronic device may obtain the data compression packet pushed by the first electronic device.
The established communication connection may be a communication long connection or a communication short connection.
A long connection means that multiple packets can be sent continuously over one connection, and during the connection hold period, if no packet is sent, a link check packet needs to be sent in both directions.
The operation steps of the long connection are as follows: establish a connection-data transfer (maintain a connection).
The short connection means that when both communication parties have data interaction, a connection is established, and after the data transmission is completed, the connection is disconnected, that is, only one service is transmitted in each connection.
The short connection operation steps are as follows: establishing connection-data transmission-closing connection.
Long connections are often used for frequent, point-to-point communications, and the number of connections cannot be too great. Each TCP connection needs three-step handshake, which requires time, and if each operation is short connection and the re-operation is performed, the processing speed is greatly reduced, so that each operation is not disconnected after the completion, and a data packet can be directly sent during the next processing without establishing TCP connection. For example: the connection of the database can be long connection, if the communication is frequent with short connection, socket error can be caused, and the frequent socket creation is waste of resources.
Http services like WEB sites generally use short links, because long connections consume certain resources for a service end (e.g., a first electronic device), while connections of thousands or even billions of clients, which are frequent like WEB sites, use short connections, which saves some resources. The amount of concurrency is large, but each user needs to use a short connection well without frequent operations.
It should be noted that the communication connection established between the first electronic device and the second electronic device may be a communication long connection or a communication short connection, and is not limited herein.
Step S302: and decompressing the data compression packet to obtain at least one type of data frame.
Specifically, after acquiring the data compression packet, the second electronic device decompresses the data compression packet, so as to obtain at least one type of data frame of the data compression packet after decompression, where decompression may be performed on the data compression packet by using a default decompression algorithm, or performed on the data compression packet by using a decompression algorithm corresponding to a compression algorithm used when the data compression packet is compressed.
Including, but not limited to, Huffman (Huffman) decompression algorithm, LZW (Lenpel-Ziv & Welch) decompression algorithm, LZR (LZ-Renau) decompression method, and the like.
According to some embodiments, the second electronic device may decompress the data compression packet to obtain at least one first type data frame and at least one second type data frame.
The first type data frame is obtained by the first electronic device by performing data encoding on data information contained in repeated byte data in original file data.
The second type data frame is obtained by the first electronic device performing data encoding on data information contained in differential byte data in original file data.
The above embodiments may be referred to for the process of encoding the first type data frame and the second type data frame and the associated concepts, and detailed description is omitted here.
Step S303: and performing data decoding on the at least one type of data frames based on the decoding modes corresponding to the various types of data frames in the at least one type of data frames to obtain original file data corresponding to the data compression packet.
In the embodiment of the present application, data decoding is performed on the data frames based on the decoding manner of each type of data frame, so that original data corresponding to the type of data frame can be obtained, and it can be understood that after all types of data frames included in the data compression packet are decoded, original file data corresponding to the data compression packet can be obtained by the second electronic device.
According to some embodiments, the first type data frame is encoded in a first encoding manner, and the corresponding decoding process is an inverse process of the encoding, and the second electronic device may decode the first type data frame based on the decoding manner corresponding to the first encoding manner. The decoding recovery is performed based on the repetition byte information in the first type data frame, wherein the repetition byte information comprises the repetition byte, the padding length, the first data length corresponding to the repetition byte, and the first sequence of the repetition byte in all data frames.
According to some embodiments, the second type data frame is encoded in a second encoding manner, and the corresponding decoding process is an inverse process of the encoding, and the second electronic device may decode the second type data frame based on the decoding manner corresponding to the second encoding manner. For example, the decoding is performed by adopting a corresponding decoding mode according to different data frames in the second type data frames.
For a detailed explanation of the data decoding performed by the second electronic device on the at least one type of data frame (e.g., the first type of data frame, the second type of data frame), the following explanations may be referred to.
In this embodiment of the present application, a second electronic device obtains a data compression packet, decompresses the data compression packet to obtain at least one type of data frame, and performs data decoding on the at least one type of data frame based on a decoding manner corresponding to each type of data frame in the at least one type of data frame to obtain original file data corresponding to the data compression packet. By acquiring the data compression packet which is recoded based on at least one type of feature data of the original file data and has a high compression rate, the original file data corresponding to the data compression packet can be restored only by carrying out corresponding data decoding on at least one type of data frames in the data compression packet with small data overhead (such as memory overhead, data transmission overhead and the like), the data overhead (such as memory overhead, data transmission overhead and the like) is reduced in the whole original file data acquisition process, the time for acquiring the original file data is shortened, the data acquisition efficiency is improved, and equipment resources are saved at the same time.
Referring to fig. 6, fig. 6 is a schematic flowchart illustrating another embodiment of a data compression method according to the present application. Specifically, the method comprises the following steps:
step S401: and acquiring a data encapsulation packet, and decapsulating the data encapsulation packet to obtain a data compression packet, and a decompression file and a data frame decoding file corresponding to the data compression packet.
The decompression file may be a decoding algorithm, a decoding rule, a decoding file, etc. corresponding to the compression algorithm in the process of compressing the data compression packet by using the corresponding compression algorithm. In some embodiments, the decompression file may be a decompression software driver integrated with a function of decompressing the data compression packet, and the second electronic device may initialize the decompression software driver by downloading and installing the decompression file, so that the second electronic device may decompress the data compression packet quickly based on the decompression software driver in practical application, thereby obtaining at least one type of data frame after decompression processing.
The data frame decoding file contains decoding rules for decoding various types of data frames into original file data, and the frame decoding file usually exists in the form of a frame decoding protocol, such as decoding rules for a first type of data frame, decoding rules for a second type of data frame, and the like. In some embodiments, the data frame decoding file may be a de-differential driver integrated with a function of decoding various types of data frames, and the second electronic device may initialize the de-differential driver by downloading and installing the data frame decoding file, so that the second electronic device may decode various types of data frames based on the de-differential driver in practical application, thereby obtaining original data corresponding to at least one type of data frames after decoding processing.
Step S402: and decompressing the data compression packet to obtain at least one type of data frame.
Specifically, the second electronic device may decompress the data compression packet based on the decompression file, and then may obtain at least one type of data frame corresponding to the data compression packet, such as a first type of data frame and a second type of data frame. In practical application, the second electronic device caches the decompressed file to a next corresponding storage space through a corresponding download service (e.g., a UDS service), then installs the decompressed file to initialize a decompressed software driver corresponding to the decompressed file, calls the decompressed software driver to decompress the data compression packet, and then obtains at least one type of data frame corresponding to the data compression packet, such as a first type data frame and a second type data frame.
In a specific implementation scenario, the second electronic device acquires at least one type of data frame, where the at least one type of data frame corresponds to each data frame, and then the electronic device may perform frame data validity check on each type of data frame (e.g., the first type of data frame and the second type of data frame).
Specifically, the second electronic device obtains a frame sequence and a frame check value of each type of data frame in the at least one type of data frame, and performs data check based on the frame sequence and the frame check value, on one hand, based on the frame sequence of each frame data, detects whether the downloaded frame data has missing data frames and is not downloaded, that is, detects the frame integrity of all the frame data; on one hand, the frame Check value is obtained to perform the validity detection of the frame data corresponding to the frame Check value, for example, the Check on the frame data can be realized based on a set Check algorithm, such as a parity Check algorithm, a bcc xor Check method (block Check character), a Cyclic Redundancy Check (Cyclic Redundancy Check), an MD Check algorithm, and the like, the second electronic device can perform Check data calculation on the frame data according to the Check algorithm to obtain Check calculation, compare the Check calculation result with the frame Check value (such as a CRC code) carried in the frame data, if the Check calculation result is consistent, the frame data is complete, and if the Check calculation result is inconsistent, the frame data is erroneous. .
Specifically, after the second electronic device passes the validity check of the frame data of all types, the second electronic device may generally perform data decoding on the frame data of all types based on the data frame decoding file, specifically, perform de-difference processing on at least one first type data frame and at least one second type data frame, and obtain, through the de-difference processing, original file data corresponding to the frame data of all types after the de-difference processing.
Step S403: and acquiring at least one first type data frame and at least one target data frame in the second type data frames based on the data frame decoding file.
The decoding process for all types of data frames usually adopts a serial manner, that is, each data frame is decoded in turn according to the frame sequence of each data frame. The target data frame is a data frame to be decoded in the current round in the process of decoding the data frame by each round of the second electronic equipment.
In practical application, the second electronic device may determine a decoding mode or a decoding rule for each data frame based on the data frame decoding file, then determine a current target data frame to be decoded based on a frame sequence of "at least one first type data frame and at least one second type data frame", and if the second electronic device has completed decoding the nth frame sequence, obtain the target data frame with the frame sequence of the (n + 1) th frame, and then perform the next data frame decoding by using the corresponding decoding mode of the target data frame.
Step S404: and carrying out de-difference processing on the target data frame to obtain an original data frame corresponding to the target data frame.
Specifically, according to the type of the target data frame, in a specific implementation, the second electronic device may obtain a fixed byte representing the type of the target data frame from a "frame type included in a frame header" of the target data frame, for example, "0 x 02" represents the first type data frame. And then performing corresponding differential processing, for example, performing first differential processing on the first type data frame, and performing second differential processing on the second type data frame, thereby obtaining an original data frame corresponding to the target data frame. The method comprises the following specific steps:
1. when the target data frame is the first type data frame, the second electronic device may perform first de-difference processing on the first type data frame according to a decoding manner corresponding to the data frame decoding file based on the data frame decoding file, so as to obtain original padding data, where original data obtained by first de-difference processing on the first type data is the original padding data.
Specifically, in the decoding process of the first type data frame, that is, the inverse process of the encoding of the first type data, the second electronic device may perform frame data reduction on the relevant byte data (such as repeated byte information) based on the relevant byte data in the first type data frame, so as to obtain the original padding data corresponding to the target data frame.
Further, the second electronic device may obtain repeat byte information corresponding to the target data frame, where the repeat byte information includes the repeat byte, a padding length, a first data length corresponding to the repeat byte, and a first order of the repeat byte in all data frames;
for example, according to the encoding process of the first type data frame and the frame structure of the first type data frame mentioned in some embodiments, the second electronic device may obtain the repetition byte, the padding length, and the first data length corresponding to the repetition byte in the "data segment".
Then, performing a first de-difference process based on the repeated byte information to obtain original padding data corresponding to the target data frame, specifically, the second electronic device may restore original data corresponding to the first type data frame (that is, perform data restoration on a target data block corresponding to the frame when the first type data frame is encoded) based on the first data length (that is, the byte size of the original repeated byte block when the first type data frame is encoded) and the padding length (that is, the byte size of the repeated byte), as follows: the repetition byte is "F", the padding length (i.e., the byte size of the repetition byte) is "1", the first data length (i.e., the byte size of the original repetition byte block during encoding) is "10", the second electronic device only needs to repeat the repetition byte to the length indicated by the first data length, that is, continuously copy "F" by 10 to obtain "fffffffffffff …", where "fffffffff …" is the original padding data corresponding to the frame during encoding the first type data frame, and at the same time, fill this segment of data "fffffffffffffff …" after the data that has been restored correspondingly to all data frames before the first type data frame, thereby completing the restoration of the frame data of which the target data frame is the first type data frame.
Specifically, according to the type of the target data frame, in a specific implementation, the second electronic device may obtain a fixed byte representing the type of the target data frame from a "frame type included in a frame header" of the target data frame, where the fixed byte represents the type of the target data frame, such as "0 x 00", "0 x 01", and "0 x 81", to represent the second type data frame. And then performing corresponding differential processing, for example, performing first differential processing on the first type data frame, and performing second differential processing on the second type data frame, thereby obtaining an original data frame corresponding to the target data frame. The method comprises the following specific steps:
and when the target data frame is the second type data frame, performing second differential processing on the target data frame to obtain original differential data.
According to some embodiments, the second type of data frame includes at least a new data frame, a first differential data frame, and a second differential data frame, such as "0 x 00" for the new data frame, "0 x 01" for the first differential data frame, and "0 x 81" for the second differential data frame. Further:
2. when the second type of data frame is a newly added data frame, the second electronic device may obtain, based on the data frame decoding file, first differential data of the newly added data frame, a second order of the first differential data in all data frames, and a second data length corresponding to the first differential data in a decoding manner corresponding to the data frame decoding file, and perform second de-differential processing based on the first differential data, the second order of the first differential data in all data frames, and the second data length corresponding to the first differential data, to obtain original newly added data.
Specifically, in the decoding process of the newly added data frame, that is, the inverse process of encoding the newly added data frame, the second electronic device may perform frame data reduction on the relevant byte data (such as the first difference data) in the newly added data frame, based on the relevant byte data in the newly added data frame, so as to obtain the original newly added data corresponding to the target data frame.
Further, the second electronic device may obtain first differential data of the newly added data frame, a second order of the first differential data in all data frames, and a second data length corresponding to the first differential data.
For example, according to the encoding process of the new data frame in the second type data frame and the frame structure of the second type data frame mentioned in some embodiments, the second electronic device may obtain, in the "data segment", the first difference data of the new data frame and the second data length corresponding to the first difference data.
Then, performing a second differential solution processing based on the first differential data, a second order of the first differential data in all data frames, and a second data length corresponding to the first differential data, specifically: and filling the first differential data corresponding to the second data length into the restored original data of all the data frames before the newly added data frame, thereby completing the frame data restoration of the newly added data frame as the second type data frame.
3. When the second type data frame is a first differential data frame, the second electronic device may obtain, based on the data frame decoding file, a third data length, second differential data, and a first data address of the first differential data frame in a decoding manner corresponding to the data frame decoding file; acquiring first reference data indicated at the first data address based on the third data length and the first data address; and performing second differential processing on the first reference data and the second differential data to obtain first original differential data.
Specifically, in a decoding process of the first differential data frame, that is, an inverse process of encoding the first differential data frame, the second electronic device may perform frame data reduction on the relevant byte data (for example, the second differential data) based on the relevant byte data in the first differential data frame, so as to obtain the first original differential data corresponding to the first differential data frame.
Further, the second electronic device may obtain the third data length, the second differential data, and the first data address.
For example, according to the encoding process of the first differential data frame in the second type data frame and the frame structure of the second type data frame mentioned in some embodiments, the second electronic device may obtain the third data length, the second differential data, and the first data address in the "data segment".
According to some embodiments, the second electronic device may find a reference data segment (i.e. the first reference data) from the frame data that has been restored before the first differential data frame for reference restoration during decoding by a differential calculation method set in the data frame decoding file; specifically, the second electronic device determines a start address of "first reference data" based on the first data address in "frame data restored before the first differential data frame," acquires the first reference data indicated by the third data length from the start address, and if the third data length is 128 bits, the second electronic device copies the 128-bit first reference data from the first data address backward, and in addition, the data frame decoding file corresponds to a differential calculation manner of the first differential data frame, where the differential calculation manner may be a logical operation, such as exclusive or, and, not, and so on. For example, during encoding, the first electronic device performs a logical encoding operation (e.g., an or operation) based on the first original differential data and the first reference data to obtain second differential data, and during decoding, the second electronic device performs a logical decoding operation on the second differential data and the first reference data by using an inverse operation of the logical encoding operation, that is, a differential calculation manner (which may also be understood as a logical decoding operation) in which the data frame decoding file corresponds to the first differential data frame, to obtain first original differential data corresponding to the first differential data frame, where if the logical encoding operation is an or operation, the logical decoding operation is an xor operation. The decoding process is also the process of the second section differential processing. And then the second electronic device fills the first original differential data into the restored original data of all data frames before the first differential data frame, so as to complete the frame data restoration of the first differential data frame as the second type data frame.
4. When the first type data frame is a second differential data frame, the second electronic device obtains a fourth data length and a second data address of the second differential data frame, obtains second reference data in second reference data indicated at the second data address, and performs second differential processing on the second reference data and the fourth data length to obtain second original differential data.
Specifically, in the decoding process of the second differential data frame, that is, the inverse process of encoding the second differential data frame, the second electronic device may perform frame data reduction on the relevant byte data (such as the second data address) based on the relevant byte data in the second differential data frame, so as to obtain the second original differential data corresponding to the second differential data frame.
Further, the second electronic device may obtain a fourth data length and a second data address of the second differential data frame.
For example, according to the encoding process of the second differential data frame in the second type data frame and the frame structure of the second type data frame mentioned in some embodiments, the second electronic device may obtain the fourth data length and the second data address in the "data segment".
According to some embodiments, the second electronic device may find the reference data segment (i.e. the second reference data) from the frame data that has been restored before the second differential data frame for direct copy and restoration during decoding by the differential calculation method set in the data frame decoding file; specifically, the second electronic device determines a start address of "second reference data" based on the second data address in "frame data restored before the second differential data frame", and acquires the second reference data indicated by the fourth data length from the start address, and if the fourth data length is 130 bits, the second electronic device copies 130 bits from the second data address backward to obtain the second reference data. According to some embodiments, the second reference data is the second original differential data corresponding to the second differential data frame, according to the definition of the second differential data frame encoding process. And then the second electronic device fills the second original differential data into the restored original data of all data frames before the second differential data frame, so as to complete the frame data restoration of the second differential data frame as the second type data frame.
Step S405: and acquiring a next data frame corresponding to the target data frame, taking the next data frame as the target data frame, and executing the step of performing the de-difference processing on the target data frame.
Specifically, the second electronic device may determine, based on the frame sequence of the target data frame, a next data frame indicated by a next sequence of the frame sequence, and if the frame sequence of the target data frame is n, determine that the frame sequence is the next data frame indicated by n + 1. Thereby taking the next data frame as a target data frame, and executing the step of performing the de-difference processing on the target data frame.
Step S406: and when the next data frame does not exist, generating original file data based on all the original data frames.
Specifically, when the second electronic device determines that the target data frame is the last data frame, the next data frame does not exist at this time, and at this time, the second electronic device may complete data reduction on all the types of data frames, so as to obtain original file data obtained by performing data reduction on all the types of data frames.
In this embodiment of the present application, a second electronic device obtains a data compression packet, decompresses the data compression packet to obtain at least one type of data frame, and performs data decoding on the at least one type of data frame based on a decoding manner corresponding to each type of data frame in the at least one type of data frame to obtain original file data corresponding to the data compression packet. By acquiring a data compression packet which is recoded based on at least one type of feature data of original file data and has a high compression rate, the original file data corresponding to the data compression packet can be restored only by carrying out corresponding data decoding on at least one type of data frames in the data compression packet with small data overhead (such as memory overhead, data transmission overhead and the like), the data overhead (such as memory overhead, data transmission overhead and the like) is reduced in the whole original file data acquisition process, the time for acquiring the original file data is shortened, the data acquisition efficiency is improved, and equipment resources are saved; and adopting corresponding decoding modes according to different types of data frames, maximizing the reuse rate of the restored data (namely, decoded bytes or data segments), and greatly shortening the time for decoding the original file data.
The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.
Referring to fig. 7, a schematic structural diagram of a data compression apparatus according to an exemplary embodiment of the present application is shown. The data compression apparatus may be implemented as all or part of an apparatus in software, hardware, or a combination of both. The device 1 comprises a data information acquisition module 11, a data information coding module 12 and a data compression packet generation module 13.
The data information acquiring module 11 is configured to acquire at least one type of feature data included in original file data, and acquire data information of each type of feature data in the at least one type of feature data;
the data information encoding module 12 is configured to perform data encoding on each type of data information based on an encoding mode corresponding to the data information of each type of feature data, so as to obtain data frames corresponding to each type of data information;
and the data compression packet generation module 13 is configured to compress data frames corresponding to the various types of data information, respectively, to obtain a data compression packet corresponding to the original file data.
Optionally, as shown in fig. 8, the data information obtaining module 11 includes:
a repeated byte information obtaining unit 111, configured to perform repeated byte calculation on original file data, extract repeated byte data in the original file data, and obtain repeated byte information corresponding to the repeated byte data;
the differential byte information obtaining unit 112 is configured to perform differential comparison calculation on the original file data, extract differential byte data in the original file data, and obtain differential byte information corresponding to the differential byte data.
Optionally, as shown in fig. 9, the data information encoding module 12 includes:
a first type data frame encoding unit 121, configured to perform data encoding on the repeated byte information according to a first encoding manner corresponding to the repeated byte data, so as to obtain at least one first type data frame of the original file data;
and a second type data frame encoding unit 122, configured to perform data encoding on the differential byte information according to a second encoding manner corresponding to the differential byte data, so as to obtain at least one second type data frame of the original file data.
Optionally, the repeated byte information obtaining unit 111 is specifically configured to:
acquiring at least one repeated byte in original file data, and performing segmentation calculation processing on the original file data based on the repeated byte to obtain repeated byte information corresponding to the repeated byte, wherein the repeated byte information comprises the repeated byte, a filling length, a first data length corresponding to the repeated byte and a first sequence of the repeated byte in all data frames;
the first-type data frame encoding unit 121 is specifically configured to:
and generating at least one first type data frame corresponding to the original file data based on the repeated byte information.
Optionally, as shown in fig. 10, the differential byte information obtaining unit 112 is specifically configured to:
a data block dividing subunit 1121, configured to divide the original file data according to a preset block length to obtain at least two data blocks;
a differential comparison subunit 1122, configured to obtain a target data block from the at least two data blocks, and perform differential comparison processing on the target data block and all data blocks before the target data block to obtain differential byte data of the target data block;
the differential comparison subunit 1122 is further configured to obtain differential byte information corresponding to the differential byte data, obtain a next data block of the target data block, use the next data block as the target data block, and perform the step of determining to perform differential comparison processing on the target data block and all data blocks before the target data block;
a differential byte information generating unit 1123, configured to obtain all the differential byte information in the original file data when the next data block does not exist.
Optionally, the differential byte information at least includes first differential information, second differential information, and third differential information, and the second-type data frame encoding unit 122 is specifically configured to:
acquiring first differential information of the target data block, wherein the first differential information comprises first differential data, a second sequence of the first differential data in all data frames and a second data length corresponding to the first differential data, and generating a new data frame corresponding to the original file data based on the first differential information; and/or the presence of a gas in the gas,
acquiring second differential information of the target data block, wherein the second differential information includes second differential data, a third sequence of the second differential data in all data frames, first reference data of the second differential data in all data blocks before the target data block, a first data address corresponding to the first reference data, and a third data length corresponding to the first reference data, and generating a first differential data frame of the original file data based on the third sequence, the first data address, the third data length, and the second differential data is determined based on the first reference data and the original differential data corresponding to the second differential information after logical operation; and/or the presence of a gas in the gas,
acquiring third differential information of the target data block, wherein the third differential information includes third differential data, a fourth sequence of the third differential data in all data frames, a fourth data length corresponding to the second reference data, and a second data address of the third differential data in all data blocks before the target data block, and generating a second differential data frame of the original file data based on the fourth frame sequence, the fourth data length, and the second data address.
Optionally, the preset block lengths are multiple, and the apparatus 1 is specifically configured to:
determining the coding efficiency of the at least one second type data frame corresponding to each preset block length based on a plurality of preset block lengths;
determining the highest coding efficiency in the frame coding efficiencies, and determining at least one target second type data frame indicated by the highest frame coding efficiency;
the second-type data frame encoding unit 122 is specifically configured to:
and taking the at least one target second type data frame as at least one second type data frame of the original file data.
Optionally, as shown in fig. 8, the data information obtaining module 11 further includes:
a repeated data filtering unit 113, configured to filter the repeated byte data in the original file data to obtain filtered target file data;
the differential byte information obtaining unit 112 is further configured to perform differential comparison calculation on the target file data, extract differential byte data in the target file data, and obtain differential byte information corresponding to the differential byte data.
Optionally, the data compression packet generating module 13 is specifically configured to:
and acquiring a target compression algorithm corresponding to the original file data, and compressing the various data frames based on the target compression algorithm to obtain a data compression packet corresponding to the original file data.
Optionally, the data compression packet generating module 13 is specifically configured to:
respectively compressing the various types of data frames based on a plurality of compression algorithms to generate compressed data packets respectively corresponding to the plurality of compression algorithms;
determining compression rates respectively corresponding to the plurality of compression algorithms based on the compression data packets, determining the highest compression rate in the compression rates, and taking the compression algorithm indicated by the highest compression rate as a target compression algorithm; or the like, or, alternatively,
and acquiring a preset compression algorithm corresponding to the original file data.
Optionally, as shown in fig. 11, the apparatus 1 further includes:
a data encapsulation packet generating module 14, configured to obtain a data frame decoding file and a decompressed file of the target compression algorithm, and generate a data encapsulation packet based on the data compression packet, the decompressed file, and the data frame decoding file.
It should be noted that, when the data compression apparatus provided in the foregoing embodiment executes the data compression method, only the division of the functional modules is illustrated, and in practical applications, the above functions may be distributed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the data compression apparatus and the data compression method provided by the above embodiments belong to the same concept, and details of implementation processes thereof are referred to in the method embodiments and are not described herein again.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
In this embodiment, a first electronic device obtains at least one type of feature data included in original file data, obtains data information of each type of the feature data, performs data encoding on each type of the data information based on an encoding mode corresponding to the data information of each type of the feature data, obtains data frames corresponding to each type of the data information, and compresses the data frames corresponding to each type of the data information, to obtain a data compression packet corresponding to the original file data. By acquiring at least one type of characteristic data (namely data compiling characteristics) obtained after subdividing and mining original file data and then performing data encoding on the various types of characteristic data, the problem of low compression ratio when data compression is directly performed based on a fixed compression algorithm can be solved, the compression ratio of data compression is improved, and data expenses (such as memory expenses, data transmission expenses and the like) are saved; and adopting a corresponding coding mode according to data information (such as repeated byte information and differential information) of various types of characteristic data of different types, maximizing the utilization of the characteristic data of the original file data, and greatly improving the reuse rate of coded data (namely coded bytes or data segments) when the original file data is coded; the repeated byte data in the original file data can be encoded and then filtered, and then the differential data is encoded based on the filtered target file data, so that the encoding efficiency or the encoding compression ratio of the original file data frame can be further improved; and the optimal compression algorithm when various different compression algorithms encode various data frames can be obtained, so that the data compression packet after the various data frames are encoded by the optimal compression algorithm is obtained, and the compression rate is further improved.
Please refer to fig. 12, which shows a schematic structural diagram of a data decompression apparatus according to an exemplary embodiment of the present application. The data decompression means may be implemented as all or part of the apparatus in software, hardware or a combination of both. The apparatus 2 includes a compressed packet obtaining module 21, a compressed packet decompressing module 22, and a data frame decoding module 23.
A compressed packet obtaining module 21, configured to obtain a data compressed packet;
a compressed packet decompression module 22, configured to decompress the data compressed packet to obtain at least one type of data frame;
and the data frame decoding module 23 is configured to perform data decoding on the at least one type of data frame based on a decoding manner corresponding to each type of data frame in the at least one type of data frame, so as to obtain original file data corresponding to the data compression packet.
Optionally, the compressed packet obtaining module 21 is specifically configured to:
and acquiring a data encapsulation packet, and decapsulating the data encapsulation packet to obtain a data compression packet, and a decompression file and a data frame decoding file corresponding to the data compression packet.
Optionally, the data frame decoding module 23 is specifically configured to:
and carrying out de-difference processing on at least one first type data frame and at least one second type data frame based on the data frame decoding file to obtain original file data subjected to de-difference processing.
Optionally, as shown in fig. 13, the data frame decoding module 23 includes:
a data frame acquiring unit 231, configured to acquire a target data frame of at least one of the first type data frames and at least one of the second type data frames based on the data frame decoding file;
a data frame de-differentiating unit 232, configured to perform de-differentiating processing on the target data frame to obtain an original data frame corresponding to the target data frame;
the data frame acquiring unit 231 is further configured to acquire a next data frame corresponding to the target data frame, use the next data frame as the target data frame, and perform the step of performing the difference-removing processing on the target data frame;
an original data generating unit 233 configured to generate original file data based on all the original data frames when the next data frame does not exist.
Optionally, the data frame de-differentiating unit 232 is specifically configured to:
when the target data frame is the first type data frame, performing first differential processing on the target data frame to obtain original filling data;
and when the target data frame is the second type data frame, performing second differential processing on the target data frame to obtain original differential data.
Optionally, the data frame de-differentiating unit 232 is specifically configured to:
acquiring repeated byte information corresponding to the target data frame, wherein the repeated byte information comprises the repeated bytes, a filling length, a first data length corresponding to the repeated bytes and a first sequence of the repeated bytes in all data frames;
and performing first difference solving processing based on the repeated byte information to obtain original filling data corresponding to the target data frame.
Optionally, the second type data frame at least includes a newly added data frame, a first differential data frame, and a second differential data frame, and the data frame de-differentiating unit 232 is specifically configured to:
when the first type data frame is a newly added data frame, acquiring first differential data of the newly added data frame, a second sequence of the first differential data in all data frames and a second data length corresponding to the first differential data, and performing second de-differential processing based on the first differential data, the second sequence of the first differential data in all data frames and the second data length corresponding to the first differential data to obtain original newly added data;
when the first type data frame is a first differential data frame, acquiring the first differential data frame
A third data length, second differential data, a first data address; acquiring first reference data indicated at the first data address based on the third data length and the first data address; performing second differential solving processing on the first reference data and the second differential data to obtain first original differential data;
when the first type data frame is a second differential data frame, acquiring a fourth data length and a second data address of the second differential data frame, acquiring second reference data in second reference data indicated at the second data address, and performing second differential processing on the second reference data and the fourth data length to obtain second original differential data.
Optionally, as shown in fig. 14, the apparatus 2 includes:
and the data checking module 24 is configured to obtain a frame sequence and a frame check value of each type of data frame in the at least one type of data frame, and perform data checking based on the frame sequence and the frame check value.
It should be noted that, when the data decompression apparatus provided in the foregoing embodiment executes the data decompression method, only the division of the functional modules is illustrated, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the data compression apparatus and the data compression method provided by the above embodiments belong to the same concept, and details of implementation processes thereof are referred to in the method embodiments and are not described herein again.
In this embodiment of the present application, a second electronic device obtains a data compression packet, decompresses the data compression packet to obtain at least one type of data frame, and performs data decoding on the at least one type of data frame based on a decoding manner corresponding to each type of data frame in the at least one type of data frame to obtain original file data corresponding to the data compression packet. By acquiring a data compression packet which is recoded based on at least one type of feature data of original file data and has a high compression rate, the original file data corresponding to the data compression packet can be restored only by carrying out corresponding data decoding on at least one type of data frames in the data compression packet with small data overhead (such as memory overhead, data transmission overhead and the like), the data overhead (such as memory overhead, data transmission overhead and the like) is reduced in the whole original file data acquisition process, the time for acquiring the original file data is shortened, the data acquisition efficiency is improved, and equipment resources are saved; and adopting corresponding decoding modes according to different types of data frames, maximizing the reuse rate of the restored data (namely, decoded bytes or data segments), and greatly shortening the time for decoding the original file data.
An embodiment of the present application further provides a computer storage medium, where the computer storage medium may store a plurality of instructions, and the instructions are suitable for being loaded by a processor and executing the data compression method according to the embodiments shown in fig. 1 to fig. 6, and a specific execution process may refer to specific descriptions of the embodiments shown in fig. 1 to fig. 6, which is not described herein again.
The present application further provides a computer program product, where at least one instruction is stored, and the at least one instruction is loaded by the processor and executes the data compression method according to the embodiment shown in fig. 1 to 6, where a specific execution process may refer to specific descriptions of the embodiment shown in fig. 1 to 6, and is not described herein again.
Fig. 15 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 15, the electronic device 1000 may include: at least one processor 1001, at least one network interface 1004, a user interface 1003, memory 1005, at least one communication bus 1002.
Wherein a communication bus 1002 is used to enable connective communication between these components.
The user interface 1003 may include a Display screen (Display) and a Camera (Camera), and the optional user interface 1003 may also include a standard wired interface and a wireless interface.
The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others.
Processor 1001 may include one or more processing cores, among other things. The processor 1001 connects various parts throughout the server 1000 using various interfaces and lines, and performs various functions of the server 1000 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 1005, and calling data stored in the memory 1005. Alternatively, the processor 1001 may be implemented in at least one hardware form of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 1001 may integrate one or more of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display screen; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 1001, but may be implemented by a single chip.
The Memory 1005 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 1005 includes a non-transitory computer-readable medium. The memory 1005 may be used to store an instruction, a program, code, a set of codes, or a set of instructions. The memory 1005 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the various method embodiments described above, and the like; the storage data area may store data and the like referred to in the above respective method embodiments. The memory 1005 may optionally be at least one memory device located remotely from the processor 1001. As shown in fig. 15, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and a data compression application program.
In the electronic device 1000 shown in fig. 15, the user interface 1003 is mainly used as an interface for providing input for a user, and acquiring data input by the user; and the processor 1001 may be configured to call the data compression application stored in the memory 1005, and specifically perform the following operations:
acquiring at least one type of characteristic data contained in original file data, and acquiring data information of various types of characteristic data in the at least one type of characteristic data;
respectively carrying out data coding on various data information based on coding modes corresponding to the data information of various characteristic data to obtain data frames corresponding to the various data information;
and compressing the data frames corresponding to the various types of data information respectively to obtain a data compression packet corresponding to the original file data.
In an embodiment, when the processor 1001 executes at least one type of feature data included in the acquired original file data and acquires data information of each type of feature data, the following operation is specifically executed:
performing repeated byte calculation on original file data, extracting repeated byte data in the original file data, and acquiring repeated byte information corresponding to the repeated byte data;
and carrying out differential comparison calculation on the original file data, extracting differential byte data in the original file data, and acquiring differential byte information corresponding to the differential byte data.
In an embodiment, when the processor 1001 executes the encoding modes corresponding to the data information based on the various types of feature data to perform data encoding on the various types of data information respectively, so as to obtain data frames corresponding to the various types of data information respectively, the following operations are specifically executed:
performing data coding on the repeated byte information according to a first coding mode corresponding to the repeated byte data to obtain at least one first type data frame of the original file data;
and carrying out data encoding on the differential byte information according to a second encoding mode corresponding to the differential byte data to obtain at least one second type data frame of the original file data.
In an embodiment, when the processor 1001 performs the repeated byte calculation on the original file data, extracts the repeated byte data in the original file data, and acquires the repeated byte information corresponding to the repeated byte data, the following operations are specifically performed:
acquiring at least one repeated byte in original file data, and performing segmentation calculation processing on the original file data based on the repeated byte to obtain repeated byte information corresponding to the repeated byte, wherein the repeated byte information comprises the repeated byte, a filling length, a first data length corresponding to the repeated byte and a first sequence of the repeated byte in all data frames;
the data encoding of the repeated byte data according to the first encoding mode corresponding to the repeated byte data to obtain at least one first type data frame of the original file data includes:
and generating at least one first type data frame corresponding to the original file data based on the repeated byte information.
In an embodiment, when the processor 1001 executes the step of generating at least one first type data frame corresponding to the original file data based on the repeated byte information, the following steps are specifically executed:
when the total length of the repeated bytes contained in the repeated byte information is larger than a frame length threshold value, at least one first type data frame is generated based on the repeated byte information.
In an embodiment, when the processor 1001 performs the differential comparison calculation on the original file data, extracts differential byte data in the original file data, and acquires differential byte information corresponding to the differential byte data, the following steps are specifically performed:
dividing the original file data according to a preset block length to obtain at least two data blocks;
acquiring a target data block from the at least two data blocks, and performing differential comparison processing on the target data block and all data blocks before the target data block to obtain differential byte data of the target data block;
acquiring differential byte information corresponding to the differential byte data, acquiring a next data block of the target data block, taking the next data block as the target data block, and executing the step of determining to perform differential comparison processing on the target data block and all data blocks before the target data block;
and when the next data block does not exist, obtaining all the differential byte information in the original file data.
In an embodiment, the differential byte information at least includes first differential information, second differential information, and third differential information, and when the processor 1001 performs data encoding on the differential byte data according to a second encoding method corresponding to the differential byte data to obtain at least one second type data frame of the original file data, the following steps are specifically performed:
acquiring first differential information of the target data block, wherein the first differential information comprises first differential data, a second sequence of the first differential data in all data frames and a second data length corresponding to the first differential data, and generating a new data frame corresponding to the original file data based on the first differential information; and/or the presence of a gas in the gas,
acquiring second differential information of the target data block, wherein the second differential information includes second differential data, a third sequence of the second differential data in all data frames, first reference data of the second differential data in all data blocks before the target data block, a first data address corresponding to the first reference data, and a third data length corresponding to the first reference data, and generating a first differential data frame of the original file data based on the third sequence, the first data address, the third data length, and the second differential data is determined based on the first reference data and the original differential data corresponding to the second differential information after logical operation; and/or the presence of a gas in the gas,
acquiring third differential information of the target data block, wherein the third differential information includes third differential data, a fourth sequence of the third differential data in all data frames, a fourth data length corresponding to the second reference data, and a second data address of the third differential data in all data blocks before the target data block, and generating a second differential data frame of the original file data based on the fourth frame sequence, the fourth data length, and the second data address.
In an embodiment, the preset partition length is multiple, and when the processor 1001 executes the data compression method, the following steps are specifically executed:
determining the coding efficiency of the at least one second type data frame corresponding to each preset block length based on a plurality of preset block lengths;
determining the highest coding efficiency in the frame coding efficiencies, and determining at least one target second type data frame indicated by the highest frame coding efficiency;
the obtaining of the at least one second type data frame of the original file data includes:
and taking the at least one target second type data frame as at least one second type data frame of the original file data.
In one embodiment, after performing the repeated byte calculation on the original file data, extracting repeated byte data in the original file data, and acquiring repeated byte information corresponding to the repeated byte data, the processor 1001 further performs the following steps:
filtering the repeated byte data in the original file data to obtain filtered target file data;
the differential comparison calculation of the original file data, the extraction of the differential byte data in the original file data, and the acquisition of the differential byte information corresponding to the differential byte data includes:
and carrying out differential comparison calculation on the target file data, extracting differential byte data in the target file data, and acquiring differential byte information corresponding to the differential byte data.
In an embodiment, when the processor 1001 performs the compression processing on the various types of data frames to obtain the data compression packet corresponding to the original file data, the following steps are specifically performed:
and acquiring a target compression algorithm corresponding to the original file data, and compressing the various data frames based on the target compression algorithm to obtain a data compression packet corresponding to the original file data.
In an embodiment, when the processor 1001 executes the target compression algorithm for obtaining the original file data, the following steps are specifically executed:
respectively compressing the various types of data frames based on a plurality of compression algorithms to generate compressed data packets respectively corresponding to the plurality of compression algorithms;
determining compression rates respectively corresponding to the plurality of compression algorithms based on the compression data packets, determining the highest compression rate in the compression rates, and taking the compression algorithm indicated by the highest compression rate as a target compression algorithm;
or the like, or, alternatively,
and acquiring a preset compression algorithm corresponding to the original file data.
In one embodiment, when the processor 1001 executes the data compression method, the following steps are specifically executed:
and acquiring a data encapsulation packet, and decapsulating the data encapsulation packet to obtain a data compression packet, and a decompression file and a data frame decoding file corresponding to the data compression packet.
In this embodiment, a first electronic device obtains at least one type of feature data included in original file data, obtains data information of each type of the feature data, performs data encoding on each type of the data information based on an encoding mode corresponding to the data information of each type of the feature data, obtains data frames corresponding to each type of the data information, and compresses the data frames corresponding to each type of the data information, to obtain a data compression packet corresponding to the original file data. By acquiring at least one type of characteristic data (namely data compiling characteristics) obtained after subdividing and mining original file data and then performing data encoding on the various types of characteristic data, the problem of low compression ratio when data compression is directly performed based on a fixed compression algorithm can be solved, the compression ratio of data compression is improved, and data expenses (such as memory expenses, data transmission expenses and the like) are saved; and adopting a corresponding coding mode according to data information (such as repeated byte information and differential information) of various types of characteristic data of different types, maximizing the utilization of the characteristic data of the original file data, and greatly improving the reuse rate of coded data (namely coded bytes or data segments) when the original file data is coded; the repeated byte data in the original file data can be encoded and then filtered, and then the differential data is encoded based on the filtered target file data, so that the encoding efficiency or the encoding compression ratio of the original file data frame can be further improved; and the optimal compression algorithm when various different compression algorithms encode various data frames can be obtained, so that the data compression packet after the various data frames are encoded by the optimal compression algorithm is obtained, and the compression rate is further improved.
Referring to fig. 16, a schematic structural diagram of another electronic device is provided in the embodiment of the present application. As shown in fig. 16, the electronic device 2000 may include: at least one processor 2001, at least one network interface 2004, a user interface 2003, memory 2005, at least one communication bus 2002.
The communication bus 2002 is used to implement connection communication between these components.
The user interface 2003 may include a Display (Display) and a Camera (Camera), and the optional user interface 2003 may further include a standard wired interface and a wireless interface.
The network interface 2004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others.
Processor 2001 may include one or more processing cores, among other things. The processor 2001 connects the various parts within the overall server 2000 using various interfaces and lines, and performs various functions of the server 2000 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 2005 and calling data stored in the memory 2005. Optionally, the processor 2001 may be implemented in hardware using at least one of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 2001 may integrate one or a combination of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display screen; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 2001, but may be implemented by a single chip.
The Memory 2005 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 2005 includes a non-transitory computer-readable medium. The memory 2005 may be used to store instructions, programs, code, sets of codes, or sets of instructions. The memory 1005 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the various method embodiments described above, and the like; the storage data area may store data and the like referred to in the above respective method embodiments. The memory 2005 may optionally also be at least one memory device located remotely from the aforementioned processor 2001. As shown in fig. 16, the memory 2005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and a data decompression application program.
In the electronic device 2000 shown in fig. 16, the user interface 2003 is mainly used as an interface for providing input for a user, and acquiring data input by the user; and the processor 2001 may be configured to invoke the data decompression application stored in the memory 2005 and specifically perform the following operations:
acquiring a data compression packet;
decompressing the data compression packet to obtain at least one type of data frame;
and performing data decoding on the at least one type of data frames based on the decoding modes corresponding to the various types of data frames in the at least one type of data frames to obtain original file data corresponding to the data compression packet.
In an embodiment, when the processor 2001 executes the obtaining of the data compression packet, and the decompression file and the data frame decoding file corresponding to the data compression packet, the following steps are specifically executed:
and acquiring a data encapsulation packet, and decapsulating the data encapsulation packet to obtain a data compression packet, and a decompression file and a data frame decoding file corresponding to the data compression packet.
In an embodiment, when the processor 2001 performs the decoding method corresponding to each type of data frame in the at least one type of data frame, and performs data decoding on the at least one type of data frame to obtain original file data corresponding to the data compression packet, the following steps are specifically performed:
and carrying out de-difference processing on at least one first type data frame and at least one second type data frame based on the data frame decoding file to obtain original file data subjected to de-difference processing.
In an embodiment, when the processor 2001 performs the de-difference processing on the at least one first type data frame and the at least one second type data frame based on the data frame decoding file to obtain original file data after the de-difference processing, the following steps are specifically performed:
acquiring at least one first type data frame and at least one target data frame in a second type data frame based on the data frame decoding file;
performing de-difference processing on the target data frame to obtain an original data frame corresponding to the target data frame;
acquiring a next data frame corresponding to the target data frame, taking the next data frame as the target data frame, and executing the step of performing the de-difference processing on the target data frame;
and when the next data frame does not exist, generating original file data based on all the original data frames.
In an embodiment, when the processor 2001 performs the difference elimination processing on the target data frame to obtain an original data frame corresponding to the target data frame, the following steps are specifically performed:
when the target data frame is the first type data frame, performing first differential processing on the target data frame to obtain original filling data;
and when the target data frame is the second type data frame, performing second differential processing on the target data frame to obtain original differential data.
In an embodiment, when the processor 2001 performs the first de-difference processing on the target data frame to obtain the original padding data, the following steps are specifically performed:
acquiring repeated byte information corresponding to the target data frame, wherein the repeated byte information comprises the repeated bytes, a filling length, a first data length corresponding to the repeated bytes and a first sequence of the repeated bytes in all data frames;
and performing first difference solving processing based on the repeated byte information to obtain original filling data corresponding to the target data frame.
In one embodiment, the second type of data frame at least includes a new data frame, a first differential data frame, and a second differential data frame, and when the processor 2001 performs a second de-differential process on the target data frame to obtain original differential data, the following steps are specifically performed:
when the first type data frame is a newly added data frame, acquiring first differential data of the newly added data frame, a second sequence of the first differential data in all data frames and a second data length corresponding to the first differential data, and performing second de-differential processing based on the first differential data, the second sequence of the first differential data in all data frames and the second data length corresponding to the first differential data to obtain original newly added data;
when the first type data frame is a first differential data frame, acquiring the first differential data frame
A third data length, second differential data, a first data address; acquiring first reference data indicated at the first data address based on the third data length and the first data address; performing second differential solving processing on the first reference data and the second differential data to obtain first original differential data;
when the first type data frame is a second differential data frame, acquiring a fourth data length and a second data address of the second differential data frame, acquiring second reference data in second reference data indicated at the second data address, and performing second differential processing on the second reference data and the fourth data length to obtain second original differential data.
In an embodiment, before the processor 2001 performs the data decoding on the at least one type of data frame based on the decoding manner corresponding to each type of data frame in the at least one type of data frame, the following steps are further performed:
and acquiring the frame sequence and the frame check value of each type of data frame in the at least one type of data frame, and performing data check based on the frame sequence and the frame check value.
In this embodiment of the present application, a second electronic device obtains a data compression packet, decompresses the data compression packet to obtain at least one type of data frame, and performs data decoding on the at least one type of data frame based on a decoding manner corresponding to each type of data frame in the at least one type of data frame to obtain original file data corresponding to the data compression packet. By acquiring a data compression packet which is recoded based on at least one type of feature data of original file data and has a high compression rate, the original file data corresponding to the data compression packet can be restored only by carrying out corresponding data decoding on at least one type of data frames in the data compression packet with small data overhead (such as memory overhead, data transmission overhead and the like), the data overhead (such as memory overhead, data transmission overhead and the like) is reduced in the whole original file data acquisition process, the time for acquiring the original file data is shortened, the data acquisition efficiency is improved, and equipment resources are saved; and adopting corresponding decoding modes according to different types of data frames, maximizing the reuse rate of the restored data (namely, decoded bytes or data segments), and greatly shortening the time for decoding the original file data.
It is clear to a person skilled in the art that the solution of the present application can be implemented by means of software and/or hardware. The "unit" and "module" in this specification refer to software and/or hardware that can perform a specific function independently or in cooperation with other components, where the hardware may be, for example, a Field-ProgrammaBLE Gate Array (FPGA), an Integrated Circuit (IC), or the like.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implementing, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some service interfaces, devices or units, and may be an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable memory. Based on such understanding, the technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a memory, and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned memory comprises: various media capable of storing program codes, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by a program, which is stored in a computer-readable memory, and the memory may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
The above description is only an exemplary embodiment of the present disclosure, and the scope of the present disclosure should not be limited thereby. That is, all equivalent changes and modifications made in accordance with the teachings of the present disclosure are intended to be included within the scope of the present disclosure. Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims (24)

1. A method of data compression, the method comprising:
acquiring at least one type of characteristic data contained in original file data, and acquiring data information of various types of characteristic data in the at least one type of characteristic data;
respectively carrying out data coding on various data information based on coding modes corresponding to the data information of various characteristic data to obtain data frames corresponding to the various data information;
and compressing the data frames corresponding to the various types of data information respectively to obtain a data compression packet corresponding to the original file data.
2. The method according to claim 1, wherein the obtaining at least one type of feature data included in the original document data and obtaining data information of each type of feature data comprises:
performing repeated byte calculation on original file data, extracting repeated byte data in the original file data, and acquiring repeated byte information corresponding to the repeated byte data;
and carrying out differential comparison calculation on the original file data, extracting differential byte data in the original file data, and acquiring differential byte information corresponding to the differential byte data.
3. The method according to claim 2, wherein the encoding modes corresponding to the data information based on the various types of feature data respectively perform data encoding on the various types of data information to obtain data frames corresponding to the various types of data information, respectively, includes:
performing data coding on the repeated byte information according to a first coding mode corresponding to the repeated byte data to obtain at least one first type data frame of the original file data;
and carrying out data encoding on the differential byte information according to a second encoding mode corresponding to the differential byte data to obtain at least one second type data frame of the original file data.
4. The method according to claim 3, wherein the performing the repeated byte calculation on the original file data, extracting the repeated byte data in the original file data, and obtaining the repeated byte information corresponding to the repeated byte data comprises:
acquiring at least one repeated byte in original file data, and performing segmentation calculation processing on the original file data based on the repeated byte to obtain repeated byte information corresponding to the repeated byte, wherein the repeated byte information comprises the repeated byte, a filling length, a first data length corresponding to the repeated byte and a first sequence of the repeated byte in all data frames;
the data encoding of the repeated byte data according to the first encoding mode corresponding to the repeated byte data to obtain at least one first type data frame of the original file data includes:
and generating at least one first type data frame corresponding to the original file data based on the repeated byte information.
5. The method of claim 4, wherein the generating at least one first type data frame corresponding to the original file data based on the repeated byte information comprises:
when the total length of the repeated bytes contained in the repeated byte information is larger than a frame length threshold value, at least one first type data frame is generated based on the repeated byte information.
6. The method according to any one of claims 3 to 5, wherein the performing a differential comparison calculation on the original file data, extracting differential byte data in the original file data, and obtaining differential byte information corresponding to the differential byte data comprises:
dividing the original file data according to a preset block length to obtain at least two data blocks;
acquiring a target data block from the at least two data blocks, and performing differential comparison processing on the target data block and all data blocks before the target data block to obtain differential byte data of the target data block;
acquiring differential byte information corresponding to the differential byte data, acquiring a next data block of the target data block, taking the next data block as the target data block, and executing the step of determining to perform differential comparison processing on the target data block and all data blocks before the target data block;
and when the next data block does not exist, obtaining all the differential byte information in the original file data.
7. The method according to claim 6, wherein the differential byte information at least includes first differential information, second differential information, and third differential information, and the obtaining at least one second type data frame of the original file data by performing data encoding on the differential byte data according to a second encoding method corresponding to the differential byte data includes:
acquiring first differential information of the target data block, wherein the first differential information comprises first differential data, a second sequence of the first differential data in all data frames and a second data length corresponding to the first differential data, and generating a new data frame corresponding to the original file data based on the first differential information; and/or the presence of a gas in the gas,
acquiring second differential information of the target data block, wherein the second differential information includes second differential data, a third sequence of the second differential data in all data frames, first reference data of the second differential data in all data blocks before the target data block, a first data address corresponding to the first reference data, and a third data length corresponding to the first reference data, and generating a first differential data frame of the original file data based on the third sequence, the first data address, the third data length, and the second differential data is determined based on the first reference data and the original differential data corresponding to the second differential information after logical operation; and/or the presence of a gas in the gas,
acquiring third differential information of the target data block, wherein the third differential information includes third differential data, a fourth sequence of the third differential data in all data frames, a fourth data length corresponding to the second reference data, and a second data address of the third differential data in all data blocks before the target data block, and generating a second differential data frame of the original file data based on the fourth frame sequence, the fourth data length, and the second data address.
8. The method according to claim 6 or 7, wherein the preset partition length is plural, the method further comprising:
determining the coding efficiency of the at least one second type data frame corresponding to each preset block length based on a plurality of preset block lengths;
determining the highest coding efficiency in the frame coding efficiencies, and determining at least one target second type data frame indicated by the highest frame coding efficiency;
the obtaining of the at least one second type data frame of the original file data includes:
and taking the at least one target second type data frame as at least one second type data frame of the original file data.
9. The method according to any one of claims 2 to 5, wherein after performing the repeated byte calculation on the original file data, extracting repeated byte data in the original file data, and obtaining repeated byte information corresponding to the repeated byte data, the method further comprises:
filtering the repeated byte data in the original file data to obtain filtered target file data;
the differential comparison calculation of the original file data, the extraction of the differential byte data in the original file data, and the acquisition of the differential byte information corresponding to the differential byte data includes:
and carrying out differential comparison calculation on the target file data, extracting differential byte data in the target file data, and acquiring differential byte information corresponding to the differential byte data.
10. The method according to any one of claims 1 to 5, wherein the compressing the various types of data frames to obtain the data compression packet corresponding to the original file data comprises:
and acquiring a target compression algorithm corresponding to the original file data, and compressing the various data frames based on the target compression algorithm to obtain a data compression packet corresponding to the original file data.
11. The method of claim 10, wherein the target compression algorithm for obtaining the original document data comprises:
respectively compressing the various types of data frames based on a plurality of compression algorithms to generate compressed data packets respectively corresponding to the plurality of compression algorithms;
determining compression rates respectively corresponding to the plurality of compression algorithms based on the compression data packets, determining the highest compression rate in the compression rates, and taking the compression algorithm indicated by the highest compression rate as a target compression algorithm;
or the like, or, alternatively,
and acquiring a preset compression algorithm corresponding to the original file data.
12. The method according to claim 10 or 11, characterized in that the method further comprises:
and acquiring a data frame decoding file and a decompression file of the target compression algorithm, and generating a data packaging packet based on the data compression packet, the decompression file and the data frame decoding file.
13. A method for data decompression, the method comprising:
acquiring a data compression packet;
decompressing the data compression packet to obtain at least one type of data frame;
and performing data decoding on the at least one type of data frames based on the decoding modes corresponding to the various types of data frames in the at least one type of data frames to obtain original file data corresponding to the data compression packet.
14. The method of claim 13, wherein obtaining the data compression packet and the decompression file and the data frame decoding file corresponding to the data compression packet comprises:
and acquiring a data encapsulation packet, and decapsulating the data encapsulation packet to obtain a data compression packet, and a decompression file and a data frame decoding file corresponding to the data compression packet.
15. The method according to claim 13 or 14, wherein the decoding the at least one type of data frame based on the decoding manner corresponding to each type of data frame in the at least one type of data frame to obtain the original file data corresponding to the data compression packet comprises:
and carrying out de-difference processing on at least one first type data frame and at least one second type data frame based on the data frame decoding file to obtain original file data subjected to de-difference processing.
16. The method according to claim 15, wherein said de-differentiating said at least one first type of data frame and said at least one second type of data frame based on said data frame decoding file to obtain de-differentiated original file data comprises:
acquiring at least one first type data frame and at least one target data frame in a second type data frame based on the data frame decoding file;
performing de-difference processing on the target data frame to obtain an original data frame corresponding to the target data frame;
acquiring a next data frame corresponding to the target data frame, taking the next data frame as the target data frame, and executing the step of performing the de-difference processing on the target data frame;
and when the next data frame does not exist, generating original file data based on all the original data frames.
17. The method according to claim 16, wherein the performing the de-difference processing on the target data frame to obtain an original data frame corresponding to the target data frame comprises:
when the target data frame is the first type data frame, performing first differential processing on the target data frame to obtain original filling data;
and when the target data frame is the second type data frame, performing second differential processing on the target data frame to obtain original differential data.
18. The method of claim 17, wherein the first de-differentiating the target data frame to obtain original padding data comprises:
acquiring repeated byte information corresponding to the target data frame, wherein the repeated byte information comprises the repeated bytes, a filling length, a first data length corresponding to the repeated bytes and a first sequence of the repeated bytes in all data frames;
and performing first difference solving processing based on the repeated byte information to obtain original filling data corresponding to the target data frame.
19. The method of claim 17, wherein the second type of data frame comprises at least a new data frame, a first differential data frame, and a second differential data frame,
performing a second difference solving process on the target data frame to obtain original difference data, including:
when the first type data frame is a newly added data frame, acquiring first differential data of the newly added data frame, a second sequence of the first differential data in all data frames and a second data length corresponding to the first differential data, and performing second de-differential processing based on the first differential data, the second sequence of the first differential data in all data frames and the second data length corresponding to the first differential data to obtain original newly added data;
when the first type data frame is a first differential data frame, acquiring the first differential data frame
A third data length, second differential data, a first data address; acquiring first reference data indicated at the first data address based on the third data length and the first data address; performing second differential solving processing on the first reference data and the second differential data to obtain first original differential data;
when the first type data frame is a second differential data frame, acquiring a fourth data length and a second data address of the second differential data frame, acquiring second reference data in second reference data indicated at the second data address, and performing second differential processing on the second reference data and the fourth data length to obtain second original differential data.
20. The method according to claim 13, wherein before performing data decoding on the at least one type of data frame based on the decoding manner corresponding to each type of data frame in the at least one type of data frame, the method further comprises:
and acquiring the frame sequence and the frame check value of each type of data frame in the at least one type of data frame, and performing data check based on the frame sequence and the frame check value.
21. A data compression apparatus for performing the method of any one of claims 1 to 12, the apparatus comprising:
the data information acquisition module is used for acquiring at least one type of characteristic data contained in original file data and acquiring data information of various types of characteristic data in the at least one type of characteristic data;
the data information coding module is used for respectively carrying out data coding on various data information based on coding modes corresponding to the data information of various characteristic data to obtain data frames corresponding to the various data information;
and the data compression packet generation module is used for compressing the data frames corresponding to the various data information respectively to obtain the data compression packet corresponding to the original file data.
22. A data decompression apparatus for performing the method according to any one of claims 13 to 20, the apparatus comprising:
the compressed packet acquisition module is used for acquiring a data compressed packet;
the compressed packet decompression module is used for decompressing the data compressed packet to obtain at least one type of data frame;
and the data frame decoding module is used for carrying out data decoding on the at least one type of data frames based on the decoding modes corresponding to the various types of data frames in the at least one type of data frames to obtain the original file data corresponding to the data compression packet.
23. A computer storage medium, characterized in that it stores a plurality of instructions adapted to be loaded by a processor and to perform the method steps according to any of claims 1 to 12, 13 to 20.
24. An electronic device, comprising: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the method steps of any of claims 1 to 12, 13 to 20.
CN202011001267.4A 2020-09-22 2020-09-22 Data compression method and device, data decompression method and device, storage medium and electronic equipment Pending CN112165331A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011001267.4A CN112165331A (en) 2020-09-22 2020-09-22 Data compression method and device, data decompression method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011001267.4A CN112165331A (en) 2020-09-22 2020-09-22 Data compression method and device, data decompression method and device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN112165331A true CN112165331A (en) 2021-01-01

Family

ID=73862547

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011001267.4A Pending CN112165331A (en) 2020-09-22 2020-09-22 Data compression method and device, data decompression method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN112165331A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112732319A (en) * 2021-01-22 2021-04-30 百度在线网络技术(北京)有限公司 File upgrading method, device, equipment and storage medium
CN113094346A (en) * 2021-03-10 2021-07-09 北京四达时代软件技术股份有限公司 Big data coding and decoding method and device based on time sequence
CN113595557A (en) * 2021-09-30 2021-11-02 阿里云计算有限公司 Data processing method and device
CN114500689A (en) * 2022-01-30 2022-05-13 合肥美的电冰箱有限公司 Bus communication method, device, communication board, household appliance and storage medium
CN114564253A (en) * 2022-03-02 2022-05-31 重庆紫光华山智安科技有限公司 Task creation method, system, electronic device and readable storage medium
CN114978425A (en) * 2022-05-09 2022-08-30 清华大学 Remote measuring data elastic transmission method, device, electronic equipment and storage medium
CN115150348A (en) * 2021-03-30 2022-10-04 奇安信科技集团股份有限公司 Mail attachment restoring method and system

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112732319A (en) * 2021-01-22 2021-04-30 百度在线网络技术(北京)有限公司 File upgrading method, device, equipment and storage medium
CN112732319B (en) * 2021-01-22 2024-02-23 百度在线网络技术(北京)有限公司 File upgrading method, device, equipment and storage medium
CN113094346A (en) * 2021-03-10 2021-07-09 北京四达时代软件技术股份有限公司 Big data coding and decoding method and device based on time sequence
CN115150348A (en) * 2021-03-30 2022-10-04 奇安信科技集团股份有限公司 Mail attachment restoring method and system
CN115150348B (en) * 2021-03-30 2024-05-03 奇安信科技集团股份有限公司 Mail attachment restoring method and system
CN113595557A (en) * 2021-09-30 2021-11-02 阿里云计算有限公司 Data processing method and device
CN114500689A (en) * 2022-01-30 2022-05-13 合肥美的电冰箱有限公司 Bus communication method, device, communication board, household appliance and storage medium
CN114500689B (en) * 2022-01-30 2023-09-08 合肥美的电冰箱有限公司 Bus communication method, device, communication board, household electrical appliance and storage medium
CN114564253A (en) * 2022-03-02 2022-05-31 重庆紫光华山智安科技有限公司 Task creation method, system, electronic device and readable storage medium
CN114564253B (en) * 2022-03-02 2023-06-09 重庆紫光华山智安科技有限公司 Task creation method, system, electronic device and readable storage medium
CN114978425A (en) * 2022-05-09 2022-08-30 清华大学 Remote measuring data elastic transmission method, device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN112165331A (en) Data compression method and device, data decompression method and device, storage medium and electronic equipment
EP2605133A1 (en) Software version upgrading method, terminal and system
CN111262876B (en) Data processing method, device and equipment based on block chain and storage medium
US7821426B2 (en) Adaptive entropy coding compression output formats
CN110334066A (en) A kind of Gzip decompression method, apparatus and system based on FPGA
JP2014027658A (en) Compression encoding and decoding method and apparatus
CN113630125A (en) Data compression method, data encoding method, data decompression method, data encoding device, data decompression device, electronic equipment and storage medium
JP2014526098A (en) Method and system for downloading font files
US6748520B1 (en) System and method for compressing and decompressing a binary code image
JP2003524983A (en) Method and apparatus for optimized lossless compression using multiple coders
CN108880559B (en) Data compression method, data decompression method, compression equipment and decompression equipment
KR101470505B1 (en) Apparatus for compressing spatial data and method thereof, and apparatus for decompressing spatial data and method thereof
CN113055455A (en) File uploading method and equipment
CN109474826B (en) Picture compression method and device, electronic equipment and storage medium
CN108829872A (en) Immediate processing method, equipment, system and the storage medium of lossless compression file
CN114282141A (en) Processing method and device for compression format data, electronic equipment and readable storage medium
CN114035822A (en) File updating method and equipment
CN113986820A (en) Method for converting LZ4 format file into GZIP format file
CN113014551B (en) Data decompression method, data transmission method based on data decompression method, computer device and readable storage medium
CN112905575A (en) Data acquisition method, system, storage medium and electronic equipment
JP2005521324A (en) Method and apparatus for lossless data compression and decompression
US7564383B2 (en) Compression ratio of adaptive compression algorithms
CN110545107A (en) data processing method and device, electronic equipment and computer readable storage medium
CN114365096A (en) Memory allocation method, device, terminal and computer readable storage medium
CN116170115B (en) Digital fountain coding and decoding method, device and system based on codebook

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
TA01 Transfer of patent application right

Effective date of registration: 20210126

Address after: 201800 4th floor, building 43, 1485 Jialuo Road, Jiading District, Shanghai

Applicant after: SHANGHAI CETC-MOTOR Co.,Ltd.

Address before: 201800 4th floor, building 43, 1485 Jialuo Road, Jiading District, Shanghai

Applicant before: SHANGHAI CETC-MOTOR Co.,Ltd.

Applicant before: SHANGHAI OFILM INTELLIGENT VEHICLE Co.,Ltd.

TA01 Transfer of patent application right
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210101

WD01 Invention patent application deemed withdrawn after publication